Vukosi Marivate: Charting a Path for African Low-Resource Languages: A Multifaceted Approach to Research and Development

ICML 2024 Read our full coverage of this year's conference  

Vukosi Marivate’s talk (link) covered a range of work — including research, development, and community-building — related to improving support for low-resource languages, with a particular focus on those spoken in the African continent.

He made it clear that creating chatbots in African languages wasn’t necessarily the right path — pointing out that for many of these low-resource languages, the availability of reliable and accurate digital dictionaries and thesauruses would already be a useful step.

Highlighted work included generating sentence-pair datasets between low-resource languages (bypassing the need to go via English for translation), as well as text augmentation tools and dataset curation work.

Vukosi Marivate presenting at ICML 2024

Links to mentioned resources:

  • Deep Learning Indaba — ML meetings across the African continent. (next one is 1-7 September 2024, in Dakar)
  • Masakhane — a grassroots NLP research community for African languages
  • Lelapa AI — Vukosi Marivate’s Africa-centric AI research lab
  • Data Workers’ Inquiry — a community-based research project on data workers’ workplaces