Not lost in translation: training AI to speak African languages - African Business

The African continent is home to more than 2,000 languages, yet a 2025 study of large language models – advanced artificial intelligence systems designed to understand, generate and interact with human language – published in the Proceedings of Machine Learning Research found only limited support for African tongues.

The paper, which comparatively analysed African language coverage across six large language models, eight small language models and six specialised small language models (SSLMs), found support for 41 African languages and 23 available public data sets. But it found “a big gap” with only four languages – Amharic, Swahili, Afrikaans and Malagasy – always handled, while over 98% of African languages went unsupported.

With so many African languages neglected, there are fears that entire groups of speakers could be cut out of the AI revolution. So one team of computer scientists from the University of Cape Town is looking to close the gap that has left millions underserved by mainstream AI tools.

Earlier this year, researchers Anri Lombard, Jan Buys, Francois Meyer and their team unveiled MzansiLM, a language model specifically built to include data from all 11 of South Africa’s official written languages. Alongside it, they released MzansiText, the curated multilingual dataset on which MzansiLM was trained.

Not lost in translation: training AI to speak African languages - African Business

Not lost in translation: training AI to speak African languages - African Business

Other newsrooms on this story

Related reading

AI in Africa: Experts aim to close the language gap

Google backs African push to reclaim AI language data

China Is Providing AI That’s Literate in Africa’s Languages

From postponed tour to platform: Nkenne’s Zoom-fueled mission to preserve…

The AI infrastructure window is open. Africa shouldn't watch this one close. |…

How a Community Effort is Teaching AI to See Africa’s Richness

Other newsrooms on this story

Related reading

AI in Africa: Experts aim to close the language gap

Google backs African push to reclaim AI language data

China Is Providing AI That’s Literate in Africa’s Languages

From postponed tour to platform: Nkenne’s Zoom-fueled mission to preserve…

The AI infrastructure window is open. Africa shouldn't watch this one close. |…

How a Community Effort is Teaching AI to See Africa’s Richness