Home INNOVATION UCT Joins National Powerhouse to Build AI for African Languages

UCT Joins National Powerhouse to Build AI for African Languages

98
0
UCT AI research African languages. Large Language Models isiXhosa isiZulu Sepedi. South African AI national collaboration. National Research Foundation AI funding. Melissa Densmore UCT. Morphologically complex languages AI. Telkom Centres of Excellence ICT research
Large Language Models isiXhosa isiZulu Sepedi

As Artificial Intelligence (AI) rapidly evolves, a groundbreaking national collaboration is ensuring that South Africa’s linguistic heritage isn’t left behind. Researchers from the University of Cape Town (UCT) have teamed up with the University of Zululand, University of Limpopo, and the University of Fort Hare to develop Large Language Models (LLMs) specifically for isiXhosa, isiZulu, and Sepedi.

Supported by the National Research Foundation (NRF) and the Telkom Centres of Excellence, this project represents a shift toward “sovereign AI”—creating technology that reflects local needs rather than simply importing models from the Global North.


UCT AI research African languages. Large Language Models isiXhosa isiZulu Sepedi. South African AI national collaboration. National Research Foundation AI funding. Melissa Densmore UCT. Morphologically complex languages AI. Telkom Centres of Excellence ICT research
Large Language Models isiXhosa isiZulu Sepedi

The Challenge: Complex Structures and Data Scarcity

Building a chatbot like ChatGPT for English is relatively simple because the internet is saturated with English text. For African languages, researchers face two major hurdles:

  1. The “Data Gap”: There is significantly less digital text available in isiZulu or isiXhosa. To solve this, the team is digitizing physical archives and books from libraries that have never been online.
  2. Morphological Complexity: Unlike English, many African languages are “agglutinative,” meaning words are formed by adding multiple prefixes and suffixes to a root. This requires specialized algorithms that can understand intricate word structures.

Why It Matters: Accuracy in Healthcare and Education

When AI fails to understand a language correctly, the consequences can be more than just a bad translation—they can be dangerous.

  • Combating Misinformation: In healthcare, a poorly framed AI response to a medical query in isiXhosa could provide life-threatening advice.
  • Access to Services: Reliable AI tools allow citizens to access public services and information in the language they speak most comfortably at home.
  • Bilingual Support: Early trials have already shown success in neonatal wards, allowing parents to listen to medical info in English while reading a synchronized translation in isiXhosa.

UCT AI research African languages. Large Language Models isiXhosa isiZulu Sepedi. South African AI national collaboration. National Research Foundation AI funding. Melissa Densmore UCT. Morphologically complex languages AI. Telkom Centres of Excellence ICT research
Large Language Models isiXhosa isiZulu Sepedi

A Vision for “Digital Ownership”

Led at UCT by Associate Professor Melissa Densmore and Dr. Jan Buys, the project emphasizes ethics and community consultation. The goal isn’t just to build a tool, but to empower communities to build their own tools.

“My long-term vision is that people can build technologies themselves in their own languages,” says Densmore. “Whether those are powered by language models or other kinds of AI, the key is that communities have ownership over them.”


Building a National AI Legacy

The project, which runs until 2027, is a cornerstone of UCT’s broader ambition to establish a dedicated AI Institute. By funding Master’s, PhD, and postdoctoral researchers across four provinces, the initiative is training the next generation of South African computer scientists to lead the continent’s digital future.

LEAVE A REPLY

Please enter your comment!
Please enter your name here