African researchers are working hard to create cutting-edge AI applications that are specifically designed for African languages.
The Masakhane Research Foundation’s Kathleen Siminyu made clear that the billions of people who do not speak the most common languages, such as English, French, or Spanish, are unable to use widely used AI technologies like ChatGPT, a text generator, and voice-activated assistants like Siri. This highlights the requirement for increased diversity and representation in the field of language technology advancement.
“It doesn’t make sense to me that there are limited AI tools for African languages. Inclusion and representation in the advancement of language technology is not a patch you put at the end — it’s something you think about upfront,” Ms. Siminyu noted.
These specialised tools frequently use natural language processing, an area of artificial intelligence that enables computers to understand human languages. By recognising patterns in speech and text data, training enables computers to understand languages. However, the technology fails when there is a lack of data in a particular language, as is the case with African languages. The research team first pinpointed important players involved in the creation of African language tools in an effort to close this gap.
Language tool infrastructure is built on the experiences, motivations, focuses, and challenges of content creators like writers, editors, linguists, software engineers, and entrepreneurs. In-depth conversations with these stakeholders revealed four key principles to consider when creating African language tools.
The multilingual context of Africa must be addressed first in order to acknowledge the long-lasting effects of colonisation. Along with playing a crucial role in education, politics, and the economy, indigenous languages have a great deal of cultural significance. The second essential step is to encourage the production of African content. In order to do this, basic tools for African languages must be created, including dictionaries, spell checkers, and keyboards. Additionally, it entails removing administrative and financial barriers to translating official communications into various national languages, including African languages. Furthermore, fostering collaboration between linguistics and computer science is key, these fields intersect to yield human-centered innovative solutions.
Ethical practices and community involvement throughout data collection and utilization are equally important considerations. The next phase for the team involves addressing barriers that may impede people’s access to this technology. Their study aims to provide a roadmap for the development of various language tools, ranging from translation services to content moderators targeting misinformation.
“I would love for us to live in a world where Africans can have as good quality of life and access to information and opportunities as somebody fluent in English, French, Mandarin, or other languages,” says Siminyu.
“There is a growing number of organizations working in this space, and this study allows us to coordinate efforts in building impactful language tools. The findings highlight and articulate what the priorities are, in terms of time and financial investments.” Siminyu added.