
Bemba Speech Translation: Exploring a Low-Resource African Language
This paper describes our system submission to the International Conference on Spoken Language Translation (IWSLT 2025)
At Kreasof AI, we believe in pushing the boundaries of artificial intelligence to create a more connected and inclusive world. We are thrilled to announce a significant step forward in this mission: a new research paper detailing our state-of-the-art speech translation system for Bemba, a language spoken by over 30% of the population in Zambia.
This pioneering work, led by Muhammad Hazim Al Farouq of our Kreasof AI Research Labs, in collaboration with researchers from the African Institute for Mathematical Sciences (AIMS) and the ADAPT Centre in Ireland, has been submitted to the prestigious International Conference on Spoken Language Translation (IWSLT 2025).
The Challenge: Giving a Voice to Low-Resource Languages
In today's digital age, language technology offers unprecedented access to information and services. However, most of these advancements are concentrated on a handful of high-resource languages like English or Mandarin. Languages like Bemba, despite being spoken by millions, are often left behind due to a scarcity of high-quality data needed to train AI models. This creates a digital divide, limiting access to technology for entire communities.
Our team took on this challenge directly, aiming to build a high-quality, robust speech translation system that could understand spoken Bemba and translate it accurately into English.
Our Innovative Approach: A Smart, Two-Step Solution
To overcome the data scarcity, our team developed a sophisticated cascaded speech translation system. Instead of a single, complex model, this approach breaks the problem into two manageable steps:
- Speech-to-Text (ASR): First, we use a fine-tuned version of OpenAI's powerful Whisper model to accurately transcribe spoken Bemba into written text.
- Text-to-Text (MT): Next, that text is fed into a highly-specialized version of Meta AI's NLLB-200 model, which translates the Bemba text into English.
The real innovation lies not just in using these models, but in how we adapted them. We fine-tuned them using available Bemba datasets, including the Big-C and BembaSpeech corpora, teaching them the unique nuances, grammar, and sounds of the language.
Furthermore, to expand our training data, we employed a clever technique called back-translation. We took a large dataset of English text, translated it into Bemba using a preliminary model, and then used these new synthetic sentence pairs to further train our system. This data augmentation strategy proved crucial for boosting the final translation quality.
The Results: A Quantum Leap in Performance
The results speak for themselves. Our fine-tuned and data-augmented system achieved a dramatic improvement over the baseline, off-the-shelf models.
Our cascaded system first transcribes Bemba audio to text, then translates the text to English.
As shown in the paper, our primary system saw its COMET score—a key metric for measuring translation quality—soar from a baseline of 16.23 to an impressive 51.74. Similarly, the BLEU score jumped from just 0.72 to 27.45. This represents a monumental leap in quality, producing translations that are not only grammatically correct but also semantically coherent.
Interestingly, our cascaded system also outperformed a direct end-to-end model in translation quality (as measured by the COMET score), demonstrating the power and effectiveness of our two-step approach for this low-resource challenge.
Why This Matters
This research is more than an academic exercise; it's a blueprint for breaking down language barriers across the globe. By developing effective methodologies for low-resource languages, we are paving the way for:
- Greater Digital Inclusion: Enabling Bemba speakers to interact with global digital content and services in their native language.
- Preservation of Culture: Creating tools that help preserve and promote languages that are underrepresented online.
- Scalable Solutions: Providing a framework that can be adapted to support hundreds of other low-resource languages across Africa and the world.
Kreasof AI is proud to be at the forefront of this vital research, leveraging our expertise and computational resources to drive innovation that matters. We remain committed to building AI that serves all of humanity, not just a select few.
Read the full research paper to dive deeper into our methodology and results:
Title: Bemba Speech Translation: Exploring a Low-Resource African Language
Authors: Muhammad Hazim Al Farouq, Aman Kassahun Wassie, Yasmin Moslem
Link: arXiv:2505.02518v3 [cs.CL]