Developing Statistical Machine Translation System for English and Nigerian Languages

Main Article Content

Ignatius Ikechukwu Ayogu
Adebayo Olusola Adetunmbi
Bolanle Adefowoke Ojokoh


The global demand for translation and translation tools currently surpasses the capacity of available solutions. Besides, there is no one-solution-fits-all, off-the-shelf solution for all languages. Thus, the need and urgency to increase the scale of research for the development of translation tools and devices continue to grow, especially for languages suffering under the pressure of globalisation. This paper discusses our experiments on translation systems between English and two Nigerian languages: Igbo and Yorùbá. The study is setup to build parallel corpora, train and experiment English-to-Igbo, (), English-to-Yorùbá, () and Igbo-to-Yorùbá, () phrase-based statistical machine translation systems. The systems were trained on parallel corpora that were created for each language pair using text from the religious domain in the course of this research. A BLEU score of 30.04, 29.01 and 18.72 respectively was recorded for the English-to-Igbo, English-to-Yorùbá and Igbo-to-Yorùbá MT systems. An error analysis of the systems’ outputs was conducted using a linguistically motivated MT error analysis approach and it showed that errors occurred mostly at the lexical, grammatical and semantic levels. While the study reveals the potentials of our corpora, it also shows that the size of the corpora is yet an issue that requires further attention. Thus an important target in the immediate future is to increase the quantity and quality of the data.


Machine translation, Igbo language, Yoruba language, parallel corpora, SMT

Article Details

How to Cite
Ikechukwu Ayogu, I., Olusola Adetunmbi, A., & Adefowoke Ojokoh, B. (2018). Developing Statistical Machine Translation System for English and Nigerian Languages. Asian Journal of Research in Computer Science, 1(4), 1-8.
Original Research Article