Research Article Open Access

SuVashantor: English to Bangla Machine Translation Systems

Mahjabeen Akter1, M. Shahidur Rahman1, Muhammed Zafar Iqbal1 and Mohammad Reza Selim1
  • 1 Shahjalal University of Science and Technology, Bangladesh

Abstract

This paper presents the system description of Machine Translation (MT) systems for English-Bangla language pair. Our goal was to create two benchmark MT systems that produce a better quality translation and comparatively higher evaluation score than existing MT systems for English to Bangla. In our experiments, we implemented two baseline MT systems using both statistical and neural methods for the said language pair. Our phrase-based statistical model and 2-layer LSTM neural model were trained and evaluated with a large dataset that is carefully pre-processed and contains unique training data to avoid biases from the cross-validation and test data. We achieved the highest scoring BLEU for our experiments with these setups. Furthermore, we improved the performance of the neural model using pre-trained embedding and synthetic monolingual data which are cutting-edge technology for neural models.

Journal of Computer Science
Volume 16 No. 8, 2020, 1128-1138

DOI: https://doi.org/10.3844/jcssp.2020.1128.1138

Submitted On: 13 June 2020 Published On: 16 August 2020

How to Cite: Akter, M., Rahman, M. S., Iqbal, M. Z. & Selim, M. R. (2020). SuVashantor: English to Bangla Machine Translation Systems. Journal of Computer Science, 16(8), 1128-1138. https://doi.org/10.3844/jcssp.2020.1128.1138

  • 3,738 Views
  • 1,836 Downloads
  • 2 Citations

Download

Keywords

  • Machine Learning
  • Machine Translation Systems
  • Statistical Machine Translation Systems
  • Neural Network
  • Neural Machine Translation Systems
  • Pre-trained Word Embedding
  • Synthetic Monolingual Data