SuVashantor: English to Bangla Machine Translation Systems
- 1 Shahjalal University of Science and Technology, Bangladesh
Abstract
This paper presents the system description of Machine Translation (MT) systems for English-Bangla language pair. Our goal was to create two benchmark MT systems that produce a better quality translation and comparatively higher evaluation score than existing MT systems for English to Bangla. In our experiments, we implemented two baseline MT systems using both statistical and neural methods for the said language pair. Our phrase-based statistical model and 2-layer LSTM neural model were trained and evaluated with a large dataset that is carefully pre-processed and contains unique training data to avoid biases from the cross-validation and test data. We achieved the highest scoring BLEU for our experiments with these setups. Furthermore, we improved the performance of the neural model using pre-trained embedding and synthetic monolingual data which are cutting-edge technology for neural models.
DOI: https://doi.org/10.3844/jcssp.2020.1128.1138
Copyright: © 2020 Mahjabeen Akter, M. Shahidur Rahman, Muhammed Zafar Iqbal and Mohammad Reza Selim. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
- 3,738 Views
- 1,836 Downloads
- 2 Citations
Download
Keywords
- Machine Learning
- Machine Translation Systems
- Statistical Machine Translation Systems
- Neural Network
- Neural Machine Translation Systems
- Pre-trained Word Embedding
- Synthetic Monolingual Data