Research Article Open Access

A Context-Free Grammar for Parsing Manipuri Language

Yumnam Nirmal1 and Utpal Sharma1
  • 1 Tezpur University, India

Abstract

Parsing, i.e., identifying the underlying hierarchical structure of natural language expressions is important for several natural language processing applications. In recent times Machine Learning (ML) approaches have been developed for this study for many languages. Most of the effective techniques require an annotated corpus of the language for training and validation. For the Manipuri language of the Tibeto-Burman family, neither such a corpus nor a grammar framework to automatically analyse and represent the structure of sentences exists yet. This study proposes a context-Free Grammar (CFG) that provides the framework to represent the structure of Manipuri sentences. This paves the way for parsing Manipuri sentences using CFG-based parsers for various applications and to conveniently build a Treebank for developing ML-based parsers for Manipuri. The rules of the proposed CFG are handcrafted after extensive analysis of the structure of Manipuri sentences. The grammar covers simple, compound, complex and compound-complex sentences. For evaluation, we induce an Earley’s parser with the proposed CFG and test it over a collection of sentences that covers the possible varieties of structure. A recognition rate of 83.20% achieved in these experiments indicates the effectiveness of the proposed grammar.

Journal of Computer Science
Volume 17 No. 9, 2021, 855-869

DOI: https://doi.org/10.3844/jcssp.2021.855.869

Submitted On: 23 March 2021 Published On: 9 October 2021

How to Cite: Nirmal, Y. & Sharma, U. (2021). A Context-Free Grammar for Parsing Manipuri Language. Journal of Computer Science, 17(9), 855-869. https://doi.org/10.3844/jcssp.2021.855.869

  • 2,481 Views
  • 1,503 Downloads
  • 0 Citations

Download

Keywords

  • Context-Free Grammar
  • Parsing
  • Manipuri
  • Tibeto-Burman