First Token Algorithm for Searching Compound Terms Using Thesaurus Database

Yousef Abuzir; Thabit Sabbah

doi:10.3844/jcssp.2012.61.67

Research Article Open Access

First Token Algorithm for Searching Compound Terms Using Thesaurus Database

Yousef Abuzir and Thabit Sabbah

Abstract

Problem statement: Searching text materials is the one of the most important operations that carried out by search engines either on web or desktop applications, searching algorithms are required sometimes to find a specific word into a text, others to find a multi word term (pattern matching) into a text. Searching for term into a thesaurus database can be carried out using many searching algorithm such as brute-force algorithm and others. Approach: We addressed several issues concerning developing a searching algorithm that search terms into thesaurus database. Two exact algorithms were discussed and compared. The first algorithm, brute-force algorithm and the second one were proposed by this study to enhance brute-force algorithm. Results: We proposed an efficient search algorithm and compare it with brute force technique. Computational results showed that our algorithm can provide an efficient search algorithm that reduces the number of queries and the total time required to finish the required task. Conclusion: Our study showed an optimum solution for larger size of the studied problem with much less processing time than the brute-force algorithm. The modified algorithm has a higher efficiency to deal with Thesaurus Database searching problems.

Journal of Computer Science

Volume 8 No. 1, 2012, 61-67

DOI: https://doi.org/10.3844/jcssp.2012.61.67

Submitted On: 9 August 2011 Published On: 28 October 2011

How to Cite: Abuzir, Y. & Sabbah, T. (2012). First Token Algorithm for Searching Compound Terms Using Thesaurus Database. Journal of Computer Science, 8(1), 61-67. https://doi.org/10.3844/jcssp.2012.61.67

Copyright: © 2012 Yousef Abuzir and Thabit Sabbah. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

4,721 Views
3,469 Downloads
2 Citations

Download

Keywords

Brute-force
pattern matching
information retrieval
compound terms searching
First Token (FT)
thesaurus database
training thesaurus