Research Article Open Access

Genetic Algorithm Based Probabilistic Motif Discovery in Unaligned Biological Sequences

M. Hemalatha and K. Vivekanandan

Abstract

Finding motif in biosequences is the most important primitive operation in computational biology. There are many computational requirements for a motif discovery algorithm such as computer memory space requirement and computational complexity. To overcome the complexity of motif discovery, we propose an alternative solution integrating genetic algorithm and Fuzzy Art machine learning approaches for eliminating multiple sequence alignment process. Problem statement: More than a hundred methods had been proposed for motif discovery in recent years, representing a large variation with respect to both algorithmic approaches as well as the underlying models of regulatory regions. The aim of this study was to develop an alternative solution for motif discovery, which benefits from both data mining and genetic algorithm, and which at the same time eliminates the cost caused by use of multiple sequence alignment. Approach: Genetic algorithm based probabilistic Motif discovery model was designed to solve the problem. The proposed algorithm was implemented using Matlab and also tested with large DNA sequence data sets and synthetic data sets. Results: Results obtained by the proposed model to find the motif in terms of speed and length are compared with the existing method. Our proposed method finds Length of 11 in 18 sec and length of 15 in 24 sec but the existing methods finds length of 11 in 34 sec. Compare to other techniques the proposed one was outperforms the popular existing method. Conclusion: In this study, we proposed a model to discover motif in large set of unaligned sequences in considerably minimum time. Length of motif was also long. The proposed algorithm will be implemented using Matlab and was tested with large DNA sequence data sets and synthetic data sets.

Journal of Computer Science
Volume 4 No. 8, 2008, 625-630

DOI: https://doi.org/10.3844/jcssp.2008.625.630

Submitted On: 8 December 2007 Published On: 31 August 2008

How to Cite: Hemalatha, M. & Vivekanandan, K. (2008). Genetic Algorithm Based Probabilistic Motif Discovery in Unaligned Biological Sequences . Journal of Computer Science, 4(8), 625-630. https://doi.org/10.3844/jcssp.2008.625.630

  • 3,149 Views
  • 2,711 Downloads
  • 0 Citations

Download

Keywords

  • Bioinformatics
  • genetic algorithm
  • motif
  • DNA sequence
  • multiple unaligned