Research Article Open Access

Fast Algorithms for Discovering Sequential Patterns in Massive Datasets

S. Dharani, Justus Rabi, Nanda Kumar and Darly

Abstract

Problem statement: Sequential pattern mining is one of the specific data mining tasks, particularly from retail data. The task is to discover all sequential patterns with a user-specified minimum support, where support of a pattern is the number of data-sequences that contain the pattern. Approach: To find a sequence patterns variety of algorithm like AprioriAll and Generalized Sequential Patterns (GSP) were there. We present fast and efficient algorithms called AprioriAllSID and GSPSID for mining sequential patterns that were fundamentally different from known algorithms. Results: The proposed algorithm had been implemented and compared with AprioriAll and Generalized Sequential Patterns (GSP). Its performance was studied on an experimental basis. We combined the AprioriAllSID algorithm with AprioriAll algorithm into a Hybrid algorithm, called AprioriAll Hybrid. Conclusion: Implementation shows that the execution time of the algorithm to find sequential pattern depends on total no of candidates generated at each level and the time taken to scan the database. Our performance study shows that the proposed algorithms have an excellent performance over the best existing algorithms.

Journal of Computer Science
Volume 7 No. 9, 2011, 1325-1329

DOI: https://doi.org/10.3844/jcssp.2011.1325.1329

Submitted On: 13 April 2011 Published On: 23 July 2011

How to Cite: Dharani, S., Rabi, J., Kumar, N. & Darly, (2011). Fast Algorithms for Discovering Sequential Patterns in Massive Datasets. Journal of Computer Science, 7(9), 1325-1329. https://doi.org/10.3844/jcssp.2011.1325.1329

  • 3,066 Views
  • 2,619 Downloads
  • 2 Citations

Download

Keywords

  • Data mining
  • sequential pattern mining
  • apriori all hybrid
  • proposed algorithm
  • temporary database
  • candidate sequences
  • minimum support