Research Article Open Access

Speech Segmentation Using Dynamic Windows and Thresholds for Arabic and English Languages

Yahia Hasan Jazyah1
  • 1 Arab Open University, Kuwait

Abstract

Segmentation of audio data such as human speech (splitting each word in separate audio file – .WAV file) has been a major concern when working with multimedia such as recordings from radio or TV. The main focus of the segmentation of boundaries of spoken language has been on using energy and zero crossing thresholds for endpoint detection. Errors in endpoint detection are still a main cause of low accuracy of segmentation systems. The goal of this research is to develop an efficient algorithm in order to segment the speech of human in both languages of English and Arabic in different speaking speed with high accuracy. Simulation results show that the developed algorithm achieved high accuracy when segmenting human speech in English language up to 91.6% in average, while it is 89.0% of Arabic language.

Journal of Computer Science
Volume 14 No. 4, 2018, 485-490

DOI: https://doi.org/10.3844/jcssp.2018.485.490

Submitted On: 4 January 2018 Published On: 18 April 2018

How to Cite: Jazyah, Y. H. (2018). Speech Segmentation Using Dynamic Windows and Thresholds for Arabic and English Languages. Journal of Computer Science, 14(4), 485-490. https://doi.org/10.3844/jcssp.2018.485.490

  • 3,925 Views
  • 2,043 Downloads
  • 4 Citations

Download

Keywords

  • Audio
  • Voice
  • Speech
  • Segmentation