Audio Search Based on Keyword Spotting in Arabic Language

Abstract

Keyword spotting is an important application of speech recognition. This research introduces a keyword spotting approach to perform audio searching of uttered words in Arabic speech. The matching process depends on the utterance nucleus which is insensitive to its context. For spotting the targeted utterances, the matched nuclei are expanded to cover the whole utterances. Applying this approach to Quran and standard Arabic has promising results. To improve this spotting approach, it is combined with a text search in case of the existence of a transcript. This can be applied on Quran as there is exact correspondence between the audio and text files of each verse. The developed approach starts by text search to identify the verses that include the target utterance(s). For each allocated verse, the occurrence(s) of the target utterance is determined. The targeted utterance (the reference) is manually segmented from an allocated verse. Then Keyword spotting is performed for the extracted reference to the corresponding audio file. The accuracy of the spotted utterances achieved 97%. The experiments showed that the use of the combined text and audio search has reduced the search time by 90% when compared with audio search only tested on the same content. The developed approach has been applied to non transcribed audio files (preaches and News) for searching chosen utterances. The results are promising. The accuracy of spotting was around 84% in case of preaches and 88% in case of the news.

Authors and Affiliations

Mostafa Awaid, Sahar Fawzi, Ahmed Kandil

Keywords

Related Articles

A Comparative Analysis of Wavelet Families for the Classification of Finger Motions

Wavelet transform (WT) has been widely used in biomedical, rehabilitation and engineering applications. Due to the natural characteristic of WT, its performance is mostly depending on the selection of mother wavelet func...

Towards Securing Medical Documents from Insider Attacks

Medical organizations have sensitive health related documents. Unauthorized access attempts for these should not only be prevented but also detected in order to ensure correct treatment of the patients and to capture the...

Probabilistic Monte-Carlo Method for Modelling and Prediction of Electronics Component Life

Power electronics are widely used in electric vehicles, railway locomotive and new generation aircrafts. Reliability of these components directly affect the reliability and performance of these vehicular platforms. In re...

Automated Greenhouses for the Reduction of the Cost of the Family Basket in the District of Villa El Salvador-Perú

Today, the cost of the family basket is gradually increasing, not only globally but also in our country. This increase includes the demand for vegetables and fresh vegetables that allow people to improve their quality of...

Managing and Reducing Handoffs Latency in Wireless Local Area Networks using Multi-Channel Virtual Access Points

The time is era of computer technology and relevant hybrid disciplines to emerge as a multi impact entity in the technological world. In the same stream where user of technology is increasing the expectation from the tec...

Download PDF file
  • EP ID EP141854
  • DOI 10.14569/IJACSA.2014.050219
  • Views 124
  • Downloads 0

How To Cite

Mostafa Awaid, Sahar Fawzi, Ahmed Kandil (2014). Audio Search Based on Keyword Spotting in Arabic Language. International Journal of Advanced Computer Science & Applications, 5(2), 128-133. https://europub.co.uk/articles/-A-141854