Partial Greedy Algorithm to Extract a Minimum Phonetically-and-Prosodically Rich Sentence Set

Abstract

A phonetically-and-prosodically rich sentence set is so important in collecting a read-speech corpus for developing phoneme-based speech recognition. The sentence set is usually searched from a huge text corpus of million sentences using the optimization methods. One of the commonly used optimization methods for this case is a Least-to-Most Greedy (LTMG) algo-rithm. It is effective in minimizing the number of phoneme-units. Unfortunately, it does not distribute their frequencies. In this paper, a new method called Partial LTMG algorithm (PLTMG) is proposed to search an optimum set containing triphones and prosodies those are distributed in a near-uniform fashion. Testing on an Indonesian text corpus of ten million sentences crawled from some websites of newspapers and novels shows that the proposed method is not only capable of minimizing both phoneme-units and prosodies but also effective in distributing their frequencies.

Authors and Affiliations

Fahmi Alfiansyah, Suyanto Suyanto

Keywords

Related Articles

A Compact Broadband Antenna for Civil and Military Wireless Communication Applications

This paper presents a compact broadband antenna for civil and military wireless communication applications. Two prototypes of the antenna are designed and simulated. The proposed antenna is etched on low cost substrate m...

Conception of a management tool of Technology Enhanced Learning Environments

This paper describes the process of the conception of a software tool of TELE management. The proposed management tool combines information from two sources: i) the automatic reports produced by the Learning Content Mana...

Speaker Identification using Row Mean of Haar and Kekre’s Transform on Spectrograms of Different Frame Sizes 

In this paper, we propose Speaker Identification using two transforms, namely Haar Transform and Kekre’s Transform. The speech signal spoken by a particular speaker is converted into a spectrogram by using 25% and 50% ov...

Smart Building’s Elevator with Intelligent Control Algorithm based on Bayesian Networks

Implementation of the intelligent elevator control systems based on machine-learning algorithms should play an important role in our effort to improve the sustainability and convenience of multi-floor buildings. Traditio...

Firefly Algorithm for Adaptive Emergency Evacuation Center Management

Flood disaster is among the most devastating natural disasters in the world, claiming more lives and causing property damage. The pattern of floods across all continents has been changing, becoming more frequent, intense...

Download PDF file
  • EP ID EP429238
  • DOI 10.14569/IJACSA.2018.091274
  • Views 94
  • Downloads 0

How To Cite

Fahmi Alfiansyah, Suyanto Suyanto (2018). Partial Greedy Algorithm to Extract a Minimum Phonetically-and-Prosodically Rich Sentence Set. International Journal of Advanced Computer Science & Applications, 9(12), 530-534. https://europub.co.uk/articles/-A-429238