Partial Greedy Algorithm to Extract a Minimum Phonetically-and-Prosodically Rich Sentence Set
Journal Title: International Journal of Advanced Computer Science & Applications - Year 2018, Vol 9, Issue 12
Abstract
A phonetically-and-prosodically rich sentence set is so important in collecting a read-speech corpus for developing phoneme-based speech recognition. The sentence set is usually searched from a huge text corpus of million sentences using the optimization methods. One of the commonly used optimization methods for this case is a Least-to-Most Greedy (LTMG) algo-rithm. It is effective in minimizing the number of phoneme-units. Unfortunately, it does not distribute their frequencies. In this paper, a new method called Partial LTMG algorithm (PLTMG) is proposed to search an optimum set containing triphones and prosodies those are distributed in a near-uniform fashion. Testing on an Indonesian text corpus of ten million sentences crawled from some websites of newspapers and novels shows that the proposed method is not only capable of minimizing both phoneme-units and prosodies but also effective in distributing their frequencies.
Authors and Affiliations
Fahmi Alfiansyah, Suyanto Suyanto
Mode-Scheduling Steering Law of VSCMGs for Multi-Target Pointing and Agile Maneuver of a Spacecraft
This study proposes a method of selecting a set of gimbal angles in the final state and applies the method to the mode-scheduling steering law of variable-speed control moment gyros intended for multi-target pointing man...
Intelligent Watermarking Scheme for image Authentication and Recovery
Recently, researchers have proposed semi-fragile watermarking techniques with the additional capability of image recovery. However, these approaches have certain limitations with respect to capacity, imperceptibility, an...
Applicability of Data Mining Technique Using Bayesians Network in Diagnosis of Genetic Diseases
This study aims to identify a methodology to aid in the identification of diagnosis for chromosomal abnormalities and genetic diseases, presenting as a tutorial model the Turner Syndrome. So, it has been used classificat...
Intelligent Diagnostic System for Nuclei Structure Classification of Thyroid Cancerous and Non-Cancerous Tissues
Recently, image mining has opened new bottlenecks in the field of biomedical discoveries and machine leaning techniques have brought significant revolution in medical diagnosis. Especially, classification problem of huma...
Investigating the combination of structural and textual information about multimedia retrieval
The expansion of structured information in different applications introduces a new ambiguity in multimedia retrieval in semi-structured documents. We investigate in this paper the combination of textual and structural co...