Greedy Algorithms to Optimize a Sentence Set Near-Uniformly Distributed on Syllable Units and Punctuation Marks

Abstract

An optimum sentence set that near-uniformly dis-tributed on syllable units and punctuation marks is important to develop a syllable-based automatic speech recognition (ASR). It is usually extracted from a mother set of millions of unique sentences using Modified Least-to-Most (LTM) Greedy algorithm. The Modified LTM Greedy is capable of minimizing the number of syllables but ignores distributing their frequencies. Hence, two schemes are proposed to minimize the number of syllables as well as to distribute their frequencies near-uniformly. Testing on a mother set of 10 million Indonesian sentences shows that both schemes perform better than the Modified LTM Greedy for two syllable units: monosyllables and bisyllables.

Authors and Affiliations

Bagus Nugroho Budi Nurtomo, Suyanto Suyanto

Keywords

Related Articles

Basic Health Screening by Exploiting Data Mining Techniques

This study aimed at proposing a basic health screening system based on data mining techniques in order to help related personnel on basic health screening and to facilitate citizens on self-examining health conditions. T...

Nonlinear Mixing Model of Mixed Pixels in Remote Sensing Satellite Images Taking Into Account Landscape

Nonlinear mixing model of mixed pixels in remote sensing satellite images taking into account landscape is proposed. Most of linear mixing models of mixed pixels do not work so well because the mixed pixels consist of se...

Hierarchical Cellular Structures in High-Capacity Cellular Communication Systems 

In the prevailing cellular environment, it is important to provide the resources for the fluctuating traffic demand exactly in the place and at the time where and when they are needed. In this paper, we explored the abil...

A Novel Cylindrical DRA for C-Band Applications

In this paper, we study a Dielectric Resonator Antenna of cylindrical shape with circular polarization for applications in the C band. The proposed antenna is composed of two different layers. The first is Polyflon Polyg...

Blood Vessels Segmentation in Retinal Fundus Image using Hybrid Method of Frangi Filter, Otsu Thresholding and Morphology

Diagnosis of computer-based retinopathic hypertension is done by analyzing of retinal images. The analysis is carried out through various stages, one of which is blood vessel segmentation in retinal images. Vascular segm...

Download PDF file
  • EP ID EP408080
  • DOI 10.14569/IJACSA.2018.091035
  • Views 55
  • Downloads 0

How To Cite

Bagus Nugroho Budi Nurtomo, Suyanto Suyanto (2018). Greedy Algorithms to Optimize a Sentence Set Near-Uniformly Distributed on Syllable Units and Punctuation Marks. International Journal of Advanced Computer Science & Applications, 9(10), 291-296. https://europub.co.uk/articles/-A-408080