Developing an Adaptive Language Model for Bahasa Indonesia
Journal Title: International Journal of Advanced Computer Science & Applications - Year 2019, Vol 10, Issue 1
Abstract
A language model is one of the important compo-nents in a speech recognition system. It is commonly developed using a statistical method called n-gram. However, a standard n-gram cannot be used for general domains with so many am-biguous semantics of sentences. This paper focuses on developing an adaptive n-gram language model for Bahasa Indonesia. First, a text corpus of ten million distinct sentences is crawled from hundreds of websites of news, magazines, personal blogs, and writing forums. The text corpus is then used to construct an adaptive language model using Latent Dirichlet Allocation (LDA) with Collapsed Gibbs Sampling (CGS) training method. Compare to the standard n-gram, the adaptive language model gives a better performance in the word selection to produce the best sentence.
Authors and Affiliations
Satria Nur Hidayatullah, Suyanto Suyanto
Optimal Design of PMSA for SBW Application
In this paper a new topology of Permanent Magnet Synchronous Actuator (PMSA) is used for steer-by-wire application. The magnetic field patterns are determined from finite element modeling, for different rotor positions a...
Measuring the Effect of Packet Corruption Ratio on Quality of Experience (QoE) in Video Streaming
The volume of Internet video traffic which consists of downloaded or streamed video from the Internet is projected to increase from 42,029PB monthly in 2016 to 159,161PB monthly, in 2021, representing a 31% increase in t...
Towards Network-Aware Composition of Big Data Services in the Cloud
Several Big data services have been developed on the cloud to meet increasingly complex needs of users. Most times a single Big data service may not be capable in satisfying user requests. As a result, it has become nece...
Modified Farmland Fertility Optimization Algorithm for Optimal Design of a Grid-connected Hybrid Renewable Energy System with Fuel Cell Storage: Case Study of Ataka, Egypt
In this paper, a Modified Farmland Fertility Optimization algorithm (MFFA) has been presented for optimal sizing of a grid connected hybrid system including photovoltaic (PV), wind turbines and fuel cell (FC). The system...
Opinion Mining and Analysis for Arabic Language
Social media constitutes a major component of Web 2.0 and includes social networks, blogs, forum discussions, micro-blogs, etc. Users of social media generate a huge volume of reviews and comments on daily basis. These r...