Language Model Adaptation Using Dirichlet Class Language Model Based on Part-of-Speech

Journal Title: Journal of Information Systems and Telecommunication - Year 2014, Vol 2, Issue 1

Abstract

Language modeling has many applications in a large variety of domains. Performance of this model depends on its adaptation to a particular style of data. Accordingly, adaptation methods endeavour to apply syntactic and semantic characteristics of the language for language modeling. The previous adaptation methods such as family of Dirichlet class language model (DCLM) extract class of history words. These methods due to lake of syntactic information are not suitable for high morphology languages such as Farsi. In this paper, we present an idea for using syntactic information such as part-of-speech (POS) in DCLM for combining with one of the language models of n-gram family. In our work, word clustering is based on POS of previous words and history words in DCLM. The performance of language models are evaluated on BijanKhan corpus using a hidden Markov model based ASR system. The results show that use of POS information along with history words and class of history words improves performance of language model, and decreases the perplexity on our corpus. Exploiting POS information along with DCLM, the word error rate of the ASR system decreases by 1.2% compared to DCLM.

Authors and Affiliations

Keywords

Related Articles

Crisis management using spatial query processing in wireless sensor networks

Natural disasters are an inevitable part of the world that we inhabit. Human casualties and financial losses are concomitants of these natural disasters. However, by an efficient crisis management program, we can minimiz...

Blog feed search in Persian Blogosphere

Blogs are one of the main user generated content on the web. So, it is necessary to present retrieval algorithms to the meet information need of weblog users. The goal of blog feed search is to rank blogs regarding their...

Ant Colony Scheduling for Network On Chip

The operation scheduling problem in network on chip is NP-hard; therefore effective heuristic methods are needful to provide modal solutions. This paper introduces ant colony scheduling, a simple and effective method to...

Application of Curve Fitting in Hyperspectral Data Classification and Compression

Regarding to the high between-band correlation and large volumes of hyperspectral data, feature reduction (either feature selection or extraction) is an important part of classification process for this data type. A vari...

A fuzzy approach for ambiguity reducing in text similarity estimation (case study: Persian web contents)

Finding similar web contents have great efficiency in academic community and software systems. There are many methods and metrics in literature to measure the extent of text similarity among various documents and some it...

Download PDF file
  • EP ID EP185971
  • DOI 10.7508/jist.2014.01.005
  • Views 132
  • Downloads 0

How To Cite

(2014). Language Model Adaptation Using Dirichlet Class Language Model Based on Part-of-Speech. Journal of Information Systems and Telecommunication, 2(1), 41-46. https://europub.co.uk/articles/-A-185971