Language Model Adaptation Using Dirichlet Class Language Model Based on Part-of-Speech

Journal Title: Journal of Information Systems and Telecommunication - Year 2014, Vol 2, Issue 1

Abstract

Language modeling has many applications in a large variety of domains. Performance of this model depends on its adaptation to a particular style of data. Accordingly, adaptation methods endeavour to apply syntactic and semantic characteristics of the language for language modeling. The previous adaptation methods such as family of Dirichlet class language model (DCLM) extract class of history words. These methods due to lake of syntactic information are not suitable for high morphology languages such as Farsi. In this paper, we present an idea for using syntactic information such as part-of-speech (POS) in DCLM for combining with one of the language models of n-gram family. In our work, word clustering is based on POS of previous words and history words in DCLM. The performance of language models are evaluated on BijanKhan corpus using a hidden Markov model based ASR system. The results show that use of POS information along with history words and class of history words improves performance of language model, and decreases the perplexity on our corpus. Exploiting POS information along with DCLM, the word error rate of the ASR system decreases by 1.2% compared to DCLM.

Authors and Affiliations

Keywords

Related Articles

Privacy Preserving Big Data Mining: Association Rule Hiding

Data repositories contain sensitive information which must be protected from unauthorized access. Existing data mining techniques can be considered as a privacy threat to sensitive data. Association rule mining is one of...

Design and Implementation of an Ultra-Wide Band, High Precision, and Low Noise Frequency Synthesizer

This paper presents system-level design and implementation of an ultra-wide tunable, high precision, fast locking, low phase noise, and low power portable fractional-N frequency synthesizer. The output frequency of the p...

A Semantic Approach to Person Profile Extraction from Farsi Web Documents

Entity profiling (EP) as an important task of Web mining and information extraction (IE) is the process of extracting entities in question and their related information from given text resources. From computational viewp...

An Efficient Noise Removal Edge Detection Algorithm Based on Wavelet Transform

In this paper, we propose an efficient noise robust edge detection technique based on odd Gaussian derivations in the wavelet transform domain. At first, new basis wavelet functions are introduced and the proposed algori...

Video Transmission Using New Adaptive Modulation and Coding Scheme in OFDM based Cognitive Radio

As Cognitive Radio (CR) used in video applications, user-comprehended video quality practiced by secondary users is an important metric to judge effectiveness of CR technologies. We propose a new adaptive modulation and...

Download PDF file
  • EP ID EP185971
  • DOI 10.7508/jist.2014.01.005
  • Views 106
  • Downloads 0

How To Cite

(2014). Language Model Adaptation Using Dirichlet Class Language Model Based on Part-of-Speech. Journal of Information Systems and Telecommunication, 2(1), 41-46. https://europub.co.uk/articles/-A-185971