Data Categorization and Model Weighting Approach for Language Model Adaptation in Statistical Machine Translation
Journal Title: International Journal of Advanced Computer Science & Applications - Year 2019, Vol 10, Issue 1
Abstract
Language model encapsulates semantic, syntactic and pragmatic information about specific task. Intelligent systems especially natural language processing systems can show different results in terms of performance and precision when moving among genres and domains. Therefore researchers have explored different language model adaptation strategies in order to overcome effectiveness issue. There are two main categories in language model adaptation techniques. The first category includes the techniques that based on the data selection where task-oriented corpus can be extracted and used to train and generate models for specific translations. While the second category focuses on developing a weighting criterion to assign the test data to specific model corpus. The purpose of this research is to introduce language model adaptation approach that combines both categories (data selection and weighting criterion) of language model adaptation. This approach applies data selection for specific-task translations by dividing the corpus into smaller and topic-related corpora using clustering process. We investigate the effect of different approaches for clustering the bilingual data on the language model adaptation process in terms of translation quality using the Europarl corpus WMT07 that includes bilingual data for English-Spanish, English-German and English-French. A mixture of language models should assign any given data to the right language model to be used in the translation process using a specific weighting criterion. The proposed language model adaptation has achieved better translation quality compare to the baseline model in Statistical Machine Translation (SMT).
Authors and Affiliations
Mohammed AbuHamad, Masnizah Mohd
A Study on Cross Layer MAC design for performance optimization of routing protocols in MANETs
One of the most visible trends in today’s commercial communication market is the adoption of wireless technology. Wireless networks are expected to carry traffic that will be a mix of real time traffic such as voice, mul...
Universal Simplest possible PLC using Personal Computer
Need of industrial automation and control is not closed yet. PLC, the programmable logic controller as available in 2009 with all standardized possible features, discussed here concisely. This work on PLC gives a simples...
Software Engineering: Challenges and their Solution in Mobile App Development
Mobile app development is increasing rapidly due to the popularity of smartphones. With billions of apps downloads, the Apple App Store and Google Play Store succeeded to overcome mobile devices. Throughout last 10 years...
Novel Software-Defined Network Approach of Flexible Network Adaptive for VPN MPLS Traffic Engineering
Multi-Protocol Label Switching VPN (MPLS-VPN) is a technology for connecting multiple remote sites across the operator’s private infrastructure. MPLS VPN offers advantages that traditional solutions cannot guarantee, in...
Quality of Service Impact on Deficit Round Robin and Stochastic Fair Queuing Mechanism in Wired-cum-Wireless Network
The deficient round robin (DRR) and stochastic fair queue (SFQ) are the active queue mechanism (AQM) techniques. These AQM techniques play important role in buffer management in order to control the congestion in the wir...