Cross-Lingual Sentiment Classification from English to Arabic using Machine Translation
Journal Title: International Journal of Advanced Computer Science & Applications - Year 2017, Vol 8, Issue 12
Abstract
Cross-lingual sentiment learning is becoming increasingly important due to the multilingual nature of user-generated content on social media and the scarce resources for languages other than English. However, cross-lingual sentiment learning is a challenging task due to the different distribution between translated data and original data and due to the language gap, i.e. each language has its own ways to express sentiments. This work explores the adaptation of English resources for sentiment analysis to a new language, Arabic. The aim is to design a light model for cross-lingual sentiment classification from English to Arabic, without any manual annotation effort which, at the same time, is easy to build and does not require deep linguistic analysis. The ultimate goal is to find an optimal baseline model and to determine the relation between the noise in the translated data and the accuracy of sentiment classification. Different configurations of several factors are investigated including feature representation, feature reduction methods, and the learning algorithms to find the optimal baseline model. Experiments show that a good classification model can be obtained from translated data regardless of the artificial noise added by machine translation. The results also show a significant cost to automation, and thus the best path to future enhancement is through the inclusion of language-specific knowledge and resources.
Authors and Affiliations
Adel Al-Shabi, Aisah Adel, Nazlia Omar, Tareq Al-Moslmi
A Hybrid Method to Improve Forecasting Accuracy in the Case of Sanitary Materials Data
Sales forecasting is a starting point of supply chain management, and its accuracy influences business management significantly. In industries, how to improve forecasting accuracy such as sales, shipping is an important...
ATM Security Using Fingerprint Biometric Identifer: An Investigative Study
The growth in electronic transactions has resulted in a greater demand for fast and accurate user identification and authentication. Access codes for buildings, banks accounts and computer systems often use persona...
Complex Binary Adder Designs and their Hardware Implementations
Complex Binary Number System (CBNS) is (-1+j)-based on binary number system which facilitates both real and imaginary components of a complex number to be represented as single binary number. In this paper, we have prese...
Developing Disease Classification System based on Keyword Extraction and Supervised Learning
The Evidence-Based Medicine (EBM) is emerged as the helpful practice for medical practitioners to make decisions with available shreds of evidence along with their professional ex-pertise. In EBM, the medical practitione...
Novel Fractional Wavelet Transform with Closed-Form Expression
A new wavelet transform (WT) is introduced based on the fractional properties of the traditional Fourier transform. The new wavelet follows from the fractional Fourier order which uniquely identifies the representation o...