Opinion Mining: An Approach to Feature Engineering
Journal Title: International Journal of Advanced Computer Science & Applications - Year 2019, Vol 10, Issue 3
Abstract
Sentiment Analysis or opinion mining refers to a process of identifying and categorizing the subjective information in source materials using natural language processing (NLP), text analytics and statistical linguistics. The main purpose of opinion mining is to determine the writer’s attitude towards a particular topic under discussion. This is done by identifying a polarity of a particular text paragraph using different feature sets. Feature engineering in pre-processing phase plays a vital role in improving the performance of a classifier. In this paper we empirically evaluated various features weighting mechanisms against the well-established classification techniques for opinion mining, i.e. Naive Bayes-Multinomial for binary polarity cases and SVM-LIN for multiclass cases. In order to evaluates these classification techniques we use Rotten Tomatoes publically available movie reviews dataset for training the classifiers as this is widely used dataset by research community for the same purpose. The empirical experiment concludes that the feature set containing noun, verb, adverb and adjective lemmas with feature-frequency (FF) function perform better among all other feature settings with 84% and 85% correctly classified test instances for Naïve Bayes and SVM, respectively.
Authors and Affiliations
Shafaq Siddiqui, M. Abdul Rehman, Sher M. Daudpota, Ahmad Waqas
Separability Detection Cooperative Particle Swarm Optimizer based on Covariance Matrix Adaptation
The particle swarm optimizer (PSO) is a population-based optimization technique that can be widely utilized to many applications. The cooperative particle swarm optimization (CPSO) applies cooperative behavior to i...
Extraction of Line Features from Multifidus Muscle of CT Scanned Images with Morphologic Filter Together with Wavelet Multi Resolution Analysis
A method for line feature extraction from multifidus muscle of Computer Tomography (CT) scanned image with morphologic filter together with wavelet based Multi Resolution Analysis (MRA) is proposed. The contour of the mu...
Post-Editing Error Correction Algorithm For Speech Recognition using Bing Spelling Suggestion
ASR short for Automatic Speech Recognition is the process of converting a spoken speech into text that can be manipulated by a computer. Although ASR has several applications, it is still erroneous and imprecise es...
BAAC: Bangor Arabic Annotated Corpus
This paper describes the creation of the new Bangor Arabic Annotated Corpus (BAAC) which is a Modern Standard Arabic (MSA) corpus that comprises 50K words manually annotated by parts-of-speech. For evaluating the quality...
Extreme Learning Machine and Particle Swarm Optimization for Inflation Forecasting
Inflation is one indicator to measure the development of a nation. If inflation is not controlled, it will have a lot of negative impacts on people in a country. There are many ways to control inflation, one of them is f...