Sentiment Summerization and Analysis of Sindhi Text
Journal Title: International Journal of Advanced Computer Science & Applications - Year 2017, Vol 8, Issue 10
Abstract
Text corpus is important for assessment of language features and variation analysis. Machine learning techniques identify the language terms, features, text structures and sentiment from linguistic corpus. Sindhi language is one of the oldest languages of the world having proper script and complete grammar. Sindhi is remained less resourced language computationally even in this digital era. Viewing this problem of Sindhi language, Sindhi NLP toolkit is developed to solve the Sindhi NLP and computational linguistics problems. Therefore, this research work may be an addition to NLP. This research study has developed an own Sindhi sentimentally structured and analyzed corpus on the basis of accumulated results of Sindhi sentiment analysis tool. Corpus is normalized and analyzed for language features and variation analysis using DTM and TF-IDF techniques. DTM and TF-IDF analysis is performed using n-gram model. The supervised machine learning model is formulated using SVMs and K-NN techniques to perform analysis on Sindhi sentiment analysis corpus dataset. Precision, recall and f-score show better performance of machine learning technique than other techniques. Cross validation techniques is used with 10 folds to validate and evaluate data set randomly for supervised machine learning analysis. Research study opens doors for linguists, data analysts and decision makers to work more for sentiment summarization and visual tracking.
Authors and Affiliations
Mazhar Ali, Asim Imdad Wagan
User Intent Discovery using Analysis of Browsing History
The search engine can retrieve the information from the web by using keyword queries. The responsibility of search engines is getting the relevant results that met with users’ search intents. Nowadays, all search engines...
On the Parallel Design and Analysis for 3-D ADI Telegraph Problem with MPI
In this paper we describe the 3-D Telegraph Equation (3-DTEL) with the use of Alternating Direction Implicit (ADI) method on Geranium Cadcam Cluster (GCC) with Message Passing Interface (MPI) parallel software. The algor...
Evaluation of LoRa-based Air Pollution Monitoring System
Air pollution is a threat to human health and the environment. Pollution caused by harmful gases emitted from car exhausts, factories, forest fires and other sources. Carbon monoxide, nitrogen oxides and carbon dioxide a...
IMPROVING THE SECURITY OF THE MEDICAL IMAGES
Applying security to the transmitted medical images is important to protect the privacy of patients. Secure transmission requires cryptography, and watermarking to achieve confidentiality, and data integrity. Improving c...
Self Interference Cancellation in Co-Time-Co-Frequency Full Duplex Cellular Communication
The performance of co-time co-frequency full duplex (CCDF) communication systems is limited by the self-interference (SI), which is the result of using the same frequency for transmission and reception. However, current...