Twitter Sentiment Analysis in Under-Resourced Languages using Byte-Level Recurrent Neural Model

Abstract

Sentiment analysis in non-English language can be more challenging than the English language because of the scarcity of publicly available resources to build the prediction model with high accuracy. To alleviate this under-resourced problem, this article introduces the leverage of byte-level recurrent neural model to generate text representation for twitter sentiment analysis in the Indonesian language. As the main part of the proposed model training is unsupervised and does not require much-labeled data, this approach can be scalable by using a huge amount of unlabeled data that is easily gathered on the Internet, without much dependencies on human-generated resources. This paper also introduces an Indonesian dataset for general sentiment analysis. It consists of 10,806 twitter data (tweets) selected from a total of 454,559 gathered tweets which taken directly from twitter using twitter API. The 10,806 tweets are then classified into 3 categories, positive, negative, and neutral. This Indonesian dataset could help the development of Indonesian sentiment analysis especially general sentiment analysis and encouraged others to start publishing similar dataset in the future.

Authors and Affiliations

Ridi Ferdiana, Wiliam Fajar, Desi Dwi Purwanti, Artmita Sekar Tri Ayu, Fahim Jatmiko

Keywords

Related Articles

Formal Method to Derive Interoperability Requirements and Guarantees

Interoperability among telecommunications systems, possibly by different vendors, is essential for both the development of many telecommunications networks, and today's civilization development. Interoperability testing...

Proposal Models for Personalization of e-Learning based on Flow Theory and Artificial Intelligence

This paper presents the comparison of the results of two models for the personalization of learning resources sequences in a Massive Online Open Course (MOOC). The compared models are very similar and differ just in the...

Modeling House Price Prediction using Regression Analysis and Particle Swarm Optimization Case Study : Malang, East Java, Indonesia

House prices increase every year, so there is a need for a system to predict house prices in the future. House price prediction can help the developer determine the selling price of a house and can help the customer to a...

Real-Time Talking Avatar on the Internet Using Kinect and Voice Conversion

We have more chances to communicate via the in-ternet. We often use text/video chat, but there are some problems, such as a lack of communication and anonymity. In this paper, we propose and implement a real-time talking...

Implementation of Pedestrian Dynamic

Pattern generation is one of the ways to implement computer science in art. Many methods have been implemented. One of them is cellular automata. In a previous work, cellular automata (CA) has been used to create an imag...

Download PDF file
  • EP ID EP626575
  • DOI 10.14569/IJACSA.2019.0100815
  • Views 104
  • Downloads 0

How To Cite

Ridi Ferdiana, Wiliam Fajar, Desi Dwi Purwanti, Artmita Sekar Tri Ayu, Fahim Jatmiko (2019). Twitter Sentiment Analysis in Under-Resourced Languages using Byte-Level Recurrent Neural Model. International Journal of Advanced Computer Science & Applications, 10(8), 108-112. https://europub.co.uk/articles/-A-626575