EFFICIENCY ESTIMATION OF METHODS FOR SENTIMENT ANALYSIS OF SOCIAL NETWORK MESSAGES

Abstract

<p class="304Annotationeng"><span lang="EN-US">The results of effectiveness evaluating of machine learning methods for sentiment analysis of social network messages are presented in this paper. The importance of the sentiment analysis problem as one of the important tasks of natural language processing in general and textual information processing in particular is substantiated. A review of existing methods and software for sentiment analysis are made. The choice of classifiers for sentiment analysis of texts for this research is substantiated. The principles of functioning of a Naïve Bayesian Classifier and classifier based on a recurrent neural network are described. Classifiers were sequentially trained in two corpuses: first, in the RuTweetCorp corpus, the corpus of short messages from the social network Twitter, and then on the Slang corpus, the corpus of messages from social networks Facebook and Instagram and posts from the Pikabu website, second corpus have been marked up the tonality of slang words. Information about the tonality of slang words was taken from the youth slang dictionary obtained as a result of the survey of users. The separation of texts by tonality was carried out into three classes: positive, negative and neutral. The efficiency of these classifiers was evaluated. Efficiency evaluation was carried out according to standard metrics Recall, Precision, F-measure, Accuracy. For the naive Bayesian classifier, after training on the first corpus, the following metric values were obtained: Recall = 0,853; Precision = 0,869; F-measure = 0,861; Accuracy = 0,855; and after training on the second corpus such values were obtained: Recall = 0,948; Precision = 0,975; F-measure = 0,961; Accuracy = 0,960. For the classifier based on a recurrent neural network, after training on the first corpus, the following metric values were obtained: Recall = 0,870; Precision = 0,878; F-measure = 0,874; Accuracy = 0,861; and after training on the second corpus such values were obtained: Recall = 0,965; Precision = 0,982; F-measure = 0,973; Accuracy = 0,973. These results prove that additional training on the second corpus increased the efficiency of classifiers by 10–11%.</span></p>

Authors and Affiliations

Natalia Borysova, Karina Melnyk

Keywords

Related Articles

Алгоритм построения стационарного нормального марковского 3d-поля: динамические уравнения движения, статистические распределения вероятностей, визуализация

<span>Рассмотрено трехмерное поле, обладающее свойствами стационарности, нормальности и марковости. На основе иерархического подхода проведен вероятностный анализ рассматриваемых случайных величин, процессов и полей. Пос...

Development of software solution for building route of a orders group delivery in presence of time constraints

<span>The problem of determining route of visiting several points is considered. The task differs from known ones that time for arrival at each point is specified. The tasks of these class are solved in courier delivery...

Development and research of models and software for the recommender system of consumer goods

<span>There have been proposed investigation of the problem of creating recommendations with technical description for building the Recommender System of consumer goods with help of modern algorithms, approaches, princip...

Information technology of a static model solving for quality improvement of the software development process based on the CMMI model

<span>Information technology that is proposed to solve the problem of a short term planning (static problem statement) for quality improvement of the software development process based on the CMMI model, which is a matur...

Case Study: Розробка концепції корпоративного web-порталу банку "Credit Agricole"

<span>Розглядаються підходи до розробки концепції корпоративного веб порталу міжнародного банку. Будь-якій компанії (підприємству) необхідний інформаційний супровід своїх бізнес-процесів, а також інформаційна взаємодія у...

Download PDF file
  • EP ID EP669220
  • DOI 10.20998/2079-0023.2019.02.13
  • Views 140
  • Downloads 0

How To Cite

Natalia Borysova, Karina Melnyk (2019). EFFICIENCY ESTIMATION OF METHODS FOR SENTIMENT ANALYSIS OF SOCIAL NETWORK MESSAGES. Вісник Національного технічного університету «ХПІ». Серія: Системний аналiз, управління та iнформацiйнi технологiї, 0(2), 76-81. https://europub.co.uk/articles/-A-669220