EFFICIENCY ESTIMATION OF METHODS FOR SENTIMENT ANALYSIS OF SOCIAL NETWORK MESSAGES

Abstract

<p class="304Annotationeng"><span lang="EN-US">The results of effectiveness evaluating of machine learning methods for sentiment analysis of social network messages are presented in this paper. The importance of the sentiment analysis problem as one of the important tasks of natural language processing in general and textual information processing in particular is substantiated. A review of existing methods and software for sentiment analysis are made. The choice of classifiers for sentiment analysis of texts for this research is substantiated. The principles of functioning of a Naïve Bayesian Classifier and classifier based on a recurrent neural network are described. Classifiers were sequentially trained in two corpuses: first, in the RuTweetCorp corpus, the corpus of short messages from the social network Twitter, and then on the Slang corpus, the corpus of messages from social networks Facebook and Instagram and posts from the Pikabu website, second corpus have been marked up the tonality of slang words. Information about the tonality of slang words was taken from the youth slang dictionary obtained as a result of the survey of users. The separation of texts by tonality was carried out into three classes: positive, negative and neutral. The efficiency of these classifiers was evaluated. Efficiency evaluation was carried out according to standard metrics Recall, Precision, F-measure, Accuracy. For the naive Bayesian classifier, after training on the first corpus, the following metric values were obtained: Recall = 0,853; Precision = 0,869; F-measure = 0,861; Accuracy = 0,855; and after training on the second corpus such values were obtained: Recall = 0,948; Precision = 0,975; F-measure = 0,961; Accuracy = 0,960. For the classifier based on a recurrent neural network, after training on the first corpus, the following metric values were obtained: Recall = 0,870; Precision = 0,878; F-measure = 0,874; Accuracy = 0,861; and after training on the second corpus such values were obtained: Recall = 0,965; Precision = 0,982; F-measure = 0,973; Accuracy = 0,973. These results prove that additional training on the second corpus increased the efficiency of classifiers by 10–11%.</span></p>

Authors and Affiliations

Natalia Borysova, Karina Melnyk

Keywords

Related Articles

Правила и составные части методики обобщенно-множественного отображения информации в подсистеме аналитического учета СППР аудита на верхнем уровне

<span>Определена информация аналитического учета характеризующая состояние и результаты деятельности предприятия за период проверки на верхнем уровне. Установлены взаимосвязи аналитического учета и характеристик предприя...

НОРМАТИВНО-ПРАВОВЕ ЗАБЕЗПЕЧЕННЯ ДІЯЛЬНОСТІ У СФЕРІ ТРАНСПОРТУ

Економічна категорія «підприємство транспорту» є загальновживаним в економічному науковому та практичному обігу. Але, в той же час, її розуміння в правовому полі, яке визначає основи господарської діяльності в сфері тран...

РОЗРОБКА ПРОГРАМНОГО ЗАБЕЗПЕЧЕННЯ ДЛЯ ВИРІШЕННЯ ЗАДАЧІ ПОБУДОВИ ІТ КОМАНДИ НА ОСНОВІ ОЦІНКИ КОРПОРАТІВНОЇ КУЛЬТУРИ ТА ТИПУ ОСОБИСТОСТІ

<p class="104"><span lang="UK">Проведено аналіз сучасних підходів щодо використання поняття корпоративної культури в роботі відділів управління людськими ресурсами на підприємстві зокрема ІТ компанії. Цей аналіз показує,...

EFFICIENCY RESEARCH OF THE THREE-LEVEL MODEL OF SMALL-SERIES PRODUCTION PLANNING

We consider the problem of finding an order portfolio that maximizes the total profit according to one of five optimization criteria and should fit the beginning date of the planned period and the due dates specified by...

Forecasting the results of football matches on the Internet based information

<span>The purpose of the article is making a model of results forecasting for football matches, which works better than bookmakers organizations. Lately the popularity of football forecasting has been increased. The exis...

Download PDF file
  • EP ID EP669220
  • DOI 10.20998/2079-0023.2019.02.13
  • Views 98
  • Downloads 0

How To Cite

Natalia Borysova, Karina Melnyk (2019). EFFICIENCY ESTIMATION OF METHODS FOR SENTIMENT ANALYSIS OF SOCIAL NETWORK MESSAGES. Вісник Національного технічного університету «ХПІ». Серія: Системний аналiз, управління та iнформацiйнi технологiї, 0(2), 76-81. https://europub.co.uk/articles/-A-669220