EFFICIENCY ESTIMATION OF METHODS FOR SENTIMENT ANALYSIS OF SOCIAL NETWORK MESSAGES
Journal Title: Вісник Національного технічного університету «ХПІ». Серія: Системний аналiз, управління та iнформацiйнi технологiї - Year 2019, Vol 0, Issue 2
Abstract
<p class="304Annotationeng"><span lang="EN-US">The results of effectiveness evaluating of machine learning methods for sentiment analysis of social network messages are presented in this paper. The importance of the sentiment analysis problem as one of the important tasks of natural language processing in general and textual information processing in particular is substantiated. A review of existing methods and software for sentiment analysis are made. The choice of classifiers for sentiment analysis of texts for this research is substantiated. The principles of functioning of a Naïve Bayesian Classifier and classifier based on a recurrent neural network are described. Classifiers were sequentially trained in two corpuses: first, in the RuTweetCorp corpus, the corpus of short messages from the social network Twitter, and then on the Slang corpus, the corpus of messages from social networks Facebook and Instagram and posts from the Pikabu website, second corpus have been marked up the tonality of slang words. Information about the tonality of slang words was taken from the youth slang dictionary obtained as a result of the survey of users. The separation of texts by tonality was carried out into three classes: positive, negative and neutral. The efficiency of these classifiers was evaluated. Efficiency evaluation was carried out according to standard metrics Recall, Precision, F-measure, Accuracy. For the naive Bayesian classifier, after training on the first corpus, the following metric values were obtained: Recall = 0,853; Precision = 0,869; F-measure = 0,861; Accuracy = 0,855; and after training on the second corpus such values were obtained: Recall = 0,948; Precision = 0,975; F-measure = 0,961; Accuracy = 0,960. For the classifier based on a recurrent neural network, after training on the first corpus, the following metric values were obtained: Recall = 0,870; Precision = 0,878; F-measure = 0,874; Accuracy = 0,861; and after training on the second corpus such values were obtained: Recall = 0,965; Precision = 0,982; F-measure = 0,973; Accuracy = 0,973. These results prove that additional training on the second corpus increased the efficiency of classifiers by 10–11%.</span></p>
Authors and Affiliations
Natalia Borysova, Karina Melnyk
Forecasting the results of football matches on the Internet based information
<span>The purpose of the article is making a model of results forecasting for football matches, which works better than bookmakers organizations. Lately the popularity of football forecasting has been increased. The exis...
Situational forecasting of electricity demand in the region
<span>The process of forecasting volumes of electricity sales on the wholesale market is considered. To improve the quality of the forecast, it is proposed to use the method of machine learning Random Forests as part of...
РАЗРАБОТКА МЕТОДИЧЕСКИХ ОСНОВ ПОВЫШЕНИЯ ЭФФЕКТИВНОСТИ МАТЕМАТИЧЕСКОГО ИНСТРУМЕНТАРИЯ РЕШЕНИЯ ЗАДАЧ ПРОИЗВОДСТВЕННО-ТРАНСПОРТНОЙ ЛОГИСТИКИ
<p class="204">Разработаны методические основы повышения эффективности математического инструментария решения задач производственно-транспортной логистики. Показано, что результаты, полученные на основе методов математич...
MODELS AND SOFTWARE SOLUTIONS FOR THE PROBLEM OF DIAGNOSING THE FINANCIAL STATE OF IT ENTERPRISE
Today, the economy of Ukraine is in a relatively unstable position; therefore, Ukrainian enterprises require effective management. But in order to effectively manage the enterprise, you need to know what state it is in....
Исследование движения магнитогазодинамических ударных волн в неоднородной плазменной среде методом Уизема
<span>Рассматривается распространение плоской магнитогазодинамической ударной волны в неоднородной плазменной среде. Исследование проводилось методом Уизема, который был использован для случая поперечного магнитного поля...