THE EXTRACTION OF LEXICAL AND METRORHYTHMIC FEATURES WHICH ARE CHARACTERISTIC FOR THE GENRE AND THE STYLE AND FOR THEIR COMBINATIONS WITHIN THE PROCESS OF AUTOMATED PROCESSING OF TEXTS IN RUSSIAN

Abstract

This paper describes the algorithm of automatic extraction of the characteristic features for the genre and the style. This work was carried out in the framework of the development of a software system created in the Institute of Computational Technologies of SB RAS and designed for a complex analysis of metrorhythmic and genre-stylistic characteristics of poetic texts in Russian. The paper presents the structure of the software system developed in the ICT SB RAS and intended for a complex analysis of metrorhythmic and genre-stylistic characteristics of poetic texts in Russian. The system organically combines both original program modules which are created directly by the system developers and intended for the solution of the single-purpose tasks of the analysis of the poetic texts, and open access software products. The generalized approach, which allows to consider the poetic features in the form of a vector, on the one hand, allows to use the modern algorithms of the classification and their ensembles, on the other, such approach has the disadvantages for small volumes of the texts with which it is necessary to work. Therefore, the presence of such a step as verification allows the specialists to adjust the operation of the system based on an expert knowledge, and also makes the classification process transparent. As a tool, the Python libraries were used: scikit-learn, in which the algorithms of the classification and also the methods of their combination were implemented; and ELI5, which allows to establish a correspondence between the components of the feature vector with specific features. So, the extraction of lexical and metrorhythmic features which are characteristic for the genre and style and of their combinations improved the process of automated processing of poetic texts in Russian what is shown on the base of the corpus of poetic texts of A.S. Pushkin and K.N. Batyushkov. The obtained results can be used for the verification of the classifier and for a list of characteristic features for the genre and the style of a poet.

Authors and Affiliations

Vladimir Barakhnin, Olga Kozhemyakina, Elena Rychkova, Ilya Pastushkov, Yuliya Borzilova

Keywords

Related Articles

RULE-BASED HYBRID INTELLIGENT LEARNING ENVIRONMENT IMPLEMENTATION

Learning is considered as an intelligent process the development scenario of which, with an individual approach to the learner, is not known in advance. The scenario is built during learning material studying and it larg...

EFFICIENCY OF MODERN OPERATING SYSTEMS

The variety of the modern multi-vendor universal and specialized operating systems (OS), designed for application in certain subject areas, poses a number of questions: which system should one choose for an automated inf...

MODEL OF FUNCTIONING OF TELECOMMUNICATION EQUIPMENT FOR SOFTWARE-CONFIGURATED NETWORKS

A mathematical model of the functioning of the switch of a software defined networks is constructed in the form of a queuing network consisting of two queuing systems: the first simulates an input data buffer and a devic...

INNOVATIVE DIDACTIC ELECTRONIC RESOURCES AND TEACHER'S PRODUCTS IN IT-EDUCATION

The article describes the design and implementation of computer support for the teacher’s activities, which combines such opportunities as data accounting and analysis, the use of pedagogical technology, and the situatio...

NEURAL NETWORK METHOD OF RESTORING AN INITIAL PROFILE OF THE SHOCK WAVE

In this paper, we apply neural network modeling to solve the inverse problem of mathematical physics with a system of nonlinear partial differential equations of hyperbolic type. In the problem, the initial conditions ar...

Download PDF file
  • EP ID EP523558
  • DOI 10.25559/SITITO.14.201804.888-895
  • Views 107
  • Downloads 0

How To Cite

Vladimir Barakhnin, Olga Kozhemyakina, Elena Rychkova, Ilya Pastushkov, Yuliya Borzilova (2018). THE EXTRACTION OF LEXICAL AND METRORHYTHMIC FEATURES WHICH ARE CHARACTERISTIC FOR THE GENRE AND THE STYLE AND FOR THEIR COMBINATIONS WITHIN THE PROCESS OF AUTOMATED PROCESSING OF TEXTS IN RUSSIAN. Современные информационные технологии и ИТ-образование, 14(4), 888-895. https://europub.co.uk/articles/-A-523558