THE MODEL AND METHOD OF TEXT THEMATIC STRUCTURIZATION ON THE BASIS OF STOCHASTIC AUTOMATICS

Abstract

Purpose.To propose a model for analyzing the text structure, which is characterized by the use of stochastic matrices and automata for text tracing. This allows you to study the change in the significance of certain aspects and story lines along the length of the text, to reveal a program for the representation of verbal images. Results. Stages of the text content analysis method have been given. The method difference is that the thematic aspects are distinguished in the content, the change in their significance is traced, stochastic matrixes of associative connections of entities are formed, and annotations are formulated for each significant aspect of the content. This allows thematically to structure the analyzed text. An experimental verification of the method and model is carried out. The graphs of the change in the significance of the aspects are obtained and annotations are formed for each significant aspect. For experiments to test the working capacity of the proposed method and models, two texts were selected – a technical text with a volume of 2400 words (without a title, conclusions and a list of references), and an artistic text, with a volume of 1500 words. The keywords were selected using the TextAnalyst program. The plot lines were constructed graphs of the significance change of aspects. Significant aspects were highlighted and annotations were formed. The compression ratio was 10%. Practical value. The experiment results confirm the method operability. As a merit of the proposed approach, one can note the relative simplicity of implementation and 100% completeness and even some redundancy of the display of relevant information in the collection of annotations. Redundancy is easily eliminated by threshold processing of the found material for each aspect. The original structure of the document has practically no value. It can be arbitrary texts or sets of values of fields of a heterogeneous database. In future studies, it is planned to analyze trends in the signifi-cance of aspects, to study the correlation of aspects, their consistency, and to construct the phase structure of the text. When using knowledge bases, it is possible to identify contradictions, collisions, and build meaningful interpretations. References 12, figure 1, table 2.

Authors and Affiliations

I. Shevchenko, A. Lebedinets, D. Vasilyev

Keywords

Related Articles

THE UP-TO-DATE STATE AND THE PROSPECTS OF THE DEVELOPMENT OF UKRAINIAN FIELD OF HIGHER EDUCATION

Purpose. To elucidate the general and important theoretical substantiation, which form the basis of the paradigm of the development of western universities, and to define the conceptual and methodological approach to the...

ON THE STRESS-STRAIN STATE OF THE WALL ROCKS AT THE MAIN ROOF SUDDEN COLLAPSE

Purpose. The purpose of research is to determine the stress-strain state of rock walls at the sudden collapse of the roof rocks in mining steep coal seams in the complex geological conditions at the traditional rock pres...

HIGHER TRANSLATOR COMPETENCE WITHIN THE TEXT ANALYSIS

Purpose. The article focuses on studying and analyzing the concept of translation competence as a vital component determined the productive text analysis within translator training syllabus. Methodology. The systemic-...

DEACTIVATION OF OIL AND GAS EQUIPMENT CONTAMINATED WITH SALTS OF RADIOACTIVE ISOTOPES

Purpose. To develop an environmentally safe and cost-effective method for cleaning oilfield equipment from radio-active sediments and the technology for its implementation. To justify the choice of the chemical method of...

CONSTRUCTION OF DISCRIMINATION RULES OF DAILY WATER CONSUMPTION GRAPHS FROM THE WATER SUPPLY NETWORK WITH CONSIDERATION OF SEASONAL AND SOCIAL FACTORS

Purpose. To form the influence of seasonal and social factors on the nature of water consumption from the water supply network for their further consideration during planning effective power consumption. Methodology. The...

Download PDF file
  • EP ID EP659434
  • DOI 10.30929/1995-0519.2018.1.29-37
  • Views 87
  • Downloads 0

How To Cite

I. Shevchenko, A. Lebedinets, D. Vasilyev (2018). THE MODEL AND METHOD OF TEXT THEMATIC STRUCTURIZATION ON THE BASIS OF STOCHASTIC AUTOMATICS. Вісник Кременчуцького національного університету імені Михайла Остроградського, 1(108), 29-37. https://europub.co.uk/articles/-A-659434