THE MODEL AND METHOD OF TEXT THEMATIC STRUCTURIZATION ON THE BASIS OF STOCHASTIC AUTOMATICS

Abstract

Purpose.To propose a model for analyzing the text structure, which is characterized by the use of stochastic matrices and automata for text tracing. This allows you to study the change in the significance of certain aspects and story lines along the length of the text, to reveal a program for the representation of verbal images. Results. Stages of the text content analysis method have been given. The method difference is that the thematic aspects are distinguished in the content, the change in their significance is traced, stochastic matrixes of associative connections of entities are formed, and annotations are formulated for each significant aspect of the content. This allows thematically to structure the analyzed text. An experimental verification of the method and model is carried out. The graphs of the change in the significance of the aspects are obtained and annotations are formed for each significant aspect. For experiments to test the working capacity of the proposed method and models, two texts were selected – a technical text with a volume of 2400 words (without a title, conclusions and a list of references), and an artistic text, with a volume of 1500 words. The keywords were selected using the TextAnalyst program. The plot lines were constructed graphs of the significance change of aspects. Significant aspects were highlighted and annotations were formed. The compression ratio was 10%. Practical value. The experiment results confirm the method operability. As a merit of the proposed approach, one can note the relative simplicity of implementation and 100% completeness and even some redundancy of the display of relevant information in the collection of annotations. Redundancy is easily eliminated by threshold processing of the found material for each aspect. The original structure of the document has practically no value. It can be arbitrary texts or sets of values of fields of a heterogeneous database. In future studies, it is planned to analyze trends in the signifi-cance of aspects, to study the correlation of aspects, their consistency, and to construct the phase structure of the text. When using knowledge bases, it is possible to identify contradictions, collisions, and build meaningful interpretations. References 12, figure 1, table 2.

Authors and Affiliations

I. Shevchenko, A. Lebedinets, D. Vasilyev

Keywords

Related Articles

ALGORITHM OF INFORMATIVE AND ANALYTICAL SUPPORT OF PRODUCTION RISKS ASSESSMENT ON THE BASIS OF THE ELMERY METHOD

Purpose. The purpose of the given research is to analyze the existing approaches to the estimation of industrial risks, in particular, risks of accidents. One of the directions of the successful implementation of Europea...

ON THE STRESS-STRAIN STATE OF THE WALL ROCKS AT THE SUDDEN COLLAPSE OF THE MAIN ROOF

Purpose. To determine of the stress-strain state of the lateral rocks during sudden collapse of the roof rocks. Meth- odology. Complex methods of research using the basic provisions of theoretical mechanics and impact th...

TECHNICAL AND ECONOMIC ASPECTS OF THE ELECTRIC VEHICLES USE IN POWER NETWORKS OF UKRAINE

Purpose. To substante technical and economic feasibility of Vehicle-to-Grid technology implementation in conditions of electric networks of Ukraine. Methodology. Standard methods of calculation and simulation of electric...

STUDY OF INFLUENCES OF DANGEROUS PRODUCTS DECOMPOSITION FROM MUNICIPAL SOLID WASTE

Purpose. To test samples of the atmospheric air along with the main chemical composition indicators for the filtration water from the filtrate discharge drain in the impact zone of the solid waste disposal plant. Methodo...

APPLICATION OF POROUS GaAs IN THE MANUFACTURE OF SCHOTTKY DIODES

Purpose. To study the electrical properties of Pd / por-GaAs / n-GaAs. It is of great interest since the efficiency of porous electronic devices based on GaAs is expected to increase. The electrochemistry of Si (and of o...

Download PDF file
  • EP ID EP659434
  • DOI 10.30929/1995-0519.2018.1.29-37
  • Views 85
  • Downloads 0

How To Cite

I. Shevchenko, A. Lebedinets, D. Vasilyev (2018). THE MODEL AND METHOD OF TEXT THEMATIC STRUCTURIZATION ON THE BASIS OF STOCHASTIC AUTOMATICS. Вісник Кременчуцького національного університету імені Михайла Остроградського, 1(108), 29-37. https://europub.co.uk/articles/-A-659434