THE DETERMINATION METHOD FOR CONTEXTUAL MEANINGS OF WORDS AND DOCUMENTS

Abstract

Problems and methods are considered for program context recognition of words and text documents. Survey of existent text processing methods is provided, simple numeric algorithm is given for determination of words and documents context with a help of semantic net, having a form of tree type graph. Semantic net structure is described in detail. Given semantic net is needed to fix basic word W1 context by means of words-meaning W2 coupled with it. Words W2 represent possible W1 context meanings. For every word W2 correspond some words-characteristics W3. At the context calculation the distances between words W2 and W3 are taken into account. The distances are measured in words between. Every word W3 has metrics, according to the concept proximity to W2. There is a table of words W1,W2 and W3 with their metrics values. At context document analyses there was taken into account case or number words variations. Simple formula for context calculation is presented. Method of results proofing with a help of Chebyshev inequality is also provided. The context analyses method was checked by Monte-Carlo simulations. Tables of investigation results are provided and some recommendation for algorithm parameters tuning and optimization are also given. The analyses showed that proposed method is quite effective for context estimation at text analyses, and for any systems, where one needs computer recognition of context.

Authors and Affiliations

Elizaveta Dorenskaya, Yuri Semenov

Keywords

Related Articles

ESTIMATION METHODOLOGY OF THE LANGUAGE IDENTIFICATION RESULTS

The article presents the author's methodology for evaluating the language identification results, developed in the course of experimental research and showing the effectiveness of appropriate methods, technologies, algor...

SIGNIFICANT COGNITIVE AND INFORMATION ASPECTS IN EDUCATION MANAGEMENT SYSTEM

In the article cognitive and information aspects which determine modern educational process using information technologies (IT) are described. Nowadays pedagogy considers various didactic approaches which make studying m...

THE ESTIMATIONS OF THE PARAMETERS OF THE DISTRIBUTION OF THE LOGARITHM OF THE COMPLEXITY OF TSP

The complexity of the individual traveling salesman problem was analyzed by means of mathematical statistics. The complexity is defined as a number of nodes of the decision tree created by the branch and bound algorithm....

EXPERIENCE IN DEVELOPMENT, TRENDS IN THE DEVELOPMENT AND IMPLEMENTATION OF INFORMATION SYSTEMS SUPPORTING THE MAIN EDUCATIONAL PROCESS

This article contents experience of use and description of tendencies of educational information systems extension with emphasis on institutions of higher education. In the context of the emergence of new educational nee...

ABOUT THE EXPERIENCE OF REALIZATION OF THE UNIVERSITY INFORMATION TECHNOLOGY CLASSES

The article gives a brief description of the methods and forms of implementation, as well as models of university profile classes. The experience of the Orenburg State University on the implementation of university infor...

Download PDF file
  • EP ID EP523561
  • DOI 10.25559/SITITO.14.201804.896-902
  • Views 120
  • Downloads 0

How To Cite

Elizaveta Dorenskaya, Yuri Semenov (2018). THE DETERMINATION METHOD FOR CONTEXTUAL MEANINGS OF WORDS AND DOCUMENTS. Современные информационные технологии и ИТ-образование, 14(4), 896-902. https://europub.co.uk/articles/-A-523561