Development of the method for filtering verbal noise while search keywords for the English text
Journal Title: Технологический аудит и резервы производства - Year 2018, Vol 6, Issue 2
Abstract
<p><em>The object of research is the processing of verbal information to identify keywords in the text. The most important step in the search for key terms is the calculation of their weights in the document in question, which makes it possible to evaluate their significance relative to each other in this context. To solve this problem, there are many approaches that are conditionally divided into two groups: they require learning and do not require learning. Learning implies the need to pre-process the original body of texts in order to extract information about the frequency of occurrence of terms in the entire body. An alternative approach is using linguistic ontologies, which are more or less approximate models of the existing set of words in a given language. On the basis of both approaches, systems are created for the automatic extraction of key terms. Nevertheless, in the direction of searching for keywords, research is not stopped in order to improve the accuracy and completeness of the results, as well as to use methods of extracting information from the text to solve new problems.</em></p><p><em>Existing approaches to the definition of keywords are characterized. The best quality of text processing is achieved by linguistic methods or when their combinations are statistical. A system for automatically determining key phrases from natural language text should be developed using the morphological dictionary and syntax rules.</em></p><em>The study uses an approach to defining keywords based on finding syntactic links between word forms in sentences in English text using the instrumental capabilities of modern linguistic packages. In the framework of the general approach to reducing verbal noise in the method, it is proposed that it is achieved with the help of formalized operations: the replacement of pronouns with the corresponding nouns; removal of noise connections; removing noise words; withdrawal of stop words. The described operations can be used as additional modules that improve the results of finding keywords for both the developed method for determining keywords of English text and other algorithms for finding keywords.</em>
Authors and Affiliations
Oleg Bisikalo, Alexander Yahimovich, Yaroslav Yahimovich
Justification for use of two-component mixtures for cooking wheat bread
<p><em>The object of research is wheat bread. One of the most problematic places is the need to correct the food ration of the population in order to enrich traditional food with vital nutrients. Taking into account that...
Analysis of the resources provision of stopping points of transport-transfer stations of urban passenger transport
<p><em>It is proposed to consider the efficiency of the operation of transport-transfer stations in terms of the effect of resource provision of stop points on the duration of the stay of passengers in them. Based on the...
Improvement of the accounting policy of the small business enterprises in its transition to IFRS: investment and innovation aspects
<p><em>The object of research is the process of forming an effective accounting policy for small businesses in Ukraine in the context of the transition to the International Financial Reporting Standard (IFRS) to stimulat...
Correction of technological characteristics of protein-fat mixture by expanding the component composition
<p><em>The object of research is the technological characteristics of the protein-fat mixture of increased nutritional value, depending on the addition of vegetable oil as a component. The protein-fat mixture is a mixtur...
Investigation of backgrounds for the innovative development of the hospitality industry in various regions of Ukraine
<p class="20"><em>The object of the research is a complex of theoretical and practical aspects of hotel and restaurant management organization in various regions of Ukraine. The biggest problems are the lack of strategic...