Analysis of statistical methods for stable combinations determination of keywords identification

Abstract

<p class="a">The study has solved the task of making comparative analysis and choosing an optimal statistical method to determine stable word combinations while identifying keywords to process English-language and Ukrainian-language Web-resources. The effectiveness of the method directly proportionally depends on the quality of linguistic analysis, of Ukrainian and English texts, respectively, based on the technology of Web Mining and NLP. A decomposition of methods of linguistic analysis was performed to determine the impact on the quality of forming stable word combinations as keywords. The features of the method are the adaptation of the morphological and syntactic analyses of lexical units to the peculiarities of Ukrainian-language words/texts.</p><p class="a">To determine stable word combinations effectively, it is essential to exclude functional words (stops or references), pronouns, numerals and verbs because they are not related to the subject and content of a published work. A set of stable word combinations as keywords is determined by qualitative morphological and syntactic analyses of relevant texts. The set of the identified stable word combinations is used further to compare and determine the degree of the text relevance to a specific topic or user request. The internal “dynamics” of forming a set of stable word combinations as keywords was investigated in the study depending on the statistical method applied to the texts. The obtained results have been verified.</p><p class="a">The study has produced results of the experimental testing of the proposed content-monitoring method for determining stable word combinations to identify keywords in the processing of English-language and Ukrainian-language web-resources of the technical content based on Web Mining technology. It has been determined that the authors of published works often identify the keywords that are far from being considered. It has also been proven that the quality of the result is influenced by the quality of linguistic analysis of texts and subsequent filtering. Further experimental research requires approbation of the proposed method for determining keywords for other categories of texts – scientific, humanitarian, belletristic, journalistic, etc.</p>

Authors and Affiliations

Vasyl Lytvyn, Victoria Vysotska, Dmytro Uhryn, Mariya Hrendus, Oleh Naum

Keywords

Related Articles

Influence of plasticizers on fire retarding properties of carbon foams of intumescent coatings

<p>The studies were conducted using the triple intumescent system based on Exolit AP 740 F, which is a synergetic system based on ammonium polyphosphate with the addition of nitrogen-containing compounds. Styrene-acrylic...

Development of a vacuum-evaporative thermotransformer for the cooling system at a nuclear power plant

<p>The study addresses the development of a method for the optimal design of vacuum-evaporative heat pump plants (HPP) for a cooling system of technological equipment of the second circuit at a nuclear power plant (NPP)...

Prerequisites for the development of hydro-jet technology in designing women’s headgear at hospitality establishments

<p class="a">Selection of the submerged hydro-jet as a tool for shaping volumetric headgear details was substantiated.</p><p class="a">The experimental device for determining dynamic pressures of the submerged hydro-jet,...

Design of the laboratory bench for a hydrovolumetric-mechanical transmission of the tracked tractor

<p>Double-flow hydrovolumetric mechanical transmissions is an advanced technical solution that aims to increase productivity, improve efficiency and convenience of control over wheeled and tracked tractors. Their arrange...

Mechanoactivation of Portland cement in the technology of manufacturing the self­compacting concrete

<p>This paper examines the intensive separation technology for producing a self-compacting concrete (SCC). We substantiate the proposed technology of SCC production through the effective control over viscosity of cement-...

Download PDF file
  • EP ID EP527851
  • DOI 10.15587/1729-4061.2018.126009
  • Views 85
  • Downloads 0

How To Cite

Vasyl Lytvyn, Victoria Vysotska, Dmytro Uhryn, Mariya Hrendus, Oleh Naum (2018). Analysis of statistical methods for stable combinations determination of keywords identification. Восточно-Европейский журнал передовых технологий, 2(2), 23-37. https://europub.co.uk/articles/-A-527851