Analysis of statistical methods for stable combinations determination of keywords identification
Journal Title: Восточно-Европейский журнал передовых технологий - Year 2018, Vol 2, Issue 2
Abstract
<p class="a">The study has solved the task of making comparative analysis and choosing an optimal statistical method to determine stable word combinations while identifying keywords to process English-language and Ukrainian-language Web-resources. The effectiveness of the method directly proportionally depends on the quality of linguistic analysis, of Ukrainian and English texts, respectively, based on the technology of Web Mining and NLP. A decomposition of methods of linguistic analysis was performed to determine the impact on the quality of forming stable word combinations as keywords. The features of the method are the adaptation of the morphological and syntactic analyses of lexical units to the peculiarities of Ukrainian-language words/texts.</p><p class="a">To determine stable word combinations effectively, it is essential to exclude functional words (stops or references), pronouns, numerals and verbs because they are not related to the subject and content of a published work. A set of stable word combinations as keywords is determined by qualitative morphological and syntactic analyses of relevant texts. The set of the identified stable word combinations is used further to compare and determine the degree of the text relevance to a specific topic or user request. The internal “dynamics” of forming a set of stable word combinations as keywords was investigated in the study depending on the statistical method applied to the texts. The obtained results have been verified.</p><p class="a">The study has produced results of the experimental testing of the proposed content-monitoring method for determining stable word combinations to identify keywords in the processing of English-language and Ukrainian-language web-resources of the technical content based on Web Mining technology. It has been determined that the authors of published works often identify the keywords that are far from being considered. It has also been proven that the quality of the result is influenced by the quality of linguistic analysis of texts and subsequent filtering. Further experimental research requires approbation of the proposed method for determining keywords for other categories of texts – scientific, humanitarian, belletristic, journalistic, etc.</p>
Authors and Affiliations
Vasyl Lytvyn, Victoria Vysotska, Dmytro Uhryn, Mariya Hrendus, Oleh Naum
Designing the flow-through parts of distribution systems for the PRG series planetary hydraulic motors
<p class="Style1">Improved efficiency of using the self-propelled machines is defined by the existence of hydraulic machines for the actuators of active working elements and running systems. Hydraulic drives of self-prop...
On the error-correcting capabilities of iterative error correction codes
<p>The influence of the theory of information on development of the error correcting coding theory has been studied. Main differences between the probabilistic approach and the deterministic approach in the analysis of e...
Design of the composition of alkali activated portland cement using mineral additives of technogenic origin
<p>This paper reports results of the development of cement compositions and production technology for common cement systems "portland cement clinker – mineral additives – alkaline activator – water-reducing admixture", w...
A comparative study on the influence of metakaolin and kaolin additives on properties and structure of the alkaliactivated slag cement and concrete
<p>The influence of the metakaolin and kaolin additives on the formation and properties of the alkali-activated slag cements and concretes was studied.</p><p>The influence of the metakaolin and kaolin additives on macro-...
Development of the gearless electric drive for the elevator lifting mechanism
<p>A technical analysis of the requirements for drive motors and rope driving pulleys of gearless elevator winches was performed. The possibility of application of the developed slow-moving electric motor of the bi-induc...