CLUSTERING OF SCIENTISTS' PUBLICATIONS, CONSIDERING FINDING SIMILARITIES IN ABSTRACTS AND TEXTS OF PUBLICATIONS BASED ON N-GRAM ANALYSIS AND IDENTIFYING POTENTIAL PROJECT GROUPS

Journal Title: Scientific Journal of Astana IT University - Year 2023, Vol 16, Issue 16

Abstract

The article describes the solution to the problem of clustering scientists' publications, taking into account the finding of similarities in the annotations and texts of these publications based on n-grams of analysis and cross-references, as well as the tasks of identifying potential project groups for the implementation of research and educational projects based on the results of clustering. The selection of scientific partners in the world practice is done without a comprehensive assessment of their activities. Most of the well-known indexes for evaluating the research activities of scientists need to consider information about citations fully. The methods developed in the study for evaluating the scientific activities of scientists and universities, as well as methods for selecting scientific partners for the implementation of educational and scientific projects on a scientific basis, allow us to organize the influential work of universities qualitatively. In the article, a probabilistic thematic model is constructed that allows the clustering of scientists' publications in scientific fields, considering the citation network, which is an important step in solving the problem of identifying subject scientific spaces. As a result of constructing the model, the problem of increasing instability of clustering of the citation graph due to a decrease in the number of clusters has been solved. The main objective of this work is to address the challenge of selecting suitable partners for collaboration in scientific and educational projects. To achieve this, a method for choosing project executors has been developed, which employs fuzzy logical inference to harmonize expert opinions regarding candidate requirements. This approach helps facilitate the multi-criteria selection of potential partners for scientific and educational projects. In addition to the method, various software modules have been created as part of this research. These modules are designed for the automated collection of information on the publications and citation records of scientists through international scientometric databases. They also encompass a visualization module and a user interface that aids in evaluating the scientific activities of university teaching staff. Choosing partners for grants or strategic collaborations, especially in the context of a globalized and highly mobile scientific community, remains a pertinent issue. The approach described in this research involves clustering the scientific publications of potential project partners. Furthermore, it incorporates conducting comparative citation analyses of these publications and establishing proximity based on n-gram annotation analysis. These methods provide a scientific basis for making informed choices when selecting partners, which is crucial for initiating and advancing research projects. Consequently, the selection of partners for forming research project teams is an immediate and pressing task.

Authors and Affiliations

Andrii Biloshchytskyi, Olexandr Kuchansky, Aidos Mukhatayev, Svitlana Biloshchytska, Yurii Andrashko, Sapar Toxanov, Adil Faizullin

Keywords

Related Articles

PROTEIN IDENTIFICATION USING SEQUENCE DATABASES

The bottom-up proteomics approach (also known as the shotgun approach), based on the digestion of proteins in peptides and their sequencing using tandem mass spectrometry (MS/MS), has become widespread. The identificat...

DETERMINATION OF PARAMETERS AND THEIR RELATIONSHIPS IN SOCIAL NETWORK ACCOUNTS

The article provides an overview of citizens’ participation in social networks according to the results of 2018 in the Republic of Kazakhstan in comparison with the data of the Statistics Committee. From year to year,...

DEVELOPMENT OF A MODEL FOR IMPLEMENTING A CASE METHOD FOR INTERACTIVE STUDY PROCESS MANAGEMENT MONITORING

The importance of this research topic lies in the need to gather, store, analyze and disseminate accurate information on the management status of educational institutions to guarantee the provision of quality educational...

INFORMATION TECHNOLOGY OF INTEGRATED RISK MANAGEMENT OF SCIENTIFIC PROJECTS UNDER UNCERTAINTY AND BEHAVIORAL ECONOMY

The relevance of the topic is that currently the development of information technology allows to implement integrated risk management of scientific projects, which, in turn, expands the range of opportunities for projec...

A CONCEPTUAL MODEL AND PROCESS MANAGEMENT METHOD OF THE PLANNING AND MONITORING OF THE WORKLOAD IN THE EDUCATIONAL ENVIRONMENT

The article formulates the aims of HEI’s activities, as well as approaches to managing all actions that ensure the achievement of the stated aims. The process approach is defined as the main one in the university managem...

Download PDF file
  • EP ID EP729512
  • DOI https://doi.org/10.37943/16AADE3851
  • Views 49
  • Downloads 0

How To Cite

Andrii Biloshchytskyi, Olexandr Kuchansky, Aidos Mukhatayev, Svitlana Biloshchytska, Yurii Andrashko, Sapar Toxanov, Adil Faizullin (2023). CLUSTERING OF SCIENTISTS' PUBLICATIONS, CONSIDERING FINDING SIMILARITIES IN ABSTRACTS AND TEXTS OF PUBLICATIONS BASED ON N-GRAM ANALYSIS AND IDENTIFYING POTENTIAL PROJECT GROUPS. Scientific Journal of Astana IT University, 16(16), -. https://europub.co.uk/articles/-A-729512