MODELING MORPHOLOGICAL ANALYSIS BASED ON WORD-ENDING FOR UZBEK LANGUAGE

Journal Title: International scientific journal Science and Innovation - Year 2023, Vol 2, Issue 11

Abstract

Uzbek, an agglutinative language, forms words by combining affixes with roots, utilizing inflectional endings for various morphological features. This property makes a large number of combinations of word ending, and greatly increases the word-vocabulary size, and data sparseness problems for statistical models. This paper discusses a morphological analyzing model which includes stemming, lemmatizing and extraction of morphological information considering morpho-phonetic exceptions. A main point of the model involves developing a complete set of word-ending with assign morphological information, and additional datasets for morphological analysis. The proposed model was evaluated using a curated test set comprising 5.3K words. It achieved a word-level accuracy over 91%, as determined through manual verification of stem, lemma, and morphological feature corrections conducted by linguistic experts. The created tool based on the proposed methodology is available as an open-source Python package, as well as a web-based application including a public API

Authors and Affiliations

Ulugbek Salaev

Keywords

Related Articles

CHANGES OF LAND COVER AND SOIL PROPERTIES OF AMUDARYA DOWNSTREAM AREAS UNDER THE INFLUENCE OF DESERTIFICATION

This article describes the soils common in the lower reaches of the Amu Darya and the Aral region, changes in the amount of humus in soils, fertility, agrochemical composition of soils, mechanical composition and changes...

SCREENING TOOLS FOR PERINATAL OUTCOMES IN PREGNANT WOMEN COMPLICATED BY SUBCHORIONIC HEMATOMA

Miscarriage remains a global problem in modern obstetrics. Subchorionic hematoma is a complication of early gestational bleeding that affects pregnancy outcomes. The value of ultrasound signs and maternal serum fetoprote...

PREMATURE OVARIAN INSUFFICIENCY DUE COVID-19: WHAT MECHANISM PLAYS A ROLE?

Today, premature ovarian failure (POF) is one of the most difficult problems of women's health. Relevance. As the main cause of infertility, a decrease in the quality of life of women with POI is a problem that needs spe...

INNOVATIVE APPROACH TO INCREASING THE EFFICIENCY OF USE OF ENERGY RESOURCES IN AGRICULTURAL PRODUCTION

The article describes the problems of building innovative activity related to the problems of the investment process, a system of investment policy rules that would facilitate the creation of a positive investment climat...

LICORICE (GLYCYRRHIZA L.) AS MORE THAN A MEDICINAL PLANT: APPLICATIONS, ENVIRONMENTAL CHALLENGES ASSOCIATED WITH IT, AND EFFECTIVE PROPAGATION TECHNOLOGIES

This article examines the significance, applications, and cultivation methods of licorice plants (Glycyrrhiza) within Uzbekistan. Emphasis is placed on the plant’s broad utilization across various industrial sectors, its...

Download PDF file
  • EP ID EP725418
  • DOI 10.5281/zenodo.10155225
  • Views 56
  • Downloads 0

How To Cite

Ulugbek Salaev (2023). MODELING MORPHOLOGICAL ANALYSIS BASED ON WORD-ENDING FOR UZBEK LANGUAGE. International scientific journal Science and Innovation, 2(11), -. https://europub.co.uk/articles/-A-725418