STATISTICAL METHODS OF FORMATION OF TEXT CORPORA AND LEXICOGRAPHIC RESOURCES (ON THE BASIS OF THE SPECIALTY “ACOUSTICS AND ULTRASONIC”)

Abstract

The article considers the description of the step sequence in forming the text corpora, and then frequency dictionaries on the example of Acoustics and Ultrasonic Technique (AUST) specialty, the texts of which are referred to scientific and technical discourse. The necessity of application of real text corpora compiled with the help of statistical methods in the present-day research processes is proved. Statistical method usage allows to determine such a mandatory parameter as the reliability of text corpus and lexicographic resources created on its basis – frequency dictionaries, alphabet-frequency dictionaries, etc. The example of specialty AUST demonstrates how statistically verified characteristics of the text corpus allowed to create a reliable probabilistic-statistical model (frequency dictionary) of this subject area. The statistical reliabil- ity of the dictionary manifested itself in the fact that the percentage of covering the AUST texts with the units of the base dictionary (the first 2 thousand words) is 86%, which makes it possible to understand the content of almost any text on the specialty AUST using the lexical units presented in it (the base dictionary).

Authors and Affiliations

G. F. Dyachenko, S. L. Mykhailiuk, I. F. Duvanskaya

Keywords

Related Articles

SPECIFICITY OF CREATING OF COMICAL EFFECT IN AMERICAN SITUATIONAL COMEDY (CASE STUDY OF SITCOM “SEX AND THE CITY”)

This article observes the category of comical from the position of linguistics and poetics of a literary text; on the case study of sitcom “Sex and the City” it is defined the specificity of creating of comical effect in...

METHODOLOGICAL PRINCIPLES FOR STUDY OF ENGLISH AND UKRAINIAN BANKING TERMINOLOGY

The article highlights the methodological basis for the study of English and Ukrainian banking terminology in the comparative aspect. Among the general scientific methods, a systematic approach, method of induction and d...

HUNGARIAN TRANSLATIONS OF LOFTY BALLS: THE SPECIFICS OF ORGANIZATION AND REPRODUCTION OF IMAGES

In the article the conceptual models of the imagological position are outlined through the prism of the László Balla’s interpret solutions (the Hungarian translations from Taras Shevchenko’s, Mykhaulo Kotsiubynsky’s and...

UNMODAL MEANINGS OF MODAL PREDICATES OF UKRAINIAN, RUSSIAN, ENGLISH LANGUAGES

Variations of modal predicates’ weakening of their modal meaning were analyzed in the article. The cases of main infinitive coreferential and prepositional object addition to the predicates were described. Depending on s...

NON-VERBAL AND PARALINGUAL SEMIOTIC RESOURCES IN LINGUISTIC RESEARCH PAPERS OF THE 20TH AND 21ST CENTURIES: APPLIED ASPECT

This article provides a review of linguistic research papers devoted to semiotic resources of a nonverbal nature. Extensive attention is given to their importance in communication processes and the revival of research in...

Download PDF file
  • EP ID EP562563
  • DOI -
  • Views 38
  • Downloads 0

How To Cite

G. F. Dyachenko, S. L. Mykhailiuk, I. F. Duvanskaya (2018). STATISTICAL METHODS OF FORMATION OF TEXT CORPORA AND LEXICOGRAPHIC RESOURCES (ON THE BASIS OF THE SPECIALTY “ACOUSTICS AND ULTRASONIC”). Закарпатські філологічні студії, 6(), 158-161. https://europub.co.uk/articles/-A-562563