Formulaicity in Turkish: Evidence from the Turkish National Corpus
Journal Title: Mersin Üniversitesi Dil ve Edebiyat Dergisi - Year 2016, Vol 13, Issue 2
Abstract
Formulaic sequences are the most frequently occurred forms in a language. Identification of formulaic sequences in language is useful for a wide range of areas including linguistics, second language learning, natural language processing, etc. To identify formulaic sequences in a language, the most preferred method is to use a corpus, which may be formed from written texts or tape-recorded conversations in the language, and count the frequencies of sequences in the corpus. Then, most frequently occurring sequences are examined to find formulas. Numerous studies have been made to identify formulas for several languages like English. There exists only few studies about formulaicity in Turkish and most of these studies focus on identifying formulas in the forms of multi word units. Turkish, however, is an agglutinating language having a rich and complex morphology, therefore formulaic sequences in affixation should be discovered. Only very limited studies about formulaicity in affixation of Turkish exist in the literature. In this study, we try to discover formulaic sequences in affixation of Turkish by counting frequent suffix n-grams in written and spoken Turkish by using the Turkish National Corpus, which is a balanced, large scale, and general-purpose corpus for contemporary Turkish. We list the most frequent suffix combinations not only for verbs but also for all lexical categories like noun, adjective, verb, and adverb for both written and spoken corpora from Turkish National Corpus, and discuss similarities and differences in affixation in written and spoken usage of Turkish. We observe that, we prefer shorter suffix sequences in spoken Turkish than in written Turkish, and as the length of the suffix n-grams increase, we use different formulaic sequences in written and spoken Turkish.
Authors and Affiliations
Selma Ayşe Özel, Yasin Bektaş, Hakan Yılmazer
A Theoretical View to Collocations with an Argument of Integrant Approach in Teaching Level
In the sense of the process to generate data aiming the language teaching, the main research question of a study that requires to be defined as collocation is undoubtedly about how to reach the collocations that are comm...
Foreword
Mersin Üniversitesi Dil ve Edebiyat Dergisi’nin bu sayısı 2000’lerden bu yana her geçen gün daha fazla ilgi odağı olan duruş (İng. Stance) kavramına ayrılmıştır. Dilbilim alanyazınında çok farklı yaklaşımlar tarafından f...
Stance and Perception of Phonetic Variable
Perception of phonetic variables alongside social meanings has been the preliminary research question in the field of sociolinguistics in the last twenty years. The theoretical debate fostered in answering this research...
Multi-word Expressions in Genre Specification
Corpus analyses of lexical structures have uncovered different functions that they come to serve in textual organisation. Frequently occurring patterns of lexical items, the multi-word units, display different distributi...
The Teller/Receiver-Oriented Functions of Ondan Sonra As A Discourse Marker in Conversational Narratives
Discourse markers that are largely used in everyday talk carry out various functions in conversations. One of the conversational genres in which discourse markers are highly used is conversational narrative. Conversation...