Stemming and root-based approaches to the retrieval of Arabic documents on the Web
Journal Title: Webology - Year 2006, Vol 3, Issue 1
Abstract
Using information retrieval systems to gain access to documents in languages other than English is becoming an increasingly significant problem. Rules, theories, algorithms, and retrieval methods designed and developed for English and other morphologically similar languages may or may not apply in the linguistic environments of other languages. The problem is particularly acute in languages that differ radically from English on account of morphological rules. This paper compares the effects stemming and root retrieval on information retrieval in Arabic through an exploratory study of the handling of Arabic words by an English-language search engine (ELSE). Search experiments, using 2000 Arabic documents and 40 Arabic search terms (nouns), were conducted in a Web search engine developed for English (AltaVista) and in an Arabic search engine (al-Idrisi) to compare the performances of stemming and root retrieval and to investigate the possibility of adapting AltaVista for use with Arabic text. The results of the experiments show that more effective retrieval can be accomplished through stemming, and that it is possible to adapt an ELSE for use with Arabic without the need to develop root-retrieval features.
Authors and Affiliations
Haidar Moukdad
Digital Health Information for the Consumer: Evidence and Policy Implications
Wide and easy availability of health information for the general public is something that governments consider beneficial to the public as it improves the public health, helps largescale preventative medicine and eventua...
Social Media Application in Indonesian Academic Libraries
Nowadays, many libraries have taken advantages of social media to promote their collection as well as to enhance services and interact with their users. In this research, the use of social media by academic libraries in...
Evidence-based librarianship: Case studies and active learning exercises
Evidence-based librarianship: Case studies and active learning exercises earned wide readership for its novelty contents and multinational contributors with wide range of case studies. The edited title consists of 8 chap...
Marketing Research in India: A Scientometrics Study
Analyses the Indian publications output in marketing research during 1990-2018 on several parameters including contribution and citation impact of most productive countries, India’s overall contribution, its growth patte...
Roles of information systems in socio-legal context
Contemporary conception of information has been studied for decades. Starting from legal viewpoint, purpose of this article is to provide methodological understanding of the conception and classification of information i...