Stemming and root-based approaches to the retrieval of Arabic documents on the Web

Journal Title: Webology - Year 2006, Vol 3, Issue 1

Abstract

Using information retrieval systems to gain access to documents in languages other than English is becoming an increasingly significant problem. Rules, theories, algorithms, and retrieval methods designed and developed for English and other morphologically similar languages may or may not apply in the linguistic environments of other languages. The problem is particularly acute in languages that differ radically from English on account of morphological rules. This paper compares the effects stemming and root retrieval on information retrieval in Arabic through an exploratory study of the handling of Arabic words by an English-language search engine (ELSE). Search experiments, using 2000 Arabic documents and 40 Arabic search terms (nouns), were conducted in a Web search engine developed for English (AltaVista) and in an Arabic search engine (al-Idrisi) to compare the performances of stemming and root retrieval and to investigate the possibility of adapting AltaVista for use with Arabic text. The results of the experiments show that more effective retrieval can be accomplished through stemming, and that it is possible to adapt an ELSE for use with Arabic without the need to develop root-retrieval features.

Authors and Affiliations

Haidar Moukdad

Keywords

Related Articles

E-marketing, Unsolicited Commercial E-mail, and Legal Solutions

The purpose of this paper is to explore the legal solutions to unsolicited commercial e-mail. The advantages of e-mail enable it to be one of the most important e-marketing instruments. Spammers are also motivated by p...

Digital consumers reshaping the information profession

The introductory paragraph to Digital consumers reshaping the information profession (p.1), explaining the choice of the title as "Digital consumers …" and not "Digital information consumers …", set the tone for a though...

Reshaping Digital Inequality in the European Union: How Psychological Barriers Affect Internet Adoption Rates

In the past years, scholars have assessed the social differences that the Internet has generated from its use (or its non-use). The issue has been largely referred to as Digital Divide, describing the social division bet...

The Institutional Repository

At a time when the future nature of scholarly communication and publishing are being debated this book serves as a useful reference guide for one of the key aspects- the institutional repository.

How to use recommender systems in e-business domains

Recommender systems (RS) were developed by research as a means to manage the information retrieval problem for users searching large databases. Recently they have become very popular among businesses as online marketing...

Download PDF file
  • EP ID EP687497
  • DOI -
  • Views 212
  • Downloads 0

How To Cite

Haidar Moukdad (2006). Stemming and root-based approaches to the retrieval of Arabic documents on the Web. Webology, 3(1), -. https://europub.co.uk/articles/-A-687497