Stemming and root-based approaches to the retrieval of Arabic documents on the Web

Journal Title: Webology - Year 2006, Vol 3, Issue 1

Abstract

Using information retrieval systems to gain access to documents in languages other than English is becoming an increasingly significant problem. Rules, theories, algorithms, and retrieval methods designed and developed for English and other morphologically similar languages may or may not apply in the linguistic environments of other languages. The problem is particularly acute in languages that differ radically from English on account of morphological rules. This paper compares the effects stemming and root retrieval on information retrieval in Arabic through an exploratory study of the handling of Arabic words by an English-language search engine (ELSE). Search experiments, using 2000 Arabic documents and 40 Arabic search terms (nouns), were conducted in a Web search engine developed for English (AltaVista) and in an Arabic search engine (al-Idrisi) to compare the performances of stemming and root retrieval and to investigate the possibility of adapting AltaVista for use with Arabic text. The results of the experiments show that more effective retrieval can be accomplished through stemming, and that it is possible to adapt an ELSE for use with Arabic without the need to develop root-retrieval features.

Authors and Affiliations

Haidar Moukdad

Keywords

Related Articles

Creating a digital footprint as a means of optimizing the personal branding of librarians in the digital society

The paper vividly x-rays and brings into limelight, the concept of personal branding of librarians in the contemporary age and corporate world where there exist a lot of competitions among various scholars. The element o...

The impact of electronic word-of-mouth in the distribution of digital goods

The rapid proliferation of social media networks has presented a platform of opportunities for the distribution of digital products and related applications. This is commonly known as word-of-mouth or viral marketing and...

Gamification in library websites based on motivational theories

Gamification is defined as “the use of game elements and techniques in non-game contexts”. In fact, this definition is the most comprehensive one presented so far. This concept emerged first in 2002 but it has been prolo...

Citation Analysis of Library Trends

Citation analysis of all the journal articles published in the Library Trends from 1994-2007 is carried out. 593 articles are published in the journal during 14 years. Highest number (52) of articles is published in 2004...

Digital Literacies: Concepts, Policies and Practices

With Digital literacies … a group of internationally renowned authors, under the capable editorship of Colin Lankshear & Michele Knobel, succeed in raising awareness for the vast scope and complexities of literacies that...

Download PDF file
  • EP ID EP687497
  • DOI -
  • Views 208
  • Downloads 0

How To Cite

Haidar Moukdad (2006). Stemming and root-based approaches to the retrieval of Arabic documents on the Web. Webology, 3(1), -. https://europub.co.uk/articles/-A-687497