Effective Performance of Information Retrieval by using Domain Based Crawler

Abstract

World Wide Web continuously introduces new capabilities and attracts many people[1]. It consists of more than 60 billion pages online. Due to this explosion in size, the information retrieval system or Search Engines are being upgraded day by day and it can be used to access the information effectively and efficiently. In this paper, we have addressed Domain Based Information Retrieval (DBIR) System. In this system we crawl the information from the web and added all links to the data base which are related to a specific domain. It simply ignores which are not related to that domain. Because of that we can save the Storage Space (SS) and Searching Time (ST) and as a result it improves the performance of the system. It is an extension of Effective Performance of Web Crawler (EPOW) System [2], in which it has two Crawler modules. The first one is Basic Crawler. It consists of multiple downloaders to achieve parallelization policy . The second one is Master Crawler, which is used to filter the URLs send by the Basic Crawler based on the Domain and sends back to the Basic Crawler to extract the related links. All these related links are collectively stored into the database under a unique domain name.

Authors and Affiliations

Sk. Nabi, Dr. Premchand

Keywords

Related Articles

Knowledge discovery from database using an integration of clustering and classification

Clustering and classification are two important techniques of data mining. Classification is a supervised learning problem of assigning an object to one of several pre-defined categories based upon the attributes of the...

Analysis of Particle Swarm Optimization and Genetic Algorithm based on Task Scheduling in Cloud Computing Environment

Since the beginning of cloud computing technology, task scheduling problem has never been an easy work. Because of its NP-complete problem nature, a large number of task scheduling techniques have been suggested by diffe...

A Novel E-Mail Network Evolution Model based on user Information

E-mail is one of the main means of communication in society today, and it is a typical social network. Studying the evolution of the social network structure by constructing an e-mail network evolution model is of great...

Theoretical and numerical characterization of continuously graded thin layer by the reflection acoustic microscope

This article presents a theoretical and numerical study by the reflection acoustic microscope of the surface acoustic waves propagation at the interface formed by a thin layer and the coupling liquid (water). The thin la...

Introducing Time based Competitive Advantage in IT Sector with Simulation

Incompletion of projects in time leads to project failure which is the major dilemma of the software industry. Different strategies are used to gain a competitive advantage over competitors in business. In software persp...

Download PDF file
  • EP ID EP115027
  • DOI 10.14569/IJACSA.2013.040713
  • Views 93
  • Downloads 0

How To Cite

Sk. Nabi, Dr. Premchand (2013). Effective Performance of Information Retrieval by using Domain Based Crawler. International Journal of Advanced Computer Science & Applications, 4(7), 88-92. https://europub.co.uk/articles/-A-115027