Effective Performance of Information Retrieval by using Domain Based Crawler

Abstract

World Wide Web continuously introduces new capabilities and attracts many people[1]. It consists of more than 60 billion pages online. Due to this explosion in size, the information retrieval system or Search Engines are being upgraded day by day and it can be used to access the information effectively and efficiently. In this paper, we have addressed Domain Based Information Retrieval (DBIR) System. In this system we crawl the information from the web and added all links to the data base which are related to a specific domain. It simply ignores which are not related to that domain. Because of that we can save the Storage Space (SS) and Searching Time (ST) and as a result it improves the performance of the system. It is an extension of Effective Performance of Web Crawler (EPOW) System [2], in which it has two Crawler modules. The first one is Basic Crawler. It consists of multiple downloaders to achieve parallelization policy . The second one is Master Crawler, which is used to filter the URLs send by the Basic Crawler based on the Domain and sends back to the Basic Crawler to extract the related links. All these related links are collectively stored into the database under a unique domain name.

Authors and Affiliations

Sk. Nabi, Dr. Premchand

Keywords

Related Articles

A Proposed Framework to Investigate the User Acceptance of Personal Health Records in Malaysia using UTAUT2 and PMT

Personal Health Records (PHRs) can be considered as one of the most important health technologies. PHRs enroll the patients directly to their health decision making through giving them the authority to control and share...

 Decision Tree Classification of Remotely Sensed Satellite Data using Spectral Separability Matrix

 In this paper an attempt has been made to develop a decision tree classification algorithm for remotely sensed satellite data using the separability matrix of the spectral distributions of probable classes in respe...

An Evaluation of Requirement Prioritization Techniques with ANP

This article elaborates an evaluation of seven software requirements prioritization methods (ANP, binary search tree, AHP, hierarchy AHP, spanning tree matrix, priority group and bubble sort). Based on the case study of...

Investigative Behavioral Intention to Knowledge Acceptance and Motivation in Cloud Computing Applications

Recently the number of Cloud Computing users in educational institutions has increased. Students have the chance to access various applications and this gives the opportunity to take advantage of those applications. This...

Pulse Shape Filtering in Wireless Communication-A Critical Analysis

The goal for the Third Generation (3G) of mobile communications system is to seamlessly integrate a wide variety of communication services. The rapidly increasing popularity of mobile radio services has created a series...

Download PDF file
  • EP ID EP115027
  • DOI 10.14569/IJACSA.2013.040713
  • Views 101
  • Downloads 0

How To Cite

Sk. Nabi, Dr. Premchand (2013). Effective Performance of Information Retrieval by using Domain Based Crawler. International Journal of Advanced Computer Science & Applications, 4(7), 88-92. https://europub.co.uk/articles/-A-115027