Realizing Peer-to-Peer and Distributed Web Crawler 

Abstract

The tremendous growth of the World Wide Web has made tools such as search engines and information retrieval systems have become essential. In this dissertation, we propose a fully distributed, peer-to-peer architecture for web crawling. The main goal behind the development of such a system is to provide an alternative but efficient, easily implementable and a decentralized system for crawling, indexing, caching and querying web pages. The main function of a webcrawler is to recursively visit web pages, extract all URLs form the page, parse the page for keywords and visit the extracted URLs recursively. We propose an architecture that can be easily implemeneted on a local (campus) network and which follows a fully distributed, peer-to-peer architecture. The architecture specifications, implementation details, requirements to be met and analysis of such a system is discussed. 

Authors and Affiliations

Anup A. Garje, , Prof. Bhavesh Patel , Dr. B. B. Meshram

Keywords

Related Articles

Automatic Boundary Detection and Generation of Region of Interest for Focal Liver Lesion Ultrasound Image Using Texture Analysis

The analysis of texture parameters is a useful way of increasing the information obtainable from medical images. It is an on-going field of research, with applications ranging from the segmentation of specific anatomical...

Small Signal Stability Investigation of SMIB System Using Variable Structure Control 

This paper aims at investigation of small signal stability for a synchronous machine connected to the infinite bus system by variable structure control. Power systems inherently being non linear, so first the linea...

Scalable Multicasting and Sustaining Proficient Over Mobile Ad Hoc Networks: MANET 

Cluster interactions are imperative in Mobile Ad hoc Networks (MANET). Multicast is an proficient technique for implementing cluster connections. However, it is exigent to execute competent and scalable multicast in MANE...

Cost Estimation Modal to find faulty Objects in Software Reusable Components

The software development cost can be reduced by reusing the existing components. These exciting components can be the object oriented software components .The object oriented components can be easily reused. Reusing the...

Design of Differential LC and Voltage Controlled Oscillator for ISM Band Applications  

— Oscillators are integral part of many electronic systems. An oscillator is an electronic device used for the purpose of generating a signal. Applications range from clock generation in microprocessors to carrier...

Download PDF file
  • EP ID EP136148
  • DOI -
  • Views 81
  • Downloads 0

How To Cite

Anup A. Garje, , Prof. Bhavesh Patel, Dr. B. B. Meshram (2012). Realizing Peer-to-Peer and Distributed Web Crawler . International Journal of Advanced Research in Computer Engineering & Technology(IJARCET), 1(4), 353-357. https://europub.co.uk/articles/-A-136148