Web Crawler: A Review

Abstract

In a large distributed system like the Web, users find resources by following hypertext links from one document to another. When the system is small and its resources share the same fundamental purpose, users can find resources of interest with relative ease. However, with the Web now encompassing millions of sites with many different purposes, navigation is difficult. WebCrawler, the Web’s first comprehensive full-text search engine, is a tool that assists users in their Web navigation by automating the task of link traversal, creating a searchable index of the web, and fulfilling searchers’ queries from the index. Conceptually, WebCrawler is a node in the Web graph that contains links to many sites on the net, shortening the path between users and their destinations.

Authors and Affiliations

Dhiraj Khurana , Satish Kumar

Keywords

Related Articles

SECURITY THREATS IN WIRELESS SENSOR NETWORKS

Wireless Sensor Network (WSN) is an emerging technology that shows great promise for various futuristic applications both for mass public and military. The sensing technology combined with processing power and wireless...

GRID COMPUTING AND CHECKPOINT APPROACH

Grid computing is a means of allocating the computational power of a large number of computers to complex difficult computation or problem. Grid computing is a distributed computing paradigm that differs from traditiona...

Improving the Performance of Servers by Implementing the Manual Log Shipping for Large Databases

Business success, the need to ensure 24x7availability is greater than it has ever been. One common method for as database systems become more and more critical to providing 99.99%, or "4-nines" availability, is to implem...

Operation Management of a GTPPS Based on Field Failure Data: A Case Study

Reliability analysis for Gas turbine power plant over a period of 66-month was carried out. The most important failure modes units were identified and the descriptive statistics at failure and machine level were calculat...

Awareness of Open Source Software (OSS): Promises, Reality and Future

Open source is a development method for software that harnesses the power of distributed peer review and transparency of process. The Open Source Initiative Approved License trademark and program creates a nexus of trus...

Download PDF file
  • EP ID EP130272
  • DOI -
  • Views 89
  • Downloads 0

How To Cite

Dhiraj Khurana, Satish Kumar (2012). Web Crawler: A Review. International Journal of Computer Science and Management Studies (IJCSMS) www.ijcsms.com, 12(1), 401-405. https://europub.co.uk/articles/-A-130272