Web Crawler Used in Search Engine

Abstract

The World Wide Web (WWW) is a collection of billions of documents formatted using HTML. Web Search engines are used to find the desired information on the World Wide Web. Whenever a user query is inputted, searching is performed through that database. The size of repository of search engine is not enough to accommodate every page available on the web. So it is desired that only the most relevant pages must be stored in the database. So, to store those most relevant pages from the World Wide Web, a better approach has to be followed. The software that traverses web for getting the relevant pages is called “Crawlers” or “Spiders”. A specialized crawler called focussed crawler traverses the web and selects the relevant pages to a defined topic rather than to explore all the regions of the web page. The crawler does not collect all the web pages, but retrieves only the relevant pages out of all. So the major problem is how to retrieve the relevant and quality web pages.

Authors and Affiliations

Harshali Kshirsagar, Pratibha Rewaskar, Komal Ramteke

Keywords

Related Articles

Design and Verification of ACK / NAK Protocol of PCI Express Data Link Layer in System Verilog

PCI-Express is a high performance, general purpose I/O interconnect communication protocol for multiplexing various peripheral. PCI Express is third generation Computer Bus(3GIO) to inter connect peripherals in a Comput...

A Study of Various Algorithms Used for Analyzing Eavesdropping Attack in Industrial Wireless Sensor Network

In industrial applications, the real time communications among the spatially distributed sensors should satisfy reliability requirements and strict security. Most of the industries use wireless networks for communicatin...

Ruthenium Oxide: Thin Film and Electrochemical Properties

Electrochemical supercapacitors have attracted increased interest due to their high power density and long life cycle compared to batteries and high energy density compared to conventional capacitors. In recent years va...

The Design of Channel Estimation and Blocking Probability for Multi-Hop Wireless Communication Network

The paper presents the design of channel estimation and blocking probability for multi-hop wireless communication network (MH-WCN). This work proposed a bidirectional network coding 4−hop transmission strategy and follo...

Design of Multiplier and Divider Using Reversible Logic Gates with Vedic Mathematical Approach

Arithmetic operations are the main components in any design of Digital signal processing or microcontrollers. Multipliers and Divider circuits includes the adders and substations. The main requirements of Digital Signal...

Download PDF file
  • EP ID EP20109
  • DOI -
  • Views 201
  • Downloads 4

How To Cite

Harshali Kshirsagar, Pratibha Rewaskar, Komal Ramteke (2015). Web Crawler Used in Search Engine. International Journal for Research in Applied Science and Engineering Technology (IJRASET), 3(4), -. https://europub.co.uk/articles/-A-20109