Web Crawler Used in Search Engine

Abstract

The World Wide Web (WWW) is a collection of billions of documents formatted using HTML. Web Search engines are used to find the desired information on the World Wide Web. Whenever a user query is inputted, searching is performed through that database. The size of repository of search engine is not enough to accommodate every page available on the web. So it is desired that only the most relevant pages must be stored in the database. So, to store those most relevant pages from the World Wide Web, a better approach has to be followed. The software that traverses web for getting the relevant pages is called “Crawlers” or “Spiders”. A specialized crawler called focussed crawler traverses the web and selects the relevant pages to a defined topic rather than to explore all the regions of the web page. The crawler does not collect all the web pages, but retrieves only the relevant pages out of all. So the major problem is how to retrieve the relevant and quality web pages.

Authors and Affiliations

Harshali Kshirsagar, Pratibha Rewaskar, Komal Ramteke

Keywords

Related Articles

Identification, Delineation and Mapping of Micro Watershed of Kaneri

Water is most important part of our life. Now a days almost in all parts of Maharashtra people are suffering from water scarcity. To overcome this problem water conservation and management is the only solution. Delineat...

Performance Analysis of AODV & DSR Routing Protocol for Wireless Ad-hoc Network

These Recent years have witnessed an extreme growth in research and development in the field of Wireless Networks. The special focus has been on Ad-hoc Networks especially Mobile Ad hoc Network. Mobile Ad-hoc Network is...

High Density of Noise Removal Using FBD Filter

This research work has been done on data mining for noise type detection and filtering process to ample more effectiveness for removing the noise content in an image. In this article we have proposed a scheme of data mi...

Analysis of Multi-Disease & Prediction of Suitable Drug for Healthcare Application using Bigdata

The Healthcare technologies have grown immensely in various domains. These technologies have also made the health care data huge making it difficult to process. Several precautions should be taken using pharmaceutical d...

Distance Protection Scheme for Transmission Lines

This paper presents a protection scheme for transmission lines using the principle of distance relay. Even though many schemes exist for the protection of transmission lines, distance protection scheme is optimal due to...

Download PDF file
  • EP ID EP20109
  • DOI -
  • Views 233
  • Downloads 4

How To Cite

Harshali Kshirsagar, Pratibha Rewaskar, Komal Ramteke (2015). Web Crawler Used in Search Engine. International Journal for Research in Applied Science and Engineering Technology (IJRASET), 3(4), -. https://europub.co.uk/articles/-A-20109