Topic-specific Web Crawler using Probability Method

Journal Title: IOSR Journals (IOSR Journal of Computer Engineering) - Year 2013, Vol 13, Issue 1

Abstract

 Web has become an integral part of our lives and search engines play an important role in making users search the content online using specific topic. The web is a huge and highly dynamic environment which is  growing exponentially in content and developing fast in structure. No search engine can cover the whole web,  but it has to focus on the most valuable pages for crawling. Many methods have been developed based on link  and text content analysis for retrieving the pages. Topic-specific web crawler collects the relevant web pages of  interested topics of the user from the web. In this paper, we present an algorithm that covers the link, text  content using Levenshtein distance and probability method to fetch more number of relevant pages based on the  topic specified by the user. Evaluation illustrates that the proposed web crawler collects the best web pages  under user interests during the earlier period of crawling

Authors and Affiliations

S Subatra Devi

Keywords

Related Articles

 Educational Process Mining-Different Perspectives

 Process mining methods have in recent years enabled the development of more sophisticated Process models which represent and detect a broader range of student behaviors than was previously possible. This  pa...

 Pervasive Computing Applications And Its Security Issues &Challenges

 Abstract: This paper discusses the emerging field of pervasive computing applications and its securitychallenges, the word pervasive or ubiquitous mean "existing everywhere." It produces sevice to anyplace,anywhere...

Role of Adjacency Matrix in Graph Theory

Graph theory is an applied branch of the mathematics which deals the problems, with the help of graphs. There are many applications of graph theory to a wide variety of subjects which include Operations Research, Physics...

 An approach for human gait identification based on area

 In recent investigations related to security issues, biometrics plays an important role in recognition of individuals based on their physiological or behavioral characteristics. Gait as a biometric behavioral tra...

Performance Analysis of Adaptive Approach for Congestion Control In Wireless Sensor Networks

WSN consists of hundreds / thousands of wireless nodes distributed within the geographical area. The wireless nodes gather information and supply towards the central node for further processing. There are different facto...

Download PDF file
  • EP ID EP104170
  • DOI -
  • Views 102
  • Downloads 0

How To Cite

S Subatra Devi (2013).  Topic-specific Web Crawler using Probability Method. IOSR Journals (IOSR Journal of Computer Engineering), 13(1), 102-106. https://europub.co.uk/articles/-A-104170