Improved Focused Crawler Using Inverted WAH Bitmap Index
Journal Title: International Journal of Advanced Research in Computer Engineering & Technology(IJARCET) - Year 2012, Vol 1, Issue 4
Abstract
Focused Crawlers are software which can traverse the internet and retrieve web pages by hyperlinks according to specific topic. The traditional web crawlers cannot function well to retrieve the relevant pages effectively. The focused crawler is a special-purpose search engine which aims to selectively seek out pages that are relevant. The main characteristic of focused crawling is that the crawler does not need to collect all web pages, but selects and retrieves only the relevant pages. So the major problem is how to retrieve the maximal set of relevant and quality pages. To address this problem, we have designed an Interactive focused crawler which calculates the relevancy of web page. It calculates the URL score for identifying whether a URL is relevant or not for a specific topic. The Interactive Focused Crawler proceeds by gathering pages related to the seed set by using techniques like keyword extraction and search engine query and link neighbourhood expansion. These collected pages are then prompted to the user in a ranked order that facilitates quick elimination of negatives. The user then provides feedback and helps the baseline classifier to be progressively induced using active learning techniques. Once the classifier is in place the crawler can be started on its task of resource discovery.
Authors and Affiliations
Sanjay Kumar Singh, , Sonu Agrawal,
A Three-Layer Architecture based Approach for Data Access Layer in the Information Systems Production
Software architecture as an important branch of software engineering is one of the significant issues in software production line. It makes communication between system elements and shows us the general structure of t...
Egoistic superimpose Network Formation and Preservation
t—A introductory issue essential many superimpose network applications ranging from routing to peer-to-peer file distribution is that of the network configuration, i.e., flop new arrivals into an presented overlie,...
An Approach for Storage Security in Cloud Computing- A Survey
The many advantages of cloud computing are increasingly attracting individuals and organizations to outsource their data from local to remote cloud servers. In addition to cloud infrastructure and platform provider...
FGEST: FINGER TRACKING AND GESTURE RECOGNITION IN SMARTPHONES
Recent advances in mobile processors have made complex calculations possible and feasible in Smartphones. Taking advantage of these developments we aim to develop a gesture recognition application that can recognize fing...
Simulation and evaluation of convolution encoder for different noisy channel over wireless communication network in CDMA environment
In this paper we simulate and evaluate the performance of physical layer of wireless communication system of CDMA-2000 specification using radio configuration-3 under forward fundamental channel 1x in terms of bit...