A New Hidden Web Crawling Approach

Abstract

Traditional search engines deal with the Surface Web which is a set of Web pages directly accessible through hyperlinks and ignores a large part of the Web called hidden Web which is a great amount of valuable information of online database which is “hidden” behind the query forms. To access to those information the crawler have to fill the forms with a valid data, for this reason we propose a new approach which use SQLI technique in order to find the most promising keywords of a specific domain for automatic form submission. The effectiveness of proposed framework has been evaluated through experiments using real web sites and encouraging preliminary results were obtained

Authors and Affiliations

L. Saoudi , A. Boukerram , S. Mhamedi

Keywords

Related Articles

A Unified Forensic Framework for Data Identification and Collection in Mobile Cloud Social Network Applications

Mobile Cloud Computing (MCC) is the emerging and well accepted concept that significantly removes the constraints of mobile devices in terms of storage and computing capabilities and improves productivity, enhances perfo...

Hybrid Motion Graphs for Character Animation

Many works in the literature have improved the performance of motion graphs for synthesis the humanlike results in limited domains that necessity few constraints like dance, navigation in small game like environments or...

Empirical Validation of Web Metrics for Improving the Quality of Web Page

Web page metrics is one of the key elements in measuring various attributes of web site. Metrics gives the concrete values to the attributes of web sites which may be used to compare different web pages .The web pages ca...

A Knowledge-based Topic Modeling Approach for Automatic Topic Labeling

Probabilistic topic models, which aim to discover latent topics in text corpora define each document as a multinomial distributions over topics and each topic as a multinomial distributions over words. Although, humans c...

An Adaptive Solution for Congestion Control in CoAP-based Group Communications

The use of lightweight devices and constrained resources like Wireless Sensors Network (WSN) makes patterns traffic in the Internet of Things (IoT) different from the ones in conventional networks. One of the most emergi...

Download PDF file
  • EP ID EP106530
  • DOI 10.14569/IJACSA.2015.061039
  • Views 113
  • Downloads 0

How To Cite

L. Saoudi, A. Boukerram, S. Mhamedi (2015). A New Hidden Web Crawling Approach. International Journal of Advanced Computer Science & Applications, 6(10), 293-297. https://europub.co.uk/articles/-A-106530