I-ViDE: An Improved Vision-Based Approach for Deep Web Data Extraction
Journal Title: IOSR Journals (IOSR Journal of Computer Engineering) - Year 2014, Vol 16, Issue 4
Abstract
Abstract: Deep Web contents are accessed by queries submitted to Web databases and the returned data records are enwrapped in dynamically generated Web pages (they will be called deep Web pages in this paper). Extracting structured data from deep Web pages is a challenging problem due to the underlying intricate structures of such pages. Until now, a large number of techniques have been proposed to address this problem, but all of them have inherent limitations because they are HTML language dependent .Visual features are not taken into consideration. All previous methods are mostly dependent on table tags. A Vision based approach for web data extraction has overcome the limitations of previous work by utilizing some interesting common visual features on the web page. But still this approach has one drawback that it can process web page containing only one data region. Due to processing of one data region it reduces the precision and recall rate. As precision give us the rate that how many correct data records are extracted from relevant data records and recall give us the rate that how many relevant data records are extracted from overall data records. The proposed Improved-ViDE approach handles multi data-region in deep web pages which can improve the precision rate and recall rate.
Authors and Affiliations
Mrudula Varade , Vimla Jethani
A Survey on Enhancing Routing using NCPR in Mobile Ad hoc Networks
Abstract: Mobile Ad hoc Network (MANET) is collection of wireless mobile hosts (or nodes) that are free to move in any directions at any speed. This nature of MANET’s leads to periodic link breakage, which in turn...
Enhancement Caesar Cipher for Better Security
Abstract: Cryptography is an art and science of converting original message into non readable form. Fast progression of digital data exchange in electronic way, information security is becoming much more important...
Analysis and evaluation of probabilistic routing protocol for intermittently connected network
Intermittently connected network often referred to as Delay/Disruption Tolerant Network which is an infrastructure less network suffers from intermittent connection i.e. a connected path from the source to the de...
Enhancement in Weighted PageRank Algorithm Using VOL
There are billions of web pages available on the World Wide Web (WWW). So there are lots of search results corresponding to a user’s query out of which only some are relevant. The relevancy of a web page is cal...
Web Penetration Testing using Nessus and Metasploit Tool
Abstract: Web Penetration Testing is a tool that is being used widely to see how the website reacts when an vulnerability attack is done. Now days many ethical hackers use web penetration tool to predict the vulner...