Content Evocation Using Web Scraping and Semantic Illustration
Journal Title: IOSR Journals (IOSR Journal of Computer Engineering) - Year 2014, Vol 16, Issue 3
Abstract
Abstract: Web scraping is the process of automatically collecting information from the World Wide Web. It is a field with active developments, sharing a common goal with the semantic web vision, an ambitious initiative that still requires breakthroughs in text processing, artificial intelligence and human-computer interactions. It means extraction of content from different web pages using web scrapping and semantic illustration. Web Scrapping is a process of evocation of content from HTML pages and related to web indexing. A commonly used measure for tree similarity is the tree edit distance which easily can be extended to be a measure of how well a pattern can be matched in a tree. An obstacle for this approach is its time complexity, so we consider if faster algorithms for constrained tree edit distances are usable for web scraping, and to reduce the size of the tree representing the web page. Different applications of web scraping are used by current market to achieve best web scraping output, Like Web Data Extraction, Data Collection, Screen Scraping. Many different algorithms are used for web scraping like “tree pattern matching”, “tree mapping”, “approximate tree matching”. But in general “tree edit distance” algorithm is used. But with this algorithm many issues of incorrectness of data, low efficiency and higher time complexity have analyzed. In this research I am focus to improve the performance of tree edit distance problem. And I am also trying to focus on higher bound time complexity of this algorithm.
Authors and Affiliations
Vasani Krunal A
Enhancement of Data Hiding Capacity in Audio Steganography
Nowadays, a lot of applications are Internet-based and demand of internet applications requires data to be transmitted in a secure manner. Data transmission in public communication system is not secure...
Educational Robot Task Virtual Model Transformation Into Real Environment
Abstract: We propose to extend the learning of programming basics in secondary school, creating an educational environment based on using virtual modelling of educational robot task and model’s transformation into real e...
Social Interaction Feature for Mobile TV Services Based On Cloud Move
Abstract: The rapidly increasing power of personal mobile devices (smartphones, tablets, etc.) is providing much richer contents and social interactions to users on the move. This trend however is throttled by the...
Quantum Cellular Automata Circuit Mixed Strategy Game Theoretic Optimization
Abstract : Gates and circuits made of Quantum Cellular Automata are subject to manufacturing errors due to imprecision of measurement. Our work examines the use of game theory optimization to correct Gaussian measurement...
Fake Reviewer Groups’ Detection System
We have the cyber space occupied with most of the opinions, comments and reviews. We also see the use of opinions in decision making process of many organizations. Not only organizations use these reviews but&...