Content Evocation Using Web Scraping and Semantic Illustration

Journal Title: IOSR Journals (IOSR Journal of Computer Engineering) - Year 2014, Vol 16, Issue 3

Abstract

  Abstract: Web scraping is the process of automatically collecting information from the World Wide Web. It is a field with active developments, sharing a common goal with the semantic web vision, an ambitious initiative that still requires breakthroughs in text processing, artificial intelligence and human-computer interactions. It means extraction of content from different web pages using web scrapping and semantic illustration. Web Scrapping is a process of evocation of content from HTML pages and related to web indexing. A commonly used measure for tree similarity is the tree edit distance which easily can be extended to be a measure of how well a pattern can be matched in a tree. An obstacle for this approach is its time complexity, so we consider if faster algorithms for constrained tree edit distances are usable for web scraping, and to reduce the size of the tree representing the web page. Different applications of web scraping are used by current market to achieve best web scraping output, Like Web Data Extraction, Data Collection, Screen Scraping. Many different algorithms are used for web scraping like “tree pattern matching”, “tree mapping”, “approximate tree matching”. But in general “tree edit distance” algorithm is used. But with this algorithm many issues of incorrectness of data, low efficiency and higher time complexity have analyzed. In this research I am focus to improve the performance of tree edit distance problem. And I am also trying to focus on higher bound time complexity of this algorithm.

Authors and Affiliations

Vasani Krunal A

Keywords

Related Articles

Face Detection and Recognition using Viola-Jones algorithm and fusion of LDA and ANN

Abstract: Building a computational model for recognizing a face is a complicated task as the face is a complex multidimensional visual model. The proposed paper focuses on human face recognition by calculating the featur...

IOT Based Coal Mine Safety Monitoring and Control Automation

In this paper we are organizing an IoT (Internet of Things) screen, a safety efforts for excavators which is most basic in underground mining domains. In this undertaking, the system is build using particular sensors sor...

 Virtual Watermarking for Color images

 This work proposes a virtual watermarking technique for colored images. In the proposed work, sender uses two images one is cover image and other is secret image. Let us consider the secret image to be embedded as...

Design Approach to Big data Systems in Developing and Maintaining the Information Security Systems

Abstract: Data is accumulating from almost all aspects of our everyday lives that it becomes huge and multistructuredand has hidden useful information. The challenges with Big Data include capture, curation, storage, sea...

Analysis of Alzheimer Symptoms and Stages Using Canny Edge Detector in Image Segmentation

  Abstract: Alzheimer’s disease is the most common form of dementia.It is a neurological brain disorders. The hippocampus is known to shrink in time due to cell death,and it is linked with increased memory loss,whic...

Download PDF file
  • EP ID EP116095
  • DOI 10.9790/0661-16395460
  • Views 87
  • Downloads 0

How To Cite

Vasani Krunal A (2014).  Content Evocation Using Web Scraping and Semantic Illustration. IOSR Journals (IOSR Journal of Computer Engineering), 16(3), 54-60. https://europub.co.uk/articles/-A-116095