A Review of Various Techniques of Web Content Mining For HTML and XML Contents
Journal Title: International Journal of Research in Computer and Communication Technology - Year 2014, Vol 3, Issue 6
Abstract
World Wide Web is the largest source of information. Most of the data on the web is dynamic and is in unstructured form. It is becoming difficult to get the relevant data from the web. Data Mining is the field of computer science which is used to extract knowledge from very large amount of data. Web mining is the application of data mining, which implements various techniques of data mining to get the efficient knowledge from the web data. This paper presents an overview of various techniques that has been used for web content mining including images, audio, video and semi-structured contents like HTML and XML. Since HTML has many limitations like limited tags, not case sensitive and designed to display data only, Web developers has started to develop Web pages on emerging Web Technologies like XML, Flash etc. XML was designed to describe data and to focus on what the data is. XML also plays the role of a meta- language and allows document authors to create customized markup language for limitless different types of documents, making it a standard data format for online data exchange.
Authors and Affiliations
Rupinder Kaur, Kamaljit Kaur
Blind Watermarking Scheme for Un-Compressed Video Using RBF Neural Network
Robust watermarking is one of the major multimedia processing applications which is specifically important for compressed and uncompressed video for copyright protection, content authentication and broadcasting. In t...
A Review of Resource Allocation and Task scheduling for Computational Grids based on Meta-heuristic Function
The current scenario of grid computing faced a problem of job failure and increase of execution time of jobs. The failure of job degraded the performance of grid computing. The failure and increase execution time dep...
Minimizing Packet Delay Rate in Tree based wireless sensor networks
In this advanced and fast world people do not want to wait much for collecting information. Hence now a day’s collecting information in a faster way became a challenge for the researchers. Faster data collection in W...
Survey on Different Smoke Detection Techniques Using Image Processing
The most significant parts of protective and monitoring systems are the fire detection systems. Fire detection is very important for the safety of the people. The main causes of disasters are the failure in fire dete...
Supportive Accumulation for Proficient Data Approach In Interruption Tolerant Networks
The goal of this is to build up a structural system of social group based agreeable reserving for minimizing electronic substance provisioning cost in Mobile Social Wireless Networks (MSWNET). MSWNETs are framed by r...