A Review of Various Techniques of Web Content Mining For HTML and XML Contents
Journal Title: International Journal of Research in Computer and Communication Technology - Year 2014, Vol 3, Issue 6
Abstract
World Wide Web is the largest source of information. Most of the data on the web is dynamic and is in unstructured form. It is becoming difficult to get the relevant data from the web. Data Mining is the field of computer science which is used to extract knowledge from very large amount of data. Web mining is the application of data mining, which implements various techniques of data mining to get the efficient knowledge from the web data. This paper presents an overview of various techniques that has been used for web content mining including images, audio, video and semi-structured contents like HTML and XML. Since HTML has many limitations like limited tags, not case sensitive and designed to display data only, Web developers has started to develop Web pages on emerging Web Technologies like XML, Flash etc. XML was designed to describe data and to focus on what the data is. XML also plays the role of a meta- language and allows document authors to create customized markup language for limitless different types of documents, making it a standard data format for online data exchange.
Authors and Affiliations
Rupinder Kaur, Kamaljit Kaur
Survey on Different Data Hiding Techniques
Data is a vital resource in all areas. With the wide use of internet, the security of the data being transmitted over internet also became an important challenge. Now data security over internet has been an important...
Closed Loop Micro strip Antenna Design For Wireless Technology
A small Dual band microstrip patch antenna is introduced. The advance of communication systems requires new antenna designs to comply with the ever-increasing demands of the wireless market. This presented antenna is...
A Study on channel modeling of underwater acoustic communication
The ability to effectively communicate underwater has numerous applications for researchers, marine commercial operators and defense organizations. As electromagnetic waves cannot propagate over long distances in sea...
Text Detection On Scene Images Using MSER
Text detection and recognition is one of the difficult tasks in the computer vision community and there is a lot of research going on in recent years. This paper focuses on the problem of text detection and recogniti...
An Optimal and Secure Ranking Search Over out Sourced Cloud Databases
Considering top k multi keywords from the out sourced information files is still an intriguing examination issue in light of the fact that out sourced information over cloud can be scrambled for secrecy .In this pape...