Semi-Structured Data Structured Data Conversion Using Data Mining Methods
Journal Title: International journal of Emerging Trends in Science and Technology - Year 2017, Vol 4, Issue 10
Abstract
Emerging technologies of semi-structured data have attracted a wide attention like networks, e-commerce, information retrieval and databases. In these applications, the data are modeled not as static collections but as transient data streams, where the data source is an unbounded stream of individual data items. It is becoming increasingly popular to send heterogeneous and ill-structured data through networks. Since traditional database technologies are not directly applicable to such data streams, it is important to study efficient information extraction methods for semi-structured data. Hence there has been increasing demand for automatic methods for extracting useful information, particularly, for discovering rules or patterns from large collection of semi-structured data, namely, semi-structured data mining. We introduce a class of simple combinatorial patterns over texts such as proximity phrase association patterns and ordered and unordered tree patterns modeling unstructured texts and semi-structured data on the Web. In addition with, we consider the problem of finding the patterns that optimize a given statistical measure within the whole class of patterns in a large collection of unstructured texts. For these classes of patterns, we develop fast and robust text mining algorithms based on techniques in computational geometry, string matching, and combinatorial optimization. We successfully implemented the developed text and semi-structured mining algorithms with experiments on interactive document browsing in a large text database, keyword and common structure discovery from Web.
Authors and Affiliations
B. Suchitra
A Novel Approach for Appraising the Performance of Multimedia Traffic for OLSR Routing Protocol in Manet
Mobile Ad-hoc Network is a set of portable electronic receivers which detects and demodulates and amplifies transmitted signals. MANETs do not have any polarized armature. Immense excerpt of influx on the cyberspace tran...
Survey on Leveraging Social Networks for P2P Content-Based File Sharing in Disconnected MANETs
3G technology has stimulated a wide variety of high band width applications on smart phones, such as video streaming and content-rich web browsing. Although having those applications mobile is quite appealing, high data...
Implementation of ZCR and STE techniques for the detection of the voiced and unvoiced signals in Continuous Punjabi Speech
During the analysis of speech signals the evaluation of the basic characteristics of the speech is an important stage. The basic characteristics of speech are voiced, unvoiced and silence. Such characteristics are evalua...
Clinical Observation of Toxicological Pathology of Vegetable oil in White Male Rats
Sixty white male rats (Sprague dawelly) were divided into two groups, 20 untreated control male rats feed on normal diet as group one while 40 treated male rats feed solely on vegetable oil for six months both young and...
Effect of Position of Infill Wall for Seismic Analysis of Low Rise Open Ground Storey Building
Presence of infill walls in the frames alters the behaviour of the building under lateral loads. However, it is common industry practice to ignore the stiffness of infill wall for analysis of framed building. Engineers b...