Semi-Structured Data Structured Data Conversion Using Data Mining Methods
Journal Title: International journal of Emerging Trends in Science and Technology - Year 2017, Vol 4, Issue 10
Abstract
Emerging technologies of semi-structured data have attracted a wide attention like networks, e-commerce, information retrieval and databases. In these applications, the data are modeled not as static collections but as transient data streams, where the data source is an unbounded stream of individual data items. It is becoming increasingly popular to send heterogeneous and ill-structured data through networks. Since traditional database technologies are not directly applicable to such data streams, it is important to study efficient information extraction methods for semi-structured data. Hence there has been increasing demand for automatic methods for extracting useful information, particularly, for discovering rules or patterns from large collection of semi-structured data, namely, semi-structured data mining. We introduce a class of simple combinatorial patterns over texts such as proximity phrase association patterns and ordered and unordered tree patterns modeling unstructured texts and semi-structured data on the Web. In addition with, we consider the problem of finding the patterns that optimize a given statistical measure within the whole class of patterns in a large collection of unstructured texts. For these classes of patterns, we develop fast and robust text mining algorithms based on techniques in computational geometry, string matching, and combinatorial optimization. We successfully implemented the developed text and semi-structured mining algorithms with experiments on interactive document browsing in a large text database, keyword and common structure discovery from Web.
Authors and Affiliations
B. Suchitra
Detection of Spine Diseases by Feature Change in Aging Lumbar Spine Using Factor Analysis
The vertebrae’s and intervertebral discs form the axis of the skeleton which is said to be spine. The spine consists of 33 vertebrae’s which are grouped as five regions named as cervical region, thoracic region, lumbar r...
Exploration Encryption Mechanism Using Single Key for Public Cloud Storage
Sharing of Data is an main functionality in cloud Computing. In this system, we show how to share data in securely, flexibly, efficiently, and with others in cloud computing. Cloud storage is a bulk of information online...
A Framework for Medical Assistance using Internet of Things Architecture
Remote monitoring of patient is one of the important areas of research in the medical field. With the advancements in the field of sensors and semiconductors it is possible to monitor a patient and provide him medication...
Correction of Power Quality Issues in Distribution System Using DSTATCOM with SSULMS Control Algorithm
Power quality disturbances is the major concern in the distribution system, that leads to tripping and malfunction of sensitive equipments in distribution system. A Distribution Static Synchronous Compensator (DSTATCOM)...
Software Protection against Piracy and Reverse Engineering using Software Watermarking Technique
The rise in use of Internet and byte code languages such as Java byte code and Microsoft’s Common Intermediate language have made copying, decompiling and disassembling software easier with the rapid development of Inter...