An Effective Solution to Adequate and Operative Duplicate Detection in Stratified Data
Journal Title: IOSR Journals (IOSR Journal of Computer Engineering) - Year 2014, Vol 16, Issue 2
Abstract
Abstract: Data Mining is considered as a nontrival extraction of implicit, previously unknown and potentially usefull information from data. Although there is a long line of work on identifying duplicates in relational data, only a few solutions focus on duplicate detection in more complex hierarchical structures, like XML data. A novel method for XML duplicate detection, called XMLDup. XMLDup uses a Bayesian network to determine the probability of two XML elements being duplicates, considering not only the information within the elements, but also the way that information is structured. In addition, to improve the efficiency of the network evaluation a novel pruning strategy, capable of significant gains of the unoptimized version of the algorithm, is presented. Through experiments Bayesian Network proves that it can achieve high precision and recall scores in several data sets. XMLDup is also able to outperform another state-of-the-art duplicate detection solution, both in terms of upto 80% and of effectiveness.
Authors and Affiliations
A. Baladhandayutham, , S. Roselin Mary
Content Based Image Retrieval for Unlabelled Images
Recently, content-based image retrieval has become hot topic and the techniques of content-based image retrieval have been achieved good development. Content-based image retrieval systems were introduced to...
Energetic Hybrid Routing Protocol
Abstract: The networks of sensors are characterized by limited capacity especially at the level of energy saw that the components constitute the network to know the sensors are powered by batteries, which influence on th...
[i][/i] Brief Study of Performance of Routing Protocols for Mobile Ad Hoc Networking in Ns-2
Ad hoc networking allows portable devices to establish communication independent of a centralinfrastructure. However, the fact that there is no central Infrastructure and that the devices can move randomly gives...
Advanced Client Repudiation Diverge Auditor in Public Cloud
Abstract: With data storage and sharing services provided by cloud , people work together by sharing data asa group . After creating group and shared data in the cloud user in the group is able to access and modifi...
The Application of Model Predictive Control (MPC) to Fast Systems such as Autonomous Ground Vehicles (AGV)
Abstract: This paper investigates the application of Model Predictive Control (MPC) to fast systems such as Autonomous Ground Vehicles (AGV) or mobile robots. The control of Autonomous ground vehicles (AGV) is chal...