Duplicate Detection in Hierarchical Data Using XPath
Journal Title: IOSR Journals (IOSR Journal of Computer Engineering) - Year 2015, Vol 17, Issue 6
Abstract
Abstract: There were many techniques for identifying duplicates in relational data, but only a few solutions focus on identifying duplicates which has complex hierarchical structure, as XML data. In this paper, we present a new technique for identifying XML duplicates, so-called XML duplication using Xpath. XML duplication using Xpath technique uses a Bayesian network to conclude the possibility that two xml elements are duplicates, based on the information within the elements and other information organized in the XML. Inaddition, to increase the proficiency of the web usage, a new pruning strategy was created. This pruning strategy will help to gain maximum benefits over non-computing algorithm. This technique can be used to increase the proficiency of identifying duplicates and remove it, so no duplicate record will be there. Through many experiments, our algorithm is able to achieve high accuracy and retrieve count in several XML dataset. XML duplication using Xpath technique is able to outclass another technique for identifying duplicates, both in proficiency and potency
Authors and Affiliations
Akash R. Petkar, , Vijay B. Patil
Friend Recommendation System for Social Networks
Abstract: The current social networking services suggests friends based on the respective individual’s network. This may not be the perfect way to recommend friends to respective user as friend suggestion should be more...
Security Evaluation of Google Chrome Operating System
Abstract: Due to the increase nature of computer threats and attacks, the security of the operating system isparamount in the computing world today. Every modern computer system, from network servers, workstationde...
Comparative Analysis of Various Grid Based SchedulingAlgorithms
Abstract : Grid computing provides access to heterogeneous resources which are distributed geographicallywhich makes resource scheduling a complex problem. Hence, scheduling algorithms are necessary whichallocate j...
Speech Synthesis for Punjabi Language Using Festival
Processing of digital speech plays an important role in modern speech communication research and applications. The main propose of digital speech is communication which means transmission of messages between human and co...
A Survey on Secure Key Policy Attribute-Based Encryption Policy for Data Sharing Among Dynamic Groups in the Cloud
Profited from distributed computing, clients can accomplish a powerful and sparing methodology for information sharing among gathering individuals in the cloud with the characters of low support and little administration...