Duplicate Detection in Hierarchical Data Using XPath

Journal Title: IOSR Journals (IOSR Journal of Computer Engineering) - Year 2015, Vol 17, Issue 6

Abstract

Abstract: There were many techniques for identifying duplicates in relational data, but only a few solutions focus on identifying duplicates which has complex hierarchical structure, as XML data. In this paper, we present a new technique for identifying XML duplicates, so-called XML duplication using Xpath. XML duplication using Xpath technique uses a Bayesian network to conclude the possibility that two xml elements are duplicates, based on the information within the elements and other information organized in the XML. Inaddition, to increase the proficiency of the web usage, a new pruning strategy was created. This pruning strategy will help to gain maximum benefits over non-computing algorithm. This technique can be used to increase the proficiency of identifying duplicates and remove it, so no duplicate record will be there. Through many experiments, our algorithm is able to achieve high accuracy and retrieve count in several XML dataset. XML duplication using Xpath technique is able to outclass another technique for identifying duplicates, both in proficiency and potency

Authors and Affiliations

Akash R. Petkar, , Vijay B. Patil

Keywords

Related Articles

Effective Crypto Systemfor Achieving Security and Performance over Market Basket Data Analysis

Abstract:Nowadays Cloud computing plays a vital role in diverseareaslike Personal Health,Social applications,enterprise,Finacial and Public domains etc., Security is an important province in as far as cloudcomputing is...

 Improving Cloud Security Using Data Mining

 Cloud computing is the use of computing resources (hardware and software) that are delivered as a service over a network (typically the Internet). It does offer great level of flexibility but this advantage comes...

 Neuron the Memory Unit of the Brain

Abstract: For long, the human brain has intrigued Researchers, Psychologist, Doctors and everyone alike. Ithas left many unanswered questions and the more it is studied the more questions arise. This paper presents acomp...

 Monitoring Wireless Sensor Network using Android based Smart Phone Application

 Abstract: Wireless Sensor Network application’s is use in detection of natural calamities like forest fire detection, flood detection, , earth quick early detection ,snow detection, traffic congestion and various o...

 Grid Computing- An Emerging Technology that enables large-scale resource sharing

 Abstract: In the last few years there has been a rapid exponential increase in computer processing power, data storage and communication. But still there are many complex and computation intensive problems, which c...

Download PDF file
  • EP ID EP111887
  • DOI -
  • Views 90
  • Downloads 0

How To Cite

Akash R. Petkar, , Vijay B. Patil (2015). Duplicate Detection in Hierarchical Data Using XPath. IOSR Journals (IOSR Journal of Computer Engineering), 17(6), 39-76. https://europub.co.uk/articles/-A-111887