Duplicate Detection in Hierarchical Data Using XPath

Journal Title: IOSR Journals (IOSR Journal of Computer Engineering) - Year 2015, Vol 17, Issue 6

Abstract

Abstract: There were many techniques for identifying duplicates in relational data, but only a few solutions focus on identifying duplicates which has complex hierarchical structure, as XML data. In this paper, we present a new technique for identifying XML duplicates, so-called XML duplication using Xpath. XML duplication using Xpath technique uses a Bayesian network to conclude the possibility that two xml elements are duplicates, based on the information within the elements and other information organized in the XML. Inaddition, to increase the proficiency of the web usage, a new pruning strategy was created. This pruning strategy will help to gain maximum benefits over non-computing algorithm. This technique can be used to increase the proficiency of identifying duplicates and remove it, so no duplicate record will be there. Through many experiments, our algorithm is able to achieve high accuracy and retrieve count in several XML dataset. XML duplication using Xpath technique is able to outclass another technique for identifying duplicates, both in proficiency and potency

Authors and Affiliations

Akash R. Petkar, , Vijay B. Patil

Keywords

Related Articles

 Channel Fading Detection in Manets with Hand off Strategy

 In wireless mobile ad hoc networks (MANETs), packet transmission is impaired by radio link fluctuations. This paper proposes a novel channel adaptive routing protocol which extends the Ad hoc OnDemand Multipath D...

Smart GNC scheme forAutonomous Planetary Landing

Abstract: The moon or other nearest planets are important destinations for space science and the smart landing is key technology for exploring the different planets without fail. Due to the long round-trip delay ofcommun...

 Wavelet Based Features for Defect Detection in Fabric using Genetic Algorithm

 Abstract: In this paper a new scheme is proposed for Fabric defect detection in textile industry. For this purpose, wavelet transformer is used as feature extractor of coefficients of fabric. These coefficients c...

 Improved and Energy Efficient Olsr Protocol Using Spanning Tree in Manet

 Abstract: Mobile Adhoc Network is an autonomous and decentralized network. Its topology changes dynamically and message overhead is more due to its frequent change of topology in network. For reliable transmissio...

Using Data-Mining Technique for Census Analysis to Give GeoSpatial Distribution of Nigeria.

 There are patterns buried within the mass of data in the various editions of population census figures in this country. These are patterns that will be impossible for humans working with bare eyes and hands, to u...

Download PDF file
  • EP ID EP111887
  • DOI -
  • Views 127
  • Downloads 0

How To Cite

Akash R. Petkar, , Vijay B. Patil (2015). Duplicate Detection in Hierarchical Data Using XPath. IOSR Journals (IOSR Journal of Computer Engineering), 17(6), 39-76. https://europub.co.uk/articles/-A-111887