Detection of Similar Identities in XML Documents

Abstract

Duplicate detection is an important part of data cleaning; it is the process of detecting multiple representations of a same real-world object in the data sources. Numbers of solutions are available for detecting duplicates in XML data. One of the novel methods for XML duplicate detection is XMLDup. XMLDup makes use of a Bayesian network to evaluate the probability of two XML elements are duplicates. In addition a network pruning strategy is also used for improving the evaluation of the Bayesian network. A DOM tree construction algorithm for constructing the tree of the input XML data is proposed. It is seen that by using DOM tree construction algorithm higher efficiency is achieved for detection of similar identities in XML Documents.

Authors and Affiliations

Miss Amita Fulsundar, Dr. K. V. Metre

Keywords

Related Articles

Study of Hardened Properties of Rubcrete Incorporated with SCMs (SF, FA, RHA): A Sustainable Approach

This research focuses on the production of concrete with the possible use of agricultural and industrial wastes in combined form as a replacement to OPC along with the use of waste rubber tire granules as coarse aggregat...

Modelling and Designing of NLP Interface to Database for Afaan Oromoo

Database management system has been widely used for storing and retrieving data. However, database is often hard to access the data since their interface is rigid in cooperating with user, due to that analysis of natural...

Seismic Protection with Base Isolation Method Using ETABS

Base isolation is one of the most promising alternatives among the structure control methods. In recent decades, base isolation has been seriously considered for civil structures, such as buildings and bridges, subjected...

A Review on Drinking Water Treatment with Disinfectant

The majority of the world's natural water sources are not always safe to consume untreated. Raw water from rivers, lakes, ponds, and groundwater can include microorganisms. Waterborne diseases can be contracted by consum...

An Overview on the Manhole Edge Computing

One of the most important basic platforms in a smart city for preventing recurrent manhole cover accidents is an intelligent manhole cover management system. Manhole cover removal, loss, and injury endanger people's safe...

Download PDF file
  • EP ID EP748811
  • DOI -
  • Views 42
  • Downloads 0

How To Cite

Miss Amita Fulsundar, Dr. K. V. Metre (2015). Detection of Similar Identities in XML Documents. International Journal of Innovative Research in Computer Science and Technology, 3(3), -. https://europub.co.uk/articles/-A-748811