Clustering Tree based Implementation of Record Linkage on Many-to-Many Relation

Journal Title: International Journal of Science and Research (IJSR) - Year 2015, Vol 4, Issue 3

Abstract

Record linkage or entity resolution are emerging strategy to avoid duplication and other purposes. Recommender domain uses the linkage method to provide efficient results in terms of accuracy. This paper introduces a new Many-to-Many Record Linkage (MMRL) algorithm which links records from one table with a set of records from another table. MMRL algorithm is based on clustering tree which forms the group on each table separately that to be linked. Hierarchical structure such as tree is suitable to understand and execute the linkage process. Intermediate nodes are having less similarity value than end nodes. Each node of the clustering tree contains a cluster instead of a single classification. Prediction accuracy depends on the end node. Jaccard similarity and metaphone similarity are used as distance measures. Prediction result shows whether the records are matched or not. This result proves the efficiency of MMRL algorithm. A data set from movie recommender domain was evaluated for this paper. This MMRL algorithm gives better performance and results.

Authors and Affiliations

Keywords

Related Articles

Cordial Labeling of Kn;n related graphs

Cordial Labeling of Kn;n related graphs

Survey on Privacy-Preservation in Data Mining Using Slicing Strategy

Privacy-preserving data mining is used to safeguard sensitive information from unsanctioned disclosure. Privacy is an important issue in data publishing years because of the increasing ability to store personal data abou...

Improved AMCBF

In this research work we have proposed an algorithm that is improved version of AMCBF. This algorithm is priority-based scheduling algorithm and helps to consolidate parallel workloads in the cloud. This scheduling algor...

Secure One Time Password Generation for Website Security using Mobile Phone with Biometrics

Authentication is an essential part of network security. It is a process of confirming the identity to ensure security, with a vital role to provide security in websites. Even though text password is a convenient user au...

Performance Analysis of Scheduling Algorithms in Simulated Parallel Environment

Performance Analysis of Scheduling Algorithms in Simulated Parallel Environment

Download PDF file
  • EP ID EP357676
  • DOI -
  • Views 99
  • Downloads 0

How To Cite

(2015). Clustering Tree based Implementation of Record Linkage on Many-to-Many Relation. International Journal of Science and Research (IJSR), 4(3), -. https://europub.co.uk/articles/-A-357676