A Method for Implementing Probabilistic Entity Resolution

Abstract

Deterministic and probabilistic are two approaches to matching commonly used in Entity Resolution (ER) systems. While many users are familiar with writing and using Boolean rules for deterministic matching, fewer are as familiar with the scoring rule configuration used to support probabilistic matching. This paper describes a method using deterministic matching to “bootstrap” probabilistic matching. It also examines the effectiveness three commonly used strategies to mitigate the effect of missing values when using probabilistic matching. The results based on experiment using different sets of synthetically generated data processed using the OYSTER open source entity resolution system.

Authors and Affiliations

Awaad Alsarkhi, John R. Talburt

Keywords

Related Articles

A Hybrid Approach for Feature Subset Selection using Ant Colony Optimization and Multi-Classifier Ensemble

An active area of research in data mining and machine learning is dimensionality reduction. Feature subset selection is an effective technique for dimensionality reduction and an essential step in successful data mining...

Power and Contention Control Scheme: As a Good Candidate for Interference Modeling in Cognitive Radio Network

Due to the ever growing need for spectrum, the cognitive radio (CR) has been proposed to improve the radio spectrum utilization. In this scenario, the secondary users (SU) are permitted to share spectrum with the license...

Differentiation of Brain Waves from the Movement of the Upper and Lower Extremities of the Human Body

Currently, the study of brain waves has shown a type of alternative communication, in addition to the different applications that can be made with the brain waves obtained from each individual. The OpenBCI is an open sou...

A Simple Exercise-to-Play Proposal that would Reduce Games Addiction and Keep Players Healthy

Games players usually get addicted to video games in general and more specifically to those that are usually played over the internet. These players prefer to stay at home and play games rather than playing sports or out...

Location Prediction in a Smart Environment

The context prediction and especially the location prediction is an important feature for improving the performance of smart systems. Predicting the next location or context of the user make the system proactive, so the...

Download PDF file
  • EP ID EP417568
  • DOI 10.14569/IJACSA.2018.091102
  • Views 108
  • Downloads 0

How To Cite

Awaad Alsarkhi, John R. Talburt (2018). A Method for Implementing Probabilistic Entity Resolution. International Journal of Advanced Computer Science & Applications, 9(11), 7-15. https://europub.co.uk/articles/-A-417568