Reconciling Schema Matching Networks Through Crowdsourcing
Journal Title: EAI Endorsed Transactions on Collaborative Computing - Year 2015, Vol 1, Issue 2
Abstract
for data integration purposes. Although several automatic schema matching tools have been developed, their results are often incomplete or erroneous. To obtain a correct set of correspondences, usually human effort is required to validate the generated correspondences. This validation process is often costly, as it is performed by highly skilled experts. Our paper analyzes how to leverage crowdsourcing techniques to validate the generated correspondences by a large group of non-experts. In our work we assume that one needs to establish attribute correspondences not only between two schemas but in a network. We also assume that the matching is realized in a pairwise fashion, in the presence of consistency expectations about the network of attribute correspondences. We demonstrate that formulating these expectations in the form of integrity constraints can improve the process of reconciliation. As in the case of crowdsourcing the user’s input is unreliable, we need specific aggregation techniques to obtain good quality. We demonstrate that consistency constraints can not only improve the quality of aggregated answers, but they also enable us to more reliably estimate the quality answers of individual workers and detect spammers. Moreover, these constraints also enable to minimize the necessary human effort needed, for the same expected quality of results.
Authors and Affiliations
Nguyen Quoc Viet Hung, Nguyen Thanh Tam, Zoltán Miklós, Karl Aberer
A Framework for Performance Evaluation of Decentralized Eventual Consistency Algorithms
Eventual Consistency (EC) model is adopted by numerous large-scale distributed systems. To ensure performance and scalability, this model allows any replica to accept updates without remote synchronization. Nowadays, man...
Assessing the Use of Communication Robots for Recreational Activities at Nursing Homes
We are using information communication technology and communication robots (hereafter referred to as "robots") to develop a service to assist recreational activities at nursing homes. The service relies on visual content...
A Novel Stackelberg-Bertrand Game Model for Pricing Content Provider
With the popularity of smart devices such as smartphone, tablet, contents that traditionally be viewed on a personal computer, can also be viewed on these smart devices. The demand for contents thus is increasing year by...
Effects of Cohesion-Based Feedback on the Collaborations in Global Software Development Teams
This paper describes a study that examines the effect of cohesion-based feedback on a team member’s behaviors in a global software development project. Chat messages and forum posts were collected from a software develop...
Collaborating with executable content across space and time
Executable content is of growing importance in many domains. How does one share and archive such content at Internet-scale for spatial and temporal collaboration? Spatial collaboration refers to the classic concept of us...