Modeling Topic-Based Human Expertise for Crowd Entity Resolution
-
Abstract
Entity resolution (ER) aims to identify whether two entities in an ER task refer to the same real-world thing.Crowd ER uses humans, in addition to machine algorithms, to obtain the truths of ER tasks. However, inaccurate or erroneous results are likely to be generated when humans give unreliable judgments. Previous studies have found that correctly estimating human accuracy or expertise in crowd ER is crucial to truth inference. However, a large number of them assume that humans have consistent expertise over all the tasks, and ignore the fact that humans may have varied expertise on different topics (e.g., music versus sport). In this paper, we deal with crowd ER in the Semantic Web area. We identify multiple topics of ER tasks and model human expertise on different topics. Furthermore, we leverage similar task clustering to enhance the topic modeling and expertise estimation. We propose a probabilistic graphical model that computes ER task similarity, estimates human expertise, and infers the task truths in a unified framework. Our evaluation results on real-world and synthetic datasets show that, compared with several state-of-the-art approaches, our proposed model achieves higher accuracy on the task truth inference and is more consistent with the human real expertise.
-
-