? COSSET+:Crowdsourced Missing Value Imputation Optimized by Knowledge Base
Journal of Computer Science and Technology
Quick Search in JCST
 Advanced Search 
      Home | PrePrint | SiteMap | Contact Us | FAQ
 
Indexed by   SCIE, EI ...
Bimonthly    Since 1986
Journal of Computer Science and Technology 2017, Vol. 32 Issue (5) :845-857    DOI: 10.1007/s11390-017-1768-1
Special Section on Crowdsourced Data Management Current Issue | Archive | Adv Search << Previous Articles | Next Articles >>
COSSET+:Crowdsourced Missing Value Imputation Optimized by Knowledge Base
Hong-Zhi Wang, Member, CCF, ACM, IEEE, Zhi-Xin Qi, Ruo-Xi Shi, Jian-Zhong Li, Fellow, CCF, Member, ACM, Hong Gao, Senior Member, CCF, Member, ACM
School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China

Abstract
Reference
Related Articles
Download: [PDF 1383KB]     Export: BibTeX or EndNote (RIS)  
Abstract Missing value imputation with crowdsourcing is a novel method in data cleaning to capture missing values that could hardly be filled with automatic approaches. However, time cost and overhead in crowdsourcing are high. Therefore, we have to reduce cost and guarantee accuracy of crowdsourced imputation. To achieve the optimization goal, we present COSSET+, a crowdsourced framework optimized by knowledge base. We combine the advantages of both knowledge-based filter and crowdsourcing platform to capture missing values. Since the amount of crowd values will affect the cost of COSSET+, we aim to select partial missing values to be crowdsourced. We prove that the crowd value selection problem is an NP-hard problem and develop an approximation algorithm for this problem. Extensive experimental results demonstrate the efficiency and effectiveness of the proposed approaches.
Articles by authors
Keywordscrowdsourcing   missing value   imputation   knowledge base   optimization     
Received 2017-04-01;
Fund:

This work was supported by the National Natural Science Foundation of China under Grant Nos. U1509216 and 61472099, the National Key Technology Research and Development Program of the Ministry of Science and Technology of China under Grant No. 2015BAH10F01, the Scientific Research Foundation for the Returned Overseas Chinese Scholars of Heilongjiang Province of China under Grant No. LC2016026, and MOE-Microsoft Key Laboratory of Natural Language Processing and Speech of Harbin Institute of Technology.

About author: Hong-Zhi Wang is a professor and doctoral supervisor of Harbin Institute of Technology, Harbin. He received his Ph.D. degree in computer science and technology from Harbin Institute of Technology, Harbin, in 2008. He was awarded Microsoft Fellowship, Chinese Excellent Database Engineer, and IBM Ph.D. Fellowship. His research interests include big data management, data quality, and graph data management.
Cite this article:   
Hong-Zhi Wang, Zhi-Xin Qi, Ruo-Xi Shi, Jian-Zhong Li, Hong Gao.COSSET+:Crowdsourced Missing Value Imputation Optimized by Knowledge Base[J]  Journal of Computer Science and Technology, 2017,V32(5): 845-857
URL:  
http://jcst.ict.ac.cn:8080/jcst/EN/10.1007/s11390-017-1768-1
Copyright 2010 by Journal of Computer Science and Technology