|
›› 2018,Vol. 33 ›› Issue (2): 351-365.doi: 10.1007/s11390-018-1823-6
所属专题: Data Management and Data Mining
• Data Management and Data Mining • 上一篇 下一篇
Peng-Peng Chen1,2, Student Member, CCF, ACM, Hai-Long Sun1,2*, Member, CCF, ACM, IEEE, Yi-Li Fang1,2*, Member, CCF, ACM, Jin-Peng Huai1,2, Fellow, CCF, Member, ACM, IEEE
Peng-Peng Chen1,2, Student Member, CCF, ACM, Hai-Long Sun1,2*, Member, CCF, ACM, IEEE, Yi-Li Fang1,2*, Member, CCF, ACM, Jin-Peng Huai1,2, Fellow, CCF, Member, ACM, IEEE
在众包中,通常考虑工人独立处理任务并且提交答案,从而确保答案的多样性。事实上,当前研究表明来自通用平台的工人之间存在隐式的协作关系。工人为了付出少量的劳动获取更多的报酬,可能提供重复的答案进行串谋。该种行为会严重损害最终众包结果的质量。然而存在的众包方法均没有考虑到串谋对众包结果推理的影响。因此本文提出一种基于串谋检测的众包结果推理方法。利用工人表现的变化率,通过计算删除重复答案之前和之后工人平均表现的差异来检测串谋行为产生的重复答案,并考虑入结果推理方法中,以确保汇聚结果的质量。基于众包平台的真实数据和仿真数据进行了大量的实验评估。实验结果表明了本文方法的优越性。
[1] Li G L, Wang J N, Zheng Y D, Franklin M J. Crowdsourced data management:A survey. IEEE Trans. Knowledge and Data Engineering, 2016, 28(9):2296-2319.[2] Chen L, Lee D, Milo T. Data-driven crowdsourcing:Management, mining, and applications. In Proc. the 31st Int. Conf. Data Engineering, April 2015, pp.1527-1529.[3] Deng J, Dong W, Socher R et al. ImageNet:A large-scale hierarchical image database. In Proc. IEEE Conf. Computer Vision and Pattern Recognition, June 2009, pp.248-255.[4] Liu X, Lu M Y, Ooi B C, Shen Y Y, Wu S, Zhang M H. CDAS:A crowdsourcing data analytics system. Proceedings of the VLDB Endowment, 2012, 5(10):1040-1051.[5] Fang Y L, Sun H L, Li G L, Zhang R C, Huai J P. Effective result inference for context-sensitive tasks in crowdsourcing. In Proc. the 21st Int. Conf. Database Systems for Advanced Applications, April 2016, pp.33-48.[6] von Ahn L, Maurer B, McMillen C, Abraham D, Blum M. reCAPTCHA:Human-based character recognition via web security measures. Science, 2008, 321(5895):1465-1468.[7] Fang Y L, Sun H L, Zhang R C, Huai J P, Mao Y Y. A model for aggregating contributions of synergistic crowdsourcing workflows. In Proc. the 28th AAAI Conf. Artificial Intelligence, July 2014, pp.3102-3103.[8] Zaidan O F, Callison-Burch C. Crowdsourcing translation:Professional quality from non-professionals. In Proc. the 49th Annual Meeting of the Association for Computational Linguistics, June 2011, pp.1220-1229.[9] Bernstein M S, Little G, Miller R C, Hartmann B, Ackerman M S, Karger D R, Crowell D, Panovich K. Soylent:A word processor with a crowd inside. Communications of the ACM, 2015, 58(8):85-94.[10] Zhu Y S, Yue S C, Yu C, Shi Y C. CEPT:Collaborative editing tool for non-native authors. In Proc. ACM Conf. Computer Supported Cooperative Work and Social Computing, February 25-March 1, 2017, pp.273-285.[11] Nebeling M, To A, Guo A H, De Freitas A A, Teevan J, Dow S P, Bigham J P. WearWrite:Crowd-assisted writing from smartwatches. In Proc. CHI Conf. Human Factors in Computing Systems, May 2016, pp.3834-3846.[12] Gray M L, Suri S, Ali S S, Kulkarni D. The crowd is a collaborative network. In Proc. the 19th ACM Conf. ComputerSupported Cooperative Work & Social Computing, February 27-March 2, 2016, pp.134-147.[13] Yin M, Gray M L, Suri S, Vaughan J W. The communication network within the crowd. In Proc. the 25th Int. Conf. World Wide Web, April 2016, pp.1293-1303.[14] Salehi N, McCabe A, Valentine M, Bernstein M. Huddler:Convening stable and familiar crowd teams despite unpredictable availability. In Proc. ACM Conf. Computer Supported Cooperative Work and Social Computing, February 25-March 1, 2017, pp.1700-1713.[15] Gadiraju U, Kawase R, Dietze S, Demartini G. Understanding malicious behavior in crowdsourcing platforms:The case of online surveys. In Proc. the 33rd Annual ACM Conf. Human Factors in Computing Systems, April 2015, pp.1631-1640.[16] Sodré I, Brasileiro F. An analysis of the use of qualifications on the Amazon mechanical Turk online labor market. Computer Supported Cooperative Work, 2017, 26(4/5/6):837-872.[17] Chang J C, Amershi S, Kamar E. Revolt:Collaborative crowdsourcing for labeling machine learning datasets. In Proc. CHI Conf. Human Factors in Computing Systems, May 2017, pp.2334-2346.[18] Wang G, Wilson C, Zhao X H, Zhu Y B, Mohanlal M, Zheng H T, Zhao B Y. Serf and turf:Crowdturfing for fun and profit. In Proc. the 21st Int. Conf. World Wide Web, April 2012, pp.679-688.[19] Adams S A. Maintaining the collision of accounts:Crowdsourcing sites in health care as brokers in the co-production of pharmaceutical knowledge. Information Communication & Society, 2014, 17(6):657-669.[20] Douceur J R. The Sybil attack. In Proc. the 1st Int. Workshop on Peer-to-Peer Systems, March 2002, pp.251-260.[21] Lev O, Polukarov M, Bachrach Y, Rosenschein J S. Mergers and collusion in all-pay auctions and crowdsourcing contests. In Proc. Int. Conf. Autonomous Agents and MultiAgent Systems, May 2013, pp.675-682.[22] KhudaBukhsh A R, Carbonell J G, Jansen P J. Detecting non-adversarial collusion in crowdsourcing. In Proc. the 2nd AAAI Conf. Human Computation and Crowdsourcing, November 2014, pp.104-111.[23] Xiang Q K, Nevat I, Zhang P F, Zhang J. Collusionresistant spatial phenomena crowdsourcing via mixture of Gaussian processes regression. In Proc. the 18th Int. Conf. Trust in Agent Societies, May 2016, pp.30-41.[24] Fang Y L, Chen P P, Sun K, Sun H L. A decision tree based quality control framework for multi-phase tasks in crowdsourcing. In Proc. the 12th Chinese Conf. Computer Supported Cooperative Work and Social Computing, September 2017, pp.10-17.[25] Fang Y L, Sun H L, Chen P P, Deng T. Improving the quality of crowdsourced image labeling via label similarity. Journal of Computer Science and Technology, 2017, 32(5):877-889.[26] Sheng V S, Provost F, Ipeirotis P G. Get another label? Improving data quality and data mining using multiple, noisy labelers. In Proc. the 14th ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining, August 2008, pp.614-622.[27] Snow R, O'Connor B, Jurafsky D, Ng A Y. Cheap and fast-but is it good?:Evaluating non-expert annotations for natural language tasks. In Proc. Conf. Empirical Methods in Natural Language Processing, October 2008, pp.254-263.[28] Dawid A P, Skene A M. Maximum likelihood estimation of observer error-rates using the EM algorithm. Journal of the Royal Statistical Society, 1979, 28(1):20-28.[29] Raykar V C, Yu S P, Zhao L H, Valadez G H, Florin C, Bogoni L, Moy L. Learning from crowds. Journal of Machine Learning Research, 2010, 11:1297-1322.[30] Gao C, Lu Y, Zhou D Y. Exact exponent in optimal rates for crowdsourcing. In Proc. the 33rd Int. Conf. Machine Learning, June 2016, pp.603-611.[31] Whitehill J, Ruvolo P, Wu T, Bergsma J, Movellan J. Whose vote should count more:Optimal integration of labels from labelers of unknown expertise. In Proc. the 22nd Int. Conf. Neural Information Processing Systems, December 2009, pp.2035-2043.[32] Garcia-Molina H, Joglekar M, Marcus A, Parameswaran A, Verroios V. Challenges in data crowdsourcing. IEEE Trans Knowledge and Data Engineering, 2016, 28(4):901-911.[33] Shin H, Park T, Kang S, Lee B, Song J, Chon Y, Cha H. CoSMiC:Designing a mobile crowd-sourced collaborative application to find a missing child in situ. In Proc. the 16th Int. Conf. Human-Computer Interaction with Mobile Devices & Services, September 2014, pp.389-398.[34] Ambati V, Vogel S, Carbonell J. Collaborative workflow for crowdsourcing translation. In Proc. ACM Conf. Computer Supported Cooperative Work, February 2012, pp.1191-1194.[35] Teevan J, Iqbal S T, Von Veh C. Supporting collaborative writing with microtasks. In Proc. CHI Conf. Human Factors in Computing Systems, May 2016, pp.2657-2668.[36] Rahman H, Roy S B, Thirumuruganathan S, Amer-Yahia S, Das G. Task assignment optimization in collaborative crowdsourcing. In Proc. IEEE Int. Conf. Data Mining, November 2015, pp.949-954.[37] Torshiz M N, Amintoosi H. Collusion-resistant worker selection in social crowdsensing systems. Journal of Computer and Knowledge Engineering, 2017, 1(1):9-20.[38] Celis L E, Reddy S P, Singh I P, Vaya S. Assignment techniques for crowdsourcing sensitive tasks. In Proc. the 19th ACM Conf. Computer-Supported Cooperative Work & Social Computing, February 27-March 2, 2016, pp.836-847.[39] Wang L, Zhou Z H. Cost-saving effect of crowdsourcing learning. In Proc. the 25th Int. Joint Conf. Artificial Intelligence, July 2016, pp.2111-2117.[40] Welinder P, Branson S, Belongie S, Perona P. The multidimensional wisdom of crowds. In Proc. the 23rd Int. Conf. Neural Information Processing Systems, December 2010, pp.2424-2432.[41] Ipeirotis P G, Provost F, Wang J. Quality management on Amazon Mechanical Turk. In Proc. ACM SIGKDD Workshop on Human Computation, July 2010, pp.64-67. |
No related articles found! |
|
版权所有 © 《计算机科学技术学报》编辑部 本系统由北京玛格泰克科技发展有限公司设计开发 技术支持:support@magtech.com.cn 总访问量: |