Journal of Computer Science and Technology


When Crowdsourcing Meets Data Markets: A Fair Data Value Metric for Data Trading

Yang-Su Liu (刘洋溯), Zhen-Zhe Zheng (郑臻哲), Member, CCF, IEEE, Fan Wu (吴帆), Member, CCF, IEEE, and Gui-Hai Chen (陈贵海), Member, CCF, Fellow, IEEE   

  1. Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai 200240, China
  • Received:2022-05-21 Revised:2023-02-13 Accepted:2023-03-15
  • Contact: Zhen-Zhe Zheng
  • About author:Zhen-Zhe Zheng is an assistant professor in the Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai. He received the B.E. in Software Engineering from Xidian University, in 2012, and the M.S. degree and the Ph.D. degree in Computer Science from Shanghai Jiao Tong University, Shanghai, in 2015 and 2018, respectively. He has visited the University of Illinois at Urbana-Champaign (UIUC) as a Visiting Scholar from 2016 to 2018, and then a Post Doc Research Associate from 2018 to 2019. His research interests include game theory, networking and mobile computing, and online marketplaces. He is a recipient of the China Computer Federation (CCF) Excellent Doctoral Dissertation Award 2018, Google Ph.D. Fellowship 2015 and Microsoft Research Asia Ph.D. Fellowship 2015. He has served as the member of technical program committees of several academic conferences, such as MobiHoc, AAAI, IoTDI, MSN, etc. He is a member of the ACM, IEEE, and CCF. For more information, please visit

Large quantity and high quality data is critical to the success of machine learning in diverse applications. Faced with the dilemma of data silos where data is difficult to circulate, emerging data markets attempt to break the dilemma by facilitating data exchange on the Internet. Crowdsourcing, on the other hand, is one of the important methods to efficiently collect large amounts of data with high-value in data markets. In this paper, we investigate the joint problem of efficient data acquisition and fair budget distribution across the crowdsourcing and data market. We propose a new metric of data value as the uncertainty reduction of a Bayesian machine learning model by integrating the data into model training. Guided by this data value metric, we design a mechanism called Shapley value mechanism with Individual Rationality (SV-IR), in which we design a greedy algorithm with a constant approximation ratio to greedily select the most cost-efficient data brokers, and a fair compensation determination rule based on the Shapley value, respecting the individual rationality constraints. We further propose a fair reward distribution method for the data holders with various effort levels under the charge of a data broker. We demonstrate the fairness of the compensation determination rule and reward distribution rule by evaluating our mechanisms on two real-world datasets. The evaluation results also show that the selection algorithm in SV-IR could approach the optimal solution, and outperforms the other existing methods.


1、 研究背景(context)
2、 目的(Objective)
3、 方法(Method)
4、 结果(Result & Findings)
5、 结论(Conclusions)

Key words: data trading; crowdsourcing; mechanism design; shapley value;

[1] Sai-Sai Gong, Wei Hu, Wei-Yi Ge, Yu-Zhong Qu. Modeling Topic-Based Human Expertise for Crowd Entity Resolution [J]. Journal of Computer Science and Technology, 2018, 33(6): 1204-1218.
[2] Peng-Peng Chen, Hai-Long Sun, Yi-Li Fang, Jin-Peng Huai. Collusion-Proof Result Inference in Crowdsourcing [J]. , 2018, 33(2): 351-365.
[3] An-Zhen Zhang, Jian-Zhong Li, Hong Gao, Yu-Biao Chen, Heng-Zhao Ma, Mohamed Jaward Bah. CrowdOLA: Online Aggregation on Duplicate Data Powered by Crowdsourcing [J]. , 2018, 33(2): 366-379.
[4] Yi-Li Fang, Hai-Long Sun, Peng-Peng Chen, Ting Deng. Improving the Quality of Crowdsourced Image Labeling via Label Similarity [J]. , 2017, 32(5): 877-889.
[5] Hong-Zhi Wang, Zhi-Xin Qi, Ruo-Xi Shi, Jian-Zhong Li, Hong Gao. COSSET+:Crowdsourced Missing Value Imputation Optimized by Knowledge Base [J]. , 2017, 32(5): 845-857.
[6] Zhi-Xu Li, Qiang Yang, An Liu, Guan-Feng Liu, Jia Zhu, Jia-Jie Xu, Kai Zheng, Min Zhang. Crowd-Guided Entity Matching with Consolidated Textual Data [J]. , 2017, 32(5): 858-876.
[7] An Liu, Zhi-Xu Li, Guan-Feng Liu, Kai Zheng, Min Zhang, Qing Li, Xiangliang Zhang. Privacy-preserving Task Assignment in Spatial Crowdsourcing [J]. , 2017, 32(5): 905-918.
[8] Jia-Xu Liu, Yu-Dian Ji, Wei-Feng Lv, Ke Xu. Budget-aware Dynamic Incentive Mechanism in Spatial Crowdsourcing [J]. , 2017, 32(5): 890-904.
[9] Hai-Bo Ye, Tao Gu, Xian-Ping Tao, Jian Lv. Infrastructure-Free Floor Localization Through Crowdsourcing [J]. , 2015, 30(6): 1249-1273.
Full text



No Suggested Reading articles found!

ISSN 1000-9000(Print)

CN 11-2296/TP

Editorial Board
Author Guidelines
Journal of Computer Science and Technology
Institute of Computing Technology, Chinese Academy of Sciences
P.O. Box 2704, Beijing 100190 P.R. China
  Copyright ©2015 JCST, All Rights Reserved