We use cookies to improve your experience with our site.
Yang-Su Liu, Zhen-Zhe Zheng, Fan Wu, Gui-Hai Chen. When Crowdsourcing Meets Data Markets: A Fair Data Value Metric for Data Trading[J]. Journal of Computer Science and Technology. DOI: 10.1007/s11390-023-2519-0
Citation: Yang-Su Liu, Zhen-Zhe Zheng, Fan Wu, Gui-Hai Chen. When Crowdsourcing Meets Data Markets: A Fair Data Value Metric for Data Trading[J]. Journal of Computer Science and Technology. DOI: 10.1007/s11390-023-2519-0

When Crowdsourcing Meets Data Markets: A Fair Data Value Metric for Data Trading

  • Large quantity and high quality data is critical to the success of machine learning in diverse applications. Faced with the dilemma of data silos where data is difficult to circulate, emerging data markets attempt to break the dilemma by facilitating data exchange on the Internet. Crowdsourcing, on the other hand, is one of the important methods to efficiently collect large amounts of data with high-value in data markets. In this paper, we investigate the joint problem of efficient data acquisition and fair budget distribution across the crowdsourcing and data market. We propose a new metric of data value as the uncertainty reduction of a Bayesian machine learning model by integrating the data into model training. Guided by this data value metric, we design a mechanism called Shapley value mechanism with Individual Rationality (SV-IR), in which we design a greedy algorithm with a constant approximation ratio to greedily select the most cost-efficient data brokers, and a fair compensation determination rule based on the Shapley value, respecting the individual rationality constraints. We further propose a fair reward distribution method for the data holders with various effort levels under the charge of a data broker. We demonstrate the fairness of the compensation determination rule and reward distribution rule by evaluating our mechanisms on two real-world datasets. The evaluation results also show that the selection algorithm in SV-IR could approach the optimal solution, and outperforms the other existing methods.
  • loading

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return