Journal of Computer Science and Technology


Hadamard Encoding based Frequent Itemset Mining under Local Differential Privacy Constraints

Dan Zhao1,2(赵丹), Su-yun Zhao2(赵素云), Member, CCF, Hong Chen2,∗(陈红), Distinguished Member, CCF, Rui-xuan Liu2(刘睿瑄), Cui-ping Li2(李翠平), Distinguished Member, CCF, and Xiao-ying Zhang2(张晓莹)   

  1. 1Institute of Scientific and Technical Information of China, Beijing 100038, China
    2Key Laboratory of Data Engineering and Knowledge Engineering, Ministry of Education School of Information, Renmin University of China, Beijing 100872, China
  • Received:2021-01-31 Revised:2023-02-11 Accepted:2023-02-21
  • Contact: Hong Chen
  • About author:Hong Chen received the B.S., M.S., and Ph.D. degrees, professor, doctoral supervisor, distinguished member of CCF. She is currently with the School of Information, Renmin University of China. Her current research interests include high-performance database system, data warehouse and data mining, stream data management, and data management in wireless sensor networks.

Local differential privacy (LDP) approaches to collect sensitive information for frequent itemset mining (FIM) can reliably guarantee privacy. Most current approaches to FIM under LDP add “padding and sampling” steps to obtain frequent itemsets and their frequencies because each user transaction represents a set of items. The current state-of-the-art approach, namely set-value itemset mining (SVSM), must balance variance and bias to achieve accurate results. Thus, an unbiased FIM approach with lower variance is highly promising. To narrow this gap, we propose an item-level LDP frequency oracle approach, named the integrated-with-Hadamard-transform-based frequency oracle (IHFO). For the first time, Hadamard encoding is introduced to a set of values to encode all items into a fixed vector, allowing perturbation adding in IHFO. For this purpose, an FIM approach, called optimized united itemset mining (O-UISM), is proposed. O-UISM combines the padding-and-sampling-based frequency oracle (PSFO) and the IHFO into a framework to acquire accuracy frequent itemsets with their frequencies. Finally, we theoretically and experimentally demonstrate that O-UISM significantly outperforms the extant approaches in finding frequent itemsets and estimating their frequencies under the same privacy guarantee.



Key words: Local Differential Privacy; Frequent Itemset Mining; Frequency Oracle;

[1] Yu-Geng Song, Hui-Min Cui, Xiao-Bing Feng. Parallel Incremental Frequent Itemset Mining for Large Data [J]. , 2017, 32(2): 368-385.
Full text



No Suggested Reading articles found!

ISSN 1000-9000(Print)

CN 11-2296/TP

Editorial Board
Author Guidelines
Journal of Computer Science and Technology
Institute of Computing Technology, Chinese Academy of Sciences
P.O. Box 2704, Beijing 100190 P.R. China
  Copyright ©2015 JCST, All Rights Reserved