DSBP:基于多教师对抗蒸馏的可信联邦学习无数据快速后门净化
DSBP: Data-Free and Swift Backdoor Purification for Trustworthy Federated Learning via Multi-Teacher Adversarial Distillation
-
摘要:研究背景 联邦学习 (Federated Learning, FL) 面临严重的后门威胁。由于缺少干净样本,即使发现了中毒特征,参数服务器也无法实时清理。现有的后门防御方法总是需要牺牲模型准确性或增加通信延迟以换取更好的可信度,这在实际场景中是不切实际的。目的 本文旨在提出了一种基于多教师对抗蒸馏的新型无数据快速后门净化方法DSBP,能够有效擦除 FL 中的各种后门变体。方法 DSBP 将纯化任务视为知识继承和后门抑制之间的对抗博弈过程,目标是强制学生模型在重建的干净样本上学习多个教师模型的集合结果,同时对合成中毒样本不敏感。在DSBP中,中毒特征的自相似性被用于优化触发生成器,这对于在对抗性蒸馏过程中加速DSBP 的收敛至关重要。结果 在3个数据集上与4种最先进的防御方法与3种后门变体进行比较:平均攻击成功率只需300轮就可以从 96.6%降低到2.3%。结论 在模型即服务(MaaS, Model-as-a-Service)场景下,用户可以通过所提出的方法识别并擦除模型中的多类后门变体,而无需访问原始干净样本。在隐私保护与版权保护意识日益增强的当下,该方法具备显著的实际应用价值与推广前景。Abstract: Federated learning (FL) faces severe backdoor threats. Due to the inaccessibility of clean samples, the parameter server cannot clean them up in real time even if poisoning features are discovered. Meanwhile, existing backdoor defense methods always require sacrificing model accuracy or increasing communication delay in exchange for better FL trustworthiness. To address these challenges, we propose a novel data-free and swift backdoor purification (DSBP) scheme based on multi-teacher adversarial distillation to effectively erase various backdoor variants in FL. DSBP treats the purification task as an adversarial game process between knowledge inheritance and backdoor inhibition by enforcing the student model to learn ensemble results of multiple teacher models on reconstructed clean samples, while being insensitive to synthetic poisoned samples. In DSBP, we utilize the self-similarity of poisoned features to optimize the trigger generator and accelerate the convergence of DSBP during the adversarial distillation process. We validate the effectiveness of DBSP by comparing it with four state-of-the-art defense methods against three backdoor variants on three datasets. The average attack success rate can be reduced from 96.6% to 2.3% with only 300 rounds.
下载: