2.624

2020影响因子

(CJCR)

  • 中文核心
  • EI
  • 中国科技核心
  • Scopus
  • CSCD
  • 英国科学文摘

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于最大最小策略的纵向联邦学习隐私保护方法

李荣昌 刘涛 郑海斌 陈晋音 刘振广 纪守领

李荣昌, 刘涛, 郑海斌, 陈晋音, 刘振广, 纪守领. 基于最大最小策略的纵向联邦学习隐私保护方法. 自动化学报, 2022, 45(x): 1−16 doi: 10.16383/j.aas.c211233
引用本文: 李荣昌, 刘涛, 郑海斌, 陈晋音, 刘振广, 纪守领. 基于最大最小策略的纵向联邦学习隐私保护方法. 自动化学报, 2022, 45(x): 1−16 doi: 10.16383/j.aas.c211233
Li Rong-Chang, Liu Tao, Zheng Hai-Bin, Chen Jin-Yin, Liu Zhen-Guang, Ji Shou-Ling. Privacy preserving method for vertical federated learning based on max-min strategy. Acta Automatica Sinica, 2022, 45(x): 1−16 doi: 10.16383/j.aas.c211233
Citation: Li Rong-Chang, Liu Tao, Zheng Hai-Bin, Chen Jin-Yin, Liu Zhen-Guang, Ji Shou-Ling. Privacy preserving method for vertical federated learning based on max-min strategy. Acta Automatica Sinica, 2022, 45(x): 1−16 doi: 10.16383/j.aas.c211233

基于最大最小策略的纵向联邦学习隐私保护方法

doi: 10.16383/j.aas.c211233
基金项目: 国家自然科学基金(62072406), 信息系统安全技术重点实验室基金(61421110502), 基于大数据架构的公安信息化应用公安部重点实验室开放课题(2020DSJSYS003), 浙江省自然科学基金(LGF21F020006)和浙江省自然科学基金(LGF20F020016) 资助
详细信息
    作者简介:

    李荣昌:浙江工业大学信息与工程学院硕士研究生.主要研究方向为联邦学习, 图神经网络和人工智能安全技术. E-mail: lrcgnn@163.com

    刘涛:浙江工业大学信息与工程学院硕士研究生.主要研究方向为联邦学习和人工智能安全. E-mail: leonliu022@163.com

    郑海斌:浙江工业大学网络空间安全研究院讲师. 分别于2017年和2022年获得浙江工业大学学士和博士学位. 主要研究方向为深度学习, 人工智能安全和公平性算法. 本文通信作者. E-mail: haibinzheng320@gmail.com

    陈晋音:浙江工业大学信息工程学院教授. 分别于2004年和2009年获得浙江工业大学学士和博士学位. 主要研究方向为人工智能安全, 图数据挖掘和进化计算. E-mail: chenjinyin@zjut.edu.cn

    刘振广:浙江大学网络空间学院研究员. 主要研究方向为数据挖掘和区块链安全. E-mail: liuzhenguang2008@gmail.com

    纪守领:浙江大学研究员. 2013年获得佐治亚州立大学计算机科学博士学位, 2015年获得佐治亚理工学院电子与计算机工程博士学位. 主要研究方向为数据驱动的安全性和隐私性, 人工智能安全性和大数据分析. E-mail: sji@zju.edu.cn

Privacy Preserving Method for Vertical Federated Learning based on Max-min Strategy

Funds: Supported by National Natural Science Foundation of China (62072406), Key Laboratory of Science and Technology on Information System Security (61421110502), Key Laboratory of Ministry of Public Security (2020DSJSYS003), Natural Science Foundation of Zhejiang Province (LGF21F020006), and Natural Science Foundation of Zhejiang Province (LGF20F020016)
More Information
    Author Bio:

    LI Rong-Chang Master student at the School of Information Engineering, Zhejiang University of Technology. His research interest covers federated learning, graph neural network, and artificial intelligence security

    LIU Tao Master student at the School of Information Engineering, Zhejiang University of Technology. His research interest covers federated learning and artificial intelligence security

    ZHENG Hai-Bin Lecturer at the Institute of Cyberspace Security, Zhejiang University of Technology. He received his bachelor and Ph.D. degrees from Zhejiang University of Technology in 2017 and 2022, respectively. His research interest covers deep learning, artificial intelligence security, and fairness algorithm. Corresponding author of this paper

    CHEN Jin-Yin Professor at the School of Information Engineering, Zhejiang University of Technology. She received her bachelor and Ph.D. degrees from Zhejiang University of Technology in 2004 and 2009, respectively. Her research interests covers artificial intelligence security, graph data mining, and evolutionary computing

    LIU Zhen-Guang Professor at the School of Cyberspace, Zhejiang University. His research interest covers data mining and blockchain security

    JI Shou-Ling Professor at Zhejiang University. He received his Ph.D. degree in electrical and computer engineering from Georgia Institute of Technology in 2013, and in computer science from Georgia State University in 2015, respectively. His research interest covers data-driven security and privacy, artificial intelligence security, and big data analysis

  • 摘要: 纵向联邦学习是一种新兴的分布式机器学习技术, 在保障隐私性的前提下利用分散在各个机构的数据实现机器学习模型的联合训练. 纵向联邦学习被广泛应用于工业互联网金融借贷和医疗诊断等众多领域中, 因此保证其隐私安全性具有重要意义. 本文首先针对纵向联邦学习协议中由于参与方交换的嵌入表示造成的隐私泄露风险, 研究由协作者发起的通用的属性推断攻击. 攻击者利用辅助数据和嵌入表示训练一个攻击模型, 然后利用训练完成的攻击模型窃取参与方的隐私属性. 实验结果表明: 纵向联邦学习在训练、推理阶段产生的嵌入表示容易泄露数据隐私. 为了应对上述隐私泄露风险, 进一步提出一种基于最大最小策略的纵向联邦学习隐私保护方法, 其引入梯度正则组件保证训练过程主任务的预测性能, 同时引入重构组件掩藏参与方嵌入表示中包含的隐私属性信息. 最后, 在钢板缺陷诊断工业场景的实验结果表明: 相比于没有任何防御方法的VFL, 隐私保护方法将攻击推断准确度从95%降到55%以下, 接近于随机猜测的水平, 同时主任务预测准确率仅下降2%.
  • 图  1  VFL隐私泄露示例

    Fig.  1  Examples of VFL privacy leaks

    图  2  VFL框架

    Fig.  2  VFL framework

    图  3  VFL场景中攻击示意图

    Fig.  3  Illustration of attack in VFL

    图  4  VFL中协作方的攻击流程

    Fig.  4  Attack pipeline of collaborator in VFL

    图  5  PPVFL流程示意图

    Fig.  5  Illustration of PPVFL's pipeline

    图  6  防御方法的示意图

    Fig.  6  Illustration of defense method

    图  7  不同比例背景知识下属性推断攻击的性能

    Fig.  7  Performance of property inference attack with different proportions of background knowledge

    图  8  VFL不同时期下属性推断攻击的性能

    Fig.  8  Performance of property inference attack with different round in VFL

    图  9  PPVFL对训练数据的隐私保护性能

    Fig.  9  Performance of PPVFL's privacy preservation for training data

    图  10  PPVFL对测试数据隐私保护性能

    Fig.  10  Performance of PPVFL's privacy preservation for testing data

    图  11  多方场景下防御的性能

    Fig.  11  PPVFL's privacy preservation in multiple parties

    图  12  PPVFL隐私解码器对性能的影响

    Fig.  12  Performance of PPVFL's different privacy decoder

    图  13  PPVFL防御不同攻击模型的性能

    Fig.  13  Performance of PPVFL's privacy preservation against different attack model

    图  14  Adults数据集防御前后t-SNE示意图

    Fig.  14  t-SNE before and after defense of Adults

    图  15  Rochester数据集防御前后t-SNE示意图

    Fig.  15  t-SNE before and after defense of Rochester

    表  1  VFL隐私保护技术优缺点对比

    Table  1  Comparison of advantages and disadvantages of VFL privacy protection technology

    策略 方法 优点 缺点
    基于加密 同态加密[14] 可扩展性强 受限非线性函数
    MPC 准确率高 时间成本较高
    基于扰动 差分隐私 有理论保证 性能存在损耗
    梯度压缩[23] 通信成本低 保护效果较弱
    基于系统 可信执行环境[2425] 同时抵御基于硬件攻击 经济成本较高
    下载: 导出CSV

    表  2  VFL数据集的基本统计信息

    Table  2  The Basic Statistics of VFL Datasets

    数据集 样本数量 连边数量 标签类别 特征数量 隐私属性
    Adults 48 842 2 12 婚姻
    Rochester 4 563 167 653 6 236 教育
    Yale 8 578 405 450 2 188 种族
    下载: 导出CSV

    表  3  模型结构

    Table  3  Model architectures

    数据集 本地模型 顶部模型
    Adults FCNN-1 FCNN-2
    Rochester GCN-2 FCNN-2
    Yale SGC-2 FCNN-2
    下载: 导出CSV

    表  4  实际工业互联网数据集上的隐私保护效果

    Table  4  Privacy protection effect on actual industrial Internet dataset

    隐私属性 训练 权衡值 测试 权衡值 主任务 训练 权衡值 测试 权衡值 主任务
    No_defense 0.95 0.82 0.96 0.81 0.78 0.74 1.00 0.72 1.03 0.74
    Noisy ($\sigma=1$) 0.66 1.00 0.84 0.79 0.66 0.63 0.95 0.62 0.97 0.60
    Noisy ($\sigma=5$) 0.60 0.93 0.55 1.02 0.56 0.60 0.83 0.59 0.85 0.50
    Dropout ($\eta=0.5$) 0.91 0.88 0.91 0.88 0.80 0.70 1.03 0.64 1.13 0.72
    Dropout ($\eta=0.8$) 0.86 0.86 0.86 0.86 0.74 0.70 0.96 0.64 1.05 0.67
    DP ($\sigma=0.1$) 0.56 1.21 0.56 1.21 0.68 0.67 1.06 0.65 1.09 0.71
    DP ($\sigma=0.2$) 0.90 0.79 0.89 0.80 0.71 0.68 1.06 0.67 1.07 0.72
    DR ($d=8$) 0.87 0.85 0.86 0.86 0.74 0.69 0.80 0.67 0.82 0.55
    DR ($d=4$) 0.66 0.97 0.65 0.98 0.64 0.68 0.79 0.64 0.84 0.54
    PPVFL ($\lambda=0.1$) 0.55 1.38 0.57 1.33 0.76 0.60 1.20 0.62 1.16 0.72
    PPVFL ($\lambda=0.5$) 0.55 1.36 0.54 1.39 0.75 0.59 1.20 0.61 1.16 0.71
    下载: 导出CSV
  • [1] Luckow A, Cook M, Ashcraft N, Weill E, Djerekarov E, Vorster B. Deep learning in the Automotive Industry: Applications and Tools. In: Proceedings of the IEEE International Conference on Big Data. Washington, USA: IEEE, 2016. 3759−3768
    [2] Schneider S, Taylor G W, Kremer S C. Deep learning object detection methods for ecological camera trap data. In: Proceedings of the 15th Conference on Computer and Robot Vision. Toronto, Canada: IEEE, 2018. 321−328
    [3] Sangineto E, Nabi M, Culibrk D, Sebe N. Self paced deep learning for weakly supervised object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 41(3): 712—725
    [4] Scoon C, Ko R K. The data privacy matrix project: towards a global alignment of data privacy laws. In: Proceedings of the IEEE International Conference on Trust, Security and Privacy in Computing and Communications. Tianjin, China: IEEE, 2016. 1998−2005
    [5] Goddard M. The EU general data protection regulation: European regulation that has a global impact. International Journal of Market Research, 2017, 59(6): 703—705 doi: 10.2501/IJMR-2017-050
    [6] Yang Q, Liu Y, Chen T J, Tong Y X. Federated machine learning: concept and applications. ACM Transactions on Intelligent Systems and Technology, 2019, 10(2): 1—19
    [7] 张泽辉, 富瑶, 高铁杠. 支持数据隐私保护的联邦深度神经网络模型研究. 自动化学报, 2022, 48(5): 1—12

    Zhang Ze-Hui, Fu Yao, Gao Tie-Gang. Research on federated deep neural network model for data privacy protection. Acta Automatica Sinica, 2022, 48(5): 1—12
    [8] 张泽辉, 李庆丹, 富瑶, 何宁昕, 高铁杠. 面向非独立同分布数据的自适应联邦深度学习算法. 自动化学报, 2021, doi: 10.16383/j.aas.c201018, 预出版

    Zhang Ze-Hui, Li Qing-Dan, Fu Yao, He Ning-Xin, Gao Tie-Gang. Adaptive federated deep learning with non-iid data. Acta Automatica Sinica, 2021, doi: 10.16383/j.aas.c201018, to be published
    [9] Nasr M, Shokri R, Houmansadr A. Comprehensive privacy analysis of deep learning: Passive and active white-box inference attacks against centralized and federated learning. In: Proceedings of the IEEE Symposium on Security and Privacy. San Francisco, USA: IEEE, 2019. 739−753
    [10] Luca M, Song C, Cristofaro E D, Shmatikov V. Exploiting unintended feature leakage in collaborative learning. In: Proceedings of the IEEE Symposium on Security and Privacy. San Francisco, USA: IEEE, 2019. 691−706
    [11] Zhu L, Liu Z, Han S. Deep leakage from gradients. In: Proceedings of the Advances in Neural Information Processing Systems. Vancouver, Canada: 2019. 1−11
    [12] 周纯毅, 陈大卫, 王尚, 付安民, 高艳松. 分布式深度学习隐私与安全攻击研究进展与挑战. 计算机研究与发展, 2021, 58(5): 927—943 doi: 10.7544/issn1000-1239.2021.20200966

    Zhou Chun-Yi, Chen Da-Wei, Wang Shang, Fu An-Min, Gao Yan-Song. Research and challenge of distributed deep learning privacy and security attack. Journal of Computer Research and Development, 2021, 58(5): 927—943 doi: 10.7544/issn1000-1239.2021.20200966
    [13] Fu C, Zhang X, Ji S, Chen J Y, Wu J Z, Guo S Q, et al. Label inference attacks against vertical federated learning. In: Proceedings of the USENIX Security. Boston, USA: 2022. 1−18
    [14] Ou W, Zeng J H, Guo Z J, Yan W Q, Liu D W, Fuentes S. A homomorphic-encryption-based vertical federated learning scheme for rick management. Computer Science and Information Systems, 2020, 17(3): 819—834 doi: 10.2298/CSIS190923022O
    [15] Liu W, Cheng J H, Wang X L, Lu X J, Yin J W. Hybrid differential privacy based federated learning for Internet of Things. Journal of Systems Architecture, 2022, 124: 1—15
    [16] Mehdi M, Al-Fuqaha A. Enabling cognitive smart cities using big data and machine learning: Approaches and challenges. IEEE Communications Magazine, 2018, 56(2): 94—101 doi: 10.1109/MCOM.2018.1700298
    [17] Lu Y, Huang X H, Zhang K, Maharjan S, Zhang Y. Blockchain empowered asynchronous federated learning for secure data sharing in internet of vehicles. IEEE Transactions on Vehicular Technology, 2020, 69(4): 4298—4311 doi: 10.1109/TVT.2020.2973651
    [18] Dinh C, Pubudu N, Ming D, Aruna S. Blockchain for 5G and beyond networks: a state of the art survey. Journal of Network and Computer Applications, 2020: 166: 1—45
    [19] 韩璇, 袁勇, 王飞跃. 区块链安全问题: 研究现状与展望. 自动化学报, 2019, 45(1): 206—225

    Han Xuan, Yuan Yong, Wang Fei-Yue. Security problems on blockchain: the state of the art and future trends. Acta Automatica Sinica, 2019, 45(1): 206—225
    [20] Sun H, Wang Z Y, Huang Y J, Ye J D. Privacy-preserving vertical federated logistic regression without trusted third-party coordinator. In: Proceedings of the 6th International Conference on Machine Learning and Soft Computing. Haikou, China: 2022. 132−138
    [21] Cheng K, Fan T, Jin Y, Liu Y, Chen T J, Papadopoulos D, et al. Secureboost: A lossless federated learning framework. IEEE Intelligent Systems, 2021, 36(6): 1—9 doi: 10.1109/MIS.2021.3132250
    [22] Luo X, Wu Y, Xiao X, Ooi B C. Feature inference attack on model predictions in vertical federated learning. In: Proceedings of the IEEE 37th International Conference on Data Engineering. Chania, Greece: 2021. 181−192
    [23] Yang K, Song Z, Zhang Y, Zhou Y F, Sun X H, Wang J X. Model optimization method based on vertical federated learning. In: Proceedings of the IEEE International Symposium on Circuits and Systems. Daegu, South Korea: IEEE, 2021. 1−5
    [24] Paramod S, Rohit S, Iiia L, Srinivas D, Sanjit A S. A formal foundation for secure remote execution of enclaves. In: Proceedings of the ACM SIGSAC Conference on Computer and Communications Security. Dallas, USA: 2017. 2435−2450
    [25] Florian T, Dan H. Slalom: fast, verifiable and private execution of neural networks in trusted hardware. In: Proceedings of the 7th International Conference on Learning Representations. New Orleans, USA: 2019. 1−19
    [26] Yaroslav G, Lempitsky V. Unsupervised domain adaptation by backpropagation. In: Proceedings of the 32nd International Conference on Machine Learning. Lille, France: 2015. 1180−1189
    [27] Li K, Luo G C, Ye Y, Li W, Ji S H, Cai Z P. Adversarial privacy-preserving graph embedding against inference attack. IEEE Internet of Things Journal, 2020, 8(8): 6904—6915
    [28] Vasisht D, Boutet A, Shejwalkar V. Quantifying privacy leakage in graph embedding. In: Proceedings of the 17th EAI International Conference on Mobile and Ubiquitous Systems: Computing, Networking and Services. Darmstadt, Germany: 2020. 76−85
    [29] Zhang Z, Chen M, Backes M, Shen Y, Zhang Y. Inference attacks against graph neural networks. In: Proceedings of the USENIX Security 22. Boston, USA: 2022. 1−18
    [30] Liao P, Zhao H, Xu K, Jaakkola T, Gordon G J, Jegelka S, et al. Information obfuscation of graph neural networks. In: Proceedings of the 38th International Conference on Machine Learning. Virtual Event: 2021. 6600−6610
    [31] Thomas N, Welling M. Semi-supervised classification with graph convolutional networks. In: Proceedings of the 5th International Conference on Learning Representations. Toulon, USA: 2017. 1−14
    [32] Wu F, Zhang T Y, Souza A H, Fifty C, Yu T, Weinberger K Q. Simplifying graph convolutional networks. In: Proceedings of the 36th International Conference on Machine Learning. California, USA: 2019. 6861−6871
    [33] 王婕婷, 钱宇华, 李飞江, 刘郭庆. 消除随机一致性的支持向量机分类方法. 计算机研究与发展, 2020, 57(8): 1581—1593 doi: 10.7544/issn1000-1239.2020.20200127

    Wang Jie-Ting, Qian Yu-Hua, Li Fei-Jiang, Liu Guo-Qing. Support vector machine with eliminating the random consistency. Journal of Computer Research and Development. 2020, 57(8): 1581—1593 doi: 10.7544/issn1000-1239.2020.20200127
    [34] 窦诺, 赵瑞珍, 岑翼刚, 胡绍海, 张勇东. 基于稀疏表示的含噪图像超分辨重建方法. 计算机研究与发展, 2015, 52(4): 943—951 doi: 10.7544/issn1000-1239.2015.20140047

    Dou Nuo, Zhao Rui-Zhen, Cen Yi-Gang, Hu Shao-Hai, Zhang Yong-Dong. Noisy image super-resolution reconstruction based on sparse representation. Journal of Computer Research and Development, 2015, 52(4): 943—951 doi: 10.7544/issn1000-1239.2015.20140047
  • 加载中
计量
  • 文章访问数:  49
  • HTML全文浏览量:  27
  • 被引次数: 0
出版历程
  • 收稿日期:  2021-12-26
  • 录用日期:  2022-06-12
  • 网络出版日期:  2022-10-21

目录

    /

    返回文章
    返回