支持数据隐私保护的联邦深度神经网络模型研究

张泽辉; 富瑶; 高铁杠

doi:10.16383/j.aas.c200236

支持数据隐私保护的联邦深度神经网络模型研究

doi: 10.16383/j.aas.c200236

1. 南开大学软件学院天津 300071

基金项目: 国家科技重大专项(2018YFB0204304), 天津市研究生科研创新基金资助项目(2019YJSB067)资助

详细信息

作者简介:
张泽辉：南开大学软件学院博士研究生. 2019年获得武汉理工大学硕士学位. 主要研究方向为联邦学习, 故障诊断和智能船舶控制. E-mail: zhangtianxia918@163.com

富瑶：南开大学软件学院硕士研究生. 主要研究方向为云端数据完整性验证, 信息安全. E-mail: FuYao_TJ@163.com

高铁杠：南开大学软件学院教授. 1991年获得华中理工大学应用数学专业硕士学位, 2005年获得南开大学博士学位. 主要研究方向为联邦学习, 图像水印, 信息隐藏和云端数据安全. 本文通信作者. E-mail: gaotiegang@nankai.edu.cn

计量
- 文章访问数: 1137
- HTML全文浏览量: 931
- PDF下载量: 474
- 被引次数: 0
出版历程
- 收稿日期: 2020-04-21
- 录用日期: 2020-07-27
- 网络出版日期: 2022-03-18
- 刊出日期: 2022-05-13

Research on Federated Deep Neural Network Model for Data Privacy Preserving

1. College of Software, Nankai University, Tianjin 300071

Funds: Supported by National Science and Technology Major Project of China (2018YFB0204304), and Tianjin Research Innovation Project for Postgraduate Students (2019YJSB067)

More Information

Author Bio:
ZHANG Ze-Hui　Ph.D. candidate at the College of Software, Nankai University. He received his master degree from Wuhan University of Technology in 2019. His research interest covers federated learning, fault diagnosis, and intelligent ship control

FU Yao　Master student at the College of Software, Nankai University. Her research interest covers cloud data integrity verification and information security

GAO Tie-Gang　Professor at the College of Software, Nankai University. He received his master degree in applied mathematics from Huazhong University of Science and Technology in 1991, and Ph.D. degree from Nankai University in 2005. His research interest covers federated learning, image watermarking, information hiding, and cloud data security. Corresponding author of this paper

摘要

摘要: 近些年, 人工智能技术已经在图像分类、目标检测、语义分割、智能控制以及故障诊断等领域得到广泛应用, 然而某些行业(例如医疗行业)由于数据隐私的原因, 多个研究机构或组织难以共享数据训练联邦学习模型. 因此, 将同态加密(Homomorphic encryption, HE)算法技术引入到联邦学习中, 提出一种支持数据隐私保护的联邦深度神经网络模型(Privacy-preserving federated deep neural network, PFDNN). 该模型通过对其权重参数的同态加密保证了数据的隐私性, 并极大地减少了训练过程中的加解密计算量. 通过理论分析与实验验证, 所提出的联邦深度神经网络模型具有较好的安全性, 并且能够保证较高的精度.
- 联邦学习 /
- 深度学习 /
- 数据隐私 /
- 同态加密 /
- 神经网络
Abstract: In recent years, artificial intelligence technology has been widely used in the fields of image classification, object detection, semantic segmentation, intelligent control and fault diagnosis, etc.. However, in some industries, such as medical, it is difficult that multiple research institutions share data to train federated learning models due to data privacy. Therefore, homomorphic encryption (HE) algorithm technology is introduced into federated learning in this paper. A privacy-preserving federated deep neural network (PFDNN) model is proposed, which preserves data privacy by homomorphic encryption of model parameters and significantly reduces the amount of computation required for encryption and decryption. Theoretical analysis and experimental results show that the proposed federated deep neural network model has better security and can guarantee higher accuracy.
- Federated learning /
- deep learning /
- data privacy /
- homomorphic encryption (HE) /
- neural network

HTML全文

图 1 联邦学习结构

Fig. 1 Federated learning structure

下载: 全尺寸图片幻灯片

图 2 神经网络结构

Fig. 2 Neural network construction

下载: 全尺寸图片幻灯片

图 3 不同比例的数据信息泄露

Fig. 3 Different proportions of data information leakage

下载: 全尺寸图片幻灯片

图 4 不同偏置值的数据信息泄露

Fig. 4 Data information leakage of different bias values

下载: 全尺寸图片幻灯片

图 5 训练过程交互图

Fig. 5 Interaction in the training process

下载: 全尺寸图片幻灯片

图 6 支持数据隐私保护的联邦学习训练过程

Fig. 6 The training process of the date privacy-preserving federated learning

下载: 全尺寸图片幻灯片

图 7 各模型训练过程曲线

Fig. 7 Training process curves of each models

下载: 全尺寸图片幻灯片

图 8 测试集预测结果的混淆矩阵

Fig. 8 The confounding matrix of the test dataset prediction results

下载: 全尺寸图片幻灯片

表 1 训练者与参数服务器获得的数据信息

Table 1 Data information obtained by the participant and parameter server

名称	训练者 i	参数服务器
中间数据	Enc(W_global)	Enc(W_global)
	Prediction results	Enc(W_par_,₁)
	Loss_i	Enc(W_par_,₂)
	G_i	…
	W_par_,i	Enc(W_par_,_n)

下载: 导出CSV

表 2 算法执行时间

Table 2 Execution time of the algorithms

操作		参数量
操作		10	16	32	64	160	512	2048	50176
Paillier 方案	加密	0.07 s	0.11 s	0.23 s	0.48 s	1.19 s	3.91 s	15.05 s	381.51 s
	解密	0.02 s	0.03 s	0.07 s	0.13 s	0.34 s	1.10 s	4.31 s	107.77 s
	加法	0.10 ms	0.19 ms	0.49 ms	1.09 ms	5.59 ms	8.07 ms	30.82 ms	0.81 s
直接相加		0.96 μs	0.99 μs	1.01 μs	1.03 μs	1.10 μs	1.49 μs	2.83 μs	33.30 μs

下载: 导出CSV

表 3 深度神经网络模型结构

Table 3 Deep neural network model structure

model	深度神经网络
结构	L1 input = 784, output = 64
	L2 input = 64, output = 32
	L3 input = 32, output = 16
	L4 input = 16, output = 10

下载: 导出CSV

表 4 不同模型偏差结果

Table 4 The deviation results of the different models

模型		参数
模型		mini-batch = 32	mini-batch = 64	mini-batch = 128	mini-batch = 256
DNN-1	dev_avg	2.04%	2.62%	2.68%	6.25%
PFDNN-1	dev_max	0.47%	0.67%	1.07%	2.82%
DNN-2	dev_avg	2.30%	3.00%	4.39%	5.26%
PFDNN-2	dev_max	0.36%	0.57%	0.03%	0.05%

下载: 导出CSV

表 5 不同类别物品的预测结果

Table 5 Prediction results of the different items

Method	DNN-1 mini-batch = 128 lr = 0.005, epoch = 300		DNN-1 mini-batch =128 lr = 0.01, epoch = 300		FDNN-1 mini-batch = 128 lr = 0.005, epoch = 300		FDNN-2 mini-batch = 128 lr = 0.01, epoch = 300
Type	Recall	Precision	Recall	Precision	Recall	Precision	Recall	Precision
T-shirt	82.00%	83.42%	86.00%	78.18%	78.60%	84.24%	76.50%	86.50%
Trouser	96.40%	98.07%	98.20%	92.99%	95.70%	98.76%	96.10%	99.17%
Pullover	78.00%	78.23%	80.40%	76.14%	67.60%	83.25%	76.90%	78.47%
Dress	87.70%	87.96%	80.10%	91.23%	88.70%	85.62%	89.30%	86.03%
Coat	82.70%	77.80%	86.10%	70.40%	82.60%	74.08%	85.80%	75.59%
Sandal	95.50%	95.50%	94.70%	95.75%	93.80%	95.81%	94.90%	96.25%
Shirt	68.10%	71.16%	51.80%	80.06%	71.40%	64.09%	68.30%	69.48%
Sneaker	94.00%	95.05%	95.30%	92.08%	97.10%	90.58%	97.00%	92.21%
Bag	96.50%	95.64%	96.50%	93.78%	96.20%	95.82%	96.80%	96.80%
Boot	95.90%	93.84%	94.30%	95.54%	93.10%	96.38%	93.70%	96.80%
Average	87.68%	87.67%	86.34%	86.62%	86.48%	86.86%	87.53%	87.69%

下载: 导出CSV

参考文献(39)

[1]	Kebria P M, Khosravi A, Salaken S M, Nahavandi S. Deep imitation learning for autonomous vehicles based on convolutional neural networks. Acta Automatica Sinica, 2019, 7(1): 82-95.
[2]	Gong W F, Chen H, Zhang Z H, Zhang M L, Wang R H, Guan C, et al. A novel deep learning method for intelligent fault diagnosis of rotating machinery based on improved CNN-SVM and multichannel data fusion. Sensors (Basel, Switzerland), 2019, 19(7): Article No. 1693
[3]	Roy P, Mahapatra G S, Dey K N. Forecasting of software reliability using neighborhood fuzzy particle swarm optimization based novel neural network. Acta Automatica Sinica, 2019, 6(6): 1365-1383.
[4]	田娟秀, 刘国才, 谷珊珊, 鞠忠建, 刘劲光, 顾冬冬. 医学图像分析深度学习方法研究与挑战. 自动化学报, 2018, 44(03): 401-424. Tian Juan-Xiu, Liu Guo-Cai, Gu Shan-Shan, Ju Zhong-Jian, Liu Jin-Guang, Gu Dong-Dong. Research and challenges of deep learning methods for medical image analysis. Acta Automatica Sinica, 2018, 44(03): 401-424.
[5]	陈加, 张玉麒, 宋鹏, 魏艳涛, 王煜. 深度学习在基于单幅图像的物体三维重建中的应用[J]. 自动化学报, 2019, 45(4): 657-668. Chen Jia, Zhang Yu-Qi, Song Peng, Wei Yan-Tao, Wang Yu. Application of deep learning to 3D object reconstruction from a single image. Acta Automatica Sinica, 2019, 45(4): 657−668.
[6]	Zhang Z H, Guan C, Liu Z. Real-time optimization energy management strategy for fuel cell hybrid ships considering power sources degradation. IEEE Access, 2020, 8: 87046-87059. doi: 10.1109/ACCESS.2020.2991519
[7]	Chen H, Zhang Z H, Guan C, Gao H B. Optimization of sizing and frequency control in battery/supercapacitor hybrid energy storage system for fuel cell ship. Energy, 2020, 197: Article No. 117285
[8]	Liu W B, Wang Z D, Liu X H, Zeng N Y, Liu Y R, Alsaadi F E. A survey of deep neural network architectures and their applications. Neurocomputing, 2017, 234(19): 11-26.
[9]	Shorten C, Khoshgoftaar T M. A survey on image data augmentation for deep learning. Journal of Big Data, 2019, 6(1): 1-48. doi: 10.1186/s40537-018-0162-3
[10]	Roh Y, Heo G, Whang S E. A survey on data collection for machine learning: A big data — AI integration perspective. IEEE Transactions on Knowledge and Data Engineering, DOI: 10.1109/TKDE.2019.2946162
[11]	Zhang Q, Yang L T, Chen Z, Li P. A survey on deep learning for big data[J]. Information Fusion, 2018, 42: 146-157.
[12]	张超, 李强, 陈子豪, 黎祖睿, 张震. Medical Chain:联盟式医疗区块链系统. 自动化学报, 2019, 45(08): 1495-1510. Zhang Chao, Li Qiang, Chen Zi-Hao, Li Zu-Rui, Zhang Zhen. Medical chain: consortium medical blockchain system. Acta Automatica Sinica, 2019, 45(08): 1495-1510.
[13]	Huang L, Shea A L, Qian H, Masurkar A, Deng H, Liu D B. Patient clustering improves efficiency of federated machine learning to predict mortality and hospital stay time using distributed electronic medical records. Journal of Biomedical Informatics, 2019, 99: Article No. 103291
[14]	徐剑, 王安迪, 毕猛, 周福才. 支持隐私保护的k近邻分类器. 软件学报, 2019, 30(11): 3503-3517. Xu Jian, Wang An-Di, Bi Meng, Zhou Fu-Cai. K-nearest neighbor classifier supporting privacy protection. Journal of Software, 2019, 30(11): 3503-3517.
[15]	Bonawitz K, Ivanov V, Kreuter B, Marcedone A, McMahan H B, Patel S, et al. Practical secure aggregation for privacy-preserving machine learning. In: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. New York, USA: Association for Computing Machinery, 2017. 1175−1191
[16]	贾春福, 王雅飞, 陈阳, 孙梦洁, 葛凤仪. 机器学习算法在同态加密数据集上的应用. 清华大学学报(自然科学版), 2020, 60(06):456-463. Jia Chun-Fu, Wang Ya-Fei, Chen Yang, Sun Meng-Jie, Ge Feng-Yi. Application of machine learning algorithms on homomorphic encrypted data sets. Journal of Tsinghua University (Natural Science Edition), 2020, 60(06):456-463.
[17]	Bost R, Popa R A, Tu S, Goldwasser S. Machine learning classification over encrypted data. In: Proceedings of the 2014 Network and Distributed System Security Symposium. San Diego, USA: 2014.
[18]	Barni M, Failla P, Lazzeretti R, Sadeghi A R, Schneider T. Privacy-preserving ECG classification with branching programs and neural networks. IEEE Transactions on Information Forensics and Security, 2011, 6(2): 452-468. doi: 10.1109/TIFS.2011.2108650
[19]	Yu Y, Li H, Chen R, Zhao Y, Yang H, Du X. Enabling secure intelligent network with cloud-assisted privacy-preserving machine learning[J]. IEEE Network, 2019, 33(3): 82-87. doi: 10.1109/MNET.2019.1800362
[20]	Phong L T, Aono Y, Hayashi T, Wang L H. Privacy-preserving deep learning via additively homomorphic encryption. IEEE Transactions on Information Forensics and Security. 2017, 13(5): 1333-1345.
[21]	Agrawal R, Srikant R. Privacy-preserving data mining. In: Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data. New York, USA: Association for Computing Machinery, 2000. 439−450
[22]	Shokri R, Shmatikov V. Privacy-preserving deep learning. In: Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security. New York, USA: Association for Computing Machinery, 2015. 1310−1321
[23]	Truex S, Baracaldo N, Anwar A, Steinke T, Ludwig H, Zhang R, Zhou Y. A hybrid approach to privacy-preserving federated learning. Informatik Spektrum, 2019, 42 (5): 356-357. doi: 10.1007/s00287-019-01205-x
[24]	Zhu Y, Hu H, Ahn G J, Huang D J, Wang S B. Towards temporal access control in cloud computing. Proceedings IEEE Infocom, 2012, 131(5):2576-2580.
[25]	Canetti R, Feige U, Goldreich O, Naor M. Adaptively secure multi-party computation. In: Proceedings of the 28th Annual ACM Symposium on Theory of Computing. Philadelphia, USA: ACM, 1996. 639−648
[26]	Cireşan D, Meier U, Masci J, Schmidhuber J. Multi-column deep neural network for traffic sign classification. Neural networks, 2012, 32: 333-338. doi: 10.1016/j.neunet.2012.02.023
[27]	Yin B, Yin H, Wu Y L, Jiang Z X. FDC: a secure federated deep learning mechanism for data collaborations in the Internet of Things. IEEE Internet of Things Journal, 2020, 7(7):6348-6359. doi: 10.1109/JIOT.2020.2966778
[28]	Wang X F, Han Y W, Wang C Y, Zhao Q Y. In-Edge AI: intelligentizing mobile edge computing, caching and communication by federated learning. IEEE Network, 2019, 33(5): 156-165. doi: 10.1109/MNET.2019.1800286
[29]	Yang K, Jiang T, Shi Y, Ding Z. Federated learning via over-the-air computation. IEEE Transactions on Wireless Communications, 2020, 19(3): 2022-2035. doi: 10.1109/TWC.2019.2961673
[30]	Engelenburg S V, Janssen M, Klievink B. Design of a software architecture supporting business-to-government information sharing to improve public safety and security. Journal of Intelligent information systems, 2019, 52(3): 595-618. doi: 10.1007/s10844-017-0478-z
[31]	Qiu M J, Sun P, Zhang B, Yu J H, Fu Y, Yu X, et al. Reliable information encryption and digital display applications based on multistate smart windows. Advanced Optical Materials, 2018, 6(22): Article No. 1800338
[32]	Hua Z, Zhou Y. Image encryption using 2D Logistic-adjusted-Sine map. Information Sciences, 2016, 339: 237-253. doi: 10.1016/j.ins.2016.01.017
[33]	Wang R B, Li Y N, Xu H Y, Feng Y, Zhang Y G. Electronic scoring scheme based on real paillier encryption algorithms. IEEE Access, 2019, 7: 128043-128053. doi: 10.1109/ACCESS.2019.2939227
[34]	Ganesan I, Balasubramanian A A A, Muthusamy R. An efficient implementation of novel paillier encryption with polar encoder for 5G systems in VLSI. Computers & Electrical Engineering, 2018, 65: 153-164.
[35]	Wu H T, Cheung Y M, Huang J W. Reversible data hiding in Paillier cryptosystem. Journal of Visual Communication and Image Representation, 2016, 40: 765-771. doi: 10.1016/j.jvcir.2016.08.021
[36]	CSIRO＇s Data61. Python paillier library [Online], available: https://github.com/data61/python-paillier, April 12, 2022
[37]	Seo Y, Shin K S. Hierarchical convolutional neural networks for fashion image classification. Expert Systems with Applications, 2019, 116: 328-339. doi: 10.1016/j.eswa.2018.09.022
[38]	Sun Y, Chen J, Liu Q, Liu G. Learning image compressed sensing with sub-pixel convolutional generative adversarial network[J]. Pattern Recognition, 2020, 98: 107051.
[39]	Liao Z, Couillet R. A large dimensional analysis of least squares support vector machines. IEEE Transactions on Signal Processing, 2019, 67(4): 1065-1074. doi: 10.1109/TSP.2018.2889954