-
摘要: 近些年, 人工智能技术已经在图像分类、目标检测、语义分割、智能控制以及故障诊断等领域得到广泛应用, 然而某些行业(例如医疗行业)由于数据隐私的原因, 多个研究机构或组织难以共享数据训练联邦学习模型. 因此, 将同态加密(Homomorphic encryption, HE)算法技术引入到联邦学习中, 提出一种支持数据隐私保护的联邦深度神经网络模型(Privacy-preserving federated deep neural network, PFDNN). 该模型通过对其权重参数的同态加密保证了数据的隐私性, 并极大地减少了训练过程中的加解密计算量. 通过理论分析与实验验证, 所提出的联邦深度神经网络模型具有较好的安全性, 并且能够保证较高的精度.Abstract: In recent years, artificial intelligence technology has been widely used in the fields of image classification, object detection, semantic segmentation, intelligent control and fault diagnosis, etc.. However, in some industries, such as medical, it is difficult that multiple research institutions share data to train federated learning models due to data privacy. Therefore, homomorphic encryption (HE) algorithm technology is introduced into federated learning in this paper. A privacy-preserving federated deep neural network (PFDNN) model is proposed, which preserves data privacy by homomorphic encryption of model parameters and significantly reduces the amount of computation required for encryption and decryption. Theoretical analysis and experimental results show that the proposed federated deep neural network model has better security and can guarantee higher accuracy.
-
Key words:
- Federated learning /
- deep learning /
- data privacy /
- homomorphic encryption (HE) /
- neural network
-
表 1 训练者与参数服务器获得的数据信息
Table 1 Data information obtained by the participant and parameter server
名称 训练者 i 参数服务器 中间数据 Enc(Wglobal) Enc(Wglobal) Prediction results Enc(Wpar,1) Lossi Enc(W par,2) Gi … Wpar,i Enc(Wpar,n) 表 2 算法执行时间
Table 2 Execution time of the algorithms
操作 参数量 10 16 32 64 160 512 2048 50176 Paillier 方案 加密 0.07 s 0.11 s 0.23 s 0.48 s 1.19 s 3.91 s 15.05 s 381.51 s 解密 0.02 s 0.03 s 0.07 s 0.13 s 0.34 s 1.10 s 4.31 s 107.77 s 加法 0.10 ms 0.19 ms 0.49 ms 1.09 ms 5.59 ms 8.07 ms 30.82 ms 0.81 s 直接相加 0.96 μs 0.99 μs 1.01 μs 1.03 μs 1.10 μs 1.49 μs 2.83 μs 33.30 μs 表 3 深度神经网络模型结构
Table 3 Deep neural network model structure
model 深度神经网络 结构 L1 input = 784, output = 64 L2 input = 64, output = 32 L3 input = 32, output = 16 L4 input = 16, output = 10 表 4 不同模型偏差结果
Table 4 The deviation results of the different models
模型 参数 mini-batch = 32 mini-batch = 64 mini-batch = 128 mini-batch = 256 DNN-1 devavg 2.04% 2.62% 2.68% 6.25% PFDNN-1 devmax 0.47% 0.67% 1.07% 2.82% DNN-2 devavg 2.30% 3.00% 4.39% 5.26% PFDNN-2 devmax 0.36% 0.57% 0.03% 0.05% 表 5 不同类别物品的预测结果
Table 5 Prediction results of the different items
Method DNN-1
mini-batch = 128
lr = 0.005, epoch = 300DNN-1
mini-batch =128
lr = 0.01, epoch = 300FDNN-1
mini-batch = 128
lr = 0.005, epoch = 300FDNN-2
mini-batch = 128
lr = 0.01, epoch = 300Type Recall Precision Recall Precision Recall Precision Recall Precision T-shirt 82.00% 83.42% 86.00% 78.18% 78.60% 84.24% 76.50% 86.50% Trouser 96.40% 98.07% 98.20% 92.99% 95.70% 98.76% 96.10% 99.17% Pullover 78.00% 78.23% 80.40% 76.14% 67.60% 83.25% 76.90% 78.47% Dress 87.70% 87.96% 80.10% 91.23% 88.70% 85.62% 89.30% 86.03% Coat 82.70% 77.80% 86.10% 70.40% 82.60% 74.08% 85.80% 75.59% Sandal 95.50% 95.50% 94.70% 95.75% 93.80% 95.81% 94.90% 96.25% Shirt 68.10% 71.16% 51.80% 80.06% 71.40% 64.09% 68.30% 69.48% Sneaker 94.00% 95.05% 95.30% 92.08% 97.10% 90.58% 97.00% 92.21% Bag 96.50% 95.64% 96.50% 93.78% 96.20% 95.82% 96.80% 96.80% Boot 95.90% 93.84% 94.30% 95.54% 93.10% 96.38% 93.70% 96.80% Average 87.68% 87.67% 86.34% 86.62% 86.48% 86.86% 87.53% 87.69% -
[1] Kebria P M, Khosravi A, Salaken S M, Nahavandi S. Deep imitation learning for autonomous vehicles based on convolutional neural networks. Acta Automatica Sinica, 2019, 7(1): 82-95. [2] Gong W F, Chen H, Zhang Z H, Zhang M L, Wang R H, Guan C, et al. A novel deep learning method for intelligent fault diagnosis of rotating machinery based on improved CNN-SVM and multichannel data fusion. Sensors (Basel, Switzerland), 2019, 19(7): Article No. 1693 [3] Roy P, Mahapatra G S, Dey K N. Forecasting of software reliability using neighborhood fuzzy particle swarm optimization based novel neural network. Acta Automatica Sinica, 2019, 6(6): 1365-1383. [4] 田娟秀, 刘国才, 谷珊珊, 鞠忠建, 刘劲光, 顾冬冬. 医学图像分析深度学习方法研究与挑战. 自动化学报, 2018, 44(03): 401-424.Tian Juan-Xiu, Liu Guo-Cai, Gu Shan-Shan, Ju Zhong-Jian, Liu Jin-Guang, Gu Dong-Dong. Research and challenges of deep learning methods for medical image analysis. Acta Automatica Sinica, 2018, 44(03): 401-424. [5] 陈加, 张玉麒, 宋鹏, 魏艳涛, 王煜. 深度学习在基于单幅图像的物体三维重建中的应用[J]. 自动化学报, 2019, 45(4): 657-668.Chen Jia, Zhang Yu-Qi, Song Peng, Wei Yan-Tao, Wang Yu. Application of deep learning to 3D object reconstruction from a single image. Acta Automatica Sinica, 2019, 45(4): 657−668. [6] Zhang Z H, Guan C, Liu Z. Real-time optimization energy management strategy for fuel cell hybrid ships considering power sources degradation. IEEE Access, 2020, 8: 87046-87059. doi: 10.1109/ACCESS.2020.2991519 [7] Chen H, Zhang Z H, Guan C, Gao H B. Optimization of sizing and frequency control in battery/supercapacitor hybrid energy storage system for fuel cell ship. Energy, 2020, 197: Article No. 117285 [8] Liu W B, Wang Z D, Liu X H, Zeng N Y, Liu Y R, Alsaadi F E. A survey of deep neural network architectures and their applications. Neurocomputing, 2017, 234(19): 11-26. [9] Shorten C, Khoshgoftaar T M. A survey on image data augmentation for deep learning. Journal of Big Data, 2019, 6(1): 1-48. doi: 10.1186/s40537-018-0162-3 [10] Roh Y, Heo G, Whang S E. A survey on data collection for machine learning: A big data — AI integration perspective. IEEE Transactions on Knowledge and Data Engineering, DOI: 10.1109/TKDE.2019.2946162 [11] Zhang Q, Yang L T, Chen Z, Li P. A survey on deep learning for big data[J]. Information Fusion, 2018, 42: 146-157. [12] 张超, 李强, 陈子豪, 黎祖睿, 张震. Medical Chain:联盟式医疗区块链系统. 自动化学报, 2019, 45(08): 1495-1510.Zhang Chao, Li Qiang, Chen Zi-Hao, Li Zu-Rui, Zhang Zhen. Medical chain: consortium medical blockchain system. Acta Automatica Sinica, 2019, 45(08): 1495-1510. [13] Huang L, Shea A L, Qian H, Masurkar A, Deng H, Liu D B. Patient clustering improves efficiency of federated machine learning to predict mortality and hospital stay time using distributed electronic medical records. Journal of Biomedical Informatics, 2019, 99: Article No. 103291 [14] 徐剑, 王安迪, 毕猛, 周福才. 支持隐私保护的k近邻分类器. 软件学报, 2019, 30(11): 3503-3517.Xu Jian, Wang An-Di, Bi Meng, Zhou Fu-Cai. K-nearest neighbor classifier supporting privacy protection. Journal of Software, 2019, 30(11): 3503-3517. [15] Bonawitz K, Ivanov V, Kreuter B, Marcedone A, McMahan H B, Patel S, et al. Practical secure aggregation for privacy-preserving machine learning. In: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. New York, USA: Association for Computing Machinery, 2017. 1175−1191 [16] 贾春福, 王雅飞, 陈阳, 孙梦洁, 葛凤仪. 机器学习算法在同态加密数据集上的应用. 清华大学学报(自然科学版), 2020, 60(06):456-463.Jia Chun-Fu, Wang Ya-Fei, Chen Yang, Sun Meng-Jie, Ge Feng-Yi. Application of machine learning algorithms on homomorphic encrypted data sets. Journal of Tsinghua University (Natural Science Edition), 2020, 60(06):456-463. [17] Bost R, Popa R A, Tu S, Goldwasser S. Machine learning classification over encrypted data. In: Proceedings of the 2014 Network and Distributed System Security Symposium. San Diego, USA: 2014. [18] Barni M, Failla P, Lazzeretti R, Sadeghi A R, Schneider T. Privacy-preserving ECG classification with branching programs and neural networks. IEEE Transactions on Information Forensics and Security, 2011, 6(2): 452-468. doi: 10.1109/TIFS.2011.2108650 [19] Yu Y, Li H, Chen R, Zhao Y, Yang H, Du X. Enabling secure intelligent network with cloud-assisted privacy-preserving machine learning[J]. IEEE Network, 2019, 33(3): 82-87. doi: 10.1109/MNET.2019.1800362 [20] Phong L T, Aono Y, Hayashi T, Wang L H. Privacy-preserving deep learning via additively homomorphic encryption. IEEE Transactions on Information Forensics and Security. 2017, 13(5): 1333-1345. [21] Agrawal R, Srikant R. Privacy-preserving data mining. In: Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data. New York, USA: Association for Computing Machinery, 2000. 439−450 [22] Shokri R, Shmatikov V. Privacy-preserving deep learning. In: Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security. New York, USA: Association for Computing Machinery, 2015. 1310−1321 [23] Truex S, Baracaldo N, Anwar A, Steinke T, Ludwig H, Zhang R, Zhou Y. A hybrid approach to privacy-preserving federated learning. Informatik Spektrum, 2019, 42 (5): 356-357. doi: 10.1007/s00287-019-01205-x [24] Zhu Y, Hu H, Ahn G J, Huang D J, Wang S B. Towards temporal access control in cloud computing. Proceedings IEEE Infocom, 2012, 131(5):2576-2580. [25] Canetti R, Feige U, Goldreich O, Naor M. Adaptively secure multi-party computation. In: Proceedings of the 28th Annual ACM Symposium on Theory of Computing. Philadelphia, USA: ACM, 1996. 639−648 [26] Cireşan D, Meier U, Masci J, Schmidhuber J. Multi-column deep neural network for traffic sign classification. Neural networks, 2012, 32: 333-338. doi: 10.1016/j.neunet.2012.02.023 [27] Yin B, Yin H, Wu Y L, Jiang Z X. FDC: a secure federated deep learning mechanism for data collaborations in the Internet of Things. IEEE Internet of Things Journal, 2020, 7(7):6348-6359. doi: 10.1109/JIOT.2020.2966778 [28] Wang X F, Han Y W, Wang C Y, Zhao Q Y. In-Edge AI: intelligentizing mobile edge computing, caching and communication by federated learning. IEEE Network, 2019, 33(5): 156-165. doi: 10.1109/MNET.2019.1800286 [29] Yang K, Jiang T, Shi Y, Ding Z. Federated learning via over-the-air computation. IEEE Transactions on Wireless Communications, 2020, 19(3): 2022-2035. doi: 10.1109/TWC.2019.2961673 [30] Engelenburg S V, Janssen M, Klievink B. Design of a software architecture supporting business-to-government information sharing to improve public safety and security. Journal of Intelligent information systems, 2019, 52(3): 595-618. doi: 10.1007/s10844-017-0478-z [31] Qiu M J, Sun P, Zhang B, Yu J H, Fu Y, Yu X, et al. Reliable information encryption and digital display applications based on multistate smart windows. Advanced Optical Materials, 2018, 6(22): Article No. 1800338 [32] Hua Z, Zhou Y. Image encryption using 2D Logistic-adjusted-Sine map. Information Sciences, 2016, 339: 237-253. doi: 10.1016/j.ins.2016.01.017 [33] Wang R B, Li Y N, Xu H Y, Feng Y, Zhang Y G. Electronic scoring scheme based on real paillier encryption algorithms. IEEE Access, 2019, 7: 128043-128053. doi: 10.1109/ACCESS.2019.2939227 [34] Ganesan I, Balasubramanian A A A, Muthusamy R. An efficient implementation of novel paillier encryption with polar encoder for 5G systems in VLSI. Computers & Electrical Engineering, 2018, 65: 153-164. [35] Wu H T, Cheung Y M, Huang J W. Reversible data hiding in Paillier cryptosystem. Journal of Visual Communication and Image Representation, 2016, 40: 765-771. doi: 10.1016/j.jvcir.2016.08.021 [36] CSIRO's Data61. Python paillier library [Online], available: https://github.com/data61/python-paillier, April 12, 2022 [37] Seo Y, Shin K S. Hierarchical convolutional neural networks for fashion image classification. Expert Systems with Applications, 2019, 116: 328-339. doi: 10.1016/j.eswa.2018.09.022 [38] Sun Y, Chen J, Liu Q, Liu G. Learning image compressed sensing with sub-pixel convolutional generative adversarial network[J]. Pattern Recognition, 2020, 98: 107051. [39] Liao Z, Couillet R. A large dimensional analysis of least squares support vector machines. IEEE Transactions on Signal Processing, 2019, 67(4): 1065-1074. doi: 10.1109/TSP.2018.2889954