深度学习软测量模型对抗鲁棒性增强: 一种对抗训练与攻击检测融合的对抗防御方法
doi: 10.16383/j.aas.c250725 cstr: 32138.14.j.aas.c250725
Enhancing Adversarial Robustness of Deep Learning-Based Soft Sensors: A Defense Method Integrating Adversarial Training and Attack Detection
-
摘要: 深度学习软测量(DLSS)已成为解决复杂工业过程关键变量测量难题的有效方法, 但其极易遭受肉眼难以察觉的对抗攻击而输出虚假预测结果, 进而危害生产安全. 现有对抗攻击检测方法对微小扰动对抗样本(SPAS)普遍难以检出, 且对抗训练防御方法仍广泛存在对抗鲁棒过拟合现象. 针对上述问题, 提出一种对抗训练与攻击检测融合(ATADI)的对抗防御方法: 首先理论证明了仅采用SPAS进行对抗训练更有助于缓解对抗鲁棒过拟合, 其次在此基础上提出一种基于SPAS的特征锚定式对抗训练(FAAT)方案, 即SPAS-FAAT. ATADI中的攻击检测器在防御过程中仅检测较大扰动对抗样本(LPAS)以规避其难以检出SPAS的不足, SPAS则被输入经SPAS-FAAT后的DLSS以获得最终软测量结果. 最后在转子热变形软测量对抗攻防案例上进行了实验: 结构消融结果显示, 将对抗训练与攻击检测相融合可作为DLSS对抗防御的有效手段, 各类攻击产生的LPAS均能以高于98%的准确率被检出, 且经SPAS-FAAT后的DLSS在SPAS与正常样本上的软测量结果均满足精度要求. ATADI方法显著增强了DLSS的对抗鲁棒性, 为对抗防御方法的研究提供了新思路.Abstract: Deep learning-based soft sensors (DLSS) effectively measure critical variables in complex industrial processes. However, DLSS models are highly vulnerable to imperceptible adversarial attacks that generate misleading outputs and endanger production safety. Existing adversarial detectors poorly identify small perturbation adversarial samples (SPAS), while adversarial training suffers from adversarial robustness overfitting. To tackle these issues, an adversarial training and attack detection integration (ATADI) defense method is proposed. First, it theoretically proves that SPAS-only adversarial training better mitigates such overfitting. Second, on this basis, a SPAS-based feature-anchored adversarial training (FAAT) scheme, namely SPAS-FAAT, is proposed. In ATADI, the detector only targets large perturbation adversarial samples (LPAS) to avoid SPAS detection difficulties, while SPAS are directly fed into SPAS-FAAT-enhanced DLSS for prediction. Finally, experiments are conducted on an adversarial attack-defense case of rotor thermal deformation soft sensing: structural ablation results verify that the integration of adversarial training and attack detection serves as an effective defense means for DLSS. LPAS from various adversarial attacks are detected with over 98% accuracy, and SPAS-FAAT-enhanced DLSS achieves satisfactory accuracy for both SPAS and normal samples. The ATADI method significantly improves DLSS adversarial robustness, providing a new avenue for adversarial defense research.
-
Key words:
- Soft sensors /
- deep learning /
- adversarial defense /
- adversarial training /
- dversarial attack detection
-
表 1 BCD检测器对四类攻击所产生LPAS的检测结果(%)
Table 1 Detection results of the BCD detector for LPAS generated by four types of attacks(%)
攻击方法 Accuracy Precision Recall IDAO 99.63 98.95 99.45 HGAA 98.76 97.55 98.30 KGAA 99.21 98.40 98.95 KAGAN 98.42 97.10 97.95 表 2 四种不同对抗攻击下的DLSS对抗鲁棒性测试指标(输入SPAS)
Table 2 DLSS adversarial robustness test metrics under four different adversarial attacks (SPAS input)
测试样本 MAE ${\rm{DLSS}}_{{\rm{SPAS}}-{\rm{FAAT}}}$ ${\rm{DLSS}}_{{\rm{Original}}}$ ${\rm{DLSS}}_{{\rm{SAT}}}$ $x_{{\rm{SPAS}}}^{{\rm{IDAO}}}$ 0.0282 0.0421 0.0391 $x_{{\rm{SPAS}}}^{{\rm{KGAA}}}$ 0.0197 0.0415 0.0310 $x_{{\rm{SPAS}}}^{{\rm{HGAA}}}$ 0.0136 0.0446 0.0267 $x_{{\rm{SPAS}}}^{{\rm{KAGAN}}}$ 0.0263 0.0375 0.0287 表 3 四种不同对抗攻击下的DLSS对抗鲁棒性测试指标(输入正常样本)
Table 3 DLSS adversarial robustness test metrics under four different adversarial attacks (normal sample input)
测试样本 MAE ${\rm{DLSS}}_{{\rm{SPAS}}-{\rm{FAAT}}}$ ${\rm{DLSS}}_{{\rm{Original}}}$ ${\rm{DLSS}}_{{\rm{SAT}}}$ $x_{{\rm{normal}}}^{{\rm{IDAO}}}$ 0.0386 0.0391 0.0410 $x_{{\rm{normal}}}^{{\rm{KGAA}}}$ 0.0349 0.0341 0.0361 $x_{{\rm{normal}}}^{{\rm{HGAA}}}$ 0.0338 0.0329 0.0358 $x_{{\rm{normal}}}^{{\rm{KAGAN}}}$ 0.0328 0.0326 0.0349 表 4 采用不同对抗训练方法的DLSS对抗鲁棒性测试结果
Table 4 Adversarial robustness test results of DLSS with different adversarial training methods
测试样本 MAE ${\rm{DLSS}}_{{\rm{SPAS}}-{\rm{SAT}}}$ ${\rm{DLSS}}_{{\rm{SPAS}}-{\rm{DAAT}}}$ ${\rm{DLSS}}_{{\rm{SPAS}}-{\rm{FAAT}}}$ SPAS 0.0238 0.0217 0.0197 正常样本 0.0361 0.0351 0.0349 A1 缩写对照表
A1 List of abbreviations
缩写 全称/ 说明 DLSS 深度学习软测量 SPAS 微小扰动对抗样本 LPAS 较大扰动对抗样本 BCD 双向一致性判别 FAAT 特征锚定式对抗训练 ATADI 对抗训练与攻击检测融合 SAT 经典对抗训练 DAAT 域自适应对抗训练 KGAA 知识引导型对抗攻击 SPAS-FAAT 基于微小扰动对抗样本特征锚定式对抗训练 DLSSOriginal 原始深度学习软测量模型 DLSSSAT 经典对抗训练后得到的深度学习软测量模型 DLSSDAAT 域自适应对抗训练后得到的深度学习软测量模型 DLSSSPAS-FAAT 基于微小扰动对抗样本特征锚定式对抗训练后
得到的深度学习软测量模型 -
[1] Guo R Y, Liu H, Xie G, Zhang Y M, Liu D. A self-interpretable soft sensor based on deep learning and multiple attention mechanism: From data selection to sensor modeling. IEEE Transactions on Industrial Informatics, 2023, 19(5): 6859−6871 doi: 10.1109/TII.2022.3181692 [2] Shi X D, Li R H, Morales H, Huang W, Xiong W L. Semi-supervised probabilistic learning network for soft sensor modeling with partially labeled data. IEEE Transactions on Automation Science and Engineering, 2025, 22: 16309−16321 doi: 10.1109/TASE.2025.3576122 [3] 乔景慧, 柴天佑. 数据与模型驱动的水泥生料分解率软测量模型. 自动化学报, 2019, 45(8): 1564−1578Qiao Jing-Hui, Chai Tian-You. Data and model-based soft measurement model of cement raw meal decomposition ratio. Acta Automatica Sinica, 2019, 45(8): 1564−1578 [4] 蒙西, 张寅, 乔俊飞. 基于动态模糊神经网络的出水含氮参数软测量方法. 控制理论与应用, 2024, 41(12): 2383−2392 doi: 10.7641/CTA.2023.20667Meng Xi, Zhang Yan, Qiao Jun-Fei. A soft-sensing method for effluent nitrogen-containing parameters based on dynamic fuzzy neural networks. Control Theory & Applications, 2024, 41(12): 2383−2392 doi: 10.7641/CTA.2023.20667 [5] Zühlke M M, Kudenko D. Adversarial robustness of neural networks from the perspective of lipschitz calculus: A survey. ACM Computing Surveys, 2025, 57(6): 1−41 doi: 10.1145/3648351 [6] Wu T, Wang X C, Qiao S J, Zhang Y M, Xian X P, Liu Y B, et al. Small perturbations are enough: Adversarial attacks on time series prediction. Information Sciences, 2022, 587: 794−812 doi: 10.1016/j.ins.2021.11.007 [7] Jiang Q C, Fan S H, Zhu Z Y, Hou Z X, Zhong W M, Tan L, et al. Adversarial attacks on industrial soft sensors: Multi-target attacks based on diffusion models. Information Sciences, 2025, 725: Article No. 122732 [8] Yin Z Q, Zhuo Y, Ge Z Q. Transfer adversarial attacks across industrial intelligent systems. Reliability Engineering & System Safety, 2023, 237: Article No. 109299 doi: 10.1016/j.ress.2023.109299 [9] Kong X Y, Ge Z Q. Adversarial attacks on neural-network-based soft sensors: Directly attack output. IEEE Transactions on Industrial Informatics, 2022, 18(4): 2443−2451 doi: 10.1109/TII.2021.3093386 [10] Chen L, Zhu Q X, He Y L. Adversarial attacks for neural network-based industrial soft sensors: Mirror output attack and translation mirror output attack. IEEE Transactions on Industrial Informatics, 2024, 20(2): 2378−2386 doi: 10.1109/TII.2023.3291717 [11] Jiang Q C, Fan S H, Zhu Z Y, Hou Z X, Zhong W M, Tan L, et al. Adversarial attacks on industrial soft sensors: Multi-target attacks based on diffusion models. Information Sciences, 2026, 725: Article No. 122732 doi: 10.1016/j.ins.2025.122732 [12] Zhou X L, Wu O, Yang N. Adversarial training with anti-adversaries. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024, 46(12): 10210−10227 doi: 10.1109/TPAMI.2024.3432973 [13] Zhao S J, Wang X Z, Wei X X. Mitigating accuracy-robustness trade-off via balanced multi-teacher adversarial distillation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024, 46(12): 9338−9352 doi: 10.1109/TPAMI.2024.3416308 [14] Guo R Y, Liu H, Liu D. When deep learning-based soft sensors encounter reliability challenges: A practical knowledge-guided adversarial attack and its defense. IEEE Transactions on Industrial Informatics, 2024, 20(2): 2702−2714 doi: 10.1109/TII.2023.3297663 [15] Guo R Y, Chen Q Y, Tong S, Liu H. Knowledge-aided generative adversarial network: A transfer gradient-less adversarial attack for deep learning-based soft sensors. In: Proceedings of the 2024 14th Asian Control Conference (ASCC). Dalian, China: IEEE, 2024. 1254-1259 [16] Xie Y F, Wang J, Xie S W, Chen X F. Adversarial training-based deep layer-wise probabilistic network for enhancing soft sensor modeling of industrial processes. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2023, 54(2): 972−984 doi: 10.1109/tsmc.2023.3322195 [17] Guo R Y, Chen Q Y, Liu H, Wang W Q. Adversarial robustness enhancement for deep learning-based soft sensors: An adversarial training strategy using historical gradients and domain adaptation. Sensors, 2024, 24(12): Article No. 3909 doi: 10.3390/s24123909 [18] Liu C, Huang Z C, Salzmann M, Zhang T, Süsstrunk S. On the impact of hard adversarial instances on overfitting in adversarial training. Journal of Machine Learning Research, 2024, 25(356): 1−46 [19] Chen Y, Li X D, Hu P, Peng D Z, Wang X. Diffilter: Defending against adversarial perturbations with diffusion filter. IEEE Transactions on Information Forensics and Security, 2024, 19: 6779−6794 doi: 10.1109/TIFS.2024.3422923 [20] Liu H, Zhao B, Zhang K H, Liu P. Nowhere to hide: A lightweight unsupervised detector against adversarial examples. arXiv preprint arXiv: 2210.08579, 2022. [21] Liu H, Zhao B, Guo J B, Zhang K H, Liu P. A lightweight unsupervised adversarial detector based on autoencoder and isolation forest. Pattern Recognition, 2024, 147: Article No. 110127 doi: 10.1016/j.patcog.2023.110127 [22] Mumcu F, Yilmaz Y. Universal and efficient detection of adversarial data through nonuniform impact on network layers. arXiv preprint arXiv: 2506.20816, 2025. [23] Guo R Y, Li A L, Liu H. An adversarial attack detection method based on bidirectional consistency discrimination for deep learning-based soft sensors. In: Proceedings of the 2025 CAA Symposium on Fault Detection, Supervision and Safety for Technical Processes. Urumqi, China: IEEE, 2025. 1-6 [24] Yin X W, Kolouri S, Rohde G K. GAT: Generative adversarial training for adversarial example detection and robust classification. arXiv preprint arXiv: 1905.11475, 2022. [25] Chen E C, Lee C R. Data filtering for efficient adversarial training. Pattern Recognition, 2024, 151: Article No. 110394 doi: 10.1016/j.patcog.2024.110394 [26] Hong S, An N, Cho H, Lim J, Han I S, Moon I, Kim J. A dynamic soft sensor based on hybrid neural networks to improve early off-spec detection. Engineering with Computers, 2023, 39(4): 3011−3021 doi: 10.1007/s00366-022-01694-7 [27] Guo R Y, Liu H. Semisupervised dynamic soft sensor based on complementary ensemble empirical mode decomposition and deep learning. Measurement, 2021, 183: Article No. 109788 doi: 10.1016/j.measurement.2021.109788 [28] 吴英俊, 汝英涛, 刘锦涛, 施展宇, 顾松, 倪明. 基于集员滤波的自动发电控制系统虚假数据注入攻击检测. 电力系统自动化, 2022, 46(1): 33−41 doi: 10.7500/AEPS20210525006Wu Ying-Jun, Ru Ying-Tao, Liu Jin-Tao, Shi Zhan-Yu, Gu Song, Ni Ming. False data injection attack detection for automatic generation control system based on set-membership filtering. Automation of Electric Power Systems, 2022, 46(1): 33−41 doi: 10.7500/AEPS20210525006 [29] Guo W, Liu F, Wang Y, Sidorov D, Wu J. Adaptive event-triggered sliding mode load frequency control for cyber-physical power systems under false data injection attacks. IEEE Transactions on Industrial Informatics, 2024, 21(4): 2947−2956 doi: 10.1109/ccdc52312.2021.9602783 [30] 刘丁, 刘涵. 电站锅炉空气预热器控制方法及应用. 北京: 科学出版社, 2016.Liu Ding, Liu Han. Control Methods and Applications of Air Preheater for Power Plant Boilers. Beijing: Science Press, 2016. -
计量
- 文章访问数: 5
- HTML全文浏览量: 4
- 被引次数: 0
下载: