基于两阶段多教师知识蒸馏的工业过程故障检测方法

陈光捷; 张洪海; 刘毅; 周乐

doi:10.16383/j.aas.c250617

基于两阶段多教师知识蒸馏的工业过程故障检测方法

doi: 10.16383/j.aas.c250617 cstr: 32138.14.j.aas.c260617

陈光捷^1,,
张洪海^1,,
刘毅^2,,
周乐^1,

1.
浙江科技大学自动化与电气工程学院杭州 310023
2.
浙江工业大学机械工程学院杭州 310023

基金项目: 国家重点研发计划(2025YFE0204600), 浙江省自然科学基金(LR26F030005), 国家自然科学基金(U23A20328)资助

详细信息

作者简介:
陈光捷：浙江科技大学自动化与电气工程学院讲师. 主要研究方向为工业过程监控, 数据驱动建模和故障检测与诊断. E-mail: chenguangjie@zust.edu.cn

张洪海：浙江科技大学自动化与电气工程学院硕士研究生. 主要研究方向为工业过程监控, 数据驱动建模, 知识蒸馏和故障检测与诊断. E-mail: zhanghonghai@zust.edu.cn

刘毅：浙江工业大学机械工程学院教授. 主要研究方向为数据智能及其在工业过程建模、控制、优化中的应用. E-mail: yliuzju@zjut.edu.cn

周乐：浙江科技大学自动化与电气工程学院教授. 主要研究方向为工业过程建模、监控与故障诊断, 软传感器建模和深度学习. 本文通信作者. E-mail: zhoule@zust.edu.cn

计量
- 文章访问数: 11
- HTML全文浏览量: 9
- 被引次数: 0
出版历程
- 收稿日期: 2025-11-10
- 录用日期: 2026-03-19
- 网络出版日期: 2026-05-20

Two-stage Multi-teacher Knowledge Distillation for Industrial Process Fault Detection

1.
School of Automation and Electrical Engineering, Zhejiang University of Science and Technology, Hangzhou 310023
2.
College of Mechanical Engineering, Zhejiang University of Technology, Hangzhou 310023

Funds: Supported by National Key Research and Development Project of China (2025YFE0204600), Zhejiang Provincial Natural Science Foundation of China (LR26F030005), and National Natural Science Foundation of China (U23A20328)

More Information

Author Bio:
CHEN Guang-Jie Lecturer at the School of Automation and Electrical Engineering, Zhejiang University of Science and Technology. His research interests include industrial process monitoring, data-driven modeling, and fault detection and diagnosis

ZHANG Hong-Hai Master student at the School of Automation and Electrical Engineering, Zhejiang University of Science and Technology. His research interests include industrial process monitoring, data-driven modeling, knowledge distillation, and fault detection and diagnosis

LIU Yi Professor at the College of Mechanical Engineering, Zhejiang University of Technology. His research interests include data intelligence and its applications to modeling, control, and optimization of industrial processes

ZHOU Le Professor at the School of Automation and Electrical Engineering, Zhejiang University of Science and Technology. His research interests include industrial process modeling, monitoring and fault diagnosis, soft sensor modeling, and deep learning. Corresponding author of this paper

摘要

摘要: 现代工业过程数据具有大容量、高维度及复杂相关性等特征, 单一多元统计监测方法难以兼顾不同类型特征的监测需求. 现有多模型融合方法与深度学习技术虽能提升故障检测性能, 但前者依赖模型库构建, 难以统一建模, 后者存在结构复杂与参数冗余问题. 针对上述问题, 提出一种基于两阶段多教师知识蒸馏的工业过程建模与故障检测方法. 该方法通过蒸馏框架将核主成分分析与独立成分分析提取的异构知识内化至学生自编码器模型中, 实现非线性与非高斯特征的统一建模, 并通过两阶段蒸馏协同优化特征空间与重构空间. 第一阶段在特征层蒸馏以引导学生模型学习教师模型的特征分布, 第二阶段在重构层蒸馏以提升模型对过程变化的表征与重构能力. 在田纳西—伊斯曼仿真过程及合成氨实际过程上的实验结果表明, 该方法能够有效提升故障检测的准确性与鲁棒性, 并通过离线知识蒸馏实现在线阶段的统一建模与高效监测.
- 独立主成分分析 /
- 核主成分分析 /
- 知识蒸馏 /
- 多教师知识蒸馏 /
- 故障检测
Abstract: Modern industrial process data are characterized by large scale, high dimensionality, and complex correlations, making it difficult for a single multivariate statistical monitoring method to simultaneously address diverse monitoring requirements. Although multi-model fusion methods and deep learning techniques can improve fault detection performance, the former relies on the construction of model libraries and lacks unified modeling capability, while the latter suffers from complex structures and parameter redundancy. To address these issues, a two-stage multi-teacher knowledge distillation method for industrial process modeling and fault detection is proposed. In this framework, heterogeneous knowledge extracted by kernel principal component analysis and independent component analysis is embedded into a student autoencoder model, enabling unified modeling of nonlinear and non-Gaussian characteristics. A two-stage distillation strategy is further adopted to collaboratively optimize the feature space and reconstruction space. In the first stage, feature-level distillation guides the student model to learn the feature distributions of the teacher models. In the second stage, reconstruction-level distillation is performed to enhance the model＇s capability in representing and reconstructing process variations. Experiments on the Tennessee Eastman simulation process and a real ammonia synthesis process demonstrate that the proposed method can effectively improve fault detection accuracy and robustness, while achieving unified modeling and efficient online monitoring through offline knowledge distillation.
- independent component analysis /
- kernel principal component analysis /
- knowledge distillation /
- multi-teacher knowledge distillation /
- fault detection

HTML全文

图 1 两阶段多教师知识蒸馏过程监测框架

Fig. 1 Framework of two-stage multi-teacher knowledge distillation for process monitoring

下载: 全尺寸图片幻灯片

图 2 TE过程流程图

Fig. 2 Flowchart of the TE process

下载: 全尺寸图片幻灯片

图 3 TE过程故障5的AE蒸馏前后监测结果 ((a) 学生AE模型; (b) MTDM)

Fig. 3 AE monitoring results before and after distillation for fault 5 in the TE process ((a) Student AE model; (b) MTDM)

下载: 全尺寸图片幻灯片

图 4 TE过程故障10的AE蒸馏前后监测结果 ((a) 学生AE模型; (b) MTDM)

Fig. 4 AE monitoring results before and after distillation for fault 10 in the TE process ((a) Student AE model; (b) MTDM)

下载: 全尺寸图片幻灯片

图 5 TE过程故障19的AE蒸馏前后监测结果 ((a) 学生AE模型; (b) MTDM)

Fig. 5 AE monitoring results before and after distillation for fault 19 in the TE process ((a) Student AE model; (b) MTDM)

下载: 全尺寸图片幻灯片

图 6 一段炉流程图

Fig. 6 Flowchart of primary reformer

下载: 全尺寸图片幻灯片

图 7 合成氨过程一段炉Case1的AE蒸馏前后监测结果 ((a) 学生AE模型; (b) MTDM)

Fig. 7 AE monitoring results before and after distillation for Case1 of the primary reformer in the ammonia synthesis process ((a) Student AE model; (b) MTDM)

下载: 全尺寸图片幻灯片

图 8 合成氨过程一段炉Case2的AE蒸馏前后监测结果 ((a) 学生AE模型; (b) MTDM)

Fig. 8 AE monitoring results before and after distillation for Case2 of the primary reformer in the ammonia synthesis process ((a) Student AE model; (b) MTDM)

下载: 全尺寸图片幻灯片

图 9 四种方法在TE过程上的平均故障检测率

Fig. 9 Average fault detection rates of four methods on the TE process

下载: 全尺寸图片幻灯片

表 1 21个TE过程故障的FDR——基于ICA、KPCA、AE与多教师蒸馏方法(%)

Table 1 FDR of 21 TE process faults detected by ICA, KPCA, AE, and multi-teacher distillation methods (%)

故障编号	ICA		KPCA		AE		MTDM
故障编号	$ I^{2(\tau_1)} $	$ Q^{(\tau_1)} $	$ T^{2(\tau_2)} $	$ Q^{(\tau_2)} $	$ T^{2(S)} $	$ Q^{(S)} $	$ T^{2(S)} $	$ Q^{(S)} $
F1	99.75	99.75	99.75	99.25	99.50	99.88	99.88	99.50
F2	98.62	98.88	98.75	98.12	98.75	98.88	98.88	98.38
F3	8.75	8.50	7.88	9.62	10.12	5.62	7.88	6.25
F4	100.00	100.00	100.00	9.62	72.88	100.00	99.50	96.38
F5	100.00	100.00	28.62	31.25	34.88	34.50	100.00	100.00
F6	100.00	100.00	99.50	99.50	99.25	100.00	100.00	100.00
F7	100.00	100.00	100.00	66.75	100.00	100.00	100.00	100.00
F8	98.38	98.12	98.12	97.25	97.50	95.12	98.38	97.50
F9	8.75	6.50	6.38	7.88	9.38	5.50	6.00	7.75
F10	90.50	90.62	54.75	49.25	49.12	56.38	88.00	81.50
F11	78.38	78.25	79.25	27.62	64.38	61.38	73.25	66.88
F12	99.88	99.88	99.12	97.25	99.00	97.12	99.75	99.62
F13	95.38	95.50	95.50	94.25	95.62	95.25	95.62	94.62
F14	100.00	100.00	100.00	93.00	100.00	97.88	100.00	99.88
F15	19.88	17.50	13.25	15.62	12.38	8.75	14.00	8.75
F16	92.12	93.88	37.00	34.88	32.00	51.88	90.62	83.38
F17	96.25	96.25	96.00	76.12	85.50	96.50	95.38	91.62
F18	90.38	91.00	91.25	89.50	90.38	90.38	91.00	91.50
F19	86.50	90.75	18.88	1.75	25.50	25.37	86.62	73.88
F20	90.38	90.75	68.38	41.38	50.25	61.00	78.38	75.88
F21	64.88	61.62	54.37	30.12	48.38	54.87	57.63	37.00
均值	81.85	81.80	68.89	55.71	65.47	68.39	80.04	76.68

下载: 导出CSV

表 2 合成氨实验案例的数据划分设置

Table 2 Data division settings for ammonia synthesis experimental cases

案例编号	训练集	验证集	测试集	工况切换点
Case1	1$ \sim $300	301$ \sim $560	561$ \sim $2000	140
Case2	1$ \sim $550	551$ \sim $800	801$ \sim $2000	200

下载: 导出CSV

表 3 2个合成氨过程一段炉案例的FDR——基于ICA、KPCA、AE与多教师蒸馏方法(%)

Table 3 FDR of 2 primary reformer cases in the ammonia synthesis process detected by ICA, KPCA, AE, and multi-teacher distillation methods (%)

案例编号	ICA		KPCA		AE		MTDM
案例编号	$ I^{2(\tau_1)} $	$ Q^{(\tau_1)} $	$ T^{2(\tau_2)} $	$ Q^{(\tau_2)} $	$ T^{2(S)} $	$ Q^{(S)} $	$ T^{2(S)} $	$ Q^{(S)} $
Case1	72.77	91.92	90.77	96.46	100.00	14.69	89.54	86.46
Case2	82.50	83.60	86.70	92.80	100.00	82.70	84.10	85.20
均值	77.64	87.76	88.74	94.63	100.00	48.70	86.82	85.83

下载: 导出CSV

表 4 2个合成氨过程一段炉案例的FAR——基于ICA、KPCA、AE与多教师蒸馏方法(%)

Table 4 FAR of 2 primary reformer cases in the ammonia synthesis process detected by ICA, KPCA, AE, and multi-teacher distillation methods(%)

案例编号	ICA		KPCA		AE		MTDM
案例编号	$ I^{2(\tau_1)} $	$ Q^{(\tau_1)} $	$ T^{2(\tau_2)} $	$ Q^{(\tau_2)} $	$ T^{2(S)} $	$ Q^{(S)} $	$ T^{2(S)} $	$ Q^{(S)} $
Case1	0	0	0.70	0	99.28	0	0	0
Case2	0	0	10.00	4.00	97.00	5.00	5.00	5.50
均值	0	0	5.35	2.00	98.14	2.50	2.50	2.75

下载: 导出CSV

表 5 MTDM与MMSF在TE过程代表性故障上的FDR对比(%)

Table 5 FDR comparison between MTDM and MMSF on representative TE process faults (%)

故障编号	MTDM		MMSF
故障编号	$ T^{2(S)} $	$ Q^{(S)} $	$ T^{2(F)} $	$ Q^{(F)} $
F4	99.50	96.38	99.88	64.62
F10	88.00	81.50	86.00	81.88
F19	86.62	73.88	73.00	51.50

下载: 导出CSV

参考文献(32)

[1]	Wang N, Yang F, Zhang R, Gao F. Intelligent fault diagnosis for chemical processes using deep learning multimodel fusion. IEEE Transactions on Cybernetics, 2020, 52(7): 7121−7135 doi: 10.1109/tcyb.2020.3038832
[2]	Zhang X, Huang T, Wu B, Hu Y, Huang S, Zhou Q, et al. Multi-model ensemble deep learning method for intelligent fault diagnosis with high-dimensional samples. Frontiers of Mechanical Engineering, 2021, 16(2): 340−352 doi: 10.1007/s11465-021-0629-3
[3]	Buciluǎ C, Caruana R, Niculescu-Mizil A. Model compression. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, USA: ACM, 2006. 535−541
[4]	Hinton G, Vinyals O, Dean J. Distilling the knowledge in a neural network. arXiv preprint arXiv: 1503.02531, 2015.
[5]	Song J, Chen Y, Ye J W, Song M L. Spot-adaptive knowledge distillation. IEEE Transactions on Image Processing, 2022, 31: 3359−3370 doi: 10.1109/TIP.2022.3170728
[6]	Li Y C, Wang X Y, Xu W C, Wang H Z, Qi Y N, Dong J H, et al. Feature distillation is the better choice for model-heterogeneous federated learning. arXiv preprint arXiv: 2507.10348, 2025
[7]	Mansourian A M, Jalali A, Ahmadi R, Kasaei S. Attention-guided feature distillation for semantic segmentation. arXiv preprint arXiv: 2403.05451, 2024
[8]	Dai T, Lin Y, Guo H, Wang J B, Zhu Z X. DCSF-KD: Dynamic channel-wise spatial feature knowledge distillation for object detection. In: Proceedings of the AAAI Conference on Artificial Intelligence. Philadelphia, USA: AAAI Press, 2025. 2627−2635
[9]	Mishra D, Uikey R. Unified knowledge distillation framework: Fine-grained alignment and geometric relationship preservation for deep face recognition. In: Proceedings of the 2025 IEEE International Joint Conference on Biometrics (IJCB). Osaka, Japan: IEEE, 2025. 1−10
[10]	Karine A, Napoléon T, Jridi M. I2CKD: Intra-and inter-class knowledge distillation for semantic segmentation. Neurocomputing, 2025, 649: 130791 doi: 10.1016/j.neucom.2025.130791
[11]	Mansourian A M, Ahmadi R, Ghafouri M, Babaei A M, Golezani E B, Ghamchi Z Y, et al. A comprehensive survey on knowledge distillation. arXiv preprint arXiv: 2503.12067, 2025.
[12]	Zhang W F, Biswas G, Zhao Q, Zhao H B, Feng W Q. Knowledge distilling based model compression and feature learning in fault diagnosis. Applied Soft Computing, 2020, 88: 105958 doi: 10.1016/j.asoc.2019.105958
[13]	Petrosian O, Pengyi L, Yulong H, Jiarui L, Zhaoruikun S, Guofeng F, et al. DKDL-Net: A lightweight bearing fault detection model via decoupled knowledge distillation and low-rank adaptation fine-tuning. arXiv preprint arXiv: 2406.06653, 2024.
[14]	Ai M, Xie Y, Ding S X, Tang Z, Gui W. Domain knowledge distillation and supervised contrastive learning for industrial process monitoring. IEEE Transactions on Industrial Electronics, 2022, 70(9): 9452−9462 doi: 10.1109/tie.2022.3206696
[15]	Liu Y, Huang J J, Jia M W. Knowledge distillation-based zero-shot learning for process fault diagnosis. Advanced Intelligent Systems, 2025, 7(6): 2400828
[16]	Qian J C, Song Z H, Yao Y, Zhu Z R, Zhang X M. A review on autoencoder based representation learning for fault detection and diagnosis in industrial processes. Chemometrics and Intelligent Laboratory Systems, 2022, 231: 104711 doi: 10.1016/j.chemolab.2022.104711
[17]	Hinton G E, Osindero S, Teh Y W. A fast learning algorithm for deep belief nets. Neural Computation, 2006, 18(7): 1527−1554 doi: 10.1162/neco.2006.18.7.1527
[18]	Romero A, Ballas N, Kahou S E, Chassang A, Gatta C, Bengio Y. FitNets: Hints for thin deep nets. arXiv preprint arXiv: 1412.6550, 2014
[19]	Schölkopf B, Smola A, Müller K R. Nonlinear component analysis as a kernel eigenvalue problem. Neural Computation, 1998, 10(5): 1299−1319 doi: 10.1162/089976698300017467
[20]	Lee J M, Yoo C K, Choi S W, Vanrolleghem P A, Lee I B. Nonlinear process monitoring using kernel principal component analysis. Chemical Engineering Science, 2004, 59(1): 223−234 doi: 10.1016/j.ces.2003.09.012
[21]	Deng X G, Tian X M, Chen S, Harris C J. Deep principal component analysis based on layerwise feature extraction and its application to nonlinear process monitoring. IEEE Transactions on Control Systems Technology, 2018, 27(6): 2526−2540
[22]	Kong X Y, Ge Z Q. Deep learning of latent variable models for industrial process monitoring. IEEE Transactions on Industrial Informatics, 2021, 18(10): 6778−6788 doi: 10.1109/tii.2021.3134251
[23]	Downs J J, Vogel E F. A plant-wide industrial process control problem. Computers & Chemical Engineering, 1993, 17(3): 245−255 doi: 10.1016/0098-1354(93)80018-i
[24]	Zheng J H, Yang Z Y, Ge Z Q. Deep residual principal component analysis as feature engineering for industrial data analytics. IEEE Transactions on Instrumentation and Measurement, 2024, 73: 1−10
[25]	Chiang L H, Russell E L, Braatz R D. Fault Detection and Diagnosis in Industrial Systems. London, UK: Springer Science & Business Media, 2012.
[26]	Ku W F, Storer R H, Georgakis C. Disturbance detection and isolation by dynamic principal component analysis. Chemometrics and Intelligent Laboratory Systems, 1995, 30(1): 179−196 doi: 10.1016/0169-7439(95)00076-3
[27]	Bao D, Wang Y J, Li S H. Dynamic graph embedding PCA to extract spatio-temporal information for fault detection. IEEE Transactions on Industrial Informatics, 2025, 21(2): 1714−1723 doi: 10.1109/TII.2024.3485805
[28]	Chen Y Q, Zhang R D. Deep multiscale convolutional model with multihead self-attention for industrial process fault diagnosis. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2025, 55(4): 2503−2512 doi: 10.1109/TSMC.2024.3523708
[29]	Shen B B, Jiang X Y, Yao L, Zeng J S. Gaussian mixture TimeVAE for industrial soft sensing with deep time series decomposition and generation. Journal of Process Control, 2025, 147: 103355 doi: 10.1016/j.jprocont.2024.103355
[30]	Wang J B, Shao W M, Song Z H. Student’s-t mixture regression-based robust soft sensor development for multimode industrial processes. Sensors, 2018, 18(11): 3968 doi: 10.3390/s18113968
[31]	Zheng J H, Zhou L, Lyu Y T, Yang Z Y, Ge Z Q. Multi-rate data distillation for deep process monitoring. IEEE Transactions on Instrumentation and Measurement, 2025, 74: 1−10 doi: 10.1109/tim.2025.3571076
[32]	Zheng J, Zhao C H, Gao F. Retrospective comparison of several typical linear dynamic latent variable models for industrial process monitoring. Computers & Chemical Engineering, 2022, 157: 107587 doi: 10.1016/j.compchemeng.2021.107587