A Complete Spatio-temporal Information Guided Approach for Robust KPI Forecasting Under Endogenous Variable Missingness
-
摘要: 关键性能指标(KPI)预测对工业过程优化和安全至关重要. 然而, 现实工业环境中传感器故障常导致推理阶段内生变量(预测目标)缺失, 引发信息不对称. 现有方法在推理阶段因缺乏内生变量的历史自回归信息, 难以建立鲁棒时空特征映射, 严重影响多步预测性能. 针对该挑战, 提出完备时空信息引导网络. 该网络采用包含"完备变量引导"和"外生变量学习"的双流架构, 基于变分贝叶斯理论将内生变量缺失下的预测问题转换为特征对齐任务, 通过分布约束使网络在变量缺失时仍能学习到逼近完备变量提取的时空表征; 同时, 提出多尺度时空聚合模块, 结合图结构学习与注意力机制动态建模变量间的耦合关系, 并压缩精炼特征空间, 有效捕获与KPI相关的复杂时空关联. 在电力变压器数据集和氧化铝回转窑数据集上的实验表明, 在内生变量缺失下, 所提网络表现出良好的泛化能力和鲁棒的多步预测性能.Abstract: Forecasting of key performance indicator (KPI) is vital to industrial process optimization and safety. However, sensor failures often result in missing endogenous variables during inference, causing information asymmetry. Existing methods struggle to build robust spatio-temporal mappings without historical autoregressive information during inference, impacting multi-step prediction. To address this challenge, the Complete Spatio-Temporal information Guided Network (CSTG-Net) is proposed. Featuring a dual-stream architecture with “complete-variable guidance” and “exogenous-variable learning” branches, CSTG-Net reformulates prediction under missing data as a feature alignment task based on variational Bayesian theory. Distribution alignment constraints enable the network to learn spatio-temporal representations approximating complete data even when variables are missing. Additionally, a multi-scale spatio-temporal aggregation module is introduced, combining graph structure learning and attention mechanisms to dynamically model variable couplings and refine the feature space, capturing complex KPI-related correlations. Experiments on electric transformer and alumina rotary kiln datasets demonstrate that CSTG-Net maintains superior generalization and robust multi-step prediction performance under endogenous-variable missingness.
-
表 1 回转窑数据集描述
Table 1 Description of the rotary kiln dataset
变量 描述 单位 均值 $ x_{1} $ 窑头温度 ℃ 659.5 $ x_{2} $ 窑尾温度 ℃ 258.1 $ x_{3} $ 主电机电流 A 191.2 $ x_{4} $ 冷却机电机电流 A 287.7 $ x_{5} $ 给煤速率 rad/s 506.0 $ x_{6} $ 给料量 t/h 69.6 $ y $ 烧结温度 ℃ 950.6 表 2 基线模型汇总
Table 2 Summary of baseline models
范式 模型 特性 循环神经网络 LSTM 天然的序列结构, 善于捕捉序列中的时间依赖 Transformer Crossformer 两阶段注意力机制捕捉跨时间与跨变量的依赖关系 iTransformer 将变量维度视作令牌(token), 高效提取多变量间的空间耦合关系 内外生变量分离建模 CrossLinear 显式分离内生与外生变量的表征学习路径, 并建模二者的交互 序列缺失预测网络 GinAR 基于插值注意力机制从可观测变量中重建缺失变量的表示 表 3 CSTG-Net超参数设置
Table 3 Hyperparameter settings of the CSTG-Net
超参数 参数值 超参数 参数值 优化器 Adam 早停耐心值 15 学习率 0.001 预热轮数 30 批量大小 512 KLD权重 $ 5\times10^{-4} $ 训练轮数 100 对抗损失权重 0.05 下采样次数 3 隐藏层维度 128 表 4 回转窑数据集不同模型在不同步长下的性能比较
Table 4 Performance comparison of different models in the rotary kiln dataset at various horizons
模型 指标 预测步长 1 3 6 12 24 LSTM MAE 35.571 35.583 35.592 35.609 35.649 RMSE 45.766 45.782 45.798 45.828 45.884 Crossformer MAE 29.057 27.108 29.141 28.749 30.058 RMSE 34.545 35.413 34.772 34.969 37.993 iTransformer MAE 35.574 35.586 35.585 35.609 35.641 RMSE 45.768 45.783 45.789 45.825 45.874 CrossLinear MAE 30.490 28.321 26.666 28.433 29.411 RMSE 36.107 34.390 34.301 34.485 36.164 GinAR MAE 36.748 37.237 36.857 37.537 36.521 RMSE 47.000 47.455 46.938 47.685 46.695 CSTG-Net MAE 26.434 26.993 26.317 26.957 27.657 RMSE 34.431 32.323 32.290 33.163 33.536 表 5 不同模型在ETT数据集上的性能比较
Table 5 Performance comparison of different models on ETT datasets
模型 指标 ETTh1 ETTh2 ETTm1 ETTm2 LSTM MAE 9.291 14.413 12.283 15.817 RMSE 9.836 16.043 12.679 17.686 Crossformer MAE 6.486 7.025 6.162 12.791 RMSE 7.344 8.288 7.039 14.345 iTransformer MAE 8.930 14.416 12.282 12.790 RMSE 9.461 16.049 12.677 14.343 CrossLinear MAE 7.049 6.628 7.032 14.126 RMSE 8.089 8.064 7.937 15.794 GinAR MAE 8.705 12.542 9.742 10.693 RMSE 9.339 15.080 10.762 13.376 CSTG-Net MAE 3.156 4.182 3.288 6.044 RMSE 3.779 5.118 3.902 7.495 表 6 对抗训练在ETTh1数据集的消融实验
Table 6 Ablation experiment of adversarial training in the ETTh1 dataset
模型 指标 预测步长 1 3 6 12 24 CSTG-Net MAE 2.657 2.973 3.027 3.272 3.851 RMSE 3.288 3.584 3.64 3.872 4.509 去除对抗损失 MAE 3.603 3.623 4.934 5.311 6.437 RMSE 4.165 4.180 5.575 5.933 7.028 -
[1] 张辉, 颜星雨, 毛建旭, 别克扎提·巴合提, 杜瑞, 王耀南. 面向源网荷的智能化数据协同推断技术研究综述. 自动化学报, 2025, 51(11): 2387−2411 doi: 10.16383/j.aas.c250203Zhang Hui, Yan Xing-Yu, Mao Jian-Xu, Biekezhati Baheti, Du Rui, Wang Yao-Nan. A review of intelligent data collaborative inference techniques for source-grid-load systems. Acta Automatica Sinica, 2025, 51(11): 2387−2411 doi: 10.16383/j.aas.c250203 [2] Ayvaz S, Alpay K. Predictive maintenance system for production lines in manufacturing: A machine learning approach using IoT data in real-time. Expert Systems with Applications, 2021, 173: 114598 doi: 10.1016/j.eswa.2021.114598 [3] Li J B, Izakian H, Pedrycz W, Jamal I. Clustering-based anomaly detection in multivariate time series data. Applied Soft Computing, 2021, 100: 106919 doi: 10.1016/j.asoc.2020.106919 [4] Pang H J, Ben Y W, Cao Y, Qu S, Hu C Z. Time series-based machine learning for forecasting multivariate water quality in full-scale drinking water treatment with various reagent dosages. Water Research, 2025, 268: 122777 doi: 10.1016/j.watres.2024.122777 [5] Huang X Q, Li Q, Tai Y H, Chen Z Q, Liu J, Shi J S, et al. Time series forecasting for hourly photovoltaic power using conditional generative adversarial network and Bi-LSTM. Energy, 2022, 246: 123403 doi: 10.1016/j.energy.2022.123403 [6] Jiang Z H, Zhu J C, Pan D, Gui W H, Xu Z H. Soft sensors using heterogeneous image features for moisture detection of sintering mixture in the sintering process. IEEE Transactions on Instrumentation and Measurement, 2023, 72: 1 doi: 10.1109/tim.2023.3284017 [7] Wang S, Chen H, Xiong H, Wang K, Zhang X. NAR Broad Learning System for Dynamical Systems Prediction. Neural Networks, 2026, 198: 108617 doi: 10.1016/j.neunet.2026.108617 [8] Sen P, Roy M, Pal P. Application of ARIMA for forecasting energy consumption and GHG emission: A case study of an Indian pig iron manufacturing organization. Energy, 2016, 116: 1031 doi: 10.1016/j.energy.2016.10.068 [9] Aigrain S, Foreman-Mackey D. Gaussian process regression for astronomical time series. Annual Review of Astronomy and Astrophysics, 2023, 61(1): 329 doi: 10.1146/annurev-astro-052920-103508 [10] Karmy J P, Maldonado S. Hierarchical time series forecasting via support vector regression in the European travel retail industry. Expert Systems with Applications, 2019, 137: 59 doi: 10.1016/j.eswa.2019.06.060 [11] Auret L, Aldrich C. Change point detection in time series data with random forests. Control Engineering Practice, 2010, 18(8): 990 doi: 10.1016/j.conengprac.2010.04.005 [12] Wang X M, Zhang C B, Liu Y Q, Liang X J, Yang C H, Gui W H. Advancing Industrial Process Control With Deep Learning-Enhanced Model Predictive Control for Nonlinear Time-Delay Systems. IEEE Transactions on Industrial Informatics, 2025 [13] Zhang X G, Lei Y Y, Chen H, Zhang L, Zhou Y C. Multivariate time-series modeling for forecasting sintering temperature in rotary kilns using DCGNet. IEEE Transactions on Industrial Informatics, 2020, 17(7): 4635 doi: 10.1109/tii.2020.3022019 [14] Wang X M, Liu Y Q, Liang X J, Zhang C B, Yang C H, Gui W H. Learning an enhanced TCN-LSTM network for temperature process modeling in rotary kilns. IEEE Transactions on Automation Science and Engineering, 2024, 22: 3056 [15] Shi F, Li B, Zhang W D. Transformer with Sparse Mixture of Experts for Time-Series Data Prediction in Industrial IoT Systems. Engineering, 2025, 17(3): 241 doi: 10.4236/eng.2025.173015 [16] Su Z Y, Zhang J, Yang Z H, Ma L H. A hybrid monthly electricity demand forecasting model combining an Hodrick-Prescott filter, recurrent neural networks, and autoregressive integrated moving average. Energy and AI, 2025100600 [17] Tao X L, Liu H W, Zhao W B, Li W K, Nie Y Q, Fu J Q. GCRA-FWVAE: Anomaly detection for IIoT univariate time series using time-frequency domain analysis. Digital Communications and Networks, 2025 [18] Chen H, Jiang Y, Zhang X G, Zhou Y C, Wang L H, Wei J C. Spatio-temporal graph attention network for sintering temperature long-range forecasting in rotary kilns. IEEE Transactions on Industrial Informatics, 2022, 19(2): 1923 doi: 10.1109/tii.2022.3210028 [19] Zhang L, Ren G F, Li S L, Du J S, Xu D Y, Li Y H. A novel soft sensor approach for industrial quality prediction based TCN with spatial and temporal attention. Chemometrics and Intelligent Laboratory Systems, 2025, 257: 105272 doi: 10.1016/j.chemolab.2024.105272 [20] Wang J, Xie Y F, Xie S W, Chen X F. Dual cross-attention transformer networks for temporal predictive modeling of industrial process. IEEE Transactions on Instrumentation and Measurement, 2024, 73: 1 [21] Chen J W, Zhao C H. Addressing information asymmetry: Deep temporal causality discovery for mixed time series. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025. [22] Kuan Y H, Narayanan V, Li J S. Iterative Reservoir Computing Networks for Reconstructing Irregular Time Series. IEEE Transactions on Neural Networks and Learning Systems, 2025 [23] Chen J W. Addressing spatial-temporal heterogeneity: General mixed time series analysis via latent continuity recovery and alignment. Advances in Neural Information Processing Systems, 2024, 37: 17910 doi: 10.52202/079017-0569 [24] Ma L, Wang M W, Peng K X. A spatiotemporal industrial soft sensor modeling scheme for quality prediction with missing data. IEEE Transactions on Instrumentation and Measurement, 2024, 73: 1 doi: 10.1109/tim.2024.3400358 [25] 庞昭辰, 刘明, 张立宪, 曹喜滨, 段广仁. 基于条件扩散模型的卫星遥测数据缺失值插补方法. 自动化学报, 2025, 51(10): 2302−2312 doi: 10.16383/j.aas.c250244Pang Z C, Liu M, Zhang L X, Cao X B, Duan G R. Conditional Diffusion Model-based Imputation Method for Missing Satellite Telemetry Data. Acta Automatica Sinica, 2025, 51(10): 2302−2312 doi: 10.16383/j.aas.c250244 [26] Dai Q Y, Zhao C H, Huang B. M2 D-VAE: Self-Supervised Probabilistic Temporal–Spatial Latent Representation Learning for Unsupervised Industrial Operational Applications Under Missing Value Interference. IEEE Transactions on Neural Networks and Learning Systems, 2024 [27] Li X, Li L K, Zhang K S, Chen X M, Feng T, Zhao Y, et al. Multivariate correlation self-distillation transformer for time series forecasting with incomplete data. IEEE Transactions on Industrial Informatics, 2025 [28] Chauhan J, Raghuveer A, Saket R, Nandy J, Ravindran B. Multi-variate time series forecasting on variable subsets. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 2022. 76–86 [29] Yu C Q, Wang F, Shao Z Z, Qian T W, Zhang Z, Wei W, et al. GinAR: An end-to-end multivariate time series forecasting model suitable for variable missing. In: Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 2024. 3989–4000 [30] Vapnik V, Vashist A. A new learning paradigm: Learning using privileged information. Neural Networks, 2009, 22(5-6): 544 doi: 10.1016/j.neunet.2009.06.042 [31] Sun S L, Li M C, Wang S Y, Zhang C Y. Multi-step ahead tourism demand forecasting: The perspective of the learning using privileged information paradigm. Expert Systems with Applications, 2022, 210: 118502 doi: 10.1016/j.eswa.2022.118502 [32] Han D, Kozuno T, Luo X F, Chen Z Y, Doya K, Yang Y Q, et al. Variational oracle guiding for reinforcement learning. International Conference on Learning Representations, 2022 [33] Han L, Chen X Y, Ye H J, Zhan D C. Softs: Efficient multivariate time series forecasting with series-core fusion. Advances in Neural Information Processing Systems, 2024, 37: 64145 doi: 10.52202/079017-2046 [34] Zhou H Y, Zhang S H, Peng J Q, Zhang S, Li J X, Xiong H, et al. Informer: Beyond efficient transformer for long sequence time-series forecasting. In: Proceedings of the AAAI Conference on Artificial Intelligence. 2021. 11106–11115 [35] Hochreiter S, Schmidhuber J. Long short-term memory. Neural Computation, 1997, 9(8): 1735 doi: 10.1162/neco.1997.9.8.1735 [36] Zhang Y H, Yan J C. Crossformer: Transformer utilizing cross-dimension dependency for multivariate time series forecasting. In: Proceedings of The Eleventh International Conference on Learning Representations. 2023 [37] Liu Y, Hu T, Zhang H R, Wu H X, Wang S Y, Ma L T, et al. itransformer: Inverted transformers are effective for time series forecasting. arXiv preprint arXiv: 2310.06625, 2023 [38] Zhou P F, Liu Y L, Liang J L, Song Q, Li X Y. CrossLinear: Plug-and-Play Cross-Correlation Embedding for Time Series Forecasting with Exogenous Variables. In: Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V. 2. 2025. 4120–4131 [39] Wang Y X, Wu H X, Dong J X, Liu Y, Wang C, Long M S, et al. Deep time series models: A comprehensive survey and benchmark. arXiv preprint arXiv: 2407.13278, 2024 -
计量
- 文章访问数: 12
- HTML全文浏览量: 10
- 被引次数: 0
下载: