基于多变量时空融合网络的风机数据缺失值插补研究

詹兆康; 胡旭光; 赵浩然; 张思琪; 张峻凯; 马大中

doi:10.16383/j.aas.c230534

基于多变量时空融合网络的风机数据缺失值插补研究

doi: 10.16383/j.aas.c230534

1.
东北大学信息科学与工程学院沈阳 110819
2.
山东大学电气工程学院济南 250100

基金项目: 国家自然科学基金(U22A20221, 62303103, 62073064), 中央高校基本科研业务费(N2304017, N2204007), 辽宁省自然科学基金(2022-KF-11-02)资助

详细信息

作者简介:
詹兆康：东北大学信息科学与工程学院硕士研究生. 主要研究方向为神经网络, 基于数据驱动的数据补偿. E-mail: 2200758@stu.neu.edu.cn

胡旭光：东北大学信息科学与工程学院讲师. 主要研究方向为数模混合驱动的能源系统智能化建模、综合高效利用与优化调控. 本文通信作者. E-mail: huxuguang@mail.neu.edu.cn

赵浩然：山东大学电气工程学院教授. 主要研究方向为新能源发电与并网, 新型电力系统建模与仿真和综合能源优化运行与控制. E-mail: hzhao@sdu.edu.cn

张思琪：东北大学信息科学与工程学院硕士研究生. 主要研究方向为基于机器学习的数据预测. E-mail: 2270967@stu.neu.edu.cn

张峻凯：东北大学信息科学与工程学院硕士研究生. 主要研究方向为能源系统的数据预测及分区恢复. E-mail: 2100687@stu.neu.edu.cn

马大中：东北大学信息科学与工程学院教授. 主要研究方向为故障诊断, 容错控制, 能源管理系统, 分布式发电系统、微网和能源互联网的优化与控制. E-mail: madazhong@ise.neu.edu.cn

计量
- 文章访问数: 1241
- HTML全文浏览量: 149
- PDF下载量: 208
- 被引次数: 0
出版历程
- 收稿日期: 2023-08-30
- 录用日期: 2024-02-09
- 网络出版日期: 2024-04-28
- 刊出日期: 2024-06-27

Study of Missing Value Imputation in Wind Turbine Data Based on Multivariate Spatiotemporal Integration Network

1.
College of Information Science and Engineering, Northeastern University, Shenyang 110819
2.
School of Electrical Engineering, Shandong University, Jinan 250100

Funds: Supported by National Natural Science Foundation of China (U22A20221, 62303103, 62073064), Fundamental Research Funds for the Central Universities in China (N2304017, N2204007), and Natural Science Foundation of Liaoning Province (2022-KF-11-02)

More Information

Author Bio:
ZHAN Zhao-Kang　Master student at the College of Information Science and Engineering, Northeastern University. Her research interest covers neural networks and data-driven data imputation

HU Xu-Guang　Lecturer at the College of Information Science and Engineering, Northeastern University. His research interest covers intelligent modelling, integrated and efficient utilization and optimal regulation of energy system driven by data-model hybrid. Corresponding author of this paper

ZHAO Hao-Ran　Professor at the School of Electrical Engineering, Shandong University. His research interest covers new energy generation and grid connection, modeling and simulation of new power systems, and optimal operation and control of integrated energy sources

ZHANG Si-Qi　Master student at the College of Information Science and Engineering, Northeastern University. Her main research interest is machine learning-based data prediction

ZHANG Jun-Kai　Master student at the College of Information Science and Engineering, Northeastern University. His research interest covers data prediction and partition recovery of energy systems

MA Da-Zhong　Professor at the College of Information Science and Engineering, Northeastern University. His research interest covers fault diagnosis, fault-tolerant control, energy management systems, and control and optimization of distributed generation systems, microgrids and energy internet

摘要

摘要: 风电场数据的完整性会因恶劣天气、输入信号丢失、传感器故障等原因遭到破坏, 而大面积的数据缺失将给风机设备的运行和维护带来严峻考验. 因此, 提出一个多变量时空融合网络(Multivariate spatiotemporal integration network, MSIN)来解决缺失数据问题. 首先, 提出包含缺失值定位−指引机制的MSIN结构, 揭示缺失部分数据的潜在信息, 确保插补数据符合真实分布. 其次, 在网络中设计多视角时空卷积模块, 捕捉同一风机多个变量与多个风机同一变量之间的局部空间和全局时间相关性, 用于提高插补数据的真实性. 接着, 提出网络实时自更新机制, 根据风电场实时变化情况实现在线调整, 能够提升网络泛化能力, 由此弥补重新训练模型的时间和空间成本高的缺陷. 最后, 通过真实的风机数据验证所提网络的有效性和优越性. 相关分析结果表明, 相较于MissForest等传统数据插补方法的插补性能, 平均绝对误差(Mean absolute error, MAE)、平均绝对百分比误差(Mean absolute percentage error, MAPE)和均方根误差(Root mean square error, RMSE)分别下降 18.54%、41.00% 和 3.15% 以上.
- 风机数据 /
- 数据插补 /
- 时空特征 /
- 生成对抗网络
Abstract: The integrity of wind farm data can be damaged by bad weather, input signal loss, sensor failure, etc., and the large-scale data loss will bring severe tests to the operation and maintenance of wind turbine equipment. Therefore, this paper proposes a multivariate spatiotemporal integration network (MSIN) to solve the missing data problem. Firstly, the structure of MSIN is proposed to include a localization guidance mechanism for missing values, which reveals the potential information of the missing part of the data and ensures that the imputed data conforms to the true distribution. Secondly, a multi-view spatiotemporal convolution module is designed in the network to capture the local spatial and global temporal correlations between multiple variables of the same wind turbine and the same variable of multiple wind turbines, which is used to improve the realism of the imputed data. Then, a real-time self-updating mechanism is proposed to adjust the network online according to the real-time changes of wind farms, which can improve the generalization ability of the network and thus make up for the defect of high time and space costs when retraining the model. Finally, the effectiveness and superiority of the proposed network are verified by real wind turbine data. The results show that the mean absolute error (MAE), the mean absolute percentage error (MAPE), and the root mean square error (RMSE) are reduced by more than 18.54%, 41.00% and 3.15%, respectively, when compared with the traditional data imputation methods such as MissForest and so on.
- Wind turbine data /
- data imputation /
- spatiotemporal characteristics /
- generative adversarial networks

HTML全文

图 1 风机时空关联分析示意图

Fig. 1 Schematic diagram of spatiotemporal correlation analysis of wind turbines

下载: 全尺寸图片幻灯片

图 2 多变量时空融合网络的网络架构

Fig. 2 The architecture of MSIN

下载: 全尺寸图片幻灯片

图 3 多视角时空卷积模块

Fig. 3 Multi-view spatiotemporal convolution module

下载: 全尺寸图片幻灯片

图 4 网络训练流程图

Fig. 4 Network training flowchart

下载: 全尺寸图片幻灯片

图 5 所提方法对具有相同缺失率的不同风机的不完整数据插补结果((a) 样本1; (b) 样本2; (c)样本3; (d)样本4)

Fig. 5 Results of incomplete data imputation of the proposed method for different wind turbines with the same missing rate ((a) Sample 1; (b) Sample 2; (c) Sample 3; (d) Sample 4)

下载: 全尺寸图片幻灯片

图 6 同一风机样本在不同缺失率下的不完整数据插补结果((a) 0.1; (b) 0.2; (c) 0.3; (d) 0.4; (e) 0.5; (f) 0.6; (g) 0.7; (h) 0.8)

Fig. 6 Incomplete data imputation results for the same wind turbine sample at different missing rates ((a) 0.1; (b) 0.2; (c) 0.3; (d) 0.4; (e) 0.5; (f) 0.6; (g) 0.7; (h) 0.8)

下载: 全尺寸图片幻灯片

图 7 消融实验评价指标的平均结果 ((a) MAE; (b) MAPE; (c) RMSE)

Fig. 7 The average results of evaluation metrics for ablation experiments ((a) MAE; (b) MAPE; (c) RMSE)

下载: 全尺寸图片幻灯片

图 8 七种插补方法运行时的CPU利用率

Fig. 8 CPU usage at runtime for seven imputation methods

下载: 全尺寸图片幻灯片

图 9 不同方法对比实验结果 ((a) MAE; (b) MAPE; (c) RMSE)

Fig. 9 Comparative experimental results of different methods ((a) MAE; (b) MAPE; (c) RMSE)

下载: 全尺寸图片幻灯片

表 1 风机变量

Table 1 The variables of wind turbine

编号	变量	编号	变量
1	轮毂转速	14	风电机定子温度1
2	叶片桨距角1	15	风电机定子温度2
3	叶片桨距角2	16	风电机定子温度3
4	叶片桨距角3	17	风电机定子温度4
5	节点X方向振动值	18	风电机定子温度5
6	节点Y方向振动值	19	风电机定子温度6
7	电网侧输出功率	20	发电机输出功率
8	风向偏移角度	21	轮毂角度
9	速度传感器	22	发电机转矩
10	ISU温度	23	INU RMIO 温度
11	发电机环境温度1	24	齿轮箱前轴承温度
11	发电机环境温度2	24	齿轮箱后轴承温度
12	机舱温度	25	INU温度
13	风速	26	风向

下载: 导出CSV

表 2 不同提示率下的评估结果

Table 2 Evaluation results under different hint-rates

提示率	MAE	MAPE	RMSE
0.10	0.1549	3.0010	0.2396
0.20	0.1552	2.9599	0.2398
0.30	0.1557	2.3107	0.2384
0.40	0.1564	2.2437	0.2401
0.50	0.1552	3.3131	0.2390
0.60	0.1555	2.2019	0.2400
0.70	0.1577	2.2831	0.2398
0.80	0.1543	2.8454	0.2397
0.90	0.1541	1.1783	0.2381
0.95	0.1561	1.9770	0.2391

下载: 导出CSV

表 3 不同$ \alpha $下的评估结果

Table 3 Evaluation results under different$ \alpha $

$ \alpha $	MAE	MAPE	RMSE
0.0001	0.6231	27135.3668	0.4956
0.0010	0.4983	128671.0614	0.6251
0.0100	0.4963	42939.8706	0.6236
0.1000	0.4967	167721.3201	0.6238
1	0.3625	229.8665	0.4843
10	0.1805	23.6173	0.2644
100	0.1539	5.4836	0.2321
1000	0.1518	5.7790	0.2488

下载: 导出CSV

表 4 不同$ \beta $下的评估结果

Table 4 Evaluation results under different$ \beta $

$ \beta $	MAE	MAPE	RMSE
0.0001	0.1532	1.2270	0.2320
0.0010	0.1505	2.3903	0.2290
0.0100	0.1507	2.3558	0.2274
0.1000	0.1499	1.9291	0.2268
1	0.1530	4.0830	0.2319
10	0.1801	23.7244	0.2641
100	0.3652	237.1457	0.4874
1000	0.4970	35792.8434	0.6240

下载: 导出CSV

表 5 不同学习率下的评估结果

Table 5 Evaluation results under different learning rates

学习率	MAE	MAPE	RMSE
0.0001	0.2121	1.7066	0.2941
0.0010	0.1521	1.4009	0.2295
0.0100	0.4272	4.2201	0.5652
0.1000	0.4264	7.0552	0.5648
1	0.4302	5.2400	0.5676
10	0.4269	7.8907	0.5646
100	0.4272	9.6068	0.5657
1000	0.4298	6.7900	0.5674

下载: 导出CSV

表 6 风机数据在不同缺失率下的评价指标结果

Table 6 Results of evaluation metrics for wind turbine data with different missing rates

缺失率	MAE			MAPE			RMSE
缺失率	max	min	avg	max	min	avg	max	min	avg
0.1	0.1653	0.0822	0.1179	3.8283	1.2530	2.3968	0.2432	0.1556	0.1877
0.2	0.1768	0.1052	0.1298	3.7203	1.1687	2.4970	0.2656	0.1724	0.2032
0.3	0.1914	0.1127	0.1409	3.7355	1.2704	2.6702	0.2768	0.1884	0.2186
0.4	0.1791	0.1079	0.1356	3.5158	1.2851	2.6973	0.2841	0.1920	0.2244
0.5	0.1881	0.1217	0.1418	3.6810	1.2905	2.7583	0.2654	0.2068	0.2269
0.6	0.1968	0.1386	0.1544	3.7117	1.2130	2.7925	0.2823	0.2239	0.2753
0.7	0.1994	0.1789	0.1629	3.8964	1.2025	2.8347	0.2833	0.2353	0.2538
0.8	0.1999	0.1625	0.1787	3.9935	1.2148	2.8559	0.3004	0.2465	0.2734

下载: 导出CSV

表 7 七种插补方法一次迭代的运行时间(s)

Table 7 Running time of the seven imputation methods for one iteration (s)

插补方法	缺失率
插补方法	0.1	0.2	0.3	0.4	0.5	0.6	0.7	0.8
MSIN	4.3156	4.7167	4.9595	5.1400	5.1159	4.9905	5.1656	5.0997
TimeGAN^[28]	6.5895	6.6172	7.3519	8.8907	7.7120	8.4728	7.8757	8.3546
M-RNN^[29]	81.1218	70.8649	69.9753	67.5593	69.0319	68.2631	71.2586	68.9668
MIRACLE^[30]	0.2554	0.3761	0.3752	0.3925	0.3879	0.3692	0.3712	0.3941
MICE^[31]	2.5963	2.1705	2.1164	2.7042	2.2922	2.3221	2.6145	2.5653
MissForest^[32]	0.5963	0.5771	0.7897	0.7921	0.8396	0.9587	0.9132	0.8527
LGDI^[33]	15.6514	14.0879	15.8731	16.3439	14.9822	17.3042	15.9346	17.8468

下载: 导出CSV

参考文献(33)

[1]	胡旭光, 马大中, 郑君, 张化光, 王睿. 基于关联信息对抗学习的综合能源系统运行状态分析方法. 自动化学报, 2020, 46(9): 1783−1797 Hu Xu-Guang, Ma Da-Zhong, Zheng Jun, Zhang Hua-Guang, Wang Rui. An operation state analysis method for integrated energy system based on correlation information adversarial learning. Acta Automatica Sinica, 2020, 46(9): 1783−1797
[2]	王睿, 孙秋野, 张化光. 微电网的电流均衡/电压恢复自适应动态规划策略研究. 自动化学报, 2022, 48(2): 479−491 Wang Rui, Sun Qiu-Ye, Zhang Hua-Guang. Research on current sharing/voltage recovery based adaptive dynamic programming control strategy of microgrids. Acta Automatica Sinica, 2022, 48(2): 479−491
[3]	李远征, 倪质先, 段钧韬, 徐磊, 杨涛, 曾志刚. 面向高比例新能源电网的重大耗能企业需求响应调度. 自动化学报, 2023, 49(4): 754−768 Li Yuan-Zheng, Ni Zhi-Xian, Duan Jun-Tao, Xu Lei, Yang Tao, Zeng Zhi-Gang. Demand response scheduling of major energy-consuming enterprises based on a high proportion of renewable energy power grid. Acta Automatica Sinica, 2023, 49(4): 754−768
[4]	Hu X G, Zhang H G, Ma D Z, Wang R. Hierarchical pressure data recovery for pipeline network via generative adversarial networks. IEEE Transactions on Automation Science and Engineering, 2022, 19(3): 1960−1970 doi: 10.1109/TASE.2021.3069003
[5]	张博玮, 郑建飞, 胡昌华, 裴洪, 董青. 基于流模型的缺失数据生成方法在剩余寿命预测中的应用. 自动化学报, 2023, 49(1): 185−196 Zhang Bo-Wei, Zheng Jian-Fei, Hu Chang-Hua, Pei Hong, Dong Qing. Missing data generation method based on flow model and its application in remaining life prediction. Acta Automatica Sinica, 2023, 49(1): 185−196
[6]	杜党波, 张伟, 胡昌华, 周志杰, 司小胜, 张建勋. 含缺失数据的小波−卡尔曼滤波故障预测方法. 自动化学报, 2014, 40(10): 2115−2125 Du Dang-Bo, Zhang Wei, Hu Chang-Hua, Zhou Zhi-Jie, Si Xiao-Sheng, Zhang Jian-Xun. A failure prognosis method based on wavelet-Kalman filtering with missing data. Acta Automatica Sinica, 2014, 40(10): 2115−2125
[7]	Jin X H, Wang H, Kong Z Q, Xu Z W, Qiao W. Condition monitoring of wind turbine generators using SCADA data analysis. IEEE Transactions on Sustainable Energy, 2021, 12(1): 202−210 doi: 10.1109/TSTE.2020.2989220
[8]	Liu Z P, Wang X F, Zhang L. Fault diagnosis of industrial wind turbine blade bearing using acoustic emission analysis. IEEE Transactions on Instrumentation and Measurement, 2020, 69(9): 6630−6639 doi: 10.1109/TIM.2020.2969062
[9]	刘畅, 郎劲. 基于混核LSSVM的批特征风功率预测方法. 自动化学报, 2020, 46(6): 1264−1273 Liu Chang, Lang Jin. Wind power prediction method using hybrid kernel LSSVM with batch feature. Acta Automatica Sinica, 2020, 46(6): 1264−1273
[10]	孔小兵, 刘向杰. 双馈风力发电机非线性模型预测控制. 自动化学报, 2013, 39(5): 636−643 Kong Xiao-Bing, Liu Xiang-Jie. Nonlinear model predictive control for DFIG-based wind power generation. Acta Automatica Sinica, 2013, 39(5): 636−643
[11]	Peng Y Y, Qiao W, Qu L Y. Compressive sensing-based missing-data-tolerant fault detection for remote condition monitoring of wind turbines. IEEE Transactions on Industrial Electronics, 2022, 69(2): 1937−1947 doi: 10.1109/TIE.2021.3057039
[12]	Coville A, Siddiqui A, Vogstad K O. The effect of missing data on wind resource estimation. Energy, 2011, 36(7): 4505−4517 doi: 10.1016/j.energy.2011.03.067
[13]	Liu X, Zhang Z J. A two-stage deep autoencoder-based missing data imputation method for wind farm SCADA data. IEEE Sensors Journal, 2021, 21(9): 10933−10945 doi: 10.1109/JSEN.2021.3061109
[14]	许美玲, 邢通, 韩敏. 基于时空Kriging方法的时空数据插值研究. 自动化学报, 2020, 46(8): 1681−1688 Xu Mei-Ling, Xing Tong, Han Min. Spatial-temporal data interpolation based on spatial-temporal Kriging method. Acta Automatica Sinica, 2020, 46(8): 1681−1688
[15]	Ma D Z, Hu X G, Zhang H G, Sun Q Y, Xie X P. A hierarchical event detection method based on spectral theory of multidimensional matrix for power system. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2021, 51(4): 2173−2186 doi: 10.1109/TSMC.2019.2931316
[16]	Hu X G, Zhang H G, Ma D Z, Wang R. A tnGAN-based leak detection method for pipeline network considering incomplete sensor data. IEEE Transactions on Instrumentation and Measurement, 2020, 70: Article No. 3510610
[17]	Mostafa S M. Imputing missing values using cumulative linear regression. CAAI Transactions on Intelligence Technology, 2019, 4(3): 182−200 doi: 10.1049/trit.2019.0032
[18]	Razavi-Far R, Cheng B Y, Saif M, Ahmadi M. Similarity-learning information-fusion schemes for missing data imputation. Knowledge-based Systems, 2020, 187: Article No. 104805 doi: 10.1016/j.knosys.2019.06.013
[19]	Ye C, Wang H Z, Lu W B, Li J Z. Effective Bayesian-network-based missing value imputation enhanced by crowdsourcing. Knowledge-based Systems, 2020, 190: Article No. 105199 doi: 10.1016/j.knosys.2019.105199
[20]	Zhang Z H. Multiple imputation with multivariate imputation by chained equation (MICE) package. Annals of Translational Medicine, 2016, 4(2): Article No. 30
[21]	文成林, 吕菲亚, 包哲静, 刘妹琴. 基于数据驱动的微小故障诊断方法综述. 自动化学报, 2016, 42(9): 1285−1299 Wen Cheng-Lin, Lv Fei-Ya, Bao Zhe-Jing, Liu Mei-Qin. A review of data driven-based incipient fault diagnosis. Acta Automatica Sinica, 2016, 42(9): 1285−1299
[22]	Tak S, Woo S, Yeo H. Data-driven imputation method for traffic data in sectional units of road links. IEEE Transactions on Intelligent Transportation Systems, 2016, 17(6): 1762−1771 doi: 10.1109/TITS.2016.2530312
[23]	Folguera L, Zupan J, Cicerone D, Magallanes J F. Self-organizing maps for imputation of missing data in incomplete data matrices. Chemometrics and Intelligent Laboratory Systems, 2015, 143: 146−151 doi: 10.1016/j.chemolab.2015.03.002
[24]	Pan H, Ye Z, He Q Y, Yan C Y, Yuan J Y, Lai X D, et al. Discrete missing data imputation using multilayer perceptron and momentum gradient descent. Sensors, 2022, 22(15): Article No. 5645 doi: 10.3390/s22155645
[25]	Khan H, Wang X Z, Liu H. Handling missing data through deep convolutional neural network. Information Sciences, 2022, 595: 278−293 doi: 10.1016/j.ins.2022.02.051
[26]	Yu B, Yin H T, Zhu Z X. Spatio-temporal graph convolutional networks: A deep learning framework for traffic forecasting. arXiv preprint arXiv: 1709.04875, 2018.
[27]	Zhang J B, Zheng Y, Qi D K. Deep spatio-temporal residual networks for citywide crowd flows prediction. In: Proceedings of the 31st AAAI Conference on Artificial Intelligence. San Francisco, USA: AAAI Press, 2017. 1655−1661
[28]	Yoon J, Jarrett D, Schaar M V D. Time-series generative adversarial networks. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems. Vancouver, Canada: Curran Associates Inc., 2019. 5508−5518
[29]	Yoon J, Zame W R, Schaar M V D. Estimating missing data in temporal data streams using multi-directional recurrent neural networks. IEEE Transactions on Biomedical Engineering, 2019, 66(5): 1477−1490 doi: 10.1109/TBME.2018.2874712
[30]	Kyono T, Zhang Y, Bellot A, Schaar M V D. MIRACLE: Causally-aware imputation via learning missing data mechanisms. arXiv preprint arXiv: 2111.03187, 2021.
[31]	Zhang Y F, Thorburn P J, Xiang W, Fitch P. SSIM——A deep learning approach for recovering missing time series sensor data. IEEE Internet of Things Journal, 2019, 6(4): 6618−6628 doi: 10.1109/JIOT.2019.2909038
[32]	Li Z G, He Q. Prediction of railcar remaining useful life by multiple data source fusion. IEEE Transactions on Intelligent Transportation Systems, 2015, 16(4): 2226−2235 doi: 10.1109/TITS.2015.2400424
[33]	Wu R, Hamshaw S D, Yang L, Kincaid D W, Etheridge R, Ghasemkhani A. Data imputation for multivariate time series sensor data with large gaps of missing data. IEEE Sensors Journal, 2022, 22(11): 10671−10683 doi: 10.1109/JSEN.2022.3166643