-
摘要: 污水处理过程中, 出水水质参数是衡量污水处理性能的最重要指标, 需要进行严格监测, 但现有传感技术难以对其进行实时准确地在线测量. 因此, 提出一种新型的基于随机权神经网络(Random vector functional-link networks, RVFLNs)与Schweppe型广义M估计(Generalized M-estimation, GM-estimation)的稀疏鲁棒建模方法, 用于水质指标的在线鲁棒预测. 首先, 针对常规RVFLNs隐含层矩阵存在多重共线性而导致最小二乘估计失效的问题, 利用稀疏偏最小二乘(Sparse partial least squares, SPLS)代替RVFLNs输出权值求解的最小二乘估计, 从而提出SPLS-RVFLNs. 该算法不仅可有效解决传统RVFLNs的多重共线性问题, 还可以进行建模变量选择, 提高模型的可解释性和最终的预测精度. 同时, 考虑到SPLS-RVFLNs在求解输出权值时会同时受到隐含层矩阵和输出层矩阵两个方向离群点的影响, 进一步采用Schweppe型广义M估计对SPLS-RVFLNs进行鲁棒改进, 从而提出GM-SPLS-RVFLNs, 可显著提高模型的稀疏鲁棒性能. 最后, 将提出的GM-SPLS-RVFLNs用于污水处理过程出水水质指标预测建模, 数据实验结果表明所提方法不仅解决了常规RVFLNs多重共线性和鲁棒性差的问题, 而且具有很好的预测精度和泛化性能.Abstract: In the process of wastewater treatment, effluent quality indices are the most important indicators to measure the performance of wastewater treatment, which need to be monitored strictly. However, the existing sensor technology is difficult to measure them in real time and accurately. Therefore, a novel sparse robust modeling method based on random vector functional-link networks (RVFLNs) and Schweppe-type generalized M-estimation (GM-estimation) is proposed for on-line robust estimation of effluent quality indices. First of all, aiming at the multicollinearity of conventional RVFLNs hidden layer matrix, which leads to the failure of the least squares estimation, sparse partial least squares (SPLS) algorithm is used to replace the least squares estimation of output weights of RVFLNs, and a SPLS-RVFLNs algorithm is proposed. This algorithm can not only solve the multicollinearity problem of traditional RVFLNs effectively, but also select modeling variables to improve the interpretability and prediction accuracy of the model. At the same time, considering that the SPLS-RVFLNs algorithm is affected by outliers in both directions of hidden layer matrix and output layer matrix, Schweppe-type GM-estimation is further used to improve the robustness, thus a GM-SPLS-RVFLNs algorithm is proposed, which can improve the sparse robustness of the model significantly. Finally, the GM-SPLS-RVFLNs algorithm is used to predict effluent quality indices of wastewater treatment process. The experimental results show that the proposed method not merely solves the problems of multicollinearity and poor robustness of conventional RVFLNs, but has good prediction accuracy and generalization performance as well.
-
表 1 10个潜变量时, 建模误差与隐含层节点个数之间的关系表
Table 1 The relationship between the RMSE and the number of hidden layer nodes when 10 latent variables
隐含层节点个数 RMSE BOD COD TSS 10 0.0372 0.2321 0.1733 15 0.0290 0.1781 0.1569 20 0.0290 0.1574 0.1350 25 0.0258 0.1496 0.1211 30 0.0247 0.1440 0.1150 35 0.0223 0.1345 0.1043 40 0.0225 0.1343 0.1032 50 0.0224 0.1337 0.1054 100 0.0221 0.1340 0.1022 200 0.0227 0.1325 0.1015 表 2 输入输出样本均含25%离群点时, 不同水质指标建模方法性能指标对比
Table 2 The comparison of performance indexes of effluent quality indices with different methods for input and output samples with 25% outliers
模型 RMSE MAPE R square BOD COD TSS BOD COD TSS BOD COD TSS RVFLNs 0.1689 1.2691 0.8442 0.0532 0.0215 0.0581 0.7550 0.7068 0.7817 Robust RVFLNs 0.0931 0.6572 0.4539 0.0242 0.0100 0.0242 0.9303 0.9285 0.9413 PRM RVFLNs 0.0893 0.5389 0.4015 0.0216 0.0078 0.0200 0.9330 0.9501 0.9522 GM-SPLS-RVFLNs 0.0301 0.1765 0.1259 0.0056 0.0016 0.0045 0.9959 0.9976 0.9974 -
[1] 乔俊飞, 韩改堂, 周红标. 基于知识的污水生化处理过程智能优化方法. 自动化学报, 2017, 43(6): 1038--1046Qiao Jun-Fei, Han Gai-Tang, Zhou Hong-Biao. Knowledge-based intelligent optimal control for wastewater biochemical treatment process. Acta Automatica Sinica, 2017, 43(6): 1038--1046 [2] 栗三一, 乔俊飞, 李文静, 顾锞. 污水处理决策优化控制. 自动化学报, 2018, 44(12): 2198--2209Li San-Yi, Qiao Jun-Fei, Li Wen-Jing, Gu Ke. Advanced decision and optimization control for wastewater treatment plants. Acta Automatica Sinica, 2018, 44(12): 2198--2209 [3] 张帅, 周平. 污水处理过程递推双线性子空间建模及无模型自适应控制. 自动化学报, DOI: 10.16383/j.aas.c190514Zhang Shuai, Zhou Ping. Recursive bilinear subspace modeling and model-free adaptive control of wastewater treatment. Acta Automatica Sinica, DOI: 10.16383/j.aas.c190514 [4] 柴天佑. 复杂工业过程运行优化与反馈控制. 自动化学报, 2013, 39(11): 1744--1757 doi: 10.3724/SP.J.1004.2013.01744Chai Tian-You. Operational optimization and feedback control for complex industrial processes. Acta Automatica Sinica, 2013, 39(11): 1744--1757 doi: 10.3724/SP.J.1004.2013.01744 [5] 陈龙, 刘全利, 王霖青, 赵珺, 王伟. 基于数据的流程工业生产过程指标预测方法综述. 自动化学报, 2017, 43(6): 944--954CHEN Long, LIU Quan-Li, WANG Lin-Qing, ZHAO Jun, WANG Wei. Data-driven prediction on performance indicators in process industry: a survey. Acta Automatica Sinica, 2017, 43(6): 944--954 [6] 蒙西, 乔俊飞, 韩红桂. 基于类脑模块化神经网络的污水处理过程关键出水参数软测量. 自动化学报, 2019, 45(5): 906--919Meng Xi, Qiao Jun-Fei, Han Hong-Gui. Soft measurement of key effluent parameters in wastewater treatment process using brain-like modular neural networks. Acta Automatica Sinica, 2019, 45(5): 906--919 [7] Liu H, Zhang H, Zhang Y, Zhang F, Huang M. Modeling of Wastewater Treatment Processes Using Dynamic Bayesian Networks Based on Fuzzy PLS. IEEE Access, 2020, 8: 92129--92140 [8] Liu H, Yang C, Carlsson B, Qin S J, Yoo CK. Dynamic nonlinear partial least squares modeling using Gaussian process regression. Industrial & Engineering Chemistry Research, 2019, 58(36): 16676--16686 [9] Liu Z, Wan J, Ma Y, Wang Y. Online prediction of effluent COD in the anaerobic wastewater treatment system based on PCA-LSSVM algorithm. Environmental Science and Pollution Research, 2019, 26(13): 12828--12841 doi: 10.1007/s11356-019-04671-8 [10] Liu H, Xin C, Zhang H, Zhang F, Huang M. Effluent Quality Prediction of Papermaking Wastewater Treatment Processes Using Stacking Ensemble Learning. IEEE Access, 2020, 8: 180844--180854 doi: 10.1109/ACCESS.2020.3028683 [11] Pisa I, Santin I, Morell A, Vicario JL, Vilanova R. LSTM-Based Wastewater Treatment Plants Operation Strategies for Effluent Quality Improvement. IEEE Access, 2019, 7: 159773--159786 doi: 10.1109/ACCESS.2019.2950852 [12] Cheng T, Harrou F, Kadri F, Sun Y, Leiknes T. Forecasting of Wastewater Treatment Plant Key Features using Deep Learning-Based Models: A Case Study. IEEE Access, 2020, 8: 184475--184485 doi: 10.1109/ACCESS.2020.3030820 [13] 李温鹏, 周平. 高炉铁水质量鲁棒正则化随机权神经网络建模. 自动化学报, 2020, 46(4): 721--733Li Wen-Peng, Zhou Ping. Robust regularized RVFLNs modeling of molten iron quality in blast furnace ironmaking. Acta Automatica Sinica, 2020, 46(4): 721--733 [14] Pao Y H, Takefuji Y. Functional-link net computing: theory, system architecture, and functionalities. Computer, 1992, 25(5): 76--79 doi: 10.1109/2.144401 [15] Igelnik B, PaoY H. Stochastic choice of basis functions in adaptive function approximation and the functional-link net. IEEE Transactions on Neural Networks, 1995, 6(6): 1320--1329. doi: 10.1109/72.471375 [16] Pao Y H, Park G H, Sobajic D J. Learning and generalization characteristics of random vector functional-link net. Neurocomputing, 1994, 6(2): 163--180 doi: 10.1016/0925-2312(94)90053-1 [17] Scardapane S, Wang D H, Panella M, Uncini A. Distributed learning for random vector functional-link networks. Information Sciences, 2015, 301: 271--284 doi: 10.1016/j.ins.2015.01.007 [18] Zhang L, Suganthan P N. A comprehensive evaluation of random vector functional link networks. Information Sciences, 2015, 367: 1094--1105 [19] Yu P, Cao J, Jegatheesan V, Du X. A real-time BOD estimation method in wastewater treatment process based on an optimized extreme learning machine. Applied Sciences, 2019, 9(3): 523 doi: 10.3390/app9030523 [20] Zhao L J, Chai T Y, Yuan D C. Selective ensemble extreme learning machine modeling of effluent quality in wastewater treatment plants. International Journal of Automation & Computing, 2012, 9(6): 627--633 [21] Zhou P, Lv Y, Wang H, Chai T. Data-driven robust RVFLNs modeling of a blast furnace iron-making process using Cauchy distribution weighted M-Estimation. IEEE Transactions on Industrial Electronics, 2017, 64(9): 7141--7151 doi: 10.1109/TIE.2017.2686369 [22] Zhao L, Wang D, Chai T. Estimation of effluent quality using PLS-based extreme learning machines. Neural Computing and Applications, 2013, 22(3-4): 509--519 doi: 10.1007/s00521-012-0837-1 [23] 张瑞垚, 周平. 基于鲁棒加权模糊聚类的污水处理过程监测方法. 自动化学报, DOI: 10.16383/j.aas.c200392Zhang Rui-Yao, Zhou Ping. Robust weighted fuzzy clustering for sewage treatment process monitoring. Acta Automatica Sinica, DOI: 10.16383/j.aas.c200392 [24] Huber P J, Ronchetti E M. Robust Statistics (2nd Edition). USA: Wiley, 2009 [25] Kim-Anh, Lê, Cao. A sparse PLS for variable selection when integrating omics data. Statistical Applications in Genetics & Molecular Biology, 2008, 7(1): 1--29 [26] Krasker W S, Welsch R E. Efficient bounded-influence regression estimation. Journal of the American Statistical Association, 1982, 77(379): 595--604 doi: 10.1080/01621459.1982.10477855 [27] Fritz H, Filzmoser P, Croux C. A comparison of algorithms for the multivariate L1-median. Computational Statistics, 2012, 27(3): 393--410 doi: 10.1007/s00180-011-0262-4 [28] Hampel F R. The influence curve and its role in robust estimation. Journal of the American Statistical Association, 1974, 69(346): 383--393 doi: 10.1080/01621459.1974.10482962 [29] Schmidt W F, Kraaijveld M A, Duin R P W. Feedforward neural networks with random weights. In: Proceedings of the 11th IAPR International Conference on Pattern Recognition Vol.II Conference B: Pattern Recognition Methodology and Systems. The Hague, Netherlands: IEEE, 1992. 1−4 [30] Serneels S, Croux C, Filzmoser P, Espen P J V. Partial robust M-regression. Chemometrics and Intelligent Laboratory Systems, 2005, 79(1-2): 55--64 doi: 10.1016/j.chemolab.2005.04.007