Stochastic Variational Bayesian Learning of Wiener Model in the Presence of Uncertainty
-
摘要: 多重不确定性环境下的非线性系统辨识是一个开放问题. 贝叶斯学习在描述、处理不确定性方面具有显著优势, 已在线性系统辨识方面得到广泛应用, 但在非线性系统辨识的应用较少, 且面临概率估计复杂、计算量大等难题. 针对上述问题, 以典型维纳(Wiener)非线性过程为对象, 提出基于随机变分贝叶斯的非线性系统辨识方法. 首先对过程噪声、测量噪声以及参数不确定性进行概率描述; 然后利用随机变分贝叶斯方法对模型参数进行后验估计. 在估计过程中, 利用随机优化思想, 仅利用部分中间变量概率信息估计模型参数分布的自然梯度期望, 与利用所有中间变量概率信息估计模型参数比较, 显著降低了计算复杂性. 该方法是首次在系统辨识领域中的应用. 最后, 利用一个仿真实例和一个维纳模型的Benchmark问题, 证明了该方法在对大规模数据下非线性系统辨识的有效性.Abstract: Nonlinear system identification in multiple uncertain environment is an open problem. Bayesian learning has significant advantages in describing and dealing with uncertainties and has been widely used in linear system identification. However, the use of Bayesian learning for nonlinear system identification has not been well studied, confronted with the complexity of the estimation of the probability and the high computational cost. Motivated by these problems, this paper proposes a nonlinear system identification method based on stochastic variational Bayesian for Wiener model, a typical nonlinear model. First, the process noise, measurement noise and parameter uncertainty are described in terms of probability distribution. Then, the posterior estimation of model parameters is carried out by using the stochastic variational Bayesian approach. In this framework, only a few intermediate variables are used to estimate the natural gradient of the lower bound function of the likelihood function based on the stochastic optimization idea. Compared with classical variational Bayesian approach, where the estimation of model parameters depends on the information of all the intermediate variables, the computational complexity is significantly reduced for the proposed method since it only depends on the information of a few intermediate variables. To the best of our knowledge, it is the first time to use the stochastic variational Bayesian to system identification. A numerical example and a Benchmark problem of Wiener model are used to show the effectiveness of this method in the nonlinear system identification in the presence of large-scale data.
-
表 1 不同子采样数据点对应的参数辨识情况
Table 1 Identification of parameters corresponding to different sub-sampling data points
$ \langle \theta _0 \rangle $ $ \langle \theta _1 \rangle $ $ \langle \theta _2 \rangle $ $ \langle \theta _3 \rangle $ $ \langle \theta _4 \rangle $ $ \langle \lambda _0 \rangle $ $ \langle \lambda _1 \rangle $ $ \langle \lambda _2 \rangle $ 时间(s) 真实值 1 −0.5000 0.2500 −0.1250 0.0625 0 1 1 — 采样1个点 1±0 −0.5463±0.3604 0.2507±0.2471 −0.2446±0.2655 0.0358±0.2882 0.5434±0.4180 0.6625±0.2907 0.3803±0.2185 0.6005 采样5% 1±0 −0.5060±0.0330 0.2693±0.0497 −0.1252±0.0323 0.0633±0.0323 0.0908±0.2707 0.9871±0.1480 0.9103±0.1246 3.1829 采样10% 1±0 −0.5055±0.0248 0.2571±0.0257 −0.1341±0.0255 0.0594±0.0256 0.0631±0.0504 0.9684±0.0498 0.9499±0.0459 7.7402 采样20% 1±0 −0.5077±0.0204 0.2544±0.0202 −0.1287±0.0289 0.0659±0.0291 0.0575±0.0540 0.9813±0.0518 0.9574±0.0451 11.4620 采样全部 1±0 −0.5078±0.0278 0.2541±0.0283 −0.1299±0.0271 0.0685±0.0246 0.0777±0.0726 0.9439±0.1183 0.9252±0.1326 9.0772 表 2 不同异常值存在时的参数辨识情况
Table 2 Parameter identification when different outliers exist
$ \langle \theta _0 \rangle $ $ \langle \theta _1 \rangle $ $ \langle \theta _2 \rangle $ $ \langle \theta _3 \rangle $ $ \langle \theta _4 \rangle $ $ \langle \theta _5 \rangle $ 时间 (s) 真实值 1 −0.5000 0.2500 −0.1250 0.0625 −0.03125 — 无异常值 1±0 −0.4989±0.0292 0.2495±0.0293 −0.1254±0.0223 0.0611±0.0257 −0.0338±0.0262 2.9369 2% 异常值 1±0 −0.5097±0.0389 0.2672±0.0497 −0.1305±0.0426 0.0652±0.0452 −0.0291±0.0494 2.9480 5% 异常值 1±0 −0.5060±0.0330 0.2693±0.0497 −0.1252±0.0323 0.0633±0.0323 −0.0314±0.0523 3.1829 10% 异常值 1±0 −0.5349±0.0325 0.2627±0.0323 −0.1314±0.0330 0.0685±0.0389 −0.0377±0.0355 2.9057 表 3 不同辨识方法的性能比较
Table 3 Performance comparison of different recognition methods
$ b_0 $ $ a_1 $ $ \langle \lambda _0 \rangle(\lambda_0) $ $ \langle \lambda _1 \rangle(\lambda_1) $ $ \langle \lambda _2 \rangle(\lambda_2) $ 均方误差 时间(s) 真实值 1 0.5 0 1 1 — — 无异常值 SVBI — — 0.0648±0.0620 0.9633±0.0509 0.9766±0.0626 0.9136 2.936 9 VBEM — — 0.0503±0.0346 0.9411±0.0393 0.9655±0.0459 0.8978 9.7046 MLE 1±0 0.5102±0.0136 0.1054±0.0405 1.0154±0.0464 0.9490±0.0411 0.9130 9.0350 PEM 1±0 0.4948±0.0172 0.0828±0.0524 0.9905±0.0373 1.0072±0.0449 0.9132 0.6474 5% 异常值 SVBI — — 0.0575±0.0540 0.9813±0.0520 0.9573±0.0450 5.4540 2.9352 VBEM — — 0.0503±0.0411 0.9770±0.0532 0.9748±0.0518 3.8695 9.7709 MLE 1±0 0.4150±0.0711 −0.9407±0.1253 1.0019±0.1839 1.3715±0.1895 3.9574 9.6693 PEM 1±0 0.4999±0.0549 0.1072±0.1871 0.9646±0.1926 0.9878±0.1558 3.8374 0.6580 10% 异常值 SVBI — — 0.1439±0.1065 0.9163±0.0924 0.8416±0.0924 7.5364 2.9057 VBEM — — 0.0556±0.0468 0.9711±0.0538 0.9568±0.0553 5.5110 9.9245 MLE — — — — — — — PEM 1±0 0.4723±0.2004 0.1458±0.5211 0.9746±0.3091 1.0030±0.3253 5.4992 0.6620 表 4 式(52)部分参数辨识结果
Table 4 The identification results of the part parameters of the process (52)
参数 $\theta_0$ $\theta_1$ $\theta_2$ $\theta_3$ $\theta_4$ $\theta_5$ $\theta_6$ $\theta_7$ $\theta_8$ $\theta_9$ $c_0$ $c_1$ $c_2$ $Q$ $R$ 结果值 −0.0390 0.0648 −0.0547 0.0856 −0.0462 0.2613 0.0501 0.2041 0.3396 0.4154 −0.0188 0.1035 −0.0030 0.0034 0.0014 表 5 不同方法的性能比较
Table 5 Performance comparison of different methods
采样点数 方法 均方误差(V) 参数个数 时间(s) 2 000 SVBI 0.056 95 25 256.12 VBEM 0.062 83 25 1 211.27 SVBI 0.034 07 40 264.27 VBEM 0.034 25 40 1 214.55 10 000 SVBI 0.061 79 25 1 299.99 VBEM 0.093 34 25 6 347.28 SVBI 0.033 85 40 1 332.31 VBEM 0.034 04 40 6 442.98 -
[1] 王乐一, 赵文虓. 系统辨识: 新的模式、挑战及机遇. 自动化学报, 2013, 39(7): 933−942 doi: 10.1016/S1874-1029(13)60062-2Wang Le-Yi, Zhao Wen-Xiao. System identification: New paradigms, challenges, and opportunities. Acta Automatica Sinica, 2013, 39(7): 933−942 doi: 10.1016/S1874-1029(13)60062-2 [2] 刘鑫. 时滞取值概率未知下的线性时滞系统辨识方法. 自动化学报, 2023, 49(10): 2136−2144Liu Xin. Identification of linear time-delay systems with unknown delay distributions in its value range. Acta Automatica Sinica, 2023, 49(10): 2136−2144 [3] Stoica P. On the convergence of an iterative algorithm used for Hammerstein system identification. IEEE Transactions on Automatic Control, 1981, 26(4): 967−969 doi: 10.1109/TAC.1981.1102761 [4] 张亚军, 柴天佑, 杨杰. 一类非线性离散时间动态系统的交替辨识算法及应用. 自动化学报, 2017, 43(1): 101−113Zhang Ya-Jun, Chai Tian-You, Yang Jie. Alternating identification algorithm and its application to a class of nonlinear discrete-time dynamical systems. Acta Automatica Sinica, 2017, 43(1): 101−113 [5] 黄玉龙, 张勇刚, 李宁, 赵琳. 一种带有色量测噪声的非线性系统辨识方法. 自动化学报, 2015, 41(11): 1877−1892Huang Yu-Long, Zhang Yong-Gang, Li Ning, Zhao Lin. An identification method for nonlinear systems with colored measurement noise. Acta Automatica Sinica, 2015, 41(11): 1877−1892 [6] Ljung L. Perspectives on system identification. Annual Reviews in Control, 2008, 34(1): 1−12 [7] Schön T B, Wills A, Ninness B. System identification of nonlinear state-space models. Automatica, 2011, 47(1): 39−49 doi: 10.1016/j.automatica.2010.10.013 [8] Billings S A. Nonlinear System Identification: NARMAX Methods in the Time, Frequency, and Spatio-Temporal Domains. Chichester: John Wiley & Sons, 2013. [9] Carini A, Orcioni S, Terenzi A, Cecchi S. Nonlinear system identification using Wiener basis functions and multiple-variance perfect sequences. Signal Processing, 2019, 160: 137−149 doi: 10.1016/j.sigpro.2019.02.017 [10] Schoukens M, Tiels K. Identification of block-oriented nonlinear systems starting from linear approximations: A survey. Automatica, 2017, 85: 272−292 doi: 10.1016/j.automatica.2017.06.044 [11] Bershad N J, Celka P, McLaughlin S. Analysis of stochastic gradient identification of Wiener-Hammerstein systems for nonlinearities with Hermite polynomial expansions. IEEE Transactions on Signal Processing, 2001, 49(5): 1060−1072 doi: 10.1109/78.917809 [12] Valarmathi K, Devaraj D, Radhakrishnan T K. Intelligent techniques for system identification and controller tuning in pH process. Brazilian Journal of Chemical Engineering, 2009, 26(1): 99−111 doi: 10.1590/S0104-66322009000100010 [13] Zhu Y. Distillation column identification for control using Wiener model. In: Proceedings of the American Control Conference. San Diego, USA: IEEE, 1999. 3462−3466 [14] Schoukens J, Suykens J, Ljung L. Wiener-Hammerstein benchmark. In: Proceedings of the 15th IFAC Symposium on System Identification (SYSID 2009). Malo, France: 2009. [15] Hagenblad A, Ljung L, Wills A. Maximum likelihood identification of Wiener models. Automatica, 2008, 44(11): 2697−2705 doi: 10.1016/j.automatica.2008.02.016 [16] Xu W Y, Bai E W, Cho M. System identification in the presence of outliers and random noises: A compressed sensing approach. Automatica, 2014, 50(11): 2905−2911 doi: 10.1016/j.automatica.2014.10.017 [17] Bottegal G, Castro-Garcia R, Suykens J A K. A two-experiment approach to Wiener system identification. Automatica, 2018, 93: 282−289 doi: 10.1016/j.automatica.2018.03.069 [18] Westwick D T, Schoukens J. Initial estimates of the linear subsystems of Wiener-Hammerstein models. Automatica, 2012, 48(11): 2931−2936 doi: 10.1016/j.automatica.2012.06.091 [19] Giordano G, Gros S, Sjöberg J. An improved method for Wiener-Hammerstein system identification based on the fractional approach. Automatica, 2018, 94: 349−360 doi: 10.1016/j.automatica.2018.04.046 [20] Liu Q, Lin W Y, Jiang S L, Chai Y, Sun L. Robust estimation of Wiener models in the presence of outliers using the VB approach. IEEE Transactions on Industrial Electronics, 2021, 68(11): 11390−11399 doi: 10.1109/TIE.2020.3028806 [21] Xie L, Yang H Z, Huang B. FIR model identification of multirate processes with random delays using EM algorithm. AIChE Journal, 2013, 59(11): 4124−4132 doi: 10.1002/aic.14147 [22] Agamennoni G, Nieto J I, Nebot E M. Approximate inference in state-space models with heavy-tailed noise. IEEE Transactions on Signal Processing, 2012, 60(10): 5024−5037 doi: 10.1109/TSP.2012.2208106 [23] Bishop C M. Pattern Recognition and Machine Learning. New York: Springer, 2006. [24] Amari S I. Natural gradient works efficiently in learning. Neural Computation, 1998, 10(2): 251−276 doi: 10.1162/089976698300017746 [25] Amari S I. Differential geometry of curved exponential families-curvatures and information loss. The Annals of Statistics, 1982, 10(2): 357−385 [26] Bottou L. On-line learning and stochastic approximations. On-line Learning in Neural Networks. New York: Cambridge University Press, 1999. [27] Robbins H, Monro S. A stochastic approximation method. The Annals of Mathematical Statistics, 1951, 22(3): 400−407 doi: 10.1214/aoms/1177729586 [28] Hoffman M D, Blei D M, Wang C, Paisley J. Stochastic variational inference. The Journal of Machine Learning Research, 2013, 14(1): 1303−1347