Adaptive Optimal Control for a Class of Nonlinear Systems With Dead Zone Input and Prescribed Performance
-
摘要: 针对一类考虑指定性能和带有输入死区约束的严格反馈非线性系统,本文提出了一种自适应模糊最优控制方法.采用模糊逻辑系统逼近系统的未知非线性函数及代价函数,利用backstepping方法及命令滤波技术,设计前馈控制器.针对仿射形式的误差系统,结合自适应动态规划技术,设计最优反馈控制器.采用指定性能控制方法,将系统跟踪误差约束在指定范围内.利用死区斜率信息解决具有死区输入的非线性系统的控制问题.基于Lyapunov稳定性理论,证明闭环系统内所有信号是一致最终有界的.最后仿真结果验证了本文方法的可行性和有效性.
-
关键词:
- 自适应模糊最优控制 /
- 自适应动态规划 /
- backstepping方法 /
- 输入死区 /
- 指定性能
Abstract: This paper develops an adaptive fuzzy optimal control method for a class of strict-feedback nonlinear systems with dead zone input and prescribed performance. Fuzzy logic systems are used to approximate unknown nonlinear functions and cost functions. Under the command filtered technique and the backstepping method, the feedforward controller is designed. And an optimal feedback controller is derived to stabilize the tracking error dynamics in the affine form via an adaptive dynamic programming technology. Based on the prescribed performance control method, the tracking error can be limited in a prescribed area. Moreover, the control problem of nonlinear systems with dead zone input can be addressed by utilizing the information of dead zone slopes. All signals in the closed-loop system can be proved to be uniformly ultimately bounded by using the of Lyapunov stability theorem. Finally, the simulation results verify the feasibility and effectiveness of the proposed method.-
Key words:
- Adaptive fuzzy optimal control /
- adaptive dynamic programming /
- backstepping method /
- dead zone input /
- prescribed performance
-
不确定非线性系统控制问题一直是控制领域内研究的热点和难点[1-8].以模糊逻辑系统, 神经网络为基础的自适应控制设计方法, 打破了以往系统中非线性函数要满足某些限定条件或参数化的限制[1], 解决了不满足匹配条件及不确定的非线性系统的控制器设计问题.并广泛应用于纯反馈和严格反馈的单输入单输出系统[2−3], 多输入多输出系统[4−5], 以及多智能体系统[6]中.值得注意的是, 基于backstepping技术的控制设计方法在每一步都需要对已设计的虚拟控制器进行反复求导, 从而产生“计算爆炸”问题.而动态面控制技术通过在backstepping设计的每一步中引入一阶低通滤波器, 有效地避免了这一问题[9].在此基础上, 文献[10]提出一种命令滤波方法, 利用误差补偿机制消除动态面技术中滤波误差对系统性能的影响.上述工作为不确定非线性系统控制提供了一种简便化, 结构化的方法, 但以上成果均没有考虑最优控制问题.
最优控制是一类考虑系统控制性能和节能效应的控制策略[11].传统的动态规划(Dynamic programming, DP)采用按照时间阶段逆向递推的方法有效解决了最优控制问题[12], 但其后向求解的模式往往会导致“维数灾”现象的发生[13].自适应动态规划(Adaptive dynamic programming, ADP)方法作为DP方法的近似解法, 弥补了DP方法的不足, 为求解复杂非线性系统最优控制问题提供了新的思路[14]. Murray等[15]首先针对连续系统提出了一种迭代ADP算法, 并从数学上证明了该算法的可行性. Vamvoudakis等[16]提出了基于策略迭代的在线ADP方法, 克服了迭代ADP算法无法适应系统变化的缺点.上述的研究成果对ADP理论的发展具有里程碑的意义.为了保证系统运行时的稳定性, 文献[15−16]所提出的方法要求给定一个初始稳定的控制策略.针对此问题, Zargarzadeh等[17]提出一种基于单网络评价技术的在线ADP算法, 并采用新的参数训练方法, 突破了初始稳定控制策略的限制.近年来, ADP受到国内学者的广泛关注[18-22], 已经成为一种重要的优化控制方法.
目前, 采用ADP方法研究非线性系统的最优控制问题已经取得了一系列研究成果, 然而针对带有输入死区和指定性能约束条件的非线性系统所做的研究较少.事实上, 死区作为一类非光滑非线性函数经常出现在机械连接, 液压制动器和传感器等实际工程系统中, 极大地影响系统的性能, 甚至引起系统不稳定[23].对此, 文献[24−25]借助死区的斜率来解决输入死区问题.另一方面, 工程中希望控制器不仅能够保证系统稳定, 而且使系统跟踪误差在一定条件下收敛.文献[26]通过预先设定跟踪性能函数, 提出一种指定性能方法, 使得系统的跟踪误差保持在两个指定性能函数组成的有限范围内, 解决了控制器设计参数调节难的问题.
基于以上讨论, 本文针对一类考虑指定性能和具有输入死区约束的严格反馈非线性系统, 提出一种自适应模糊最优控制方法.本文的主要工作如下: 1)结合命令滤波技术和backstepping方法设计了一种前馈控制器, 与文献[19, 27]的方法相比, 本文采用命令滤波技术不但能克服“计算爆炸”问题, 而且能补偿滤波器误差, 取得更好的控制效果. 2)设计了一种新的ADP结构对误差系统进行优化, 利用单网络在线逼近器求解近似最优控制器. 3)本文解决了一类考虑输入死区和指定性能约束的非线性严格反馈系统的优化控制问题.最后, 通过实例仿真验证所提控制方法的有效性.
1. 问题描述
考虑一类严格反馈系统:
$ \begin{aligned} &\dot x_i = x_{_{i + 1} } + f_i (\bar x_i )\, , \\ &\dot x_n = u + f_n (\bar x_n ){, } \quad 1 \le i \le n - 1\\ \;\;\;\; & y = x_1 \end{aligned} $
(1) 其中, $ \bar x_i = [x_1 , x_2 , \cdots , x_i ]^{{\rm T}} \in {\bf R}^i $, $ y \in {\bf R} $分别表示系统的状态向量和输出, $ {f_i}({\bar x_i}) $是未知的光滑非线性函数, $ u = D(v)\in {\bf R} $表示输入死区对执行器的影响, 其表达式为
$ u = D(v) = \left\{ {\begin{aligned} &{{M_r}(v - {a_r}), }{v \ge {a_r}}\\ &{0, }\quad\quad\quad\quad\;{ - {a_l} < v < {a_r}}\\ &{{M_l}(v + {a_l}), }\;{v \le - {a_l}} \end{aligned}} \right. $
(2) 其中, $ v \in {\bf R} $是死区输入信号, $ M_r $和$ M_l $表示死区的斜率, $ a_l $和$ a_r $是断点, $ M_r $, $ M_l $, $ a_r $, $ a_l $都是正常数.
假设1[24].存在一个正常数$ \varpi $满足$ \left| v \right| \le \varpi $.
假设2[25].给定的参考信号$ x_{1d} $及其一阶导数$ \dot x_{1d} $是光滑的、已知的且有界.
死区输入(2)可简化为
$ u = K(t)v(t) + d(t) $
其中,
$ \begin{array}{l} K(t) = \left\{ {\begin{aligned} &{{M_r}, {\mkern 1mu} v > 0}\\ &{{M_l}, {\mkern 1mu} v \le 0} \end{aligned}} \right.\\ d(t) = \left\{ {\begin{aligned} &{ - {M_r}{a_r}, \;}\quad{v \ge {a_r}}\\ &{ - K(t)v(t), }{ - {a_l} < v < {a_r}}\\ &{{M_l}{a_l}, }\quad\quad\;{v \le - {a_l}} \end{aligned}} \right. \end{array} $
而且, $ d(t) \le \bar d, $ $ \bar d = \max \{ M_l a_l , M_r a_r \}.$定义$ m_0 = \min \{ M_l , M_r \} , $ $ m_1 = \max \{ M_l , M_r \}, $可得
$ \frac{{K(t)}}{{m_0 }} = 1 + \rho (t) $
其中, $ \rho (t) $是分段且有界的函数, 满足$ \rho (t) \le \frac{{m_1 }}{{m_0 }} - 1 $.
根据以上讨论, 可得
$ u = m_0(1+\rho(t))v(t)+d(t) $
(3) 由文献[26]可知, 当系统误差$ \tilde z_1 $($ \tilde z_1 $的具体定义将在式(6)中给出)满足条件
$ - \delta _{ \min } \mu (t) < \tilde z_1 (t) < \delta _{\max } \mu (t)\, , \;\forall t \ge 0 $
(4) 则称系统的暂态性能满足指定性能的要求.其中, $ {\delta _{\min }}, {\delta _{\max }}>0 $是可调节的参数, 指定性能函数取为$ \mu (t) = (\mu _{0} - \mu _{\infty } ){\rm{e}}^{ - n t} + \mu _{\infty } , $函数$ \mu (t) $是严格单调递减的函数, $ n>0 , $ $ \mu _{0} = \mu (0) , $ $ \mu _{\infty } = \mathop {\lim }\nolimits_{t \to \infty } \mu (t) , $那么$ \mu _{0} > \mu _{\infty }>0 , $而且满足$ - \delta _{\min } \mu (0) <\tilde z_1 (0) <$$ \delta _{\max } \mu (0) $.上面不等式可以等价于以下等式
$ \tilde z_1 (t) = \mu (t)S(\chi_1 (t)), \forall t \ge 0 $
其中, $ S(\chi _1 (t)) = \dfrac{{\delta _{\max } {\rm{e}}^{\chi _1 } - \delta _{\min } {\rm{e}}^{ - \chi _1 } }}{{{\rm{e}}^{\chi _1 } + {\rm{e}}^{ - \chi _1 } }} $是严格单调递增的光滑函数, 那么
$ \begin{array}{l} \chi _1 (t) = S^{ - 1} \left(\dfrac{{\tilde z_1 (t)}}{{\mu (t)}}\right) = \dfrac{1}{2}\ln \frac{{S + \delta _{\min } }}{{\delta _{\max } - S}}\\ \dot \chi _1 (t) = p\left(\dot {\tilde z}_1 (t) - \dfrac{{\dot \mu (t)\tilde z_1 (t)}}{{\mu (t)}}\right) \end{array} $
其中, $ p = \frac{1}{{2\mu }}(\frac{1}{{S + \delta _{\min } }} - \frac{1}{{S - \delta _{\max } }}).$针对非线性系统的前馈控制器设计, 采用如下误差变换
$ z_1 (t) = \chi _1 (t) - \frac{1}{2}\ln \frac{{\delta _{\min } }}{{\delta _{\max } }} $
其导数为
$ \dot z_1 (t) = p(\dot {\tilde z}_1 (t) - \frac{{\dot \mu (t)\tilde z_1 (t)}}{{\mu (t)}}) $
(5) 本论文的控制目的:针对一类考虑指定性能和具有输入死区约束的非线性严格反馈系统设计一种自适应模糊最优控制器, 保证闭环系统中所有信号都是一致最终有界的, 误差信号收敛到以“0”为中心的邻域内, 并且满足指定性能要求, 同时代价函数达到最小值.
引理1[19].对任意给定的精度$ {{\mathit{\boldsymbol{\varsigma}}}} > 0, $都存在模糊逻辑系统$ {{\mathit{\boldsymbol{w}}}} ^{{\rm T}}\phi ({\cal Z}) $能逼近任意连续的非线性函数$ {\mathit{\boldsymbol{F}}}({\cal Z}) $, 使得$ {\mathit{\boldsymbol{F}}}({\cal Z}) = {\mathit{\boldsymbol{w}}} ^{{\rm T}} \phi ({\cal Z}) + \vartheta ({\cal Z}) $, 其中$ \left\vert \vartheta {({\cal Z})}\right\vert \leq {{\mathit{\boldsymbol{\varsigma}}}} $, 这里$ {\mathit{\boldsymbol{F}}}({\cal Z}) $是定义在紧集$ {{{{\mathit{\boldsymbol{\Omega}}}}_{\cal Z} }} \in {{\bf R}^q} $上的函数, $ \mathit{\boldsymbol{w}}$是理想权重向量, 定义为
$ {{\mathit{\boldsymbol{w}}}} = \arg \, \mathop {\min }\limits_{\phi \in {\bf R}^N } \, \mathop {\sup }\limits_{{\cal Z} \in {{{\Omega}}_{\cal Z} } } \left| {{\mathit{\boldsymbol{F}}}({\cal Z}) - {{\mathit{\boldsymbol{w}}}} ^{{\rm T}}\phi ({\cal Z})} \right| \le {{{\mathit{\boldsymbol{\varsigma}}}}} $
引理2[19] (Young's不等式).对于任意$ \mathit{\boldsymbol{x}}, \mathit{\boldsymbol{y}}\in {\bf R}^n , $有以下不等式成立:
$ \mathit{\boldsymbol{x}}^{{\rm T}} \mathit{\boldsymbol{y}} \le \frac{{a^b }}{b}\left\| \mathit{\boldsymbol{x}} \right\|^b + \frac{1}{{qa^q }}\left\| \mathit{\boldsymbol{y}} \right\|^q $
其中, $ a>0, $ $ b>1 , $ $ q>1 $且$ (b-1)(q-1) = 1 $.
2. 控制器设计
在本节中, 首先结合backstepping方法和命令滤波技术, 设计前馈控制器$ U^{a} \!.$然后, 采用自适应动态规划方法设计出最优反馈控制器$ U^* \!.$最后, 整个控制输入$ U_w = U^a + U^* \! .$
2.1 前馈控制器设计
首先进行如下坐标变换
$ \tilde z_1 = x_1 - x_{1d} , \, \, z_i = x_i - \lambda _i , \, \, 2 \le i \le n $
(6) 其中, $ x_{1d} $为参考信号, $ \lambda _i $是虚拟控制输入$ x_{id} $通过一阶命令滤波器的输出. $ x_{id} = x_{id}^a + x_{id}^ * , $ $ x_{id}^a $是前馈虚拟控制输入, $ x_{id}^ * $为最优反馈虚拟控制输入.最后一步中定义$ v = v^a+v^* \!, $ $ v^a $为前馈实际控制输入, $ v^* $为最优反馈实际控制输入.一阶命令滤波器表达式为
$ \tau _i {\dot \lambda} _i + \lambda _i = x_{id}{ , }\; \; \lambda _i (0) = x_{id} (0) $
(7) 其中, $ \tau _i $是时间常数.为了消除滤波器误差$ \lambda_i-x_{id} $的影响, 设计误差补偿信号$ \zeta _i\;(2 \le i \le n-1) $为
$ \dot \zeta _1 = - c_1\zeta _1 + p(\lambda _{2} - x_{2d} + \zeta _{2} ) $
(8) $ \dot \zeta_i = -c_i \zeta_i+(\lambda_{i+1}-x_{(i+1)d}+\zeta_{i+1}) $
(9) $ \dot \zeta _n = - c_n \zeta _n $
(10) 其中, $ c_i > 0 $是设计参数, $ \zeta(0) = 0 $.
定义补偿跟踪误差为
$ {\bar z}_i = z_i-\zeta_i, i = 1, \cdots , n $
(11) 结合式(5) $\sim $ (11), 对$ {\bar z}_i $求导可得
$ \begin{array}{l} \dot {{\bar z}}_1 = p \left({\bar z}_{2} + x_{2d}^a + x_{2d}^ *+ h_1 ( {\mathit{\boldsymbol{Z}}}_1 )+ f_1 ( x_{1d} ) - \right. \\ \;\;\;\;\;\;\;\;\; \left. \dot x_{1d}- \frac{{\dot \mu \tilde z_1 }}{{\mu }}\right)+c_1\zeta_1 \end{array} $
(12) $\begin{array}{l} \dot {{\bar z}}_i = {\bar z}_{i + 1} + x_{(i + 1)d}^a + x_{(i + 1)d}^* + h_i ({\mathit{\boldsymbol{Z}}}_i) + \\ \;\;\;\;\;\; f_i (\bar x_{id} )- \dot \lambda _i + c_i \zeta _i, \; \; i = 2, \cdots, n-1 \end{array} $
(13) $ \begin{array}{l}\dot {{\bar z}}_n = m_0(v^a+v^*)+m_0\rho(t)v+d(t) + \\ \;\;\;\;\; h_n ({\mathit{\boldsymbol{Z}}}_n )+ f_n (\bar x_{nd} )- \dot \lambda _n+c_n \zeta_n \end{array} $
(14) 其中, $ \dot \lambda_1 = \dot x_{1d}, $ $ \dot \lambda _i = - \frac{\lambda_{i}-x_{id}}{{\tau _i }}, $ $ h_i ({\mathit{\boldsymbol{Z}}}_i ) = f_i (\bar x_i ) -$ $ f_i (\bar x_{id} ), $ $ {\mathit{\boldsymbol{Z}}}_i = [\bar x_{i}; \bar x_{id}] , $ $ 1 \le i \le n .$由于$ f_i (\bar x_i ) , $ $ f_i (\bar x_{id} ) $是未知函数, 不能直接用于设计每步的控制器.根据引理1, 利用模糊逻辑系统逼近$ f_i (\bar x_{id}) $, 可得
$ f_i (\bar x_{id} ) = w_{i}^{{\rm T}} \varphi _i (\bar x_{id} ) + \vartheta _i (\bar x_{id} ) $
利用Young's不等式可得, $ p{\bar z}_1 \vartheta _1 (x_{1d} ) \le\dfrac{{p^2 {\bar z}_1^2 }}{2} +$ $ \dfrac{{\zeta _1^{'2} }}{2} , $ $ {{\bar z}_i \vartheta _i (\bar x_{id})} \le {\dfrac{{{\bar z}_i ^2 }}{2} + } \dfrac{{\zeta _i^{'2} }}{2}, $ $ 2 \le i \le n. $ $ \varsigma _i ^{'} $是$ \vartheta _i (\bar x_{id} ) $的上界.定义如下的符号和参数, $ \hat {{\mathit{\boldsymbol{w}}}}_i $是$ {\mathit{\boldsymbol{w}}}_i $的估计值, $ \tilde {{\mathit{\boldsymbol{w}}}}_i = {{\mathit{\boldsymbol{w}}}}_i - \hat {{\mathit{\boldsymbol{w}}}}_i $是参数误差, $ \dot {\tilde {{\mathit{\boldsymbol{w}}}}}_i = -\dot {\hat {{\mathit{\boldsymbol{w}}}}}_i .$ $ {\gamma _i }>0 , $ $ c_i > $ $0 , \sigma _i > 0 $是设计参数, $ 1 \le i \le n .$
第1步:考虑如下Lyapunov函数
$ V_1 = \frac{1}{2}{\bar z}_1^2 + \frac{1}{{2\gamma _1 }}\tilde {{\mathit{\boldsymbol{w}}}}_1^{{\rm T}} \tilde {{\mathit{\boldsymbol{w}}}}_1 $
设计前馈虚拟控制器$ x_{2d}^a $和自适应律$ \dot {\hat w}_1 $如下:
$ x_{2d}^a = - \frac{{c_1z_1 }}{{p }} -\frac{{p{\bar z}_1 }}{2}- \hat {{\mathit{\boldsymbol{w}}}}_1 \varphi _1 (x_{1d} ) + \dot x_{1d} + \frac{{\dot \mu \tilde z_1 }}{{\mu }} $
(15) $ \dot {\hat {{{w}}}}_1 = \gamma _1 p{\bar z}_1 \varphi _1 (x_{1d} ) - \sigma _1 \hat {{\mathit{\boldsymbol{w}}}}_1 \left\| {\hat {{\mathit{\boldsymbol{w}}}}_1 } \right\|^2 $
(16) 根据式(15)和式(16), 对$ V_1 $求导有
$ \begin{array}{l} \dot V_1 \le - c_1{\bar z}_1^2 + p{\bar z}_1{\bar z}_2 + p{\bar z}_1 x_{2d}^ *+ p {\bar z}_1 h_1 ({\mathit{\boldsymbol{Z}}}_1 )+\notag\\ \;\;\;\;\; \frac{{\zeta _1^{'2} }}{2}+ \frac{{\sigma _1 }}{{\gamma _1 }}\tilde {{\mathit{\boldsymbol{w}}}}_1^ {\rm T }\hat {{\mathit{\boldsymbol{w}}}}_1 \left\| {\hat {{\mathit{\boldsymbol{w}}}}_1 } \right\|^2 \end{array} $
第$ i $步$ (2\leq i\leq n-1) $:考虑如下Lyapunov函数
$ V_i = V_{i - 1} + \frac{1}{2}{\bar z}_i^2 + \frac{1}{{2\gamma _i }}\tilde {{\mathit{\boldsymbol{w}}}}_i^{{\rm T}} \tilde {{\mathit{\boldsymbol{w}}}}_i $
设计前馈虚拟控制器$ x_{id}^a $和自适应律$ \dot{\hat w}_i $如下:
$ x_{3d}^a = - c_2 z_2 -\frac{{{\bar z}_2 }}{2} - p{\bar z}_{1} - \hat {{\mathit{\boldsymbol{w}}}}_2 \varphi _2 (\bar x_{2d})+\dot \lambda _2 $
(17) $ \begin{array}{l} x_{(j+1)d}^a = - c_jz_j - {\bar z}_{j-1} -\frac{{{\bar z}_j }}{2} - \hat {{\mathit{\boldsymbol{w}}}}_j \varphi _j (\bar x_{jd})+\\ \;\;\;\;\;\;\;\;\;\;\;\;\;\;\; \dot \lambda _j, \; \; j = 3, \cdots, n-1 \end{array} $
(18) $ \dot {\hat {{\mathit{\boldsymbol{w}}}}}_i = \gamma _i {\bar z}_i \varphi _i (\bar x_{id} ) - \sigma _i \hat {{\mathit{\boldsymbol{w}}}}_i \left\| {\hat {\mathit{\boldsymbol{w}}}_i } \right\|^2 $
(19) 根据式(17) $\sim $ (19), 对$ V_i $求导有
$ \begin{array}{l} \dot V_i \le \sum\limits_{j = 1}^i { - c_{j}{\bar z}_{j}^2 + \frac{{\sigma _j }}{{\gamma _j }}\tilde {{\mathit{\boldsymbol{w}}}}_j^{{\rm T}} \hat {{\mathit{\boldsymbol{w}}}}_j \left\| {\hat {{\mathit{\boldsymbol{w}}}}_j } \right\|^2 } + \frac{\varsigma_j ^{'2}}{2}+\notag\\ \;\;\;\;\;\;\;{\bar z}_i {\bar z}_{i + 1}+p{\bar z}_1 (h_1 ({\mathit{\boldsymbol{Z}}}_1) + x_{2d}^* ) + \notag \\ \;\;\;\;\;\;\;\sum\limits_{j = 2}^i {\bar z}_j \left(x_{(j + 1)d}^*+ h_j ({\mathit{\boldsymbol{Z}}}_j ) \right) \end{array}$
第$ n $步:考虑如下Lyapunov函数
$ V_n = V_{n - 1} + \frac{1}{2}{\bar z}_n^2 + \frac{1}{{2\gamma _n }}\tilde {{\mathit{\boldsymbol{w}}}}_n^ {\rm T} \tilde {{\mathit{\boldsymbol{w}}}}_n $
设计前馈控制器$ v^a $和自适应律$ \dot{\hat w}_n $如下:
$ v^a = \frac{1}{{m_0 }}( - c_n z_n -{\bar z}_{n - 1} -\frac{{{\bar z}_n }}{2} - \hat {{\mathit{\boldsymbol{w}}}}_n \varphi _n (\bar x_{nd} )+ \dot \lambda _n ) $
(20) $ \dot{\hat {{\mathit{\boldsymbol{w}}}}}_n = \gamma _n{\bar z}_n \varphi _n (\bar x_{nd} ) - \sigma _n \hat {{\mathit{\boldsymbol{w}}}}_n \left\| {\hat {{\mathit{\boldsymbol{w}}}}_n } \right\|^2 $
(21) 根据式(20)和式(21), 对$ V_n $求导有
$ \begin{array}{l} \dot V_n \le \sum\limits_{i = 1}^n { -c_{i} {\bar z}_{i}^2 + \frac{{\sigma _i }}{{\gamma _i }}\tilde {{\mathit{\boldsymbol{w}}}}_i^{{\rm T}} \hat {{\mathit{\boldsymbol{w}}}}_i \left\| {\hat {{\mathit{\boldsymbol{w}}}}_i } \right\|^2 } +\frac{\varsigma _i ^{'2}}{2}+ \\ \;\;\;\;\;\;\;p{\bar z}_1(h_1 ({\mathit{\boldsymbol{Z}}}_1 )+ x_{2d}^* ) +\sum\limits_{j = 2}^{n-1} {\bar z}_j (x_{(j + 1)d}^* + \\ \;\;\;\;\;\;\; h_j ({\mathit{\boldsymbol{Z}}}_j ))+{\bar z}_n (m_0 v^*+h_n({\mathit{\boldsymbol{Z}}}_n)+ \\ \;\;\;\;\;\;\; m_0\rho(t)v+d(t)) \end{array} $
(22) 根据Young's不等式, 对式(22)不等式右边第2、6项变换如下
$ \sum\limits_{i = 1}^n {\tilde {{\mathit{\boldsymbol{w}}}}_i^{{\rm T}}\hat{{\mathit{\boldsymbol{w}}}}_i \left\| {\hat{{\mathit{\boldsymbol{w}}}}_i } \right\|^2 } \le - \frac{1}{{10}}\left\| {\tilde {{\mathit{\boldsymbol{w}}}}} \right\|^4 + \frac{1}{{2}}\left\| {\mathit{\boldsymbol{w}}} \right\|^4 $
(23) $ {\bar z}_n (m_0\rho(t)v+d(t)) \le {\bar z}_n ^2 {\rm+} \frac{(m_1-m_0)^2\varpi^2}{2}{\rm+}\frac{1}{2}{\bar d} ^{2} $
(24) 将式(23)和式(24)代入式(22), 可得
$ \begin{array}{l} \dot V_n \le - \kappa _1 \left\| Z \right\|^2 - \kappa _2 \left\| {\tilde {{\mathit{\boldsymbol{w}}}}} \right\|^4 + D +\\ \;\;\;\;\;\;\; Z^ {\rm T}P(\bar h({\mathit{\boldsymbol{Z}}}) + GU^ *) \end{array} $
(25) 其中, $ \bar h({\mathit{\boldsymbol{Z}}}) = \left[ {h_1 ({\mathit{\boldsymbol{Z}}}_1 ), \cdots , h_n ({\mathit{\boldsymbol{Z}}}_n )} \right]^{{\rm T}} , $ $G = {\rm diag}\left\{ 1, \right. $ $ \left. \cdots, 1, m_0 \right\} , $ $ Z = [{\bar z}_1 , \cdots , {\bar z}_n ]^{{\rm T}} , $ $U^ * = [x_{2d}^* , \cdots , x_{nd}^* , $ $ v^* ]^ {\rm T}, $ $ P ={\rm diag}\{p , 1, \cdots , 1 \} , $ $D = \dfrac{(m_1-m_0)^2\varpi^2}{2}+ $ $ \dfrac{1}{2}{\bar d} ^{2}+ \dfrac{1}{2}\sum_{i = 1}^ n {\varsigma _i ^{'2} } + \dfrac{\sigma_{\max} }{{2\gamma_{\min} }}\left\| w \right\|^4 , $ $ \sigma_{\max} = \max\{\sigma_i\} , $ $ \gamma_{\min} =$ $ \min\{\gamma_i\} , $ $ \kappa _1 = \min \left\{ c_i-1 \right\} , $ $ \kappa _2 = \min \{ \frac{{\sigma _i }}{{10\gamma _i }}\left| 1 \le \right.$ $\left.i \le n \right.\} .$
如前所述, 系统(1)的控制输入$ U_w =$ $ [x_{2d} , \ldots , x_{nd} , v]^{{\rm T}} $由两部分$ U^a $和$ U^* $构成, 前馈控制器$ U^a = [x_{2d}^a , \cdots , x_{nd}^a , v^a ]^{{\rm T}} $的表达式如式(15), (17), (18), (20)所示.由式(25)可知, 前馈控制器$ U^a $不能保证整个闭环系统的稳定性.因此, 需要设计最优反馈控制器$ U^* = [x_{2d}^* , \ldots , x_{nd}^* , v^*]^{{\rm T}} $使得$ U_w $能够保证被控系统(1)稳定.
2.2 最优反馈控制器设计
本节中, 设计最优反馈控制器$ U^* $使如下误差仿射系统稳定, 并且使得代价函数达到最小.
$ \dot Z = P(\bar h({\mathit{\boldsymbol{Z}}}) + GU^ *) $
(26) 定义系统(26)的代价函数为
$ V(Z) = \int_0^\infty {Q(Z) + U^{ * {{\rm T}}} RU^ * dt} $
(27) 其中, $ Q(Z) $是半正定的罚函数, $ R = R^ {\rm T} > 0 $.
根据代价函数(27), 定义哈密顿函数如下
$ H(Z, U) = Q(Z) + U^{{\rm T}} RU + \nabla _z^ {\rm T} V(Z) P(\bar h({\mathit{\boldsymbol{Z}}}) + GU) $
(28) 其中, $ \nabla_z V(Z) $是$ V(Z) $对$ Z $的偏导, 通过求解$ \frac{{\partial H}}{{\partial U}} = 0 $, 解得最优控制输入
$ U^ * = - \frac{1}{2}R^{ - 1} G^{{\rm T}} P^{{\rm T}} \nabla_z V^ * (Z) $
(29) 将式(29)代入式(28)可得最优控制输入的充分必要条件: $ H(Z, U^ * ) = Q(Z) + \nabla_z^{{\rm T}} V^ * (Z) P\bar h({\mathit{\boldsymbol{Z}}}) -$ $ \dfrac{1}{4}\nabla_z^{{\rm T}}V^ * (Z) E\nabla_z V^ * (Z) = 0 , $此时哈密顿函数最小.其中, $ E = PGR^{ - 1} G^{{\rm T}}P^{{\rm T}}, $且$ V^ * (0) = 0 .$
引理3[27].对于系统(26), 代价函数(27), 最优控制器(29), 存在径向无界且连续可导的Lyapunov函数$ J(Z_s) , $ $ J(Z), $使得$ \nabla _{z_s}^{{\rm T}}J(Z_s)\dot Z< 0 .$其中$ \nabla _{z_s }J(Z_s ) = \dfrac{{\partial J(Z_s )}}{{\partial Z_s }} $.此外, $ \Lambda(Z)>0 $是一个半正定函数矩阵, 满足当$ \left\| Z \right\| = 0 , $有$ \left\| \Lambda(Z) \right\| = 0 ;$当$ \ell _{\min } \le $ $\left\| Z \right\| \le \ell _{\max }, $有$ \Lambda _{\min } \le \left\| {\Lambda (Z)} \right\| \le \Lambda _{\max }, $ $ \ell _{\min }, $ $ \ell_{\max } , $ $ \Lambda _{\min }, $ $ \Lambda _{\max } $都是正常数; $ \mathop {\lim }\nolimits_{Z \to \infty } \Lambda(Z) = \infty , $同时使等式$ Q(Z) + U^{*{{\rm T}}} RU^ * = \nabla_z^{{\rm T}} J(Z)\Lambda(Z)\nabla _{z_s} J(Z_s) $成立, 其中$ \nabla _{z }J(Z ) = \dfrac{{\partial J(Z )}}{{\partial Z }} $, 那么可得$\nabla _{z_s}^{{\rm T}}J(Z_s)\dot Z = - \nabla _{z_s}^{{\rm T}} $ $ J(Z_s)\Lambda(Z)\nabla _{z_s}J(Z_s) $.
根据引理1, 利用模糊逻辑系统逼近最优代价函数, 可得
$ V^ * (Z) = {\mathit{\boldsymbol{w}}}_c ^{{\rm T}} \phi (Z) + \varepsilon (Z) $
其中, $ {\mathit{\boldsymbol{w}}}_c $为理想的权值, $ \phi (Z) $为模糊基函数, $ \varepsilon (Z) $为逼近误差.则最优代价函数的梯度为
$ \nabla_z V^ * (Z) = \nabla _z^{{\rm T}} \phi (Z){\mathit{\boldsymbol{w}}}_c + \nabla _z^{{\rm T}} \varepsilon (Z) $
(30) 将式(30)分别代入式(28), (29)可得
$ U^ * = - \frac{1}{2}R^{ - 1} G^{{\rm T}} P^{{\rm T}}(\nabla _z^{{\rm T}}\phi (Z){\mathit{\boldsymbol{w}}}_c + \nabla _z \varepsilon (Z)) $
(31) $ \begin{array}{l} H(Z, U^ * ) = Q(Z) + {\mathit{\boldsymbol{w}}}_c ^{{\rm T}} \nabla _z \phi (Z)P\bar h({\mathit{\boldsymbol{Z}}})- \\ \;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\; \dfrac{1}{4}{\mathit{\boldsymbol{w}}}_c ^{{\rm T}} \Pi {\mathit{\boldsymbol{w}}}_c+ \varepsilon _{HJB} = 0 \end{array} $
(32) 其中, $ \Pi = \nabla _z \phi (Z)E\nabla _z^{{\rm T}}\phi (Z) , $ $ \varepsilon _{HJB} $为在线模糊逻辑系统重构引起的残差, $ \varepsilon _{HJB} = \nabla _z^{{\rm T}} \varepsilon (Z)P(\bar h({{Z}}) +$ $ GU^ * ) + \dfrac{1}{4}\nabla _z^{{\rm T}} \varepsilon (Z)E\nabla _z \varepsilon (Z) $.
利用模糊逻辑系统对代价函数进行估计, 则有
$ \hat V(Z) = \hat {{\mathit{\boldsymbol{w}}}}_c ^{{\rm T}} \phi (Z) $
(33) 其中, $ \hat {{\mathit{\boldsymbol{w}}}}_c $是$ {{\mathit{\boldsymbol{w}}}}_c $的估计值, 则最优控制器的估计值为
$ \hat U^ * = - \frac{1}{2}R^{ - 1} G^{{\rm T}}P^{{\rm T}} \nabla _z^{{\rm T}} \phi (Z)\hat {{\mathit{\boldsymbol{w}}}}_c $
(34) 将式(34)代入式(28), 得到哈密顿函数的估计为
$ \hat H(Z, \hat U^ *) = Q(Z) + \hat w_c ^{{\rm T}} \nabla_z \phi (Z)P\hat {\bar h}({\mathit{\boldsymbol{Z}}}) - \frac{1}{4}\hat {{\mathit{\boldsymbol{w}}}}_c ^{{\rm T}}\Pi \hat{{\mathit{\boldsymbol{w}}}}_c $
(35) 其中, $ \hat {\bar h}({\mathit{\boldsymbol{Z}}}) = [\hat h_1 ({\mathit{\boldsymbol{Z}}}_1 |\hat {{\mathit{\boldsymbol{w}}}}_1 ), \cdots , \hat h_n ( {\mathit{\boldsymbol{Z}}}_n |\hat {{\mathit{\boldsymbol{w}}}}_n )]^{{\rm T}} $, $ \hat h_i ( {\mathit{\boldsymbol{Z}}}_i | $ $ \hat {{\mathit{\boldsymbol{w}}}}_i ) =\hat f_i (\bar x_i |\hat {{\mathit{\boldsymbol{w}}}}_i ) - \hat f_i (\bar x_{id} |\hat {{\mathit{\boldsymbol{w}}}}_i ), i = 1, \cdots , n $.
为使$ \hat H(Z, \hat U^ *) $最小, 利用梯度下降法设计$ \dot {\hat {{\mathit{\boldsymbol{w}}}}}_c $得
$ \begin{array}{l} \dot {\hat {{\mathit{\boldsymbol{w}}}}}_c = - \beta _1 \hat \sigma (Q(Z) + \hat {{\mathit{\boldsymbol{w}}}}_c ^{{\rm T}} \nabla_z \phi (Z)P\hat {\bar h}({\mathit{\boldsymbol{Z}}}) - \\ \;\;\;\;\;\;\;\dfrac{1}{4}\hat {{\mathit{\boldsymbol{w}}}}_c ^{{\rm T}} \Pi \hat {{\mathit{\boldsymbol{w}}}}_c)+ \frac{1}{2}\beta _2 \sum {(Z, \hat U^ *)} \times \\ \;\;\;\;\;\;\;\nabla_z \phi (Z)E\nabla_{z_s} J(Z_s) \end{array} $
(36) 其中, $ \hat \sigma = \nabla_z \phi (Z)P(\hat {\bar h}({\mathit{\boldsymbol{Z}}}) + G\hat U^ *) , $ $ \beta _1 > 0 , $ $ \beta _2 > 0 $为设计参数, 这里取$ J(Z_s ) = (Z_s^{\rm{T}} Z_s )^\frac{5}{2} /5 $.为了保证系统稳定, 定义$ \sum {(Z, \hat U^ *)} $为
$ \sum {(Z, {{\hat U}^*})} = \left\{ {\begin{aligned} 0\text{, }&{\nabla _{{z_s}}^{\rm{T}}J({Z_s}){{\dot Z}_s} < 0}\\ 1\text{, }&\text{其他} \end{aligned}} \right. $
其中, $ \dot Z_s = P(\hat {\bar h}({\mathit{\boldsymbol{Z}}}) + G\hat U^ *) $.
根据$Q(Z) = - {\mathit{\boldsymbol{w}}}_c^{{\rm T}} \nabla _z \phi (Z)P\bar h({\mathit{\boldsymbol{Z}}}) + \dfrac{1}{4}{\mathit{\boldsymbol{w}}}_c^ {\rm T} \Pi {\mathit{\boldsymbol{w}}}_c - $ $ \varepsilon _{HJB} $和$ \dot {\hat {{\mathit{\boldsymbol{w}}}}}_c = - \dot {\tilde {{\mathit{\boldsymbol{w}}}}}_c , $可得
$ \begin{array}{l} \dot {\tilde {{\mathit{\boldsymbol{w}}}}}_c = - \beta _1 \left(\nabla _z \phi (Z)\left(\dot Z - P\tilde {\bar h}({\mathit{\boldsymbol{Z}}}) + \dfrac{{E\nabla _z \varepsilon (Z)}}{2}\right)+ \right.\\ \;\;\;\;\;\left. \dfrac{1}{2}\Pi \tilde {{\mathit{\boldsymbol{w}}}}_c \right)\times \left(\tilde {{\mathit{\boldsymbol{w}}}}_c^{{\rm T}} \nabla _z \phi (Z)\left(\dot Z - P\tilde {\bar h}({\mathit{\boldsymbol{Z}}}) + \right.\right.\\ \;\;\;\;\; \left. \dfrac{{E\nabla _z \varepsilon (Z)}}{2}\right)+ \dfrac{1}{4}\tilde {{\mathit{\boldsymbol{w}}}}_c ^ {\rm T} \Pi \tilde {{\mathit{\boldsymbol{w}}}}_c + \varepsilon _{HJB} + \\ \;\;\;\;\; \left.{\mathit{\boldsymbol{w}}}_c ^{{\rm T}} \nabla _z \phi (Z)P\tilde {\bar h}({\mathit{\boldsymbol{Z}}})\right)- \dfrac{1}{2}\beta _2 \sum {(Z, \hat U^ *)}\times \\ \;\;\;\;\; \nabla _z \phi (Z)E\nabla_{z_s} J(Z_s) \end{array} $
(37) 其中, $ \tilde{\bar h}({\mathit{\boldsymbol{Z}}}) = \bar h({\mathit{\boldsymbol{Z}}})-\hat{\bar h}({\mathit{\boldsymbol{Z}}}) $.
根据自适应律(16), (19), (21), 引入附加项, 可得
$ \begin{array}{l} \dot {\hat {{\mathit{\boldsymbol{w}}}}} = \sum\limits_{i = 1}^n {PB_i\gamma _i {\bar z}_i \varphi _i (\bar x_{id} ) - \sum\limits_{i = 1}^n {B_i \sigma _i \hat {{\mathit{\boldsymbol{w}}}}_i \left\| {\hat {{\mathit{\boldsymbol{w}}}}_i } \right\|^2 }}- \\ \;\;\;\;\;\;\beta _2 \sum ( Z, \hat U^* )P\bar \varphi({\mathit{\boldsymbol{Z}}})\nabla _{z_s } J(Z_s ) \end{array}$
(38) 其中, $ B_i = [\underbrace {0, \cdots , 0}_{i - 1}, 1, 0, \cdots , 0]_{n \times 1}^{{\rm T}} , $ $\bar \varphi ({\mathit{\boldsymbol{Z}}}) ={{\rm diag}} $ $ \{\varphi _1 (x_1 ) -\varphi _1 (x_{1d} ), \cdots , \varphi _n (\bar x_n ) - \varphi _n (\bar x_{nd} )\} , $ $\dot {\hat {{\mathit{\boldsymbol{w}}}}} = \left[\dot {\hat {{\mathit{\boldsymbol{w}}}}}_1^{{\rm T}} , \right. $ $ \left. \cdots, \dot {\hat {{\mathit{\boldsymbol{w}}}}}_n^{{\rm T}}\right]^{{\rm T}} $.
定理1.针对一类考虑指定性能和具有输入死区约束的严格反馈非线性系统(1), 设计前馈虚拟控制器(15), (17), (18), 前馈实际控制器(20), 反馈最优控制器(34)及自适应律(36)和(38), 通过选择合适的参数使得闭环系统内所有信号一致最终有界, 跟踪误差以最优的方式收敛且满足指定性能要求.
证明.见附录A.
3. 实例仿真
本节将通过一类机械臂系统仿真验证所提出自适应模糊最优控制方法的有效性和可行性.带有输入死区约束的机械臂系统动力学方程如下:
$ \left\{ {\begin{aligned} &{{{\dot x}_1} = {x_2}}\\ &{{{\dot x}_2} = - \frac{{Mgl}}{J}\sin {x_1} - \frac{D}{J}{x_2} + u}\\ &{y = {x_1}} \end{aligned}} \right. $
其中, $ x_1 $, $ x_2 $分别为连杆角速度和角加速度, $M = $$ 1\;{\rm kg} $为连杆总质量, $ g = 9.8\;{\rm m/s^2} $为重力加速度, $ l = 1\;{\rm m} $为机械臂连杆的质心距连杆的转动中心的距离, $ D = 2\;{\rm N·m·s/rad} $为连杆转动的粘性摩擦系数, $ J = 1\;{\rm kg·m^2} $为连杆转动惯量.
参考信号$ x_{1d} = \sin (t) $.死区参数$ M_r = 3 $, $ M_l = 1 $, $ a_r = 1.5 , $ $ a_l = 3 .$模糊隶属度函数为$ \mu _{F_i^l } (x_i ) =\exp$$ \left[\! - \dfrac{(x_i - 6 + 2l)^2 }{4}\! \right] $, $ \mu _{F_i^l } (x_{id} ) = \exp\! \left[ - \dfrac{(x_{id} - 6 + 2l)^2 }{4}\! \right] $, $ \mu _{F_i^l } ({\bar z}_i ) = \exp \left[ - \dfrac{({\bar z}_i - 3 + l)^2 }{3}\right] $, $ i = 1, 2 .$ $ l = 1, 2, 3, $$ 4, 5 $.初始值为$ x_1 (0) = 1.4 $, $ x_2 (0) = -0.2 $.$ \hat w_c = [1, 1, $$ 1, -1, -1]^{{\rm T}} $.性能函数$ \mu = 2.5{\rm{e}}^{ - 0.5t} + 0.05 $, $ \delta _{\min } = 0.6 $, $ \delta_{\max} = 0.8 .$设计参数为$ c_1 = 10 , $ $ c_2 = 50 , $ $ \gamma _1 = 1 , $ $ \gamma _2 = 1 $, $ \sigma _1 = 50 $, $ \sigma _2 = 50 $, $ \beta _1 = 0.01 $, $ \beta _2 = 0.01 $, 给定系统代价函数(27)中R=[0.2, 0; 0, 0.01], $ Q(Z) = {\bar z}_1^2+{\bar z}_2^2 $.其余参数初始值均为0.
仿真结果如图 1$\sim $4所示, 图 1给出了参考信号$ x_{1d} $和系统输出信号$ y $的跟踪轨迹, 系统输出$ y $在5 s内跟踪上参考信号, 表明本文的控制方法能使系统输出具有良好的跟踪效果. 图 2给出了跟踪误差$ \tilde z_{1} $的轨迹曲线, 由图中可以看出跟踪误差$ \tilde z_{1} $收敛于以原点为中心的有界邻域内, 满足预设性能的要求, 并且稳态误差小于0.01. 图 3给出了代价函数权值$ \hat w_{ci} $和哈密顿函数的估计值$ \hat H(Z, \hat U^ *) $的变化曲线, 表明权值信号能快速收敛到目标权值并使得哈密顿函数趋于0. 图 4描绘了执行器输入信号$ v $和执行器输出信号$ u $的响应曲线.由仿真结果可知本文提出的控制方案使得闭环系统内所有信号都是有界的, 保证了系统的稳定性.
4. 结束语
本文针对一类参数未知的严格反馈非线性系统, 考虑输入死区和指定性能两个约束条件, 提出了一种自适应模糊最优控制方法.首先在backstepping方法和命令滤波技术的基础上, 利用死区斜率信息和性能指标函数设计了前馈控制器.进而采用单网络的ADP方法, 设计了最优反馈控制器.最后采用Lyapunov函数稳定性理论证明了闭环系统的稳定性.仿真结果表明了本文设计方法能够有效解决考虑死区和指定性能的严格反馈系统的优化控制问题.
附录A
选取Lyapunov函数为
$ V_{HJB} = V_n + \frac{1}{2}\tilde {{\boldsymbol{w}}}_c ^{{\rm T}} \tilde {{\boldsymbol{w}}}_c + \beta_2 J(Z_s) $
(A1) 结合式(25), (37), 对$ V_{HJB} $求导, 可得
$ \begin{array}{l} \dot V_{HJB} \le - \kappa _1 \left\| Z \right\|^2 - \kappa _2 \left\| {\tilde {{\boldsymbol{w}}}} \right\|^4 + D+ \\ \;\;\;\;\;\;\;\; Z^{{\rm T}} P (\bar h(\mathit{\boldsymbol{Z}})+ GU^ * ) - \beta _1 \tilde {{\boldsymbol{w}}}_c ^ {\rm T}( \nabla _z \phi (Z)\times \\ \;\;\;\;\;\;\;\; (\dot Z- P\tilde {\bar h}(\mathit{\boldsymbol{Z}})+ \frac{{E\nabla _z \varepsilon (Z)}}{2}) + \frac{1}{2}\Pi \tilde {{\boldsymbol{w}}}_c )\times \\ \;\;\;\;\;\;\;\; (\tilde {{\boldsymbol{w}}}_c^{{\rm T}} \nabla _z \phi (Z)(\dot Z-P\tilde {\bar h}(\mathit{\boldsymbol{Z}}) + \frac{{E\nabla _z \varepsilon (Z)}}{2}) + \\ \;\;\;\;\;\;\;\;\frac{1}{4}\tilde {{\boldsymbol{w}}}_c ^{{\rm T}} \Pi \tilde {{\boldsymbol{w}}}_c+ \varepsilon _{HJB} + {\boldsymbol{w}}_c ^{{\rm T}} \nabla _z \phi (Z)P\tilde {\bar h}(\mathit{\boldsymbol{Z}})) - \\ \;\;\;\;\;\;\;\;\beta _2 \sum {(Z,\hat U^ *)} (\frac{1}{2}\tilde {{\boldsymbol{w}}}_c^{{\rm T}} \nabla _z \phi (Z)E-\tilde {{\boldsymbol{w}}}^{{\rm T}} P\times\\ \;\;\;\;\;\;\;\; \bar \varphi (\mathit{\boldsymbol{Z}}))\nabla _{z_s} J(Z_s)+ \beta_2 \nabla _{z_s}^{{\rm T}} J(Z_s)\dot Z_s \end{array} $
(A2) 根据文献[27], 假设: $ \left\| {\nabla_z \varepsilon (Z)} \right\| \le b_M $, $ \left\| {\nabla_z \phi (Z)} \right\| \le \phi _M $, $ \left\| {\varepsilon _{HJB} } \right\| \le \lambda _\varepsilon $, $ \Pi_m \leq \left\| \Pi \right\| \leq \Pi _M $, $ \left\|P(\bar h(\mathit{\boldsymbol{Z}}) + GU^*)\right\| \le I\sqrt {\left\| Z \right\|} $. 取$ \left\| {\nabla_z \phi (Z)E\nabla _z \varepsilon (Z)} \right\| \le \lambda _M $. 其中$ I $, $ b_M $, $ \phi _M $, $ \lambda _\varepsilon $, $ \Pi _m $, $ \Pi _M $, $ \lambda _M $都是正参数. 把各项展开, 并应用Young's不等式, 柯西不等式$ (\sum {_{i = 1}^n a_i } )^2 \le \sum {_{i = 1}^n na_i^2 } $, 以其中一项为例说明放缩过程
$ \begin{array}{l}- \beta _1 (\tilde {{\boldsymbol{w}}}_c^{{\rm T}} \nabla _z \phi (Z)P\tilde {\bar h}(\mathit{\boldsymbol{Z}}))(\tilde {{\boldsymbol{w}}}_c^{{\rm T}} \nabla _z \phi (Z)P\tilde {\bar h}(\mathit{\boldsymbol{Z}}))\le \notag\\ \beta _1 [\frac{{{\bf{\pi }} _1\left\| P \right\|^4}}{2}(8\left\| {\tilde {{\boldsymbol{w}}}} \right\|^4 +(\sum\limits_{i = 1}^n {4\varsigma _i^2 + 4\varsigma _i^{'2} } )^2)+\notag\\ \;\;\;\;\; \frac{9}{{2{\bf{\pi }} _1 }}\phi _M^4 \left\| {\tilde {{\boldsymbol{w}}}_c } \right\|^4 ] \end{array} $
其中, $ \pi_1 $是正常数, $ \varsigma_i $, $ \varsigma_i^{'} $分别是$ \vartheta _i (\bar x_{i} ) $ 和$ \vartheta _i (\bar x_{id} ) $的上界. 其余各项采用相同的方式处理, 可得
$ \begin{array}{l} \dot V_{HJB} \le - k_1 \left\| Z \right\|^2 + k_2 \left\| Z \right\| - k_3 \left\| {\tilde{\boldsymbol w}} \right\|^4 -\\ \;\;\;\;\;\;\; k_4 \left\| {\tilde {{\boldsymbol{w}}}_c } \right\|^4 + k_5 \left\| {\tilde {{\boldsymbol{w}}}_c } \right\|^2+ k_6 -\\ \;\;\;\;\;\;\; \beta _2 \sum {(Z,\hat U^ *)} (\frac{1}{2}\tilde {{\boldsymbol{w}}}_c^{{\rm T}} \nabla _z \phi (Z)E-\tilde {{\boldsymbol{w}}}^{{\rm T}} P\times\\ \;\;\;\;\;\;\; \bar \varphi (\mathit{\boldsymbol{Z}}))\nabla _{z_s} J(Z_s)+\beta_2 \nabla _{z_s}^{{\rm T}}J(Z_s)\dot Z_s \end{array} $
(A3) 其中, $ k_1 = \kappa _1 - ( \frac{1}{{2{\bf{\pi }} _2 }} + \frac{1}{{2{\bf{\pi }} _3 }} + \frac{3}{{16{\bf{\pi }} _5 }} + \frac{1}{{4{\bf{\pi }} _8 }}+ \frac{1}{{4{\bf{\pi }} _9 }}+\frac{1}{2{\bf{\pi }}_{16}})\beta _1I^4 - \frac{1}{2} $, $ k_2 = \frac{1}{2}I^2 $, $ k_3 = \kappa _2 - \beta _1 \left\| P \right\|^4[4({\bf{\pi }} _1 + {\bf{\pi }} _2 ) + 2({\bf{\pi }} _4 + {\bf{\pi }} _9 + {\bf{\pi }} _{11} )+ \frac{{3{\bf{\pi }} _6 }}{2} + \frac{2}{{{\bf{\pi }} _{10} }} + \frac{2}{{{\bf{\pi }} _{11} }}+ {\bf{\pi }} _{13} + {\bf{\pi }} _{15} ] $, $ k_4 = \beta _1 [\frac{{\Pi _m^2 }}{8} - (\frac{9}{{2{\bf{\pi }} _1 }} + \frac{1}{{2{\bf{\pi }} _2 }} + \frac{1}{{2{\bf{\pi }} _3 }} + \frac{{9{\bf{\pi }} _2 }}{2} + \frac{{9{\bf{\pi }} _4 }}{4} + \frac{3}{{16{\bf{\pi }} _5 }} + \frac{{27{\bf{\pi }} _6 }}{{16}} + \frac{1}{{4{\bf{\pi }} _8 }} + \frac{1}{{4{\bf{\pi }} _9 }} + \frac{9}{{4{\bf{\pi }} _{10} }} + \frac{9}{{4{\bf{\pi }} _{11} }}+\frac{{\bf{\pi }}_{16}}{2})\phi _M^4- (\frac{{3{\bf{\pi }} _5 }}{8} + \frac{3}{{8{\bf{\pi }} _6 }} + \frac{{3{\bf{\pi }} _7 }}{{16}} +\frac{1}{{4{\bf{\pi }} _{14} }}+\frac{1}{{4{\bf{\pi }} _{15} }})\Pi _M^2] $, $ k_5 = \beta _1[ (\frac{{{\bf{\pi }} _3+1 }}{4} + \frac{1}{{2{\bf{\pi }} _4 }} + \frac{3}{{16{\bf{\pi }} _7 }} + \frac{1}{{4{\bf{\pi }} _{13} }}+ \frac{1}{{4{\bf{\pi }} _{12} }})\lambda _M^2 ] $, $ k_6 = D+\beta _1 [(\frac{{{\bf{\pi }} _1 }}{2} + \frac{{{\bf{\pi }} _2 }}{2} + \frac{{{\bf{\pi }} _4 }}{4} + \frac{{3{\bf{\pi }} _6 }}{{16}} + \frac{{{\bf{\pi }} _9 }}{4} + \frac{1}{{4{\bf{\pi }} _{10} }} + \frac{{{\bf{\pi }} _{11} }}{4} + \frac{{{\bf{\pi }} _{13} }}{8} + \frac{{{\bf{\pi }} _{15} }}{8})\left\| P \right\|^4(\sum\limits_{i = 1}^n {4\varsigma _i^2 + 4\varsigma _i^{'2} )^2 }+ (\frac{{{\bf{\pi }} _8}}{2} + \frac{{{\bf{\pi }} _{10} }}{2} + \frac{{{\bf{\pi }} _{12} }}{4} + \frac{{{\bf{\pi }} _{14} }}{4})\lambda _\varepsilon ^2 + (\frac{{9{\bf{\pi }} _9 }}{4} + \frac{{9{\bf{\pi }} _{11} }}{4} + \frac{9{{\bf{\pi }} _{13} }}{8}{{\rm + }}\frac{{9{\bf{\pi }} _{15} }}{8})\phi _M^4 \left\| {{\boldsymbol{w}}_c } \right\|^4 ] $, $ {{\bf{\pi }} _{i} } $, $ i = 1, \ldots ,16 $ 是正常数, 选择合适的参数, 可使得$ k_1 ,k_3 ,k_4 > 0 $.
当$ \sum {(Z,\hat U^ *)} = 0 $时, 因为$ \nabla _{z_s}^{{\rm T}} J(Z_s)\dot Z_s < 0 $, 所以$ - \nabla _{z_s}^{{\rm T}} J(Z_s)\dot Z_s > 0 $, 则存在$ k_7 > 0 $使得$ 0 < k_7 \left\| {\nabla _{z_s} J(Z_s)} \right\| \le - \nabla _{z_s}^{{\rm T}} J(Z_s)\dot Z_s $, 即$ \nabla _{z_s}^{{\rm T}} J(Z_s)\dot Z_s \le - k_7 \left\| {\nabla _{z_s} J(Z_s)} \right\| < 0 $. 结合式(A3), 可得
$ \begin{array}{l} \dot V_{HJB} \le - k_1 \left\| Z \right\|^2 + k_2 \left\| Z \right\| - k_3 \left\| {\tilde {{\boldsymbol{w}}}} \right\|^4 - \notag\\ \;\;\;\;\;\;\;\; k_4 \left\| {\tilde {{\boldsymbol{w}}}_c } \right\|^4+ k_5 \left\| {\tilde {{\boldsymbol{w}}}_c } \right\|^2+ k_6 - \notag\\ \;\;\;\;\;\;\;\; k_7 \beta _2 \left\| {\nabla _{z_s} J(Z_s)} \right\| \end{array} $
如果以下条件成立
$ \begin{array}{l} \left\| {\nabla _{z_s} J(Z_s)} \right\| > A_1, {\text{或}}\left\| Z \right\| >B_1,\\ \;\;\;\;\;\;\;\;\; {\text{或}} \left\| {\tilde {{\boldsymbol{w}}}_c } \right\| >C_1, {\text{或}}\left\| {\tilde {{\boldsymbol{w}}}} \right\| > D_1 \end{array} $
其中, $ A_1 = \frac{{k_6 }}{{k_7 \beta _2 }} $, $ B_1 = \frac{{k_2 + \sqrt {k_2 ^2 + 4k_1 k_6 } }}{{2k_1 }} $, $ C_1 = \sqrt {\frac{{k_5 + \sqrt {k_5 ^2 + 4k_4 k_6 } }}{{2k_4 }}} $, $ D_1 = \sqrt[4]{{\frac{{k_6 }}{{k_3 }}}} $, 则$ \dot V_{HJB} < 0 $.
当$ \sum {(Z,\hat U^ *)} = 1 $时, 在式(A3)右边引入$ \beta _2 \nabla _{z_s }^{{\rm T}} J(Z_s )\bar \varsigma $, $ \bar \varsigma = [\varsigma_1,\ldots,\varsigma_n]^{{\rm T}} $, $ \beta _2 \nabla _{z_s }^{{\rm T}} J(Z_s )\bar \varsigma ^{'} $, $ \bar\varsigma^{'} = [\varsigma_1^{'},\ldots,\varsigma_n^{'}]^{{\rm T}} $, $ \frac{\beta_2}{2}\nabla _{z_s}^{{\rm T}} J(Z_s)E\nabla_z \varepsilon (Z) $, 利用引理$ 3 $ 和Young's不等式可得
$\begin{array}{l} \dot V_{HJB} \le - k_1 \left\| Z \right\|^2 + k_2 \left\| Z \right\| - k_3 \left\| {\tilde{\boldsymbol w}} \right\|^4- \notag\\ \;\;\;\;\;\;\;\; k_4 \left\| {\tilde {{\boldsymbol{w}}}_c } \right\|^4+ k_5 \left\| {\tilde {{\boldsymbol{w}}}_c } \right\|^2 + k_6 + \beta_2\nabla _{z_s}^{{\rm T}}J(Z_s)\times\notag\\ \;\;\;\;\;\;\;\; P(\bar h(\mathit{\boldsymbol{Z}}) + GU^ * )+ \beta _2\nabla _{z_s}^{{\rm T}} J(Z_s)\times \notag\\ \;\;\;\;\;\;\;\;(\frac{1}{2}E\nabla _z \varepsilon (Z) - \bar \varsigma+\bar \varsigma ^{'}) \le\notag \\ \;\;\;\;\;\;\;\; - k_1 \left\| Z \right\|^2 + k_2 \left\| Z \right\| - k_3 \left\| {\tilde {{\boldsymbol{w}}}} \right\|^4 - \notag\\ \;\;\;\;\;\;\;\; k_4 \left\| {\tilde {{\boldsymbol{w}}}_c } \right\|^4 + k_5 \left\| {\tilde {{\boldsymbol{w}}}_c } \right\|^2 + k_6- \notag\\ \;\;\;\;\;\;\;\;\lambda _{5} \left\| {\nabla _{z_s} J(Z_s)} \right\|^2+\lambda _{7}\left\| {\nabla _{z_s} J(Z_s)} \right\| \end{array} $
其中, $ \lambda_7 = \frac{\beta_2}{2} b_M \left\| P \right\|^2 \left\| {R^{ - 1} } \right\| \left\| G \right\|^2+\beta _2\left\| \bar \varsigma \right\|+\beta _2\left\| \bar \varsigma^{'} \right\| $, $ \lambda _{5} = \beta_2 \lambda_{\min }(\Lambda(Z) ) $, $ \lambda_{\min }(\Lambda(Z) ) $ 是$ \Lambda(Z) $的最小特征值. 如果下列条件成立
$ \begin{array}{l} \left\| {\nabla_{z_s} J(Z_s )} \right\| > A_2, {\text{或}}\left\| Z \right\| > B_1 , \\ \;\;\;\;\;\;\;{\text{或}}\left\| {\tilde {{\boldsymbol{w}}}_c } \right\| >C_1 , {\text{或}}\left\| {\tilde {{\boldsymbol{w}}}} \right\| > D_1 \end{array} $
其中, $ A_2 = \frac{{\lambda _7 + \sqrt {\lambda _7^2 + 4\lambda _5 k_6 } }}{{2\lambda _5 }} $, 那么$ \dot V_{HJB} < 0 $.
综上所述, 如果$ \left\| {\nabla_{z_s} J(Z_s )} \right\| > \max \{A_1 ,A_2 \} $, 或$ \left\| Z\right\| > B_1 $, 或$ \left\| {\tilde {{\boldsymbol{w}}}_c } \right\| >C_1 $, 或$ \left\| {\tilde {{\boldsymbol{w}}}} \right\| >D_1 $, 那么$ \dot V_{HJB} < 0 $.
-
-
[1] Lee H, Tomizuka M. Robust adaptive control using a universal approximator for SISO nonlinear systems. IEEE Transactions on Fuzzy Systems, 2000, 8(1):95-106 doi: 10.1109/91.824777 [2] Ge S S, Wang C. Adaptive NN control of uncertain nonlinear pure-feedback systems. Automatica, 2002, 38(4):671-682 doi: 10.1016/S0005-1098(01)00254-0 [3] Ge S S, Wang C. Direct adaptive NN control of a class of nonlinear systems. IEEE Transactions on Neural Networks, 2002, 13(1):214-221 doi: 10.1109/72.977306 [4] Hu X, Wei X J, Zhang H F, Han J, Liu X H. Robust adaptive tracking control for a class of mechanical systems with unknown disturbances under actuator saturation. International Journal of Robust and Nonlinear Control, 2019, 29(6):1893-1908 doi: 10.1002/rnc.4465 [5] Chen M, Shao S Y, Jiang B. Adaptive neural control of uncertain nonlinear systems using disturbance observer. IEEE Transactions on Cybernetics, 2017, 47(10):3110-3123 doi: 10.1109/TCYB.2017.2667680 [6] Chen C L P, Wen G X, Liu Y J, Liu Z. Observer-based adaptive backstepping consensus tracking control for high-order nonlinear semi-strict-feedback multiagent systems. IEEE Transactions on Cybernetics, 2016, 46(7):1591-1601 doi: 10.1109/TCYB.2015.2452217 [7] Qian W, Gao Y S, Yang Y. Global consensus of multiagent systems with internal delays and communication delays. IEEE Transactions on Systems, Man, and Cybernetics:Systems, 2018 [8] Qian W, Wang L, Chen M Z Q. Local consensus of nonlinear multiagent systems with varying delay coupling. IEEE Transactions on Systems, Man, and Cybernetics:Systems, 2018, 48(12):2462-2469 doi: 10.1109/TSMC.2017.2684911 [9] Li T S, Wang D, Feng G, Tong S C. A DSC approach to robust adaptive NN tracking control for strict-feedback nonlinear systems. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 2010, 40(3):915-927 doi: 10.1109/TSMCB.2009.2033563 [10] Dong W J, Farrell J A, Polycarpou M M, Djapic V, Sharma M. Command filtered adaptive backstepping. IEEE Transactions on Control Systems Technology, 2012, 20(3):566-580 doi: 10.1109/TCST.2011.2121907 [11] Bai W W, Zhou Q, Li T S, Li H Y. Adaptive reinforcement learning neural network control for uncertain nonlinear system with input saturation. IEEE Transactions on Cybernetics, DOI: 10.1109/TCYB.2019.292105 [12] Bellman R E. Dynamic Programming. Princeton:Princeton University Press, 1957 [13] 王鼎, 穆朝絮, 刘德荣.基于迭代神经动态规划的数据驱动非线性近似最优调节.自动化学报, 2017, 43(3):366-375 http://www.aas.net.cn/CN/abstract/abstract19015.shtmlWang Ding, Mu Chao-Xu, Liu De-Rong. Data-driven nonlinear nearoptimal regulation based on iterative neural dynamic programming. Acta Automatica Sinica, 2017, 43(3):366-375 http://www.aas.net.cn/CN/abstract/abstract19015.shtml [14] 张化光, 张欣, 罗艳红, 杨珺.自适应动态规划综述.自动化学报, 2013, 39(4):303-311 http://www.aas.net.cn/CN/abstract/abstract17916.shtmlZhang Hua-Guang, Zhang Xin, Luo Yan-Hong, Yang Jun. An overview of research on adaptive dynamic programming. Acta Automatica Sinica, 2013, 39(4):303-311 http://www.aas.net.cn/CN/abstract/abstract17916.shtml [15] Murray J J, Cox C J, Lendaris G G, Saeks R. Adaptive dynamic programming. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 2002, 32(2):140-153 doi: 10.1109/TSMCC.2002.801727 [16] Vamvoudakis K G, Lewis F L. Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem. Automatica, 2010, 46(5):878-888 doi: 10.1016/j.automatica.2010.02.018 [17] Zargarzadeh H, Dierks T, Jagannathan S. Optimal control of nonlinear continuous-time systems in strict-feedback form. IEEE transactions on neural networks and learning systems, 2015, 26(10):2535-2549 doi: 10.1109/TNNLS.2015.2441712 [18] Wang D, He H B, Zhao B, Liu D R. Adaptive near optimal controllers for nonlinear decentralised feedback stabilisation problems. IET Control Theory and Applications, 2017, 11(6):799-806 doi: 10.1049/iet-cta.2016.1383 [19] Li Y M, Sun K K, Tong S C. Observer-based adaptive fuzzy fault-tolerant optimal control for SISO nonlinear systems. IEEE Transactions on Cybernetics, 2019, 49(2):649-661 doi: 10.1109/TCYB.2017.2785801 [20] Sun J L, Liu C S. Distributed fuzzy adaptive backstepping optimal control for nonlinear multimissile guidance systems with input saturation. IEEE Transactions on Fuzzy Systems, 2019, 27(3):447-461 doi: 10.1109/TFUZZ.2018.2859904 [21] Wei Q L, Liu D R, Lin Q, Song R Z. Adaptive dynamic programming for discrete-time zero-sum games. IEEE Transactions on Neural Networks and Learning Systems, 2018, 29(4):957-969 doi: 10.1109/TNNLS.2016.2638863 [22] Fan B, Yang Q M, Tang X Y, Sun Y X. Robust ADP design for continuous-time nonlinear systems with output constraints. IEEE Transactions on Neural Networks and Learning Systems, 2018, 29(6):2127-2138 doi: 10.1109/TNNLS.2018.2806347 [23] Chen M, Tao G. Adaptive fault-tolerant control of uncertain nonlinear large-scale systems with unknown dead-zone. IEEE Transactions on Cybernetics, 2016, 46(8):1851-1862 doi: 10.1109/TCYB.2015.2456028 [24] Tong S C, Li Y M. Adaptive fuzzy output feedback control of MIMO nonlinear systems with unknown dead-zone inputs. IEEE Transactions on Fuzzy Systems, 2013, 21(1):134-146 doi: 10.1109/TFUZZ.2012.2204065 [25] Yu J P, Shi P, Dong W J, Lin C. Adaptive fuzzy control of nonlinear systems with unknown dead zones based on command filtering. IEEE Transactions on Fuzzy Systems, 2018, 26(1):46-55 doi: 10.1109/TFUZZ.2016.2634162 [26] Zhang L L, Yang G H. Adaptive fuzzy prescribed performance control of nonlinear systems with hysteretic actuator nonlinearity and faults. IEEE Transactions on Systems, Man, and Cybernetics:Systems, 2017, 48(12):2349-2358 doi: 10.1109/TSMC.2017.2707241 [27] Tong S C, Sun K K, Sui S. Observer-based adaptive fuzzy decentralized optimal control design for strict-feedback nonlinear large-scale systems. IEEE Transactions on Fuzzy Systems, 2018, 26(2):569-584 doi: 10.1109/TFUZZ.2017.2686373 期刊类型引用(10)
1. 王树波,那靖,任雪梅. 面向性能增强的双惯量伺服系统状态反馈控制. 自动化学报. 2023(04): 904-912 . 本站查看
2. 陈超,段纳,徐止政. 具有输入死区与扰动的四旋翼无人机自抗扰控制. 信息与控制. 2023(03): 326-333 . 百度学术
3. 罗傲,肖文彬,周琪,鲁仁全. 基于强化学习的一类具有输入约束非线性系统最优控制. 控制理论与应用. 2022(01): 154-164 . 百度学术
4. 韩松,李晓孟,陈广登,孟伟. 重放攻击下不确定CPS的安全滤波. 控制工程. 2022(06): 1004-1010 . 百度学术
5. 孙猛,杨洪. 输出非对称死区的非严格反馈非线性系统控制. 控制理论与应用. 2022(08): 1442-1450 . 百度学术
6. 魏文军,尉晶波. 拓扑切换下IT2 T-S模糊非线性多智能体系统全局逆最优控制. 控制理论与应用. 2022(10): 1985-1994 . 百度学术
7. 杨祎桀,刘冬,曲袁超. 基于指定性能的非线性系统数据驱动滑模控制. 沈阳师范大学学报(自然科学版). 2022(06): 502-508 . 百度学术
8. 何诚,吴剑,张哲. 具有输入饱和的纯反馈系统有限时间跟踪控制. 电光与控制. 2021(08): 11-16 . 百度学术
9. 赵光同,曹亮,周琪,李鸿一. 具有未建模动态的互联大系统事件触发自适应模糊控制. 自动化学报. 2021(08): 1932-1942 . 本站查看
10. 刘军. 直角坐标式机器人控制系统设计与研究. 电子设计工程. 2020(11): 88-91+96 . 百度学术
其他类型引用(15)
-