2.765

2022影响因子

(CJCR)

  • 中文核心
  • EI
  • 中国科技核心
  • Scopus
  • CSCD
  • 英国科学文摘

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于广义模糊双曲模型的自适应动态规划最优控制设计

张吉烈 张化光 罗艳红 梁洪晶

张吉烈, 张化光, 罗艳红, 梁洪晶. 基于广义模糊双曲模型的自适应动态规划最优控制设计. 自动化学报, 2013, 39(2): 142-149. doi: 10.3724/SP.J.1004.2013.00142
引用本文: 张吉烈, 张化光, 罗艳红, 梁洪晶. 基于广义模糊双曲模型的自适应动态规划最优控制设计. 自动化学报, 2013, 39(2): 142-149. doi: 10.3724/SP.J.1004.2013.00142
ZHANG Ji-Lie, ZHANG Hua-Guang, LUO Yan-Hong, LIANG Hong-Jing. Nearly Optimal Control Scheme Using Adaptive Dynamic Programming Based on Generalized Fuzzy Hyperbolic Model. ACTA AUTOMATICA SINICA, 2013, 39(2): 142-149. doi: 10.3724/SP.J.1004.2013.00142
Citation: ZHANG Ji-Lie, ZHANG Hua-Guang, LUO Yan-Hong, LIANG Hong-Jing. Nearly Optimal Control Scheme Using Adaptive Dynamic Programming Based on Generalized Fuzzy Hyperbolic Model. ACTA AUTOMATICA SINICA, 2013, 39(2): 142-149. doi: 10.3724/SP.J.1004.2013.00142

基于广义模糊双曲模型的自适应动态规划最优控制设计

doi: 10.3724/SP.J.1004.2013.00142
详细信息
    通讯作者:

    张化光

Nearly Optimal Control Scheme Using Adaptive Dynamic Programming Based on Generalized Fuzzy Hyperbolic Model

  • 摘要: 为连续非线性系统提出了一种有效的最优控制设计方法. 广义模糊双曲模型(Generalized fuzzy hyperbolic model, GFHM)首次作为逼近器用来估计 HJB (Hamilton-Jacobi-Bellman)方程的解 (值函数,即它是状态与代价函数之间的映射), 然后,利用该近似解获得最优控制. 本文方法只需要一个GFHM估计值函数. 首先, 阐述了对于连线非线性系统最优控制的设计过程; 然后,证明了逼近误差是一致最终有界的 (Uniformly ultimately bounded, UUB); 最后, 一个数值例子验证了本文方法的有效性. 另一个例子通过与神经网络自适应动态规划的方法作比较, 演示了本文方法的优点.
  • [1] Prokhorov D V, Wunsch D C. Adaptive critic designs. IEEE Transactions on Neural Networks, 1997, 8(5): 997-1007[2] Murray J J, Cox C J, Lendaris G G, Saeks R. Adaptive dynamic programming. IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews, 2002, 32(2): 140-153[3] Wang F Y, Jin N, Liu D R, Wei Q L. Adaptive dynamic programming for finite-horizon optimal control of discrete-time nonlinear systems with bound. IEEE Transactions on Neural Networks, 2011, 22(1): 24-36[4] Dierks T, Thumati B T, Jagannathan S. Optimal control of unknown affine nonlinear discrete-time systems using offline-trained neural networks with proof of convergence. Neural Network, 2009, 22(5-6): 851-860[5] Zhang H G, Luo Y H, Liu D R. Neural-network-based near-optimal control for a class of discrete-time affine nonlinear systems with control constraints. IEEE Transactions on Neural Networks, 2009, 20(9): 1490-1503[6] Zhang H G, Cui L L, Luo Y H. Near-optimal control for nonzero-sum differential games of continuous-time nonlinear systems using single-network ADP. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 2012, doi: 10.1109/TSMCB.2012.2203336[7] Wei Q L, Zhang H G, Cui L L. Data-based optimal control for discrete-time zero-sum games of 2-D systems using adaptive critic designs. Acta Automatica Sinica, 2009, 35(6): 682-692[8] Wei Qing-Lai, Zhang Hua-Guang, Liu De-Rong, Zhao Yan. An optimal control scheme for a class of discrete-time nonlinear systems with time delays using adaptive dynamic programming. Acta Automatica Sinica, 2010, 36(1): 121-129(魏庆来, 张化光, 刘德荣, 赵琰. 基于自适应动态规划的一类带有时滞的离散时间非线性系统的最优控制策略.自动化学报, 2010, 36(1): 121-129)[9] Wei Q L, Liu D R. An iterative 2-optimal control scheme for a class of discrete-time nonlinear systems with unfixed initial state. Neural Networks, 2012, 32: 236-244[10] Si J, Barto A G, Powell W B, Wunsch D. Handbook of Learning and Approximate Dynamic Programming. New York: Wiley, 2004[11] Kirk D E. Optimal Control Theory: An Introduction. New York: Dover, Inc., 2004[12] Lewis F L, Vrabie D. Reinforcement learning and adaptive dynamic programming for feedback control. IEEE Circuits and Systems Magazine, 2009, 9(3): 32-50[13] Vrabie D, Pastravanu O, Abu-Khalaf M, Lewis F L. Adaptive optimal control for continuous-time linear systems based on policy iteration. Automatica, 2009, 45(2): 477-484[14] Vrabie D, Lewis F. Neural network approach to continuous-time direct adaptive optimal control for partially unknown nonlinear systems. Neural Networks, 2009, 22(3): 237-246[15] Vamvoudakis K G, Lewis F L. Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem. Automatica, 2010, 46(5): 878-888[16] Zhang H G, Cui L L, Zhang X, Luo Y H. Data-driven robust approximate optimal tracking control for unknown general nonlinear systems using adaptive dynamic programming method. IEEE Transactions on Neural Networks, 2011, 22(12): 2226-2236[17] Haddad W M, Chellaboina V. Nonlinear Dynamical Systems and Control: A Lyapunov-Based Approach. United Kingdom: Princeton University Press, 2008[18] Lewis F L. Optimal Control. New York: John Wiley and Sons, 1986[19] Sutton R S, Barto A G. Reinforcement Learning: An Introduction. Cambridge: MIT Press, 1998[20] Al-Tamimi A, Lewis F L, Abu-Khalaf M. Model-free Q-learning designs for linear discrete-time zero-sum games with application to H∞ control. Automatica, 2007, 43(3): 473-481[21] Beard R W. Improving the Closed-Loop Performance of Nonlinear Systems [Ph.D. dissertation], Rensselaer Polytechnic Institute, Troy, 1995[22] Beard R W, Saridis G N, Wen J T. Approximate solutions to the time-invariant Hamilton-Jacobi-Bellman equation. Journal of Optimization Theory and Applications, 1998, 96(3): 589-626[23] Abu-Khalaf M, Lewis F L. Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach. Automatica, 2005, 41(5): 779-791[24] Wang D, Liu D R, Wei Q L. Finite-horizon neuro-optimal tracking control for a class of discrete-time nonlinear systems using adaptive dynamic programming approach. Neurocomputing, 2012, 78(1): 14-22[25] Al-Tamimi A, Lewis F L, Abu-Khalaf M. Discrete-time nonlinear HJB solution using approximate dynamic programming: convergence proof. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 2008, 38(4): 943-949[26] Finlayson B A. The Method of Weighted Residuals and Variational Principles. New York: Academic Press, 1972[27] Zhang H G, Quan Y B. Modeling identification and control of a class of nonlinear system. IEEE Transactions on Fuzzy Systems, 2001, 9(2): 349-354[28] Kim Y H, Lewis F L, Dawson D M. Hamilton-Jacobi-Bellman optimal design of functional link neural network controller for robot manipulators. In: Proceedings of the 36th IEEE Conference on Decision and Control. San Diego, California USA, 1997, 2: 1038-1043[29] Wang L X, Mendel J M. Fuzzy basis functions, universal approximation, and orthogonal least-squares learning. IEEE Transactions on Neural Networks, 1992, 3(5): 807-814[30] Lewis F W, Jagannathan S, Yesildirek A. Neural Network Control of Robot Manipulators and Nonlinear Systems. USA: Taylor and Francis, Inc., 1998[31] Hagan M T, Demuth H B, Beale M H. Neural Network Design. Boston, MA: PWS Publishing, 1996[32] Abdollahi F, Talebi H A, Patel R V. A stable neural network observer with application to flexible-joint manipulators. In: Proceedings of the 9th International Conference on Neural Information Processing (ICONIP'02). Singapore: IEEE, 2002, 4: 1910-1914[33] Khalil H K. Nonlinear Systems, Third edition. New Jersey: Prentice Hall, 2001
  • 加载中
计量
  • 文章访问数:  2052
  • HTML全文浏览量:  55
  • PDF下载量:  1444
  • 被引次数: 0
出版历程
  • 收稿日期:  2012-04-10
  • 修回日期:  2012-08-20
  • 刊出日期:  2013-02-20

目录

    /

    返回文章
    返回