2.845

2023影响因子

(CJCR)

  • 中文核心
  • EI
  • 中国科技核心
  • Scopus
  • CSCD
  • 英国科学文摘

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

执行器饱和的离散时间多智能体系统有限时域一致性控制

王巍 王珂 黄自鑫 王乐君 穆朝絮

王巍, 王珂, 黄自鑫, 王乐君, 穆朝絮. 执行器饱和的离散时间多智能体系统有限时域一致性控制. 自动化学报, 2025, 51(3): 1−14 doi: 10.16383/j.aas.c240446
引用本文: 王巍, 王珂, 黄自鑫, 王乐君, 穆朝絮. 执行器饱和的离散时间多智能体系统有限时域一致性控制. 自动化学报, 2025, 51(3): 1−14 doi: 10.16383/j.aas.c240446
Wang Wei, Wang Ke, Huang Zi-Xin, Wang Le-Jun, Mu Chao-Xu. Finite-horizon consensus control of discrete-time multi-agent systems with actuator saturation. Acta Automatica Sinica, 2025, 51(3): 1−14 doi: 10.16383/j.aas.c240446
Citation: Wang Wei, Wang Ke, Huang Zi-Xin, Wang Le-Jun, Mu Chao-Xu. Finite-horizon consensus control of discrete-time multi-agent systems with actuator saturation. Acta Automatica Sinica, 2025, 51(3): 1−14 doi: 10.16383/j.aas.c240446

执行器饱和的离散时间多智能体系统有限时域一致性控制

doi: 10.16383/j.aas.c240446 cstr: 32138.14.j.aas.c240446
基金项目: 湖北省自然科学基金(2023AFB561)资助
详细信息
    作者简介:

    王巍:中南财经政法大学副教授. 天津大学博士后. 2019年获得中国地质大学(武汉)控制科学与工程博士学位. 主要研究方向为强化学习与自适应动态规划, 多智能体系统, 有限时域最优控制. E-mail: imagef@zuel.edu.cn

    王珂:天津大学助理研究员. 2023年获得天津大学控制科学与工程博士学位. 主要研究方向为强化学习与自适应动态规划, 微分博弈与应用, 事件触发方法. E-mail: walker_wang@tju.edu.cn

    黄自鑫:武汉工程大学副教授. 上海交通大学博士后, 南开大学博士后. 2020年获得中国地质大学(武汉)控制科学与工程博士学位. 主要研究方向为软体机器人, 强化学习. E-mail: huangzx@wit.edu.cn

    王乐君:重庆邮电大学讲师. 天津大学博士后. 2022年获得中国地质大学(武汉)控制科学与工程博士学位. 主要研究方向为机器人智能控制技术. E-mail: wanglj@cqupt.edu.cn

    穆朝絮:天津大学教授. 主要研究方向为强化学习, 自适应学习系统, 无人优化与控制. 本文通信作者. E-mail: cxmu@tju.edu.cn

Finite-horizon Consensus Control of Discrete-time Multi-agent Systems with Actuator Saturation

Funds: Supported by Hubei Provincial Natural Science Foundation of China (2023AFB561)
More Information
    Author Bio:

    WANG Wei Associate professor at Zhongnan University of Economics and Law. Postdoctor at Tianjin University. He received his Ph.D. degree in control science and engineering from China University of Geosciences (Wuhan) in 2019. His main research interest covers reinforcement learning and adaptive dynamic programming, multi-agent systems, and finite-horizon optimal control

    WANG Ke Assistant researcher at Tianjin University. He received his Ph.D. degree in control science and engineering from Tianjin University in 2023. His main research interest covers reinforcement learning and adaptive dynamic programming, differential games and applications, and event-triggered methods

    HUANG Zi-Xin Associate professor at Wuhan University of Technology. Postdoctor at Shanghai Jiao Tong University, Nankai University. He received his Ph.D. degree in control science and engineering from China University of Geosciences (Wuhan) in 2020. His main research interest covers soft robotics and reinforcement learning

    WANG Le-Jun Lecturer at Chongqing University of Posts and Telecommunications. Postdoctor at Tianjin University. He received his Ph.D. degree in control science and engineering from China University of Geosciences (Wuhan) in 2022. His main research interest covers intelligent control technology of robot

    MU Chao-Xu Professor at Tianjin University. Her main research interest covers reinforcement learning, adaptive learning systems, and unmanned optimization and control. Corresponding author of this paper

  • 摘要: 针对执行器饱和的离散时间线性多智能体系统有限时域一致性控制问题, 将低增益反馈方法与Q学习相结合, 提出采用后向时间迭代的模型无关控制方法. 首先, 将执行器饱和的有限时域一致性控制问题的求解转变为执行器饱和的单智能体有限时域最优控制问题的求解, 并证明可以通过求解修正的时变黎卡提方程 (Modified time-varying Riccati equation, MTVRE) 以实现有限时域最优控制. 随后, 引入参数化时变Q函数, 并提出基于Q学习的模型无关后向时间迭代算法, 可以更新低增益参数, 同时实现逼近求解修正的时变黎卡提方程. 另外, 证明所提迭代求解算法得到的低增益反馈控制矩阵收敛于修正的时变黎卡提方程的最优解, 也可以实现全局有限时域一致性控制. 最后, 通过仿真实验结果验证该方法的有效性.
  • 图  1  仿真1中MAS的通信拓扑

    Fig.  1  MAS communication topology in simulation 1

    图  2  例1中智能体的状态

    Fig.  2  The states of agents in example 1

    图  3  例1中智能体的控制输入

    Fig.  3  The control inputs of agents in example 1

    图  4  例2中智能体的状态

    Fig.  4  The states of agents in example 2

    图  5  例2中智能体的控制输入

    Fig.  5  The control inputs of agents in example 2

    图  6  例3中智能体的状态

    Fig.  6  The states of agents in example 3

    图  7  例3中智能体的控制输入

    Fig.  7  The control inputs of agents in example 3

    图  11  例2中有限时域方法获得的一致性误差

    Fig.  11  Consensus error obtained by finite-horizon method in example 2

    图  8  仿真2中MAS的通信拓扑

    Fig.  8  MAS communication topology in simulation 2

    图  9  例1中有限时域方法获得的一致性误差

    Fig.  9  Consensus error obtained by finite-horizon method in example 1

    图  10  例1中无限时域方法获得的一致性误差

    Fig.  10  Consensus error obtained by infinite-horizon method in example 1

    图  12  例2中无限时域方法获得的一致性误差

    Fig.  12  Consensus error obtained by infinite-horizon method in example 2

    表  1  对比实验评价指标

    Table  1  Performance index of comparison experiment

    $100\le k \le 120$ $IAE$ $MSE$
    例1-有限时域方法 0.637 7 0.005 4
    例1-无限时域方法 10.264 9 2.116 9
    例2-有限时域方法 1.074 8 0.014 7
    例2-无限时域方法 5.186 9 0.510 9
    下载: 导出CSV

    表  2  例1中一致性误差调节时间

    Table  2  Consensus error setting time in example 1

    例1-调节时间 有限时域方法 无限时域方法
    智能体1 109 137
    智能体2 119 161
    智能体3 104 127
    智能体4 109 137
    智能体5 90 110
    下载: 导出CSV

    表  3  例2中一致性误差调节时间

    Table  3  Consensus error setting time in example 2

    例2-调节时间 有限时域方法 无限时域方法
    智能体1 108 131
    智能体2 116 158
    智能体3 120 183
    智能体4 108 131
    智能体5 84 93
    下载: 导出CSV
  • [1] Huang Y, Fang W, Chen Z, Li Y, Yang C. Flocking of multiagent systems with nonuniform and nonconvex input constraints. IEEE Transactions on Automatic Control, 2023, 68(7): 4329−4335
    [2] Okine A A, Adam N, Naeem F, Kaddoum G. Multi-agent deep reinforcement learning for packet routing in tactical mobile sensor networks. IEEE Transactions on Network and Service Management, 2024, 21(2): 2155−2169 doi: 10.1109/TNSM.2024.3352014
    [3] Mu C, Liu Z, Yan J, Jia H, Zhang X. Graph Multi-agent reinforcement learning for inverter-based active voltage control. IEEE Transactions on Smart Grid, 2024, 15(2): 1399−1409 doi: 10.1109/TSG.2023.3298807
    [4] Zhao Y, Niu B, Zong G, Alharbi, K H. Neural network-based adaptive optimal containment control for non-affine nonlinear multi-agent systems within an identifier-actor-critic framework. Journal of the Franklin Institute, 2023, 360(12): 8118−8143 doi: 10.1016/j.jfranklin.2023.06.014
    [5] Fan S, Wang T, Qin C, Qiu J, Li M. Optimized backstepping attitude containment control for multiple spacecrafts. IEEE Transactions on Fuzzy Systems, 2024, 32(9): 5248−5258 doi: 10.1109/TFUZZ.2024.3418577
    [6] An L, Yang G H, Deng C, Wen C. Event-triggered reference governors for collisions-free leader-following coordination under unreliable communication topologies. IEEE Transactions on Automatic Control, 2024, 69(4): 2116−2130 doi: 10.1109/TAC.2023.3291654
    [7] Wang W, Chen X. Model-free optimal containment control of multi-agent systems based on actor-critic framework. Neurocomputing, 2018, 314: 242−250 doi: 10.1016/j.neucom.2018.06.011
    [8] Wang W, Chen X, Fu H, Wu M. Model-free distributed consensus control based on actor-critic framework for discrete-time nonlinear multiagent systems. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2020, 50(11): 4123−4134 doi: 10.1109/TSMC.2018.2883801
    [9] Su H, Miao S. Consensus on directed matrix-weighted networks. IEEE Transactions on Automatic Control, 2023, 68(4): 2529−2535 doi: 10.1109/TAC.2022.3184630
    [10] 杨洪勇, 郭雷, 张玉玲, 姚秀明. 离散时间分数阶多自主体系统的时延一致性. 自动化学报, 2014, 40(9): 2022−2028

    Yang Hong-Yong, Guo Lei, Zhang Yu-Ling, Yao Xiu-Ming. Delay consensus of fractional-order multi-agent systems with sampling delays. Acta Automatica Sinica, 2014, 40(9): 2022−2028
    [11] 马煜文, 李贤伟, 李少远. 无控制器间通信的线性多智能体一致性的降阶协议. 自动化学报, 2023, 49(9): 1836−1844

    Ma Yu-Wen, Li Xian-Wei, Li Shao-Yuan. A reduced-order protocol for linear multi-agent consensus without inter-controller communication. Acta Automatica Sinica, 2023, 49(9): 1836−1844
    [12] He W, Chen X, Zhang M, Sun Y, Sekiguchi A, She J. Data-driven optimal consensus control for switching multiagent systems via joint communication graph. IEEE Transactions on Industrial Informatics, 2024, 20(4): 5959−5968 doi: 10.1109/TII.2023.3342881
    [13] Zhang H, Yue D, Dou C, Zhao W, Xie X. Data-driven distributed optimal consensus control for unknown multiagent systems with input-delay. IEEE Transactions on Cybernetics, 2019, 49(6): 2095−2105 doi: 10.1109/TCYB.2018.2819695
    [14] Ji J W, Zhang Z C, Wang Y J, Zuo Z Q. Event-triggered consensus of discrete-time double-integrator multi-agent systems with asymmetric input saturation. Nonlinear Dynamics, 2024, 112: 13321−13334 doi: 10.1007/s11071-024-09761-y
    [15] Liu C, Liu L, Cao J, Abdel-Aty M. Intermittent event-triggered optimal leader-following consensus for nonlinear multi-agent systems via actor-critic algorithm. IEEE Transactions on Neural Networks and Learning Systems, 2023, 34(8): 3992−4006 doi: 10.1109/TNNLS.2021.3122458
    [16] Wang J, Zhang Z, Tian B, Zong Q. Event-based robust optimal consensus control for nonlinear multiagent system with local adaptive dynamic programming. IEEE Transactions on Neural Networks and Learning Systems, 2024, 35(1): 1073−1086 doi: 10.1109/TNNLS.2022.3180054
    [17] Zattoni E. Structural invariant subspaces of singular Hamiltonian systems and nonrecursive solutions of finite-horizon optimal control problems. IEEE Transactions on Automatic Control, 2008, 53(5): 1279−1284 doi: 10.1109/TAC.2008.921040
    [18] Ferrari-Trecate G, Galbusera L, Marciandi M P E, Scattolini R. A model predictive control scheme for consensus in multi-agent systems with single-integrator dynamics and input constraints. In: Proceedings of the 46th IEEE Conference on Decision and Control, New Orleans, LA, USA: IEEE, 2007. 1492−1497
    [19] Aditya P, Werner H. A distributed linear-quadratic discrete-time game approach to multi-agent consensus. In: Proceedings of the 61st IEEE Conference on Decision and Control, Cancun, Mexico: IEEE, 2022. 6169−6174
    [20] Han F, Wei G, Ding D, Song Y. Finite-horizon H-consensus control for multi-agent systems with random parameters: The local condition case. Journal of the Franklin Institute, 2017, 354(14): 607−6097
    [21] Li J, Wei G, Ding D. Finite-horizon H consensus control for multi-agent systems under energy constraint. Journal of the Franklin Institute, 2019, 356(6): 3762−3780 doi: 10.1016/j.jfranklin.2019.01.016
    [22] Chen W, Ding D, Dong H, Wei G, Ge X. Finite-horizon H∞ bipartite consensus control of cooperation-competition multiagent systems with round-robin protocols. IEEE Transactions on Cybernetics, 2021, 51(7): 3699−3709 doi: 10.1109/TCYB.2020.2977468
    [23] Li X M, Yao D, Li P, Meng W, Li H, Lu R. Secure finite-horizon consensus control of multiagent systems against cyber attacks. IEEE Transactions on Cybernetics, 2022, 52(9): 9230−9239 doi: 10.1109/TCYB.2021.3052467
    [24] Powell W B, Approximate Dynamic Programming: Solving the Curses of Dimensionality. John Wiley & Sons, 2007.
    [25] Sutton R S, Barto A G. Reinforcement Learning: An Introduction. MIT Press, 2018.
    [26] 庞文砚, 范家璐, 姜艺, LewisFrank L. 基于强化学习的部分线性离散时间系统的最优输出调节. 自动化学报, 2022, 48(9): 2242−2253

    Pang Wen-Yan, Fan Jia-Lu, Jiang Yi, Lewis Frank Leroy. Optimal output regulation of partially linear discrete-time systems using reinforcement learning. Acta Automatica Sinica, 2022, 48(9): 2242−2253
    [27] Watkins C J, Dayan P. Q-learning. Machine Learning, 1992, 8: 279−292
    [28] Mu C, Zhao Q, Gao Z, Sun C. Q-learning solution for optimal consensus control of discrete-time multiagent systems using reinforcement learning. Journal of the Franklin Institute, 2019, 356(13): 6946−6967 doi: 10.1016/j.jfranklin.2019.06.007
    [29] Liu J, Dong Y, Gu Z, Xie X, Tian E. Security consensus control for multi-agent systems under DoS attacks via reinforcement learning method. Journal of the Franklin Institute, 2024, 361(1): 164−176 doi: 10.1016/j.jfranklin.2023.11.032
    [30] Feng T, Zhang J, Tong Y, Zhang H. Q-learning algorithm in solving consensusability problem of discrete-time multi-agent systems. Automatica, 2021, 128: 109576 doi: 10.1016/j.automatica.2021.109576
    [31] Long M, Su H, Zeng Z. Output-feedback global consensus of discrete-time multiagent systems subject to input saturation via Q-learning method. IEEE Transactions on Cybernetics, 2022, 52(3): 1661−1670 doi: 10.1109/TCYB.2020.2987385
    [32] Zhang H, Park J H, Yue D, Xie X. Finite-horizon optimal consensus control for unknown multiagent state-delay systems. IEEE Transactions on Cybernetics, 2020, 50(2): 402−413 doi: 10.1109/TCYB.2018.2856510
    [33] Liu C, Liu L. Finite-horizon robust event-triggered control for nonlinear multi-agent systems with state delay. Neural Processing Letters, 2023, 55(4): 5167−5191 doi: 10.1007/s11063-022-11085-0
    [34] Guzey H, Xu H, Sarangapani J. Neural network-based finite horizon optimal adaptive consensus control of mobile robot formations. Optimal Control Applications and Methods, 2016, 37(5): 1014−1034 doi: 10.1002/oca.2222
    [35] Yu D, Ge S S, Li D, Wang P. Finite-horizon robust formation-containment control of multi-agent networks with unknown dynamics. Neurocomputing, 2021, 458: 403−415 doi: 10.1016/j.neucom.2021.01.063
    [36] Shi J, Yue D, Xie X. Optimal leader-follower consensus for constrained-input multiagent systems with completely unknown dynamics. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2022, 52(2): 1182−1191 doi: 10.1109/TSMC.2020.3011184
    [37] Qin J, Li M, Shi Y, Ma Q, Zheng W X. Optimal synchronization control of multiagent systems with input saturation via off-policy reinforcement learning. IEEE Transactions on Neural Networks and Learning Systems, 2019, 30(1): 85−96 doi: 10.1109/TNNLS.2018.2832025
    [38] Long M, Su H, Wang X, Jiang G P, Wang X. An iterative Q-learning based global consensus of discrete-time saturated multi-agent systems. Chaos, 2019, 29(10): 103127 doi: 10.1063/1.5120106
    [39] Long M, Su H, Zeng Z. Model-free algorithms for containment control of saturated discrete-time multiagent systems via Q-learning method. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2022, 52(2): 1308−1316 doi: 10.1109/TSMC.2020.3019504
    [40] Wang B, Xu L, Yi X, Jia Y, Yang T. Semiglobal suboptimal output regulation for heterogeneous multi-agent systems with input saturation via adaptive dynamic programming. IEEE Transactions on Neural Networks and Learning Systems, 2024, 35(3): 3242−3250 doi: 10.1109/TNNLS.2022.3191673
    [41] Wang L, Xu J, Liu Y, Chen C L P. Dynamic event-driven finite-horizon optimal consensus control for constrained multiagent systems. IEEE Transactions on Neural Networks and Learning Systems, DOI: 10.1109/TNNLS.2023.3292154
    [42] Lin Z. Low Gain Feedback. Springer, 1999.
    [43] Calafiore G C, Possieri C. Output feedback Q-learning for linear-quadratic discrete-time finite-horizon control problems. IEEE Transactions on Neural Networks and Learning Systems, 2021, 32(7): 3274−3281 doi: 10.1109/TNNLS.2020.3010304
    [44] Wang W, Xie X, Feng C. Model-free finite-horizon optimal tracking control of discrete-time linear systems. Applied Mathematics and Computation, 2022, 433: 127400 doi: 10.1016/j.amc.2022.127400
    [45] Wu H, Su H. Discrete-time positive edge-consensus for undirected and directed nodal networks. IEEE Transactions on Circuits and Systems II: Express Briefs, 2018, 65(2): 221−225
    [46] Lewis F L, Vrabie D, Syrmos V L. Optimal Control. John Wiley & Sons, 2012.
    [47] Jiang Y, Kiumarsi B, Fan J L, Chai T Y, Li J N, Lewis F L. Optimal output regulation of linear discrete-time system with unknown dynamics using reinforcement learning. IEEE Transactions on Cybernetics, 2020, 50(4): 3147−3156
    [48] 姜艺, 范家璐, 贾瑶, 柴天佑. 数据驱动的浮选过程运行反馈解耦控制方法. 自动化学报, 2019, 45(4): 759−770

    Jiang Yi, Fan Jia-Lu, Jia Yao, Chai Tian-You. Data-driven flotation process operational feedback decoupling control. Acta Automatica Sinica, 2019, 45(4): 759−770
  • 加载中
计量
  • 文章访问数:  32
  • HTML全文浏览量:  17
  • 被引次数: 0
出版历程
  • 收稿日期:  2024-06-30
  • 录用日期:  2024-10-28
  • 网络出版日期:  2024-11-28

目录

    /

    返回文章
    返回