多智能体协同研究进展综述: 博弈和控制交叉视角

秦家虎; 马麒超; 李曼; 张聪; 付维明; 刘轻尘; 郑卫新

doi:10.16383/j.aas.c240508

多智能体协同研究进展综述: 博弈和控制交叉视角

doi: 10.16383/j.aas.c240508 cstr: 32138.14.j.aas.c240508

秦家虎^1,,
马麒超^1,,
李曼^1,,
张聪^1,,
付维明^1,,
刘轻尘^1,,
郑卫新^2,

1.
中国科学技术大学自动化系合肥 230027 中国
2.
西悉尼大学计算机、数据与数学学院悉尼 NSW 2751 澳大利亚

基金项目: 国家自然科学基金 (U23A20323, 62373341, 62203418, 62303435, 62403444)资助

详细信息

作者简介:
秦家虎：中国科学技术大学自动化系教授. 主要研究方向为网络化控制系统, 自主智能系统, 人–机交互. 本文通信作者. E-mail: jhqin@ustc.edu.cn

马麒超：中国科学技术大学自动化系副研究员. 主要研究方向为多智能体系统协同决策与控制, 及其在机器人系统中的应用. E-mail: qcma@ustc.edu.cn

李曼：中国科学技术大学自动化系副研究员. 主要研究方向为多智能体博弈, 强化学习, 人–机交互. E-mail: man.li@ustc.edu.cn

张聪：中国科学技术大学自动化系博士后. 主要研究方向为多智能体协同, 分布式状态估计, 移动机器人同步定位与建图. E-mail: cong_zhang@ustc.edu.cn

付维明：中国科学技术大学自动化系副研究员. 主要研究方向为多智能体系统协同与智能电网能量管理. E-mail: fwm1993@ustc.edu.cn

刘轻尘：中国科学技术大学自动化系教授. 主要研究方向为网络化系统, 多机器人系统, 基于学习的控制. E-mail: qingchen_liu@ustc.edu.cn

郑卫新：澳大利亚西悉尼大学杰出教授. 主要研究方向为系统辨识, 网络化控制系统, 多智能体系统, 神经网络, 信号处理. E-mail: w.zheng@westernsydney.edu.au

计量
- 文章访问数: 6495
- HTML全文浏览量: 4465
- PDF下载量: 998
- 被引次数: 0
出版历程
- 收稿日期: 2024-07-16
- 录用日期: 2024-11-06
- 网络出版日期: 2024-12-13
- 刊出日期: 2025-03-18

Recent Advances on Multi-agent Collaboration: A Cross-perspective of Game and Control Theory

1.
Department of Automation, University of Science and Technology of China, Hefei 230027, China
2.
School of Computer, Data and Mathematical Sciences, Western Sydney University, Sydney NSW 2751, Australia

Funds: Supported by National Natural Science Foundation of China (U23A20323, 62373341, 62203418, 62303435, 62403444)

More Information

Author Bio:
QIN Jia-Hu　Professor in the Department of Automation, University of Science and Technology of China. His research interest covers networked control systems, autonomous intelligent systems, and human-robot interaction. Corresponding author of this paper

MA Qi-Chao　Research associate professor in the Department of Automation, University of Science and Technology of China. His research interest covers collaborative decision and control of multi-agent systems, with applications to robotics

LI Man　Research associate professor in the Department of Automation, University of Science and Technology of China. Her research interest covers multi-agent games, reinforcement learning, and human-robot interaction

ZHANG Cong　 Postdoctor in the Department of Automation, University of Science and Technology of China. Her research interest covers multi-agent cooperation, distributed state estimation, and simultaneous localization and mapping (SLAM) for mobile robots

FU Wei-Ming　Research associate professor in the Department of Automation, University of Science and Technology of China. His research interest covers collaborative in multi-agent systems and energy management in smart grids

LIU Qing-Chen　Professor in the Department of Automation, University of Science and Technology of China. His research interest covers networked systems, multi-robotics system, and learning based control

ZHENG Wei-Xing　Distinguished professor at Western Sydney University, Australia. His research interest covers system identification, networked control systems, multi-agent systems, neural networks, and signal processing

摘要

摘要: 多智能体协同应用广泛, 并被列为新一代人工智能(Artificial intelligence, AI)基础理论亟待突破的重要内容之一, 对其开展研究具有鲜明的科学价值和工程意义. 随着人工智能技术的进步, 传统的单一控制视角下的多智能体协同已无法满足执行大规模复杂任务的需求, 融合博弈与控制的多智能体协同应运而生. 在这一框架下, 多智能体协同具有更高的灵活性、适应性和扩展性, 为多智能体系统的发展带来更多可能性. 鉴于此, 首先从协同角度入手, 回顾多智能体协同控制与估计领域的进展. 接着, 围绕博弈与控制的融合, 介绍博弈框架的基本概念, 重点讨论在微分博弈下多智能体协同问题的建模与分析, 并简要总结如何应用强化学习算法求解博弈均衡. 选取多机器人导航和电动汽车充电调度这两个典型的多智能体协同场景, 介绍博弈与控制融合的思想如何用于解决相关领域的难点问题. 最后, 对博弈与控制融合框架下的多智能体协同进行总结和展望.
- 多智能体系统 /
- 协同控制 /
- 博弈优化 /
- 多移动机器人导航 /
- 电动汽车充电调度
Abstract: Multi-agent collaboration is widely applied and considered a key component for breakthroughs in next-generation artificial intelligence (AI) foundational theories, possessing significant scientific and engineering value. As AI technology advances, the traditional single-control perspective on multi-agent collaboration is inadequate for large-scale complex tasks. Thus, the integration of game theory and control has emerged, offering greater flexibility, adaptability, and scalability, thereby expanding the potential of multi-agent systems. This paper begins by reviewing progress in the field of multi-agent coordination control and estimation. It then introduces the basic concepts of game theory, focusing on modeling and analysis of multi-agent collaboration within the framework of differential games, and briefly summarizes the use of reinforcement learning algorithms to solve game equilibria. Two typical multi-agent collaboration scenarios, namely multi-robot navigation and electric vehicle charging scheduling, are discussed to demonstrate how the integration of game theory and control addresses key challenges. Finally, the paper summarizes and provides an outlook on multi-agent collaboration within the integrated game-control framework.
- Multi-agent system /
- coordination control /
- game and optimization /
- multi-mobile robot navigation /
- electric vehicle charging scheduling
注释:

1) 1¹ 这里的交互指的是信息流动, 例如智能体通过通信或者传感装置获取其他个体的信息.² 除非特别声明, 下文均以这里的连续型动力学系统为讨论对象.

2) 2³ 如果矩阵$ A $的特征值实部小于等于零, 且实部为零的特征值代数重数等于几何重数, 则称$ A $是边缘稳定的.⁴ 如果矩阵$ A $的特征值实部均为零, 且代数重数等于几何重数, 则称$ A $是中立型稳定的.

3) 3⁵有界输入下渐近零可控的定义见文献[42].

4) 4⁶ 除非特别说明, 本部分所述时间$ t\in[t_k^i,\;t_{k+1}^i) $.

HTML全文

图 1 本文总体结构

Fig. 1 General structure of the paper

下载: 全尺寸图片幻灯片

图 2 第2节总体结构

Fig. 2 General structure of Section 2

下载: 全尺寸图片幻灯片

参考文献(203)

[1]	杨涛, 杨博, 殷允强, 虞文武, 夏元清, 洪奕光. 多智能体系统协同控制与优化专刊序言. 控制与决策, 2023, 38(5): 1153−1158 Yang Tao, Yang Bo, Yin Yun-Qiang, Yu Wen-Wu, Xia Yuan-Qing, Hong Yi-Guang. Guest editorial of special issue on cooperative control and optimization for multi-agent systems. Control and Decision, 2023, 38(5): 1153−1158
[2]	Moreau L. Stability of multiagent systems with time-dependent communication links. IEEE Transactions on Automatic Control, 2005, 50(2): 169−182 doi: 10.1109/TAC.2004.841888
[3]	Cao M, Morse A S, Anderson B D O. Reaching a consensus in a dynamically changing environment: Convergence rates, measurement delays, and asynchronous events. SIAM Journal on Control and Optimization, 2008, 47(2): 601−623 doi: 10.1137/060657029
[4]	Shi G D, Johansson K H. The role of persistent graphs in the agreement seeking of social networks. IEEE Journal on Selected Areas in Communications, 2013, 31(9): 595−606 doi: 10.1109/JSAC.2013.SUP.0513052
[5]	Qin J H, Gao H J. A sufficient condition for convergence of sampled-data consensus for double-integrator dynamics with nonuniform and time-varying communication delays. IEEE Transactions on Automatic Control, 2012, 57(9): 2417−2422 doi: 10.1109/TAC.2012.2188425
[6]	Qin J H, Zheng W X, Gao H J. Consensus of multiple second-order vehicles with a time-varying reference signal under directed topology. Automatica, 2011, 47(9): 1983−1991 doi: 10.1016/j.automatica.2011.05.014
[7]	Qin J H, Gao H J, Zheng W X. Exponential synchronization of complex networks of linear systems and nonlinear oscillators: A unified analysis. IEEE Transactions on Neural Networks and Learning Systems, 2015, 26(3): 510−521 doi: 10.1109/TNNLS.2014.2316245
[8]	Lin Z L. Low Gain Feedback. London: Springer, 1999.
[9]	Qin J H, Fu W M, Zheng W X, Gao H J. On the bipartite consensus for generic linear multiagent systems with input saturation. IEEE Transactions on Cybernetics, 2017, 47(8): 1948−1958 doi: 10.1109/TCYB.2016.2612482
[10]	Meskin N, Khorasani K. Actuator fault detection and isolation for a network of unmanned vehicles. IEEE Transactions on Automatic Control, 2009, 54(4): 835−840 doi: 10.1109/TAC.2008.2009675
[11]	Dimarogonas D V, Frazzoli E, Johansson K H. Distributed event-triggered control for multi-agent systems. IEEE Transactions on Automatic Control, 2012, 57(5): 1291−1297 doi: 10.1109/TAC.2011.2174666
[12]	Qin J H, Ma Q C, Shi Y, Wang L. Recent advances in consensus of multi-agent systems: A brief survey. IEEE Transactions on Industrial Electronics, 2017, 64(6): 4972−4983 doi: 10.1109/TIE.2016.2636810
[13]	Qin J H, Yu C B, Gao H J. Coordination for linear multiagent systems with dynamic interaction topology in the leader-following framework. IEEE Transactions on Industrial Electronics, 2014, 61(5): 2412−2422 doi: 10.1109/TIE.2013.2273480
[14]	Zhang Y Y, Li S. Distributed biased min-consensus with applications to shortest path planning. IEEE Transactions on Automatic Control, 2017, 62(10): 5429−5436 doi: 10.1109/TAC.2017.2694547
[15]	Qin J H, Gao H J, Zheng W X. Second-order consensus for multi-agent systems with switching topology and communication delay. Systems and Control Letters, 2011, 60(6): 390−397
[16]	Zhang J F. Preface to special topic on games in control systems. National Science Review, 2020, 7(7): 1115−1115 doi: 10.1093/nsr/nwaa118
[17]	Shamma J S. Game theory, learning, and control systems. National Science Review, 2020, 7(7): 1118−1119 doi: 10.1093/nsr/nwz163
[18]	王龙, 黄锋. 多智能体博弈、学习与控制. 自动化学报, 2023, 49(3): 580−613 Wang Long, Huang Feng. An interdisciplinary survey of multi-agent games, learning, and control. Acta Automatica Sinica, 2023, 49(3): 580−613
[19]	Marden J R, Shamma J S. Game theory and control. Annual Review of Control, Robotics, and Autonomous Systems, 2018, 1: 105−134 doi: 10.1146/annurev-control-060117-105102
[20]	Riehl J, Ramazi P, Cao M. A survey on the analysis and control of evolutionary matrix games. Annual Reviews in Control, 2018, 45: 87−106
[21]	Zhang R R, Guo L. Controllability of Nash equilibrium in game-based control systems. IEEE Transactions on Automatic Control, 2019, 64(10): 4180−4187 doi: 10.1109/TAC.2019.2893150
[22]	Huo W, Tsang K F E, Yan Y, Johansson K H, Shi L. Distributed Nash equilibrium seeking with stochastic event-triggered mechanism. Automatica, 2024, 162: Article No. 111486
[23]	Oh K K, Park M C, Ahn H S. A survey of multi-agent formation control. Automatica, 2015, 53: 424−440 doi: 10.1016/j.automatica.2014.10.022
[24]	Anderson B D O, Shi G D, Trumpf J. Convergence and state reconstruction of time-varying multi-agent systems from complete observability theory. IEEE Transactions on Automatic Control, 2017, 62(5): 2519−2523 doi: 10.1109/TAC.2016.2599274
[25]	Xiao F, Wang L. Asynchronous consensus in continuous-time multi-agent systems with switching topology and time-varying delays. IEEE Transactions on Automatic Control, 2008, 53(8): 1804−1816 doi: 10.1109/TAC.2008.929381
[26]	Kim H, Shim H, Back J, Seo J H. Consensus of output-coupled linear multi-agent systems under fast switching network: Averaging approach. Automatica, 2013, 49(1): 267−272 doi: 10.1016/j.automatica.2012.09.025
[27]	Back J, Kim J S. Output feedback practical coordinated tracking of uncertain heterogeneous multi-agent systems under switching network topology. IEEE Transactions on Automatic Control, 2017, 62(12): 6399−6406 doi: 10.1109/TAC.2017.2651166
[28]	Valcher M E, Zorzan I. On the consensus of homogeneous multi-agent systems with arbitrarily switching topology. Automatica, 2017, 84: 79−85 doi: 10.1016/j.automatica.2017.07.011
[29]	Qin J H, Gao H J, Yu C B. On discrete-time convergence for general linear multi-agent systems under dynamic topology. IEEE Transactions on Automatic Control, 2014, 59(4): 1054−1059 doi: 10.1109/TAC.2013.2285777
[30]	Yang T, Meng Z Y, Shi G D, Hong Y G, Johansson K H. Network synchronization with nonlinear dynamics and switching interactions. IEEE Transactions on Automatic Control, 2016, 61(10): 3103−3108 doi: 10.1109/TAC.2015.2497907
[31]	Lu M B, Liu L. Distributed feedforward approach to cooperative output regulation subject to communication delays and switching networks. IEEE Transactions on Automatic Control, 2017, 62(4): 1999−2005 doi: 10.1109/TAC.2016.2594151
[32]	Meng H F, Chen Z Y, Middleton R. Consensus of multiagents in switching networks using input-to-state stability of switched systems. IEEE Transactions on Automatic Control, 2018, 63(11): 3964−3971 doi: 10.1109/TAC.2018.2809454
[33]	Liu T, Huang J. Leader-following attitude consensus of multiple rigid body systems subject to jointly connected switching networks. Automatica, 2018, 92: 63−71 doi: 10.1016/j.automatica.2018.02.012
[34]	Meng Z Y, Yang T, Li G Q, Ren W, Wu D. Synchronization of coupled dynamical systems: Tolerance to weak connectivity and arbitrarily bounded time-varying delays. IEEE Transactions on Automatic Control, 2018, 63(6): 1791−1797 doi: 10.1109/TAC.2017.2754219
[35]	Abdessameud A. Consensus of nonidentical Euler-Lagrange systems under switching directed graphs. IEEE Transactions on Automatic Control, 2019, 64(5): 2108−2114 doi: 10.1109/TAC.2018.2867347
[36]	Su Y F, Huang J. Stability of a class of linear switching systems with applications to two consensus problems. IEEE Transactions on Automatic Control, 2012, 57(6): 1420−1430 doi: 10.1109/TAC.2011.2176391
[37]	Wang X P, Zhu J D, Feng J E. A new characteristic of switching topology and synchronization of linear multiagent systems. IEEE Transactions on Automatic Control, 2019, 64(7): 2697−2711 doi: 10.1109/TAC.2018.2869478
[38]	Ma Q C, Qin J H, Zheng W X, Shi Y, Kang Y. Exponential consensus of linear systems over switching network: A subspace method to establish necessity and sufficiency. IEEE Transactions on Cybernetics, 2022, 52(3): 1565−1574 doi: 10.1109/TCYB.2020.2991540
[39]	Ma Q C, Qin J H, Yu X H, Wang L. On necessary and sufficient conditions for exponential consensus in dynamic networks via uniform complete observability theory. IEEE Transactions on Automatic Control, 2021, 66(10): 4975−4981 doi: 10.1109/TAC.2020.3046606
[40]	Ma Q C, Qin J H, Anderson B D O, Wang L. Exponential consensus of multiple agents over dynamic network topology: Controllability, connectivity, and compactness. IEEE Transactions on Automatic Control, 2023, 68(12): 7104−7119 doi: 10.1109/TAC.2023.3245021
[41]	Bernstein D S, Michel A N. A chronological bibliography on saturating actuators. International Journal of Robust and Nonlinear Control, 1995, 5(5): 375−380 doi: 10.1002/rnc.4590050502
[42]	Zhou B, Duan G R, Lin Z L. A parametric Lyapunov equation approach to the design of low gain feedback. IEEE Transactions on Automatic Control, 2008, 53(6): 1548−1554 doi: 10.1109/TAC.2008.921036
[43]	Su H S, Chen M Z Q, Lam J, Lin Z L. Semi-global leader-following consensus of linear multi-agent systems with input saturation via low gain feedback. IEEE Transactions on Circuits and Systems I: Regular Papers, 2013, 60(7): 1881−1889 doi: 10.1109/TCSI.2012.2226490
[44]	Li Y, Xiang J, Wei W. Consensus problems for linear time-invariant multi-agent systems with saturation constraints. IET Control Theory and Applications, 2011, 5(6): 823−829
[45]	Meng Z Y, Zhao Z Y, Lin Z L. On global leader-following consensus of identical linear dynamic systems subject to actuator saturation. Systems and Control Letters, 2013, 62(2): 132−142
[46]	Ren W, Beard R W. Consensus algorithms for double-integrator dynamics. Distributed Consensus in Multi-vehicle Cooperative Control: Theory and Applications. London: Springer, 2008. 77−104
[47]	Zhao Z Y, Lin Z L. Global leader-following consensus of a group of general linear systems using bounded controls. Automatica, 2016, 68: 294−304 doi: 10.1016/j.automatica.2016.01.027
[48]	Zhang Y M, Jiang J. Bibliographical review on reconfigurable fault-tolerant control systems. Annual Reviews in Control, 2008, 32(2): 229−252 doi: 10.1016/j.arcontrol.2008.03.008
[49]	Davoodi M R, Khorasani K, Talebi H A, Momeni H R. Distributed fault detection and isolation filter design for a network of heterogeneous multiagent systems. IEEE Transactions on Control Systems Technology, 2014, 22(3): 1061−1069 doi: 10.1109/TCST.2013.2264507
[50]	Kashyap N, Yang C W, Sierla S, Flikkema P G. Automated fault location and isolation in distribution grids with distributed control and unreliable communication. IEEE Transactions on Industrial Electronics, 2015, 62(4): 2612−2619 doi: 10.1109/TIE.2014.2387093
[51]	Teixeira A, Shames I, Sandberg H, Johansson K H. Distributed fault detection and isolation resilient to network model uncertainties. IEEE Transactions on Cybernetics, 2014, 44(11): 2024−2037 doi: 10.1109/TCYB.2014.2350335
[52]	Wang Y J, Song Y D, Lewis F L. Robust adaptive fault-tolerant control of multiagent systems with uncertain nonidentical dynamics and undetectable actuation failures. IEEE Transactions on Industrial Electronics, 2015, 62(6): 3978−3988
[53]	Chen S, Ho D W C, Li L L, Liu M. Fault-tolerant consensus of multi-agent system with distributed adaptive protocol. IEEE Transactions on Cybernetics, 2015, 45(10): 2142−2155 doi: 10.1109/TCYB.2014.2366204
[54]	Tabuada P. Event-triggered real-time scheduling of stabilizing control tasks. IEEE Transactions on Automatic Control, 2007, 52(9): 1680−1685 doi: 10.1109/TAC.2007.904277
[55]	Cao M T, Xiao F, Wang L. Event-based second-order consensus control for multi-agent systems via synchronous periodic event detection. IEEE Transactions on Automatic Control, 2015, 60(9): 2452−2457 doi: 10.1109/TAC.2015.2390553
[56]	Lu W L, Han Y J, Chen T P. Synchronization in networks of linearly coupled dynamical systems via event-triggered diffusions. IEEE Transactions on Neural Networks and Learning Systems, 2015, 26(12): 3060−3069 doi: 10.1109/TNNLS.2015.2402691
[57]	Fan Y, Feng G, Wang Y, Song C. Distributed event-triggered control of multi-agent systems with combinational measurements. Automatica, 2013, 49(2): 671−675 doi: 10.1016/j.automatica.2012.11.010
[58]	Garcia E, Cao Y C, Casbeer D W. Decentralized event-triggered consensus with general linear dynamics. Automatica, 2014, 50(10): 2633−2640 doi: 10.1016/j.automatica.2014.08.024
[59]	Seyboth G S, Dimarogonas D V, Johansson K H. Event-based broadcasting for multi-agent average consensus. Automatica, 2013, 49(1): 245−252 doi: 10.1016/j.automatica.2012.08.042
[60]	Zhu W, Jiang Z P. Event-based leader-following consensus of multi-agent systems with input time delay. IEEE Transactions on Automatic Control, 2015, 60(5): 1362−1367 doi: 10.1109/TAC.2014.2357131
[61]	Cheng Y, Ugrinovskii V. Event-triggered leader-following tracking control for multivariable multi-agent systems. Automatica, 2016, 70: 204−210 doi: 10.1016/j.automatica.2016.04.003
[62]	Mu N K, Liao X F, Huang T W. Event-based consensus control for a linear directed multiagent system with time delay. IEEE Transactions on Circuits and Systems Ⅱ: Express Briefs, 2015, 62(3): 281−285 doi: 10.1109/TCSII.2014.2368991
[63]	Altafini C. Consensus problems on networks with antagonistic interactions. IEEE Transactions on Automatic Control, 2013, 58(4): 935−946 doi: 10.1109/TAC.2012.2224251
[64]	Cartwright D, Harary F. Structural balance: A generalization of Heider's theory. Psychological Review, 1956, 63(5): 277−293 doi: 10.1037/h0046049
[65]	Meng Z Y, Shi G D, Johansson K H, Cao M, Hong Y G. Behaviors of networks with antagonistic interactions and switching topologies. Automatica, 2016, 73: 110−116 doi: 10.1016/j.automatica.2016.06.022
[66]	Qin J H, Yu C B, Anderson B D O. On leaderless and leader-following consensus for interacting clusters of second-order multi-agent systems. Automatica, 2016, 74: 214−221 doi: 10.1016/j.automatica.2016.07.008
[67]	Qin J H, Yu C B. Cluster consensus control of generic linear multi-agent systems under directed topology with acyclic partition. Automatica, 2013, 49(9): 2898−2905 doi: 10.1016/j.automatica.2013.06.017
[68]	Ren L, Li M, Sun C Y. Semiglobal cluster consensus for heterogeneous systems with input saturation. IEEE Transactions on Cybernetics, 2021, 51(9): 4685−4694 doi: 10.1109/TCYB.2019.2942735
[69]	Qin J H, Ma Q C, Gao H J, Shi Y, Kang Y. On group synchronization for interacting clusters of heterogeneous systems. IEEE Transactions on Cybernetics, 2017, 47(12): 4122−4133 doi: 10.1109/TCYB.2016.2600753
[70]	Xia W G, Cao M. Clustering in diffusively coupled networks. Automatica, 2011, 47(11): 2395−2405 doi: 10.1016/j.automatica.2011.08.043
[71]	Battistelli G, Chisci L, Mugnai G, Farina A, Graziano A. Consensus-based linear and nonlinear filtering. IEEE Transactions on Automatic Control, 2015, 60(5): 1410−1415 doi: 10.1109/TAC.2014.2357135
[72]	Battistelli G, Chisci L. Stability of consensus extended Kalman filter for distributed state estimation. Automatica, 2016, 68: 169−178 doi: 10.1016/j.automatica.2016.01.071
[73]	Zhang C, Qin J H, Li H, Wang Y N, Wang S, Zheng W X. Consensus-based distributed two-target tracking over wireless sensor networks. Automatica, 2022, 146: Article No. 110593 doi: 10.1016/j.automatica.2022.110593
[74]	Chen Q, Yin C, Zhou J, Wang Y, Wang X Y, Chen C Y. Hybrid consensus-based cubature Kalman filtering for distributed state estimation in sensor networks. IEEE Sensors Journal, 2018, 18(11): 4561−4569 doi: 10.1109/JSEN.2018.2823908
[75]	Guo M, Jayawardhana B. Simultaneous distributed localization, formation, and group motion control: A distributed filter approach. IEEE Transactions on Control of Network Systems, 2024, 11(4): 1867−1878 doi: 10.1109/TCNS.2024.3367448
[76]	Sun W W, Lv X Y, Qiu M Y. Distributed estimation for stochastic Hamiltonian systems with fading wireless channels. IEEE Transactions on Cybernetics, 2022, 52(6): 4897−4906 doi: 10.1109/TCYB.2020.3023547
[77]	Chen W, Wang Z D, Ding D R, Yi X J, Han Q L. Distributed state estimation over wireless sensor networks with energy harvesting sensors. IEEE Transactions on Cybernetics, 2023, 53(5): 3311−3324 doi: 10.1109/TCYB.2022.3179280
[78]	Kalman R E. A new approach to linear filtering and prediction problems. Journal of Basic Engineering, 1960, 82(1): 35−45 doi: 10.1115/1.3662552
[79]	Ljung L. Asymptotic behavior of the extended Kalman filter as a parameter estimator for linear systems. IEEE Transactions on Automatic Control, 1979, 24(1): 36−50 doi: 10.1109/TAC.1979.1101943
[80]	Wan E A, van der Merwe R. The unscented Kalman filter. Kalman Filtering and Neural Networks. New York: Wiley, 2001. 221−280
[81]	Julier S J, Uhlmann J K. Reduced sigma point filters for the propagation of means and covariances through nonlinear transformations. In: Proceedings of the American Control Conference (IEEE Cat. No. CH37301). Anchorage, USA: IEEE, 2002. 887−892
[82]	Arasaratnam I, Haykin S. Cubature Kalman filters. IEEE Transactions on Automatic Control, 2009, 54(6): 1254−1269 doi: 10.1109/TAC.2009.2019800
[83]	Chen B, Hu G Q, Ho D W C, Yu L. Distributed covariance intersection fusion estimation for cyber-physical systems with communication constraints. IEEE Transactions on Automatic Control, 2016, 61(12): 4020−4026 doi: 10.1109/TAC.2016.2539221
[84]	Yu D D, Xia Y Q, Li L, Zhai D H. Event-triggered distributed state estimation over wireless sensor networks. Automatica, 2020, 118: Article No. 109039 doi: 10.1016/j.automatica.2020.109039
[85]	Peng H, Zeng B R, Yang L X, Xu Y, Lu R Q. Distributed extended state estimation for complex networks with nonlinear uncertainty. IEEE Transactions on Neural Networks and Learning Systems, 2023, 34(9): 5952−5960 doi: 10.1109/TNNLS.2021.3131661
[86]	Wang S C, Ren W, Chen J. Fully distributed dynamic state estimation with uncertain process models. IEEE Transactions on Control of Network Systems, 2018, 5(4): 1841−1851 doi: 10.1109/TCNS.2017.2763756
[87]	Yu F, Dutta R G, Zhang T, Hu Y D, Jin Y E. Fast attack-resilient distributed state estimator for cyber-physical systems. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2020, 39(11): 3555−3565 doi: 10.1109/TCAD.2020.3013072
[88]	Zhang C, Qin J H, Yan C Z, Shi Y, Wang Y N, Li M. Towards invariant extended Kalman filter-based resilient distributed state estimation for moving robots over mobile sensor networks under deception attacks. Automatica, 2024, 159: Article No. 111408 doi: 10.1016/j.automatica.2023.111408
[89]	Xie L, Choi D H, Kar S, Poor H V. Fully distributed state estimation for wide-area monitoring systems. IEEE Transactions on Smart Grid, 2012, 3(3): 1154−1169 doi: 10.1109/TSG.2012.2197764
[90]	Qian J C, Duan P H, Duan Z S, Shi L. Event-triggered distributed state estimation: A conditional expectation method. IEEE Transactions on Automatic Control, 2023, 68(10): 6361−6368 doi: 10.1109/TAC.2023.3234453
[91]	Duan P H, Wang Q S, Duan Z S, Chen G R. A distributed optimization scheme for state estimation of nonlinear networks with norm-bounded uncertainties. IEEE Transactions on Automatic Control, 2022, 67(5): 2582−2589 doi: 10.1109/TAC.2021.3091182
[92]	Zhang C, Qin J H, Ma Q C, Shi Y, Li M L. Resilient distributed state estimation for LTI systems under time-varying deception attacks. IEEE Transactions on Control of Network Systems, 2023, 10(1): 381−393 doi: 10.1109/TCNS.2022.3203360
[93]	Wang H J, Liu K, Han D Y, Xia Y Q. Vulnerability analysis of distributed state estimation under joint deception attacks. Automatica, 2023, 157: Article No. 111274 doi: 10.1016/j.automatica.2023.111274
[94]	Facchinei F, Kanzow C. Generalized Nash equilibrium problems. Annals of Operations Research, 2010, 175(1): 177−211 doi: 10.1007/s10479-009-0653-x
[95]	Ye M J, Hu G Q. Adaptive approaches for fully distributed Nash equilibrium seeking in networked games. Automatica, 2021, 129 : Article No. 109661
[96]	Meng Q, Nian X H, Chen Y, Chen Z. Attack-resilient distributed Nash equilibrium seeking of uncertain multiagent systems over unreliable communication networks. IEEE Transactions on Neural Networks and Learning Systems, 2024, 35(5): 6365−6379 doi: 10.1109/TNNLS.2022.3209313
[97]	Ye M J, Han Q L, Ding L, Xu S Y, Jia G B. Distributed Nash equilibrium seeking strategies under quantized communication. IEEE/CAA Journal of Automatica Sinica, 2024, 11(1): 103−112 doi: 10.1109/JAS.2022.105857
[98]	Zhong Y F, Yuan Y, Yuan H H. Nash equilibrium seeking for multi-agent systems under DoS attacks and disturbances. IEEE Transactions on Industrial Informatics, 2024, 20(4): 5395−5405 doi: 10.1109/TII.2023.3332951
[99]	Gadjov D, Pavel L. A passivity-based approach to Nash equilibrium seeking over networks. IEEE Transactions on Automatic Control, 2019, 64(3): 1077−1092 doi: 10.1109/TAC.2018.2833140
[100]	Romano A R, Pavel L. Dynamic gradient play for NE seeking with disturbance rejection. In: Proceedings of the IEEE Conference on Decision and Control (CDC). Miami, USA: IEEE, 2018. 346−351
[101]	Lou Y C, Hong Y G, Xie L H, Shi G D, Johansson K H. Nash equilibrium computation in subnetwork zero-sum games with switching communications. IEEE Transactions on Automatic Control, 2016, 61(10): 2920−2935 doi: 10.1109/TAC.2015.2504962
[102]	Lu K H, Jing G S, Wang L. Distributed algorithms for searching generalized Nash equilibrium of noncooperative games. IEEE Transactions on Cybernetics, 2019, 49(6): 2362−2371 doi: 10.1109/TCYB.2018.2828118
[103]	Chen S B, Cheng R S. Operating reserves provision from residential users through load aggregators in smart grid: A game theoretic approach. IEEE Transactions on Smart Grid, 2019, 10(2): 1588−1598 doi: 10.1109/TSG.2017.2773145
[104]	Zhu Y N, Yu W W, Wen G H, Chen G R. Distributed Nash equilibrium seeking in an aggregative game on a directed graph. IEEE Transactions on Automatic Control, 2021, 66(6): 2746−2753 doi: 10.1109/TAC.2020.3008113
[105]	Carnevale G, Fabiani F, Fele F, Margellos K, Notarstefano G. Tracking-based distributed equilibrium seeking for aggregative games. IEEE Transactions on Automatic Control, 2024, 69(9): 6026−6041 doi: 10.1109/TAC.2024.3368967
[106]	时侠圣, 任璐, 孙长银. 自适应分布式聚合博弈广义纳什均衡算法. 自动化学报, 2024, 50(6): 1210−1220 Shi Xia-Sheng, Ren Lu, Sun Chang-Yin. Distributed adaptive generalized Nash equilibrium algorithm for aggregative games. Acta Automatica Sinica, 2024, 50(6): 1210−1220
[107]	Zhang Y Y, Sun J, Wu C Y. Vehicle-to-grid coordination via mean field game. IEEE Control Systems Letters, 2022, 6: 2084−2089 doi: 10.1109/LCSYS.2021.3139266
[108]	Alasseur C, ben Taher I, Matoussi A. An extended mean field game for storage in smart grids. Journal of Optimization Theory and Applications, 2020, 184(2): 644−670 doi: 10.1007/s10957-019-01619-3
[109]	Martinez-Piazuelo J, Quijano N, Ocampo-Martinez C. Nash equilibrium seeking in full-potential population games under capacity and migration constraints. Automatica, 2022, 141: Article No. 110285 doi: 10.1016/j.automatica.2022.110285
[110]	Zhang J, Lu J Q, Cao J D, Huang W, Guo J H, Wei Y. Traffic congestion pricing via network congestion game approach. Discrete and Continuous Dynamical Systems——Series S, 2021, 14(4): 1553−1567 doi: 10.3934/dcdss.2020378
[111]	Zeng J, Wang Q Q, Liu J F, Chen J L, Chen H Y. A potential game approach to distributed operational optimization for microgrid energy management with renewable energy and demand response. IEEE Transactions on Industrial Electronics, 2019, 66(6): 4479−4489 doi: 10.1109/TIE.2018.2864714
[112]	Deng Z H, Luo J. Distributed algorithm for nonsmooth multi-coalition games and its application in electricity markets. Automatica, 2024, 161: Article No. 111494 doi: 10.1016/j.automatica.2023.111494
[113]	Meng M, Li X X. On the linear convergence of distributed Nash equilibrium seeking for multi-cluster games under partial-decision information. Automatica, 2023, 151: Article No. 110919 doi: 10.1016/j.automatica.2023.110919
[114]	Bašar T, Olsder G J. Dynamic Noncooperative Game Theory (2nd edition). Philadelphia: SIAM, 1999.
[115]	Modares H, Lewis F L, Jiang Z P. ${ H_{\infty}}$ tracking control of completely unknown continuous-time systems via off-policy reinforcement learning. IEEE Transactions on Neural Networks and Learning Systems, 2015, 26(10): 2550−2562 doi: 10.1109/TNNLS.2015.2441749
[116]	Song R Z, Lewis F L, Wei Q L. Off-policy integral reinforcement learning method to solve nonlinear continuous-time multiplayer nonzero-sum games. IEEE Transactions on Neural Networks and Learning Systems, 2017, 28(3): 704−713 doi: 10.1109/TNNLS.2016.2582849
[117]	Odekunle A, Gao W N, Davari M, Jiang Z P. Reinforcement learning and non-zero-sum game output regulation for multi-player linear uncertain systems. Automatica, 2020, 112: Article No. 108672 doi: 10.1016/j.automatica.2019.108672
[118]	Li M, Qin J H, Freris N M, Ho D W C. Multiplayer Stackelberg-Nash game for nonlinear system via value iteration-based integral reinforcement learning. IEEE Transactions on Neural Networks and Learning Systems, 2022, 33(4): 1429−1440 doi: 10.1109/TNNLS.2020.3042331
[119]	Mukaidani H, Xu H. Stackelberg strategies for stochastic systems with multiple followers. Automatica, 2015, 53: 53−59 doi: 10.1016/j.automatica.2014.12.021
[120]	李曼, 秦家虎, 王龙. 线性二次二人Stackelberg博弈均衡点求解: 一种Q学习方法. 中国科学: 信息科学, 2022, 52(6): 1083−1097 doi: 10.1360/SSI-2021-0016 Li Man, Qin Jia-Hu, Wang Long. Seeking equilibrium for linear-quadratic two-player Stackelberg game: A Q-learning approach. Scientia Sinica Informationis, 2022, 52(6): 1083−1097 doi: 10.1360/SSI-2021-0016
[121]	Lin Y N. Necessary/sufficient conditions for Pareto optimality in finite horizon mean-field type stochastic differential game. Automatica, 2020, 119: Article No. 108951 doi: 10.1016/j.automatica.2020.108951
[122]	Vamvoudakis K G, Lewis F L, Hudas G R. Multi-agent differential graphical games: Online adaptive learning solution for synchronization with optimality. Automatica, 2012, 48(8): 1598−1611 doi: 10.1016/j.automatica.2012.05.074
[123]	Jiao Q, Modares H, Xu S Y, Lewis F L, Vamvoudakis K G. Multi-agent zero-sum differential graphical games for disturbance rejection in distributed control. Automatica, 2016, 69: 24−34 doi: 10.1016/j.automatica.2016.02.002
[124]	Li M, Qin J H, Ma Q C, Zheng W X, Kang Y. Hierarchical optimal synchronization for linear systems via reinforcement learning: A Stackelberg-Nash game perspective. IEEE Transactions on Neural Networks and Learning Systems, 2021, 32(4): 1600−1611 doi: 10.1109/TNNLS.2020.2985738
[125]	Li M, Qin J H, Wang Y N, Kang Y. Bio-inspired dynamic collective choice in large-population systems: A robust mean-field game perspective. IEEE Transactions on Neural Networks and Learning Systems, 2022, 33(5): 1914−1924 doi: 10.1109/TNNLS.2020.3027428
[126]	Kamalapurkar R, Klotz J R, Walters P, Dixon W E. Model-based reinforcement learning in differential graphical games. IEEE Transactions on Control of Network Systems, 2018, 5(1): 423−433 doi: 10.1109/TCNS.2016.2617622
[127]	Li J N, Modares H, Chai T Y, Lewis F L, Xie L H. Off-policy reinforcement learning for synchronization in multiagent graphical games. IEEE Transactions on Neural Networks and Learning Systems, 2017, 28(10): 2434−2445 doi: 10.1109/TNNLS.2016.2609500
[128]	Qin J H, Li M, Shi Y, Ma Q C, Zheng W X. Optimal synchronization control of multiagent systems with input saturation via off-policy reinforcement learning. IEEE Transactions on Neural Networks and Learning Systems, 2019, 30(1): 85−96 doi: 10.1109/TNNLS.2018.2832025
[129]	孙长银, 穆朝絮. 多智能体深度强化学习的若干关键科学问题. 自动化学报, 2020, 46(7): 1301−1312 Sun Chang-Yin, Mu Chao-Xu. Important scientific problems of multi-agent deep reinforcement learning. Acta Automatica Sinica, 2020, 46(7): 1301−1312
[130]	Arslan G, Yüksel S. Decentralized Q-learning for stochastic teams and games. IEEE Transactions on Automatic Control, 2017, 62(4): 1545−1558 doi: 10.1109/TAC.2016.2598476
[131]	Shao J Z, Lou Z Q, Zhang H C, Jiang Y H, He S C, Ji X Y. Self-organized group for cooperative multi-agent reinforcement learning. In: Proceedings of the 36th International Conference on Neural Information Processing Systems. New Orleans, USA: Curran Associates Inc., 2022. Article No. 413
[132]	Wang L, Zhang Y P, Hu Y J, Wang W X, Zhang C J, Gao Y, et al. Individual reward assisted multi-agent reinforcement learning. In: Proceedings of the 39th International Conference on Machine Learning. Baltimore, USA: PMLR, 2022. 23417−23432
[133]	Leonardos S, Overman W, Panageas I, Piliouras G. Global convergence of multi-agent policy gradient in Markov potential games. In: Proceedings of the 10th International Conference on Learning Representations. Virtual Event: ICLR, 2022.
[134]	Zhang K Q, Hu B, Bašar T. On the stability and convergence of robust adversarial reinforcement learning: A case study on linear quadratic systems. In: Proceedings of the 34th International Conference on Neural Information Processing Systems. Vancouver, Canada: Curran Associates Inc., 2020. Article No. 1850
[135]	Yang Y D, Luo R, Li M N, Zhou M, Zhang W N, Wang J. Mean field multi-agent reinforcement learning. In: Proceedings of the 35th International Conference on Machine Learning. Stockholm, Sweden: PMLR, 2018. 5571−5580
[136]	Ben-Porath E. Rationality, Nash equilibrium and backwards induction in perfect-information games. The Review of Economic Studies, 1997, 64(1): 23−46 doi: 10.2307/2971739
[137]	Brown N, Sandholm T. Reduced space and faster convergence in imperfect-information games via pruning. In: Proceedings of the 34th International Conference on Machine Learning. Sydney, Australia: PMLR, 2017. 596−604
[138]	Lowe R, Wu Y, Tamar A, Harb J, Abbeel P, Mordatch I. Multi-agent actor-critic for mixed cooperative-competitive environments. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach, USA: Curran Associates Inc., 2017. 6382−6393
[139]	Sunehag P, Lever G, Gruslys A, Czarnecki W M, Zambaldi V, Jaderberg M, et al. Value-decomposition networks for cooperative multi-agent learning based on team reward. In: Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems. Stockholm, Sweden: International Foundation for Autonomous Agents and Multiagent Systems, 2018. 2085−2087
[140]	Rashid T, Samvelyan M, de Witt C S, Farquhar G, Foerster J, Whiteson S. Monotonic value function factorisation for deep multi-agent reinforcement learning. The Journal of Machine Learning Research, 2020, 21(1): Article No. 178
[141]	Ruan J Q, Du Y L, Xiong X T, Xing D P, Li X Y, Meng L H, et al. GCS: Graph-based coordination strategy for multi-agent reinforcement learning. In: Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems. Auckland, New Zealand: International Foundation for Autonomous Agents and Multiagent Systems, 2022. 1128−1136
[142]	Li X S, Li J C, Shi H B, Hwang K S. A decentralized communication framework based on dual-level recurrence for multiagent reinforcement learning. IEEE Transactions on Cognitive and Developmental Systems, 2024, 16(2): 640−649 doi: 10.1109/TCDS.2023.3281878
[143]	Jiang H B, Ding Z L, Lu Z Q. Settling decentralized multi-agent coordinated exploration by novelty sharing. In: Proceedings of the 38th AAAI Conference on Artificial Intelligence. Vancouver, Canada: AAAI, 2024. 17444−17452
[144]	Wang H, Yu Y, Jiang Y. Fully decentralized multiagent communication via causal inference. IEEE Transactions on Neural Networks and Learning Systems, 2023, 34(12): 10193−10202 doi: 10.1109/TNNLS.2022.3165114
[145]	van Goor P, Mahony R. EqVIO: An equivariant filter for visual-inertial odometry. IEEE Transactions on Robotics, 2023, 39(5): 3567−3585 doi: 10.1109/TRO.2023.3289587
[146]	Shan T X, Englot B, Meyers D, Wang W, Ratti C, Rus D. LIO-SAM: Tightly-coupled lidar inertial odometry via smoothing and mapping. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Las Vegas, USA: IEEE, 2020. 5135−5142
[147]	Shan T X, Englot B, Ratti C, Rus D. LVI-SAM: Tightly-coupled lidar-visual-inertial odometry via smoothing and mapping. In: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA). Xi'an, China: IEEE, 2021. 5692−5698
[148]	Zhang Z Y, Wang L, Zhou L P, Koniusz P. Learning spatial-context-aware global visual feature representation for instance image retrieval. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). Paris, France: IEEE, 2023. 11216−11225
[149]	Harris C, Stephens M. A combined corner and edge detector. In: Proceedings of the Alvey Vision Conference. Manchester, UK: Alvey Vision Club, 1988. 1−6
[150]	Fang S, Li H. Multi-vehicle cooperative simultaneous LiDAR SLAM and object tracking in dynamic environments. IEEE Transactions on Intelligent Transportation Systems, 2024, 25(9): 11411−11421 doi: 10.1109/TITS.2024.3360259
[151]	Zhu F C, Ren Y F, Kong F Z, Wu H J, Liang S Q, Chen N, et al. Swarm-LIO: Decentralized swarm LiDAR-inertial odometry. In: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA). London, UK: IEEE, 2023. 3254–3260
[152]	Zhang Z J, Wang S, Hong Y C, Zhou L K, Hao Q. Distributed dynamic map fusion via federated learning for intelligent networked vehicles. In: Proceedings of the IEEE International conference on Robotics and Automation (ICRA). Xi'an, China: IEEE, 2021. 953−959
[153]	Khamis A, Hussein A, Elmogy A. Multi-robot task allocation: A review of the state-of-the-art. Cooperative Robots and Sensor Networks 2015. Cham: Springer, 2015. 31−51
[154]	Choi H L, Brunet L, How J P. Consensus-based decentralized auctions for robust task allocation. IEEE Transactions on Robotics, 2009, 25(4): 912−926 doi: 10.1109/TRO.2009.2022423
[155]	Bai X S, Fielbaum A, Kronmüller M, Knoedler L, Alonso-Mora J. Group-based distributed auction algorithms for multi-robot task assignment. IEEE Transactions on Automation Science and Engineering, 2023, 20(2): 1292−1303 doi: 10.1109/TASE.2022.3175040
[156]	Shorinwa O, Haksar R N, Washington P, Schwager M. Distributed multirobot task assignment via consensus ADMM. IEEE Transactions on Robotics, 2023, 39(3): 1781−1800 doi: 10.1109/TRO.2022.3228132
[157]	Park S, Zhong Y D, Leonard N E. Multi-robot task allocation games in dynamically changing environments. In: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA). Xi'an, China: IEEE, 2021. 8678−8684
[158]	Soria E, Schiano F, Floreano D. Predictive control of aerial swarms in cluttered environments. Nature Machine Intelligence, 2021, 3(6): 545−554 doi: 10.1038/s42256-021-00341-y
[159]	Saravanos A D, Aoyama Y, Zhu H C, Theodorou E A. Distributed differential dynamic programming architectures for large-scale multiagent control. IEEE Transactions on Robotics, 2023, 39(6): 4387−4407 doi: 10.1109/TRO.2023.3319894
[160]	Yao W J, de Marina H G, Sun Z Y, Cao M. Guiding vector fields for the distributed motion coordination of mobile robots. IEEE Transactions on Robotics, 2023, 39(2): 1119−1135 doi: 10.1109/TRO.2022.3224257
[161]	Chen Y D, Guo M, Li Z K. Deadlock resolution and recursive feasibility in MPC-based multirobot trajectory generation. IEEE Transactions on Automatic Control, 2024, 69(9): 6058−6073 doi: 10.1109/TAC.2024.3393126
[162]	Spica R, Cristofalo E, Wang Z J, Montijano E, Schwager M. A real-time game theoretic planner for autonomous two-player drone racing. IEEE Transactions on Robotics, 2020, 36(5): 1389−1403 doi: 10.1109/TRO.2020.2994881
[163]	Williams Z, Chen J S, Mehr N. Distributed potential iLQR: Scalable game-theoretic trajectory planning for multi-agent interactions. In: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA). London, UK: IEEE, 2023. 1−7
[164]	Chen M, Shih J C, Tomlin C J. Multi-vehicle collision avoidance via Hamilton-Jacobi reachability and mixed integer programming. In: Proceedings of the IEEE 55th Conference on Decision and Control (CDC). Las Vegas, USA: IEEE, 2016. 1695−1700
[165]	Li M, Qin J H, Li J C, Liu Q C, Shi Y, Kang Y. Game-based approximate optimal motion planning for safe human-swarm interaction. IEEE Transactions on Cybernetics, 2024, 54(10): 5649−5660 doi: 10.1109/TCYB.2023.3340659
[166]	Zhu K, Zhang T. Deep reinforcement learning based mobile robot navigation: A review. Tsinghua Science and Technology, 2021, 26(5): 674−691 doi: 10.26599/TST.2021.9010012
[167]	He Z C, Dong L, Song C W, Sun C Y. Multiagent soft actor-critic based hybrid motion planner for mobile robots. IEEE Transactions on Neural Networks and Learning Systems, 2023, 34(12): 10980−10992 doi: 10.1109/TNNLS.2022.3172168
[168]	Fan T X, Long P X, Liu W X, Pan J. Distributed multi-robot collision avoidance via deep reinforcement learning for navigation in complex scenarios. The International Journal of Robotics Research, 2020, 39(7): 856−892 doi: 10.1177/0278364920916531
[169]	Brito B, Everett M, How J P, Alonso-Mora J. Where to go next: Learning a subgoal recommendation policy for navigation in dynamic environments. IEEE Robotics and Automation Letters, 2021, 6(3): 4616−4623 doi: 10.1109/LRA.2021.3068662
[170]	Xie Z T, Dames P. DRL-VO: Learning to navigate through crowded dynamic scenes using velocity obstacles. IEEE Transactions on Robotics, 2023, 39(4): 2700−2719 doi: 10.1109/TRO.2023.3257549
[171]	Han R H, Chen S D, Wang S J, Zhang Z Q, Gao R, Hao Q, et al. Reinforcement learned distributed multi-robot navigation with reciprocal velocity obstacle shaped rewards. IEEE Robotics and Automation Letters, 2022, 7(3): 5896−5903 doi: 10.1109/LRA.2022.3161699
[172]	Chen L, Wang Y N, Miao Z Q, Feng M T, Zhou Z, Wang H S, et al. Reciprocal velocity obstacle spatial-temporal network for distributed multirobot navigation. IEEE Transactions on Industrial Electronics, 2024, 71(11): 14470−14480 doi: 10.1109/TIE.2024.3379630
[173]	Qin J M, Qin J H, Qiu J X, Liu Q C, Li M, Ma Q C. SRL-ORCA: A socially aware multi-agent mapless navigation algorithm in complex dynamic scenes. IEEE Robotics and Automation Letters, 2024, 9(1): 143−150 doi: 10.1109/LRA.2023.3331621
[174]	Li Y, Davis C, Lukszo Z, Weijnen M. Electric vehicle charging in China's power system: Energy, economic and environmental trade-offs and policy implications. Applied Energy, 2016, 173: 535−554 doi: 10.1016/j.apenergy.2016.04.040
[175]	Chandra I, Singh N K, Samuel P. A comprehensive review on coordinated charging of electric vehicles in distribution networks. Journal of Energy Storage, 2024, 89: Article No. 111659 doi: 10.1016/j.est.2024.111659
[176]	Franco J F, Rider M J, Romero R. A mixed-integer linear programming model for the electric vehicle charging coordination problem in unbalanced electrical distribution systems. IEEE Transactions on Smart Grid, 2015, 6(5): 2200−2210 doi: 10.1109/TSG.2015.2394489
[177]	Das R, Wang Y, Busawon K, Putrus G, Neaimeh M. Real-time multi-objective optimisation for electric vehicle charging management. Journal of Cleaner Production, 2021, 292: Article No. 126066 doi: 10.1016/j.jclepro.2021.126066
[178]	Wan Y N, Qin J H, Yu X H, Yang T, Kang Y. Price-based residential demand response management in smart grids: A reinforcement learning-based approach. IEEE/CAA Journal of Automatica Sinica, 2022, 9(1): 123−134 doi: 10.1109/JAS.2021.1004287
[179]	Zhang P, Qian K J, Zhou C K, Stewart B G, Hepburn D M. A methodology for optimization of power systems demand due to electric vehicle charging load. IEEE Transactions on Power Systems, 2012, 27(3): 1628−1636 doi: 10.1109/TPWRS.2012.2186595
[180]	Ioakimidis C S, Thomas D, Rycerski P, Genikomsakis K N. Peak shaving and valley filling of power consumption profile in non-residential buildings using an electric vehicle parking lot. Energy, 2018, 148: 148−158 doi: 10.1016/j.energy.2018.01.128
[181]	van Kriekinge G, de Cauwer C, Sapountzoglou N, Coosemans T, Messagie M. Peak shaving and cost minimization using model predictive control for uni- and bi-directional charging of electric vehicles. Energy Reports, 2021, 7: 8760−8771 doi: 10.1016/j.egyr.2021.11.207
[182]	Gong J B, Fu W M, Kang Y, Qin J H, Xiao F. Multi-agent deep reinforcement learning based multi-objective charging control for electric vehicle charging station. In: Proceedings of the 7th Chinese Conference on Swarm Intelligence and Cooperative Control. Nanjing, China: Springer, 2023. 266−277
[183]	Tu R, Gai Y J, Farooq B, Posen D, Hatzopoulou M. Electric vehicle charging optimization to minimize marginal greenhouse gas emissions from power generation. Applied Energy, 2020, 277: Article No. 115517 doi: 10.1016/j.apenergy.2020.115517
[184]	Adetunji K E, Hofsajer I W, Abu-Mahfouz A M, Cheng L. An optimization planning framework for allocating multiple distributed energy resources and electric vehicle charging stations in distribution networks. Applied Energy, 2022, 322: Article No. 119513 doi: 10.1016/j.apenergy.2022.119513
[185]	Ran L L, Qin J H, Wan Y N, Fu W M, Yu W W, Xiao F. Fast charging navigation strategy of EVs in power-transportation networks: A coupled network weighted pricing perspective. IEEE Transactions on Smart Grid, 2024, 15(4): 3864−3875 doi: 10.1109/TSG.2024.3354300
[186]	Wan Y N, Qin J H, Li F Y, Yu X H, Kang Y. Game theoretic-based distributed charging strategy for PEVs in a smart charging station. IEEE Transactions on Smart Grid, 2021, 12(1): 538−547 doi: 10.1109/TSG.2020.3020466
[187]	Zhang L, Li Y Y. A game-theoretic approach to optimal scheduling of parking-lot electric vehicle charging. IEEE Transactions on Vehicular Technology, 2016, 65(6): 4068−4078 doi: 10.1109/TVT.2015.2487515
[188]	Kabir M E, Assi C, Tushar M H K, Yan J. Optimal scheduling of EV charging at a solar power-based charging station. IEEE Systems Journal, 2020, 14(3): 4221−4231 doi: 10.1109/JSYST.2020.2968270
[189]	Zavvos E, Gerding E H, Brede M. A comprehensive game-theoretic model for electric vehicle charging station competition. IEEE Transactions on Intelligent Transportation Systems, 2022, 23(8): 12239−12250 doi: 10.1109/TITS.2021.3111765
[190]	Chen J, Huang X Q, Cao Y J, Li L Y, Yan K, Wu L, et al. Electric vehicle charging schedule considering shared charging pile based on generalized Nash game. International Journal of Electrical Power and Energy Systems, 2022, 136: Article No. 107579
[191]	Yan D X, Yin H, Li T, Ma C B. A two-stage scheme for both power allocation and EV charging coordination in a grid-tied PV-battery charging station. IEEE Transactions on Industrial Informatics, 2021, 17(10): 6994−7004 doi: 10.1109/TII.2021.3054417
[192]	Liu Z X, Wu Q W, Huang S J, Wang L F, Shahidehpour M, Xue Y S. Optimal day-ahead charging scheduling of electric vehicles through an aggregative game model. IEEE Transactions on Smart Grid, 2018, 9(5): 5173−5184 doi: 10.1109/TSG.2017.2682340
[193]	Lin R Z, Chu H Q, Gao J W, Chen H. Charging management and pricing strategy of electric vehicle charging station based on mean field game theory. Asian Journal of Control, 2024, 26(2): 803−813 doi: 10.1002/asjc.3173
[194]	Wang Y F, Wang X L, Shao C C, Gong N W. Distributed energy trading for an integrated energy system and electric vehicle charging stations: A Nash bargaining game approach. Renewable Energy, 2020, 155: 513−530 doi: 10.1016/j.renene.2020.03.006
[195]	Pahlavanhoseini A, Sepasian M S. Optimal planning of PEV fast charging stations using Nash bargaining theory. Journal of Energy Storage, 2019, 25: Article No. 100831 doi: 10.1016/j.est.2019.100831
[196]	Ran L L, Wan Y N, Qin J H, Fu W M, Zhang D F, Kang Y. A game-based battery swapping station recommendation approach for electric vehicles. IEEE Transactions on Intelligent Transportation Systems, 2023, 24(9): 9849−9860 doi: 10.1109/TITS.2023.3269570
[197]	Zeng H T, Sheng Y J, Sun H B, Zhou Y Z, Xue Y X, Guo Q L. A conic relaxation approach for solving Stackelberg pricing game of electric vehicle charging station considering traffic equilibrium. IEEE Transactions on Smart Grid, 2024, 15(3): 3080−3097 doi: 10.1109/TSG.2023.3329651
[198]	Wan Y N, Qin J H, Ma Q C, Fu W M, Wang S. Multi-agent DRL-based data-driven approach for PEVs charging/discharging scheduling in smart grid. Journal of the Franklin Institute, 2022, 359(4): 1747−1767 doi: 10.1016/j.jfranklin.2022.01.016
[199]	Zhang Z L, Wan Y N, Qin J H, Fu W M, Kang Y. A deep RL-based algorithm for coordinated charging of electric vehicles. IEEE Transactions on Intelligent Transportation Systems, 2022, 23(10): 18774−18784 doi: 10.1109/TITS.2022.3170000
[200]	Park K, Moon I. Multi-agent deep reinforcement learning approach for EV charging scheduling in a smart grid. Applied Energy, 2022, 328: Article No. 120111 doi: 10.1016/j.apenergy.2022.120111
[201]	Zhang Y, Yang Q Y, An D, Li D H, Wu Z Z. Multistep multiagent reinforcement learning for optimal energy schedule strategy of charging stations in smart grid. IEEE Transactions on Cybernetics, 2023, 53(7): 4292−4305 doi: 10.1109/TCYB.2022.3165074
[202]	Liang Y C, Ding Z H, Zhao T Y, Lee W J. Real-time operation management for battery swapping-charging system via multi-agent deep reinforcement learning. IEEE Transactions on Smart Grid, 2023, 14(1): 559−571 doi: 10.1109/TSG.2022.3186931
[203]	Wang L, Liu S X, Wang P F, Xu L M, Hou L Y, Fei A G. QMIX-based multi-agent reinforcement learning for electric vehicle-facilitated peak shaving. In: Proceedings of the IEEE Global Communications Conference. Kuala Lumpur, Malaysia: IEEE, 2023. 1693−1698