-
摘要: 作为多智能体对抗博弈问题的重要分支, 追逃博弈(Pursuit-evasion, PE)问题在控制和机器人领域得到了广泛应用, 受到众多研究者的密切关注. 追逃博弈问题主要聚焦于追逐者和逃跑者双方为实现各自目标而展开的动态博弈: 追逐者试图在最短时间内抓到逃跑者, 逃跑者的目标则是避免被捕获. 本文概述追逃博弈问题的相关研究进展, 从空间环境、信息获取等五个方面介绍追逃博弈问题的各类设定; 简述理论求解、数值求解等四种当下主流的追逃博弈问题求解方法. 通过对现有研究的总结和分析, 给出几点研究建议, 对未来追逃博弈问题的发展具有一定指导意义.Abstract: As an important branch of multi-agent adversarial games, pursuit-evasion (PE) games have found widespread applications in the fields of control and robotics, attracting considerable attention from researchers. PE games primarily focus on the dynamic games between pursuer and evader, each striving to achieve their respective objectives: The pursuer aims to capture the evader as quickly as possible, while the evader's goal is to avoid capture. This article provides an overview of the research progress in PE games, and introduces various settings of PE games across five key dimensions, including spatial environment, information acquisition, and so on. It briefly describes four mainstream methods for solving PE games, including theoretical approaches, numerical approaches, and so on. By summarizing and analyzing existing researches, this article offers several research suggestions, which are expected to provide significant guidance for future developments in PE games.
-
Key words:
- Pursuit-evasion (PE) games /
- multi-agent /
- adversarial games /
- differential games
-
多个独立的智能体通过信息交互以及多种方式组成多智能体系统[1−6], 旨在解决单个智能体无法解决的大规模复杂性任务. 多智能体系统能够完成在复杂网络下的信息传递, 已广泛应用在诸多重要的实际场景, 如航天器领域[7]、无人驾驶飞行器[8]和水下车辆系统[9]等. 特别地, 多智能体系统的跟踪控制策略[10−11]一直是进一步探索的热点研究方向. 例如, 文献[12]研究针对多智能体系统网络化预测PID的控制问题来达到输出信号一致性. 文献[13]针对带有非线性扰动的多智能体系统, 提出基于神经网络的自适应控制策略来实现共识控制目标. 然而, 智能体之间的信息交互严格依赖网络环境, 特别是存在大量智能体时将同时占用多个网络通信渠道, 不可避免地导致沉重的网络负担, 这一现象亟需解决.
近年来, 为缓解信息传递渠道上的通讯压力, 学者们提出事件触发策略[14−18]使控制器以非周期的方式更新. 文献[19]提出针对多智能体系统的事件触发分布式控制策略, 包括固定阈值策略、相对阈值策略和切换阈值策略. 根据这一概念, 相继研究出多种新颖的事件触发机制. 例如, 动态事件触发机制、状态触发机制和记忆事件触发机制等. 其中, 状态触发机制[20−21]引起广泛关注. 特别地, 文献[21]首次提出基于状态触发机制的非线性多智能体系统自适应一致性控制方案, 设计的事件触发机制首先将采样信号值与系统真实值构造出采样误差, 随后转换得到带有触发信号的同步误差, 并将之设计到控制输入信号中来缓解控制器−传感器渠道上的通讯压力. 值得注意的是, 现存文献中状态触发机制的阈值条件是动态变化的, 并且会采用根据分解的方式将产生的采样误差值的平方项化为常数项的方法判断下一次采样时刻. 需要进一步指出的是, 该阈值条件的设计方法严格依赖于稳定性条件, 从而限制阈值条件设计的灵活性. 但在实际系统的真实状况中, 状态信息通常不可测量, 因而会导致控制方案实施的准确性. 因此, 提出状态观测器来实现对原系统重构, 从而满足反馈控制的需要. 所以, 在大规模实际系统的控制运行中, 首先需要解决在未知状态不可测情况下系统通讯渠道负担重的问题, 并且在信息传递过程中多个通讯链路上的资源节约问题同样值得注意. 例如, 控制器−执行器环节、传感器−控制器环节、智能体与智能体之间的通讯渠道等. 因此, 如果实现同时在多个通讯渠道上节约通讯资源势必会大幅度降低整体控制系统通讯带宽占用率.
值得关注的是, 一些非线性因素[22−28]可能会导致系统性能下降或系统抖震现象发生. 模糊逻辑系统或神经网络是处理非线性项的常用近似工具. 然而在大多数情况下, 模糊逻辑系统或神经网络的逼近能力是有限的, 并且其逼近效果取决于模糊规则或神经网络节点的数量. 因此, 系统中非线性因素的存在会导致系统不稳定. 为解决上述情况并保证系统的实时性能, 文献[29]提出规定性能控制方法, 通过将跟踪误差约束在预定的范围内实现对系统瞬态性能和稳态性能的保证. 文献[30]针对非线性多智能体系统, 设计新颖的规定性能转换函数. 文献[31]利用误差转换方法和规定性能控制策略设计分布式自适应控制器来保证控制目标的有效实现. 因此可以看出, 规定性能控制方法在确保系统性能方面十分有效, 并且当发现性能指标无法满足时, 该机制可以及时采取相应的容错措施以提高系统的可靠性和安全性.
因此, 考虑到非线性多智能体系统中通讯资源负担重以及现存状态事件触发机制的阈值设计条件具有一定地局限性的双重问题, 本文展开基于混合双端事件触发机制的协同控制策略研究, 旨在提升现存状态事件触发机制阈值设计的灵活性且改善多通讯渠道的通讯压力状况, 进一步拓展多智能体系统一致性控制策略的多样性.
本文主要贡献如下:
1) 与现存结果相比[21], 所提出的状态触发机制的阈值条件可以在不使用杨氏不等式的情况下直接设计, 从而减少现有控制方案的缩放次数和保守性. 并且首次使用估计状态进行采样, 扩展了状态触发机制的应用范围. 基于新的状态触发机制与控制器触发机制, 构造新的双端分布式触发框架, 在较少的参数设计限制性下, 具有更小的通信压力.
2) 基于规定性能技术特性, 提出的自适应控制方案既减少系统不稳定情况发生的概率, 又保证多智能体一致性任务的精确度. 此外, 所设计的分布式观测器仅依赖于相对输出信息进行反馈调节来解决状态不可测问题, 具有良好的可扩展性和灵活性.
本文的组织结构安排如下. 第1节给出本文工作所需的预备知识; 第2节介绍模糊状态观测器的设计过程; 为获得预期的控制目标, 第3节提出自适应分布式控制器的设计方法、稳定性及芝诺行为分析; 第4节通过一个实际仿真例子证明所提出策略的有效性; 第5节给出本文的结论, 并对未来进行展望.
1. 预备知识
1.1 图论
针对非线性多智能体系统, 智能体之间的信息传输关系需要清晰描述. 考虑一个有向图来表明多个智能体之间的关系. 首先, 定义有向图为 $ \bar{{\cal{G}}}=({{\cal{V}}},\;\acute{{\cal{E}}},\;\bar{{\cal{A}}}) $, 其中, $ {{\cal{V}}}=\left( {1},\;\cdots,\;{N}\right) $ 为节点集合, $ \acute{{\cal{E}}}\subseteq{{\cal{V}}} \times {{\cal{V}}} $ 为边集合. 定义邻接矩阵为 $ {\bar{{\cal{A}}}}=[a_{h,\; \ell}] \in \bf{R}^{N\times N} $. 与此同时, 节点$ \ell $到节点 $ h $形成的边表示为 $ \left({{\cal{V}}}_{h},\;{{\cal{V}}}_{\ell}\right) \in \acute{{\cal{E}}} $. 当节点$ \ell $可以传输信息到节点$ h $, 即可以得到$ a_{h,\;\ell}>0 $. 否则, $ a_{h,\;\ell}=0 $. 此外, $ \check{{\cal{D}}} $ 表示度矩阵, 其中 $ \check{{\cal{D}}}=\hbox{diag}\left\{d_{1},\;\cdots,\;d_{N}\right\} $ 以及 $ d_{h} = {\sum\nolimits^N_{\ell=1}}a_{h,\;\ell} $. $ {{\cal{L}}} $ 表示拉普拉斯矩阵, 其定义为$ {{\cal{L}}}=\check{{\cal{D}}}-\bar{{\cal{A}}} $.
假设 1[32]. 将有向图$ {\cal{G}} $看作一个生成树, 需要保证任意一个节点到根节点都至少存在一条有效传输路径, 并将领导者节点$ {\bf{0}} $ 视为生成树的根.
引理1[33]. 定义$ {\cal{B}}=\hbox{diag}\left\{b_{h}\right\}\in \bf{R}^{N\times N} $, 并且使得$ b_{h}>0 $. 随后, 可以得到 $ {\cal{L}}+{\cal{B}} $是非奇异的.
1.2 问题形成
1) 智能体的动态模型. 对于$ h=1,\;2,\;\cdots,\;M $和 $ g=1,\;2,\;\cdots,\;n-1 $, 定义第$ h $个智能体的动态模型为
$$ \begin{aligned} \left\{ \begin{aligned} &\dot{x}_{h,\;g}=x_{h,\;g+1}+\zeta_{h,\;g}(\bar{x}_{h,\;g})\\ &\dot{x}_{h,\;n}=u_{h}+\zeta_{h,\;n}(\bar{x}_{h,\;n})\\ &y_{h}=x_{h,\;1} \end{aligned} \right. \end{aligned} $$ (1) 其中, $ {\bar x_{h,\;g}} = [x_{h,\;1},\;x_{h,\;2},\;\cdots,\;x_{h,\;g}]^\text{T}\in {{\bf{R}}}^{g} $ 和 $ {\bar x_{h,\;n}}= [x_{h,\;1},\;x_{h,\;2},\;\cdots,\;x_{h,\;n}]^\text{T}\in {\bf{R}}^{n} $ 代表状态向量. $ u_{h} $和$ y_{h} $分别代表控制输入信号和输出信号. $ \zeta_{h,\;g}(\bar{x}_{h,\;g}) $和$ \zeta_{h,\;n}(\bar{x}_{h,\;n}) $表示未知的光滑非线性函数.
2)控制目标. 针对非线性多智能体系统, 设计一个带有规定性能机制的自适应混合双端事件触发跟踪控制策略来保证如下的两个控制目标:
a) 保证闭环内所有信号都是半全局一致最终有界的;
b) 使跟随者的输出轨迹和领导者的输出轨迹保持一致.
引理2[33]. 定义$ \bar{s}_{.1}=({s}_{1,\;1},\;{s}_{2,\;1},\;\cdots,\;{s}_{N,\;1})^\text{T} $, $ \bar{y}=(y_{1},\;y_{2},\;\cdots,\;y_{N})^\text{T} $ 和 $ \bar{y}_{r}=({y_{r},\;y_{r},\;\cdots,\; y_{r}})^\text{T} $. 三者满足如下关系:
$$ \begin{align} \begin{aligned} ||\bar{y}-\bar{y}_{r}||\le\frac{||\bar{s}_{.1}||}{{\bar{\sigma}}({\cal{L}}+{\cal{B}})} \end{aligned} \end{align} $$ (2) 其中, $ {\bar{\sigma}}({\cal{L}}+{\cal{B}}) $为矩阵$ {\cal{L}}+{\cal{B}} $的最小奇异值.
1.3 模糊逻辑系统
使用模糊逻辑系统理论近似严格反馈多智能体系统中存在的未知非线性函数. 考虑如下模糊逻辑系统:
$$ \begin{aligned} \tilde{y}(x)=\frac{{\sum\limits_{\flat=1}^r}\bar{y}_{\flat}{\prod\limits_{\hbar=1}^{\check{n}}}{\bar{\mu}_{F^{\flat}_{\hbar}}}(x_{\hbar})}{{\sum\limits_{\flat=1}^r}[\prod\limits_{\hbar=1}^{\check{n}}{\bar{\mu}_{F^{\flat}_{\hbar}}}(x_{\hbar})]} \end{aligned} $$ (3) 其中, $ \bar{y}_{\flat}=\max_{y\in{\bf{R}}}{\bar{\mu}_{G^{\flat}}}(\tilde{y}) $.
定义模糊基函数为
$$ \begin{aligned} \psi_{\hbar}(x)=\frac{{\prod\limits_{\hbar=1}^{\check{n}}}{\bar{\mu}_{F^{\flat}_{\hbar}}}(x_{\hbar})}{{\sum\limits_{\flat=1}^{r}}[\prod\limits_{\hbar=1}^{\check{n}}{\bar{\mu}_{F^{\flat}_{\hbar}}}(x_{\hbar})]} \end{aligned} $$ (4) 并且, 定义向量 $ \zeta^\text{T}=[\bar{y}_{1},\;\bar{y}_{2},\;\cdots,\;\bar{y}_{n}]= [\zeta_{1},\; \zeta_{2},\;\cdots, \zeta_{n}] $ 和 $ \psi(x)=[\psi_{1}(x),\;\cdots,\;\psi_{n}(x)]^\text{T} $. 随后, 进一步表示模糊逻辑系统为 $ \tilde{y}(x)=\zeta^\text{T}\psi(x) $.
引理3[33]. 对于$ \varepsilon>0 $, 表示模糊逻辑系统为
$$ \begin{align} \underset{x\in\varpi}\sup\lvert \zeta(x)-{\eta}^\text{T}\psi(x) \rvert \le\varepsilon \end{align} $$ (5) 其中, $ \zeta(x) $为定义在紧集$ \varpi $上的连续函数.
1.4 规定性能函数
定义规定性能机制[30]为
$$ \begin{align} \begin{aligned} -e_{h,\;1,\;{\mathrm{min}}}(t)<e_{h,\;1}(t)<e_{h,\;1,\;{\mathrm{max}}}(t) \end{aligned} \end{align} $$ (6) 接下来, 定义规定性能上下边界为
$$ \begin{aligned} e_{h,\;1,\;{\mathrm{min}}}(t)=\;&(e_{h,\;1,\;0,\;{\mathrm{min}}}-e_{h,\;1,\;\infty,\;{\mathrm{min}}})\text{e}^{-o_{h}t}\;+\\ & e_{h,\;1,\;\infty,\;{\mathrm{min}}}\nonumber\\ e_{h,\;1,\;{\mathrm{max}}}(t)=\;&(e_{h,\;1,\;0,\;{\mathrm{max}}}-e_{h,\;1,\;\infty,\;{\mathrm{max}}})\text{e}^{-o_{h}t}\;+\\ & e_{h,\;1,\;\infty,\;{\mathrm{max}}}\nonumber \end{aligned} $$ 其中, $ -e_{h,\;1,\;{\mathrm{min}}}\;(t) $表示设计的规定范围下界, $ e_{h,\;1,\;{\mathrm{max}}}(t) $ 表示设计的规定范围上界. $ o_{h} $ 是一个可设计的常数. $ e_{h,\;1,\;0,\;{\mathrm{min}}} $, $ e_{h,\;1,\;0,\;{\mathrm{max}}} $, $ e_{h,\;1,\;\infty,\;{\mathrm{min}}} $ 和 $ e_{h,\;1,\;\infty,\;{\mathrm{max}}} $ 是正的参数. 并且, 参数需要满足$ e_{h,\;1,\;0,\;{\mathrm{min}}} > e_{h,\;1,\;\infty,\;{\mathrm{min}}} $ 和 $ e_{h,\;1,\;0,\;{\mathrm{max}}} > e_{h,\;1,\;\infty,\;{\mathrm{max}}} $.
假设2[30]. 对于智能体$ h $, 初始同步误差必须满足受限不等式$ -e_{h,\;1,\;{\mathrm{min}}}(0) < e_{h,\;1}(0) < e_{h,\;1,\;{\mathrm{max}}}(0) $.
根据所考虑的规定性能方法, 得到误差转换机制为
$$ \begin{align} \begin{aligned} e_{h,\;1}=e_{h,\;1,\;{\mathrm{max}}}\Re_{h}(s_{h,\;1}) \end{aligned} \end{align} $$ (7) 其中, $ s_{h,\;1} $ 是转换后的误差. $ \Re_{h}(s_{h,\;1}) $ 表示误差转换函数, 其是光滑且严格单调递增的, 同时满足 $ \Re_{h}(s_{h,\;1})\in(-\kappa_{h},\;1) $ 且 $ \kappa_{h}={e_{h,\;1,\;{\mathrm{max}}}(t)}/{e_{h,\;1,\;{\mathrm{min}}}(t)} $.
转换函数的表达式为
$$ \begin{align} \begin{aligned} \Re_{h}(s_{h,\;1})=\frac{\text{e}^{s_{h,\;1}}-\text{e}^{-s_{h,\;1}}}{\text{e}^{s_{h,\;1}}+{\kappa}^{-1}_{h}\text{e}^{-s_{h,\;1}}} \end{aligned} \end{align} $$ (8) 将式(8)代入式(7), 可得
$$ \begin{align} s_{h,\;1}=\frac{1}{2}\ln\left(1+\frac{e_{h,\;1}}{e_{h,\;1,\;{\mathrm{min}}}}\right)-\frac{1}{2}\ln\left(1-\frac{e_{h,\;1}}{e_{h,\;1,\;{\mathrm{max}}}}\right) \end{align} $$ (9) 注1. 由于本文考虑状态触发机制, 该机制会导致系统出现阶跃现象或抖震现象. 为克服这一现象发生, 本文采用规定性能方法来约束系统的同步误差以减少系统性能下降的情况发生.
1.5 混合双端事件触发机制
针对非线性多智能体系统, 如何有效节省通信资源是十分重要的问题, 事件触发机制可以减少通信带宽的占用. 同时, 在设计事件触发机制时, 重要的是在设计相应的阈值条件时要考虑到通信资源和跟踪性能之间的平衡.
在网络环境中进行信息交换时, 多个信息传输通道会同时进行数据传输. 基于这一考虑, 本文设计混合双端事件触发机制来同时释放控制器−执行器环节和传感器−控制器环节中通信渠道上的压力. 因此, 提出如下的混合双端事件触发机制:
$$ \left\{\begin{split} &\breve{\hat{x}}_{h,\;g}=\hat{x}_{h,\;g}(t^{x}_{h,\;k}),\;t\in[t^{x}_{h,\;k},\;t^{x}_{h,\;k+1})\\ &t^{x}_{h,\;k+1}=\inf\Bigr\{t>t^{x}_{h,\;k}:|\hat{x}_{h,\;g}(t)-\breve{\hat{x}}_{h,\;g}|\ge \\ &\qquad\qquad \nu_{h}+m_{h}\text{e}^{-b_{h}t}\Bigr\} \end{split}\right. $$ (10) 随后可得
$$ \left\{\begin{split} &\breve{u}_{h}(t)=u_{h}(t^{u}_{h,\;k}),\;t\in[t^{u}_{h,\;k},\;t^{u}_{h,\;k+1})\\ &t^{u}_{h,\;k+1}=\inf\Bigr\{t>t^{u}_{h,\;k}:|u_{h}(t)-\breve{u}_{h}(t)|\ge\\ &\qquad\qquad \rho_{h}+\mu_{h}\text{e}^{-\tau_{h}t}\Bigr\} \end{split} \right.$$ (11) 其中, $ t^{x}_{h,\;k} $ 表示系统状态的触发时刻. $ t^{u}_{h,\;k} $ 表示控制输入信号的触发时刻. $ \nu_{h} $, $ m_{h} $, $ b_{h} $, $ \rho_{h} $, $ \mu_{h} $ 和 $ \tau_{h} $ 是正的常数. 同时, 通常假设第一个触发发生在系统运行的初始时刻.
注2. 由于本文考虑未知不可测量状态问题, 提出的状态触发机制首次使用估计状态作为采样信号并构成触发误差, 拓宽了状态触发机制的应用范围. 并且, 设计的阈值条件会随着系统运行时间的变化而变化, 从而更好地平衡了系统性能和资源节约之间的关系.
2. 模糊状态观测器的设计
本节通过构造模糊状态观测器解决未知状态不可测量问题, 该观测器仅使用相对输出分布式误差信息进行反馈. 首先, 定义$ \zeta_{h}(x_{h},u_{h})=\zeta_{h,\;n}(\bar{x}_{h,\;n}) $ 且要求 $ |\zeta_{h}(x_{h},\;u_{h})|\le \bar{\zeta}_{h}(x_{h},\;u_{h}) $. 重新构造系统模型为
$$ \begin{aligned} \left\{ \begin{aligned} &\dot{x}_{h,\;g}=x_{h,\;g+1}+\zeta_{h,\;g}(\hat{\bar{x}}_{h,\;g})\\ &\dot{x}_{h,\;n}=u_{h}+\zeta_{h}(\hat{x}_{h},\;u_{h})\\ & y_{h}=x_{h,\;1} \end{aligned} \right. \end{aligned} $$ (12) 基于模糊逻辑系统理论, 考虑的多智能体系统包含非线性函数$ \zeta_{h,\;g}(\hat{\bar{x}}_{h,\;g}) $ 和 $ \zeta_{h}(\hat{x}_{h},\;u_{h}) $, 近似这两项可得:
$$ \begin{align} \begin{aligned} \hat{\zeta}_{h,\;g}(\hat{\bar{x}}_{h,\;g}|\eta_{h,\;g})=\eta^{*\text{T}}_{h,\;g}\psi_{h,\;g}(\hat{\bar{x}}_{h,\;g}) \end{aligned} \end{align} $$ (13) 其中, $ \hat{\bar{x}}_{h,\;g} $ 表示$ {\bar{x}}_{h,\;g} $的估计值.
$ \eta^{*}_{h,\;g} $ 是最优参数向量, 其可以表示为
$$ \begin{split} \eta^{*}_{h,\;g}=&\;\arg\underset{\eta_{h,\;g}\in\Omega_{h,\;g}}\min[\underset{(\bar{x}_{h,\;g},\;\hat{\bar{x}}_{h,\;g})\in U}\sup| \hat{\zeta}(\hat{\bar{x}}_{h,\;g}|\hat{\eta}_{h,\;g})\;-\\ &\zeta(\hat{\bar{x}}_{h,\;g})|] \\[-1pt]\end{split} $$ (14) 其中, $ U $ 和 $ \Omega_{h,\;g} $ 分别为 $ \hat{\bar{x}}_{h,\;g} $ 和 $ \eta_{h,\;g} $ 对应的紧集. $ \hat{\eta}_{h,\;g} $ 表示 $ \eta^{*}_{h,\;g} $ 的估计值. $ \varepsilon_{h,\;g} $ 表示模糊最小化近似误差且 $ \varepsilon_{h,\;g}=\zeta_{h,\;g}(\bar{x}_{h,\;g})-\hat{\zeta}_{h,\;g}(\hat{\bar{x}}_{h,\;g}|\eta_{h,\;g}) $. 接下来, 构建分布式状态观测器为
$$ \left\{\begin{aligned} &\dot{\hat x}_{h,\;g}=\hat{x}_{h,\;g+1}+\hat{\eta}^\text{T}_{h,\;g}\psi_{h,\;g}(\hat{\bar{x}}_{h,\;g})\;+\\ &\quad k^{*}_{h,\;g}\left(\sum_{\ell=1}^N a_{h,\;\ell}(y_{h}-y_{\ell})-\sum_{\ell=1}^N a_{h,\;\ell}(\hat{y}_{h}-y_{\ell})\right)\\ &\dot{\hat x}_{h,\;n}=u_{h}+\hat{\eta}^\text{T}_{h,\;n}\psi_{h,\;n}(\hat{\bar{x}}_{h,\;n})\;+\\ &\quad k^{*}_{h,\;n}\left(\sum_{\ell=1}^N a_{h,\;\ell}(y_{h}-y_{\ell})-\sum_{\ell=1}^N a_{h,\;\ell}(\hat{y}_{h}-y_{\ell})\right)\\ &\hat{y}_{h}=\hat{x}_{h,\;1} \\[-1pt] \end{aligned}\right. $$ (15) 其中, $ k^{*}_{h,\;g} $ 和 $ k^{*}_{h,\;n} $ 是正的常数. 此外, 相对输出误差表示为 $ \sum_{\ell=1}^N a_{h,\;\ell}(y_{h} - y_{\ell}) $. 定义 $ \Delta_{h} = \bar{x}_{h,\;n} - \hat{\bar{x}}_{h,\;n} $ 和 $ \hat{\bar{x}}_{h,\;n}=[\hat{x}_{h,\;1},\;\cdots,\;\hat{x}_{h,\;n}]^\text{T} $. 经过上述分析, 可得
$$ \begin{align} \begin{aligned} \dot{\Delta}_{h}=&\Xi_{h}\Delta_{h}+\varepsilon_{h}+\sum_{\eth=1}^{n} A_{h,\;\eth} \tilde{\eta}^\text{T}_{h,\;\eth}\psi_{h,\;\eth}(\hat{\bar{x}}_{h,\;\eth}) \end{aligned} \end{align} $$ (16) 其中, $ \varepsilon_{h} = [\varepsilon_{h,\,1},\,\cdots,\,\varepsilon_{h,\,n}]^\text{T} $, $ A_{h,\,\eth} = [0\cdots1\cdots0]_{n \times 1} $. 为了确保 $ \Xi_{h} $ 是一个严格的赫尔维兹矩阵, 选择向量 $ K_{h} $ 且 $ K_{h}=[k^{*}_{h,\;1},\;\cdots,\;k^{*}_{h,\;n}]^\text{T} $.
并且,
$$ \begin{align} \begin{aligned} A_{h}= \begin{bmatrix} -k^{*}_{h,\;1}\displaystyle\sum\limits_{\ell=1}^N a_{h,\;\ell}\\ -k^{*}_{h,\;2}\displaystyle\sum\limits_{\ell=2}^N a_{h,\;\ell}&H_{h,\;n-1}&\\ \vdots\\ -k^{*}_{h,\;n}\displaystyle\sum\limits_{\ell=n}^N a_{h,\;\ell}&...&0\nonumber \end{bmatrix} \end{aligned} \end{align} $$ 对于矩阵 $ Q_{h}=Q^\text{T}_{h}>0 $ 和矩阵 $ Y_{h}=Y^\text{T}_{h}>0 $, 满足如下关系
$$ \begin{align} \begin{aligned} A^\text{T}_{h}Y_{h}+Y_{h}A_{h}=-2Q_{h} \end{aligned} \end{align} $$ (17) 挑选如下Lyapunov函数 $ V_{h,\;0} $:
$$ \begin{align} \begin{aligned} V_{h,\;0}=\frac{1}{2}\tilde{x}^\text{T}_{h}Y_{h}\tilde{x}_{h} \end{aligned} \end{align} $$ (18) 基于上述分析, 计算$ V_{h,\;0} $的导数可以为
$$ \begin{split} \dot{V}_{h,\;0}=\;&-\Delta^\text{T}_{h}Q_{h}\Delta_{h}+\Delta^\text{T}_{h}Y_{h}\varepsilon_{h}\;+\\ &\Delta^\text{T}_{h}Y_{h}\sum_{\eth=1}^{n} A_{h,\;\eth} \tilde{\eta}^\text{T}_{h,\;\eth}\psi_{h,\;\eth} \end{split} $$ (19) 利用杨氏不等式, 可得
$$ \begin{align} \Delta^\text{T}_{h}Y_{h}\varepsilon_{h}\le\frac{1}{2}||Y_{h}||^{2}||\varepsilon^{*}_{h}||^{2}+\frac{1}{2}||\Delta_{h}||^{2} \end{align} $$ (20) $$ \begin{split} &e^\text{T}_{h}Y_{h}\sum_{\eth=1}^{n} A_{h,\;\eth} \tilde{\eta}^\text{T}_{h,\;\eth}\psi_{h,\;\eth}(\hat{\bar{x}}_{h,\;\eth})\le \\ &\qquad\frac{n}{2}||\Delta_{h}||^{2}+ \frac{1}{2}||Y_{h}||^{2}\sum_{\eth=1}^{n} A_{h,\;\eth} \tilde{\eta}^\text{T}_{h,\;\eth}\tilde{\eta}_{h,\;\eth} \end{split} $$ (21) 其中, $ \varepsilon^{*}_{h}=[\varepsilon^{*}_{h,\;1},\;\varepsilon^{*}_{h,\;2},\;\cdots,\;\varepsilon^{*}_{h,\;n}]^\text{T} $.
随后, 可得
$$ \begin{split} \dot{V}_{h,\;0}\le\;&\frac{1+n}{2}||\Delta_{h}||^{2}-\Delta^\text{T}_{h}Q_{h}\Delta_{h}+\frac{1}{2}||Y_{h}||^{2}||\varepsilon^{*}_{h}||^{2}\;+\\ &\frac{1}{2}||Y_{h}||^{2}\sum_{\eth=1}^{n} A_{h,\;\eth} \tilde{\eta}^\text{T}_{h,\;\eth}\tilde{\eta}_{h,\;\eth}\le\frac{1}{2}||Y_{h}||^{2}||\tilde{\eta}^{2}_{h,\;n}\;+\\ & \frac{1}{2}||Y_{h}||^{2}||\varepsilon^{*}_{h}||^{2}+ \xi_{0}||\Delta_{h}||^{2}\\[-1pt] \end{split} $$ (22) 其中, $ \xi_{0} = \min\left\{\tau_{{\mathrm{min}}}(Q_{h}) - \frac{1+n}{2}\right\} $ 和 $ \xi_{0} > 0 $. $ \tau_{{\mathrm{min}}}(Q_{h}) $ 是矩阵 $ Q_{h} $ 的最小特征值.
注3. 本文设计的模糊状态观测器仅依赖于智能体的相对输出信息进行反馈, 表明仅使用部分的分布式信息就可以解决未知不可测状态问题. 此外, 设计的观测器可识别严格反馈多智能体系统中的未知非线性函数.
3. 主要内容
3.1 自适应分布式控制器设计
本节给出自适应控制器的设计过程且解决非线性多智能体系统的自适应模糊跟踪控制问题. 定义局部的同步误差 $ e_{h,\;1} $ 和 $ e_{h,\;g} $ ($ g\;=\; 2,\;\cdots,\;n $) 为
$$ \left\{\begin{split} &e_{h,\;1}=\sum_{\ell=1}^Na_{h,\;\ell}(y_{h}-y_{\ell})+b_{h}(y_{h}-y_{r})\\ &e_{h,\;g}=\hat{x}_{h,\;g}-\alpha_{hf,\;g-1} \end{split}\right.$$ (23) 其中, $ \alpha_{hf,\;g-1} $ 表示滤波后的虚拟控制器.
在传统的反步法框架下, 为避免“复杂性爆炸”问题, 引入了一阶滤波器:
$$ \left\{\begin{split} &\Phi_{h,\;g-1}\dot{\alpha}_{hf,\;g-1}+{\alpha}_{hf,\;g-1}={\alpha}_{h,\;g-1}\\ &{\alpha}_{hf,\;g-1}(0)={\alpha}_{h,\;g-1}(0) \end{split}\right. $$ (24) 其中, $ \alpha_{h,\;g - 1} $ 是虚拟控制信号. 虚拟控制信号$ \alpha_{h,\;g - 1} $ 通过一阶滤波器 $ \Phi_{h,\;g-1}>0 $ 会产生一个新的信号 $ {\alpha}_{hf,\;g-1} $.
随后, 表示一阶滤波器的误差为
$$ \begin{align} \begin{aligned} \vartheta_{h,\;g-1}={\alpha}_{hf,\;g-1}-{\alpha}_{h,\;g-1} \end{aligned} \end{align} $$ (25) 基于式 (9), 计算 $ s_{h,\;1} $ 的导数为
$$ \begin{align} \begin{aligned} \dot{s}_{h,\;1}=\rho_{h}\dot{e}_{h,\;1}-\phi_{h}e_{h,\;1} \end{aligned} \end{align} $$ (26) 其中,
$$ \begin{align} \begin{aligned} \rho_{h}=\;&\frac{1}{2}\bigg(\frac{1}{e_{h,\;1,\;{\mathrm{min}}}+e_{h,\;1}}+\frac{1}{e_{h,\;1,\;{\mathrm{max}}}-e_{h,\;1}}\bigg)\\ \phi_{h}=\;&\frac{1}{2}\bigg(\frac{\dot{e}_{h,\;1,\;{\mathrm{min}}}}{e_{h,\;1,\;{\mathrm{min}}}(e_{h,\;1,\;{\mathrm{min}}}+e_{h,\;1})}\;+\\ &\frac{\dot{e}_{h,\;1,\;{\mathrm{max}}}}{e_{h,\;1,\;{\mathrm{max}}}(e_{h,\;1,\;{\mathrm{max}}}-e_{h,\;1})}\bigg)\nonumber \end{aligned} \end{align} $$ 步骤 1. 选择如下的Lyapunov函数:
$$ \begin{align} \begin{aligned} V_{h,\;1}=\frac{1}{2}s^{2}_{h,\;1}+\frac{1}{2}\eta^\text{T}_{h,\;1}\eta_{h,\;1}+\frac{1}{2}\vartheta^{2}_{h,\;1} \end{aligned} \end{align} $$ (27) 计算$ V_{h,\;1} $的导数为
$$ \begin{split} \dot{V}_{h,\;1}&=s_{h,\;1}\dot{s}_{h,\;1}-\tilde{\eta}_{h,\;1}\dot{\hat{\eta}}_{h,\;1}+\vartheta_{h,\;1}\dot{\vartheta}_{h,\;1}=\\ &s_{h,\;1}(\rho_{h}\dot{e}_{h,\;1}-\phi_{h}e_{h,\;1})-\tilde{\eta}_{h,\;1}\dot{\hat{\eta}}_{h,\;1}+\vartheta_{h,\;1}\dot{\vartheta}_{h,\;1}=\\ &s_{h,\;1}\rho_{h}\left(\sum_{\ell=1}^Na_{h,\;\ell}(\dot{y}_{h}-\dot{y}_{\ell})+b_{h}(\dot{y}_{h}-\dot{y}_{r})\right)-\\ &\phi_{h}e_{h,\;1}s_{h,\;1}+\vartheta_{h,\;1}\dot{\vartheta}_{h,\;1}-\tilde{\eta}_{h,\;1}\dot{\hat{\eta}}_{h,\;1}\\[-1pt] \end{split} $$ (28) 根据式 (15), 进一步可得
$$ \begin{split} \dot{V}_{h,\;1}=\;&s_{h,\;1}\rho_{h}\big((b_{h}+d_{h})(\hat{x}_{h,\;2}+\zeta_{h,\;1})\;-\\ & d_{h}(\hat{x}_{\ell,\;2}+\zeta_{\ell,\;1})-b_{h}\dot{y}_{r}\big)-\phi_{h}e_{h,\;1}s_{h,\;1}\;-\\ &\tilde{\eta}_{h,\;1}\dot{\hat{\eta}}_{h,\;1}+\vartheta_{h,\;1}\dot{\vartheta}_{h,\;1}=s_{h,\;1}\rho_{h}\big((b_{h}\;+\\ & d_{h})(e_{h,\;2}+\alpha_{h,\;1}+\vartheta_{h,\;1}+\zeta_{h,\;1})\;-\\ & d_{h}(\hat{x}_{\ell,\;2}+\zeta_{\ell,\;1})-b_{h}\dot{y}_{r}-\phi_{h}e_{h,\;1}\rho^{-1}_{h}\big)\;-\\ & \tilde{\eta}_{h,\;1}\dot{\hat{\eta}}_{h,\;1}+\vartheta_{h,\;1}\dot{\vartheta}_{h,\;1} \end{split} $$ (29) 其中, $ e_{h,\;2}=\hat{x}_{h,\;2}-\alpha_{hf,\;1} $ 且 $ \vartheta_{h,\;1}={\alpha}_{hf,\;1}-{\alpha}_{h,\;1} $. $ \sum_{j=1}^Na_{h,\;\ell} $ 可以由 $ d_{h} $ 来表示.
随后, 根据引理3, 可得
$$ \begin{split} &\bar{F}_{h,\;1}({x}_{h,\;1},\;{x}_{\ell,\;1})=(b_{h}+d_{h})f_{h,\;1}-d_{h}(\hat{x}_{\ell,\;2}\;+\\ &\quad f_{\ell,\;1})\nonumber-\phi_{h}e_{h,\;1}\rho^{-1}_{h}\nonumber={\eta}^{\mathrm{T}}_{h,\;1}\psi_{h,\;1}+\varepsilon_{h,\;1} \end{split} $$ 式 (29) 可进一步表示为
$$ \begin{split} \dot{V}_{h,\;1}=\;&s_{h,\;1}\rho_{h}\big((b_{h}+d_{h})(e_{h,\;2}+\alpha_{h,\;1}+\vartheta_{h,\;1})\nonumber\;-\\ & b_{h}\dot{y}_{r}+\bar{F}_{h,\;1}\big)-\tilde{\eta}_{h,\;1}\dot{\hat{\eta}}_{h,\;1}+\vartheta_{h,\;1}\dot{\vartheta}_{h,\;1}\nonumber=\\ & s_{h,\;1}\rho_{h}\big((b_{h}+d_{h})(e_{h,\;2}+\alpha_{h,\;1}+\vartheta_{h,\;1})\nonumber\;+\\ & {\eta}^{\mathrm{T}}_{h,\;1}\psi_{h,\;1}+\varepsilon_{h,\;1}-b_{h}\dot{y}_{r})\big)\nonumber\;-\\ &\tilde{\eta}_{h,\;1}\dot{\hat{\eta}}_{h,\;1}+\vartheta_{h,\;1}\dot{\vartheta}_{h,\;1} \end{split} $$ 基于杨氏不等式, 可得
$$ \begin{align} \left\{\begin{aligned} &(b_{h}+d_{h})s_{h,\;1}e_{h,\;2}\le\frac{(b_{h}+d_{h})^{2}}{2}s^{2}_{h,\;1}+\frac{1}{2}e^{2}_{h,\;2}\nonumber\\&(b_{h}+d_{h})s_{h,\;1}\vartheta_{h,\;1}\le\frac{(b_{h}+d_{h})^{2}}{2}s^{2}_{h,\;1}+\frac{1}{2}\vartheta^{2}_{h,\;1}\nonumber\\ &s_{h,\;1}\varepsilon_{h,\;1}\le\frac{1}{2}s^{2}_{h,\;1}+\frac{1}{2}\varepsilon^{2}_{h,\;1} \end{aligned}\right. \end{align} $$ 设计虚拟控制器为
$$ \begin{split} \alpha_{h,\;1}=\; &\rho^{-1}_{h}\bigg(-(b_{h}+d_{h})s_{h,\;1}-\frac{1}{2(b_{h}+d_{h})}s_{h,\;1}\;-\\ & c_{h,\;1}s_{h,\;1}+\frac{b_{h}}{b_{h}+d_{h}}\dot{y}_{r}-\frac{1}{b_{h}+d_{h}}\hat{\eta}_{h,\;1}\psi_{h,\;1}\bigg) \end{split} $$ (30) 其中, $ c_{h,\;1} $ 是设计参数. $ \hat{\eta}_{h,\;1} $ 是 $ {\eta}_{h,\;1} $的估计值.
在触发时刻, 系统会更新自适应律, 并且在触发间隔区间中保持不变. 因此, 设计自适应律为
$$ \left\{\begin{aligned} \hat{\eta}^+_{h,\;1}&=-N_{h,\;1}\hat{\eta}_{h,\;1}+s_{h,\;1}\psi_{h,\;1},&& t=t_{h,\;k}\\ \dot{\hat{\eta}}_{h,\;1}&=0,&&t\in[t_{h,\;k},\;t_{h,\;k+1}) \end{aligned} \right. $$ (31) 其中, $ N_{h,\;1} $ 是设计参数.
基于上述分析, 可知
$$ \begin{split} \dot{V}_{h,\;1}&\le-{c}_{h,\;1}s^{2}_{h,\;1}-\frac{1}{2}N_{h,\;1}\tilde{\eta}^{2}_{h,\;1}+\frac{1}{2}N_{h,\;1}{\eta}^{2}_{h,\;1}\;+\nonumber\\ &\quad\frac{1}{2}\rho_{h}\vartheta^{2}_{h,\;1}+\frac{1}{2}\varepsilon^{2}_{h,\;1}\rho_{h}+\frac{1}{2}\rho_{h}e^{2}_{h,\;2}+\vartheta_{h,\;1}\dot{\vartheta}_{h,\;1} \end{split} $$ 随后, 可得
$$ \begin{align} \begin{aligned} \dot{\vartheta}_{h,\;1}=-\frac{1}{\upsilon_{h,\;1}}{\vartheta}_{h,\;1}-\dot{\alpha}_{h,\;1} \end{aligned} \end{align} $$ (32) 其中, $ \upsilon_{h,\;1} $ 是设计常数.
根据上述分析, $ \dot{V}_{h,\;1} $ 满足如下不等式:
$$ \begin{split} \dot{V}_{h,\;1}\le &-c_{h,\;1}s^{2}_{h,\;1}-\frac{1}{2}N_{h,\;1}\tilde{\eta}^{2}_{h,\;1}+\frac{1}{2}N_{h,\;1}{\eta}^{2}_{h,\;1}\;+\\ &\frac{1}{2}\varepsilon^{2}_{h,\;1}\rho_{h}+\frac{1}{2}\rho_{h}e^{2}_{h,\;2}-\frac{1}{\upsilon_{h,\;1}}{\vartheta}^{2}_{h,\;1}\;+\\ &\frac{1}{2}\dot{\alpha}^{2}_{h,\;1}+\frac{1}{2}\rho_{h}\vartheta^{2}_{h,\;1}\\[-1pt] \end{split} $$ (33) 步骤g. $(g=2, \cdots, n-1) $选择如下的Lyapunov函数:
$$ \begin{align} \begin{aligned} V_{h,\;g}=\frac{1}{2}e^{2}_{h,\;g}+\frac{1}{2}\eta^\text{T}_{h,\;g}\eta_{h,\;g}+\frac{1}{2}\vartheta^{2}_{h,\;g} \end{aligned} \end{align} $$ (34) 计算$ V_{h,\;g} $的导数为
$$ \begin{split} \dot{V}_{h,\;g}=\;&e_{h,\;g}\dot{e}_{h,\;g}-\tilde{\eta}_{h,\;g}\dot{\hat{\eta}}_{h,\;g}+\vartheta_{h,\;g}\dot{\vartheta}_{h,\;g}=\\ &e_{h,\;g}(\dot{\hat{x}}_{h,\;g}-\dot{\alpha}_{hf,\;g-1})\;-\\ & \tilde{\eta}_{h,\;g}\dot{\hat{\eta}}_{h,\;g}+\vartheta_{h,\;g}\dot{\vartheta}_{h,\;g} \end{split} $$ (35) 基于式 (15), 式 (35)重新表示为
$$ \begin{split} \dot{V}_{h,\;g}=\;&e_{h,\;g}\big(\hat{x}_{h,\;g+1}+\eta_{h,\;g}\psi_{h,\;g}+k^{*}_{h,\;g}\big(d_{h}(y_{h}-y_{\ell})\nonumber\;-\\ & d_{h}(\hat{y}_{h}-y_{\ell}) \big) -\dot{\alpha}_{hf,\;g-1} \big) -\tilde{\eta}_{h,\;g}\dot{\hat{\eta}}_{h,\;g}\nonumber + \vartheta_{h,\;g}\dot{\vartheta}_{h,\;g} \end{split} $$ 其中, $ \hat{x}_{h,\;g+1}=e_{h,\;g+1}+\alpha_{hf,\;g} $.
随后, 可得
$$\begin{aligned} \dot{V}_{h,\;g}=\;&e_{h,\;g}\big(e_{h,\;g+1}+\alpha_{hf,\;g}+\eta_{h,\;g}\psi_{h,\;g}\;+\\ & k^{*}_{h,\;g}\big(d_{h}(y_{h}\nonumber-y_{\ell})-d_{h}(\hat{y}_{h}-y_{\ell})\big)-\dot{\alpha}_{hf,\;g-1}\big)\;-\\ & \tilde{\eta}_{h,\;g}\dot{\hat{\eta}}_{h,\;g}\nonumber+\vartheta_{h,\;g}\dot{\vartheta}_{h,\;g}\nonumber=e_{h,\;g}\big(e_{h,\;g+1}+\alpha_{h,\;g}\;+\\ & {\vartheta}_{h,\;g}+\eta_{h,\;g}\psi_{h,\;g}\nonumber+k^{*}_{h,\;g}\big(d_{h}(y_{h}-y_{\ell})\;-\\ &d_{h}(\hat{y}_{h} - y_{\ell}) \big)\nonumber-\dot{\alpha}_{hf,\;g-1}\big) -\tilde{\eta}_{h,\;g}\dot{\hat{\eta}}_{h,\;g} +\vartheta_{h,\;g}\dot{\vartheta}_{h,\;g} \end{aligned} $$ 基于杨氏不等式, 可知
$$ \begin{align} \begin{aligned} e_{h,\;g}\vartheta_{h,\;g}\le\frac{1}{2}e^{2}_{h,\;g}+\frac{1}{2}\vartheta^{2}_{h,\;g} \end{aligned} \end{align} $$ (36) 设计虚拟控制器为
$$ \begin{split} \alpha_{h,\;g}=\; &-k^{*}_{h,\;g}\big(d_{h}(y_{h}-y_{\ell})-d_{h}(\hat{y}_{h}-y_{\ell})\big)\;-\\ &\hat{\eta}_{h,\;g}\psi_{h,\;g}-\left(\frac{1}{2}+c_{h,\;g}\right)e_{h,\;g}+\dot{\alpha}_{hf,\;g-1} \end{split} $$ (37) 其中, $ c_{h,\;g} $ 是设计常数.
设计自适应律为
$$ \begin{align} \left\{ \begin{aligned} \hat{\eta}^+_{h,\;g}&=-N_{h,\;g}\hat{\eta}_{h,\;g}+e_{h,\;g}\psi_{h,\;g},&& t=t_{h,\;k}\\ \dot{\hat{\eta}}_{h,\;g}&=0,&& t\in[t_{h,\;k},\;t_{h,\;k+1}) \end{aligned} \right. \end{align} $$ (38) 并且, 可知
$$ \begin{align} \begin{aligned} \dot{\vartheta}_{h,\;g}=-\frac{1}{\upsilon_{h,\;g}}{\vartheta}_{h,\;g}-\dot{\alpha}_{h,\;g} \end{aligned} \end{align} $$ (39) 其中, $ \upsilon_{h,\;g} $ 是一个设计常数.
接下来, $ \dot{V}_{h,\;g} $ 满足如下不等式:
$$ \begin{split} \dot{V}_{h,\;g}\le&-{c}_{h,\;g}e^{2}_{h,\;g}-\frac{1}{2}N_{h,\;g}\tilde{\eta}^{2}_{h,\;g}+\frac{1}{2}N_{h,\;g}{\eta}^{2}_{h,\;g}\;+\\ &\frac{1}{2}\vartheta^{2}_{h,\;g}-\frac{1}{\upsilon_{h,\;g}}{\vartheta}^{2}_{h,\;g}+\frac{1}{2}\dot{\alpha}^{2}_{h,\;g} \\[-1pt]\end{split} $$ (40) 步骤${\boldsymbol{n}} $. 当应用一阶滤波器, 可得
$$\left\{\begin{split} &\Phi_{h,\;n-1}\dot{\alpha}_{hf,\;n-1}+{\alpha}_{hf,\;n-1}={\alpha}_{h,\;n-1}\\ &{\alpha}_{hf,\;n-1}(0)={\alpha}_{h,\;n-1}(0) \end{split}\right. $$ (41) 其中, $ \alpha_{h,\;n-1} $ 是虚拟控制信号. 虚拟控制信号$ \alpha_{h,\;n-1} $ 通过一阶滤波器 $ \Phi_{h,\;n-1}>0 $ 会产生一个新的信号 $ {\alpha}_{hf,\;n-1} $.
挑选如下的Lyapunov函数:
$$ \begin{align} \begin{aligned} V_{h,\;n}=\frac{1}{2}e^{2}_{h,\;n}+\frac{1}{2}\eta^\text{T}_{h,\;n}\eta_{h,\;n} \end{aligned} \end{align} $$ (42) 计算$ V_{h,\;n} $的导数为
$$ \begin{split} \label{lin} \begin{aligned} \dot{V}_{h,\;n}=\;&e_{h,\;n}\dot{e}_{h,\;n}-\tilde{\eta}_{h,\;n}\dot{\hat{\eta}}_{h,\;n}\nonumber=\\ &e_{h,\;n}(\dot{\hat{x}}_{h,\;n}-\dot{\alpha}_{hf,\;n-1})-\tilde{\eta}_{h,\;n}\dot{\hat{\eta}}_{h,\;n} \end{aligned} \end{split} $$ 随后, 可得
$$ \begin{split} \dot{V}_{h,\;n}=\;&e_{h,\;n}\Bigg(u_{h}+{\eta}_{h,\;n}\psi_{h,\;n}(\hat{\bar{x}}_{h,\;n})\;+\\ &k^{*}_{h,\;n}\Biggr(\sum_{\ell=1}^N a_{h,\;\ell}(y_{h}-y_{\ell})-\sum_{\ell=1}^N a_{h,\;\ell}(\hat{y}_{h}\nonumber\;-\\ &y_{\ell})\Biggr)-\dot{\alpha}_{hf,\;n-1}\Bigg)-\tilde{\eta}_{h,\;n}\dot{\hat{\eta}}_{h,\;n}\nonumber=\\ &e_{h,\;n}\Bigg(\breve{u}_{h}+(u_{h}-\breve{u}_{h})+\eta^\text{T}_{h,\;n}\psi_{h,\;n}(\hat{\bar{x}}_{h,\;n})\;+\\ & k^{*}_{h,\;n}\Biggr(\sum_{\ell=1}^N a_{h,\;\ell}(y_{h}-y_{\ell})-\sum_{j=1}^N a_{h,\;\ell}(\hat{y}_{h}\nonumber\;-\\ & y_{\ell})\Biggr)-\dot{\alpha}_{hf,\;n-1}\Bigg)-\tilde{\eta}_{h,\;n}\dot{\hat{\eta}}_{h,\;n} \end{split} $$ 设计自适应事件触发控制器 $ \breve{u}_{h} $ 为
$$ \begin{split} \breve{u}_{h}=\;&-c_{h,\;n}\breve{e}_{h,\;n}-\hat{\eta}_{h,\;n}\psi_{h,\;n}\;+\\ &\frac{\breve{\alpha}_{h,\;n-1}-\breve{\alpha}_{hf,\;n-1}}{\Phi_{h,\;n-1}}- k^{*}_{h,\;n}\Biggr(\sum_{\ell=1}^Na_{h,\;\ell}(y_{h}-y_{\ell})\;-\\ &\sum_{\ell=1}^N a_{h,\;\ell}(\hat{y}_{h}-y_{\ell})\Biggr)\\[-1pt]\end{split} $$ (43) 其中,
$$ \begin{split} &\breve{e}_{h,\;n}(t)=\breve{\hat{x}}_{h,\;n}(t)-\breve{\alpha}_{hf,\;n-1}(t)\nonumber\\ &\breve{\alpha}_{hf,\;n-1}(t)={\alpha}_{hf,\;n-1}(t^{\alpha_{f}}_{h,\;k}),\quad t\in[t^{\alpha_{f}}_{h,\;k},\;t^{\alpha_{f}}_{h,\;k+1})\nonumber \end{split} $$ $$ \begin{split} t^{k+1}_{h,\;\alpha_{f}}=\;&\;\inf\Bigr\{t>t^{k}_{h,\;\alpha_{f}}:|{\dot{\alpha}}_{hf,\;n-1}(t)\;-\\ &{\dot{\breve{\alpha}}}_{hf,\;n-1}(t)|\ge\Theta_{h,\;\alpha_{f}}\Bigr\} \end{split} $$ (44) $ \breve{\alpha}_{h,\;n-1} $ 是 $ \alpha_{h,\;n-1} $ 触发后的信号. $ \breve{\alpha}_{hf,\;n-1} $ 是 $ \alpha_{hf,\;n-1} $ 触发后的信号. 在本文考虑的状态触发机制中, $ {\alpha}_{hf,\;n-1} $ 和 $ {\alpha}_{h,\;n-1} $ 均依赖于系统的状态值.
选择自适应律为
$$ \left\{\begin{split}& \hat{\eta}^+_{h,\;n}=-N_{h,\;n}\hat{\eta}_{h,\;n}+e_{h,\;n}\psi_{h,\;n}, \;\; t=t_{h,\;k}\\ &\dot{\hat{\eta}}_{h,\;n}=0,\qquad\qquad\qquad\qquad\qquad\; t\in[t_{h,\;k},\;t_{h,\;k+1}) \end{split} \right.$$ (45) 其中, $ N_{h,\;n} $ 是设计参数.
随后, 可知
$$ \begin{split} \dot{V}_{h,\;n}=\;&e_{h,\;n}\big((u_{h}-\breve{u}_{h})+{c}_{h,\;n}(e_{h,\;n}-\breve{e}_{h,\;n})\;+\\ & \tilde{\eta}_{h,\;n}\psi_{h,\;n}\nonumber+\frac{\breve{\alpha}_{h,\;n-1}-\breve{\alpha}_{hf,\;n-1}}{\Pi_{h,\;n-1}}\;-\\&\frac{{\alpha}_{h,\;n-1}-{\alpha}_{hf,\;n-1}}{\Pi_{h,\;n-1}}\nonumber\;-\\&{c}_{h,\;n}e_{h,\;n}\big)-\tilde{\eta}_{h,\;n}\dot{\hat{\eta}}_{h,\;n} \end{split} $$ 引理4[34]. 触发误差的上界可以表示为
$$ \begin{split} &|e_{h,\;n}-\breve{e}_{h,\;n}|\le\bar{\Theta}_{h}\nonumber\\ &\qquad\quad\left|\frac{\breve{\alpha}_{h,\;n-1}\breve{\alpha}_{hf,\;n-1}}{\Phi_{h,\;n-1}}-\frac{{\alpha}_{h,\;n-1}{\alpha}_{hf,\;n-1}}{\Phi_{h,\;n-1}}\right|\le \\ &\qquad\quad \frac{\Theta_{h,\;\alpha_{f}}+\Theta_{hf,\;n-1}}{\Phi_{h,\;n-1}} \end{split} $$ 接下来, 可得
$$ \begin{split} \dot{V}_{h,\;n}\le\;&-c_{h,\;n}e^{2}_{h,\;n}-\frac{1}{2}N_{h,\;n}\tilde{\eta}^{2}_{h,\;n}\;+\\ &\frac{1}{2}N_{h,\;n}{\eta}^{2}_{h,\;n}+\varsigma_{h,\;n} \end{split} $$ (46) 其中, $ \varsigma_{h,\;n}=\bar{\Theta}^{2}_{h}+\frac{{(\Theta_{h,\;\alpha_{f}}+\Theta_{hf,\;n-1})}^{2}}{\Phi^{2}_{h,\;n-1}}+p^{2}_{h}+\rho^{2}_{h} $.
注 4. 针对控制器−执行器环节和传感器−控制器环节, 本文设计混合双端事件触发机制, 可同时缓解双信道的通讯负担. 首先, 考虑在传感器−控制器环节上设置事件触发机制. 因为每次传输闭环系统的信息时, 输入信号都是根据输出反馈结果设置的, 所以输出信号的有效更新和更新次数是需要考虑的重要环节. 另外, 考虑控制器−执行器环节的资源节约. 通过在输入信号中设置触发项并进一步设置事件采样所需的条件, 实现节省通信资源的目的. 在多智能体系统或分布式系统中, 通信资源通常是有限的. 双端事件触发机制需要在有限的通信条件下, 确保信息的及时传输和系统的协调运行. 因此, 可以通过调整参数$ \nu_{h} $, $ m_{h} $, $ b_{h} $, $ \rho_{h} $, $ \mu_{h} $, $ \tau_{h} $ 和 $ \Theta_{h,\;\alpha_{f}} $实现对系统实时性能和通讯带宽占用率的有效平衡.
3.2 稳定性分析
定理1. 在假设1和假设 2下, 针对非线性多智能体系统 (1), 考虑混合双端事件触发机制 (10), (11), (44) 和模糊状态观测器(15), 设计自适应律 (31), (38) 和 (45), 虚拟控制器 (30), (37) 和分布式控制器 (43)可以使得闭环系统内的所有信号是半全局一致最终有界的.
此外, 对于 $ \forall \chi>0 $, 设计的参数满足
$$ \begin{align} \begin{aligned} \underset{t\to\infty}\lim||y-{y}_{r}||\leq\chi \end{aligned} \end{align} $$ (47) 证明. 为证明整体闭环系统的稳定性, 选择总Lyapunov函数为
$$ \begin{align} \begin{aligned} V_{h}=\sum_{m=1}^n V_{h,\;m}+V_{h,\;0} \end{aligned} \end{align} $$ (48) 根据式 (22), (33), (40) 和 (46), 可得
$$ \begin{split} \dot{V}_{h}\le\;&\xi_{0}||\Delta_{h}||^{2} + \frac{1}{2}||Y_{h}||^{2}||\tilde{\eta}^\text{T}_{h,\;n}\tilde{\eta}_{h,\;n} + \frac{1}{2}||Y_{h}||^{2}||\varepsilon^{*}_{h}||^{2}\;+\\ &\frac{1}{2}||Y_{h}||^{2}\sum_{m=1}^{n} A_{h,\;m} \tilde{\eta}^\text{T}_{h,\;m}\tilde{\eta}_{h,\;m}+\frac{1}{2}\varepsilon^{2}_{h,\;1}\rho_{h}\;-\\ &\sum_{m=1}^{n}\frac{1}{2}N_{h,\;m}\tilde{\eta}^{2}_{h,\;m}+\sum_{m=1}^{n}\frac{1}{2}N_{h,\;m}{\eta}^{2}_{h,\;m}\;+\\ &\sum_{m=1}^{n-1}\frac{1}{2}\dot{\alpha}^{2}_{h,\;m}-\sum_{m=2}^{n-1}\left(\frac{1}{\upsilon_{h,\;1}}-\frac{1}{2}\right){\vartheta}^{2}_{h,\;m}\;-\\ & {c}_{h,\;1}s^{2}_{h,\;1}-\sum_{m=2}^{n-1} {c}_{h,\;m}e^{2}_{h,\;m}-{c}_{h,\;n}e^{2}_{h,\;n}+\varsigma_{h,\;n}\;-\\ & \left(\frac{1}{\upsilon_{h,\;1}}-\frac{1}{2}\rho_{h}\right){\vartheta}^{2}_{h,\;1}\le-\partial_{h}V_{h}+\omega_{h}\\[-1pt] \end{split} $$ (49) 其中,
$$ \begin{split} \partial_{h}=\;&\;\min\bigg\{ \bar{c}_{h,\;1},\; \bar{c}_{h,\;m},\; {c}_{h,\;n},\; \frac{N_{h,\;m}}{2},\;\\& \left(\frac{1}{\upsilon_{h,\;1}}-\frac{1}{2}\right),\; \left(\frac{1}{\upsilon_{h,\;1}}-\frac{\rho_{h}}{2}\right) \bigg\} \end{split} $$ 且
$$ \begin{split} \omega_{h}=\;&\;\sum_{m=1}^{n}\frac{1}{2}N_{h,\;m}{\eta}^{2}_{h,\;m}+\frac{1}{2}\varepsilon^{2}_{h,\;1}\rho_{h}\;+\\ &\varsigma_{h,\;n}+\sum_{m=1}^{n-1}\frac{1}{2}\dot{\alpha}^{2}_{h,\;m}\nonumber \end{split} $$ 在不等式(49)两边同时乘以 $ \text{e}^{\partial_{h}t} $, 并在定义域 $ [0,\;t] $ 上同时积分, 可得
$$ \begin{align} \begin{aligned} 0\le V(t)\le \text{e}^{-\partial_{h}t}V(0)+\frac{\omega_{h}}{\partial_{h}}(1-\text{e}^{-\partial_{h}t}) \end{aligned} \end{align} $$ (50) 根据 $ V $ 的定义和式(50), 可知
$$ \begin{align} \begin{aligned} ||s_{.1}||^2\le2\text{e}^{-\partial_{h}t}V_{0}+\frac{\omega_{h}}{\partial_{h}}(1-\text{e}^{-\partial_{h}t}) \end{aligned} \end{align} $$ (51) 针对 $ \forall\chi>0 $, 基于 $ \partial_{h} $ 和 $ \omega_{h} $ 的定义, 选择合适的参数, 可得如下关系式
$$ \begin{align} \frac{\omega_{h}}{\partial_{h}}\le\frac{\chi^{2}}{2}(\sigma({\cal{L}}+{\cal{B}}))^2 \end{align} $$ (52) 并且, 根据引理2, 当 $ t\to\infty $ 时, 不等式 (47) 成立.
□ 3.3 芝诺行为分析
本节需要证明提出的事件触发机制的间隔时间是有下界的, 即同时排除芝诺行为发生的可能性. 对于 $ \varkappa=1,\;\cdots,\;n $, 定义
$$\left\{\begin{split} &\sigma_{h,\;\varkappa}(t)=\hat{x}_{h,\;\varkappa}(t)-\breve{\hat{x}}_{h,\;\varkappa}(t),\; \quad t\in[t^{x}_{h,\;k},\;t^{x}_{h,\;k+1})\nonumber\\ &\sigma_{h,\;u}(t)=u_{h}(t)-\breve{u}_{h}(t),\; \quad\quad\quad t\in[t^{u}_{h,\;k},\;t^{u}_{h,\;k+1})\nonumber\\ &\sigma_{hf,\;\varkappa-1}(t)={\dot{\alpha}}_{hf,\;\varkappa-1}(t)-{\dot{\breve{\alpha}}}_{hf,\;\varkappa-1}(t),\;\\ &\qquad\qquad\qquad\qquad\qquad\qquad\quad\quad t\in[t^{\alpha_{f}}_{h,\;k},\;t^{\alpha_{f}}_{h,\;k+1})\nonumber \end{split}\right. $$ 计算以上变量的导数值为
$$ \begin{align} \begin{aligned} &\frac{{\rm{d}}}{{\rm{d}}t}|\sigma_{h,\;\varkappa}|=\frac{{\rm{d}}}{{\rm{d}}t}(\sigma_{h,\;\varkappa}*\sigma_{h,\;\varkappa})^{\frac{1}{2}}\nonumber=\\ &\qquad\hbox{sign}(\hat{x}_{h,\;n}-\breve{\hat{x}}_{h,\;n})\dot{\hat{x}}_{h,\;n}\le|\dot{\hat{x}}_{h,\;n}|\nonumber\\& \frac{{\rm{d}}}{{\rm{d}}t}|\sigma_{h,\;u}|=\frac{{\rm{d}}}{{\rm{d}}t}(\sigma_{h,\;u}*\sigma_{h,\;u})^{\frac{1}{2}}\nonumber=\\ &\qquad\hbox{sign}(u_{h}-\breve{u}_{h})\dot{u}_{h}\le|\dot{u}_{h}|\nonumber\\ &\frac{{\rm{d}}}{{\rm{d}}t}|\sigma_{hf,\;\varkappa-1}|=\frac{{\rm{d}}}{{\rm{d}}t}(\sigma_{hf,\;\varkappa-1}*\sigma_{hf,\;\varkappa-1})^{\frac{1}{2}}\nonumber=\\ &\qquad\hbox{sign}(\dot{\alpha}_{hf,\;g-1}-\dot{\breve{\alpha}}_{hf,\;g-1})\ddot{\alpha}_{hf,\;g-1}\nonumber \le\\ &\qquad|\ddot{\alpha}_{hf,\;g-1}| \end{aligned} \end{align} $$ 其中, $ \dot{u}_{h} $和$ \ddot{\alpha}_{hf,\;g-1} $与$ \hat{x}_{h,\;g} $, $ e_{h,\;g} $, $ \tilde{\eta}_{h,\;g} $和$ \dot{\alpha}_{h,\;g} $信号有关.
以上变量满足如下的关系式: $ |\dot{\hat{x}}_{h,\;m}|\;\le\;\digamma_{h} $, $ |\dot{u}_{h}|\le\digamma_{\breve{u}_{h}} $ 和 $ |\ddot{\digamma}_{h,\;\alpha_{f}}|\le\digamma_{h,\;\alpha_{f}} $. 值得注意的是, 在 $ t^{x}_{h,\;k} $, $ t^{u}_{h,\;k} $ 和 $ t^{\alpha_{f}}_{h,\;k} $ 时刻, 三个变量满足 $ \sigma_{h,\,\varkappa}(t^{x}_{h,\,k}) = 0 $, $ \sigma_{h,\;\varkappa}(t^{u}_{h,\;k})=0 $ 和 $ \sigma_{hf,\;\varkappa-1}(t^{\alpha_{f}}_{h,\;k})=0 $.
随后, 可得
$$\begin{split} &\lim_{t\rightarrow t^{x}_{h,\;k}}\sigma_{h,\;\varkappa}(t)=\nu_{h}+ m_{h}\text{e}^{-b_{h}t^{x}_{h,\;k}} \\ &\lim_{t\rightarrow t^{u}_{h,\;k}} \sigma_{h,\;\varkappa}(t)=\rho_{h}+\mu_{h}\text{e}^{-\tau_{h}t^{u}_{h,\;k}}\\ &\lim_{t\rightarrow t^{\alpha_{f}}_{h,\;k}} \sigma_{h,\;\varkappa}(t)=\Theta_{h,\;\alpha_{f}} \end{split} $$ 因此, 得到本文提出的事件触发机制间隔的最小界限值为
$$ \begin{align} t^{x}_{h,\;k+1}-t^{x}_{h,\;k}\ge\frac{\nu_{h}+m_{h}\text{e}^{-b_{h}t^{x}_{h,\;k}}}{\digamma_{h}} \end{align} $$ (53) $$ \begin{align} t^{u}_{h,\;k+1}-t^{u}_{h,\;k}\ge\frac{\rho_{h}+\mu_{h}\text{e}^{-\tau_{h}t^{u}_{h,\;k}}}{\digamma_{\breve{u}_{h}}} \end{align} $$ (54) $$ \begin{align} t^{\alpha_{f}}_{h,\;k+1}-t^{\alpha_{f}}_{h,\;k}\ge\frac{\Theta_{h,\;\alpha_{f}}}{\digamma_{h,\;\alpha_{f}}} \end{align} $$ (55) 通过以上分析可以得出, 本文提出的事件触发机制不会发生芝诺行为.
4. 仿真结果
一些仿真结果验证了本文控制方案的有效性.
图1是本文所考虑的通讯拓扑结构, 其表明4个跟随者与1个领导者之间的信息传输关系. 基于图1, 邻接矩阵 $ \bar{{\cal{A}}} $ 和拉普拉斯矩阵 $ {\cal{L}} $ 表示如下:
$$ \begin{align} \begin{aligned} \bar{{\cal{A}}}= \begin{bmatrix} 0&0&0&1\\ 1&0&0&0\\ 0&1&0&0\\ 0&1&1&0\nonumber \end{bmatrix} ,\; {\cal{L}}= \begin{bmatrix} \;\;\,1&\;\;\;\,0&\;\;\,0&-1\\ -1&\;\;1&\;\;\,0&\;\;0\\ \,\;\;0&-1&\;\;\,1&\;\;0\\ \;\;\,0&-1&-1&\;\;2\nonumber \end{bmatrix} \end{aligned} \end{align} $$ 对于 $ h=1,\;2,\;3,\;4 $, 挑选一组强阻尼系统, 其动态模型为
$$ \begin{align} \begin{aligned} \Lambda_{h}=\frac{\pi_{h}}{M_{h}L^{2}_{h}}\Lambda_{h}-\frac{v_{h}\bar{g}}{L_{h}}\sin(\Lambda_{h})+u_{h} \end{aligned} \end{align} $$ (56) 其中, $ L_{h} $ 是摆长. $ \Lambda_{h} $ 是从垂直向下的位置逆时针测量的杆的角度. $ u_{h} $ 表示传动转矩. $ \bar{g} $ 为重力加速度. $ v_{h} $ 定义为恢复转矩系数. $ \pi $ 为阻尼系数. $ M_{h} $ 为钟摆的质量.
针对$ h = 1,\;2,\;3,\;4 $, 设$ x_{h,\;1} = \Lambda_{h} $ 和 $ {x}_{h,\;2}=\dot{\Lambda}_{h} $, 其动力学方程可转化为
$$ \left\{\begin{aligned} &\dot{x}_{h,\;1}=x_{h,\;2}\nonumber\\ &\dot{x}_{h,\;2}=u_{h}-\frac{\pi_{h}}{M_{h}L^{2}_{h}}x_{h,\;2}-\frac{v_{h}\bar{g}}{L_{h}}\sin(x_{h,\;1})\nonumber\\ & y_{h}=x_{h,\;1} \end{aligned} \right. $$ 强阻尼系统的初始值矩阵选择为 $ x_{h}(0) = [0.1,\; 0.1\;]^\text{T} $. 自适应参数的初始值为 $ \hat{\eta}\;_{h,\;1}\; (0) \;=\; 0.1 $ 和 $ \hat{\eta}\;_{h,\,2} (0) = 0.1 $. 挑选参数值分别为 $ e_{h,\;1,\;0,\;\min}=0.5 $, $ e_{h,\;1,\;\infty,\;\max}=0.5 $, $ e_{h,\;1,\;\infty,\;\min}=0.3 $, $ e_{h,\;1,\;0,\;\max}=0.5 $, $ o_{h}=-1 $, $ c_{h,\;1}=c_{h,\;2}=50 $, $ \phi_{h,\;2}=2 $, $ N_{h,\;1}= N_{h,\;2}= 0.1 $, $ p_{h}=0.1 $, $ m_{h}=0.1 $, $ b_{h}=0.1 $, $ \rho_{h}=10 $, $ \mu_{h}=2 $, $ \tau_{h}=2, $ $ M_{h}=1, $ $ L_{h}=1, $ $ \bar{g}=9.8 $, $ v_{h}=\frac{1}{9.8} $ 和 $ \pi=-0.25 $.
图2表明4个跟随者的输出轨迹与既定的领导者轨迹是一致的, 设计的控制算法可使强阻尼系统的输出稳定于给定的参考信号. 基于此, 可以看出考虑的控制目标得到实现, 控制策略有效地实现分布式强阻尼系统的一致性目标.
在图3中, 将一致性误差的值域限制在预设的范围中, 误差输出轨迹小于设计的规定性能预设边界$ (-0.5,\;0.3) $, 表明良好的规定性能效果. 图4为在双端触发框架下的控制输入曲线, 从图4中可以看出输入信号$ u_{h} $是有界的.
图5和图6凸显了模糊逻辑系统权重参数的变化过程, 证明本文所有自适应律参数是有界的, 并根据上文展示的控制性能, 说明本文控制框架针对实际系统具备有效性与适用性. $ \hat{\eta}_{h,\;g} $ 表示模糊逻辑系统的权重, 且用来调整模糊逻辑系统对非线性函数$ \zeta_{h,\;g}(\bar{x}_{h,\;g}) $ 的逼近效果.
图7为观测误差的数值变化轨迹. 同时, 依据前文的设计过程, 可知考虑的观测器仅使用相对输出信息进行反馈, 能够大幅度提高观测器的实用性. 图8展示了所提出事件触发机制的触发间隔. 同时, 也说明所提出的混合双端事件触发机制具有可节省控制器环节通信资源的优势. 以智能体1为例, 正常迭代次数为
3000 次, 通过本文事件触发的设计后, 控制器的更新次数为908 次, 节省了69.7%的通讯资源. 针对多智能体系统, 智能体之间在通讯网络下进行信息传输, 当智能体数量增多时, 必会造成一定程度的通讯压力, 这足以可见本文设计事件触发机制的重要性. 并且, 针对多渠道通讯网络, 本文同时降低了控制器−执行器环节和传感器−控制器环节的通讯负担.5. 结束语
本文研究了双端事件触发自适应模糊跟踪控制问题. 针对控制器−执行器和传感器−控制器环节, 提出基于状态触发机制和控制器触发机制的混合双端分布式事件触发机制, 并且设计一种改进的状态触发机制, 首次将估计的状态信号作为触发信号来达到节约通讯资源的目的. 最终, 一些仿真结果证明了提出控制方案的有效性. 在未来的研究工作中, 我们将致力于探索电力系统控制需求, 并将多种事件触发控制策略融合实际系统的需要, 以满足智能化、高效化与绿色化的能源转型目标.
-
A1 代表性追逃博弈文献分类
A1 Classification of representative literature on PE games
求解方法 文献 动态模型 追逃双方数量 维度 信息完整程度 状态空间 一阶积分器 二阶积分器 独轮车 其他 一对一 多对一 多对多 二维平面 三维空间 完全信息 不完全信息 连续 离散 理论求解法 [60] $\checkmark $ $\checkmark $ $\checkmark $ $\checkmark $ $\checkmark $ $\checkmark $ [57] $\checkmark $ $\checkmark $ $\checkmark $ $\checkmark $ $\checkmark $ $\checkmark $ [169] $\checkmark $ $\checkmark $ $\checkmark $ $\checkmark $ $\checkmark $ [96] $\checkmark $ $\checkmark $ $\checkmark $ $\checkmark $ $\checkmark $ 数值求解法 [73] $\checkmark $ $\checkmark $ $\checkmark $ $\checkmark $ $\checkmark $ [6] $\checkmark $ $\checkmark $ $\checkmark $ $\checkmark $ $\checkmark $ [86] $\checkmark $ $\checkmark $ $\checkmark $ $\checkmark $ $\checkmark $ [49] $\checkmark $ $\checkmark $ $\checkmark $ $\checkmark $ $\checkmark $ [44] $\checkmark $ $\checkmark $ $\checkmark $ $\checkmark $ $\checkmark $ 积分强化学习法 [147] $\checkmark $ $\checkmark $ $\checkmark $ $\checkmark $ $\checkmark $ [141] $\checkmark $ $\checkmark $ $\checkmark $ $\checkmark $ $\checkmark $ [46] $\checkmark $ $\checkmark $ $\checkmark $ $\checkmark $ $\checkmark $ [15] $\checkmark $ $\checkmark $ $\checkmark $ $\checkmark $ $\checkmark $ [40] $\checkmark $ $\checkmark $ $\checkmark $ $\checkmark $ $\checkmark $ [146] $\checkmark $ $\checkmark $ $\checkmark $ $\checkmark $ $\checkmark $ 几何法 [157] $\checkmark $ $\checkmark $ $\checkmark $ $\checkmark $ $\checkmark $ [95] $\checkmark $ $\checkmark $ $\checkmark $ $\checkmark $ $\checkmark $ [155] $\checkmark $ $\checkmark $ $\checkmark $ $\checkmark $ $\checkmark $ [161] $\checkmark $ $\checkmark $ $\checkmark $ $\checkmark $ $\checkmark $ [105] $\checkmark $ $\checkmark $ $\checkmark $ $\checkmark $ $\checkmark $ [23] $\checkmark $ $\checkmark $ $\checkmark $ $\checkmark $ $\checkmark $ [31] $\checkmark $ $\checkmark $ $\checkmark $ $\checkmark $ $\checkmark $ [39] $\checkmark $ $\checkmark $ $\checkmark $ $\checkmark $ $\checkmark $ -
[1] Domenici P, Blagburn J M, Bacon J P. Animal escapology Ⅱ: Escape trajectory case studies. Journal of Experimental Biology, 2011, 214(15): 2474−2494 doi: 10.1242/jeb.053801 [2] Tay N E, Warburton N M, Moseby K E, Fleming P A. Predator escape behaviour in threatened marsupials. Animal Conservation, 2023, 26(4): 587−601 doi: 10.1111/acv.12847 [3] FitzGibbon C D. The costs and benefits of predator inspection behaviour in Thomson's gazelles. Behavioral Ecology and Sociobiology, 1994, 34(2): 139−148 doi: 10.1007/BF00164184 [4] Scheel D, Packer C. Group hunting behaviour of lions: A search for cooperation. Animal Behaviour, 1991, 41(4): 697−709 doi: 10.1016/S0003-3472(05)80907-8 [5] Wang J N, Li G L, Liang L, Wang C Y, Deng F. Pursuit-evasion games of multiple cooperative pursuers and an evader: A biological-inspired perspective. Communications in Nonlinear Science and Numerical Simulation, 2022, 110: Article No. 106386 doi: 10.1016/j.cnsns.2022.106386 [6] Li W. A dynamics perspective of pursuit-evasion: Capturing and escaping when the pursuer runs faster than the agile evader. IEEE Transactions on Automatic Control, 2017, 62(1): 451−457 doi: 10.1109/TAC.2016.2575008 [7] Li W. The confinement-escape problem of a defender against an evader escaping from a circular region. IEEE Transactions on Cybernetics, 2016, 46(4): 1028−1039 doi: 10.1109/TCYB.2015.2503285 [8] Weintraub I E, Pachter M, Garcia E. An introduction to pursuit-evasion differential games. In: Proceedings of the American Control Conference (ACC). Denver, USA: IEEE, 2020. 1049−1066 [9] Mu Z X, Pan J, Zhou Z Y, Yu J Z, Cao L. A survey of the pursuit-evasion problem in swarm intelligence. Frontiers of Information Technology & Electronic Engineering, 2023, 24(8): 1093−1116 [10] Isaacs R. Differential Games: A Mathematical Theory With Applications to Warfare and Pursuit, Control and Optimization. New York: John Wiley & Sons, Inc., 1965. [11] Starr A W, Ho Y C. Further properties of nonzero-sum differential games. Journal of Optimization Theory and Applications, 1969, 3(4): 207−219 doi: 10.1007/BF00926523 [12] Ho Y C. Differential games, dynamic optimization, and generalized control theory. Journal of Optimization Theory and Applications, 1970, 6(3): 179−209 doi: 10.1007/BF00926600 [13] 张嗣瀛. 微分对策. 北京: 科学出版社, 1987.Zhang Si-Ying. Differential Games. Beijing: Science Press, 1987. [14] 刘坤, 郑晓帅, 林业茗, 韩乐, 夏元清. 基于微分博弈的追逃问题最优策略设计. 自动化学报, 2021, 47(8): 1840−1854Liu Kun, Zheng Xiao-Shuai, Lin Ye-Ming, Han Le, Xia Yuan-Qing. Design of optimal strategies for the pursuit-evasion problem based on differential game. Acta Automatica Sinica, 2021, 47(8): 1840−1854 [15] 耿远卓, 袁利, 黄煌, 汤亮. 基于终端诱导强化学习的航天器轨道追逃博弈. 自动化学报, 2023, 49(5): 974−984Geng Yuan-Zhuo, Yuan Li, Huang Huang, Tang Liang. Terminal-guidance based reinforcement-learning for orbital pursuit-evasion game of the spacecraft. Acta Automatica Sinica, 2023, 49(5): 974−984 [16] Pontryagin L S. On the theory of differential games. Russian Mathematical Surveys, 1966, 21(4): 193−246 doi: 10.1070/RM1966v021n04ABEH004171 [17] Dobbie J M. Solution of some surveillance-evasion problems by the methods of differential games. In: Proceedings of the International Conference on Operational Research. New York, USA: John Wiley & Sons, 1966. [18] Tan M. Multi-agent reinforcement learning: Independent versus cooperative agents. In: Proceedings of the 10th International Conference on Machine Learning. Amherst, USA: Morgan Kaufmann Publishers Inc., 1993. 330−337 [19] Guy R K. Unsolved problems in combinatorial games. Combinatorics Advances. Boston: Springer, 1995. 161−179 [20] Vamvoudakis K G, Lewis F L. Multi-player non-zero-sum games: Online adaptive learning solution of coupled Hamilton-Jacobi equations. Automatica, 2011, 47(8): 1556−1569 doi: 10.1016/j.automatica.2011.03.005 [21] Flynn J. Lion and man: The general case. SIAM Journal on Control, 1974, 12(4): 581−597 doi: 10.1137/0312043 [22] Oyler D W, Kabamba P T, Girard A R. Pursuit-evasion games in the presence of a line segment obstacle. In: Proceedings of the 53rd IEEE Conference on Decision and Control (CDC). Los Angeles, USA: IEEE, 2014. 1149−1154 [23] Garcia E, Casbeer D W, Pachter M. Optimal strategies for a class of multi-player reach-avoid differential games in 3D space. IEEE Robotics and Automation Letters, 2020, 5(3): 4257−4264 doi: 10.1109/LRA.2020.2994023 [24] Nath S, Ghose D. A two-phase evasive strategy for a pursuit-evasion problem involving two non-holonomic agents with incomplete information. European Journal of Control, 2022, 68: Article No. 100677 doi: 10.1016/j.ejcon.2022.100677 [25] Oyler D W, Girard A R. Dominance regions in the homicidal chauffeur problem. In: Proceedings of the American Control Conference (ACC). Boston, USA: IEEE, 2016. 2494−2499 [26] Pachter M, Moll A V, Garcia E, Casbeer D, Milutinović D. Cooperative pursuit by multiple pursuers of a single evader. Journal of Aerospace Information Systems, 2020, 17(8): 371−389 doi: 10.2514/1.I010739 [27] Bakolas E, Tsiotras P. Optimal pursuit of moving targets using dynamic Voronoi diagrams. In: Proceedings of the 49th IEEE Conference on Decision and Control (CDC). Atlanta, USA: IEEE, 2010. 7431−7436 [28] Ramana M V, Kothari M. Pursuit-evasion games of high speed evader. Journal of Intelligent & Robotic Systems, 2017, 85: 293−306 [29] Shishika D, Kumar V. Local-game decomposition for multiplayer perimeter-defense problem. In: Proceedings of the IEEE Conference on Decision and Control (CDC). Miami, USA: IEEE, 2018. 2093−2100 [30] Liang L, Deng F, Lu M B, Chen J. Analysis of role switch for cooperative target defense differential game. IEEE Transactions on Automatic Control, 2021, 66(2): 902−909 doi: 10.1109/TAC.2020.2987701 [31] Yan R, Shi Z Y, Zhong Y S. Task assignment for multiplayer reach-avoid games in convex domains via analytical barriers. IEEE Transactions on Robotics, 2020, 36(1): 107−124 doi: 10.1109/TRO.2019.2935345 [32] Zhao Y, Tao Q L, Xian C X, Li Z K, Duan Z S. Prescribed-time distributed Nash equilibrium seeking for noncooperation games. Automatica, 2023, 151: Article No. 110933 doi: 10.1016/j.automatica.2023.110933 [33] Xue L, Ye J F, Wu Y B, Liu J, Wunsch D C. Prescribed-time Nash equilibrium seeking for pursuit-evasion game. IEEE/CAA Journal of Automatica Sinica, 2024, 11(6): 1518−1520 doi: 10.1109/JAS.2023.124077 [34] Bakolas E. Evasion from a group of pursuers with double integrator kinematics. In: Proceedings of the 52nd IEEE Conference on Decision and Control. Firenze, Italy: IEEE, 2013. 1472−1477 [35] Selvakumar J, Bakolas E. Evasion from a group of pursuers with a prescribed target set for the evader. In: Proceedings of the American Control Conference (ACC). Boston, USA: IEEE, 2016. 155−160 [36] Coon M, Panagou D. Control strategies for multiplayer target-attacker-defender differential games with double integrator dynamics. In: Proceedings of the 56th IEEE Annual Conference on Decision and Control (CDC). Melbourne, Australia: IEEE, 2017. 1496−1502 [37] Chipade V S, Panagou D. IDCAIS: Inter-defender collision-aware interception strategy against multiple attackers. arXiv preprint arXiv: 2112.12098, 2021. [38] Chipade V S, Panagou D. Multiagent planning and control for swarm herding in 2-D obstacle environments under bounded inputs. IEEE Transactions on Robotics, 2021, 37(6): 1956−1972 doi: 10.1109/TRO.2021.3072026 [39] Li S, Wang C, Xie G M. Optimal strategies for pursuit-evasion differential games of players with damped double integrator dynamics. IEEE Transactions on Automatic Control, 2024, 69(8): 5278−5293 doi: 10.1109/TAC.2023.3346815 [40] Kokolakis N M T, Vamvoudakis K G. Bounded rational Dubins vehicle coordination for target tracking using reinforcement learning. Automatica, 2023, 149: Article No. 110732 doi: 10.1016/j.automatica.2022.110732 [41] Patsko V S, Turova V L. Homicidal chauffeur game: History and modern studies. Advances in Dynamic Games: Theory, Applications, and Numerical Methods for Differential and Stochastic Games. Boston: Birkhäuser, 2011. 227−251 [42] Pachter M, Coates S. The classical homicidal chauffeur game. Dynamic Games and Applications, 2019, 9(3): 800−850 doi: 10.1007/s13235-018-0264-8 [43] Exarchos I, Tsiotras P, Pachter M. On the suicidal pedestrian differential game. Dynamic Games and Applications, 2015, 5(3): 297−317 doi: 10.1007/s13235-014-0130-2 [44] Nath S, Ghose D. Worst-case scenario evasive strategies in a two-on-one engagement between Dubins' vehicles with partial information. IEEE Control Systems Letters, 2023, 7: 25−30 doi: 10.1109/LCSYS.2022.3186179 [45] Sani M, Robu B, Hably A. Pursuit-evasion game for nonholonomic mobile robots with obstacle avoidance using NMPC. In: Proceedings of the 28th Mediterranean Conference on Control and Automation (MED). Saint-Rapha, France: IEEE, 2020. 978−983 [46] Manoharan A, Thakur P, Singh A K. Multi-agent target defense game with learned defender to attacker assignment. In: Proceedings of the International Conference on Unmanned Aircraft Systems (ICUAS). Warsaw, Poland: IEEE, 2023. 297−304 [47] 祝海. 基于微分对策的航天器轨道追逃最优控制策略 [硕士学位论文], 国防科技大学, 中国, 2017.Zhu Hai. Optimal Control of Spacecraft Orbital Pursuit-evasion Based on Differential Game [Master thesis], National University of Defense Technology, China, 2017. [48] Clohessy W H, Wiltshire R S. Terminal guidance system for satellite rendezvous. Journal of the Aerospace Sciences, 1960, 27(9): 653−658 doi: 10.2514/8.8704 [49] Zhang C M, Zhu Y W, Yang L P, Zeng X. An optimal guidance method for free-time orbital pursuit-evasion game. Journal of Systems Engineering and Electronics, 2022, 33(6): 1294−1308 [50] Venigalla C, Scheeres D. Spacecraft rendezvous and pursuit/evasion analysis using reachable sets. In: Proceedings of the Space Flight Mechanics Meeting. Kissimmee, USA: American Institute of Aeronautics and Astronautics, Inc., 2018. [51] Li Z Y, Zhu H, Yang Z, Luo Y Z. A dimension-reduction solution of free-time differential games for spacecraft pursuit-evasion. Acta Astronautica, 2019, 163: 201−210 doi: 10.1016/j.actaastro.2019.01.011 [52] Lin B, Qiao L, Jia Z H, Sun Z J, Sun M, Zhang W D. Control strategies for target-attacker-defender games of USVs. In: Proceedings of the 6th International Conference on Automation, Control and Robotics Engineering (CACRE). Dalian, China: IEEE, 2021. 191−198 [53] Ho Y, Bryson A, Baron S. Differential games and optimal pursuit-evasion strategies. IEEE Transactions on Automatic Control, 1965, 10(4): 385−389 doi: 10.1109/TAC.1965.1098197 [54] Kothari M, Manathara J G, Postlethwaite I. Cooperative multiple pursuers against a single evader. Journal of Intelligent & Robotic Systems, 2017, 86(3): 551−567 [55] Yufereva O. Lion and man game in compact spaces. Dynamic Games and Applications, 2019, 9(1): 281−292 doi: 10.1007/s13235-018-0239-9 [56] Oyler D W, Kabamba P T, Girard A R. Pursuit-evasion games in the presence of obstacles. Automatica, 2016, 65: 1−11 doi: 10.1016/j.automatica.2015.11.018 [57] Exarchos I, Tsiotras P. An asymmetric version of the two car pursuit-evasion game. In: Proceedings of the 53rd IEEE Conference on Decision and Control (CDC). Los Angeles, USA: IEEE, 2014. 4272−4277 [58] Das G, Dorothy M, Bell Z I, Shishika D. Guarding a non-maneuverable translating line with an attached defender. arXiv preprint arXiv: 2209.09318, 2022. [59] Liang L, Deng F, Wang J N, Lu M B, Chen J. A reconnaissance penetration game with territorial-constrained defender. IEEE Transactions on Automatic Control, 2022, 67(11): 6295−6302 doi: 10.1109/TAC.2022.3183034 [60] Levchenkov A Y, Pashkov A G. Differential game of optimal approach of two inertial pursuers to a noninertial evader. Journal of Optimization Theory and Applications, 1990, 65(3): 501−518 doi: 10.1007/BF00939563 [61] Chen J, Zha W, Peng Z H, Gu D B. Multi-player pursuit-evasion games with one superior evader. Automatica, 2016, 71: 24−32 doi: 10.1016/j.automatica.2016.04.012 [62] Yan R, Shi Z Y, Zhong Y S. Cooperative strategies for two-evader-one-pursuer reach-avoid differential games. International Journal of Systems Science, 2021, 52(9): 1894−1912 doi: 10.1080/00207721.2021.1872116 [63] Liu S Y, Zhou Z Y, Tomlin C, Hedrick K. Evasion as a team against a faster pursuer. In: Proceedings of the American Control Conference (ACC). Washington, USA: IEEE, 2013. 5368−5373 [64] Wang D, Peng Z H. Pursuit-evasion games of multi-players with a single faster player. In: Proceedings of the 35th Chinese Control Conference (CCC). Chengdu, China: IEEE, 2016. 2583−2588 [65] Scott W L, Leonard N E. Optimal evasive strategies for multiple interacting agents with motion constraints. Automatica, 2018, 94: 26−34 doi: 10.1016/j.automatica.2018.04.008 [66] Garcia E, Casbeer D W, von Moll A, Pachter M. Multiple pursuer multiple evader differential games. IEEE Transactions on Automatic Control, 2021, 66(5): 2345−2350 doi: 10.1109/TAC.2020.3003840 [67] Wei M, Chen G S, Cruz J B, Haynes L S, Pham K, Blasch E. Multi-pursuer multi-evader pursuit-evasion games with jamming confrontation. Journal of Aerospace Computing, Information, and Communication, 2007, 4(3): 693−706 doi: 10.2514/1.25329 [68] Wei M, Chen G S, Cruz J B, Haynes L S, Chang M H, Blasch E. A decentralized approach to pursuer-evader games with multiple superior evaders in noisy environments. In: Proceedings of the IEEE Aerospace Conference. Big Sky, USA: IEEE, 2007. 1−10 [69] Xu L, Hu B, Guan Z H, Cheng X M, Li T, Xiao J W. Multi-agent deep reinforcement learning for pursuit-evasion game scalability. In: Proceedings of the Chinese Intelligent Systems Conference. Haikou, China: Springer, 2019. 658−669 [70] Li D X, Cruz J B, Chen G S, Kwan C, Chang M H. A hierarchical approach to multi-player pursuit-evasion differential games. In: Proceedings of the 44th IEEE Conference on Decision and Control (CDC). Seville, Spain: IEEE, 2005. 5674−5679 [71] Li D X, Cruz Jr J B, Schumacher C J. Stochastic multi-player pursuit-evasion differential games. International Journal of Robust and Nonlinear Control, 2008, 18(2): 218−247 doi: 10.1002/rnc.1193 [72] Yan R, Deng R L, Lai H W, Zhang W X, Shi Z Y, Zhong Y S. Multiplayer homicidal chauffeur reach-avoid games via guaranteed winning strategies. arXiv preprint arXiv: 2107.04709, 2021. [73] LaValle S M, Lin D, Guibas L J, Latombe J C, Motwani R. Finding an unpredictable target in a workspace with obstacles. In: Proceedings of the International Conference on Robotics and Automation. Albuquerque, USA: IEEE, 1997. 737−742 [74] Razali S, Meng Q G, Yang S H. A refined immune systems inspired model for multi-robot shepherding. In: Proceedings of the Second World Congress on Nature and Biologically Inspired Computing (NaBIC). Kitakyushu, Japan: IEEE, 2010. 473−478 [75] Bhadauria D, Gosse S, Pipp J. Capturing an evader in a polygonal environment with obstacles [Online], available: https://www.conservancy.umn.edu/items/256a22f6-ba8c-4b12-a80d-300b7cba947c, June 15, 2024Bhadauria D, Gosse S, Pipp J. Capturing an evader in a polygonal environment with obstacles [Online], available: https://www.conservancy.umn.edu/items/256a22f6-ba8c-4b12-a80d-300b7cba947c, June 15, 2024 [76] de Souza C, Newbury R, Cosgun A, Castillo P, Vidolov B, Kulić D. Decentralized multi-agent pursuit using deep reinforcement learning. IEEE Robotics and Automation Letters, 2021, 6(3): 4552−4559 doi: 10.1109/LRA.2021.3068952 [77] Zhang R L, Zong Q, Zhang X Y, Dou L Q, Tian B L. Game of drones: Multi-UAV pursuit-evasion game with online motion planning by deep reinforcement learning. IEEE Transactions on Neural Networks and Learning Systems, 2023, 34(10): 7900−7909 doi: 10.1109/TNNLS.2022.3146976 [78] Liang X, Zhou B R, Jiang L P, Meng G L, Xiu Y. Collaborative pursuit-evasion game of multi-UAVs based on Apollonius circle in the environment with obstacle. Connection Science, 2023, 35(1): Article No. 2168253 doi: 10.1080/09540091.2023.2168253 [79] Garcia E, Casbeer D W, Pachter M. Optimal strategies of the differential game in a circular region. IEEE Control Systems Letters, 2020, 4(2): 492−497 doi: 10.1109/LCSYS.2019.2963173 [80] Casini M, Garulli A. A novel family of pursuit strategies for the lion and man problem. In: Proceedings of the 56th IEEE Annual Conference on Decision and Control (CDC). Melbourne, Australia: IEEE, 2017. 6436−6441 [81] Yan R, Mi S, Duan X M, Chen J T, Ji X Y. Pursuit winning strategies for Reach-Avoid games with polygonal obstacles. IEEE Transactions on Automatic Control, DOI: 10.1109/TAC.2024.3438806 [82] Flynn J O. Lion and man: The boundary constraint. SIAM Journal on Control, 1973, 11(3): 397−411 doi: 10.1137/0311032 [83] Yan R, Shi Z Y, Zhong Y S. Defense game in a circular region. In: Proceedings of the 56th IEEE Annual Conference on Decision and Control (CDC). Melbourne, Australia: IEEE, 2017. 5590−5595 [84] Ruiz U, Isler V. Capturing an omnidirectional evader in convex environments using a differential drive robot. IEEE Robotics and Automation Letters, 2016, 1(2): 1007−1013 doi: 10.1109/LRA.2016.2530854 [85] Okabe A, Boots B, Sugihara K. Spatial Tessellations: Concepts and Applications of Voronoi Diagrams. New York: John Wiley, 1992. [86] Pierson A, Wang Z J, Schwager M. Intercepting rogue robots: An algorithm for capturing multiple evaders with multiple pursuers. IEEE Robotics and Automation Letters, 2017, 2(2): 530−537 doi: 10.1109/LRA.2016.2645516 [87] Li S, Wang C, Xie G M. Pursuit-evasion differential games of players with different speeds in spaces of different dimensions. In: Proceedings of the American Control Conference (ACC). Atlanta, USA: IEEE, 2022. 1299−1304 [88] Zhang R Q, Li S, Wang C, Xie G M. Optimal strategies for the game with two faster 3D pursuers and one slower 2D evader. In: Proceedings of the 41st Chinese Control Conference (CCC). Hefei, China: IEEE, 2022. 1767−1772 [89] Zhi J X, Hao Y, Vo C, Morales M, Lien J M. Computing 3-D from-region visibility using visibility integrity. IEEE Robotics and Automation Letters, 2019, 4(4): 4286−4291 doi: 10.1109/LRA.2019.2931280 [90] Chen N, Li L J, Mao W J. Equilibrium strategy of the pursuit-evasion game in three-dimensional space. IEEE/CAA Journal of Automatica Sinica, 2024, 11(2): 446−458 doi: 10.1109/JAS.2023.123996 [91] Shen H X, Casalino L. Revisit of the three-dimensional orbital pursuit-evasion game. Journal of Guidance, Control, and Dynamics, 2018, 41(8): 1823−1831 doi: 10.2514/1.G003127 [92] Yan R, Duan X M, Shi Z Y, Zhong Y S, Bullo F. Matching-based capture strategies for 3D heterogeneous multiplayer reach-avoid differential games. Automatica, 2022, 140: Article No. 110207 doi: 10.1016/j.automatica.2022.110207 [93] Lewin J, Breakwell J V. The surveillance-evasion game of degree. Journal of Optimization Theory and Applications, 1975, 16(3): 339−353 [94] Greenfeld I. A differential game of surveillance evasion of two identical cars. Journal of Optimization Theory and Applications, 1987, 52(1): 53−79 doi: 10.1007/BF00938464 [95] Bopardikar S D, Bullo F, Hespanha J P. Sensing limitations in the lion and man problem. In: Proceedings of the American Control Conference (ACC). New York, USA: IEEE, 2007. 5958−5963 [96] Lopez V G, Lewis F L, Wan Y, Sanchez E N, Fan L L. Solutions for multiagent pursuit-evasion games on communication graphs: Finite-time capture and asymptotic behaviors. IEEE Transactions on Automatic Control, 2020, 65(5): 1911−1923 doi: 10.1109/TAC.2019.2926554 [97] Zemskov K A, Pashkow A G. Construction of optimal position strategies in a differential pursuit-evasion game with one pursuer and two evaders. Journal of Applied Mathematics and Mechanics, 1997, 61(3): 391−399 doi: 10.1016/S0021-8928(97)00050-6 [98] Liu S Y, Zhou Z Y, Tomlin C, Hedrick J K. Evasion of a team of dubins vehicles from a hidden pursuer. In: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA). Hong Kong, China: IEEE, 2014. 6771−6776 [99] von Moll A, Garcia E, Casbeer D, Suresh M, Swar S C. Multiple-pursuer, single-evader border defense differential game. Journal of Aerospace Information Systems, 2020, 17(8): 407−416 doi: 10.2514/1.I010740 [100] Kartal Y, Subbarao K, Dogan A, Lewis F. Optimal game theoretic solution of the pursuit-evasion intercept problem using on-policy reinforcement learning. International Journal of Robust and Nonlinear Control, 2021, 31(16): 7886−7903 doi: 10.1002/rnc.5719 [101] Littlewood J E. A Mathematician's Miscellany. Cambridge: Cambridge University Press, 1953. [102] Kohlenbach U, López-Acedo G, Nicolae A. A uniform betweenness property in metric spaces and its role in the quantitative analysis of the “Lion-Man” game. Pacific Journal of Mathematics, 2021, 310(1): 181−212 doi: 10.2140/pjm.2021.310.181 [103] Sgall J. Solution of David Gale's lion and man problem. Theoretical Computer Science, 2001, 259(1−2): 663−670 doi: 10.1016/S0304-3975(00)00411-4 [104] Casini M, Garulli A. An improved lion strategy for the lion and man problem. IEEE Control Systems Letters, 2017, 1(1): 38−43 doi: 10.1109/LCSYS.2017.2702652 [105] Casini M, Garulli A. A new class of pursuer strategies for the discrete-time lion and man problem. Automatica, 2019, 100: 162−170 doi: 10.1016/j.automatica.2018.11.015 [106] Casini M, Criscuoli M, Garulli A. A discrete-time pursuit-evasion game in convex polygonal environments. Systems & Control Letters, 2019, 125: 22−28 [107] Bopardikar S D, Bullo F, Hespanha J P. Cooperative pursuit with sensing limitations. In: Proceedings of the American Control Conference (ACC). New York, USA: IEEE, 2007. 5394−5399 [108] Li D X, Cruz J B. Graph-based strategies for multi-player pursuit evasion games. In: Proceedings of the 46th IEEE Conference on Decision and Control. New Orleans, USA: IEEE, 2007. 4063−4068 [109] Chen H, Kalyanam K, Zhang W, Casbeer D. Intruder isolation on a general road network under partial information. IEEE Transactions on Control Systems Technology, 2017, 25(1): 222−234 doi: 10.1109/TCST.2016.2550423 [110] Kalyanam K, Casbeer D, Pachter M. Graph search of a moving ground target by a UAV aided by ground sensors with local information. Autonomous Robots, 2020, 44(5): 831−843 doi: 10.1007/s10514-019-09900-0 [111] Sundaram S, Kalyanam K, Casbeer D W. Pursuit on a graph under partial information from sensors. In: Proceedings of the American Control Conference (ACC). Seattle, USA: IEEE, 2017. 4279−4284 [112] Dong X, Zhang H G, Ming Z Y. Adaptive optimal control via Q-learning for multi-agent pursuit-evasion games. IEEE Transactions on Circuits and Systems Ⅱ: Express Briefs, 2024, 71(6): 3056−3060 doi: 10.1109/TCSII.2024.3354120 [113] Sugihara K, Suzuki I. Optimal algorithms for a pursuit-evasion problem in grids. SIAM Journal on Discrete Mathematics, 1989, 2(1): 126−143 doi: 10.1137/0402013 [114] Dawes R W. Some pursuit-evasion problems on grids. Information Processing Letters, 1992, 43(5): 241−247 doi: 10.1016/0020-0190(92)90218-K [115] Bhattacharya S, Banerjee A, Bandyopadhyay S. CORBA-based analysis of multi agent behavior. Journal of Computer Science and Technology, 2005, 20(1): 118−124 doi: 10.1007/s11390-005-0013-5 [116] Bhattacharya S, Paul G, Sanyal S. A cops and robber game in multidimensional grids. Discrete Applied Mathematics, 2010, 158(16): 1745−1751 doi: 10.1016/j.dam.2010.06.014 [117] Das S, Gahlawat H. Variations of cops and robbers game on grids. Discrete Applied Mathematics, 2021, 305: 340−349 doi: 10.1016/j.dam.2020.02.004 [118] Lewin J, Olsder G J. The isotropic rocket——A surveillance evasion game. Computers & Mathematics With Applications, 1989, 18(1−3): 15−34 [119] Altaher M, Elmougy S, Nomir O. Intercepting a superior missile: A reachability analysis of an Apollonius circle-based multiplayer differential game. International Journal of Innovative Computing, Information and Control, 2019, 15(1): 369−381 [120] Jang J S, Tomlin C. Control strategies in multi-player pursuit and evasion game. In: Proceedings of the AIAA Guidance, Navigation, and Control Conference and Exhibit. San Francisco, USA: AIAA, 2005. Article No. 6239 [121] Osborne M J, Rubinstein A. A Course in Game Theory. Cambridge: MIT Press, 1994. [122] Başar T, Olsder G J. Dynamic Noncooperative Game Theory (Second edition). New York: Academic Press Inc., 1998. [123] Krasovskii N N, Subbotin A I, Kotz S. Game-theoretical Control Problems. New York: Springer, 1988. [124] Mizukami K, Eguchi K. A geometrical approach to problems of pursuit-evasion games. Journal of the Franklin Institute, 1977, 303(4): 371−384 doi: 10.1016/0016-0032(77)90118-1 [125] Samatov B T, Soyibboev U B. Differential game with a lifeline for the inertial movements of players. Ural Mathematical Journal, 2021, 7(2): 94−109 doi: 10.15826/umj.2021.2.007 [126] Petrov N N. “Soft” capture in pontryagin's example with many participants. Journal of Applied Mathematics and Mechanics, 2003, 67(5): 671−680 doi: 10.1016/S0021-8928(03)90040-2 [127] Blagodatskikh A I. On group pursuit problem in Pontryagin's nonstationary example. Vestn. Udmurt. Gos. Univ. Ser. Mat, 2007, 1: 17−24 [128] Petrov N N, Solov'eva N A. Multiple capture in Pontryagin's recurrent example. Automation and Remote Control, 2016, 77(5): 855−861 doi: 10.1134/S0005117916050088 [129] Lewin J. Differential Games: Theory and Methods for Solving Game Problems With Singular Surfaces. London: Springer, 2012. [130] Wise K A, Sedwick J L. Successive approximation solution of the HJI equation. In: Proceedings of the 33rd IEEE Conference on Decision and Control. Lake Buena Vista, USA: IEEE, 1994. 1387−1391 [131] Yang X, Liu D R, Ma H W, Xu Y C. Online approximate solution of HJI equation for unknown constrained-input nonlinear continuous-time systems. Information Sciences, 2016, 328: 435−454 doi: 10.1016/j.ins.2015.09.001 [132] Earl M G, D'Andrea R. Modeling and control of a multi-agent system using mixed integer linear programming. In: Proceedings of the 41st IEEE Conference on Decision and Control. Las Vegas, USA: IEEE, 2002. 107−111 [133] Ni Y J, Gao S H, Huang S N, Xiang C, Ren Q Y, Lee T H. Multi-agent cooperative pursuit-evasion control using gene expression programming. In: Proceedings of the 47th Annual Conference of the IEEE Industrial Electronics Society. Toronto, Canada: IEEE, 2021. 1−6 [134] Asgharnia A, Schwartz H M, Atia M. Multi-invader multi-defender differential game using reinforcement learning. In: Proceedings of the IEEE International Conference on Fuzzy Systems (FUZZ-IEEE). Padua, Italy: IEEE, 2022. 1−8 [135] Kachroo P, Shedied S A, Bay J S, Vanlandingham H. Dynamic programming solution for a class of pursuit evasion problems: The herding problem. IEEE Transactions on Systems, Man and Cybernetics, Part C (Applications and Reviews), 2001, 31(1): 35−41 doi: 10.1109/5326.923266 [136] Hespanha J P, Pappas G J, Prandini M. Greedy control for hybrid pursuit-evasion games. In: Proceedings of the European Control Conference (ECC). Los Angeles, USA: 2001. 2621−2626Hespanha J P, Pappas G J, Prandini M. Greedy control for hybrid pursuit-evasion games. In: Proceedings of the European Control Conference (ECC). Los Angeles, USA: 2001. 2621−2626 [137] Cristiani E, Falcone M. Fully-discrete schemes for the value function of pursuit-evasion games with state constraints. Advances in Dynamic Games and Their Applications: Analytical and Numerical Developments. Boston: Birkhäuser, 2009. 1−30 [138] Anderson G. Feedback control for pursuit-evasion problems between two spacecraftbased on differential dynamic programming. In: Proceedings of the 15th Aerospace Sciences Meeting. Los Angeles, USA: AIAA, 1977. Article No. 34 [139] Alexopoulos A, Schmidt T, Badreddin E. A pursuit-evasion game between unmanned aerial vehicles. In: Proceedings of the 11th International Conference on Informatics in Control, Automation and Robotics (ICINCO). Vienna, Austria: IEEE, 2014. 74−81 [140] Xing J B, Zeng X L. A deep reinforcement learning method for lion and man problem. In: Proceedings of the 40th Chinese Control Conference (CCC). Shanghai, China: IEEE, 2021. 8366−8371 [141] Qu X Q, Gan W H, Song D L, Zhou L Q. Pursuit-evasion game strategy of USV based on deep reinforcement learning in complex multi-obstacle environment. Ocean Engineering, 2023, 273: Article No. 114016 doi: 10.1016/j.oceaneng.2023.114016 [142] Wan K F, Wu D W, Zhai Y W, Li B, Gao X G, Hu Z J. An improved approach towards multi-agent pursuit-evasion game decision-making using deep reinforcement learning. Entropy, 2021, 23(11): Article No. 1433 doi: 10.3390/e23111433 [143] Li B, Wang J M, Song C, Yang Z P, Wan K F, Zhang Q F. Multi-UAV roundup strategy method based on deep reinforcement learning CEL-MADDPG algorithm. Expert Systems With Applications, 2024, 245: Article No. 123018 doi: 10.1016/j.eswa.2023.123018 [144] Li X L, Li Z, Zheng X L, Yang X B, Yu X H. The study of crash-tolerant, multi-agent offensive and defensive games using deep reinforcement learning. Electronics, 2023, 12(2): Article No. 327 doi: 10.3390/electronics12020327 [145] Kokolakis N M T, Kanellopoulos A, Vamvoudakis K G. Bounded rational unmanned aerial vehicle coordination for adversarial target tracking. In: Proceedings of the American Control Conference (ACC). Denver, USA: IEEE, 2020. 2508−2513 [146] Xiong H, Zhang Y. Reinforcement learning-based formation-surrounding control for multiple quadrotor UAVs pursuit-evasion games. ISA Transactions, 2024, 145: 205−224 doi: 10.1016/j.isatra.2023.12.006 [147] Zhou Z J, Xu H. Decentralized optimal large scale multi-player pursuit-evasion strategies: A mean field game approach with reinforcement learning. Neurocomputing, 2022, 484: 46−58 doi: 10.1016/j.neucom.2021.01.141 [148] Vrabie D. Online Adaptive Optimal Control for Continuous-time Systems [Ph.D. dissertation], The University of Texas at Arlington, USA, 2010. [149] Gong Z F, He B, Liu G, Zhang X B. Solution for pursuit-evasion game of agents by adaptive dynamic programming. Electronics, 2023, 12(12): Article No. 2595 doi: 10.3390/electronics12122595 [150] Gong Z F, He B, Hu C, Zhang X B, Kang W J. Online adaptive dynamic programming-based solution of networked multiple-pursuer and single-evader game. Electronics, 2022, 11(21): Article No. 3583 doi: 10.3390/electronics11213583 [151] Sun J L, Liu C S. Finite-horizon differential games for missile-target interception system using adaptive dynamic programming with input constraints. International Journal of Systems Science, 2018, 49(2): 264−283 doi: 10.1080/00207721.2017.1401153 [152] Zhang Z X, Zhang K, Xie X P, Sun J Y. Fixed-time zero-sum pursuit-evasion game control of multisatellite via adaptive dynamic programming. IEEE Transactions on Aerospace and Electronic Systems, 2024, 60(2): 2224−2235 doi: 10.1109/TAES.2024.3351810 [153] Vamvoudakis K G, Lewis F L, Hudas G R. Multi-agent differential graphical games: Online adaptive learning solution for synchronization with optimality. Automatica, 2012, 48(8): 1598−1611 doi: 10.1016/j.automatica.2012.05.074 [154] Peng C, Liu X M, Ma J J. Design of safe optimal guidance with obstacle avoidance using control barrier function-based actor-critic reinforcement learning. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2023, 53(11): 6861−6873 doi: 10.1109/TSMC.2023.3288826 [155] Oyler D W, Kabamba P T, Girard A R. Dominance in pursuit-evasion games with uncertainty. In: Proceedings of the 54th IEEE Conference on Decision and Control (CDC). Osaka, Japan: IEEE, 2015. 5859−5864 [156] Agasti A, Reddy P V, Bhikkaji B. Optimal role assignment for multiplayer reach-avoid differential games in 3D space. arXiv preprint arXiv: 2303.07885, 2023. [157] Meier L. A new technique for solving pursuit-evasion differential games. IEEE Transactions on Automatic Control, 1969, 14(4): 352−359 doi: 10.1109/TAC.1969.1099226 [158] Getz W M, Pachter M. Capturability in a two-target “game of two cars”. Journal of Guidance and Control, 1981, 4(1): 15−21 doi: 10.2514/3.19715 [159] Scott W, Leonard N E. Dynamics of pursuit and evasion in a heterogeneous herd. In: Proceedings of the 53rd IEEE Conference on Decision and Control (CDC). Los Angeles, USA: IEEE, 2014. 2920−2925 [160] Garcia E, Fuchs Z E, Milutinovic D, Casbeer D W, Pachter M. A geometric approach for the cooperative two-pursuer one-evader differential game. IFAC-PapersOnLine, 2017, 50(1): 15209−15214 doi: 10.1016/j.ifacol.2017.08.2366 [161] Zhou Z Y, Zhang W, Ding J, Huang H M, Stipanović D M, Tomlin C J. Cooperative pursuit with Voronoi partitions. Automatica, 2016, 72: 64−72 doi: 10.1016/j.automatica.2016.05.007 [162] Zhou S B, Li H P, Chen Z Y. Optimal containment strategies on high-speed evader using multiple pursuers with point-capture. In: Proceedings of the 42nd Chinese Control Conference (CCC). Tianjin, China: IEEE, 2023. 1−6 [163] Zhang Z, Zhang D Y, Zhang Q R, Pan W, Hu T J. DACOOP-A: Decentralized adaptive cooperative pursuit via attention. IEEE Robotics and Automation Letters, 2024, 9(6): 5504−5511 doi: 10.1109/LRA.2023.3331886 [164] Sun Z Y, Sun H B, Li P, Zou J. Cooperative strategy for pursuit-evasion problem with collision avoidance. Ocean Engineering, 2022, 266: Article No. 112742 doi: 10.1016/j.oceaneng.2022.112742 [165] Isler V, Sun D F, Sastry S. Roadmap based pursuit-evasion and collision avoidance. Robotics: Science and Systems. Cambridge: MIT Press, 2005. 257−264 [166] Selvakumar J, Bakolas E. Feedback strategies for a reach-avoid game with a single evader and multiple pursuers. IEEE Transactions on Cybernetics, 2021, 51(2): 696−707 doi: 10.1109/TCYB.2019.2914869 [167] Shishika D, Paulos J, Kumar V. Cooperative team strategies for multi-player perimeter-defense games. IEEE Robotics and Automation Letters, 2020, 5(2): 2738−2745 doi: 10.1109/LRA.2020.2972818 [168] Garcia E, Casbeer D W, Pachter M. Active target defense using first order missile models. Automatica, 2017, 78: 139−143 doi: 10.1016/j.automatica.2016.12.032 [169] Bera R, Makkapati V R, Kothari M. A comprehensive differential game theoretic solution to a game of two cars. Journal of Optimization Theory and Applications, 2017, 174(3): 818−836 doi: 10.1007/s10957-017-1134-z 期刊类型引用(1)
1. 金立民,王海超,谷江春,徐以涛,丁国如. 低空具身智能频谱管控技术研究. 数据采集与处理. 2025(01): 45-55 . 百度学术
其他类型引用(0)
-
计量
- 文章访问数: 937
- HTML全文浏览量: 494
- 被引次数: 1