-
摘要: 近年来, 深度学习在计算机视觉领域的应用取得了突破性进展, 但基于深度学习的视频多目标跟踪(Multiple object tracking, MOT)研究却相对甚少, 而鲁棒的关联模型设计是基于检测的多目标跟踪方法的核心.本文提出一种基于深度神经网络和度量学习的关联模型:采用行人再识别(Person re-identification, Re-ID)领域中广泛使用的度量学习技术和卷积神经网络(Convolutional neural networks, CNNs)设计目标外观模型, 即利用三元组损失函数设计一个三通道卷积神经网络, 提取更具判别性的外观特征构建目标外观相似度; 再结合运动模型计算轨迹片间的关联概率.在关联策略上, 采用匈牙利算法, 首先以逐帧关联方式得到短小可靠的轨迹片集合, 再通过自适应时间滑动窗机制多级关联, 输出各目标最终轨迹.在2DMOT2015、MOT16公开数据集上的实验结果证明了所提方法的有效性, 与当前一些主流算法相比较, 本文方法取得了相当或者领先的跟踪效果.Abstract: While deep learning has made a breakthrough in many sub-fields of computer vision recently, there are only a few deep learning approaches to multiple object tracking (MOT). Since the key component in detection based multiple object tracking is to design a robust affinity model, this paper proposes a novel affinity model based on deep neural network and metric learning, that is, metric learning, a widely used technique in the task of person re-identification (Re-ID), is exploited with convolutional neural networks (CNNs) to design the object's appearance model. Specifically, we adopt a three-channel CNNs that is learned by triplet loss function, to extract the discriminative appearance features and compute appearance similarity between objects. The appearance affinity is then combined with motion model to estimate associating probability among trajectories. A hierarchical association strategy is employed by the Hungarian algorithm. At the low level, a set of short but reliable tracklets are generated in a frame by frame fashion. These tracklets are then further associated to form longer tracklets at the higher levels via an adaptive sliding-window mechanism. Experiment results in the challenging MOT benchmark demonstrate the validity of the proposed method. Compared with several state-of-the-art approaches, our method has achieved competitive or superior performance.
-
Key words:
- Multiple object tracking (MOT) /
- deep learning /
- metric learning /
- affinity model /
- multi-level association
-
迭代学习控制(Iterative learning control, ILC)能够通过利用上一次的迭代经验进行学习, 不断优化控制器和提高控制性能, 最终实现有限区间内对期望轨迹的完全跟踪, 已广泛应用于机械臂等执行重复任务的被控对象[1-5]. 经典迭代学习控制方法主要基于压缩映射方法进行稳定性分析, 包括D型、P型和PD型学习算法[6-9]. 近来, 基于Lyapunov 理论的自适应迭代学习控制(Adaptive iterative learning control, AILC) 方法[10-14]相继提出, 能够通过对不确定系统参数的自适应迭代学习, 间接优化控制器和提高控制性能.
经典ILC方法一般要求系统初值严格重置于期望轨迹初始点, 即每次迭代过程中, 系统初值需与期望轨迹的初值保持一致[15-17]. 然而, 在机械臂等实际系统中, 由于受环境因素和定位精度等影响, 该初值一致条件一般难以满足. 因此, 如何放宽初值一致条件是ILC研究的热点问题之一, 现有的方法主要包括时变边界层法[18], 状态修正方法[19-20]等. 文献[19]对迭代学习控制系统在5种初值情况下的收敛性能分别进行了分析, 并利用初值信息和期望轨迹构建新的期望轨迹. 文献[20]采用三角函数提出一种新型期望轨迹函数放宽解决初值一致条件, 通过设计过渡轨迹衔接每次迭代的初始点与期望轨迹. 然而, 由于过渡轨迹接入点处的位置及其导数与期望轨迹相关, 因此, 状态修正方法在每次迭代时往往需要设计不同的过渡轨迹. 在此基础上, 文献[21]提出一种误差跟踪方法, 通过设计期望误差轨迹和迭代学习控制器, 保证误差轨迹沿预设的期望误差轨迹收敛. 与状态修正方法比较, 误差跟踪方法的期望误差轨迹设置不依赖期望状态轨迹, 且期望误差轨迹接入点的位置及其导数可以简单设置为零[22-23].
此外, 由于机械臂等实际系统中往往存在系统约束、安全限制和信息丢失等问题, 导致迭代学习控制器的设计过程中每次迭代长度发生变化, 称为ILC不等长问题. 例如, 康复训练机械臂由于患者体力不足或者力量不足, 可能使迭代长度未到达指定迭代长度就提前终止. 目前, 已有国内外学者对ILC不等长问题进行了研究. 文献[24-25]针对离散时间线性系统的ILC不等长问题, 构造迭代平均算子, 通过利用往次迭代信息更新控制信号, 证明了跟踪误差期望值能够收敛到零, 但控制器设计中要求已知迭代长度概率分布, 且未对跟踪误差方差进行讨论. 文献[26]考虑迭代长度概率分布未知的情况, 给出变迭代长度下P型学习律的设计方法, 并证明跟踪误差在均方意义上的收敛性. 然而, 该工作并未考虑外部干扰的影响. 文献[27]考虑带有干扰和测量噪声的一类离散时间线性系统, 提出基于改进型迭代平均算子的迭代学习控制方法, 并在2自由度机械臂实验平台上验证该方法的有效性. 文献[24-27]考虑的系统均为离散时间线性系统, 控制器设计一般基于压缩映射方法, 当前迭代的信息并未充分利用.
针对一类非线性连续系统的ILC不等长问题, 文献[28]设计虚拟误差变量补偿未运行部分信息, 并通过重新定义复合能量函数, 证明当迭代次数趋向无穷时, 系统输出能够实现对期望轨迹的完全跟踪. 文献[29]通过引入指标函数, 使得当前迭代中只对最相邻的同一时刻信息进行学习, 并构建改进型复合能量函数证明变迭代长度情况下系统状态的收敛性. 文献[24-25]能够有效解决非线性连续系统的ILC不等长问题, 但控制器设计仍需满足初值一致条件. 由于许多实际系统中ILC初值问题和不等长问题同时存在, 因此文献[24-29]的工作无法直接应用于解决任意初态下的轨迹跟踪问题. 近来, 文献[30]针对机械臂轨迹跟踪中的ILC初值问题和不等长问题, 提出状态修正方法放宽初值一致条件, 并证明变迭代长度下系统误差的${\rm{L}}_2$范数收敛性. 然而, 状态修正方法在每次迭代时往往需要重新设计过渡轨迹, 导致计算量较大.
基于以上讨论, 本文研究任意初态下的机械臂轨迹跟踪问题, 提出一种变长度误差跟踪迭代学习控制方法. 针对ILC初值问题, 构造与期望轨迹无关的双曲余弦期望误差轨迹, 使得迭代初始值可任意设置, 放宽经典迭代学习控制的初值一致条件. 与现有的状态修正方法相比, 修正期望误差轨迹仅需已知实际误差初值及其导数两个条件, 且期望误差轨迹表达式在每次迭代时无需重新设计. 不同于现有的误差跟踪方法[21-23], 本文设计的期望误差轨迹只需设置一个常数项, 使得误差轨迹设计更加简便. 针对ILC 不等长问题, 构造虚拟误差变量构建误差补偿机制, 用于补偿未运行区间的误差信息, 放宽迭代长度不变的限制条件. 与文献[28-30]相比, 本文提出一种全限幅迭代学习控制方法, 能够有效避免参数估计值因逐点收敛导致上下界不固定的问题, 确保机械臂关节位置误差在整个迭代区间上跟踪期望误差轨迹.
1. 问题提出和预备知识
考虑n阶自由度的刚性机械臂, 其动态方程为[30]
$$ \left\{\begin{aligned} &\dot{\boldsymbol{q}}_{1,k} = {\boldsymbol{q}}_{2,k} \\ &{\boldsymbol {M}}({\boldsymbol{q}}_{1,k})\dot {\boldsymbol{q}}_{2,k}+{\boldsymbol {C}}({\boldsymbol{q}}_{1,k},{\boldsymbol{q}}_{2,k}){\boldsymbol{q}}_{2,k}\;+\\ &\qquad{\boldsymbol{G}}({\boldsymbol{q}}_{1,k}) = {\boldsymbol{\tau}}_k+{\boldsymbol{d}}_k \end{aligned}\right.$$ (1) 其中, $ k \;=\; 1, 2, 3, \cdots $, 表示迭代次数, $ {\boldsymbol{q}}_{1,k}\;\in \;{\bf{R}}^n $, $ {\boldsymbol{q}}_{2,k}\in {\bf{R}}^n $, $ \dot {\boldsymbol{q}}_{2,k}\in {\bf{R}}^n $分别表示关节位置、关节速度和关节加速度, $ {\boldsymbol{M}}({{\boldsymbol{q}}}_{1,k})\in {{\bf{R}}^{n\times n}} $为对称正定的惯性矩阵, $ {\boldsymbol{C}}({\boldsymbol{q}}_{1,k},{\boldsymbol{q}}_{2,k})\in {{\bf{R}}^{n\times n}} $为向心−科里奥利矩阵, $ {{\boldsymbol{G}}}({{\boldsymbol{q}}}_{1,k})\in {\bf{R}}^n $为重力矩阵, $ {\boldsymbol{d}}_k\in {\bf{R}}^n $表示包括系统模型不确定性和外部扰动在内的有界干扰, 满足$ ||{\boldsymbol{d}}_k||\le\bar{d} $, $ \bar{d} $为一未知正常数, $ {\boldsymbol{\tau}}_k\in {\bf{R}}^n $表示系统控制输入.
机械臂系统(1)具有如下性质:
性质 1[2]. 矩阵$ \dot{\boldsymbol{M}}({\boldsymbol{q}}_{1,k})-2{\boldsymbol{C}}({\boldsymbol{q}}_{1,k},{\boldsymbol{q}}_{2,k}) $是斜对称矩阵, 即对任意向量$ {\boldsymbol{x}}\;\in \;{\bf{R}}^n $, ${\boldsymbol{x}}^{\rm{T}}[\dot{\boldsymbol{M}}({\boldsymbol{q}}_{1,k})\;- 2{\boldsymbol{C}}({\boldsymbol{q}}_{1,k},{\boldsymbol{q}}_{2,k})]{\boldsymbol{x}} = 0$成立.
性质 2[2]. 对于任意向量$ {\boldsymbol{q}}, \dot {\boldsymbol{q}}\in {\bf{R}}^n $和任意的已知向量$ {\boldsymbol{v}}, \dot {\boldsymbol{v}}\in {\bf{R}}^n $, 存在一个未知的时变参数向量$ {\boldsymbol{\theta}}\in {\bf{R}}^m $, 使得
$$ {\boldsymbol{M}}( {\boldsymbol{q}})\dot {\boldsymbol{v}}+{\boldsymbol{C}}({\boldsymbol{q}}, \dot {\boldsymbol{q}}){\boldsymbol{v}}+{\boldsymbol{G}}({\boldsymbol{q}}) = {\boldsymbol{W}}({\boldsymbol{q}}, \dot {\boldsymbol{q}}, {\boldsymbol{v}}, \dot {\boldsymbol{v}}){\boldsymbol{\theta}} $$ (2) 其中, $ {{\boldsymbol{W}}}({\boldsymbol{q}}, \dot {\boldsymbol{q}}, {\boldsymbol{v}}, \dot {\boldsymbol{v}})\in {{\bf{R}}}^{n\times m} $是一个回归矩阵.
在控制器设计和稳定性分析之前, 给出引理1.
引理 1[13]. 给定标量$ a $和$ b $, 满足$ \underline b<a<\bar b $, 则有以下不等式成立
$$ [{\rm{sat}}(b)-a][{\rm{sat}}(b)-b]\le0 $$ (3) 其中, $ \bar b $为$ b $的上界, $ \underline b $为$ b $的下界, $ {\rm{sat}}(\cdot) $为饱和函数, 其表达式为
$$ {\rm{sat}}(b) = \left\{ {\begin{array}{*{20}{l}} {\bar b,}&{\bar b < b}\\ {b,}&{\underline b \le b \le \bar b}\\ {\underline b ,}&{b < \underline b } \end{array}} \right.$$ (4) 本文控制目标为针对任意初态下机械臂系统(1), 设计迭代学习控制器${\boldsymbol{\tau}}_k$, 使得当迭代次数$ k $趋向于无穷时, 关节位置$ {\boldsymbol{q}}_{1,k}(t) $能够在指定区间$ [\Delta, T_k] $跟踪期望轨迹$ q_{d}(t) $, 即当$ k\rightarrow \infty $时, 有$\tilde {\boldsymbol{q}}_{1,k}\rightarrow $$0, t\in[\Delta, T_k]$成立.
2. 期望误差函数设计
经典的迭代学习控制器设计一般要求系统满足初值一致条件[15-17]
$$ \left\{ \begin{aligned} &{\boldsymbol{q}}_{1,k}(0) = {\boldsymbol{q}}_{d}(0)\\ &{\boldsymbol{q}}_{2,k}(0) = \dot {\boldsymbol{q}}_{d}(0) \end{aligned} \right. $$ (5) 对$ \forall k $成立. 然而在实际运行中, 由于机械臂定位误差以及外部干扰等问题存在, 条件(5)一般难以满足. 因此, 本文构造一种不依赖于期望轨迹的期望误差轨迹$ \tilde {\boldsymbol{q}}_{1,k}^*(t) $, 以放宽初值一致条件(5).
如图1所示, 期望误差轨迹是由误差过渡轨迹和恒为零的轨迹衔接而成, 且期望误差轨迹的设计需要满足以下条件[21]
$$ \tilde {\boldsymbol{q}}_{1,k}^*(0) = \tilde {\boldsymbol{q}}_{1,k}(0),\;\; \tilde {\boldsymbol{q}}_{1,k}^*(\Delta) = 0 $$ (6) $$ \dot {\tilde {\boldsymbol{q}}}_{1,k}^*(0) = \tilde {\boldsymbol{q}}_{2,k}(0),\;\; \dot {\tilde {\boldsymbol{q}}}_{1,k}^*(\Delta) = 0 $$ (7) 其中, $\tilde {\boldsymbol{q}}_{1,k}(t)\,= \,{\boldsymbol{q}}_{1,k}(t)\,-\,{\boldsymbol{q}}_{d}(t)\;\in\; {\bf{R}}^n$ 和 $\tilde {\boldsymbol{q}}_{2,k}(t) \;= {\boldsymbol{q}}_{2,k}(t)- \dot {\boldsymbol{q}}_{d}(t) \in {\bf{R}}^n$分别是$ {\boldsymbol{q}}_{1,k}(t) $, $ {\boldsymbol{q}}_{2,k}(t) $的状态跟踪误差, $ \tilde{\boldsymbol{q}}_{1,k}^*(0) $和$ \dot {\tilde {\boldsymbol{q}}}_{1,k}^*(0) $分别是期望误差轨迹及其导数的初值, $ \tilde{\boldsymbol{q}}_{1,k}^*(\Delta) $和$ \dot {\tilde {\boldsymbol{q}}}_{1,k}^*(\Delta) $分别是期望误差轨迹及其导数接入点的值.
上述对期望误差轨迹的设计不要求系统状态满足初值严格一致条件(5), 而只要求被设计的过渡轨迹满足条件(6)和(7). 其中, 式 (6)是为了保证期望误差轨迹能够衔接误差初值和接入点, $ \tilde {\boldsymbol{q}}_{1,k}^*(0) = \tilde {\boldsymbol{q}}_{1,k}(0) $表示期望误差轨迹初值与实际误差初值相等. 式 (7)则是为了保证期望误差轨迹在初始点和接入点处光滑可导, $ \dot {\tilde {\boldsymbol{q}}}_{1,k}^*(0) = \tilde {\boldsymbol{q}}_{2,k}(0) $可以保证期望误差轨迹在初始点光滑可导, $ \dot {\tilde {\boldsymbol{q}}}_{1,k}^*(\Delta) = 0 $则保证期望误差轨迹在接入点$ t = \Delta $时刻光滑可导.
根据式(6)和式(7), 本文构造一种新的双曲余弦型期望误差轨迹, 表达式为
$$ \tilde{\boldsymbol{q}}_{1,k}^*(t) =\left\{ {\begin{array}{*{20}{l}} {\tilde {\boldsymbol{q}}_{ir,k}(t),}&{t\in[0,\Delta]}\\ {0,}&{{\text{其他}}} \end{array}} \right.$$ (8) 其中, $ \Delta $表示接入点时刻, $ \tilde {\boldsymbol{q}}_{ir,k}(t) $表示误差过渡轨迹, 具体表达形式为
$$ \tilde {\boldsymbol{q}}_{ir,k}(t) = [\tilde{\boldsymbol{q}}_{1,k}(0)+{\tilde {\boldsymbol{q}}}_{2,k}(0){\rm{sin}}(t)](2-{\rm{cosh}}(at))^3 $$ (9) 为了使设计的期望误差轨迹满足条件(6), 常数$ a $的取值可根据$\tilde{\boldsymbol{q}}_{1,k}^*(\Delta) = 0$确定, 即满足${\rm{cosh}}(a\Delta) = 2$. 通过计算可得$ a = \frac{1}{\Delta}{\rm{ln}}(2\pm\sqrt3) $, 因此, $ a $的取值只与预先设定的接入点时间$ \Delta $有关. 由式 (9)可以看出, 本文在期望误差初始值和零误差点之间构造光滑连续的误差过渡轨迹$ \tilde {\boldsymbol{q}}_{ir,k}(t) $, 相当于为机械臂轨迹跟踪控制安排一个理想的误差收敛过渡过程, 然后设计控制器, 使得机械臂系统的实际位置误差跟踪这个“安排的误差收敛过渡过程”, 最终实现机械臂位置轨迹快速且精确跟踪期望轨迹的控制目标.
由式(8)和图1可以发现, 期望误差轨迹在整个期望区间$ [0,T] $连续可导, 当状态误差实现对期望误差轨迹的完全跟踪时, 关节位置$ {\boldsymbol{q}}_{1,k} $能在指定区间$ [\Delta,T] $完全跟踪上给定的期望轨迹$ {\boldsymbol{q}}_d $.
状态修正方法[19-20, 30] 也是解决初值问题的方法之一. 为了解决任意初值的跟踪问题, 状态修正方法对期望轨迹进行修正, 设计过渡轨迹连接每次迭代初始位置和期望轨迹, 如图2所示. 修正后的期望轨迹为[19]
$$ {\boldsymbol{q}}_{1,k}^*(t) =\left\{ {\begin{array}{*{20}{l}} {{\boldsymbol{q}}_{r,k}(t),}&\;{t\in[0,\Delta]}\\ {{\boldsymbol{q}}_d,}&{{\text{其他}}} \end{array}} \right.$$ (10) 其中, $ {\boldsymbol{q}}_{r,k}(t) = {\boldsymbol{A}}_0 t^3+{\boldsymbol{A}}_1 t^2+{\boldsymbol{A}}_2 t+{\boldsymbol{A}}_3 $是状态过渡轨迹:
$$\begin{split} &{{\boldsymbol{A}}_0 = \frac{\Delta \dot {\boldsymbol{q}}_d(\Delta)-2{\boldsymbol{q}}_d(\Delta)+\Delta \dot {\boldsymbol{q}}_{1,k}(0)+2{\boldsymbol{q}}_{1,k}(0)}{\Delta^3}}\\ &{{\boldsymbol{A}}_1 = -\frac{\Delta \dot {\boldsymbol{q}}_d(\Delta)-3{\boldsymbol{q}}_d(\Delta)+2\Delta \dot {\boldsymbol{q}}_{1,k}(0)+3{\boldsymbol{q}}_{1,k}(0)}{\Delta^2}}\\ &{{\boldsymbol{A}}_2 = \dot {\boldsymbol{q}}_{1,k}(0) , \;{\boldsymbol{A}}_3 = {\boldsymbol{q}}_{1,k}(0)} \end{split} $$ 轨迹(10)保证了过渡轨迹能光滑地接入期望状态轨迹. 然而, 参数$ {\boldsymbol{A}}_0 $和$ {\boldsymbol{A}}_1 $依赖于接入点的期望状态轨迹信息$ {\boldsymbol{q}}_d(\Delta) $和$ \dot{\boldsymbol{q}}_d(\Delta) $. 由于系统初值$ {\boldsymbol{q}}_{1,k}(0) $和$ \dot {\boldsymbol{q}}_{1,k}(0) $随迭代次数变化, 当期望状态轨迹发生变化时, 参数${\boldsymbol{A}}_0 \sim{\boldsymbol{A}}_3$需要重新计算, 导致计算量较大.
与状态修正方法(10)直接修正期望轨迹$ {\boldsymbol{q}}_{1,k}^*(t) $不同, 本文通过构造一种新的双曲余弦型过渡轨迹$ \tilde {\boldsymbol{q}}_{ir,k}(t) $修正期望误差轨迹$ \tilde{\boldsymbol{q}}_{1,k}^*(t) $, 仅需要实际误差初值$ \tilde {\boldsymbol{q}}_{1,k}(0) $及其导数$ \dot {\tilde {\boldsymbol{q}}}_{1,k}^*(0) $两个已知条件, 且设置接入点$ t = \Delta $时刻的函数值及其导数为0. 由于期望误差轨迹的设置与期望轨迹无关, 因此对于不同的期望轨迹, 本文提出的期望误差轨迹表达式在每次迭代时无需重新设计. 综上, 本文提出的双曲余弦期望误差轨迹(8)能够避免对$ {\boldsymbol{q}}_d(\Delta) $和$ \dot{\boldsymbol{q}}_d(\Delta) $的依赖, 且只需设计一个与期望轨迹无关的常数项$ a $, 因而设计思路更为简单且计算量相对较小.
注 1. 文献[23]采用的是多项式形式的误差过渡轨迹, 表达式为
$$ e_{ir,k}^\prime(t) = e_{1,k}(0)({\boldsymbol{A}}_0^\prime t^3+{\boldsymbol{A}}_1^\prime t^2+{\boldsymbol{A}}_2^\prime t+{\boldsymbol{A}}_3^\prime) $$ (11) 其中, ${\boldsymbol{A}}_0^\prime \;\;= \;\;({\Delta \dot{e}_{1,k}(0)\;+\;2e_{1,k}(0)})/{\Delta^3}$, ${\boldsymbol{A}}_1^\prime \;= (-2\Delta \dot{e}_{2,k}(0)\;- \; 3e_{1,k}(0))/{\Delta^2}$, ${\boldsymbol{A}}_2^\prime \;= \;\dot e_{1,k}(0)$, ${\boldsymbol{A}}_3^\prime\; = e_{1,k}(0)$, $ e_{1,k}(0) $和$ \dot{e}_{1,k}(0) $分别为状态误差初值及其导数初值.
对比式(9)和式(11)的误差过渡轨迹可以看出, 两种形式的参数设计虽然都与期望轨迹无关, 但是本文设计的双曲余弦形式过渡轨迹函数只含一个常数项$ a $, 设计更加简单.
为了方便描述, 在不引起歧义的情况下, 令$ {\boldsymbol{M}}_k = {\boldsymbol{M}}({{\boldsymbol{q}}}_{1,k}) $, $ {\boldsymbol{C}}_k = {\boldsymbol C}({{\boldsymbol{q}}}_{1,k},{{\boldsymbol{q}}}_{2,k}) $, $ {{\boldsymbol{G}}}_k = {{\boldsymbol{G}}}({{\boldsymbol{q}}}_{1,k}) $, $ {{\boldsymbol{\theta}}} = {{\boldsymbol{\theta}}}(t) $, $ \tilde {\boldsymbol{q}}_{1,k}^* = \tilde {\boldsymbol{q}}_{1,k}^*(t) $, $ \dot {\tilde {\boldsymbol{q}}}_{1,k}^* = \dot {\tilde {\boldsymbol{q}}}_{1,k}^*(t) $, $ \|\cdot\| $表示${\rm{L}}_2$范数.
3. 变长度误差跟踪迭代学习控制器设计
3.1 变迭代长度
本文考虑机械臂的变长度问题, 即实际运行时间$ T_k $随迭代次数变化的问题. 此时, 系统可能会遇到$ T_k>T $和$ T_k\le T $两种情况. 针对$ T_k>T $的情况, 可以发现在期望迭代长度$ [0,T] $内的数据已经满足下次迭代所需数据的要求, 而$ (T,T_k] $之间的数据由于超过期望迭代长度, 无需参与下次迭代更新, 因而可以直接舍弃. 本文集中讨论$ T_k\le T $的情况, 即实际迭代长度在期望迭代长度之内. 当$ t\in(T_k, T] $时, 系统已经完成本次迭代, 控制器不再参与系统运行, 但参数更新律仍需将上次的迭代信息记录到未运行区间, 使得每次迭代都能是最相邻有效的更新信息. 定义$ T_{\rm{min}} $和$ T_{\rm{max }} $为$ T_k $中最小和最大的迭代长度, 则本文针对$ \Delta<T_{\rm{min}}\le T_k\le T_{\rm{max}} = T $情况设计控制器.
假设 1[28]. $ T_k $是一个随机变量, 它的概率分布函数为
$$ F_{T_k}(t) = P[T_k<t] = \left\{ {\begin{array}{*{20}{l}} {0,}&{t \in [0,{T_{{\rm{min}}}}]}\\ {p(t),}&{t \in ({T_{{\rm{min}}}},{T_{{\rm{max}}}}]}\\ {1,} & {t \in ({T_{{\rm{max}}}},\infty )} \end{array}} \right.$$ (12) 其中, $ P[\cdot] $表示概率, $ 0\le p(t)\le 1 $为一个连续函数.
注 2. 假设1描述了$ T_k $的分布情况, 从式(12)可得, $ F_{T_k}(T_{\rm{min}}) = 0 $和$ F_{T_k}(T_{\rm{max}})<1 $. 其中, 前者意味着当$t\in[0,T_{\min})$时, 由于$ T_k\geq T_{\rm{min}} $, 因而系统会继续运行; 后者则意味着迭代长度$ T_k $有一定概率满足$ T_k = T_{\rm{max}} $, 即系统有一定概率运行至最大迭代长度, 进而保证在迭代次数趋于无穷时, $ T_k = T_{\rm{max}} $的情况可以出现无穷次. 此外, 由于控制器和参数学习律的设计与$ T_k $的概率分布函数无关, 因此, 假设1 中$ T_k $的概率分布函数无需提前已知, 假设1可适用于大部分实际系统.
由于运行到$ t = T_k $时刻, 系统会回到初始状态进行下一次迭代, 因此, 在本次迭代中系统不包含$ (T_k, T] $时刻的跟踪信息. 与已有文献使用零补偿信息的方式不同[24-25], 本文利用$ T_k $时刻的误差值补偿未运行部分的误差信息, 设计如下虚拟误差变量
$$ {\boldsymbol{z}}_{1,k} = \left\{ {\begin{array}{*{20}{l}} {\tilde {\boldsymbol{q}}_{1,k}(t)-\tilde {\boldsymbol{q}}_{1,k}^*(t),}&{t\in[0,T_k\,]}\\ {\tilde {\boldsymbol{q}}_{1,k}(T_k)-\tilde {\boldsymbol{q}}_{1,k}^*(T_k),}&{t\in(T_k,T\,]} \end{array}} \right. $$ (13) $$ {\boldsymbol{z}}_{2,k} = \left\{ {\begin{array}{*{20}{l}} { \tilde {\boldsymbol{q}}_{2,k}(t)-{\boldsymbol{\alpha}}_{k}(t),}&{t\in[0,T_k]}\\ {\tilde {\boldsymbol{q}}_{2,k}(T_k)-{\boldsymbol{\alpha}}_{k}(T_k),}&{t\in(T_k,T] } \end{array}} \right.\;\;$$ (14) 其中, $ {\boldsymbol{\alpha}}_{k} = {\boldsymbol{\alpha}}_{k}(t) $为虚拟控制器, 具体设计将在第3.2节中详细描述.
注 3. 当$ t\in(T_k,T] $时, $ {\boldsymbol{z}}_{1,k} $和$ {\boldsymbol{z}}_{2,k} $的跟踪信息是通过误差补偿机制人为补全的, 而非系统实际运行过程中产生的误差信号. 该部分补充的信息只用于稳定性分析, 并不用于系统控制器和参数学习律设计.
3.2 控制器设计
为了保证变迭代长度情况下的稳定性, 以下控制器和学习律的设计基于$ t\in[0, T_k] $和$ t\in(T_k, T] $两种情况进行讨论.
1)情形1: $ t\in[0,T_k] $
步骤 1. 构造Lyapunov函数
$$ V_{1,k} = \frac{1}{2}{\boldsymbol{z}}_{1,k}^{\rm{T}}{\boldsymbol{z}}_{1,k} $$ (15) 对式(15)进行求导, 可得
$$ \begin{split} \dot V_{1,k} =\;& {\boldsymbol{z}}_{1,k}^{\rm{T}}\dot {\boldsymbol{z}}_{1,k}=\\ &{\boldsymbol{z}}_{1,k}^{\rm{T}}(\dot {\boldsymbol{q}}_{1,k}-\dot {\boldsymbol{q}}_{d}-\dot {\tilde {\boldsymbol{q}}}_{1,k}^*)=\\ &{\boldsymbol{z}}_{1,k}^{\rm{T}}({\boldsymbol{z}}_{2,k}+{{\boldsymbol{\alpha}}_{{k}}}-\dot {\tilde {\boldsymbol{q}}}_{1,k}^*) \end{split} $$ (16) 设计虚拟控制器为
$$ {\boldsymbol{\alpha}}_{k} = -c_1{\boldsymbol{z}}_{1,k}+\dot {\tilde {\boldsymbol{q}}}_{1,k}^* $$ (17) 其中, $ c_1>0 $是正常数.
将虚拟控制器(17)代入式(16), 则有
$$ \dot V_{1,k} = {\boldsymbol{z}}_{1,k}^{\rm{T}}{\boldsymbol{z}}_{2,k}-c_1{\boldsymbol{z}}_{1,k}^{\rm{T}}{\boldsymbol{z}}_{1,k} $$ (18) 步骤 2. 构造Lyapunov函数
$$ V_{2,k} = \frac{1}{2}{\boldsymbol{z}}_{2,k}^{\rm{T}} {\boldsymbol{M}}_k {\boldsymbol{z}}_{2,k} $$ (19) 对式(19)求导, 可得
$$ \begin{split} \dot V_{2,k} =\;& {\boldsymbol{z}}_{2,k}^{\rm{T}} {{\boldsymbol{M}}_k} \dot {\boldsymbol{z}}_{2,k}+\frac{1}{2}{\boldsymbol{z}}_{2,k}^{\rm{T}}\dot{ {\boldsymbol{M}}}_k{\boldsymbol{z}}_{2,k}=\\ &{\boldsymbol{z}}_{2,k}^{\rm{T}}{\boldsymbol{M}}_k(\dot {\boldsymbol{q}}_{2,k}-\ddot {\boldsymbol{q}}_{d}-\dot {\boldsymbol{\alpha}}_{k})+\frac{1}{2}{\boldsymbol{z}}_{2,k}^{\rm{T}}\dot {\boldsymbol{M}}_k {\boldsymbol{z}}_{2,k}=\\ &{\boldsymbol{z}}_{2,k}^{\rm{T}}({{\boldsymbol{\tau}}}_k+{{\boldsymbol{d}}}_k-{\boldsymbol{C}}_k{{\boldsymbol{q}}}_{2,k}-{{\boldsymbol{G}}}_k)\;-\\ &{\boldsymbol{z}}_{2,k}^{\rm{T}}{\boldsymbol{M}}_k(\ddot {\boldsymbol{q}}_{d}+\dot {\boldsymbol{\alpha}}_{k})+\frac{1}{2}{\boldsymbol{z}}_{2,k}^{\rm{T}}\dot{ {\boldsymbol{M}}}_k{\boldsymbol{z}}_{2,k}=\\ &{\boldsymbol{z}}_{2,k}^{\rm{T}}[{{\boldsymbol{\tau}}}_k+{{\boldsymbol{d}}}_k-{\boldsymbol{C}}_k(\dot {\boldsymbol{q}}_d+{\boldsymbol{\alpha}}_{k})-{{\boldsymbol{G}}}_k]\;+\\ &\frac{1}{2}{\boldsymbol{z}}_{2,k}^{\rm{T}}(\dot{ {\boldsymbol{M}}}_k-2{\boldsymbol{C}}_k){\boldsymbol{z}}_{2,k}\;-\\ &{\boldsymbol{z}}_{2,k}^{\rm{T}}{\boldsymbol{M}}_k(\ddot {\boldsymbol{q}}_{d}+\dot {{\boldsymbol{\alpha}}}_{k})\\[-10pt] \end{split} $$ (20) 根据性质1和性质2, 式(20)可以简化为
$$ \dot V_{2,k} = {\boldsymbol{z}}_{2,k}^{\rm{T}}({\boldsymbol{\tau}}_k+{\boldsymbol{d}}_k-{{\boldsymbol{W}}}_k{\boldsymbol{\theta}}) $$ (21) 其中, ${{\boldsymbol{W}}}_k\in {\bf{R}}^{n\times m}$为第$ k $次迭代时的回归矩阵, $ n $表示机械臂的自由度, $ m $表示时变参数向量$ {\boldsymbol{\theta}} $的维数.
因此, 设计实际控制器$ {\boldsymbol{\tau}}_k $为
$$ {\boldsymbol{\tau}}_k = -{\boldsymbol{z}}_{1,k}-c_2{\boldsymbol{z}}_{2,k}+{\boldsymbol{W}}_k\hat{\boldsymbol{\theta}}_k-\hat {d}_k{\rm{sgn}}({\boldsymbol{z}}_{2,k}) $$ (22) 其中, $ \hat {\boldsymbol{\theta}}_k $为参数 $ {\boldsymbol{\theta}} $的估计值, $ {\rm{sgn}}(\cdot) $为符号函数, ${\rm{sgn}}({\boldsymbol{z}}_{2,k}) \in {\bf{R}}^n$, $ c_2>0 $为正常数, $ \hat {d}_k $为$ \bar{d} $的估计值.
当$ t\in[0,T_k] $时, 设计参数$ \hat{\boldsymbol{\theta}}_k $和$ \hat {d}_k $的全限幅学习律为
$$ \left\{\begin{aligned} &\hat{\boldsymbol{\theta}}_k(t) = {\rm{sat}}(\hat{\boldsymbol{\theta}}_k^*(t))\\ &\hat{\boldsymbol{\theta}}_k^*(t) = {\rm{sat}}(\hat{\boldsymbol{\theta}}_{k-1}^*(t))-\eta {{\boldsymbol{W}}}_k^{\rm{T}} {\boldsymbol{z}}_{2,k} \end{aligned}\right. $$ (23) $$ \left\{ \begin{aligned} &\hat {{d}}_k(t) = {\rm{sat}}(\hat{d}_k^*(t))\\ &\hat{d}_k^*(t) = {\rm{sat}}(\hat{d}_{k-1}^*(t))+\gamma \| {\boldsymbol{z}}_{2,k}\| \end{aligned} \right. $$ (24) 其中, $ \eta $ 和$ \gamma $为正常数, $ \hat\theta_{-1}^*(t) = 0 $, $ \hat d_{-1}^*(t) = 0 $.
将控制器(22)代入式(21), 可得
$$ \begin{split} \dot V_{2,k} = \;&-{\boldsymbol{z}}_{2,k}^{\rm{T}}{\boldsymbol{z}}_{1,k}-c_2{\boldsymbol{z}}_{2,k}^{\rm{T}}{\boldsymbol{z}}_{2,k}\;+\\ &\;{\boldsymbol{z}}_{2,k}^{\rm{T}}{\boldsymbol{W}}_k\tilde{\boldsymbol{\theta}}_k+{\boldsymbol{z}}_{2,k}^{\rm{T}} {\boldsymbol{d}}_k-\hat d_k\|{\boldsymbol{z}}_{2,k}\|\le\\ &-{\boldsymbol{z}}_{2,k}^{\rm{T}}{\boldsymbol{z}}_{1,k}-c_2{\boldsymbol{z}}_{2,k}^{\rm{T}}{\boldsymbol{z}}_{2,k}+{\boldsymbol{z}}_{2,k}^{\rm{T}}{\boldsymbol{W}}_k\tilde{\boldsymbol{\theta}}_k\;+\\ &\;\bar d\|{\boldsymbol{z}}_{2,k}\|-\hat d_k\|{\boldsymbol{z}}_{2,k}\|\le\\ &-{\boldsymbol{z}}_{2,k}^{\rm{T}}{\boldsymbol{z}}_{1,k}-c_2{\boldsymbol{z}}_{2,k}^{\rm{T}} {\boldsymbol{z}}_{2,k}+{\boldsymbol{z}}_{2,k}^{\rm{T}}{\boldsymbol{W}}_k\tilde{\boldsymbol{\theta}}_k\;-\\ &\;\tilde d_k\|{\boldsymbol{z}}_{2,k}\|\\[-10pt] \end{split} $$ (25) 其中, $ \tilde{\boldsymbol{\theta}}_k = \hat{\boldsymbol{\theta}}_k-{\boldsymbol{\theta}} $, $ \tilde d_k = \hat d_k-\bar d $.
2)情形2: $ t\in(T_k, T] $
当$ t\in(T_k,T] $时, 实际系统已经完成本次迭代运行, 所以无需加入控制器. 但是, 参数学习律仍然需要保留上一次的迭代信息, 此时全限幅学习律设计为
$$ \left\{\begin{aligned} &\hat{\boldsymbol{\theta}}_k(t) = {\rm{sat}}(\hat{\boldsymbol{\theta}}_k^*(t))\\ &\hat{\boldsymbol{\theta}}_k^*(t) = {\rm{sat}}(\hat{\boldsymbol{\theta}}_{k-1}^*(t)) \end{aligned}\right. $$ (26) $$ \left\{\begin{aligned} &\hat d_k(t) = {\rm{sat}}(\hat{d}_k^*(t))\\ &\hat{d}_k^*(t) = {\rm{sat}}(\hat{d}_{k-1}^*(t)) \end{aligned} \right. $$ (27) 由以上分析可知, 在$ t\in(T_k,T] $时无需设计控制器, 且学习律(26)和(27)中并未包含$ {\boldsymbol{z}}_{1,k}, {\boldsymbol{z}}_{2,k} $的相关信息. 因此, 虚拟误差变量$ {\boldsymbol{z}}_{1,k} $, $ {\boldsymbol{z}}_{2,k} $在$ t\in (T_k,T] $时并不影响控制器和参数学习律的设计.
注 4. 与已有迭代平均算子[24]方法相比, 本文设计的学习律(26)和(27)能不断保存上一迭代时刻的参数信息, 为之后的参数学习保存最近一次相同时刻的迭代信息, 使得每次学习律都是根据最相邻一次的迭代信息进行, 因此能够充分利用以往迭代中的信息.
注 5. 本文采用全限幅学习律(23), (24), (26)和(27), 避免参数估计值因逐点收敛导致上下界不固定的情况, 使得参数估计值$\hat{\boldsymbol{\theta}}_k(t)$、$ \hat d_k(t) $受到固定大小的饱和限幅. 部分限幅学习律[10]也能保证参数估计值的有界性, 但是由于部分限幅学习律中存在未限幅项, 导致参数估计值的界并不是固定的值.
4. 稳定性分析
定理 1. 针对任意初态的$ n $自由度机械臂(1), 在假设1的前提下, 设计实际控制器(22)和参数学习律(23), (24), (26)和(27), 使得当迭代次数趋向无穷时, 实现关节位置误差$\tilde {\boldsymbol{q}}_{1,k}$在$ [\Delta, T_k] $区间上以概率1收敛.
证明. 设计复合能量函数如下:
$$ E_k = V_{1,k}+V_{2,k}+\frac{1}{2\eta}\int_0^t\tilde{\boldsymbol{\theta}}_k^{\rm{T}}\tilde{\boldsymbol{\theta}}_k{\rm{d}}\tau+\frac{1}{2\gamma}\int_0^t\tilde d_k^2{\rm{d}}\tau $$ (28) 1)证明$ E_k $随迭代次数单调递减.
当$ t\in[0,T_k] $时, 定义$ \delta E_k = E_k-E_{k-1} $, 则有
$$ \begin{split} \delta E_k =\;& V_{1,k}+V_{2,k}-V_{1,k-1}-V_{2,k-1}\;+\\ &\frac{1}{2\eta}\int_0^t(\tilde{\boldsymbol{\theta}}_k^{\rm{T}}\tilde{\boldsymbol{\theta}}_k-\tilde{\boldsymbol{\theta}}_{k-1}^{\rm{T}}\tilde{\boldsymbol{\theta}}_{k-1}){\rm{d}}\tau\;+\\ &\frac{1}{2\gamma}\int_0^t(\tilde d_k^2-\tilde d_{k-1}^2){\rm{d}}\tau \end{split} $$ (29) 式(29)中, $ V_{1,k}+V_{2,k}-V_{1,k-1}-V_{2,k-1} $可以进一步写为
$$ \begin{split} &V_{1,k}+V_{2,k}-V_{1,k-1}-V_{2,k-1} \le\\ &\;\;\;\;V_{1,k}(0)+V_{2,k}(0)+\int_0^t(\dot V_{1,k}+\dot V_{2,k}){\rm{d}}\tau-V_{1,k-1}\le\\ &\;\;\;\;\int_0^t(-c_1{\boldsymbol{z}}_{1,k}^{\rm{T}} {\boldsymbol{z}}_{1,k}-c_2{\boldsymbol{z}}_{2,k}^{\rm{T}} {\boldsymbol{z}}_{2,k}+{\boldsymbol{z}}_{2,k}^{\rm{T}}{\boldsymbol{W}}_k\tilde{\boldsymbol{\theta}}_k\;-\\ &\;\;\;\;\tilde d_k\|{\boldsymbol{z}}_{2,k}\|){\rm{d}}\tau-\frac{1}{2}{\boldsymbol{z}}_{1,k-1}^{\rm{T}} {\boldsymbol{z}}_{1,k-1}\\[-15pt] \end{split} $$ (30) 其中, $V_{1,k}(0) = V_{2,k}(0) = \frac{1}{2}{\boldsymbol{z}}_{1,k}^{\rm{T}}(0){\boldsymbol{z}}_{1,k}(0) = \frac{1}{2}{\boldsymbol{z}}_{2,k}^{\rm{T}} (0)\times $ ${\boldsymbol{M}}_k (0){\boldsymbol{z}}_{2,k}(0) = 0$.
根据$(b-a)^{\rm{T}}(b-a)-(c-a)^{\rm{T}}(c-a) = 2(b\;- c)^{\rm{T}}(b-a)-(b-c)^{\rm{T}}(b-c)$和引理1, 有
$$ \begin{split} \tilde{\boldsymbol{\theta}}_k^{\rm{T}}\tilde{\boldsymbol{\theta}}_k&-\tilde{\boldsymbol{\theta}}_{k-1}^{\rm{T}}\tilde{\boldsymbol{\theta}}_{k-1}=\\ &2(\hat{\boldsymbol{\theta}}_{k}-\hat{\boldsymbol{\theta}}_{k-1})^{\rm{T}}\tilde{\boldsymbol{\theta}}_{k}-(\hat{\boldsymbol{\theta}}_{k}-\hat{\boldsymbol{\theta}}_{k-1})^{\rm{T}}(\hat{\boldsymbol{\theta}}_{k}-\hat{\boldsymbol{\theta}}_{k-1}) \le\\ &2(\hat{\boldsymbol{\theta}}_{k}+\hat{\boldsymbol{\theta}}_{k}^*-\hat{\boldsymbol{\theta}}_{k}^*-\hat{\boldsymbol{\theta}}_{k-1})^{\rm{T}}\tilde{\boldsymbol{\theta}}_{k}=\\ &2({\rm{sat}}(\hat{\boldsymbol{\theta}}_{k}^*)-\hat{\boldsymbol{\theta}}_{k}^*)^{\rm{T}}({\rm{sat}}(\hat{\boldsymbol{\theta}}_{k}^*)-{{\boldsymbol{\theta}}})\;+\\ &2(\hat{\boldsymbol{\theta}}_{k}^*-\hat{\boldsymbol{\theta}}_{k-1})^{\rm{T}}\tilde{\boldsymbol{\theta}}_{k} \le\\ &-2\eta {\boldsymbol{z}}_{2,k}^{\rm{T}} {\boldsymbol{W}}_k \tilde{\boldsymbol{\theta}}_{k}\\[-13pt] \end{split} $$ (31) 同理, 可得
$$ \tilde d_k^2-\tilde d_{k-1}^2\le2\gamma \|{\boldsymbol{z}}_{2,k}\|\tilde d_k $$ (32) 将式(30) ~ (32)代入式(29), 可得
$$ \begin{split} \delta E_k\le \;&-\int_0^t(c_1{\boldsymbol{z}}_{1,k}^{\rm{T}} {\boldsymbol{z}}_{1,k}+c_2{\boldsymbol{z}}_{2,k}^{\rm{T}} {\boldsymbol{z}}_{2,k}){\rm{d}}\tau\;-\\ &\frac{1}{2}{\boldsymbol{z}}_{1,k-1}^{\rm{T}} {\boldsymbol{z}}_{1,k-1}\le0 \end{split} $$ (33) 故当$ t\in[0,T_k] $时, $ E_k $随迭代次数单调递减. 当$ t\in(T_k,T] $时, 根据积分的分段可加性, 式(28)可改写为
$$ E_k(t) = \phi_{1,k}(t)+\phi_{2,k}(t) $$ (34) 其中,
$$ \begin{split} \phi_{1,k}(t) =\;& V_{1,k}(t)+V_{2,k}(t)+\frac{1}{2\eta}\int_0^{T_k}\tilde{\boldsymbol{\theta}}_k^{\rm{T}}\tilde{\boldsymbol{\theta}}_k{\rm{d}}\tau\;+\\ &\frac{1}{2\gamma}\int_0^{T_k}\tilde d_k^2{\rm{d}}\tau\\ \phi_{2,k}(t) =\;& \frac{1}{2\eta}\int_{T_k}^t\tilde{\boldsymbol{\theta}}_k^{\rm{T}}\tilde{\boldsymbol{\theta}}_k{\rm{d}}\tau+\frac{1}{2\gamma}\int_{T_k}^t\tilde d_k^2{\rm{d}}\tau\end{split} $$ 定义$ \delta \phi_{1,k} = \phi_{1,k}(t)-\phi_{1,k-1}(t) $, 根据式(13) ~ (15)和式(19), 有
$$ V_{1,k}(t) = V_{1,k}(T_k) $$ (35) $$ V_{2,k}(t) = V_{2,k}(T_k) $$ (36) 与式(29) ~ (33)的证明过程类似, 可得
$$ \begin{split} \delta \phi_{1,k} =\;& V_{1,k}(T_k) +V_{2,k}(T_k) -V_{1,k-1}(t) -V_{2,k-1}(t)\;+\\ &\frac{1}{2\eta}\int_0^{T_k}(\tilde{\boldsymbol{\theta}}_k^{\rm{T}}\tilde{\boldsymbol{\theta}}_k-\tilde{\boldsymbol{\theta}}_{k-1}^{\rm{T}}\tilde{\boldsymbol{\theta}}_{k-1}){\rm{d}}\tau\;+\\ &\frac{1}{2\gamma}\int_0^{T_k}(\tilde d_k^2-\tilde d_{k-1}^2){\rm{d}}\tau\le\\ &-\int_0^{T_k}(c_1{\boldsymbol{z}}_{1,k}^{\rm{T}} {\boldsymbol{z}}_{1,k}+c_2{\boldsymbol{z}}_{2,k}^{\rm{T}} {\boldsymbol{z}}_{2,k}){\rm{d}}\tau\;-\\ &\frac{1}{2}{\boldsymbol{z}}^{\rm{T}}_{1,k-1}(t) {\boldsymbol{z}}_{1,k-1}(t)\le 0\\[-15pt] \end{split} $$ (37) 由式(37)可知, 为证明$ E_k $在$ t\in[0,T] $上随迭代次数单调递减, 只需证明$\delta \phi_{2,k} \;= \;\phi_{2,k}(t)\;- \phi_{2,k-1}(t)\le0$成立. 由全限幅学习律(26)和(27), 可得
$$ \begin{split} \phi_{2,k}(t) = \;& \frac{1}{2\eta}\int_{T_k}^t({\rm{sat}}(\hat{\boldsymbol{\theta}}_{k-1}^*) -{{\boldsymbol{\theta}}})^{\rm{T}}({\rm{sat}}(\hat{\boldsymbol{\theta}}_{k-1}^*) -{{\boldsymbol{\theta}}}){\rm{d}}\tau\;+\\ &\frac{1}{2\gamma}\int_{T_k}^t({\rm{sat}}(\hat d_{k-1}^*)-\bar d)^2{\rm{d}}\tau=\\ & \frac{1}{2\eta}\int_{T_k}^t(\hat{\boldsymbol{\theta}}_{k-1}-{{\boldsymbol{\theta}}})^{\rm{T}}(\hat{\boldsymbol{\theta}}_{k-1}-{{\boldsymbol{\theta}}}){\rm{d}}\tau\;+\\ &\frac{1}{2\gamma}\int_{T_k}^t(\hat d_{k-1}-\bar d)^2{\rm{d}}\tau =\phi_{2,k-1}(t)\\[-15pt] \end{split} $$ (38) 由式(38)可知, $ \delta \phi_{2,k} = 0 $成立, 进而可得当$ t\in(T_k,T] $时, 有$ \delta \phi_{1,k}\le0 $且$ \delta \phi_{2,k} = 0 $成立, 因此, $ \delta E_k(t)\le0 $仍成立. 综上所述, $ E_k $在$ [0,T] $区间上任意时刻的值随迭代次数增加而单调递减.
2) 证明$ k = 0 $时, $ E_0(t) $在$ [0,T] $ 是有界的.
$$ E_0 = V_{1,0}+V_{2,0}+\frac{1}{2\eta}\int_0^t\tilde{\boldsymbol{\theta}}_0^{\rm{T}}\tilde{\boldsymbol{\theta}}_0{\rm{d}}\tau+\frac{1}{2\gamma}\int_0^t\tilde d_0^2{\rm{d}}\tau $$ (39) 当$ t\in[0, T_k] $时, 对式(39)求导, 可得
$$ \begin{split} \dot E_0 = \;& \dot V_{1,0}+\dot V_{2,0}+\frac{1}{2\eta}\tilde{\boldsymbol{\theta}}_0^{\rm{T}}\tilde{\boldsymbol{\theta}}_0-\frac{1}{2\eta}\tilde{\boldsymbol{\theta}}_{-1}^{\rm{T}}\tilde{\boldsymbol{\theta}}_{-1}\;+\\ &\frac{1}{2\eta}\tilde{\boldsymbol{\theta}}_{-1}^{\rm{T}}\tilde{\boldsymbol{\theta}}_{-1}+\frac{1}{2\gamma}\tilde d_0^2-\frac{1}{2\gamma}\tilde d_{-1}^2+\frac{1}{2\gamma}\tilde d_{-1}^2 =\\ &-c_1{\boldsymbol{z}}_{1,0}^{\rm{T}}{\boldsymbol{z}}_{1,0}- c_2{\boldsymbol{z}}_{2,0}^{\rm{T}}{\boldsymbol{z}}_{2,0}\;+\\ &{\boldsymbol{z}}_{2,0}^{\rm{T}}{\boldsymbol{W}}_0\tilde{\boldsymbol{\theta}}_0-\tilde d_0\|{\boldsymbol{z}}_{2,0}\|\;+\\ &\frac{1}{2\eta}{{\boldsymbol{\theta}}}^{\rm{T}}{{\boldsymbol{\theta}}}+\frac{1}{\eta}(\hat{\boldsymbol{\theta}}_{0}-\hat{\boldsymbol{\theta}}_{0}^*)^{\rm{T}}\tilde{\boldsymbol{\theta}}_{0}+\frac{1}{\eta}(\hat{\boldsymbol{\theta}}_{0}^*-\hat{\boldsymbol{\theta}}_{-1})^{\rm{T}}\tilde{\boldsymbol{\theta}}_{0}\;+\\ &\frac{1}{2\gamma} \bar d^2+\frac{1}{\gamma}(\hat d_0-\hat d_{0}^*)\tilde d_{0}+\frac{1}{\gamma}(\hat d_{0}^*-\hat d_{-1})^{\rm{T}}\tilde d_{0}\le L \end{split} $$ (40) 其中, $L = \frac{1}{2\eta}{{\boldsymbol{\theta}}}^{\rm{T}}{{\boldsymbol{\theta}}}+\frac{1}{2\gamma} \bar d^2 > 0$是一个常数.
因此, $ E_0(t)\le E_0(0)+T_kL<\infty $是一个有界函数, 即$ E_0(t) $在$ [0,T_k] $是有界的.
当$ t\in(T_k, T] $ 时, 由于$V_{1,0}(t) = V_{1,0}(T_k)$, $V_{2,0}(t) =V_{2,0}(T_k)$, $\frac{1}{2\eta}\int_{0}^{T_k}\tilde{\boldsymbol{\theta}}_0^{\rm{T}}\tilde{\boldsymbol{\theta}}_0{\rm{d}}\tau+\frac{1}{2\gamma}\int_0^{T_k}\tilde d_0^2{\rm{d}}\tau$已经证明有界, 现只需证明$\frac{1}{2\eta}\int_{T_k}^t\tilde{\boldsymbol{\theta}}_0^{\rm{T}}\tilde{\boldsymbol{\theta}}_0{\rm{d}}\tau+\frac{1}{2\gamma}\int_{T_k}^t\tilde d_0^2{\rm{d}}\tau$有界, 即可证明$ E_0(t) $在整个迭代长度$ [0,T] $ 有界.
$$ \begin{split} &\frac{1}{2\eta}\int_{T_k}^t\tilde{\boldsymbol{\theta}}_0^{\rm{T}}\tilde{\boldsymbol{\theta}}_0{\rm{d}}\tau+\frac{1}{2\gamma}\int_{T_k}^{\rm{T}}\tilde d_0^2{\rm{d}}\tau=\\ &\;\;\frac{1}{2\eta}\int_{T_k}^t(\hat{\boldsymbol{\theta}}_0-{{\boldsymbol{\theta}}})^{\rm{T}}(\hat{\boldsymbol{\theta}}_0-{{\boldsymbol{\theta}}}){\rm{d}}\tau+\frac{1}{2\gamma}\int_{T_k}^t(\hat d_0-\bar d)^2{\rm{d}}\tau=\\ &\;\;\frac{1}{2\eta}\int_{T_k}^t(\hat{\boldsymbol{\theta}}_{-1}-{{\boldsymbol{\theta}}})^{\rm{T}}(\hat{\boldsymbol{\theta}}_{-1}-{{\boldsymbol{\theta}}}){\rm{d}}\tau\;+\\ &\;\;\frac{1}{2\gamma}\int_{T_k}^t(\hat d_{-1}-\bar d)^2{\rm{d}}\tau \le\\ &\;\;\frac{T-T_k}{2\eta}{{\boldsymbol{\theta}}}^{\rm{T}}{{\boldsymbol{\theta}}}+\frac{T-T_k}{2\gamma}\bar d^2<\infty \\[-15pt]\end{split} $$ (41) 综上所述, $ E_0(t) $在 $ [0,T] $上是有界的. 根据1)和2)两部分的证明结果, 可以得到如下结论:
$$ \begin{split} E_k(t)= \;&E_0(t)+\sum\limits_{j = 1}^k\delta E_j(t)\le\\ &E_0(t)-\sum\limits_{j = 0}^{k-1}\frac{1}{2}{\boldsymbol{z}}_{1,j}^{\rm{T}} {\boldsymbol{z}}_{1,j}\;-\\ &\sum\limits_{j = 1}^k\int_0^{{\rm{min}}\{t,T_j\}}(c_1{\boldsymbol{z}}_{1,j}^{\rm{T}} {\boldsymbol{z}}_{1,j}+c_2{\boldsymbol{z}}_{2,j}^{\rm{T}} {\boldsymbol{z}}_{2,j}){\rm{d}}\tau \end{split} $$ (42) 由于$ E_0(t) $有界且$ 0\le E_k(t)\le E_0(t) $, 可得
$$ \begin{split} &\sum\limits_{j = 1}^k\int_0^{{\rm{min}}\{t,T_j\}}(c_1{\boldsymbol{z}}_{1,j}^{\rm{T}} {\boldsymbol{z}}_{1,j}+c_2{\boldsymbol{z}}_{2,j}^{\rm{T}} {\boldsymbol{z}}_{2,j}){\rm{d}}\tau\;+\\ &\qquad\sum\limits_{j = 0}^{k-1}\frac{1}{2}{\boldsymbol{z}}_{1,j}^{\rm{T}} {\boldsymbol{z}}_{1,j} \le E_0(t)-E_k(t)<\infty \end{split} $$ (43) 为了更直观地说明$ {\boldsymbol{z}}_{1,k} $采取${\rm{L}}_2$范数形式以概率1收敛至0, 引入满足伯努利分布的随机变量$ \gamma_k(t) $, 取值分别为0或1. 当$ \gamma_k(t) = 1 $, 意味着在第$ k $次迭代的$ t $时刻系统仍在运行, 即$ t\le T_k $; 当$ \gamma_k(t) = 0 $, 意味着在第$ k $次迭代的$ t $时刻系统已经停止运行, 即$ T_k<t $. 由假设1可知, $ \gamma_k(t) = 1 $发生的概率为$q(t) = P(\gamma_k(t) = 1) = P(t\le T_k) = 1-P(T_k < t) = 1\;-$$F_{T_k}(t) $, 且$ q(T)>0 $意味着系统有概率运行至最大迭代长度. 由此, 可以将$ {\boldsymbol{z}}_{1,k} $写成另一种形式, 即
$$ \begin{split} {\boldsymbol{z}}_{1,k}(t) =\;& \gamma_k(t)[\tilde {\boldsymbol{q}}_{1,k}(t)-\tilde {\boldsymbol{q}}_{1,k}^*(t)]\;+\\ &(1-\gamma_k(t))[\tilde {\boldsymbol{q}}_{1,k}(T_k)-\tilde {\boldsymbol{q}}_{1,k}^*(T_k)] \end{split} $$ (44) 迭代次数趋向无穷时, 根据式(43)和式(44), 可得
$$ \begin{split} &\sum\limits_{j = 0}^\infty\gamma_j(t)[(\tilde {\boldsymbol{q}}_{1,k}(t)-\tilde {\boldsymbol{q}}_{1,k}^*(t))^{\rm{T}}(\tilde {\boldsymbol{q}}_{1,k}(t)-\tilde {\boldsymbol{q}}_{1,k}^*(t))] \le\\ &\qquad\sum\limits_{j = 0}^\infty\gamma_j(t){\boldsymbol{z}}_{1,k}^{\rm{T}}{\boldsymbol{z}}_{1,k}\le\sum\limits_{j = 0}^\infty{\boldsymbol{z}}_{1,k}^{\rm{T}}{\boldsymbol{z}}_{1,k}<\infty\\[-18pt] \end{split} $$ (45) 随着迭代次数趋向无穷, 即$ k\to\infty $, 当$t\in[0, T_{{\rm{min}}})$时, $ \gamma_j(t) = 1 $恒成立, 因此可以推出$[\tilde {\boldsymbol{q}}_{1,k}(t)\;- \tilde {\boldsymbol{q}}_{1,k}^*(t)]$收敛至$ 0 $成立. 当$ t\in[T_{{\rm{min}}},T] $时, 由假设1知, $ \gamma_j(t) = 1 $仍会有无穷次成立, $[\tilde {\boldsymbol{q}}_{1,k}(t)-\tilde {\boldsymbol{q}}_{1,k}^*(t)]$仍然以概率$ 1 $收敛至$ 0 $成立. 综上所述, 当迭代次数趋向无穷时, 关节位置误差$ \tilde {\boldsymbol{q}}_{1,k} $能够跟踪期望误差轨迹$ \tilde {\boldsymbol{q}}_{1,k}^* $. 根据期望误差轨迹(8) 的定义可得, 当$ t\in[\Delta,T_k] $时, 关节位置误差以概率1收敛至0, 即在$ t\in[\Delta,T_k] $时, 关节位置$ {\boldsymbol{q}}_{1,k} $能够实现对期望轨迹的完全跟踪.
□ 注 6. 根据式(17)和式(22)可知, 如果控制器的参数$ c_1 $和$ c_2 $选取过大, 则导致高增益控制; 如果控制参数选取过小, 则会减慢误差收敛速度. 学习律(23)和(24)的学习增益$ \eta $, $ \gamma $选取过小, 会导致迭代学习控制的学习速率下降; 但如果选取过大, 则可能出现不必要的振荡, 甚至导致系统状态发散.
5. 仿真分析
考虑一个$ 2 $自由度机械臂系统, 其表达式为[30]
$$ \left\{\begin{aligned} &\dot {{{\boldsymbol{q}}}}_{1,k} = {{\boldsymbol{q}}}_{2,k}\\ &{\boldsymbol{M}}({{\boldsymbol{q}}}_{2,k})\dot {\boldsymbol{q}}_{2,k}+{\boldsymbol{C}}({{\boldsymbol{q}}}_{1,k},{{\boldsymbol{q}}}_{2,k}){{\boldsymbol{q}}}_{2,k}\;+\\ &\qquad{{\boldsymbol{G}}}({{\boldsymbol{q}}}_{1,k}) = {{\boldsymbol{\tau}}}_k+{{\boldsymbol{d}}}_k \end{aligned} \right. $$ (46) 其中,
$$ \begin{split} &{\boldsymbol{q}}_{1,\,k} = [q_{11,\,k},q_{12,\,k}]^{\rm{T}},{{\boldsymbol{q}}}_{2,\,k} = [q_{21,\,k},q_{22,\,k}]^{\rm{T}}\\ & {\boldsymbol{M}}({\boldsymbol{q}}_{2,k}) = [M_{11}, M_{12}; M_{21}, M_{22}] \\ &M_{11} = m_1L_1^2+ m_2 (L_1^2+L_2^2+2L_1L_2{\rm{cos}}(q_{12,k}))\\ &M_{12} = M_{21} = m_2 (L_2^2+ L_1L_2{\rm{cos}}(q_{12,k}))\\ &M_{22} = m_2L_2^2,C_r = m_2L_1L_2{\rm{sin}}(q_{12,k})\\ &{\boldsymbol{C}}({\boldsymbol{q}}_{1,k}, {\boldsymbol{q}}_{2,k}) = [-C_rq_{22,k}, -C_r(q_{21,k}\;+\\ &\;\;\;\qquad q_{22,k}); C_r q_{21,k}, 0],\;\;{\boldsymbol{G}}({\boldsymbol{q}}_{1,k}) =[G_1,G_2]^{\rm{T}} \\ &G_1 = (m_1+m_2)L_1g{\rm{cos}}(q_{11,k})\;+\\ &\;\;\;\qquad m_2L_2g{\rm{cos}}(q_{11,k}+q_{12,k})\\ &G_2 = m_2L_2g{\rm{cos}}(q_{11,k}+q_{12,k})\end{split}\;\;\;\; \;\;$$ 方法1 (M1). 本文提出的误差跟踪迭代学习控制方法, 包括控制器(17), (22)和学习律(23), (24).
控制器(22)中的回归矩阵${\boldsymbol{W}}_k $如下所示:
${{\boldsymbol{W}}}_k = [W_{11}, W_{12}; W_{21}, W_{22}]$
$W_{11}= L_1^2 (\ddot q_{d1}+ \dot \alpha_{1,k})+L_1g{\rm{cos}}(q_{11,k})$, $ W_{21} = 0 $
$$ \begin{split} W_{12} =\;& (L_1^2+L_2^2+2L_1L_2{\rm{cos}}(q_{12,k}))(\ddot q_{d1}+\dot \alpha_{1,k})\;-\\ &L_1L_2{\rm{sin}}(q_{12,k})[(q_{21,k}+q_{22,k})(\dot q_{d2}+\alpha_{2,k})\;+ \end{split} $$ $$ \quad\begin{split} &q_{22,k}(\dot q_{d1}+\alpha_{1,k})]+L_1g{\rm{cos}}(q_{11,k})\;+\\ &L_2g{\rm{cos}}(q_{11,k}+q_{12,k}) \end{split} $$ $$ \begin{split} W_{22} = \;& (L_2^2+L_1L_2{\rm{cos}}(q_{12,k}))(\ddot q_{d1}+\dot \alpha_{1,k})\;+\\ &L_1L_2q_{21,k}{\rm{sin}}(q_{12,k})(\dot q_{d1}+\alpha_{1,k})\;+\\ &L_2^2(\ddot q_{d2}+\dot \alpha_{2,k})+L_2g{\rm{cos}}(q_{11,k}+q_{12,k}) \end{split} $$ 方法2 (M2). 机械臂自适应控制方法, 其虚拟控制器和实际控制器分别设计为
$$ {\boldsymbol{\alpha}} = -c_1 {\boldsymbol{z}}_1+{\boldsymbol{q}}_d \qquad\qquad\; \;\;\quad\qquad$$ (47) $$ {\boldsymbol{\tau}} = -{\boldsymbol{z}}_1-c_2 {\boldsymbol{z}}_2+{\boldsymbol{W}}\hat{\boldsymbol{\theta}}-\bar{d}{\rm{sgn}}({\boldsymbol{z}}_2) $$ (48) 其中, 未知参数$\hat{{\boldsymbol{\theta}}}$的自适应更新律设计为
$$ \dot{\hat{ {\boldsymbol{\theta}}}} = -{\boldsymbol{W}}^{\rm{T}} {\boldsymbol{z}}_2+\varepsilon \hat{\boldsymbol{\theta}} $$ (49) 其中,
$$ \begin{split} & W' = [W'_{11}, W'_{12}; W'_{21}, W'_{22}]\\ &W'_{11} = L_1^2\dot \alpha_{1,k}+ L_1g{\rm{cos}}(q_{11}) , W'_{21} = 0\\ &W'_{12} = (L_1^2+L_2^2+2L_1L_2{\rm{cos}}(q_{12}))\dot \alpha_{1,k}\;-\\ &\qquad\quad L_1L_2{\rm{sin}}(q_{12})[(q_{21}+q_{22})\alpha_{2}+q_{22}\alpha_{1,k}]\;+\\ &\qquad\quad L_1g{\rm{cos}}(q_{11})+L_2g{\rm{cos}}(q_{11}+q_{12}) \end{split} $$ $$ \;\begin{split} \; W'_{22} = \;& (L_2^2 + L_1L_2{\rm{cos}}(q_{12}))\dot \alpha_{1,k} + L_1L_2q_{21}{\rm{sin}}(q_{12})\alpha_{1}\,+\\ &L_2^2\dot \alpha_{2}+L_2g{\rm{cos}}(q_{11}+q_{12}) \end{split} $$ 仿真中系统初值设置为随机变量${\boldsymbol{q}}_{1,k}(0) = [1+0.5{\rm{rand}}(1),0.5+0.4{\rm{rand}}(1)]^{\rm{T}}$, 设置每次的迭代长度$ T_k $均匀分布在$ [4,5] $s, 期望迭代长度为$ T_{\rm{max}} = 5\;{\rm{s}} $, 期望轨迹给定为 ${\boldsymbol{q}}_d = [\;0.2{\rm{cos}}(0.5\pi t), 0.1{\rm{sin}}(\pi t)\;+ 0.1{\rm{cos}}(\pi t)\;]^{\rm{T}}$.
为保证对比公平性, 机械臂系统参数均设置相同, 即$L_1 = L_2 = 0.5\;{\rm{m}}$, $g = 9.81\;{\rm{m/s}}^2$, $m_1 = m_2 = 1\;{\rm{kg}}$, $ {\boldsymbol{d}}_k = [\,0.3\times{\rm{rand}}(1){\rm{sin}}(t), 0.2\times{\rm{rand}}(1){\rm{cos}}(t)\,]^{\rm{T}} $. M1和M2的控制器参数设置为$ c_1 = 3 $, $ c_2 = 3 $, $ \bar{d} = 0.3 $, M1的学习律参数设置为$ \eta = 0.08 $, $ \gamma = 0.01 $. M2的自适应律参数设置为$ \varepsilon = 0.01 $. 期望误差轨迹的接入点设为$ \Delta = 0.5\;{\rm{s}} $, 参数选择为$a = 2{\rm{ln}}(2- \sqrt3)$. 定义性能指标${{\rm{avg}}(\|{\boldsymbol{z}}_{1,k}(t)\|)}\; = \;\frac{\sum_{i = 1}^{T_k/tc}(||{\boldsymbol{z}}_{1,k}(i\cdot tc)||)}{T_k/tc}$ 和 $J_{\max} = \max\nolimits_{t\in[0,T_k]}(\|{\boldsymbol{z}}_{1,k}\|)$反映跟踪性能随迭代次数变化的情况. 其中, $ {{\rm{avg}}(\|{\boldsymbol{z}}_{1,k}(t)\|)} $表示在每次迭代过程中, 先对每个采样时刻的值进行累加求和, 然后再针对每次迭代时间求平均值. $ T_k $表示迭代长度, $ tc $表示采样间隔, $ \|{\boldsymbol{z}}_{1,k}(t)\| $表示第$ i $个采样时刻的$ {\boldsymbol{z}}_{1,k}(t) $的欧几里得范数值.
仿真结果如图3 ~ 11所示. 图3和图4分别描述关节位置$ q_{11,k} $和$ q_{12,k} $对期望轨迹$ q_{d1} $和$ q_{d2} $的跟踪效果. 其中, $ q_{11,1} $, $ q_{11,10} $, $ q_{11,30} $分别表示在第1次、第10次和第30次迭代后的机械臂第1个关节位置的输出; $ q_{12,1} $, $ q_{12,10} $, $ q_{12,30} $则分别表示在第1 次、第10次和第30次迭代后的机械臂第2个关节位置的输出. 由图3和图4可知, 在任意初始状态下, 经过足够多的迭代以后, 本文所提的变长度误差跟踪迭代学习控制方法能够实现关节位置在指定区间跟踪期望轨迹, 且在经过30次迭代之后, 两个关节位置的跟踪精度均优于第1次和第10次迭代的跟踪精度. 图5和图6分别表示两个关节位置误差对期望误差轨迹的跟踪情况, 可以发现两个关节位置误差在整个迭代长度内沿期望误差轨迹收敛. 此外, 在$ t = 0.5\;{\rm{s}} $后, 本文所提方法能够保证机械臂两个关节位置误差均收敛至零点附近, 这与前述理论分析结果保持一致. 图7描述了性能指标$ {{\rm{avg}}(\|{\boldsymbol{z}}_{1,k}(t)\|)} $以及$ J_{\max} $随迭代次数的变化趋势. 由图7可知, 随着迭代次数的增加, 本文所提方法能够有效提高跟踪误差的收敛性能.
图8和图9分别表示M1和M2两种控制方法下关节位置$ q_{11,k} $和$ q_{12,k} $对期望轨迹$ q_{d1} $和$ q_{d2} $的跟踪效果. 图10和图11表示两种控制方法下关节位置误差的收敛过程. 由图8 ~ 11可知, 在本文提出的M1控制方法下, 关节位置$ q_{11,k} $和$ q_{12,k} $的跟踪速度更快, 跟踪性能得到较大提升, 误差收敛速度也更快, 使得误差能够在指定区间内跟踪给定的期望误差轨迹. 图3 ~ 11的仿真结果表明, 针对任意初始状态下机械臂轨迹跟踪问题, 本文提出的变长度误差跟踪迭代学习控制方法能够实现关节位置误差在指定区间收敛到零点, 保证关节位置在指定区间内跟踪给定期望轨迹.
6. 结束语
针对机械臂迭代学习控制方法的初值与不等长问题, 本文提出一种变长度误差跟踪迭代学习控制方法. 为放宽系统的初值一致条件, 利用双曲余弦函数构造期望误差轨迹, 该期望误差轨迹只需设计一个与期望轨迹无关的常数项, 使得误差轨迹形式较为简单和直观. 针对ILC不等长问题, 定义虚拟跟踪变量构建误差补偿机制, 补偿未运行区间的误差信息, 并在此基础上设计迭代学习控制器, 保证关节位置在指定区间上跟踪给定的期望轨迹. 此外, 设计全限幅学习律, 保证参数估计值的有界性. 仿真结果验证了本文所提控制方法的有效性.
-
表 1 剥离对比实验结果
Table 1 Results of ablation study
Trackers MOTA ($\uparrow$) MOTP ($\uparrow$) MT ($\uparrow$) (%) ML ($\downarrow$) (%) FP ($\downarrow$) FN ($\downarrow$) IDS ($\downarrow$) A + T 19.5 74.6 7.41 66.70 109 14 202 43 M + T 17.6 74.6 7.40 64.80 307 14 326 70 A + M + T 21.0 74.3 9.26 70.40 175 13 893 16 A + M + V 14.7 75.1 1.85 67.00 60 14 804 339 表 2 MOT16测试集结果
Table 2 Results of MOT16 test set
Trackers Mode MOTA ($\uparrow$) MOTP ($\uparrow$) MT ($\uparrow$) (%) ML ($\downarrow$) (%) FP ($\downarrow$) FN ($\downarrow$) IDS ($\downarrow$) HZ ($\uparrow$) AMIR[20] Online 47.2 75.8 14.0 41.6 2 681 92 856 774 1.0 CDA[45] Online 43.9 74.7 10.7 44.4 6 450 95 175 676 0.5 本文 Online 43.1 74.2 12.4 47.7 4 228 99 057 495 0.7 EAMTT[41] Online 38.8 75.1 7.9 49.1 8 114 102 452 965 11.8 OVBT[42] Online 38.4 75.4 7.5 47.3 11 517 99 463 1 321 0.3 [2mm] Quad-CNN[17] Batch 44.1 76.4 14.6 44.9 6 388 94 775 745 1.8 LIN1[43] Batch 41.0 74.8 11.6 51.3 7 896 99 224 430 4.2 CEM[44] Batch 33.2 75.8 7.8 54.4 6 837 114 322 642 0.3 表 3 2DMOT2015测试集结果
Table 3 Results of 2DMOT2015 test set
Trackers Mode MOTA ($\uparrow$) MOTP ($\uparrow$) MT ($\uparrow$) (%) ML ($\downarrow$) (%) FP ($\downarrow$) FN ($\downarrow$) IDS ($\downarrow$) HZ ($\uparrow$) AMIR[20] Online 37.6 71.7 15.8 26.8 7 933 29 397 1 026 1.9 本文 Online 34.2 71.9 8.9 40.6 7965 31665 794 0.7 CDA[45] Online 32.8 70.7 9.7 42.2 4 983 35 690 614 2.3 RNN_LSTM[24] Online 19.0 71.0 5.5 45.6 11 578 36 706 1 490 165.2 Quad-CNN[17] Batch 33.8 73.4 12.9 36.9 7 879 32 061 703 3.7 MHT_DAM[46] Batch 32.4 71.8 16.0 43.8 9 064 32 060 435 0.7 CNNTCM[22] Batch 29.6 71.8 11.2 44.0 7 786 34 733 712 1.7 Siamese CNN[19] Batch 29.0 71.2 8.5 48.4 5 160 37 798 639 52.8 LIN1[43] Batch 24.5 71.3 5.5 64.6 5 864 40 207 298 7.5 表 4 UA-DETRAC数据集跟踪结果
Table 4 Tracking results of UA-DETRAC dataset
MOTA ($\uparrow$) MOTP ($\uparrow$) MT ($\uparrow$) (%) ML ($\downarrow$) (%) FP ($\downarrow$) FN ($\downarrow$) IDS ($\downarrow$) 车辆跟踪 65.3 78.5 75.0 8.3 1 069 481 27 -
[1] Luo W H, Xing J L, Milan A, Zhang X Q, Liu W, Zhao X W, et al. Multiple object tracking: A literature review. arXiv preprint arXiv: 1409.7618, 2014. [2] Felzenszwalb P F, Girshick R B, McAllester D, Ramanan D. Object detection with discriminatively trained part-based models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010, 32(9): 1627-1645 doi: 10.1109/TPAMI.2009.167 [3] Girshick R. Fast R-CNN. In: Proceedings of the 2015 IEEE International Conference on Computer Vision. Santiago, Chile: IEEE, 2015. 1440-1448 [4] 尹宏鹏, 陈波, 柴毅, 刘兆栋.基于视觉的目标检测与跟踪综述.自动化学报, 2016, 42(10): 1466-1489 doi: 10.16383/j.aas.2016.c150823Yin Hong-Peng, Chen Bo, Chai Yi, Liu Zhao-Dong. Vision-based object detection and tracking: A review. Acta Automatica Sinica, 2016, 42(10): 1466-1489 doi: 10.16383/j.aas.2016.c150823 [5] Xiang J, Sang N, Hou J H, Huang R, Gao C X. Hough forest-based association framework with occlusion handling for multi-target tracking. IEEE Signal Processing Letters, 2016, 23(2): 257-261 doi: 10.1109/LSP.2015.2512878 [6] Yang B, Nevatia R. Multi-target tracking by online learning of non-linear motion patterns and robust appearance models. In: Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition. Providence, USA: IEEE, 2012. 1918-1925 [7] Nummiaro K, Koller-Meier E, Van Gool L. An adaptive color-based particle filter. Image and Vision Computing, 2003, 21(1): 99-110 doi: 10.1016/S0262-8856(02)00129-4 [8] Dalal N, Triggs B. Histograms of oriented gradients for human detection. In: Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. San Diego, USA: IEEE, 2005. 886-893 [9] Tuzel O, Porikli F, Meer P. Region covariance: A fast descriptor for detection and classification. In: Proceedings of the 2006 European Conference on Computer Vision. Graz, Austria: Springer, 2006. 589-600 [10] Xiang J, Sang N, Hou J H, Huang R, Gao C X. Multitarget tracking using Hough forest random field. IEEE Transactions on Circuits and Systems for Video Technology, 2016, 26(11): 2028-2042 doi: 10.1109/TCSVT.2015.2489438 [11] Milan A, Leal-Taixé L, Reid I, Roth S, Schindler K. MOT16: A benchmark for multi-object tracking. arXiv preprint arXiv: 1603.00831, 2016. [12] Leal-Taixé L, Milan A, Reid I, Roth S, Schindler K. MOTChallenge 2015: Towards a benchmark for multi-target tracking. arXiv preprint arXiv: 1504.01942, 2015. [13] Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks. In: Proceedings of the 25th International Conference on Neural Information Processing Systems. Lake Tahoe, USA: MIT, 2012. 1097-1105 [14] 管皓, 薛向阳, 安志勇.深度学习在视频目标跟踪中的应用进展与展望.自动化学报, 2016, 42(6): 834-847 doi: 10.16383/j.aas.2016.c150705Guan Hao, Xue Xiang-Yang, An Zhi-Yong. Advances on application of deep learning for video object tracking. Acta Automatica Sinica, 2016, 42(6): 834-847 doi: 10.16383/j.aas.2016.c150705 [15] Bertinetto L, Valmadre J, Henriques J F, Vedaldi A, Torr P H S. Fully-convolutional siamese networks for object tracking. In: Proceedings of the 2016 European Conference on Computer Vision. Amsterdam, The Netherlands: Springer, 2016. 850-865 [16] Danelljan M, Robinson A, Khan F S, Felsberg M. Beyond correlation filters: Learning continuous convolution operators for visual tracking. In: Proceedings of the 2016 European Conference on Computer Vision. Amsterdam, The Netherlands: Springer, 2016. 472-488 [17] Son J, Baek M, Cho M, Han B. Multi-object tracking with quadruplet convolutional neural networks. In: Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE, 2017. 3786-3795 [18] Emami P, Pardalos P M, Elefteriadou L, Ranka S. Machine learning methods for solving assignment problems in multi-target tracking. arXiv preprint arXiv: 1802.06897, 2018. [19] Leal-Taixé L, Canton-Ferrer C, Schindler K. Learning by tracking: Siamese CNN for robust target association. In: Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops. Las Vegas, USA: IEEE, 2016. 418-425 [20] Sadeghian A, Alahi A, Savarese S. Tracking the untrackable: Learning to track multiple cues with long-term dependencies. In: Proceedings of the 2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE, 2017. 300- 311 [21] Tang S Y, Andriluka M, Andres B, Schiele B. Multiple people tracking by lifted multicut and person re-identification. In: Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE, 2017. 3701-3710 [22] Wang B, Wang L, Shuai B, Zuo Z, Liu T, Chan K L, Wang G. Joint learning of convolutional neural networks and temporally constrained metrics for tracklet association. In: Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops. Las Vegas, USA: IEEE, 2016. 368-393 [23] Gold S, Rangarajan A. Softmax to softassign: Neural network algorithms for combinatorial optimization. Journal of Artificial Neural Networks, 1996, 2(4): 381-399 http://dl.acm.org/citation.cfm?id=235919 [24] Milan A, Rezatofighi S H, Dick A, Schindler K, Reid I. Online multi-target tracking using recurrent neural networks. In: Proceedings of the 2017 AAAI Conference on Artificial Intelligence. San Francisco, USA: AAAI, 2017. 2-4 [25] Beyer L, Breuers S, Kurin V, Leibe B. Towards a principled integration of multi-camera re-identification and tracking through optimal Bayes filters. In: Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Honolulu, USA: IEEE, 2017. 1444-1453 [26] Farazi H, Behnke S. Online visual robot tracking and identification using deep LSTM networks. In: Proceedings of the 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems. Vancouver, Canada: IEEE, 2017. 6118 -6125 [27] Kuo C H, Nevatia R. How does person identity recognition help multi-person tracking? In: Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition. Providence, RI, USA: IEEE, 2011. 1217-1224 [28] Xiao Q Q, Luo H, Zhang C. Margin sample mining loss: A deep learning based method for person re-identification. arXiv preprint arXiv: 1710.00478, 2017. [29] Huang C, Wu B, Nevatia R. Robust object tracking by hierarchical association of detection responses. In: Proceedings of the 2008 European Conference on Computer Vision. Marseille, France: Springer, 2008. 788-801 [30] Cheng D, Gong Y H, Zhou S P, Wang J J, Zheng N N. Person re-identification by multi-channel parts-based CNN with improved triplet loss function. In: Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE, 2016. 1335-1344 [31] He K M, Zhang X Y, Ren S Q, Sun J. Deep residual learning for image recognition. In: Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE, 2016. 770-778 [32] Ioffe S, Szegedy C. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In: Proceedings of the 32nd International Conference on Machine Learning. Lille, France: ACM, 2015. 448-456 [33] Glorot X, Bordes A, Bengio Y. Deep sparse rectifier neural networks. In: Proceedings of the 14th International Conference on Artificial Intelligence and Statistics. Fort Lauderdale, USA: JMLR, 2011. 315-323 [34] Schroff F, Kalenichenko D, Philbin J. FaceNet: A unified embedding for face recognition and clustering. In: Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA: IEEE, 2015. 815-823 [35] Zheng L, Shen L Y, Tian L, Wang S J, Wang J D, Tian Q. Scalable person re-identification: A benchmark. In: Proceedings of the 2015 IEEE International Conference on Computer Vision. Santiago, Chile: IEEE, 2015. 1116-1124 [36] Li W, Zhao R, Xiao T, Wang X G. DeepReID: Deep filter pairing neural network for person re-identification. In: Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, USA: IEEE, 2014. 152 -159 [37] Hermans A, Beyer L, Leibe B. In defense of the triplet loss for person re-identification. arXiv preprint arXiv: 1703. 07737, 2017. [38] Kingma D P, Ba J. Adam: A method for stochastic optimization. arXiv preprint arXiv: 1412.6980, 2014. [39] Yang B, Nevatia R. Multi-target tracking by online learning a CRF model of appearance and motion patterns. International Journal of Computer Vision, 2014, 107(2): 203-217 doi: 10.1007/s11263-013-0666-4 [40] Bernardin K, Stiefelhagen R. Evaluating multiple object tracking performance: The CLEAR MOT metrics. EURASIP Journal on Image and Video Processing, 2008, 2008: 246309 http://dl.acm.org/citation.cfm?id=1453688 [41] Sanchez-Matilla R, Poiesi F, Cavallaro A. Online multi-target tracking with strong and weak detections. In: Proceedings of the 2016 European Conference on Computer Vision. Amsterdam, The Netherlands: Springer, 2016. 84-99 [42] Ban Y T, Ba S, Alameda-Pineda X, Horaud R. Tracking multiple persons based on a variational Bayesian model. In: Proceedings of the 2016 European Conference on Computer Vision. Amsterdam, The Netherlands: Springer, 2016. 52- 67 [43] Fagot-Bouquet L, Audigier R, Dhome Y, Lerasle F. Improving multi-frame data association with sparse representations for robust near-online multi-object tracking. In: Proceedings of the 2016 European Conference on Computer Vision. Amsterdam, The Netherlands: Springer, 2016. 774-790 [44] Milan A, Roth S, Schindler K. Continuous energy minimization for multitarget tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 36(1): 58-72 doi: 10.1109/TPAMI.2013.103 [45] Bae S H, Yoon K J. Confidence-based data association and discriminative deep appearance learning for robust online multi-object tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(3): 595-610 doi: 10.1109/TPAMI.2017.2691769 [46] Kim C, Li F X, Ciptadi A, Rehg J M. Multiple hypothesis tracking revisited. In: Proceedings of the 2015 IEEE International Conference on Computer Vision. Santiago, Chile: IEEE, 2015. 4696-4704 [47] Girshick R, Donahue J, Darrell T, Malik J. Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, USA: IEEE, 2014. 580-587 -