2.845

2023影响因子

(CJCR)

  • 中文核心
  • EI
  • 中国科技核心
  • Scopus
  • CSCD
  • 英国科学文摘

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于视觉的人体动作质量评价研究综述

沈媛媛 张燕明 沈燕飞

李云, 孙书利, 郝钢. 基于Gauss-Hermite逼近的非线性加权观测融合无迹Kalman滤波器. 自动化学报, 2019, 45(3): 593-603. doi: 10.16383/j.aas.c170534
引用本文: 沈媛媛, 张燕明, 沈燕飞. 基于视觉的人体动作质量评价研究综述. 自动化学报, 2025, 51(2): 404−426 doi: 10.16383/j.aas.c230551
LI Yun, SUN Shu-Li, HAO Gang. Weighted Measurement Fusion Unscented Kalman Filter Using Gauss-Hermite Approximation for Nonlinear Systems. ACTA AUTOMATICA SINICA, 2019, 45(3): 593-603. doi: 10.16383/j.aas.c170534
Citation: Shen Yuan-Yuan, Zhang Yan-Ming, Shen Yan-Fei. A survey of vision-based motion quality assessment. Acta Automatica Sinica, 2025, 51(2): 404−426 doi: 10.16383/j.aas.c230551

基于视觉的人体动作质量评价研究综述

doi: 10.16383/j.aas.c230551 cstr: 32138.14.j.aas.c230551
基金项目: 北京市自然科学基金(9234029), 国家自然科学基金(72071018), 中央高校基本科研业务费专项资金(2024JCYJ004)资助
详细信息
    作者简介:

    沈媛媛:北京体育大学体育工程学院讲师. 2020年获得中国科学院自动化研究所博士学位. 主要研究方向为智能体育与运动表现分析. 本文通信作者. E-mail: shenyuanyuan@bsu.edu.cn

    张燕明:中国科学院自动化研究所副研究员. 2011年获得中国科学院自动化研究所博士学位. 主要研究方向为结构预测方法, 图神经网络, 概率图模型. E-mail: ymzhang@nlpr.ia.ac.cn

    沈燕飞:北京体育大学体育工程学院教授. 2014年获得中国科学院大学博士学位. 主要研究方向为智能视频分析, 体育大数据, 智能体育装备. E-mail: syf@bsu.edu.cn

A Survey of Vision-based Motion Quality Assessment

Funds: Supported by Natural Science Foundation of Beijing (9234029), National Natural Science Foundation of China (72071018), and Fundamental Research Funds for the Central Universities (2024JCYJ004)
More Information
    Author Bio:

    SHEN Yuan-Yuan Lecturer at School of Sport Engineering, Beijing Sport University. She received her Ph.D. degree from the Institute of Automation, Chinese Academy of Sciences in 2020. Her research interest covers intelligent sports and sports performance analysis. Corresponding author of this paper

    ZHANG Yan-Ming Associate professor at Institute of Automation, Chinese Academy of Sciences. He received his Ph.D. degree from the Institute of Automation, Chinese Academy of Sciences in 2011. His research interest covers structural prediction methods, graph neural networks and probabilistic graphical models

    SHEN Yan-Fei Professor at School of Sport Engineering, Beijing Sport University. He received his Ph.D. degree from University of Chinese Academy of Sciences in 2014. His research interest covers intelligent video analysis, sports big data and intelligent sports equipment

  • 摘要: 基于视觉的人体动作质量评价利用计算机视觉相关技术自动分析个体运动完成情况, 并为其提供相应的动作质量评价结果. 这已成为运动科学和人工智能交叉领域的一个热点研究问题, 在竞技体育、运动员选材、健身锻炼、运动康复等领域具有深远的理论研究意义和很强的实用价值. 本文将从数据获取及标注、动作特征表示、动作质量评价3个方面对涉及到的技术进行回顾分析, 对相关方法进行分类, 并比较分析不同方法在AQA-7、JIGSAWS、EPIC-Skills 2018三个数据集上的性能. 最后讨论未来可能的研究方向.
  • 滤波算法在定位、目标跟踪、导航和故障诊断等方面发挥着重要作用[1-3].然而, 单个传感器难以满足高精度、高容错性等要求, 因此, 多传感器融合估计技术应运而生.在过去的几十年里, 线性系统的融合估计理论已经有了一系列完整的理论基础[3].目前常用的信息融合估计方法主要包括两个基本的结构:集中式融合估计和分布式融合估计.集中式融合估计将所有传感器信息进行增广, 并基于增广的观测设计融合状态估计[4-5].该算法没有信息丢失, 当所有传感器没有故障时, 估计精度具有全局最优性, 可作为其他融合算法在精度上的衡量标准, 也是现在多传感器系统经常采用的融合方式之一[6-7].然而, 由于集中式融合算法计算量大, 在传感器数量较多的情况下, 集中式融合算法会导致整个系统实时性差.特别是当存在故障传感器时可能导致滤波器发散.分布式融合算法是把各个局部状态估计送入融合中心, 根据一定的融合准则进行加权得到融合估计[3, 8-9].分布式融合方式具有良好的鲁棒性, 计算量小且容错性强, 估计精度是局部最优、全局次优的.

    加权观测融合算法根据加权最小二乘准则, 将集中式融合系统增广的高维观测进行压缩处理, 得到降维的观测, 基于降维观测设计的滤波器可以明显地减小计算负担.对于线性系统, 加权观测融合算法在最小方差意义下和集中式融合算法具有数值等价性, 因而具有重要的应用价值[10].然而, 绝大多数系统具有非线性特性, 例如, 大多数定位系统观测方程是在球面坐标系下建立的, 而估计和分析状态时往往又是在笛卡尔坐标系下进行的, 这使得观测方程具有某种非线性特性[6-7].

    近些年, 基于贝叶斯估计框架和采样逼近的非线性滤波算法得到了广泛研究, 例如无迹Kalman滤波器(Unscented Kalman filer, UKF) [11-12]、容积滤波器(Cubature Kalman filer, CKF) [13-14]、粒子滤波器(Particle filter, PF) [15], 以及其他一些非线性滤波器[16].这些非线性滤波器都可以统一处理非线性滤波问题, 但各具优缺点. UKF与CKF具有相同的滤波精度, 区别在于粒子权值的计算上存在差异. PF在有充足粒子条件下具有较高的滤波精度, 精度普遍要高于UKF与CKF, 但是较大的计算负担成为了PF的一大缺点.事实上, 以上提到的滤波器都可以与本文提出的加权观测融合算法相结合, 形成加权观测融合滤波算法, 本文将以UKF滤波器为例, 给出一种非线性加权观测融合滤波算法.

    非线性滤波算法的大量涌现表明了学者们对非线性问题的关注.涉及到非线性系统的融合方法也层出不穷[17-20].近年来, 有学者通过随机集、人工神经网络、模糊逻辑、粗糙集、D-S证据理论等非概率方法提出了非线性融合方法[21-23].这些方法可实现非线性系统的信息融合以及决策级融合, 但这些方法普遍存在信息丢失等情况, 所以这些算法不具有最优性或渐近最优性.文献[24]提出了一种在线性最小方差意义下最优非线性加权观测融合UKF滤波器.该算法要求传感器观测方程是相同的, 因此具有较大的局限性.文献[25]中, 基于Taylor级数和UKF, 提出了加权观测融合无迹Kalman滤波器.该算法可以统一处理非线性融合估计问题, 但该算法需要实时计算Taylor级数展开项系数, 这将带来一定的在线计算负担, 而且在展开点(状态预报)偏离过大, 或者Taylor级数展开项较少的时候, 滤波精度难以保证.

    Gauss-Hermite逼近方法[26-28]可以通过固定点采样、Gauss函数和Hermite多项式逼近任意初等函数, 且具有较好的拟合效果.为了降低该逼近方法的计算负担, 本文采用了分段处理方法, 即将状态区间进行分段逼近, 并离线计算每段的加权系数矩阵.本文主要创新点及工作如下:首先, 利用分段的Gauss-Hermite逼近方法将系统观测方程统一处理, 得到近似的中介函数以及系数矩阵.进而基于此中介函数、系数矩阵以及加权最小二乘法, 提出了非线性加权观测融合算法.该融合算法可对增广的高维观测进行压缩降维, 为后续滤波等工作降低计算负担.最后, 结合UKF滤波算法, 提出了非线性加权观测融合UKF滤波算法(Weighted measurement fusion UKF, WMF-UKF).该算法可以处理非线性多传感器系统的融合估计问题.与集中式融合UKF (Centralized measurement fusion UKF, CMF-UKF)算法相比, WMF-UKF具有与之逼近的估计精度, 但计算量明显降低, 并且随着传感器数量的增加, 该算法在计算量上的优势将更加明显.本文为非线性多传感器系统信息融合估计提供了一个有效途径.在定位、导航、目标跟踪、通信和大数据处理等领域具有潜在应用价值[29-31].

    考虑一个非线性多传感器系统

    $ \mathit{\boldsymbol{x}}(k + 1) = \mathit{\boldsymbol{f}}(\mathit{\boldsymbol{x}}(k),k) + \mathit{\boldsymbol{w}}(k) $

    (1)

    $ \mathit{\boldsymbol{z}}^{(j)}(k)=\mathit{\boldsymbol{h}}^{(j)}(\mathit{\boldsymbol{x}}(k), k)+\mathit{\boldsymbol{v}}^{(j)}(k), j=1, 2..., L $

    (2)

    其中, f(·, ·)∈Rn为已知的非线性函数, x(k)∈Rnk时刻系统状态, h(j)(·, ·)∈Rmj为已知的第j个传感器的观测函数, z(j)(k)∈Rmj为第j个传感器的观测, w(k)~ pwk(·)为状态噪声, v(j)(k)~ pvk(j)(·)为第j个传感器的观测噪声.假设w(k)和v(j)(k)是零均值、方差阵分别为QwR(j)且相互独立的白噪声, 即

    $ \begin{array}{*{35}{l}} \text{E}\left\{ \left[ \begin{matrix} \mathit{\boldsymbol{w}}(\mathit{t}) \\ {{\mathit{\boldsymbol{v}}}^{(\mathit{j})}}(\mathit{t}) \\ \end{matrix} \right]\left[ \begin{matrix} {{\mathit{\boldsymbol{w}}}^{\text{T}}}(\mathit{k}) & {{\left( {{\mathit{\boldsymbol{v}}}^{(\mathit{l})}}(\mathit{k}) \right)}^{\text{T}}} \\ \end{matrix} \right] \right\}\text{=} \\ \left[ \begin{matrix} {{Q}_{\mathit{w}}} & \text{0} \\ \text{0} & {{R}^{(\mathit{j})}}{{\delta }_{\mathit{jl}}} \\ \end{matrix} \right]{{\delta }_{\mathit{tk}}} \\ \end{array} $

    (3)

    其中, E为均值号, 上标T为转置号, δtt=1, δtk=0~(tk).

    在传感器网络中, 传感器的能量是有限的, 为了节省能量, 假设分布在空间上的传感器之间没有通信, 传感器的观测数据通过网络传输给融合中心, 在融合中心对数据进行压缩和滤波处理.而在工程中经常遇到的未知参数问题[32-33]、相关性问题[34-35]、传感器分布及管理[36]等问题, 本文没有涉及.

    本文将从集中式融合结构入手, 引出本文所提出的基于Gauss-Hermite逼近的加权观测融合方法.该融合方法将观测函数分解成Gauss函数和Hermite多项式的组合形式, 利用其系数矩阵对集中式融合系统观测方程进行降维, 得到一个维数较低的加权融合观测方程.对加权融合观测方程与状态方程形成的加权观测融合系统进行滤波器设计, 可获得与集中式融合逼近的估计精度, 并降低了集中式融合估计算法的计算量.

    引理1 [4-5].对系统式(1)和式(2), 全局最优集中式融合系统的观测方程为:

    $ \mathit{\boldsymbol{z}}^{(0)}(k)=\mathit{\boldsymbol{h}}^{(0)}(\mathit{\boldsymbol{x}}(k), k)+\mathit{\boldsymbol{v}}^{(0)}(k) $

    (4)

    其中

    $ {{\mathit{\boldsymbol{z}}}^{(0)}}(k)={{[{{\mathit{\boldsymbol{z}}}^{(1)\text{T}}}(k),{{\mathit{\boldsymbol{z}}}^{(2)\text{T}}}(k),...,{{\mathit{\boldsymbol{z}}}^{(L)\text{T}}}(k)]}^{\text{T}}} $

    (5)

    $ \begin{align} & {{\mathit{\boldsymbol{h}}}^{(0)}}(\mathit{\boldsymbol{x}}(k),k)=[{{\mathit{\boldsymbol{h}}}^{(1)\text{T}}}(\mathit{\boldsymbol{x}}(k),k),{{\mathit{\boldsymbol{h}}}^{(2)\text{T}}}(\mathit{\boldsymbol{x}}(k),k),..., \\ & \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ {{\mathit{\boldsymbol{h}}}^{(L)\text{T}}}(\mathit{\boldsymbol{x}}(k),k){{]}^{\text{T}}} \\ \end{align} $

    (6)

    $ {{\mathit{\boldsymbol{v}}}^{(0)}}(k)={{[{{\mathit{\boldsymbol{v}}}^{(1)\text{T}}}(k),{{\mathit{\boldsymbol{v}}}^{(2)\text{T}}}(k),...,{{\mathit{\boldsymbol{v}}}^{(L)\text{T}}}(k)]}^{\text{T}}} $

    (7)

    并且v(0)(k)的协方差矩阵由下式给出:

    $ {{R}^{(0)}}=\text{diag}\left\{ {{R}^{(\text{1})}},{{R}^{(\text{2})}},...,{{R}^{(\mathit{L})}} \right\} $

    (8)

    其中Λ(*)T(k)=(Λ(*)(k))T(Λ=z, h, v), "diag{·}"表示对角阵.

    对系统式(1)和式(4), 应用非线性滤波算法(例如扩展Kalman滤波器(Extended Kalman filter, EKF), UKF, CKF, PF等), 可得到相应的全局最优集中式融合非线性滤波器.但由于集中式融合的观测方程式(4)是观测增广扩维形成的, 使得基于该高维观测的估计算法的计算负担随着传感器数量的增加而迅速增加.因此, 找到等效的或者近似的融合方法来降低计算量是十分必要的.下面本文将解决非线性系统增广观测的降维问题.

    定理1. 对系统式(1)和式(2), 若存在一个中介函数ψ(x(k), k)∈Rψ, 使得局部观测函数h(j)(x(k), k)~(j=1, 2, ..., L)满足h(j)(x(k), k)=H(j)ψ(x(k), k), 其中矩阵H(j)Rmj×ψ, 则加权观测融合系统的观测方程可由下式给出:

    $ {{\mathit{\boldsymbol{z}}}^{(\text{I})}}(k)={{H}^{(\text{I})}}\psi (\mathit{\boldsymbol{x}}(k),k)+{{\mathit{\boldsymbol{v}}}^{(\text{I})}}(k) $

    (9)

    其中

    $ {{\mathit{\boldsymbol{z}}}^{(\text{I})}}(k)={{({{M}^{\text{T}}}{{R}^{(0)-1}}M)}^{-1}}{{M}^{\text{T}}}{{R}^{(0)-1}}{{\mathit{\boldsymbol{z}}}^{(0)}}(k) $

    (10)

    $ {\mathit{\boldsymbol{v}}^{({\rm{I}})}}(k) = {({M^{\rm{T}}}{R^{(0) - 1}}M)^{ - 1}}{M^{\rm{T}}}{R^{(0) - 1}}{\mathit{\boldsymbol{v}}^{(0)}}(k) $

    (11)

    其中, R(0)-1=(R(0))-1, 并且v(I)(k)的协方差矩阵为:

    $ \textit{R}^{(\rm{I})}=(M^{\mathit{\boldsymbol{T}}}\textit{R}^{(0)-1}M)^{-1} $

    (12)

    其中, M (列满秩)和H(I)(行满秩)是H(0)=[H(1)T, H(2)T, ..., H(L)T]T(H(*)T=(H(*))T)的满秩分解矩阵:

    $ \textit{H}^{(0)}=M\textit{H}^{\rm{(I)}} $

    (13)

    其中, M, H(I)可以用Hermite规范形得到[25].

    证明. 由于MH(I)H(0)的满秩分解, 则有:

    $ \begin{array}{*{20}{l}} {{\mathit{\boldsymbol{z}}^{(0)}}(k) = {H^{(0)}}\mathit{\boldsymbol{\psi }}(\mathit{\boldsymbol{x}}(k),k) + {\mathit{\boldsymbol{v}}^{(0)}}(k) = }\\ {\;\;\;\;\;\;\;\;\;\;\;\;M{H^{({\rm{I}})}}\mathit{\boldsymbol{\psi }}(\mathit{\boldsymbol{x}}(k),k) + {\mathit{\boldsymbol{v}}^{(0)}}(k)} \end{array} $

    (14)

    由于M为列满秩, 因而MTR(0)-1M为非奇异矩阵.令H(I)ψ(x(k), k)为观测对象, 应用加权最小二乘法, 则H(I)ψ(x(k), k)的最优Gauss-Markov估计为式(9)所示.

    对加权观测融合系统式(1)和式(9), 应用非线性滤波算法, 可得到全局最优加权观测融合非线性滤波算法.

    本节将引入一种函数逼近方法, 该方法借由Gauss函数和Hermit多项式的组合形式逼近任意初等函数.通过此逼近方法, 可得到h(j)(x(k), k)的近似函数h(j)(x(k), k), 进而可将h(j)(x(k), k)统一转化为h(j)(x(k), k)=H(j)ψ(x(k), k)的形式, 其中, ψ(x(k), k)由Gauss函数和Hermit多项式构成, H(j)为系数矩阵.非线性多传感器系统观测函数经过转换, 将满足定理1中要求.

    引理2[26].

    设在区间[a, b]中存在一个点集$\{x'_i, i=1, \cdots, S\} $, 对于任意点$x'_i $存在$y_{i} $, 满足$y_{i}=y(x'_i) $, 其中$y(x) $是一个确定的函数.进而$y(x) $的近似函数$\overline{y}(x) $可由Gauss-Hermite折叠函数得出:

    $ \begin{align} \overline{y}(x)=\,&\frac{1}{\gamma\sqrt{\pi}}\sum_{i=1}^Sy_{i}\Delta x_{i}\exp\left\{-\left(\frac{x-x'_i}{\gamma}\right)^{2}\right\} \cdot\notag\\ & f_{p}\left(\frac{x-x'_i}{\gamma}\right) \end{align} $

    (15)

    其中, $ \gamma$是一个与$\Delta x_{i}~(i=1, \cdots, S) $有关的常系数, , $f_{p}(u)~(p=0, 2, 4, \cdots) $为一系列Hermite多项式的组合:

    $ f_{p}(u)=\sum\limits_{\rho=0}^pC_{\rho}H_{\rho}(u) $

    (16)

    $ C_{\rho}=\frac{1}{2^{\rho}\rho!}H_{\rho}(0) $

    (17)

    其中, 是Hermite多项式[30].因此, $H_{\rho}(0) $为:

    $ \begin{array}{l} {H_\rho }(0) = \left\{ {\begin{array}{*{20}{l}} {1,}&{\rho = 0}\\ {{2^q}{{( - 1)}^q}(2q - 1)!!,}&{\rho = 2q}\\ {0,}&{\rho = 2q + 1} \end{array},} \right.\\ \;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;q = 1,2, \cdots {\rm{ }} \end{array} $

    (18)

    由式(17)和式(18)有:

    $ \begin{array}{*{20}{l}} {{C_\rho } = \left\{ {\begin{array}{*{20}{l}} {1,}&{\rho = 0}\\ {{{( - 1)}^q}\frac{{(2q - 1)!!}}{{{2^q}\left( {2q} \right)!}},}&{\rho = 2q}\\ {0,}&{\rho = 2q + 1} \end{array},} \right.}\\ {\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;q = 1,2, \cdots } \end{array} $

    (19)

    其中, '!'表示阶乘, 双阶乘'm!!'表示不超过自然数m且与m有相同奇偶性的所有正整数的乘积.

    注1. 对于多维情况, 假设${\rm{\{ }}{\mathit{\boldsymbol{X}}}'_i\in{\bf R}^{\textit{n}}\}\,(i=1,\cdots,S)$是一个采样集合, 对于集合中每一个点$\mathit{\boldsymbol{X}}'_i=[x'_{i_{1}},x'_{i_{2}},\cdots,x'_{i_{n}}] \,(a\leq x_{i_{\mu}}\leq x_{i+1_{\mu}}\leq b,\,\mu=1,\cdots,n)$存在点$\mathit{\boldsymbol{Y}}'_i(x'_{i_{1}},x'_{i_{2}},\cdots,x'_{i_{n}})=[y_{i_{1}},y_{i_{2}},\cdots,y_{i_{\xi}}]\,(\xi\geq1)$满足 $\mathit{\boldsymbol{Y}}'_i=\mathit{\boldsymbol{Y}}(\mathit{\boldsymbol{X}}'_i)$, 其中$\mathit{\boldsymbol{Y}}(\cdot)$ 是确定的多维函数.那么Gauss-Hermite折叠函数如下:

    $ \begin{align} &\overline{\pmb Y}(x_{1},x_{2},\cdots,x_{n})=\sum_{i_{1}=1}^S\Delta x_{i_{1}}\sum_{i_{2}=1}^S\Delta x_{i_{2}}\cdots\notag\\&\quad\sum_{i_{n}=1}^S\Delta x_{i_{n}}\cdot \mathit{\boldsymbol{Y}}(x'_{i_{1}},x'_{i_{2}},\cdots,x'_{i_{n}})\prod_{\mu=1}^n \frac{1}{\gamma_{\mu}\sqrt{\pi}}\cdot\notag\\&\quad \exp\left\{-\left(\frac{x_{\mu}-x'_{i_{\mu}}}{\gamma_{\mu}}\right)^{2}\right\} f_{p}\left(\frac{x_{\mu}-x'_{i_{\mu}}}{\gamma_{\mu}}\right) \end{align} $

    (20)

    其中, $n$维函数$\overline{\pmb Y}(\cdot)$为函数$\mathit{\boldsymbol{Y}}(\cdot)$ 的近似函数.引理2给出了一种利用Gauss函数和Hermite多项式组合的逼近方法,该方法可以利用较少的函数项获得很好的逼近效果.如果将引理1中的 $\sum {_{{i_1} = 1}^S} \Delta {x_{{i_1}}}\sum {_{{i_2} = 1}^S} \Delta {x_{{i_2}}} \cdots \sum {_{{i_n} = 1}^S} \Delta {x_{{i_n}}}(1/{\gamma _\mu }\sqrt \pi )\exp \{ - {(({x_\mu } - {x'_{{i_\mu }}})/{\gamma _\mu })^2}\} {f_p}(({x_\mu } - {x'_{{i_\mu }}})/{\gamma _\mu })(i = 1, \cdots ,S;\mu = 1, \cdots ,n), $视为定理1中的中介函数 $\mathit{\boldsymbol{\psi }}(\mathit{\boldsymbol{x}}(k),k)$将$\mathit{\boldsymbol{Y}}(x'_{i_{1}},x'_{i_{2}},\cdots,x'_{i_{n}})$视为$\textit{H}^{(j)}$, 则定理1可以得以实施.

    由文献[26]和大量仿真试验表明, 在$p=0, 2, 4 $等情况下, 合理的选择和$\gamma_{\mu}~(i=1, \cdots, S;\mu=1, \cdots, n) $即可很好地逼近任意初等连续函数.本文选取, 则由式(18)和式(19)有$C_{2}=-1/4, \, H_{2}(u)=4u^{2}-2 $, 进而有$f_{2}(u)=1.5-u^{2} $.令

    $ \varphi(\zeta)=\exp\{-\zeta^{2}\}f_{2}(\zeta) $

    (21)

    则有$\mathit{\boldsymbol{h}}^{(j)}\left(\mathit{\boldsymbol{x}}(k), k\right) $的近似函数$\overline{\mathit{\boldsymbol{h}}}^{(j)}\left(\mathit{\boldsymbol{x}}(k), k\right) $为:

    $ \begin{align} &\overline{\mathit{\boldsymbol{h}}}^{(j)}(x_{1},x_{2},\cdots,x_{n})=\notag\\& (\pi)^{-\frac{n}{2}}(\gamma)^{-n}\sum_{i_{1}=1}^S \sum_{i_{2}=1}^S\cdots\sum_{i_{n}=1}^S\mathit{\boldsymbol{h}}^{(j)}(x'_{i_{1}}, x'_{i_{2}},\cdots,x'_{i_{n}})\cdot\notag\\& \prod_{\mu=1}^n\varphi\left(\frac{x_{\mu}-x'_{i_{\mu}}}{\gamma}\right) \end{align} $

    (22)

    定理2. 对系统式(1)和式(2), 基于Gauss-Hermite逼近的近似加权观测融合方程为:

    $ {\mathit{\boldsymbol{\overline z}} ^{({\rm{I}})}}(k) = {\overline H ^{({\rm{I}})}}\mathit{\boldsymbol{\overline \psi }} (\mathit{\boldsymbol{x}}(k),k) + {\mathit{\boldsymbol{\overline v}} ^{({\rm{I}})}}(k) $

    (23)

    其中, $\mathit{\boldsymbol{\overline \psi }} (\mathit{\boldsymbol{x}}~(k), k) $如式(29)所示, $x_{\mu}~(\mu=1, \cdots, n) $是第$\mu $个状态变量, $x'_{i_{\mu}}~(i=1, \cdots, S;\mu=1, \cdots, n) $是第$\mu $个状态变量的第$i $个采样点. $\overline{H}^{(0)} $如式(30)所示, 其中是第$m $个观测方程的Gauss-Hermite拟合采样点, $S $是采样点的数量. $\overline{M} $和$\overline{H}^{(\rm{I})} $是$\overline{H}^{(0)} $的满秩分解矩阵, 是列满秩, 是行满秩, 且有.则有:

    $ {\mathit{\boldsymbol{\overline z}} ^{({\rm{I}})}}(k) = {({\bar M^{\rm{T}}}{R^{(0) - 1}}\bar M)^{ - 1}}{\bar M^{\rm{T}}}{R^{(0) - 1}}{\mathit{\boldsymbol{z}}^{(0)}}(k) $

    (24)

    $ {\mathit{\boldsymbol{\overline v}} ^{({\rm{I}})}}(k) = {({\bar M^{\rm{T}}}{R^{(0) - 1}}\bar M)^{ - 1}}{\bar M^{\rm{T}}}{R^{(0) - 1}}{\mathit{\boldsymbol{v}}^{(0)}}(k) $

    (25)

    $\overline{\mathit{\boldsymbol{v}}}^{(\rm{I})}(k) $的协方差矩阵为:

    $ {\overline R ^{({\rm{I}})}} = {({\bar M^{\rm{T}}}{R^{(0) - 1}}\bar M)^{ - 1}} $

    (26)

    证明. 利用式(22)将集中式融合系统观测方程式(6)进行近似, 得到近似的集中式融合观测方程:

    $ \mathit{\boldsymbol{z}}^{(0)}(k)\approx \overline{\mathit{\boldsymbol{h}}}^{(0)}(\mathit{\boldsymbol{x}}(k), k)+\mathit{\boldsymbol{v}}^{(0)}(k) $

    (27)

    其中

    $ \begin{array}{l} {\mathit{\boldsymbol{\overline h}} ^{(0)}}(\mathit{\boldsymbol{x}}(k),k) = \\ \qquad {\left[ {{{\mathit{\boldsymbol{\overline h}} }^{(1){\rm{T}}}}(\mathit{\boldsymbol{x}}(k),k), \cdots ,{{\mathit{\boldsymbol{\overline h}} }^{(L){\rm{T}}}}(\mathit{\boldsymbol{x}}(k),k)} \right]^{\rm{T}}} \end{array} $

    (28)

    $ \overline{\mathit{\boldsymbol{h}}}^{(j)}(\cdot, \cdot)(j=1, \cdots, L)$如式(22)所示, 且${\mathit{\boldsymbol{\overline h}} ^{(j){\rm{T}}}}( \cdot , \cdot ) = {\left( {{{\mathit{\boldsymbol{\overline h}} }^{(j)}}( \cdot , \cdot )} \right)^{\rm{T}}} $.

    将式(28)中的系数$\mathit{\boldsymbol{h}}^{j}(x'_{i_{1}}, x'_{i_{2}}, \cdots, x'_{i_{n}}) $与Gauss-Hermite函数$\varphi\big((x_{\mu}-x'_{i_{\mu}})/\gamma\big) $分离, 得到式(29)和式(30).利用定理1得到式(24)~式(26).

    $ \mathit{\boldsymbol{\overline \psi }}(\mathit{\boldsymbol{x}}(k), k)=(\pi)^{-\frac{n}{2}}(\gamma)^{-n} \left[ \begin{array}{c} \prod\limits_{\mu=1}^{n} \varphi\left(\dfrac{x_{\mu}-x_{1_{\mu}}'}{\gamma}\right) \\ \prod\limits_{\mu=1}^{n-1}\varphi\left(\dfrac{x_{\mu}-x_{1_{\mu}}'}{\gamma}\right)\cdot \varphi\left(\dfrac{x_{n}-x_{2_{n}}'}{\gamma}\right)\\ \vdots \\ \prod\limits_{\mu=1}^{n-1}\varphi\left(\dfrac{x_{\mu}-x_{1_{\mu}}'}{\gamma}\right)\cdot \varphi\left(\dfrac{x_{n}-x_{S_{n}}'}{\gamma}\right)\\ \prod\limits_{\mu=1}^{n-2} \varphi\left(\dfrac{x_{\mu}-x_{1_{\mu}}'}{\gamma}\right)\cdot \varphi\left(\dfrac{x_{n-1}-x_{2_{n-1}}'}{\gamma}\right) \varphi\left(\dfrac{x_{n}-x_{1_{n}}'}{\gamma}\right)\\ \prod\limits_{\mu=1}^{n-2} \varphi\left(\dfrac{x_{\mu}-x_{1_{\mu}}'}{\gamma}\right)\cdot \varphi\left(\dfrac{x_{n-1}-x_{2_{n-1}}'}{\gamma}\right) \varphi\left(\dfrac{x_{n}-x_{2_{n}}'}{\gamma}\right)\\ \vdots \\ \prod\limits_{\mu=1}^{n-2} \varphi\left(\dfrac{x_{\mu}-x_{1_{\mu}}'}{\gamma}\right)\cdot \varphi\left(\dfrac{x_{n-1}-x_{2_{n-1}}'}{\gamma}\right) \varphi\left(\dfrac{x_{n}-x_{S_{n}}'}{\gamma}\right)\\ \vdots \\ \prod\limits_{\mu=1}^{n-1}\varphi\left(\dfrac{x_{\mu}-x_{S_{\mu}}'}{\gamma}\right)\cdot \varphi\left(\dfrac{x_{n}-x_{1_{n}}'}{\gamma}\right)\\ \prod\limits_{\mu=1}^{n-1}\varphi\left(\dfrac{x_{\mu}-x_{S_{\mu}}'}{\gamma}\right)\cdot \varphi\left(\dfrac{x_{n}-x_{2_{n}}'}{\gamma}\right)\\ \vdots \\ \prod\limits_{\mu=1}^{n} \varphi\left(\dfrac{x_{\mu}-x_{S_{\mu}}'}{\gamma}\right) \end{array} \right]_{S^{n}\times1} $

    (29)

    $ \begin{array}{l} {{\bar H}^{(0)}} = \left[ {\begin{array}{*{20}{c}} {{\mathit{\boldsymbol{h}}^{(1)}}({x_{{1_1}^\prime }},{x_{{1_2}^\prime }}, \cdots ,{x_{{1_n}^\prime }})}&{{\mathit{\boldsymbol{h}}^{(1)}}({x_{{1_1}^\prime }},{x_{{1_2}^\prime }}, \cdots ,{x_{{2_n}^\prime }})}& \cdots &{{\mathit{\boldsymbol{h}}^{(1)}}({x_{{1_1}^\prime }},{x_{{1_2}^\prime }}, \cdots ,{x_{{S_n}^\prime }})}\\ {{\mathit{\boldsymbol{h}}^{(2)}}({x_{{1_1}^\prime }},{x_{{1_2}^\prime }}, \cdots ,{x_{{1_n}^\prime }})}&{{\mathit{\boldsymbol{h}}^{(2)}}({x_{{1_1}^\prime }},{x_{{1_2}^\prime }}, \cdots ,{x_{{2_n}^\prime }})}& \cdots &{{\mathit{\boldsymbol{h}}^{(2)}}({x_{{1_1}^\prime }},{x_{{1_2}^\prime }}, \cdots ,{x_{{S_n}^\prime }})}\\ \vdots & \vdots & \ddots & \vdots \\ {{\mathit{\boldsymbol{h}}^{(L)}}({x_{{1_1}^\prime }},{x_{{1_2}^\prime }}, \cdots ,{x_{{1_n}^\prime }})}&{{\mathit{\boldsymbol{h}}^{(L)}}({x_{{1_1}^\prime }},{x_{{1_2}^\prime }}, \cdots ,{x_{{2_n}^\prime }})}& \cdots &{{\mathit{\boldsymbol{h}}^{(L)}}({x_{{1_1}^\prime }},{x_{{1_2}^\prime }}, \cdots ,{x_{{S_n}^\prime }})} \end{array}} \right.\\ \;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;{\left. {\qquad \begin{array}{*{20}{c}} {{\mathit{\boldsymbol{h}}^{(1)}}({x_{{1_1}^\prime }},{x_{{1_2}^\prime }}, \cdots ,{x_{{2_{n - 1}}^\prime }},{x_{{1_n}^\prime }})}& \cdots &{{\mathit{\boldsymbol{h}}^{(1)}}({x_{{S_1}^\prime }},{x_{{S_2}^\prime }}, \cdots ,{x_{{S_n}^\prime }})}\\ {{\mathit{\boldsymbol{h}}^{(2)}}({x_{{1_1}^\prime }},{x_{{1_2}^\prime }}, \cdots ,{x_{{2_{n - 1}}^\prime }},{x_{{1_n}^\prime }})}& \cdots &{{\mathit{\boldsymbol{h}}^{(2)}}({x_{{S_1}^\prime }},{x_{{S_2}^\prime }}, \cdots ,{x_{{S_n}^\prime }})}\\ \vdots & \ddots & \vdots \\ {{\mathit{\boldsymbol{h}}^{(L)}}({x_{{1_1}^\prime }},{x_{{1_2}^\prime }}, \cdots ,{x_{{2_{n - 1}}^\prime }},{x_{{1_n}^\prime }})}& \cdots &{{\mathit{\boldsymbol{h}}^{(L)}}({x_{{S_1}^\prime }},{x_{{S_2}^\prime }}, \cdots ,{x_{{S_n}^\prime }})} \end{array}} \right]_{\sum {_{i = 1}^L{m_i} \times {S^n}} }} \end{array} $

    (30)

    注2. 定理2通过Gauss-Hermite逼近构建了一个近似的中介函数$\overline{{\psi}}(\mathit{\boldsymbol{x}}(k), k)$.它使得形如式(1)和式(2)的任意非线性多传感器系统的局部观测函数具有了定理1中所阐述的关系, 可使定理1得以实施.

    注3. 如果状态范围过大, 拟合采样点数量会急剧增加, 导致计算量增加, 因此本文采取分段的处理方法.例如, 对一维状态系统, 可以将状态的范围划分成多个区间, 对二维状态系统, 可以将状态的范围分成若干小的区域.在每个区间或区域分别进行Gauss-Hermite逼近.逼近过程中形成的中介函数$\mathit{\boldsymbol{\overline \psi }} (\mathit{\boldsymbol{x}}(k), k)$, $\overline{H}^{(0)}$及其满秩分解矩阵$\overline{M}$和$\overline{H}^{(\rm{I})}$可离线计算, 在线调用, 减少了在线计算负担.

    对加权观测融合系统式(1)和式(23), 应用非线性滤波算法(EKF、UKF、PF、CKF等), 可得加权观测融合非线性滤波算法.本文将以UKF为例, 给出一种基于Gauss-Hermite逼近和UKF滤波算法的非线性加权观测融合估计算法.

    本文UKF采样策略选用比例对称抽样, 即Sigma采样点可由式(31)计算.

    $ \{ {\mathit{\boldsymbol{\chi }}_i}\}=\left[{\mathit{\boldsymbol{\overline x}} }, {\mathit{\boldsymbol{\overline x}} }+\sqrt{(n+\kappa)\textit{P}_{xx}}, {\mathit{\boldsymbol{\overline x}} }-\sqrt{(n+\kappa)\textit{P}_{xx}}\right], \notag\\ \qquad \qquad \qquad \qquad \qquad \qquad i=0, \cdots, 2n $

    (31)

    且有粒子权值如式(32)和式(33)所示.

    $ W_i^m = \left\{ \begin{array}{l} \frac{\lambda }{{n + \kappa }},\;\;\;i = 0\\ \frac{1}{{2(n + \kappa )}},\;\;\;i \ne 0 \end{array} \right. $

    (32)

    $ W_i^c = \left\{ \begin{array}{l} \frac{\lambda }{{n + \lambda }} + (1 - {\alpha ^2} + {\beta ^2}),\;\;\;\;i = 0\\ \frac{1}{{2(n + \lambda )}},\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;i \ne 0 \end{array} \right. $

    (33)

    其中, $\alpha>0$是比例因子, $\lambda=\alpha^{2}(n+\kappa)-n$, $\kappa$是比例参数, 通常设置$\kappa=0$或者$\kappa=3-n, \, \beta=2$.下面给出WMF-UKF算法.

    WMF-UKF算法. 对非线性系统式(1)和式(2), 基于定理2的WMF-UKF算法如下:

    步骤1. 设置初始值

    基于多传感器的观测数据$\mathit{\boldsymbol{z}}^{(j)}(0)\sim \mathit{\boldsymbol{z}}^{(j)}(k)~(j=1, 2, \cdots, L), $加权观测融合系统Sigma采样点可以计算为:

    $ \begin{array}{l} \{ \mathit{\boldsymbol{\chi }}_i^{({\rm{I}})}(k|k)\} = [{\mathit{\boldsymbol{\widehat x}}^{({\rm{I}})}}(k|k),{\mathit{\boldsymbol{\widehat x}}^{({\rm{I}})}}(k|k) + \\ \sqrt {(n + \kappa )P_{xx}^{({\rm{I}})}(k|k)} ,{\mathit{\boldsymbol{\widehat x}}^{({\rm{I}})}}(k|k) - \sqrt {(n + \kappa )P_{xx}^{({\rm{I}})}(k|k)} {\rm{]}},\\ {\mkern 1mu} \qquad \qquad \qquad \qquad \qquad \qquad i = 0, \cdots ,2n \end{array} $

    (34)

    其中初值条件为:

    $ {\mathit{\boldsymbol{\widehat x}}^{({\rm{I}})}}(0|0) = {\rm{E}}\left\{ {\mathit{\boldsymbol{x}}({\rm{0}})} \right\} $

    (35)

    $ \begin{array}{l} P_{xx}^{({\rm{I}})}(0|0) = \\ {\rm{E}}\left\{ {\left( {\mathit{\boldsymbol{x}}({\rm{0}}){\rm{ - }}{{\mathit{\boldsymbol{\widehat x}}}^{({\rm{I}})}}({\rm{0|0}})} \right){{\left( {\mathit{\boldsymbol{x}}({\rm{0}}){\rm{ - }}{{\mathit{\boldsymbol{\widehat x}}}^{({\rm{I}})}}({\rm{0|0}})} \right)}^{\rm{T}}}} \right\} \end{array} $

    (36)

    步骤2. 预测方程

    预测Sigma采样点:

    $ \mathit{\boldsymbol{\chi }}_i^{({\rm{I}})}(k+1|k)=\mathit{\boldsymbol{f}}(\mathit{\boldsymbol{\chi }}_i^{({\rm{I}})}(k|k), k), \, i=0, \cdots, 2n $

    (37)

    状态预报:

    $ {\mathit{\boldsymbol{\widehat x}}^{({\rm{I}})}}(k + 1|k) = \sum\limits_{i = 0}^{2n} {W_i^m} \mathit{\boldsymbol{\chi }}_i^{({\rm{I}})}(k + 1|k) $

    (38)

    状态预测误差方差阵:

    $ \begin{array}{l} {P^{({\rm{I}})}}(k + 1|k) = \sum\limits_{i = 0}^{2n} {W_i^c} (\mathit{\boldsymbol{\chi }}_i^{({\rm{I}})}(k + 1|k) - \\ \qquad {\mathit{\boldsymbol{\widehat x}}^{({\rm{I}})}}(k + 1|k))(\mathit{\boldsymbol{\chi }}_i^{({\rm{I}})}(k + 1|k) - \\ \qquad {\mathit{\boldsymbol{\widehat x}}^{({\rm{I}})}}(k + 1|k){)^{\rm{T}}} + {Q_w} \end{array} $

    (39)

    观测预报Sigma采样点:

    $ \mathit{\boldsymbol{z}}^{(\rm{I})}(k+1|k)=\overline{H}^{(\rm{I})}\mathit{\boldsymbol{\overline \psi }} (\mathit{\boldsymbol{\chi }}_i^{({\rm{I}})}(k+1|k), k+1), \notag\\ \qquad \qquad \qquad \qquad \qquad \qquad i=0, \cdots, 2n $

    (40)

    观测预报:

    $ \mathit{\boldsymbol{z}}^{(\rm{I})}(k+1|k)=\sum\limits_{i=0}^{2n}W_{i}^{m}\mathit{\boldsymbol{z}}_{i}^{(\rm{I})}(k+1|k) $

    (41)

    观测预报误差方差阵:

    $ \begin{align} &\qquad{P}_{zz}^{(\rm{I})}(k+1|k)=\sum_{i=0}^{2n}W_{i}^{c} \left(\mathit{\boldsymbol{z}}_{i}^{(\rm{I})}(k+1|k)-\right.\notag\\&\qquad \left.\hat{\mathit{\boldsymbol{z}}}^{(\rm{I})}(k+1|k)\right) \left(\mathit{\boldsymbol{z}}_{i}^{(\rm{I})}(k+1|k)- \hat{\mathit{\boldsymbol{z}}}^{(\rm{I})}(k+1|k)\right)^{\mathrm{T}} \end{align} $

    (42)

    $ \textit{P}_{vv}^{(\rm{I})}(k+1|k)=\textit{P}_{zz}^{(\rm{I})}(k+1|k)+\overline{\textit{R}}^{(\rm{I})} $

    (43)

    其中, $\overline{\textit{R}}^{(\rm{I})}$由式(26)定义.

    协方差矩阵由下式计算:

    $ \begin{array}{l} P_{xz}^{({\rm{I}})}(k + 1|k) = \sum\limits_{i = 0}^{2n} {W_i^c} \left( {\mathit{\boldsymbol{\chi }}_i^{({\rm{I}})}(k + 1|k) - } \right.\\ \quad \left. {{{\mathit{\boldsymbol{\widehat x}}}^{({\rm{I}})}}(k + 1|k)} \right){\left( {\mathit{\boldsymbol{z}}_i^{({\rm{I}})}(k + 1|k) - {{\mathit{\boldsymbol{\widehat z}}}^{({\rm{I}})}}(k + 1|k)} \right)^{\rm{T}}} \end{array} $

    (44)

    步骤3. 更新方程

    滤波增益由下式计算:

    $ \textit{W}^{(\rm{I})}(k+1)=\textit{P}_{xz}^{(\rm{I})}(k+1|k)\textit{P}_{vv}^{(\rm{I})-1}(k+1|k) $

    (45)

    其中, $\textit{P}_{vv}^{(\rm{I})-1}(\cdot|\cdot)=\left(\textit{P}_{vv}^{(\rm{I})}(\cdot|\cdot)\right)^{-1}$, 且$k+1$时刻的状态估计为:

    $ \begin{array}{l} {\mathit{\boldsymbol{\widehat x}}^{({\rm{I}})}}(k + 1|k + 1) = {\mathit{\boldsymbol{\widehat x}}^{({\rm{I}})}}(k + 1|k) + {W^{({\rm{I}})}}(k + 1) \cdot \\ \left( {{{\mathit{\boldsymbol{\overline z}} }^{({\rm{I}})}}(k + 1) - {{\mathit{\boldsymbol{\widehat z}}}^{({\rm{I}})}}(k + 1|k)} \right) \end{array} $

    (46)

    滤波误差协方差矩阵为:

    $ \begin{array}{l} {P^{({\rm{I}})}}(k + 1|k + 1) = {P^{({\rm{I}})}}(k + 1|k) - {W^{({\rm{I}})}}(k + 1) \cdot \\ P_{vv}^{({\rm{I}})}(k + 1|k){W^{({\rm{I}}){\rm{T}}}}(k + 1) \end{array} $

    (47)

    其中, ${W^{({\rm{I}}){\rm{T}}}}( \cdot ) = {\left( {{W^{({\rm{I}})}}( \cdot )} \right)^{\rm{T}}} $.

    算法1中的式(45)出现了矩阵求逆运算, 因此该算法的时间复杂度由 $P_{vv}^{({\rm{I}}) - 1}(k + 1|k) $决定[37], 即WMF-UKF的时间复杂度为O(r3), 而CMF-UKF的时间复杂度为 ${\rm{O}}\left( {{{(\sum {_{i = 1}^L} {\mathit{m}_\mathit{i}})}^{\rm{3}}}} \right) $.由定理2知 $r \le \sum {_{i = 1}^L{m_i}} $, 所以WMF-UKF的时间复杂度小于CMF-UKF.

    另外, 随着传感器数量$L$的增加, $\sum_{i=1}^{L}m_{i}$将不断增加.而在拟合采样点数$S$不改变的情况下, 由于$r\leq\min(\sum_{i=1}^{L}m_{i}, S^{n})$, 故$r$将保持在$S^{n}$ (或者更小)不改变.因此随着传感器数量的增加, WMF-UKF较CMF-UKF在计算量上的优势将更加明显.

    本文提出的WMF-UKF所需要的融合参数矩阵$\overline{M}$和$\overline{H}^{(\rm{I})}$可事先离线计算备用, 不必在线计算.而文献[25]所用的Taylor级数方法需要根据预报值在线实时计算融合参数矩阵, 这将带来一定的在线计算负担.相比较之下, 本文提出的WMF-UKF在计算量上具有一定的优势.

    例1. 考虑一个带有4传感器的非线性系统[38]

    $ \begin{array}{l} x\left( k \right) = \frac{{x\left( {k - 1} \right)}}{2} + \frac{{x\left( {k - 1} \right)}}{{\left( {1 + x{{\left( {k - 1} \right)}^2}} \right)}} + \\ \;\;\;\;\;\;\;\;\;\cos \left( {\frac{{k - 1}}{2}} \right) + w\left( k \right) \end{array} $

    (48)

    $ z^{(j)}(k)=h^{(j)}(x(k), k)+v^{(j)}(k), \quad j=1, \cdots, 4 $

    (49)

    其中

    $ \begin{array}{l} {h^{(1)}}(x(k),k) = \frac{4}{5}x(k) + \frac{1}{2}{x^2}(k) + \frac{3}{{10}}{\rm{exp}}\left( {\frac{{\mathit{x}(\mathit{k})}}{{\rm{3}}}} \right)\\ {\mathit{h}^{({\rm{2}})}}(x(k),k) = \frac{7}{{10}}x(k) + \frac{3}{5}{x^2}(k)\\ {h^{(3)}}(x(k),k) = 2x(k) + \frac{7}{{10}}{\rm{exp}}\left( {\frac{{\mathit{x}(\mathit{k})}}{{\rm{3}}}} \right)\\ {\mathit{h}^{({\rm{4}})}}(x(k),k) = \frac{3}{5}{x^2}(k) + \frac{4}{5}{\rm{exp}}\left( {\frac{{\mathit{x}(\mathit{k})}}{{\rm{3}}}} \right) \end{array} $

    (50)

    $w(k)$和$v^{(j)}(k)~(j=1, \cdots, 4)$是相互独立的白噪声, 方差分别为: $\sigma^{2}_{w}=1^{2}$, $\sigma^{2}_{v1}=0.09^{2}$, $\sigma^{2}_{v2}=0.1^{2}$, $\sigma^{2}_{v3}=0.12^{2}$, $\sigma^{2}_{v4}=0.13^{2}$.状态初值为$x(0)=0$.由于状态$x(k)$介于$-1\sim4, $因此选取拟合采样点集为: $\{-2, -1, \cdots, 5\}$ (8个等间隔点), 相应的系数选取为: $\gamma=1$.选择$p=2$, 则中介函数为:

    $ \begin{array}{l} \mathit{\boldsymbol{\overline \psi }} (x(k),k) = \left[ {{{\rm{e}}^{ - {{(x - {x_1})}^2}}}\left( {1.5 - {{(x - {x_1})}^2}} \right),} \right. \cdots ,\\ {\left. {{{\rm{e}}^{ - {{(x - {x_8})}^2}}}\left( {1.5 - {{(x - {x_8})}^2}} \right)} \right]^{\rm{T}}} \end{array} $

    (51)

    系数矩阵$H^{(0)}$, $M$和$H^{(\rm{I})}$分别为:

    $ \begin{array}{l} {H^{(0)}} = \left[ {\begin{array}{*{20}{c}} {0.3126}&{ - 0.0480}&{0.1693}&{0.9697}\\ {0.5642}&{ - 0.0564}&0&{0.7334}\\ { - 2.0540}&{ - 0.8454}&{0.3949}&{1.6796}\\ {0.9088}&{0.4927}&{0.4514}&{0.7992} \end{array}} \right.\\ \left. {\begin{array}{*{20}{c}} {2.3607}&{4.3530}&{6.9610}&{10.2053}\\ {2.1439}&{4.2314}&{6.9960}&{10.4375}\\ {3.0260}&{4.4587}&{6.0118}&{7.7329}\\ {1.5561}&{2.7502}&{4.4204}&{6.6211} \end{array}} \right] \end{array} $

    (52)

    $ \begin{equation} M=\left[ \begin{array}{cccc} 0.3126 & -0.0480 & 0.1693\\ 0.5642 & -0.0564 & 0 \\ -2.0540 & -0.8454 & 0.3949 \\ 0.9088 & 0.4927 & 0.4514 \end{array}\right] \end{equation} $

    (53)

    $ \begin{array}{l} {H^{({\rm{I}})}} = \left[ {\begin{array}{*{20}{c}} {1.0000}&0&0&{1.0000}\\ 0&{1.0000}&0&{ - 3.0000}\\ 0&0&{1.0000}&{3.0318} \end{array}} \right.\\ \left. {\begin{array}{*{20}{c}} {3.0000}&{6.0000}&{10.0000}&{15.0000}\\ { - 8.0000}&{ - 15.0000}&{ - 24.0000}&{ - 35.0000}\\ {6.1397}&{10.3857}&{15.8562}&{22.6718} \end{array}} \right] \end{array} $

    (54)

    最后得到基于Gauss-Hermite逼近的WMF-UKF估计曲线和真实曲线如图 1所示.

    图 1  真实状态及WMF-UKF估计曲线
    Fig. 1  Curves of the true state and the WMF-UKF estimate

    本例采用$k$时刻累积均方误差(Accumulated mean square error, AMSE)[24, 39]作为衡量估计准确性的指标函数如式(55)所示.

    $ {\rm{AMSE}}(\mathit{k}){\rm{ = }}\sum\limits_{\mathit{t}{\rm{ = 0}}}^\mathit{k} {\frac{{\rm{1}}}{\mathit{N}}} \sum\limits_{\mathit{i}{\rm{ = 1}}}^\mathit{N} {{{\left( {{\mathit{x}^\mathit{i}}(\mathit{t}){\rm{ - }}{{\mathit{\hat x}}^\mathit{i}}(\mathit{t}{\rm{|}}\mathit{t})} \right)}^{\rm{2}}}} $

    (55)

    其中, $x^{i}(t)$是$t$时刻第$i$次Monte Carlo实验的真实值, $\hat{x}^{i}(t|t)$是$t$时刻第$i$次Monte Carlo实验的估计值.独立进行20次Monte Carlo实验, 得到的AMSE曲线如图 2所示, 其中本例选取局部UKF估计AMSE曲线(Local filter 1~4, LF 1~4)、集中式融合UKF估计AMSE曲线(CMF-UKF)以及本文提出的加权观测融合UKF估计AMSE曲线(WMF-UKF)进行对比.由图 2可以看出CMF-UKF与WMF-UKF具有接近的估计精度, 而高于局部UKF.在计算量方面, 由于本文压缩后的观测为3维, 因此WMF-UKF滤波过程中的时间复杂度为$\rm O(3^{3})$.而集中式融合系统观测方程为4维, 因此时间复杂度为$\rm O(4^{3})$.因此, WMF-UKF计算量要低于CMF-UKF.

    图 2  局部UKF, WMF-UKF以及CMF-UKF的AMSE曲线
    Fig. 2  AMSE curves of local UKF, WMF-UKF and CMF-UKF

    例2. 考虑一个带有8传感器的平面跟踪系统, 在笛卡尔坐标下的状态方程和观测方程如下:

    $ {\pmb x}(k+1)=\Phi{\pmb{x}}(k)+ \Gamma {\pmb w}(k) $

    (56)

    $ \begin{array}{l} {\mathit{\boldsymbol{z}}^{(j)}}(k) = {\mathit{\boldsymbol{h}}^{(j)}}(\mathit{\boldsymbol{x}}(k),k) + \mathit{\boldsymbol{v}}_k^{(j)} = \\ \quad \left[ {\begin{array}{*{20}{c}} {\sqrt {{{(x(k) - {x_j})}^2} + {{(y(k) - {y_j})}^2}} }\\ {\arctan \left( {\frac{{y(k) - {y_j}}}{{x(k) - {x_j}}}} \right)} \end{array}} \right] + {\mathit{\boldsymbol{v}}^{(j)}}(k),{\mkern 1mu} \\ \qquad \qquad \qquad \qquad \quad j = 1, \cdots ,8 \end{array} $

    (57)

    其中, $\mathit{\boldsymbol{x}}(k)={{\left[ x(k)~~\dot{x}(k)~~y(k)~~\dot{y}(k) \right]}^{\text{T}}} $为状态变量, , , ${{\mathit{\boldsymbol{w}}}_{k}} $为零均值, 方差为$\textit{Q}_{w}^{2}={\rm{diag}}\{0.1^{2}, 0.1^{2}\}$的过程噪声.设8个传感器分别放置在4个地点, 其中$l_{1, 2}(5.5, 5)$, $l_{3, 4}(-5, 5.5)$, $l_{5, 6}(-5, -5)$, $l_{7, 8}(5.5, -5.5)$. ${\pmb v}^{(i)}(k)$, ${\pmb v}^{(j)}(k)\ (i\neq j)$互不相关, 且方差分别为.在仿真中, 设采样周期为$T=200\, \rm{ms}$, 状态初值为 $\mathit{\boldsymbol{x}}(0)={{[0\quad 0\quad 0\quad 0]}^{\text{T}}} $.

    经测试, 本例选取Gauss-Hermite系数$\gamma=1.04$.为了减少计算量, 本例将目标移动范围划分成了16个1平方公里的子区域, 如图 3(a)所示.每个子区域采用以该区域为中心, 外扩2点的方法避免边缘拟合效果不良.以子区域7为例, 以点(0, 0), (0, 1), (1, 1)和(1, 0)所围区域为中心, 外扩2点得到该子区域的拟合采样点如图 3(b)所示.计算该区域的系数矩阵$\overline{H}^{(0)}$, $\overline{M}$和$\overline{H}^{(\rm{I})}$, 如图 3(c)所示.不难看出, 由于8个传感器位于4个点, 这里至少可以将16维的集中式融合观测方程 ${{\mathit{\boldsymbol{h}}}^{(0)}}(\mathit{\boldsymbol{x}}(k),k) $压缩成8维的加权观测融合方程.将16个区域对应的$\overline{M}$和$\overline{H}^{(\rm{I})}$与中介函数$\mathit{\boldsymbol{\overline \psi }} (\mathit{\boldsymbol{x}}(k), k)$离线计算存储并形成数据库.根据每时刻状态预报, 在数据库中选取相应的$\overline{M}$, $\overline{H}^{(\rm{I})}$以及$\mathit{\boldsymbol{\overline \psi }} (\mathit{\boldsymbol{x}}(k), k)$可减少在线计算负担.

    图 3  加权系数矩阵$\overline{M}$和$\overline{H}^{(\rm{I})}$的计算
    Fig. 3  Calculation of the weighted matrices $\overline{M}$ and $\overline{H}^{(\rm{I})}$

    为了对比分析WMF-UKF的精度和计算量, 本文选取了8传感器集中式融合UKF(8-CMF-UKF), 5传感器集中式融合UKF(5-CMF-UKF)以及3传感器集中式融合UKF(3-CMF-UKF).传感器的选择原则是尽量的分散, 例如, 3-CMF-UKF选择的是1, 3和5传感器, 5-CMF-UKF选择的是1, 3, 5, 7和8传感器.各种融合系统的滤波跟踪轨迹曲线如图 4所示.

    图 4  真实轨迹和WMF-UKF, 8-CMF-UKF和5-CMF-UKF的估计曲线
    Fig. 4  True and estimated tracks using WMF-UKF, 8-CMF-UKF and 5-CMF-UKF

    本例采用$k$时刻位置$(x(k), y(k))$的累积均方误差(AMSE)作为指标函数, 如式(58)所示.

    $ \begin{align} \rm{AMSE}(k)=\,&\sum_{t=0}^{k}\frac{1}{N}\sum_{i=1}^{N}\left((x^{i}(t)-\hat{x}^{i}(t|t)\right)^{2}+\notag\\ &\left(y^{i}(t)-\hat{y}^{i}(t|t))^{2}\right) \end{align} $

    (58)

    其中, $(x^{i}(t), y^{i}(t))$是$t$时刻第$i$次Monte Carlo实验的真实值, $(\hat{x}^{i}(t|t), \hat{y}^{i}(t|t))$是$t$时刻第$i$次Monte Carlo实验的估计值.独立进行20次Monte Carlo实验, 得到的AMSE曲线如图 5所示.

    图 5  位置融合估计的AMSE曲线
    Fig. 5  AMSE curves of position fusion estimates

    在精度方面, 由图 5可以看到AMSE由低到高依次是8-CMF-UKF, WMF-UKF, 5-CMF-UKF和3-CMF-UKF.实验说明, 随着传感器数量的增加, 集中式融合算法的精度不断提高, 而本文提出的WMF-UKF算法的精度接近全观测集中式融合8-CMF-UKF.

    在计算量方面, 加权观测融合系统观测方程为8维, 因此时间复杂度为$\rm O(8^{3})$. 3传感器集中式融合系统观测方程为6维, 因此时间复杂度为$\rm O(6^{3})$. 5传感器集中式融合系统观测方程为10维, 因此时间复杂度为$\rm O(10^{3})$. 8传感器集中式融合系统观测方程为16维, 因此时间复杂度为$\rm O(16^{3})$.因此, 时间复杂度由高到低依次为: 8-CMF-UKF, 5-CMF-UKF, WMF-UKF和3-CMF-UKF.

    此外, 为了比较分析, 本例应用文献[25]中的Taylor级数逼近方法得到的WMF-UKF的AMSE曲线也绘于图 5中, 这里我们采用2阶Taylor级数逼近.由于Taylor级数展开阶数以及展开点等原因, 使得其精度低于其他融合算法.而且与本文的不需要在线计算融合矩阵的WMF-UKF算法相比, 文献[25]的WMF-UKF (2-order Taylor)算法需要根据在线预报值实时计算融合参数矩阵, 因而具有更大的在线计算负担.

    本例根据不同Hermite多项式$(p=0, 2, 4)$情形进行了仿真分析.经离线测试, 选取Gauss-Hermite系数分别为: $\gamma=0.83\, (p=0), \, \gamma=1.04(p=2), \, \gamma=1\, (p=4), \, $其他参数不变.得到Monte Carlo实验的AMSE曲线如图 6所示.图 6中可以看到, Hermite多项式的数量与函数逼近效果并无直接关系, 得到融合估值精度间也不存在渐近最优性.因此, 根据被逼近函数形式, 离线测试逼近函数效果, 对本文所提出WMF-UKF算法的精度起到非常关键的作用.

    图 6  带不同Hermite多项式的WMF-UKF位置AMSE曲线
    Fig. 6  AMSE curves of WMF-UKFs with different Hermite polynomials for position

    综上, 合理的选择Gauss-Hermite逼近函数以及相应的系数$\gamma$, 可使本文提出的WMF-UKF在精度方面接近集中式融合算法, 而减少计算量.

    本文首先基于Gauss-Hermite逼近方法和加权最小二乘法, 提出了一种具有普适性的非线性加权观测融合算法.进而结合UKF算法, 提出了非线性加权观测融合UKF (WMF-UKF)算法.与CMF-UKF算法相比, WMF-UKF具有与之逼近的估计精度, 但计算量明显降低, 并且随着传感器数量的增加, 该算法在计算量上的优势将更加明显.本文通过仿真实例对比已有的相关算法, 说明了本算法的有效性.

  • 图  1  文中总结的不同方法及其解决的主要问题

    Fig.  1  Different methods summarized in this article and the main issues they address

    图  2  卷积神经网络的动作质量评价方法框架

    Fig.  2  A CNN framework for motion quality assessment

    图  3  人体骨架示意图

    Fig.  3  The schematic diagram of human skeleton

    图  4  基于排序预测的方法

    Fig.  4  The method based on sorting prediction

    表  1  基于视觉的动作质量评价方法不同阶段的主要任务及存在的问题

    Table  1  Main tasks and existing challenges in different stages of vision-based motion quality assessment

    阶段 主要任务 存在的问题
    动作数据获取 通过视觉传感器来收集和记录与动作相关的数据(RGB、深度图、骨架序列) 如何根据不同的应用场景选择适用的数据模态? 如何确保专家的评分质量?
    动作特征表示 综合利用静态图像和人体动作等多方面信息, 设计具有区分性的特征向量以描述人体的运动过程 如何根据动作质量评价任务本身的特性学习具有强鉴别性的动作特征, 以有效地抽取和表示不同运动者在执行相同动作时的细微差异?
    动作质量评价 设计特征映射方式, 将提取的特征与相应的评分、评级或排序评价目标关联起来 如何在设计损失函数时考虑标注不确定性(如不同专家的评分差异)、同一动作之间的评分差异等问题?
    下载: 导出CSV

    表  2  主流的动作质量评价数据集总览

    Table  2  Brief overview of mainstream motion quality assessment dataset

    数据集 动作类别 样本数(受试者人数) 标注类别 应用场景 数据模态 发表年份
    Heian Shodan[25] 1 14 评级标注 健身锻炼 3D骨架 2003
    FINA09 Dive[26] 1 68 评分标注 体育赛事 RGB视频 2010
    MIT-Dive[8] 1 159 评分标注、反馈标注 体育赛事 RGB视频 2014
    MIT-Skate[8] 1 150 评分标注 体育赛事 RGB视频 2014
    SPHERE-Staircase2014[10] 1 48 评级标注 运动康复 3D骨架 2014
    JIGSAWS[9] 3 103 评级标注 技能训练 RGB视频、运动学数据 2014
    SPHERE-Walking2015[16] 1 40 评级标注 运动康复 3D骨架 2016
    SPHERE-SitStand2015[16] 1 109 评级标注 运动康复 3D骨架 2016
    LAM Exercise Dataset[23] 5 125 评级标注 运动康复 3D骨架 2016
    First-Person Basketball[27] 1 48 排序标注 健身锻炼 RGB视频 2017
    UNLV-Dive[28] 1 370 评分标注 体育赛事 RGB视频 2017
    UNLV-Vault[28] 1 176 评分标注 体育赛事 RGB视频 2017
    UI-PRMD[20] 10 100 评级标注 运动康复 3D骨架 2018
    EPIC-Skills 2018[24] 4 216 排序标注 技能训练 RGB视频 2018
    Infant Grasp[29] 1 94 排序标注 技能训练 RGB视频 2019
    AQA-7[30] 7 1189 评分标注 体育赛事 RGB视频 2019
    MTL-AQA[31] 1 1412 评分标注 体育赛事 RGB视频 2019
    FSD-10[32] 10 1484 评分标注 体育赛事 RGB视频 2019
    BEST 2019[32] 5 500 排序标注 技能训练 RGB视频 2019
    KIMORE[22] 5 78 评分标注 运动康复 RGB、深度视频、3D骨架 2019
    Fis-V[33] 1 500 评分标注 体育赛事 RGB视频 2020
    TASD-2(SyncDiving-3m)[34] 1 238 评分标注 体育赛事 RGB视频 2020
    TASD-2(SyncDiving-10m)[34] 1 368 评分标注 体育赛事 RGB视频 2020
    RG[35] 4 1000 评分标注 体育赛事 RGB视频 2020
    QMAR[36] 6 38 评级标注 运动康复 RGB视频 2020
    PISA[37] 1 992 评级标注 技能训练 RGB视频、音频 2021
    FR-FS[38] 1 417 评分标注 体育赛事 RGB视频 2021
    SMART[39] 8 640 评分标注 体育赛事、健身锻炼 RGB视频 2021
    Fitness-AQA[40] 3 1000 反馈标注 健身锻炼 RGB视频 2022
    Finediving[41] 1 3000 评分标注 体育赛事 RGB视频 2022
    LOGO[42] 1 200 评分标注 体育赛事 RGB视频 2023
    RFSJ[43] 23 1304 评分标注 体育赛事 RGB视频 2023
    FineFS[44] 2 1167 评分标注 体育赛事 RGB视频、骨架数据 2023
    AGF-Olympics[45] 1 500 评分标注 体育赛事 RGB视频、骨架数据 2024
    下载: 导出CSV

    表  3  两类动作特征表示方法优缺点对比

    Table  3  Advantage and disadvantage comparison for two types of motion feature methods

    方法分类 优点 缺点
    基于RGB信息的动作表示学习[11, 29, 47] 数据易获取, 包含关于动作的丰富视觉
    信息, 对环境要求较低, 适用性广
    数据量高, 存储和处理成本高, 易受光照、
    复杂背景等无关环境因素影响
    基于骨架序列的动作表示学习[4850] 冗余数据少、计算开销小, 对外部
    干扰的抗性较强
    对骨架序列的准确度要求高, 无法捕捉
    运动者与环境的交互信息
    下载: 导出CSV

    表  4  基于RGB信息的深度动作特征方法优缺点对比

    Table  4  Advantage and disadvantage comparison for RGB-based deep motion feature methods

    方法分类 优点 缺点
    基于卷积神经网络的动作特征
    表示方法[12, 24, 28, 3033, 48, 54, 59]
    简单易实现 无法充分捕捉动作特征的复杂性
    基于孪生网络的动作特征
    表示方法[24, 6264]
    便于建模动作之间的细微差异 计算复杂度较高, 需要构建有效的样本对
    基于时序分割的动作特征
    表示方法[44, 48, 59, 6568]
    降低噪声干扰, 更好地捕获动作的细节和变化 额外的分割标注信息, 片段划分不准确对性能影响较大
    基于注意力机制的动作特征表示
    方法[29, 3235, 38, 41, 4344, 6872]
    自适应性好, 对重要特征的捕获能力强, 可解释性较好 计算复杂度高、内存消耗大
    下载: 导出CSV

    表  5  基于骨架序列的深度动作特征方法优缺点对比

    Table  5  Advantage and disadvantage comparison for skeleton-based deep motion feature methods

    方法分类 优点 缺点
    ST-GCN[93] 模型结构简单, 易实现 长期依赖关系建模困难, 对细节特征的建模能力有限
    ST-GCN + LSTM[9495] 相比ST-GCN, 具有更优的时序建模能力 计算复杂度增加, 需要对LSTM的超参数精调
    改进的时空图卷积神经网络[49, 97] 能够对细节特征进行针对性建模 模型泛化性能不佳
    基于多模态的双流网络[98] 具有更加丰富的特征表示, 模型的整体鲁棒性更优 数据获取难度增加, 计算复杂度增加,
    需要有效的模态特征融合策略
    下载: 导出CSV

    表  6  在体育评分数据集AQA-7上的不同方法性能对比

    Table  6  Performance comparison of different methods on sports scoring dataset AQA-7

    方法 Diving Gym Vault Skiing Snowboard Sync. 3m Sync. 10m AQA-7 传统/深度 发表年份
    Pose+DCT+SVR[8] 0.5300 0.1000 传统 2014
    C3D+SVR[28] 0.7902 0.6824 0.5209 0.4006 0.5937 0.9120 0.6937 深度 2017
    C3D+LSTM[28] 0.6047 0.5636 0.4593 0.5029 0.7912 0.6927 0.6165 深度 2017
    Li 等[11] 0.8009 0.7028 深度 2018
    S3D[59] 0.8600 深度 2018
    All-action C3D+LSTM[30] 0.6177 0.6746 0.4955 0.3648 0.8410 0.7343 0.6478 深度 2019
    C3D-AVG-MTL[30] 0.8808 深度 2019
    JRG[49] 0.7630 0.7358 0.6006 0.5405 0.9013 0.9254 0.7849 深度 2019
    USDL[12] 0.8099 0.7570 0.6538 0.7109 0.9166 0.8878 0.8102 深度 2020
    AIM[36] 0.7419 0.7296 0.5890 0.4960 0.9298 0.9043 0.7789 深度 2020
    DML[62] 0.6900 0.4400 深度 2021
    CoRe[63] 0.8824 0.7746 0.7115 0.6624 0.9442 0.9078 0.8401 深度 2021
    Lei 等[69] 0.8649 0.7858 深度 2021
    EAGLE-EYE[98] 0.8331 0.7411 0.6635 0.6447 0.9143 0.9158 0.8140 深度 2021
    TSA-Net[38] 0.8379 0.8004 0.6657 0.6962 0.9493 0.9334 0.8476 深度 2021
    Adaptive[97] 0.8306 0.7593 0.7208 0.6940 0.9588 0.9298 0.8500 深度 2022
    PCLN[64] 0.8697 0.8759 0.7754 0.5778 0.9629 0.9541 0.8795 深度 2022
    TPT[70] 0.8969 0.8043 0.7336 0.6965 0.9456 0.9545 0.8715 深度 2022
    下载: 导出CSV

    表  7  JIGSAWS数据集上的不同方法性能对比

    Table  7  Performance comparison of different methods on JIGSAWS

    方法 数据模态 评价方法 技能水平
    划分
    交叉验证方法 评测指标 SU KT NP 发表年份
    k-NN[110] 动作特征 GRS 两类 LOSO Accuracy 0.897 0.821 2018
    LOUO Accuracy 0.719 0.729 2018
    LR[110] 动作特征 GRS 两类 LOSO Accuracy 0.899 0.823 2018
    LOUO Accuracy 0.744 0.702 2018
    SVM[110] 动作特征 GRS 两类 LOSO Accuracy 0.754 0.754 2018
    LOUO Accuracy 0.798 0.779 2018
    SMT[111] 动作特征 Self-proclaimed 三类 LOSO Accuracy 0.990 0.996 0.999 2018
    LOUO Accuracy 0.353 0.323 0.571 2018
    DCT[111] 动作特征 Self-proclaimed 三类 LOSO Accuracy 1.000 0.997 0.999 2018
    LOUO Accuracy 0.647 0.548 0.357 2018
    DFT[111] 动作特征 Self-proclaimed 三类 LOSO Accuracy 1.000 0.999 0.999 2018
    LOUO Accuracy 0.647 0.516 0.464 2018
    ApEn[111] 动作特征 Self-proclaimed 三类 LOSO Accuracy 1.000 0.999 1.000 2018
    LOUO Accuracy 0.882 0.774 0.857 2018
    CNN[102] 动作特征 Self-proclaimed 三类 LOSO Accuracy 0.934 0.898 0.849 2018
    CNN[102] 动作特征 GRS 三类 LOSO Accuracy 0.925 0.954 0.913 2018
    CNN[105] 动作特征 Self-proclaimed 三类 LOSO Micro F1 1.000 0.921 1.000 2018
    Macro F1 1.000 0.932 1.000 2018
    Forestier 等[112] 动作特征 GRS 三类 LOSO Micro F1 0.897 0.611 0.963 2018
    Macro F1 0.867 0.533 0.958 2018
    S3D[59] 视频数据 GRS 三类 LOSO SRC 0.680 0.640 0.570 2018
    LOUO SRC 0.030 0.140 0.350 2018
    FCN[99] 动作特征 Self-proclaimed 三类 LOSO Micro F1 1.000 0.921 1.000 2019
    Macro F1 1.000 0.932 1.000 2019
    3D ConvNet (RGB)[103] 视频数据 Self-proclaimed 三类 LOSO Accuracy 1.000 0.958 0.964 2019
    3D ConvNet (OF)[103] 视频数据 Self-proclaimed 三类 LOSO Accuracy 1.000 0.951 1.000 2019
    JRG[49] 视频数据 GRS 三类 LOUO SRC 0.350 0.190 0.670 2019
    USDL[12] 视频数据 GRS 三类 4-fold cross validation SRC 0.710 0.710 0.690 2020
    AIM[34] 视频数据
    动作特征
    GRS 三类 LOUO SRC 0.450 0.610 0.340 2020
    MTL-VF (ResNet)[113] 视频数据 GRS 三类 LOSO SRC 0.790 0.630 0.730 2020
    LOUO SRC 0.680 0.720 0.480 2020
    MTL-VF (C3D)[113] 视频数据 GRS 三类 LOSO SRC 0.770 0.890 0.750 2020
    LOUO SRC 0.690 0.830 0.860 2020
    CoRe[63] 视频数据 GRS 三类 4-fold cross validation SRC 0.840 0.860 0.860 2021
    VTPE[106] 视频数据
    动作特征
    GRS 三类 LOUO SRC 0.450 0.590 0.650 2021
    4-fold cross validation SRC 0.830 0.820 0.760 2021
    ViSA[107] 视频数据 GRS 三类 LOSO SRC 0.840 0.920 0.930 2022
    LOUO SRC 0.720 0.760 0.900 2022
    4-fold cross validation SRC 0.790 0.840 0.860 2022
    Gao 等[108] 视频数据
    动作特征
    GRS 三类 LOUO SRC 0.600 0.690 0.660 2023
    4-fold cross validation SRC 0.830 0.950 0.830 2023
    Contra-Sformer[109] 视频数据 GRS 三类 LOSO SRC 0.860 0.890 0.710 2023
    LOUO SRC 0.650 0.690 0.710 2023
    下载: 导出CSV

    表  8  在EPIC-Skills 2018上的不同方法性能对比

    Table  8  Performance comparison of different methods on EPIC-Skills 2018

    方法Chopstick-UsingSurgeryDrawingRough-Rolling发表年份
    Siamese TSN with $L_{rank3}$[24]71.5%70.2%83.2%79.4%2018
    Rank-aware Attention[32]84.7%68.5%82.3%86.9%2019
    RNN-based Spatial Attention[29]85.5%73.1%85.3%82.7%2019
    Adaptive[97]87.7%71.9%88.2%88.5%2021
    下载: 导出CSV
  • [1] 朱煜, 赵江坤, 王逸宁, 郑兵兵. 基于深度学习的人体行为识别算法综述. 自动化学报, 2016, 42(6): 848−857

    Zhu Yu, Zhao Jiang-Kun, Wang Yi-Ning, Zheng Bing-Bing. A review of human action recognition based on deep learning. Acta Automatica Sinica, 2016, 42(6): 848−857
    [2] Lei Q, Du J X, Zhang H B, Ye S, Chen D S. A survey of vision-based human action evaluation methods. Sensors, 2019, 19(19): Article No. 4129 doi: 10.3390/s19194129
    [3] Ahad M A R, Antar A D, Shahid O. Vision-based action understanding for assistive healthcare: A short review. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops. Long Beach, USA: IEEE, 2019. 1−11
    [4] Voulodimos A, Doulamis N, Doulamis A, Protopapadakis E. Deep learning for computer vision: A brief review. Computational Intelligence and Neuroscience, 2018, 2018(1): Article No. 7068349
    [5] 郑太雄, 黄帅, 李永福, 冯明驰. 基于视觉的三维重建关键技术研究综述. 自动化学报, 2020, 46(4): 631−652

    Zheng Tai-Xiong, Huang Shuai, Li Yong-Fu, Feng Ming-Chi. Key techniques for vision based 3D reconstruction: A review. Acta Automatica Sinica, 2020, 46(4): 631−652
    [6] 林景栋, 吴欣怡, 柴毅, 尹宏鹏. 卷积神经网络结构优化综述. 自动化学报, 2020, 46(1): 24−37

    Lin Jing-Dong, Wu Xin-Yi, Chai Yi, Yin Hong-Peng. Structure optimization of convolutional neural networks: A survey. Acta Automatica Sinica, 2020, 46(1): 24−37
    [7] 张重生, 陈杰, 李岐龙, 邓斌权, 王杰, 陈承功. 深度对比学习综述. 自动化学报, 2023, 49(1): 15−39

    Zhang Chong-Sheng, Chen Jie, Li Qi-Long, Deng Bin-Quan, Wang Jie, Chen Cheng-Gong. Deep contrastive learning: A survey. Acta Automatica Sinica, 2023, 49(1): 15−39
    [8] Pirsiavash H, Vondrick C, Torralba A. Assessing the quality of actions. In: Proceedings of the 13th European Conference on Computer Vision (ECCV 2014). Zurich, Switzerland: Springer, 2014. 556−571
    [9] Gao Y, Vedula S S, Reiley C E, Ahmidi N, Varadarajan B, Lin H C, et al. JHU-ISI gesture and skill assessment working set (JIGSAWS): A surgical activity dataset for human motion modeling. In: Proceedings of the Modeling and Monitoring of Computer Assisted Interventions (M2CAI)-MICCAI Workshop. 2014.
    [10] Paiement A, Tao L L, Hannuna S, Camplani M, Damen D, Mirmehdi M. Online quality assessment of human movement from skeleton data. In: Proceedings of the British Machine Vision Conference. Nottingham, UK: 2014. 153−166
    [11] Li Y J, Chai X J, Chen X L. End-to-end learning for action quality assessment. In: Proceedings of the 19th Pacific-Rim Conference on Multimedia, Advances in Multimedia Information Processing (PCM 2018). Hefei, China: Springer, 2018. 125−134
    [12] Tang Y S, Ni Z L, Zhou J H, Zhang D Y, Lu J W, Wu Y, et al. Uncertainty-aware score distribution learning for action quality assessment. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle, USA: IEEE, 2020. 9839−9848
    [13] Xu J L, Yin S B, Zhao G H, Wang Z S, Peng Y X. FineParser: A fine-grained spatio-temporal action parser for human-centric action quality assessment. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle, USA: IEEE, 2024. 14628−14637
    [14] Morgulev E, Azar O H, Lidor R. Sports analytics and the big-data era. International Journal of Data Science and Analytics, 2018, 5(4): 213−222 doi: 10.1007/s41060-017-0093-7
    [15] Butepage J, Black M J, Kragic D, Kjellström H. Deep representation learning for human motion prediction and classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, USA: IEEE, 2017. 1591−1599
    [16] Tao L L, Paiement A, Damen D, Mirmehdi M, Hannuna S, Camplani M, et al. A comparative study of pose representation and dynamics modelling for online motion quality assessment. Computer Vision and Image Understanding, 2016, 148: 136−152 doi: 10.1016/j.cviu.2015.11.016
    [17] Khalid S, Goldenberg M, Grantcharov T, Taati B, Rudzicz F. Evaluation of deep learning models for identifying surgical actions and measuring performance. JAMA Network Open, 2020, 3(3): Article No. e201664 doi: 10.1001/jamanetworkopen.2020.1664
    [18] Qiu Y H, Wang J P, Jin Z, Chen H H, Zhang M L, Guo L Q. Pose-guided matching based on deep learning for assessing quality of action on rehabilitation training. Biomedical Signal Processing and Control, 2022, 72: Article No. 103323 doi: 10.1016/j.bspc.2021.103323
    [19] Niewiadomski R, Kolykhalova K, Piana S, Alborno P, Volpe G, Camurri A. Analysis of movement quality in full-body physical activities. ACM Transactions on Interactive Intelligent Systems (TiiS), 2019, 9(1): Article No. 1
    [20] Vakanski A, Jun H P, Paul D, Baker R. A data set of human body movements for physical rehabilitation exercises. Data, 2018, 3(1): Article No. 2 doi: 10.3390/data3010002
    [21] Alexiadis D S, Kelly P, Daras P, O'Connor N E, Boubekeur T, Moussa M B. Evaluating a dancer's performance using Kinect-based skeleton tracking. In: Proceedings of the 19th ACM International Conference on Multimedia. Scottsdale, USA: ACM, 2011. 659−662
    [22] Capecci M, Ceravolo M G, Ferracuti F, Iarlori S, MonteriùA, Romeo L, et al. The KIMORE dataset: Kinematic assessment of movement and clinical scores for remote monitoring of physical rehabilitation. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 2019, 27(7): 1436−1448 doi: 10.1109/TNSRE.2019.2923060
    [23] Parmar P, Morris B T. Measuring the quality of exercises. In: Proceedings of the 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC). Orlando, USA: IEEE, 2016. 2241−2244
    [24] Doughty H, Damen D, Mayol-Cuevas W. Who's better? Who's best? Pairwise deep ranking for skill determination. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018. 6057−6066
    [25] Ilg W, Mezger J, Giese M. Estimation of skill levels in sports based on hierarchical spatio-temporal correspondences. In: Proceedings of the 25th DAGM Symposium on Pattern Recognition. Springer, 2003. 523−531
    [26] Wnuk K, Soatto S. Analyzing diving: A dataset for judging action quality. In: Proceedings of the Asian 2010 International Workshops on Computer Vision (ACCV 2010 Workshops). Queenstown, New Zealand: Springer, 2010. 266−276
    [27] Bertasius G, Park H S, Yu S X, Shi J B. Am I a baller? Basketball performance assessment from first-person videos. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV). Venice, Italy: IEEE, 2017. 2196−2204
    [28] Parmar P, Morris B T. Learning to score Olympic events. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Honolulu, USA: IEEE, 2017. 76−84
    [29] Li Z Q, Huang Y F, Cai M J, Sato Y. Manipulation-skill assessment from videos with spatial attention network. In: Proceedings of the IEEE/CVF International Conference on Computer Vision Workshop (ICCVW). Seoul, Korea: IEEE, 2019. 4385−4395
    [30] Parmar P, Morris B. Action quality assessment across multiple actions. In: Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV). Waikoloa Village, USA: IEEE, 2019. 1468−1476
    [31] Parmar P, Morris B T. What and how well you performed? A multitask learning approach to action quality assessment. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, USA: IEEE, 2019. 304−313
    [32] Doughty H, Mayol-Cuevas W, Damen D. The pros and cons: Rank-aware temporal attention for skill determination in long videos. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, USA: IEEE, 2019. 7854−7863
    [33] Xu C M, Fu Y W, Zhang B, Chen Z T, Jiang Y G, Xue X Y. Learning to score figure skating sport videos. IEEE Transactions on Circuits and Systems for Video Technology, 2020, 30(12): 4578−4590 doi: 10.1109/TCSVT.2019.2927118
    [34] Gao J B, Zheng W S, Pan J H, Gao C Y, Wang Y W, Zeng W, et al. An asymmetric modeling for action assessment. In: Proceedings of the 16th European Conference on Computer Vision (ECCV 2020). Glasgow, UK: Springer, 2020. 222−238
    [35] Zeng L A, Hong F T, Zheng W S, Yu Q Z, Zeng W, Wang Y W, et al. Hybrid dynamic-static context-aware attention network for action assessment in long videos. In: Proceedings of the 28th ACM International Conference on Multimedia. Seattle, USA: ACM, 2020. 2526−2534
    [36] Sardari F, Paiement A, Hannuna S, Mirmehdi M. VI-Net——View-invariant quality of human movement assessment. Sensors, 2020, 20(18): Article No. 5258 doi: 10.3390/s20185258
    [37] Parmar P, Reddy J, Morris B. Piano skills assessment. In: Proceedings of the 23rd International Workshop on Multimedia Signal Processing (MMSP). Tampere, Finland: IEEE, 2021. 1−5
    [38] Wang S L, Yang D K, Zhai P, Chen C X, Zhang L H. TSA-Net: Tube self-attention network for action quality assessment. In: Proceedings of the 29th ACM International Conference on Multimedia. Virtual Event: ACM, 2021. 4902−4910
    [39] Chen X, Pang A Q, Yang W, Ma Y X, Xu L, Yu J Y. SportsCap: Monocular 3D human motion capture and fine-grained understanding in challenging sports videos. International Journal of Computer Vision, 2021, 129(10): 2846−2864 doi: 10.1007/s11263-021-01486-4
    [40] Parmar P, Gharat A, Rhodin H. Domain knowledge-informed self-supervised representations for workout form assessment. In: Proceedings of the 17th European Conference on Computer Vision (ECCV 2022). Tel Aviv, Israel: Springer, 2022. 105−123
    [41] Xu J L, Rao Y M, Yu X M, Chen G Y, Zhou J, Lu J W. FineDiving: A fine-grained dataset for procedure-aware action quality assessment. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New Orleans, USA: IEEE, 2022. 2939−2948
    [42] Zhang S Y, Dai W X, Wang S J, Shen X W, Lu J W, Zhou J, et al. LOGO: A long-form video dataset for group action quality assessment. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Vancouver, Canada: IEEE, 2023. 2405−2414
    [43] Liu Y C, Cheng X N, Ikenaga T. A figure skating jumping dataset for replay-guided action quality assessment. In: Proceedings of the 31st ACM International Conference on Multimedia. Ottawa, Canada: ACM, 2023. 2437−2445
    [44] Ji Y L, Ye L F, Huang H L, Mao L J, Zhou Y, Gao L L. Localization-assisted uncertainty score disentanglement network for action quality assessment. In: Proceedings of the 31st ACM International Conference on Multimedia. Ottawa, Canada: ACM, 2023. 8590−8597
    [45] Zahan S, Hassan G M, Mian A. Learning sparse temporal video mapping for action quality assessment in floor gymnastics. IEEE Transactions on Instrumentation and Measurement, 2024, 73: Article No. 5020311
    [46] Ahmidi N, Tao L L, Sefati S, Gao Y X, Lea C, Haro B, et al. A dataset and benchmarks for segmentation and recognition of gestures in robotic surgery. IEEE Transactions on Biomedical Engineering, 2017, 64(9): 2025−2041 doi: 10.1109/TBME.2016.2647680
    [47] Liao Y L, Vakanski A, Xian M. A deep learning framework for assessing physical rehabilitation exercises. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 2020, 28(2): 468−477 doi: 10.1109/TNSRE.2020.2966249
    [48] Li Y J, Chai X J, Chen X L. ScoringNet: Learning key fragment for action quality assessment with ranking loss in skilled sports. In: Proceedings of the 14th Asian Conference on Computer Vision (ACCV 2018). Perth, Australia: Springer, 2018. 149−164
    [49] Pan J H, Gao J B, Zheng W S. Action assessment by joint relation graphs. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). Seoul, Korea: IEEE, 2019. 6330−6339
    [50] Lei Q, Zhang H B, Du J X, Hsiao T, Chen C C. Learning effective skeletal representations on RGB video for fine-grained human action quality assessment. Electronics, 2020, 9(4): Article No. 568 doi: 10.3390/electronics9040568
    [51] Gordon A S. Automated video assessment of human performance. In: Proceedings of the AI-ED-World Conference on Artificial Intelligence in Education. Washington, USA: AACE Press, 1995. 541−546
    [52] Venkataraman V, Vlachos I, Turaga P. Dynamical regularity for action analysis. In: Proceedings of the British Machine Vision Conference. Swansea, UK: BMVA Press, 2015. 67−78
    [53] Zia A, Sharma Y, Bettadapura V, Sarin E L, Ploetz T, Clements M A, et al. Automated video-based assessment of surgical skills for training and evaluation in medical schools. International Journal of Computer Assisted Radiology and Surgery, 2016, 11(9): 1623−1636 doi: 10.1007/s11548-016-1468-2
    [54] Parmar P. On Action Quality Assessment [Ph.D. dissertation], University of Nevada, USA, 2019.
    [55] Simonyan K, Zisserman A. Two-stream convolutional networks for action recognition in videos. In: Proceedings of the 27th International Conference on Neural Information Processing Systems. Cambridge, US: ACM, 2014. 568−576
    [56] Tran D, Bourdev L, Fergus R, Torresani L, Paluri M. Learning spatiotemporal features with 3D convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV). Santiago, Chile: IEEE, 2015. 4489−4497
    [57] Carreira J, Zisserman A. Quo Vadis, action recognition? A new model and the kinetics dataset. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, USA: IEEE, 2017. 4724−4733
    [58] Qiu Z F, Yao T, Mei T. Learning spatio-temporal representation with pseudo-3D residual networks. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV). Venice, Italy: IEEE, 2017. 5534−5542
    [59] Xiang X, Tian Y, Reiter A, Hager G D, Tran T D. S3D: Stacking segmental P3D for action quality assessment. In: Proceedings of the IEEE International Conference on Image Processing (ICIP). Athens, Greece: IEEE, 2018. 928−932
    [60] Yu F, Koltun V. Multi-scale context aggregation by dilated convolutions. In: Proceedings of the 4th International Conference on Learning Representations. San Juan, Puerto Rico: ICLR, 2016. 928−932
    [61] Bromley J, Bentz J W, Bottou L, Guyon I, Lecun Y, Moor C, et al. Signature verification using a "siamese" time delay neural network. International Journal of Pattern Recognition and Artificial Intelligence, 1993, 7(4): 669−688 doi: 10.1142/S0218001493000339
    [62] Jain H, Harit G, Sharma A. Action quality assessment using siamese network-based deep metric learning. IEEE Transactions on Circuits and Systems for Video Technology, 2021, 31(6): 2260−2273 doi: 10.1109/TCSVT.2020.3017727
    [63] Yu X M, Rao Y M, Zhao W L, Lu J W, Zhou J. Group-aware contrastive regression for action quality assessment. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). Montreal, Canada: IEEE, 2021. 7899−7908
    [64] Li M Z, Zhang H B, Lei Q, Fan Z W, Liu J H, Du J X. Pairwise contrastive learning network for action quality assessment. In: Proceedings of the 17th European Conference on Computer Vision (ECCV 2022). Tel Aviv, Israel: Springer, 2022. 457−473
    [65] Dong L J, Zhang H B, Shi Q H Y, Lei Q, Du J X, Gao S C. Learning and fusing multiple hidden substages for action quality assessment. Knowledge-Based Systems, 2021, 229: Article No. 107388 doi: 10.1016/j.knosys.2021.107388
    [66] Lea C, Flynn M D, Vidal R, Reiter A, Hager G D. Temporal convolutional networks for action segmentation and detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, USA: IEEE, 2017. 1003−1012
    [67] Liu L X, Zhai P J, Zheng D L, Fang Y. Multi-stage action quality assessment method. In: Proceedings of the 4th International Conference on Control, Robotics and Intelligent System. Guangzhou, China: ACM, 2023. 116−122
    [68] Gedamu K, Ji Y L, Yang Y, Shao J, Shen H T. Fine-grained spatio-temporal parsing network for action quality assessment. IEEE Transactions on Image Processing, 2023, 32: 6386−6400 doi: 10.1109/TIP.2023.3331212
    [69] Lei Q, Zhang H B, Du J X. Temporal attention learning for action quality assessment in sports video. Signal, Image and Video Processing, 2021, 15(7): 1575−1583 doi: 10.1007/s11760-021-01890-w
    [70] Bai Y, Zhou D S, Zhang S Y, Wang J, Ding E R, Guan Y, et al. Action quality assessment with temporal parsing transformer. In: Proceedings of the 17th European Conference on Computer Vision (ECCV 2022). Tel Aviv, Israel: Springer, 2022. 422−438
    [71] Xu A, Zeng L A, Zheng W S. Likert scoring with grade decoupling for long-term action assessment. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans, USA: IEEE, 2022. 3222−3231
    [72] Du Z X, He D, Wang X, Wang Q. Learning semantics-guided representations for scoring figure skating. IEEE Transactions on Multimedia, 2024, 26: 4987−4997 doi: 10.1109/TMM.2023.3328180
    [73] Yan S J, Xiong Y J, Lin D H. Spatial temporal graph convolutional networks for skeleton-based action recognition. In: Proceedings of the 32nd AAAI Conference on Artificial Intelligence. New Orleans, USA: AAAI, 2018. 7444−7452
    [74] Gao X, Hu W, Tang J X, Liu J Y, Guo Z M. Optimized skeleton-based action recognition via sparsified graph regression. In: Proceedings of the ACM International Conference on Multimedia. Nice, France: ACM, 2019. 601−610
    [75] Patrona F, Chatzitofis A, Zarpalas D, Daras P. Motion analysis: Action detection, recognition and evaluation based on motion capture data. Pattern Recognition, 2018, 76: 612−622 doi: 10.1016/j.patcog.2017.12.007
    [76] Microsoft Development Team. Azure Kinect body tracking joints [Online], available: https://learn.microsoft.com/en-us/previous-versions/azure/kinect-dk/body-joints, December 12, 2024
    [77] Yang Y, Ramanan D. Articulated pose estimation with flexible mixtures-of-parts. In: Proceedings of the CVPR 2011. Colorado Springs, USA: IEEE, 2011. 1385−1392
    [78] Felzenszwalb P F, Girshick R B, McAllester D, Ramanan D. Object detection with discriminatively trained part-based models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010, 32(9): 1627−1645 doi: 10.1109/TPAMI.2009.167
    [79] Tian Y, Sukthankar R, Shah M. Spatiotemporal deformable part models for action detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Portland, USA: IEEE, 2013. 2642−2649
    [80] Cao Z, Simon T, Wei S E, Sheikh Y. Realtime multi-person 2D pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, USA: IEEE, 2017. 7291−7299
    [81] Fang H S, Xie S Q, Tai Y W, Lu C W. RMPE: Regional multi-person pose estimation. In: Proceedings of the IEEE International Conference on Computer Vision. Venice, Italy: IEEE, 2017. 2353−2362
    [82] He K, Gkioxari G, DollÁR P, Girshick R. Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV). Venice, Italy: IEEE, 2017. 2980−2988
    [83] Shotton J, Fitzgibbon A, Cook M, Sharp T, Finocchio M, Moore R, et al. Real-time human pose recognition in parts from single depth images. In: Proceedings of the CVPR 2011. Colorado Springs, USA: IEEE, 2011. 1297−1304
    [84] Rhodin H, Meyer F, Spörri J, Müller E, Constantin V, Fua P, et al. Learning monocular 3D human pose estimation from multi-view images. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, UT, USA: IEEE, 2018. 8437−8446
    [85] Dong J T, Jiang W, Huang Q X, Bao H J, Zhou X W. Fast and robust multi-person 3D pose estimation from multiple views. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, USA: IEEE, 2019. 7784−7793
    [86] Celiktutan O, Akgul C B, Wolf C, Sankur B. Graph-based analysis of physical exercise actions. In: Proceedings of the 1st ACM International Workshop on Multimedia Indexing and Information Retrieval for Healthcare. Barcelona, Spain: ACM, 2013. 23−32
    [87] Liu J, Wang G, Hu P, Duan L Y, Kot A C. Global context-aware attention LSTM networks for 3D action recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, USA: IEEE, 2017. 3671−3680
    [88] Lee I, Kim D, Kang S, Lee S. Ensemble deep learning for skeleton-based action recognition using temporal sliding LSTM networks. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV). Venice, Italy: IEEE, 2017. 1012−1020
    [89] Li C, Zhong Q Y, Xie D, Pu S L. Co-occurrence feature learning from skeleton data for action recognition and detection with hierarchical aggregation. In: Proceedings of the 27th International Joint Conference on Artificial Intelligence. Stockholm, Sweden: AAAI Press, 2018. 786−792
    [90] Li Y S, Xia R J, Liu X, Huang Q H. Learning shape-motion representations from geometric algebra spatio-temporal model for skeleton-based action recognition. In: Proceedings of the IEEE International Conference on Multimedia and Expo (ICME). Shanghai, China: IEEE, 2019. 1066−1071
    [91] Li M S, Chen S H, Chen X, Zhang Y, Wang Y F, Tian Q. Actional-structural graph convolutional networks for skeleton-based action recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, CA, USA: IEEE, 2019. 3590−3598
    [92] Shi L, Zhang Y F, Cheng J, Lu H Q. Two-stream adaptive graph convolutional networks for skeleton-based action recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, CA, USA: IEEE, 2019. 12018−12027
    [93] Yu B X B, Liu Y, Chan K C C. Skeleton-based detection of abnormalities in human actions using graph convolutional networks. In: Proceedings of the 2nd International Conference on Transdisciplinary AI (TransAI). Irvine, USA: IEEE, 2020. 131−137
    [94] Chowdhury S H, Al Amin M, Rahman A K M M, Amin M A, Ali A A. Assessment of rehabilitation exercises from depth sensor data. In: Proceedings of the 24th International Conference on Computer and Information Technology. Dhaka, Bangladesh: IEEE, 2021. 1−7
    [95] Deb S, Islam M F, Rahman S, Rahman S. Graph convolutional networks for assessment of physical rehabilitation exercises. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 2022, 30: 410−419 doi: 10.1109/TNSRE.2022.3150392
    [96] Li H Y, Lei Q, Zhang H B, Du J X, Gao S C. Skeleton-based deep pose feature learning for action quality assessment on figure skating videos. Journal of Visual Communication and Image Representation, 2022, 89: Article No. 103625 doi: 10.1016/j.jvcir.2022.103625
    [97] Pan J H, Gao J B, Zheng W S. Adaptive action assessment. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(12): 8779−8795 doi: 10.1109/TPAMI.2021.3126534
    [98] Nekoui M, Cruz F O T, Cheng L. Eagle-eye: Extreme-pose action grader using detail bird's-eye view. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV). Waikoloa, USA: IEEE, 2021. 394−402
    [99] Fawaz H I, Forestier G, Weber J, Idoumghar L, Muller P A. Accurate and interpretable evaluation of surgical skills from kinematic data using fully convolutional neural networks. International Journal of Computer Assisted Radiology and Surgery, 2019, 14(9): 1611−1617 doi: 10.1007/s11548-019-02039-4
    [100] Roditakis K, Makris A, Argyros A. Towards improved and interpretable action quality assessment with self-supervised alignment. In: Proceedings of the 14th PErvasive Technologies Related to Assistive Environments Conference. Corfu, Greece: IEEE, 2021. 507−513
    [101] Li M Z, Zhang H B, Dong L J, Lei Q, Du J X. Gaussian guided frame sequence encoder network for action quality assessment. Complex & Intelligent Systems, 2023, 9(2): 1963−1974
    [102] Wang Z, Fey A M. Deep learning with convolutional neural network for objective skill evaluation in robot-assisted surgery. International Journal of Computer Assisted Radiology and Surgery, 2018, 13(12): 1959−1970 doi: 10.1007/s11548-018-1860-1
    [103] Funke I, Mees S T, Weitz J, Speidel S. Video-based surgical skill assessment using 3D convolutional neural networks. International Journal of Computer Assisted Radiology and Surgery, 2019, 14(7): 1217−1225 doi: 10.1007/s11548-019-01995-1
    [104] Wang Z, Fey A M. SATR-DL: Improving surgical skill assessment and task recognition in robot-assisted surgery with deep neural networks. In: Proceedings of the 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC). Honolulu, USA: IEEE, 2018. 1793−1796
    [105] Fawaz H I, Forestier G, Weber J, Idoumghar L, Muller P A. Evaluating surgical skills from kinematic data using convolutional neural networks. In: Proceedings of the 21st International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI 2018). Granada, Spain: Springer, 2018. 214−221
    [106] Liu D C, Li Q Y, Jiang T T, Wang Y Z, Miao R L, Shan F, et al. Towards unified surgical skill assessment. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Nashville, USA: IEEE, 2021. 9517−9526
    [107] Li Z Q, Gu L, Wang W M, Nakamura R, Sato Y. Surgical skill assessment via video semantic aggregation. In: Proceedings of the 25th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI 2022). Singapore: Springer, 2022. 410−420
    [108] Gao J B, Pan J H, Zhang S J, Zheng W S. Automatic modelling for interactive action assessment. International Journal of Computer Vision, 2023, 131(3): 659−679 doi: 10.1007/s11263-022-01695-5
    [109] Anastasiou D, Jin Y M, Stoyanov D, Mazomenos E. Keep your eye on the best: Contrastive regression transformer for skill assessment in robotic surgery. IEEE Robotics and Automation Letters, 2023, 8(3): 1755−1762 doi: 10.1109/LRA.2023.3242466
    [110] Fard M J, Ameri S, Ellis R D, Chinnam R B, Pandya A K, Klein M D. Automated robot-assisted surgical skill evaluation: Predictive analytics approach. The International Journal of Medical Robotics and Computer Assisted Surgery, 2018, 14(1): Article No. e1850
    [111] Zia A, Essa I. Automated surgical skill assessment in RMIS training. International Journal of Computer Assisted Radiology and Surgery, 2018, 13(5): 731−739 doi: 10.1007/s11548-018-1735-5
    [112] Forestier G, Petitjean F, Senin P, Despinoy F, Huaulmé A, Fawaz H I, et al. Surgical motion analysis using discriminative interpretable patterns. Artificial Intelligence in Medicine, 2018, 91: 3−11 doi: 10.1016/j.artmed.2018.08.002
    [113] Wang T Y, Wang Y J, Li M. Towards accurate and interpretable surgical skill assessment: A video-based method incorporating recognized surgical gestures and skill levels. In: Proceedings of the 23rd International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI 2020). Lima, Peru: Springer, 2020. 668−678
    [114] Okamoto L, Parmar P. Hierarchical NeuroSymbolic approach for comprehensive and explainable action quality assessment. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Seattle, USA: IEEE, 2024. 3204−3213
  • 加载中
图(4) / 表(8)
计量
  • 文章访问数:  513
  • HTML全文浏览量:  280
  • PDF下载量:  98
  • 被引次数: 0
出版历程
  • 收稿日期:  2023-09-05
  • 网络出版日期:  2024-11-19
  • 刊出日期:  2025-02-17

目录

/

返回文章
返回