2.845

2023影响因子

(CJCR)

  • 中文核心
  • EI
  • 中国科技核心
  • Scopus
  • CSCD
  • 英国科学文摘

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于中粒度模型的视频人体姿态估计

史青宣 邸慧军 陆耀 田学东

史青宣, 邸慧军, 陆耀, 田学东. 基于中粒度模型的视频人体姿态估计. 自动化学报, 2018, 44(4): 646-655. doi: 10.16383/j.aas.2018.c160847
引用本文: 史青宣, 邸慧军, 陆耀, 田学东. 基于中粒度模型的视频人体姿态估计. 自动化学报, 2018, 44(4): 646-655. doi: 10.16383/j.aas.2018.c160847
SHI Qing-Xuan, DI Hui-Jun, LU Yao, TIAN Xue-Dong. A Medium Granularity Model for Human Pose Estimation in Video. ACTA AUTOMATICA SINICA, 2018, 44(4): 646-655. doi: 10.16383/j.aas.2018.c160847
Citation: SHI Qing-Xuan, DI Hui-Jun, LU Yao, TIAN Xue-Dong. A Medium Granularity Model for Human Pose Estimation in Video. ACTA AUTOMATICA SINICA, 2018, 44(4): 646-655. doi: 10.16383/j.aas.2018.c160847

基于中粒度模型的视频人体姿态估计

doi: 10.16383/j.aas.2018.c160847
基金项目: 

河北省高等学校科学技术研究重点项目 ZD2017208

国家自然科学基金 61375075

国家自然科学基金 9142020013

国家自然科学基金 61273273

详细信息
    作者简介:

    史青宣  河北大学计算机科学与技术学院讲师.北京理工大学计算机学院博士研究生.主要研究方向为计算机视觉, 模式识别.E-mail:shiqingxuan@bit.edu.cn

    邸慧军  北京理工大学计算机学院讲师.主要研究方向为计算机视觉, 模式识别, 机器学习.E-mail:ajon@bit.edu.cn

    田学东  河北大学计算机科学与技术学院教授.主要研究方向为模式识别与图像处理.E-mail:txd@hbu.edu.cn

    通讯作者:

    陆耀  北京理工大学计算机学院教授.主要研究方向为神经网络, 图像和信号处理, 模式识别.本文通信作者.E-mail:vis_yl@bit.edu.cn

A Medium Granularity Model for Human Pose Estimation in Video

Funds: 

the Key Project of the Science and Technology Research Program in University of Hebei Province of China ZD2017208

National Natural Science Foundation of China 61375075

National Natural Science Foundation of China 9142020013

National Natural Science Foundation of China 61273273

More Information
    Author Bio:

     Lecturer at the School of Computer Science and Technology, Hebei University. Ph. D. candidate at the School of Computer Science, Beijing Institute of Technology. Her research interest covers computer vision, and pattern recognition

     Lecturer at the School of Computer Science, Beijing Institute of Technology. His research interest covers computer vision, pattern recognitio, and machine learning

     Professor at the School of Computer Science and Technology, Hebei University. His research interest covers pattern recognition and image processing

    Corresponding author: LU Yao  Professor at the School of Computer Science, Beijing Institute of Technology. His research interest covers neural network, image and signal processing, and pattern recognition. Corresponding author of this paper
  • 摘要: 人体姿态估计是计算机视觉领域的一个研究热点,在行为识别、人机交互等领域均有广泛的应用.本文综合粗、细粒度模型的优点,以人体部件轨迹片段为实体构建中粒度时空模型,通过迭代的时域和空域交替解析,完成模型的近似推理,为每一人体部件选择最优的轨迹片段,拼接融合形成最终的人体姿态序列估计.为准备高质量的轨迹片段候选,本文引入全局运动信息将单帧图像中的最优姿态检测结果传播到整个视频形成轨迹,然后将轨迹切割成互相交叠的固定长度的轨迹片段.为解决对称部件易混淆的问题,从概念上将模型中的对称部件合并,在保留对称部件间约束的前提下,消除空域模型中的环路.在三个数据集上的对比实验表明本文方法较其他视频人体姿态估计方法达到了更高的估计精度.
    1)  本文责任编委 王亮
  • 图  1  现有视频人体姿态估计方法采用的模型

    Fig.  1  The models used in video pose estimation

    图  2  中粒度时空模型

    Fig.  2  The medium granularity model

    图  4  不同方法的长时运动估计对比

    Fig.  4  Long-term performances of different motion estimation approaches

    图  3  不同方法的短时运动估计对比

    Fig.  3  Short-term performances of different motion estimation approaches

    图  5  基于中粒度模型的视频人体姿态估计方法示意图

    Fig.  5  Overview of the video pose estimation method based on medium granularity model

    图  6  时空模型分解为空域子模型和时域子模型

    Fig.  6  Sub-models of the full graphical model

    图  7  算法关键策略有效性测试结果

    Fig.  7  Examination of key modules

    图  8  UnusualPose数据集上的实验结果对比

    Fig.  8  Qualitative comparison on UnusualPose dataset

    图  9  FYDP数据集上的实验结果

    Fig.  9  Sample results on FYDP dataset

    图  10  Sub_Nbest数据集上的实验结果

    Fig.  10  Sample results on Sub_Nbest dataset

    表  1  UnusualPose视频集上的PCK评分对比

    Table  1  PCK on UnusualPose dataset

    MethodHeadShld.ElbowWristHipKneeAnkleAvg
    Nbest99.899.476.265.087.870.871.581.5
    UVA99.493.872.756.289.366.362.477.2
    PE_GM98.798.389.973.891.076.488.988.1
    Ours98.798.1 90.175.195.9 88.489.590.8
    下载: 导出CSV

    表  2  FYDP视频集上的PCK评分对比

    Table  2  PCK on FYDP dataset

    MethodHeadShld.ElbowWristHipKneeAnkleAvg
    Nbest95.789.775.259.183.381.479.580.6
    UVA96.291.778.460.385.483.879.282.1
    PE_GM98.489.280.960.584.4 89.383.783.8
    Ours97.993.4 84 63.188.488.984.485.7
    下载: 导出CSV

    表  3  Sub_Nbest视频集上的PCP评分对比

    Table  3  PCP on Sub_Nbest dataset

    MethodHeadTorsoU.A.L.A.U.L.L.L.
    Nbest10061.066.041.086.084.0
    SYM10069.085.042.091.089.0
    PE_GM10097.9 97.967.094.786.2
    HPEV10010093.065.092.094.0
    Ours10098.196.658.6 95.1 94.8
    下载: 导出CSV
  • [1] 李毅, 孙正兴, 陈松乐, 李骞.基于退火粒子群优化的单目视频人体姿态分析方法.自动化学报, 2012, 38(5):732-741 http://www.aas.net.cn/CN/abstract/abstract13545.shtml

    Li Yi, Sun Zheng-Xing, Chen Song-Le, Li Qian. 3D Human pose analysis from monocular video by simulated annealed particle swarm optimization. Acta Automatica Sinica, 2012, 38(5):732-741 http://www.aas.net.cn/CN/abstract/abstract13545.shtml
    [2] 朱煜, 赵江坤, 王逸宁, 郑兵兵.基于深度学习的人体行为识别算法综述.自动化学报, 2016, 42(6):848-857 http://www.aas.net.cn/CN/abstract/abstract18875.shtml

    Zhu Yu, Zhao Jiang-Kun, Wang Yi-Ning, Zheng Bing-Bing. A review of human action recognition based on deep learning. Acta Automatica Sinica, 2016, 42(6):848-857 http://www.aas.net.cn/CN/abstract/abstract18875.shtml
    [3] Shotton J, Girshick R, Fitzgibbon A, Sharp T, Cook M, Finocchio M, Moore R, Kohli P, Criminisi A, Kipman A, Blake A. E-cient human pose estimation from single depth images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(12):2821-2840 doi: 10.1109/TPAMI.2012.241
    [4] Cristani M, Raghavendra R, del Bue A, Murino V. Human behavior analysis in video surveillance:a social signal processing perspective. Neurocomputing, 2013, 100:86-97 doi: 10.1016/j.neucom.2011.12.038
    [5] Wang L M, Qiao Y, Tang X O. Video action detection with relational dynamic-poselets. In: Proceedings of the European Conference on Computer Vision. Zurich, Switzerland: Springer, 2014. 565-580
    [6] Felzenszwalb P F, Huttenlocher D P. Pictorial structures for object recognition. International Journal of Computer Vision, 2005, 61(1):55-79 doi: 10.1023/B:VISI.0000042934.15159.49
    [7] Yang Y, Ramanan D. Articulated human detection with flexible mixtures of parts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(12):2878-2890 doi: 10.1109/TPAMI.2012.261
    [8] Sapp B, Jordan C, Taskar B. Adaptive pose priors for pictorial structures. In: Proceedings of the 2010 IEEE Conference on Computer Vision and Pattern Recognition. San Francisco, CA, USA: IEEE, 2010. 422-429
    [9] Andriluka M, Roth S, Schiele B. Pictorial structures revisited: people detection and articulated pose estimation. In: Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition. Miami, FL, USA: IEEE, 2009. 1014-1021
    [10] Eichner M, Marin-Jimenez M, Zisserman A, Ferrari V. 2D articulated human pose estimation and retrieval in (almost) unconstrained still images. International Journal of Computer Vision, 2012, 99(2):190-214 doi: 10.1007/s11263-012-0524-9
    [11] Ferrari V, Marin-Jimenez M, Zisserman A. Progressive search space reduction for human pose estimation. In: Proceedings of the 2008 IEEE Conference on Computer Vision and Pattern Recognition. Anchorage, AK, USA: IEEE, 2008. 1-8
    [12] Shi Q X, Di H J, Lu Y, Lü F. Human pose estimation with global motion cues. In: Proceedings of the 2015 IEEE International Conference on Image Processing. Quebec, Canada: IEEE, 2015. 442-446
    [13] Sapp B, Toshev A, Taskar B. Cascaded models for articulated pose estimation. In: Proceedings of the Eeuropean Conference on Computer Vision. Heraklion, Greece: Springer, 2010. 406-420
    [14] Zhao L, Gao X B, Tao D C, Li X L. Tracking human pose using max-margin Markov models. IEEE Transactions on Image Processing, 2015, 24(12):5274-5287 doi: 10.1109/TIP.2015.2473662
    [15] Ramakrishna V, Kanade T, Sheikh Y. Tracking human pose by tracking symmetric parts. In: Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition. Portland, OR, USA: IEEE, 2013. 3728-3735
    [16] Cherian A, Mairal J, Alahari K, Schmid C. Mixing bodypart sequences for human pose estimation. In: Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, OH, USA: IEEE, 2014. 2361-2368
    [17] Tokola R, Choi W, Savarese S. Breaking the chain: liberation from the temporal Markov assumption for tracking human poses. In: Proceedings of the 2013 IEEE International Conference on Computer Vision. Sydney, Australia: IEEE, 2013. 2424-2431
    [18] Zhang D, Shah M. Human pose estimation in videos. In: Proceedings of the 2015 IEEE International Conference on Computer Vision. Santiago, Chile: IEEE, 2015. 2012-2020
    [19] Sigal L, Bhatia S, Roth S, Black M J, Isard M. Tracking loose-limbed people. In: Proceedings of the 2004 IEEE Conference on Computer Vision and Pattern Recognition. Washington, D. C., USA: IEEE, 2004. 421-428
    [20] Sminchisescu C, Triggs B. Estimating articulated human motion with covariance scaled sampling. The International Journal of Robotics Research, 2003, 22(6):371-391 doi: 10.1177/0278364903022006003
    [21] Weiss D, Sapp B, Taskar B. Sidestepping intractable inference with structured ensemble cascades. In: Proceedings of the 23rd International Conference on Neural Information Processing Systems. Vancouver, Canada: MIT Press, 2010. 2415-2423
    [22] Park D, Ramanan D. N-best maximal decoders for part models. In: Proceedings of the 2011 IEEE International Conference on Computer Vision. Barcelona, Spain: IEEE, 2011. 2627-2634
    [23] Wang C Y, Wang Y Z, Yuille A L. An approach to posebased action recognition. In: Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition. Portland, OR, USA: IEEE, 2013. 915-922
    [24] Zu-S, Romero J, Schmid C, Black M J. Estimating human pose with flowing puppets. In: Proceedings of the 2013 IEEE International Conference on Computer Vision. Sydney, Australia: IEEE, 2013. 3312-3319
    [25] Sapp B, Weiss D, Taskar B. Parsing human motion with stretchable models. In: Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition. Colorado Springs, CO, USA: IEEE, 2011. 1281-1288
    [26] Fragkiadaki K, Hu H, Shi J B. Pose from flow and flow from pose. In: Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition. Portland, OR, USA: IEEE, 2013. 2059-2066
    [27] Brox T, Malik J. Large displacement optical flow:descriptor matching in variational motion estimation. IEEE Transactions on Pattern Recognition and Machine Intelligence, 2011, 33(3):500-513 doi: 10.1109/TPAMI.2010.143
    [28] Wang H, Klaser A, Schmid C, Liu C L. Action recognition by dense trajectories. In: Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition. Washington, D. C., USA: IEEE, 2011. 3169-3176
    [29] Shen H Q, Yu S I, Yang Y, Meng D Y, Hauptmann A. Unsupervised video adaptation for parsing human motion. In: Proceedings of the European Conference on Computer Vision. Zurich, Switzerland: Springer, 2014. 347-360
    [30] Di H J, Tao L M, Xu G Y. A mixture of transformed hidden Markov models for elastic motion estimation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2009, 31(10):1817-1830 doi: 10.1109/TPAMI.2009.111
    [31] 吕峰, 邸慧军, 陆耀, 徐光祐.基于分层弹性运动分析的非刚体跟踪方法.自动化学报, 2015, 41(2):295-303 http://www.aas.net.cn/CN/abstract/abstract18608.shtml

    Lü Feng, Di Hui-Jun, Lu Yao, Xu Guang-You. Non-rigid tracking method based on layered elastic motion analysis. Acta Automatica Sinica, 2015, 41(2):295-303 http://www.aas.net.cn/CN/abstract/abstract18608.shtml
  • 加载中
图(10) / 表(3)
计量
  • 文章访问数:  1939
  • HTML全文浏览量:  223
  • PDF下载量:  756
  • 被引次数: 0
出版历程
  • 收稿日期:  2016-12-27
  • 录用日期:  2017-07-12
  • 刊出日期:  2018-04-20

目录

    /

    返回文章
    返回