2.793

2018影响因子

(CJCR)

  • 中文核心
  • EI
  • 中国科技核心
  • Scopus
  • CSCD
  • 英国科学文摘

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

多聚点子空间下的时空信息融合及其在行为识别中的应用

杨天金 侯振杰 李兴 梁久祯 宦娟 郑纪翔

杨天金, 侯振杰, 李兴, 梁久祯, 宦娟, 郑纪翔. 多聚点子空间下的时空信息融合及其在行为识别中的应用. 自动化学报, 2020, 46(x): 1−12 doi: 10.16383/j.aas.c190327
引用本文: 杨天金, 侯振杰, 李兴, 梁久祯, 宦娟, 郑纪翔. 多聚点子空间下的时空信息融合及其在行为识别中的应用. 自动化学报, 2020, 46(x): 1−12 doi: 10.16383/j.aas.c190327
Yang Tian-Jin, Hou Zhen-Jie, Li Xing, Liang Jiu-Zhen, Huan Juan, Zheng Ji-Xiang. Recognizing action using multi-center subspace learning-based spatial-temporal information fusion. Acta Automatica Sinica, 2020, 46(x): 1−12 doi: 10.16383/j.aas.c190327
Citation: Yang Tian-Jin, Hou Zhen-Jie, Li Xing, Liang Jiu-Zhen, Huan Juan, Zheng Ji-Xiang. Recognizing action using multi-center subspace learning-based spatial-temporal information fusion. Acta Automatica Sinica, 2020, 46(x): 1−12 doi: 10.16383/j.aas.c190327

多聚点子空间下的时空信息融合及其在行为识别中的应用

doi: 10.16383/j.aas.c190327
基金项目: 国家自然科学基金项目(61803050,61063021)资助, 江苏省物联网移动互联技术工程重点实验室开放课题项目(JSWLW-2017-013), 浙江省公益技术研究社会发展项目(2017C33223)
详细信息
    作者简介:

    杨天金:常州大学信息科学与工程学院研究生硕士. 主要研究方向为包括行为识别,机器学习.E-mail: yangtianjin128@163.com

    侯振杰:常州大学信息科学与工程学院教授,2015年获内蒙古农业大学机械专业博士学位,主要研究兴趣包括行为识别,机器学习. 本文通讯作者E-mail: houzj@cczu.edu.cn

    李兴:常州大学信息科学与工程学院研究生硕士,主要研究兴趣包括行为识别,机器学习.E-mail: lixing03201012@163.com

    梁久祯:常州大学信息科学与工程学院教授,2001年获北京航空航天大学计算机软件与理论工学博士,主要研究兴趣机器学习.E-mail: jzliang@cczu.edu.cn

    宦娟:常州大学信息科学与工程学院副教授,2019年获江苏大学农业电气化与自动化专业博士学位,她的研究兴趣是信息智能处理.E-mail: huanjuan@cczu.edu.cn

    郑纪翔:常州大学信息科学与工程学院计算机科学与技术学生.E-mail: zjx991031@163.com

Recognizing Action Using Multi-center Subspace Learning-based Spatial-temporal Information Fusion

Funds: Supported by National Natural Science Foundation of China (61803050,61063021), Jiangsu Province Networking and Mobile Internet Technology Engineering Key Laboratory Open Research Fund Project (JSWLW-2017-013), Zhejiang Public Welfare Technology Research Social Development Project(2017C33223)
  • 摘要: 基于深度图序列的人体行为识别, 一般通过提取特征图来提高识别精度, 但这类特征图通常存在时序信息缺失的问题. 针对上述问题, 本文提出了一种新的深度图序列表示方式, 即深度时空图(Depth Space Time Maps, DSTM), 该算法降低了特征图的冗余度, 弥补了时序信息缺失的问题. 本文通过融合空间信息占优的Depth Motion Maps (DMM) 与时序信息占优的DSTM, 进行高精度的人体行为研究. 提出了一种名为多聚点子空间学习Multi-Center Subspace Learning (MCSL)的多模态数据融合算法. 该算法为各类别数据构建了多个投影聚点,以此增大了样本的类间距离, 降低了投影目标区域维度. 本文在MSR-Action3D深度数据集和UTD-MHAD深度数据集上进行人体行为识别最后实验结果表明, 本文方法相较于现有人体行为识别方法有着较高的识别率.
  • 图  1  DSTM流程图

    Fig.  1  DSTM flowchart

    图  2  单聚点子空间学习

    Fig.  2  Subspace learning

    图  3  多聚点子空间学习

    Fig.  3  Multi-center subspace learning

    图  4  正反高抛动作

    Fig.  4  Positive and negative high throwing action

    图  5  参数选择

    Fig.  5  The parameter of selection

    图  6  DSTM在不同分类器识别效果

    Fig.  6  DSTM recognition of different classifiers

    表  1  MSR数据库中的人体行为

    Table  1  Human Actions in MSR

    动作 样本数 动作 样本数
    高挥手(A01) 27 双手挥(A11) 30
    水平挥手(A02) 26 侧边拳击(A12) 30
    锤(A03) 27 弯曲(A13) 27
    手抓(A04) 25 向前踢(A14) 29
    打拳(A05 26 侧踢(A15) 20
    高抛(A06) 26 慢跑(A16) 30
    画叉(A07) 27 网球挥拍(A17) 30
    画勾(A08) 30 发网球(A18) 30
    画圆(A09) 30 高尔夫挥杆(A19) 30
    拍手(A10) 30 捡起扔(A20) 27
    下载: 导出CSV

    表  2  MSR数据库中的人体行为

    Table  2  Human Actions in MSR

    动作 样本数 动作 样本数
    向左滑动(B01) 32 手臂卷曲(B16) 32
    向右滑动(B02) 32 挥网球(B15) 32
    挥手(B03) 32 网球发球(B17) 32
    鼓掌(B04) 32 推(B18)) 32
    扔(B05) 32 敲(B19) 32
    双手交叉(B06) 32 抓(B20) 32
    拍篮球(B07) 32 捡起扔(B21) 32
    画叉(B08) 31 慢跑(B22) 31
    画圆(B09) 32 走(B23) 32
    持续画圆(B10) 32 坐下(B24) 32
    画三角(B11) 32 站起来(B25) 32
    打保龄球(B12) 32 弓步(B26) 32
    冲拳(B13) 32 蹲(B27) 32
    挥羽毛球(B14) 32
    下载: 导出CSV

    表  3  MSR数据库中的人体行为

    Table  3  Human Actions in MSR

    AS1 AS2 AS3
    A02 A01 A06
    A03 A04 A14
    A05 A07 A15
    A06 A08 A16
    A10 A09 A17
    A13 A11 A18
    A18 A14 A19
    A20 A12 A20
    下载: 导出CSV

    表  4  MSR数据库上不同特征的识别率

    Table  4  Different of feature action recognition on MSR

    method Test One Test Two Test Three
    AS1 AS2 AS3 avg AS1 AS2 AS3 avg AS1 AS2 AS3 avg
    MEI-HOG 69.79 77.63 79.72 75.71 84.00 89.58 93.24 88.94 86.95 86.95 95.45 89.78
    MEI-LBP 57.05 56.58 64.19 59.27 66.66 69.79 78.37 71.61 69.56 73.91 77.27 73.58
    DSTM-HOG 83.22 71.71 87.83 80.92 94.66 84.37 88.23 89.80 91.30 82.61 95.95 89.95
    DSTM-LBP 84.56 71.71 87.83 81.37 88.00 82.29 95.94 88.74 86.96 82.61 95.45 88.34
    MHI-HOG 69.79 72.36 70.95 71.03 88.00 84.37 89.19 87.19 95.65 82.60 95.45 91.23
    MHI-LBP 51.67 60.52 54.05 55.41 73.33 70.83 78.37 74.18 82.60 65.21 72.72 73.51
    DMM-HOG 88.00 87.78 87.16 87.65 94.66 87.78 100.00 94.15 100.00 88.23 95.45 94.56
    DMM-LBP 89.52 87.78 93.20 90.17 93.11 85.19 100.00 92.77 94.03 88.98 92.38 91.80
    下载: 导出CSV

    表  5  UTD数据库上不同特征的识别率

    Table  5  Different of feature action recognition on UTD

    method Test One Test Two Test Three
    MEI-HOG 69.51 65.42 68.20
    MEI-LBP 45.12 51.97 52.61
    DSTM-HOG 71.08 80.28 89.54
    DSTM-LBP 68.81 80.97 86.06
    MHI-HOG 56.44 66.58 73.14
    MHI-LBP 49.82 53.82 57.40
    DMM-HOG 78.39 75.40 87.94
    DMM-LBP 68.98 74.94 86.75
    下载: 导出CSV

    表  6  DMM和DSTM对比实验结果

    Table  6  Experimental results of DMM and DSTM

    method D1 D2
    DSTM 62.83 81.53
    DMM 32.17 63.93
    下载: 导出CSV

    表  7  DMM和DSTM平均处理时间

    Table  7  Average processing time of DMM and DSTM

    method D1(s) D2(s)
    DSTM 2.1059 3.4376
    DMM 5.6014 8.6583
    下载: 导出CSV

    表  8  $ \mathrm{MSR}-\mathrm{Action} 3 \mathrm{D}^{1} $ 在上的实验结果

    Table  8  Experimental results on $ \mathrm{MSR}-\mathrm{Action} 3 \mathrm{D}^{1} $

    method Recognition rate (%) method Recognition rate (%)
    文献[13]方法 86.50 文献[38]方法 81.7
    文献[34]方法 91.45 文献[39]方法 90.01
    文献[35]方法 90.01 文献[40]方法 89.48
    文献[36]方法 89.40 本文学习方法 90.32
    文献[37]方法 77.47
    $\mathrm{MSR}-\mathrm{Action} 3 \mathrm{D}^{1}$ 采用设置二测试2
    下载: 导出CSV

    表  9  $ \mathrm{MSR}-\mathrm{Action} 3 \mathrm{D}^{2} $ 在上的实验结果

    Table  9  Experimental results on $ \mathrm{MSR}-\mathrm{Action} 3 \mathrm{D}^{2} $

    method Recognition rate (%) method Recognition rate (%)
    MHI-LBP 68.75 MCSL+DMM 89.28
    MEI-LBP 71.43 MCSL+DSTM 91.96
    DCA[23] 94.64 CCA[22] 83.05
    DSTM-LBP 87.50 子空间学习 92.85
    DSTM-HOG 89.28 本文学习方法 98.21
    $\mathrm{MSR}-\mathrm{Action} 3 \mathrm{D}^{2}$ 采用设置二测试4 | MCSL为多聚点子空间学习英文简写
    下载: 导出CSV

    表  10  UTD-MHAD在设置二测试4上的实验结果

    Table  10  Experimental results on UTD-MHAD

    method Recognition rate (%) method Recognition rate (%)
    MHI-LBP 62.40 MCSL+DMM 93.64
    MEI-LBP 57.80 MCSL+DSTM 95.37
    DCA[23] 92.48 CCA[22] 87.28
    DSTM-LBP 89.59 子空间学习 93.64
    DSTM-HOG 91.90 本文学习方法 98.84
    下载: 导出CSV
  • [1] Yousefi S, Narui H, Dayal S, Ermon S, Valaee S. A Survey on Behavior Recognition Using WiFi Channel State Information. IEEE Communications Magazine, 2017, 55(10): 98−104 doi: 10.1109/MCOM.2017.1700082
    [2] Mabrouk A B, Zagrouba E. Abnormal behavior recognition for intelligent video surveillance systems: A review. Expert Systems with Applications, 2018, 91: 480−491 doi: 10.1016/j.eswa.2017.09.029
    [3] Fang C C, Mou T C, Sun S W, Chang P C. Machine-Learning Based Fitness Behavior Recognition from Camera and Sensor Modalities//2018 IEEE International Conference on Artificial Intelligence and Virtual Reality (AIVR). IEEE, 2018: 249−250
    [4] Chen C, Liu K, Jafari R, Kehtarnavaz N. Home-based Senior Fitness Test measurement system using collaborative inertial and depth sensors//Engineering in Medicine and Biology Society. IEEE, 2014: 4135−4138
    [5] Laver K E, Lange B, George S, Deutsch J E, Saposnik G, Crotty M. Virtual reality for stroke rehabilitation. Cochrane database of systematic reviews, 2017, (11)
    [6] Sun J, Wu X, Yan S, Cheong L F, Chua T S, Li J. Hierarchical spatio-temporal context modeling for action recognition. Cvpr, 2009: 2004−2011
    [7] 胡建芳, 王熊辉, 郑伟诗, 赖剑煌. RGB-D行为识别研究进展及展望. 自动化学报, 2019, 45(5): 829−840

    Hu Jianfang, Wang Xionghui, Zheng Weishi, Lai Jianhuang. RGB-D Action Recognition: Recent Advances and Future Perspectives. Acta Automatica Sinica, 2019, 45(5): 829−840
    [8] Bobick A F, Davis J W. The Recognition of Human Movement Using Temporal Templates. Pattern Analysis & Machine Intelligence IEEE Transactions on, 2001, 23(3): 257−267
    [9] 苏本跃, 蒋京, 汤庆丰, 盛敏. 基于函数型数据分析方法的人体动态行为识别. 自动化学报, 2017, 43(5): 866−876

    Su Benyue, Jiang Jing, Tang Qingfeng, Sheng Min. Human Dynamic Action Recognition Based on Functional Data Analysis. Acta Automatica Sinica, 2017, 43(5): 866−876
    [10] Anderson D, Luke R H, Keller J M, Skubic M, Rantz M J, Aud M A. Modeling human activity from voxel person using fuzzy logic. IEEE Transactions on Fuzzy Systems, 2009, 17(1): 39−49 doi: 10.1109/TFUZZ.2008.2004498
    [11] 朱红蕾, 朱昶胜, 徐志刚. 人体行为识别数据集研究进展. 自动化学报, 2018, 44(6): 978−1004

    Zhu Honglei, Zhu Yusheng, Xu Zhigang. Research Advances on Human Activity Recognition Datasets. Acta Automatica Sinica, 2018, 44(6): 978−1004
    [12] Wu Y, Jia Z, Ming Y, Sun J, Cao L. Human behavior recognition based on 3D features and hidden markov models. Signal, Image and Video Processing, 2016, 10(3): 495−502 doi: 10.1007/s11760-015-0756-6
    [13] Wang J, Liu Z, Chorowski J, Chen Z, Wu Y. Robust 3d action recognition with random occupancy patterns//Computer vision-ECCV 2012. Springer, Berlin, Heidelberg, 2012: 872−885
    [14] Zhang H, Zhong P, He J, Xia C. Combining depth-skeleton feature with sparse coding for action recognition. Neurocomputing, 2017, 230: 417−426 doi: 10.1016/j.neucom.2016.12.041
    [15] Zhang S, Chen E, Qi C, Liang C. Action Recognition Based on Sub-action Motion History Image and Static History Image//MATEC Web of Conferences. EDP Sciences, 2016, 56: 02006.
    [16] Liu Z, Zhang C, Tian Y. 3D-based deep convolutional neural network for action recognition with depth sequences. Image and Vision Computing, 2016, 55: 93−100 doi: 10.1016/j.imavis.2016.04.004
    [17] Xu Y, Hou Z, Liang J, Chen C, Jia L, Song Y. Action recognition using weighted fusion of depth images and skeleton's key frames. Multimedia Tools and Applications, 2019: 1−16
    [18] Wang P, Li W, Li C, Hou Y. Action recognition based on joint trajectory maps with convolutional neural networks. Knowledge-Based Systems, 2018, 158: 43−53 doi: 10.1016/j.knosys.2018.05.029
    [19] Kamel A, Sheng B, Yang P, Li P, Shen R, Feng D D. Deep convolutional neural networks for human action recognition using depth maps and postures. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2018
    [20] Li C, Hou Y, Wang P, Li W. Joint distance maps based action recognition with convolutional neural networks. IEEE Signal Processing Letters, 2017, 24(5): 624−628 doi: 10.1109/LSP.2017.2678539
    [21] Yang X, Zhang C, Tian Y L. Recognizing actions using depth motion maps-based histograms of oriented gradient//Proceedings of the 20th ACM international conference on Multimedia. ACM, 2012: 1057−1060
    [22] Li A, Shan S, Chen X, Gao W. Face recognition based on non-corresponding region matching//2011 International Conference on Computer Vision. IEEE, 2011: 1060−1067
    [23] Haghighat M, Abdel-Mottaleb M, Alhalabi W. Discriminant correlation analysis: Real-time feature level fusion for multimodal biometric recognition. IEEE Transactions on Information Forensics and Security, 2016, 11(9): 1984−1996 doi: 10.1109/TIFS.2016.2569061
    [24] Rosipal R, Kr?mer N. Overview and recent advances in partial least squares//International Statistical and Optimization Perspectives Workshop" Subspace, Latent Structure and Feature Selection". Springer, Berlin, Heidelberg, 2005: 34−51
    [25] Liu H, Sun F. Material identification using tactile perception: A semantics-regularized dictionary learning method. IEEE/ASME Transactions on Mechatronics, 2018, 23(3): 1050−1058 doi: 10.1109/TMECH.2017.2775208
    [26] Zhuang Y T, Yang Y, Wu F. Mining Semantic Correlation of Heterogeneous Multimedia Data for Cross-Media Retrieval. IEEE Transactions on Multimedia, 2008, 10(2): 221−229 doi: 10.1109/TMM.2007.911822
    [27] Chen C, Jafari R, Kehtarnavaz N. Utd-mhad: A multimodal dataset for human action recognition utilizing a depth camera and a wearable inertial sensor//2015 IEEE International conference on image processing (ICIP). IEEE, 2015: 168−172
    [28] Sharma A, Kumar A, Daume H, Jacobs D W. Generalized multiview analysis: A discriminative latent space//2012 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2012: 2160−2167
    [29] Wang K, He R, Wang L, Wang W, Tan T. Joint feature selection and subspace learning for cross-modal retrieval. IEEE transactions on pattern analysis and machine intelligence, 2016, 38(10): 2010−2023 doi: 10.1109/TPAMI.2015.2505311
    [30] Shotton J, Fitzgibbon A, Cook M, Sharp T, Finocchio M, Moore R, et al. Real-Time Pose Recognition in Parts from Single Depth Images//Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 2013: 1297−1304
    [31] Chen C, Jafari R, Kehtarnavaz N. Action recognition from depth sequences using depth motion maps-based local binary patterns//2015 IEEE Winter Conference on Applications of Computer Vision. IEEE, 2015: 1092−1099
    [32] Nie F, Huang H, Cai X, Ding C H. Efficient and robust feature selection via joint?2, 1-norms minimization//Advances in neural information processing systems. 2010: 1813−1821
    [33] He R, Tan T, Wang L, Zheng W S. l21 regularized correntropy for robust feature selection//2012 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2012: 2504−2511
    [34] Koniusz P, Cherian A, Porikli F. Tensor representations via kernel linearization for action recognition from 3d skeletons//European Conference on Computer Vision. Springer, Cham, 2016: 37−53
    [35] Ben Tanfous A, Drira H, Ben Amor B. Coding Kendall's Shape Trajectories for 3D Action Recognition//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018: 2840−2849
    [36] Vemulapalli R, Chellapa R. Rolling rotations for recognizing human actions from 3d skeletal data//Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 4471−4479
    [37] Wang L, Huynh D Q, Koniusz P. A Comparative Review of Recent Kinect-based Action Recognition Algorithms. arXiv preprint arXiv: 1906.09955, 2019.
    [38] Rahmani H, Mian A. 3D action recognition from novel viewpoints//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016: 1506−1515
    [39] Tanfous A B, Drira H, Amor B B. Sparse Coding of Shape Trajectories for Facial Expression and Action Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019
    [40] Amor B B, Su J, Srivastava A. Action recognition using rate-invariant analysis of skeletal shape trajectories. IEEE transactions on pattern analysis and machine intelligence, 2015, 38(1): 1−13
  • 加载中
计量
  • 文章访问数:  16
  • HTML全文浏览量:  7
  • 被引次数: 0
出版历程
  • 收稿日期:  2019-04-29
  • 录用日期:  2019-11-15

目录

    /

    返回文章
    返回