2.845

2023影响因子

(CJCR)

  • 中文核心
  • EI
  • 中国科技核心
  • Scopus
  • CSCD
  • 英国科学文摘

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于中心点搜索的无锚框全卷积孪生跟踪器

谭建豪 郑英帅 王耀南 马小萍

谭建豪, 郑英帅, 王耀南, 马小萍. 基于中心点搜索的无锚框全卷积孪生跟踪器. 自动化学报, 2021, 47(4): 801−812 doi: 10.16383/j.aas.c200469
引用本文: 谭建豪, 郑英帅, 王耀南, 马小萍. 基于中心点搜索的无锚框全卷积孪生跟踪器. 自动化学报, 2021, 47(4): 801−812 doi: 10.16383/j.aas.c200469
Tan Jian-Hao, Zheng Ying-Shuai, Wang Yao-Nan, Ma Xiao-Ping. AFST: Anchor-free fully convolutional siamese tracker with searching center point. Acta Automatica Sinica, 2021, 47(4): 801−812 doi: 10.16383/j.aas.c200469
Citation: Tan Jian-Hao, Zheng Ying-Shuai, Wang Yao-Nan, Ma Xiao-Ping. AFST: Anchor-free fully convolutional siamese tracker with searching center point. Acta Automatica Sinica, 2021, 47(4): 801−812 doi: 10.16383/j.aas.c200469

基于中心点搜索的无锚框全卷积孪生跟踪器

doi: 10.16383/j.aas.c200469
基金项目: 国家自然科学基金(61433016)资助
详细信息
    作者简介:

    谭建豪:湖南大学电气与信息工程学院教授. 主要研究方向为智能机器人, 数据挖掘和模式识别. E-mail: tanjianhao96@sina.com

    郑英帅:湖南大学电气与信息工程学院硕士研究生. 主要研究方向为计算机视觉, 机器学习. 本文通信作者. E-mail: zheng_ys415@163.com

    王耀南:湖南大学电气与信息工程学院教授. 主要研究方向为智能控制理论, 机器人系统和计算机视觉. E-mail: yaonan@hnu.edu.cn

    马小萍:湖南大学电气与信息工程学院硕士研究生. 主要研究方向为机器视觉, 无人机控制技术. E-mail: maxiaoping@hnu.edu.cn

AFST: Anchor-free Fully Convolutional Siamese Tracker With Searching Center Point

Funds: Supported by National Natural Science Foundation of China (61433016)
More Information
    Author Bio:

    TAN Jian-Hao Professor at the School of Electrical and Information Engineering, Hunan University. His research interest covers intelligent robots, data mining, and pattern recognition

    ZHENG Ying-Shuai Master student at the School of Electrical and Information Engineering, Hunan University. His research interest covers computer vision and machine learning. Corresponding author of this paper

    WANG Yao-Nan Professor at the School of Electrical and Information Engineering, Hunan University. His research interest covers intelligent control theory, robot systems, and computer vision

    MA Xiao-Ping Master student at the School of Electrical and Information Engineering, Hunan University. Her research interest covers machine vision and UAV control technology

  • 摘要: 为解决孪生网络跟踪器鲁棒性差的问题, 重新设计了孪生网络跟踪器的分类与回归分支, 提出一种基于像素上直接预测方式的高鲁棒性跟踪算法—无锚框全卷积孪生跟踪器(Anchor-free fully convolutional siamese tracker, AFST). 目前高性能的跟踪算法, 如SiamRPN、SiamRPN++、CRPN都是基于预定义的锚框进行分类和目标框回归. 与之相反, 提出的AFST则是直接在每个像素上进行分类和预测目标框. 通过去掉锚框, 大大简化了分类任务和回归任务的复杂程度, 并消除了锚框和目标误匹配问题. 在训练中, 还进一步添加了同类不同实例的图像对, 从而引入了相似语义干扰物, 使得网络的训练更加充分. 在VOT2016、GOT-10k、OTB2015三个公开的基准数据集上的实验表明, 与现有的跟踪算法对比, AFST达到了先进的性能.
  • 图  1  AFST网络流程框架图

    Fig.  1  AFST network flow diagram

    图  2  多级融合模块

    Fig.  2  Multistage feature fusion

    图  3  回归方式

    Fig.  3  Regression approach

    图  4  两种计算CS的方式

    Fig.  4  Two ways to calculate center score

    图  5  基于中心得分的搜索过程图

    Fig.  5  A search process graph based on the center score

    图  6  采样策略对比图

    Fig.  6  Sampling strategy comparison diagram

    图  7  不同挑战下的精度−鲁棒性曲线图

    Fig.  7  Accuracy-Robustness curves for different challenges

    图  8  不同视频序列跟踪结果

    Fig.  8  Tracking results for different video sequences

    图  9  OTB2015结果对比图

    Fig.  9  Comparison chart of results on OTB2015

    图  10  GOT-10k成功率对比图

    Fig.  10  Success rate comparison graph on GOT-10k

    图  11  锚框与目标框误匹配

    Fig.  11  The anchor box is mismatched with the target box

    图  12  锚框分布图

    Fig.  12  Anchor box distribution map

    表  1  消融实验

    Table  1  Ablation experiments

    序号主干网络子网络质量得分AREAO融合方式新采样策略
    1Alexclsnone0.5300.4660.235nonenone
    2ResNet50clsnone0.5790.3860.280nonenone
    3ResNet50cls + regnone0.5920.3330.345nonenone
    4ResNet50cls + regnone0.6020.3020.355sumnone
    5ResNet50cls + regnone0.6070.2420.382sumyes
    6ResNet50cls + regCS0.6100.2240.415concatyes
    7ResNet50cls + regCS0.6140.2380.397sumyes
    8ResNet50cls + regCS0.6240.2050.412msfyes
    下载: 导出CSV

    表  2  VOT2016上与多个跟踪器对比

    Table  2  Compare with multiple trackers on VOT2016

    CCOTECOMDNetDeepSRDCFSiamRPNDaSiamRPNOursSiamRPN++
    A0.5410.5500.5420.5290.5600.6090.6510.642
    R0.2380.2000.3370.3260.2600.2240.1490.196
    EAO0.3310.3750.2570.2760.3440.4110.4850.464
    下载: 导出CSV

    表  3  不同挑战因素下的失败率

    Table  3  Failure rates under different challenge factors

    相机运动目标丢失光照变化物体运动遮挡尺度变化平均加权
    CCOT2411220141314.016.6
    Ours203291178.710.2
    DaSiamRPN264215161012.214.2
    SiamRPN3313122201116.720.1
    SiamRPN++20711215910.712.4
    MDNet3318421131217.021.1
    DeepSRDCF2817323251117.920.3
    下载: 导出CSV

    表  4  GOT-10k上与多个跟踪器对比

    Table  4  Compare with multiple trackers on GOT-10k

    SiamFCECOMDNetDeepSRDCFSiamRPN++Ours
    AO0.3480.3160.2990.4510.5070.529
    SR750.0980.1110.0990.2160.3110.370
    SR50.3530.3030.3030.5430.6050.617
    下载: 导出CSV
  • [1] Li B, Yan J J, Wu W, Zhu Z, Hu X L. High performance visual tracking with siamese region proposal network. In: Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018. 8971−8980
    [2] Bertinetto L, Valmadre J, Henriques J F, Vedaldi A, Torr P H. Fully-convolutional siamese networks for object tracking. In: Proceedings of the 14th European Conference on Computer Vision. Amsterdam, The Netherlands: Springer, 2016. 850−865
    [3] Zhu Z, Wang Q, Li B, Wu W, Yan J J, Hu W M. Distractor-aware siamese networks for visual object tracking. In: Proceedings of the 15th European Conference on Computer Vision. Munich, Germany: Springer, 2018. 101−117
    [4] Li B, Wu W, Wang Q, Zhang F Y, Xing J L, Yan J J. SiamRPN++: Evolution of siamese visual tracking with very deep networks. In: Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE, 2019. 4282−4291
    [5] Fan H, Ling H B. Siamese cascaded region proposal networks for real-time visual tracking. In: Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE, 2019. 7952−7961
    [6] Ren S Q, He K M, Girshick R, Sun J. Faster R-CNN: Towards real-time object detection with region proposal networks. In: Procoeedings of the 2015 Advances in Neural Information Pro cessing Systems. Montreal, Canada: MIT Press, 2015. 91−99
    [7] Tao R, Gavves E, Smeulders A W M. Siamese instance search for tracking. In: Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE, 2016. 1420−1429
    [8] Held D, Thrun S, Savarese S. Learning to track at 100 fps with deep regression networks. In: Proceedings of the 14th European Conference on Computer Vision. Amsterdam, The Netherlands: Springer, 2016. 749−765
    [9] Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, et al. Imagenet large scale visual recognition challenge. International Journal of Computer Vision, 2015, 115(3): 211−252 doi: 10.1007/s11263-015-0816-y
    [10] Lin T Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, et al. Microsoft COCO: Common objects in context. In: Proceedings of the 13th European Conference on Computer Vision. Zurich, Switzerland: Springer, 2014. 740−755
    [11] Lin T Y, Dollar P, Girshick R, He K M, Hariharan B, Belongie S. Feature pyramid networks for object detection. In: Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE, 2017. 2117−2125
    [12] Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C Y, et al. Ssd: Single shot multibox detector. In: Proceedings of the 14th European Conference on Computer Vision. Amsterdam, The Netherlands: Springer, 2016. 21−37
    [13] Lin T Y, Goyal P, Girshick R, He K M, Dollar P. Focal loss for dense object detection. In: Proceedings of the 2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE, 2017. 2980−2988
    [14] Cai Z W, Vasconcelos N. Cascade R-CNN: Delving into high quality object detection. In: Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018. 6154−6162
    [15] Zhang S F, Wen L Y, Bian X, Lei Z, Li S Z. Single-shot refinement neural network for object detection. In: Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018. 4203−4212
    [16] Jiang B R, Luo R X, Mao J Y, Xiao T T, Jiang Y N. Acquisition of localization confidence for accurate object detection. In: Proceedings of the 15th European Conference on Computer Vision. Salt Lake City, USA: IEEE, 2018. 784−799
    [17] Redmon J, Divvala S, Girshick R, Farhadi A. You only look once: Unified, real-time object detection. In: Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE, 2016. 779−788
    [18] Law H, Deng J. Cornernet: Detecting objects as paired keypoints. In: Proceedings of the 15th European Conference on Computer Vision. Munich, Germany: Springer, 2018. 734−750
    [19] Yang Z, Liu S H, Hu H, Wang L W, Lin S. Reppoints: Point set representation for object detection. In: Proceedings of the 2019 IEEE International Conference on Computer Vision. Seoul, Korea: IEEE, 2019. 9657−9666
    [20] Tian Z, Shen C H, Chen H, He T. Fcos: Fully convolutional one-stage object detection. In: Proceedings of the 2019 IEEE International Conference on Computer Vision. Seoul, Korea: IEEE, 2019. 9627−9636
    [21] He K M, Zhang X Y, Ren S Q, Sun J. Deep residual learning for image recognition. In: Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE, 2016. 770−778
    [22] Kristan M, Leonardis A, Matas J, Felsberg M, Chi Z Z. The visual object tracking VOT2016 challenge results. In: Proceedings of the 14th European Conference on Computer Vision Workshop. Amsterdam, The Netherlands: Springer, 2016. 191−217
    [23] Huang L, Zhao X, Huang K. GOT-10k: A large high-diversity benchmark for generic object tracking in the wild. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019: 1−1
    [24] Nam H, Han B. Learning multi-domain convolutional neural networks for visual tracking. In: Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE, 2016. 4293−4302
    [25] Wu Y, Lim J, Yang M. Object tracking benchmark. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1834−1848 doi: 10.1109/TPAMI.2014.2388226
    [26] Li P X, Chen B Y, Ouyang W L, Wang D, Yang X Y, Lu H C. Gradnet: Gradient-guided network for visual object tracking. In: Proceedings of the 2019 IEEE International Conference on Computer Vision. Seoul, Korea: IEEE, 2019. 6162−6171
    [27] Danelljan M, Hager G, Shahbaz Khan F, Felsberg M. Convolutional features for correlation filter based visual tracking. In: Proceedings of the 2015 IEEE International Conference on Computer Vision Workshops. Santiago, Chile: IEEE, 2015. 58−66
    [28] Valmadre J, Bertinetto L, Henriques J, Vedaldi A, Torr P H. End-to-end representation learning for correlation filter based tracking. In: Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE, 2017. 2805−2813
  • 加载中
图(12) / 表(4)
计量
  • 文章访问数:  935
  • HTML全文浏览量:  351
  • PDF下载量:  219
  • 被引次数: 0
出版历程
  • 收稿日期:  2020-06-28
  • 录用日期:  2020-11-18
  • 网络出版日期:  2021-01-14
  • 刊出日期:  2021-04-23

目录

    /

    返回文章
    返回