刘小波 肖肖 王凌 蔡之华 龚鑫 郑可心

doi: 10.16383/j.aas.c220115
基金项目: 国家自然科学基金(61973285, 62076226, 61873249, 61773355), 地质探测与评估教育部重点实验室主任基金 (GLAB2023ZR08)资助

    刘小波:中国地质大学 (武汉) 自动化学院副教授. 2008年获得中国地质大学(武汉)计算机学院计算机软件与理论硕士学位. 2012年获得中国地质大学(武汉)计算机学院地学信息工程博士学位. 主要研究方向为机器学习, 演化计算和高光谱遥感图像处理. 本文通信作者. E-mail: xbliu@cug.edu.cn

    肖肖:中国地质大学 (武汉) 自动化学院硕士研究生. 2020年获得江汉大学物理与信息工程学院学士学位. 主要研究方向为遥感图像处理, 目标检测. E-mail: xxiao@cug.edu.cn

    王凌:清华大学自动化系教授. 1995年获得清华大学自动化系学士学位. 1999年获得清华大学自动化系控制理论与控制工程专业博士学位. 主要研究方向为智能优化理论、方法与应用, 复杂生产过程建模、优化与调度. E-mail: wangling@tsinghua.edu.cn

    蔡之华:中国地质大学 (武汉) 计算机学院教授. 1986年获得武汉大学学士学位. 1992年获得北京工业大学硕士学位. 2003年获得中国地质大学(武汉) 博士学位. 主要研究方向为数据挖掘, 机器学习和演化计算. E-mail: zhcai@cug.edu.cn

    龚鑫:中国地质大学 (武汉) 自动化学院硕士研究生. 2020年获得江汉大学物理与信息工程学院学士学位. 主要研究方向为遥感图像处理, 架构搜索. E-mail: xgong@cug.edu.cn

    郑可心:中国地质大学(武汉)自动化学院硕士研究生. 2019年获得长江大学物理与光电工程学院学士学位. 主要研究方向为遥感图像处理. E-mail: zhengkexin@cug.edu.cn

Anchor-free Based Object Detection Methods and Its Application Progress in Complex Scenes

Funds: Supported by National Natural Science Foundation of China (61973285, 62076226, 61873249, 61773355) and Opening Fund of Key Laboratory of Geological Survey and Evaluation of Ministry of Education (GLAB2023ZR08)
    LIU Xiao-Bo Associate professor at the School of Automation, China University of Geosciences. He received his master degree in computer software and theory from the School of Computer Science, China University of Geosciences in 2008. He received his Ph.D. degree in geoinformation engineering from the School of Computer Science, China University of Geosciences in 2012. His research interest covers machine learning, evolutionary computation, and hyperspectral remote sensing image processes. Corresponding author of this paper

    XIAO Xiao Master student at the School of Automation, China University of Geosciences. She received her bachelor degree from the School of Physics and Information Engineering, Jianghan University in 2020. Her research interest covers remote sensing image processing and object detection

    WANG Ling Professor in the Department of Automation, Tsinghua University. He received his bachelor degree from the Department of Automation, Tsinghua University in 1995. He received his Ph.D. degree in control theory and control engineering from the Department of Automation, Tsinghua University in 1999. His research interest covers intelligent optimization theory, method and application, and complex production process modeling, optimization and scheduling

    CAI Zhi-Hua Professor at the School of Computer Science, China University of Geosciences. He received his bechelor degree from Wuhan University in 1986. He received his master degree from Beijing University of Technology in 1992. He received his Ph.D. degree from China University of Geosciences, in 2003. His research interest covers data mining, machine learning, and evolutionary computation

    GONG Xin Master student at the School of Automation, China University of Geosciences. He received his bachelor degree from the School of Physics and Information Engineering, Jianghan University in 2020. His research interest covers remote sensing image processing and neural architecture search

    ZHENG Ke-Xin Master student at the School of Automation, China University of Geosciences. He received his bachelor degree from the School of Physics and Optoelectronic Engineering, Yangtze University in 2019. His main research interest is remote sensing image processing

  • 摘要: 基于深度学习的目标检测方法是目前计算机视觉领域的热点, 在目标识别、跟踪等领域发挥了重要的作用. 随着研究的深入开展, 基于深度学习的目标检测方法主要分为有锚框的目标检测方法和无锚框的目标检测方法, 其中无锚框的目标检测方法无需预定义大量锚框, 具有更低的模型复杂度和更稳定的检测性能, 是目前目标检测领域中较前沿的方法. 在调研国内外相关文献的基础上, 梳理基于无锚框的目标检测方法及各场景下的常用数据集, 根据样本分配方式不同, 分别从基于关键点组合、中心点回归、Transformer、锚框和无锚框融合等4个方面进行整体结构分析和总结, 并结合COCO (Common objects in context)数据集上的性能指标进一步对比. 在此基础上, 介绍了无锚框目标检测方法在重叠目标、小目标和旋转目标等复杂场景情况下的应用, 聚焦目标遮挡、尺寸过小和角度多等关键问题, 综述现有方法的优缺点及难点. 最后对无锚框目标检测方法中仍存在的问题进行总结并对未来发展的应用趋势进行展望.
  • 图  1  基于锚框的目标检测方法整体框架

    Fig.  1  The overall framework of anchor-based object detection method

    图  2  基于无锚框的目标检测方法整体框架

    Fig.  2  The overall framework of anchor-free object detection method

    图  3  基于角点组合的CornerNet目标检测方法

    Fig.  3  CornerNet framework of object detection method based on corner points combination

    图  4  预测框采样方法

    Fig.  4  The sampling methods of prediction box

    图  5  基于中心点回归的无锚框目标检测方法整体框架

    Fig.  5  The overall framework of anchor-free object detection method based on center point regression

    图  6  DETR整体框架

    Fig.  6  The overall architecture of DETR

    图  7  基于优化标签分配算法的关系

    Fig.  7  The relationship between label assignment optimization algorithms

    图  8  重叠目标检测问题

    Fig.  8  The detection problems of overlapping object

    图  9  小目标示例

    Fig.  9  The object example of too few pixels

    图  10  RepPoints系列点集表示示例

    Fig.  10  The example of RepPoints series point set

    图  11  多角度目标检测结果示例

    Fig.  11  The detection result of arbitrary rotation objects

    表  1  目标检测公共数据集对比

    Table  1  Comparison of public datasets for object detection

    数据集类别数图片数量实例数量图片尺寸 (像素)标注方式使用场景发表年份
    Pascal VOC[10]20~23 k~55 k800 × 800水平框综合2010
    COCO[11]80~123 k~896 k水平框综合2014
    DOTA[12]15~2.8 k~188 k800 ~ 4000水平框/旋转框综合2018
    UCAS-AOD[13]2~1 k~6 k1280 × 1280旋转框汽车、飞机2015
    ICDAR2015[14]11.5 k720 × 1280旋转框文本2015
    CUHK-SYSU[15]1~18 k~96 k50 ~ 4000水平框行人2017
    PRW[16]1~12 k~43 k水平框行人2017
    CrowdHuman[17]1~24 k~470 k608 × 608水平框行人2018
    HRSC2016[18]1~1.1 k~3 k~1000 × 1000旋转框船舰2017
    SSDD[19]11.16 k~2.5 k500 × 500水平框船舰2017
    HRSID[20]1~5.6 k~17 k800 × 800水平框船舰2020
    表  2  基于无锚框的目标检测方法对比

    Table  2  Comparison of anchor-free object detection method

    方法动机无需设计锚框, 减少锚框带来的超参数, 简化模型
    方法优点充分利用边界和内部信息减少回归超参数数量实现端到端, 简化流程缓解正负样本不均衡
    表  3  基于关键点组合的无锚框目标检测算法在COCO数据集上的性能及优缺点对比

    Table  3  Comparison of the keypoints combination based anchor-free object detection methods on the COCO dataset

    处理器配置及检测速度(帧/s)mAP (%)优点缺点收录来源发表年份
    PLN[21]Inception-V2512 × 512GTX 1080
    CornerNet[22]Hourglass-104511 × 511TitanX × 10
    CornerNet-Saccade[23]Hourglass-54255 × 255GTX 1080Ti × 4
    CornerNet-Squeeze[23]Hourglass-54255 × 255GTX 1080Ti × 4
    ExtremeNet[24]Hourglass-104511 × 511TitanX × 10
    CenterNet-Triplets[25]Hourglass-104511 × 511Tesla V100 × 8
    CentripetalNet[26]Hourglass-104511 × 511Tesla V100 × 16
    SaccadeNet[27]DLA-34-DCN512 × 512RTX 2080Ti
    40.4获取局部和整体特征, 提高特征利用率需要平衡检测精度与速度CVPR2020
    CPNDet[28]Hourglass-104511 × 511Tesla V100 × 8
    表  4  基于中心点回归的无锚框目标检测算法在COCO数据集上的性能及优缺点对比

    Table  4  Comparison of the center point regression based anchor-free object detection methods on the COCO dataset

    处理器配置及检测速度(帧/s)mAP (%)优点缺点收录来源发表年份
    YOLO v1[31]用网格划分法提高中心点搜寻效率目标中心点在同
    FCOS[33]ResNet-101800 × $\le 1333$
    CenterNet[35]Hourglass-104511 × 511Titan X
    Grid R-CNN[40]ResNet-101800 × 800Titan Xp × 32
    Grid R-CNN Plus[41]ResNet-101800 × 800Titan Xp × 32
    42.0缩小特征表达区域尺寸, 减少计算量非代表性特征
    HoughNet[37]Hourglass-104512 × 512Tesla V100 × 4
    YOLOX[32]Darknet53640 × 640Tesla V100 × 8
    47.4解耦分类和回归分支, 提升收敛速度难分类样本的
    OneNet[34]ResNet-101512 × $\le 853$Tesla V100 × 8
    目标, 产生漏检
    CenterNet2[36]Res2Net-101-DCN-BiFPN1280 × 1280Titan Xp
    表  5  基于Transformer的无锚框目标检测算法在COCO数据集上的性能及优缺点对比

    Table  5  Comparison of the Transformer based anchor-free object detection methods on the COCO dataset

    mAP (%)浮点计算量(FLOPs/G)优点缺点收录
    DETR[42]ResNet-50(480, 800)×
    (800, 1333)
    Tesla V100 × 16
    42.086用Transformer减少手工设计参数数量收敛速度慢, 小
    TSP-FCOS[43]ResNet-50(640, 800)×
    (800, 1333)
    Tesla V100 × 8
    Deformable DETR[44]ResNet-50(480, 800)×
    (800, 1333)
    Tesla V100
    Dynamic DETR[45]ResNet-50Tesla V100 × 8
    YOLOS[47]DeiT-base(480, 800)×
    (800, 1333)

    42.0538不依赖卷积骨干网络, 性能良好检测速度较低,
    SAM-DETR[46]ResNet-50(480, 800)×
    (800, 1333)
    Tesla V100 × 8
    ViDT[49]Swin-base(480, 800)×
    (800, 1333)
    Tesla V100 × 8
    DN-DETR[50]ResNet-50Tesla A100 × 8
    表  6  基于锚框和无锚框融合的目标检测算法在COCO数据集上的性能及优缺点对比

    Table  6  Comparison of the anchor-based and anchor-free fusion object detection methods on the COCO dataset

    处理器配置及检测速度(帧/s)mAP (%)优点缺点收录来源发表年份
    FSAF[52]ResNeXt-101800 × 800Tesla V100 × 8
    SAPD[54]ResNeXt-101800 × 800GTX 1080Ti
    ATSS[56]ResNeXt-101800 ×
    (800, 1333)
    Tesla V100
    AutoAssign[57]ResNeXt-101800 × 80052.1无需手动调节的动态样本分配样本的的权重分配
    LSNet[58]ResNeXt-101800 ×
    (800, 1333)
    Tesla V100 × 8
    DW[59]ResNeXt-101800 × 800GPU × 8
    表  7  解决目标重叠排列问题的不同检测方法的性能对比

    Table  7  Performance comparison of detection methods to solve the problem that objects are densely arranged

    mAP (%)收录来源发表年份
    目标重叠排列VarifocalNet[75]COCO(480, 960)×
    ResNeXt-101Tesla V100 × 86.750.8TMI2019
    FCOS v2[73]COCO CrowdHuman800×$\le$1333ResNeXt-101 ResNet-50GTX 1080Ti50.4
    BorderDet[76]COCO800×$\le$1333ResNeXt-101GPU × 850.3ECCV2020
    900 × 1500ResNet-50Tesla V10016.494.0
    OTA-FCOS[78]COCO CrowdHuman(640, 800) ×$\le$
    GPU × 851.5
    LLA-FCOS[79]CrowdHuman800×$\le$1400ResNet-50GPU × 888.1Neuro-
    LTM[80]COCO800×$\le$1333ResNeXt-101Tesla V100 × 81.746.3TPAMI2022
    Efficient DETR[81]COCO CrowdHumanResNet-101
    900×1500ResNet-50Tesla V10094.2
    900×1500ResNet-50Tesla A10011.194.2
    COCO CrowdHuman(480, 800)×$\le$1333ResNet-50GPU × 846.7
    表  8  解决目标重叠排列问题的不同检测方法优缺点对比

    Table  8  Feature comparison of detection methods to solve the problem that objects are densely arranged

    VarifocalNet[75]预测IACS分类得分、提出Varifocal Loss损失函数有效抑制同目标重叠框小目标检测效果需提升
    FCOS v2[73]将中心度子分支加入回归分支, 并修正中心度计算方式减少类别判断错误数量针对不同尺度特征仅使用相同
    检测头, 限制模型性能
    Efficient DETR[81]用密集先验知识初始化来简化模型结构减少编码器和解码器数量检测精度有待进一步提升
    表  9  解决目标尺寸过小问题的不同检测方法性能对比

    Table  9  Performance comparison of detection methods to solve the problem that object pixels are too few

    mAP (%)收录来源发表年份
    目标尺寸过小RepPoints[89]COCO(480, 960) ×$\le$960ResNet-101GPU × 446.5ICCV2019
    DuBox[92] COCO
    VOC 2012
    800 × 800
    500 × 500
    ResNet-101 VGG-16NVIDIA P40 × 839.5
    PPDet[87]COCO800 × 1300ResNet-101Tesla V100 × 445.2BMVC2020
    RepPoints v2[90]COCO(800, 1333) × $\le$1333ResNet-101GPU × 848.1NeurlPS2020
    VOC 2012
    800 × 800ResNet-101
    GPU × 4
    FBR-Net[94]SSDD448 × 448ResNet-50RTX 2080Ti25.092.8TGRS2021
    800 × 800ResNet-50NVIDIA Titan Xp15.2
    Remote Sensing2022
    Oriented RepPoints [91]DOTA HRSC20161024 × 1024
    (300, 900)×
    (300, 1500)
    RTX 2080Ti × 476.5
    QueryDet[95]COCOResNet-50RTX 2080Ti × 814.439.5CVPR2022
    表  10  解决目标尺寸过小问题的不同检测方法优缺点对比

    Table  10  Feature comparison of detection methods to solve the problem that object pixels are too few

    RepPoints v2[90]增加角点验证分支来判断特征映射点获得更具目标内部和边缘信息的特征预测框定位准确度低
    FCOS (AFE-GDH)[88]使用自适应特征编码策略(AFE)和构造高斯引导检测头有效增强小目标表达能力仅说明船舰目标有效性
    Oriented RepPoints[91]提出质量评估、样本分配方案和空间约束提升非轴对齐小目标特征的捕获能力仅涉及空域小目标检测
    表  11  解决目标方向变化问题的不同检测方法性能对比

    Table  11  Performance comparison of detection methods to solve the problem that object direction changeable

    mAP (%)收录来源发表年份
    SARD[102]DOTA HRSC2016800 × 800ResNet-101Tesla P100
    IEEE Access2019
    512 × 512ResNet-101Tesla V100 × 272.3
    IEEE Access2020
    O2-DNet[101]DOTA ICDAR2015800 × 800ResNet-101Tesla V100 × 271.0
    DRN[106]DOTA HRSC20161024 × 1024
    768 × 768
    Hourglass-104Tesla V10073.2
    BBAVectors[96]DOTA HRSC2016608 × 608ResNet-101GTX 1080Ti × 4
    FCOSR[98]DOTA HRSC20161024 × 1024
    800 × 800
    ResNeXt-101Tesla V100 × 47.9
    DARDet[103]DOTA HRSC20161024 × 1024ResNet-50RTX 2080Ti12.6
    DAFNe[105]DOTA HRSC20161024 × 1024ResNet-101Tesla V100 × 476.9
    CHPDet[107]UCAS-AOD HRSC20161024 × 1024DLA-34RTX 2080Ti89.6
    AOPG[99]DOTA HRSC20161024 × 1024
    (800, 1333) ×
    (800, 1333)
    RTX 2080Ti10.8
    GGHL[100]DOTA SSDD+800 × 800
    Darknet53RTX 3090 × 242.3
    表  12  解决目标方向变化问题的不同检测方法优缺点对比

    Table  12  Feature comparison of detection methods to solve the problem that object direction changeable

    避免角度周期性及预测框的顶点排序问题 极坐标的后处理操作相关复杂度较高
    方法, 学习判别性特征
    BBAVectors[96]使用边缘感知向量来替代原回归参数在同坐标系中回归所有参数, 减少计算量向量的类型转化过程处理较复杂
