2.845

2023影响因子

(CJCR)

  • 中文核心
  • EI
  • 中国科技核心
  • Scopus
  • CSCD
  • 英国科学文摘

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于无锚框的目标检测方法及其在复杂场景下的应用进展

刘小波 肖肖 王凌 蔡之华 龚鑫 郑可心

刘小波, 肖肖, 王凌, 蔡之华, 龚鑫, 郑可心. 基于无锚框的目标检测方法及其在复杂场景下的应用进展. 自动化学报, 2023, 49(7): 1369−1392 doi: 10.16383/j.aas.c220115
引用本文: 刘小波, 肖肖, 王凌, 蔡之华, 龚鑫, 郑可心. 基于无锚框的目标检测方法及其在复杂场景下的应用进展. 自动化学报, 2023, 49(7): 1369−1392 doi: 10.16383/j.aas.c220115
Liu Xiao-Bo, Xiao Xiao, Wang Ling, Cai Zhi-Hua, Gong Xin, Zheng Ke-Xin. Anchor-free based object detection methods and its application progress in complex scenes. Acta Automatica Sinica, 2023, 49(7): 1369−1392 doi: 10.16383/j.aas.c220115
Citation: Liu Xiao-Bo, Xiao Xiao, Wang Ling, Cai Zhi-Hua, Gong Xin, Zheng Ke-Xin. Anchor-free based object detection methods and its application progress in complex scenes. Acta Automatica Sinica, 2023, 49(7): 1369−1392 doi: 10.16383/j.aas.c220115

基于无锚框的目标检测方法及其在复杂场景下的应用进展

doi: 10.16383/j.aas.c220115
基金项目: 国家自然科学基金(61973285, 62076226, 61873249, 61773355), 地质探测与评估教育部重点实验室主任基金 (GLAB2023ZR08)资助
详细信息
    作者简介:

    刘小波:中国地质大学 (武汉) 自动化学院副教授. 2008年获得中国地质大学(武汉)计算机学院计算机软件与理论硕士学位. 2012年获得中国地质大学(武汉)计算机学院地学信息工程博士学位. 主要研究方向为机器学习, 演化计算和高光谱遥感图像处理. 本文通信作者. E-mail: xbliu@cug.edu.cn

    肖肖:中国地质大学 (武汉) 自动化学院硕士研究生. 2020年获得江汉大学物理与信息工程学院学士学位. 主要研究方向为遥感图像处理, 目标检测. E-mail: xxiao@cug.edu.cn

    王凌:清华大学自动化系教授. 1995年获得清华大学自动化系学士学位. 1999年获得清华大学自动化系控制理论与控制工程专业博士学位. 主要研究方向为智能优化理论、方法与应用, 复杂生产过程建模、优化与调度. E-mail: wangling@tsinghua.edu.cn

    蔡之华:中国地质大学 (武汉) 计算机学院教授. 1986年获得武汉大学学士学位. 1992年获得北京工业大学硕士学位. 2003年获得中国地质大学(武汉) 博士学位. 主要研究方向为数据挖掘, 机器学习和演化计算. E-mail: zhcai@cug.edu.cn

    龚鑫:中国地质大学 (武汉) 自动化学院硕士研究生. 2020年获得江汉大学物理与信息工程学院学士学位. 主要研究方向为遥感图像处理, 架构搜索. E-mail: xgong@cug.edu.cn

    郑可心:中国地质大学(武汉)自动化学院硕士研究生. 2019年获得长江大学物理与光电工程学院学士学位. 主要研究方向为遥感图像处理. E-mail: zhengkexin@cug.edu.cn

Anchor-free Based Object Detection Methods and Its Application Progress in Complex Scenes

Funds: Supported by National Natural Science Foundation of China (61973285, 62076226, 61873249, 61773355) and Opening Fund of Key Laboratory of Geological Survey and Evaluation of Ministry of Education (GLAB2023ZR08)
More Information
    Author Bio:

    LIU Xiao-Bo Associate professor at the School of Automation, China University of Geosciences. He received his master degree in computer software and theory from the School of Computer Science, China University of Geosciences in 2008. He received his Ph.D. degree in geoinformation engineering from the School of Computer Science, China University of Geosciences in 2012. His research interest covers machine learning, evolutionary computation, and hyperspectral remote sensing image processes. Corresponding author of this paper

    XIAO Xiao Master student at the School of Automation, China University of Geosciences. She received her bachelor degree from the School of Physics and Information Engineering, Jianghan University in 2020. Her research interest covers remote sensing image processing and object detection

    WANG Ling Professor in the Department of Automation, Tsinghua University. He received his bachelor degree from the Department of Automation, Tsinghua University in 1995. He received his Ph.D. degree in control theory and control engineering from the Department of Automation, Tsinghua University in 1999. His research interest covers intelligent optimization theory, method and application, and complex production process modeling, optimization and scheduling

    CAI Zhi-Hua Professor at the School of Computer Science, China University of Geosciences. He received his bechelor degree from Wuhan University in 1986. He received his master degree from Beijing University of Technology in 1992. He received his Ph.D. degree from China University of Geosciences, in 2003. His research interest covers data mining, machine learning, and evolutionary computation

    GONG Xin Master student at the School of Automation, China University of Geosciences. He received his bachelor degree from the School of Physics and Information Engineering, Jianghan University in 2020. His research interest covers remote sensing image processing and neural architecture search

    ZHENG Ke-Xin Master student at the School of Automation, China University of Geosciences. He received his bachelor degree from the School of Physics and Optoelectronic Engineering, Yangtze University in 2019. His main research interest is remote sensing image processing

  • 摘要: 基于深度学习的目标检测方法是目前计算机视觉领域的热点, 在目标识别、跟踪等领域发挥了重要的作用. 随着研究的深入开展, 基于深度学习的目标检测方法主要分为有锚框的目标检测方法和无锚框的目标检测方法, 其中无锚框的目标检测方法无需预定义大量锚框, 具有更低的模型复杂度和更稳定的检测性能, 是目前目标检测领域中较前沿的方法. 在调研国内外相关文献的基础上, 梳理基于无锚框的目标检测方法及各场景下的常用数据集, 根据样本分配方式不同, 分别从基于关键点组合、中心点回归、Transformer、锚框和无锚框融合等4个方面进行整体结构分析和总结, 并结合COCO (Common objects in context)数据集上的性能指标进一步对比. 在此基础上, 介绍了无锚框目标检测方法在重叠目标、小目标和旋转目标等复杂场景情况下的应用, 聚焦目标遮挡、尺寸过小和角度多等关键问题, 综述现有方法的优缺点及难点. 最后对无锚框目标检测方法中仍存在的问题进行总结并对未来发展的应用趋势进行展望.
  • 图  1  基于锚框的目标检测方法整体框架

    Fig.  1  The overall framework of anchor-based object detection method

    图  2  基于无锚框的目标检测方法整体框架

    Fig.  2  The overall framework of anchor-free object detection method

    图  3  基于角点组合的CornerNet目标检测方法

    Fig.  3  CornerNet framework of object detection method based on corner points combination

    图  4  预测框采样方法

    Fig.  4  The sampling methods of prediction box

    图  5  基于中心点回归的无锚框目标检测方法整体框架

    Fig.  5  The overall framework of anchor-free object detection method based on center point regression

    图  6  DETR整体框架

    Fig.  6  The overall architecture of DETR

    图  7  基于优化标签分配算法的关系

    Fig.  7  The relationship between label assignment optimization algorithms

    图  8  重叠目标检测问题

    Fig.  8  The detection problems of overlapping object

    图  9  小目标示例

    Fig.  9  The object example of too few pixels

    图  10  RepPoints系列点集表示示例

    Fig.  10  The example of RepPoints series point set

    图  11  多角度目标检测结果示例

    Fig.  11  The detection result of arbitrary rotation objects

    表  1  目标检测公共数据集对比

    Table  1  Comparison of public datasets for object detection

    数据集类别数图片数量实例数量图片尺寸 (像素)标注方式使用场景发表年份
    Pascal VOC[10]20~23 k~55 k800 × 800水平框综合2010
    COCO[11]80~123 k~896 k水平框综合2014
    DOTA[12]15~2.8 k~188 k800 ~ 4000水平框/旋转框综合2018
    UCAS-AOD[13]2~1 k~6 k1280 × 1280旋转框汽车、飞机2015
    ICDAR2015[14]11.5 k720 × 1280旋转框文本2015
    CUHK-SYSU[15]1~18 k~96 k50 ~ 4000水平框行人2017
    PRW[16]1~12 k~43 k水平框行人2017
    CrowdHuman[17]1~24 k~470 k608 × 608水平框行人2018
    HRSC2016[18]1~1.1 k~3 k~1000 × 1000旋转框船舰2017
    SSDD[19]11.16 k~2.5 k500 × 500水平框船舰2017
    HRSID[20]1~5.6 k~17 k800 × 800水平框船舰2020
    下载: 导出CSV

    表  2  基于无锚框的目标检测方法对比

    Table  2  Comparison of anchor-free object detection method

    方法类型基于关键点组合基于中心点回归基于Transformer基于锚框和无锚框融合
    方法动机无需设计锚框, 减少锚框带来的超参数, 简化模型
    方法思想组合关键点并检测中心点回归预测框位置Transformer的编码和解码直接预测优化样本标签分配策略
    方法优点充分利用边界和内部信息减少回归超参数数量实现端到端, 简化流程缓解正负样本不均衡
    方法难点不同类别关键点的误配对中心点重叠目标的漏检小目标检测性能较差自适应标签分配不连续
    计算速度检测速度相对较慢检测速度相对较快收敛速度相对较慢检测速度相对较慢
    下载: 导出CSV

    表  3  基于关键点组合的无锚框目标检测算法在COCO数据集上的性能及优缺点对比

    Table  3  Comparison of the keypoints combination based anchor-free object detection methods on the COCO dataset

    算法特征提取网络输入尺寸
    (像素)
    处理器配置及检测速度(帧/s)mAP (%)优点缺点收录来源发表年份
    PLN[21]Inception-V2512 × 512GTX 1080
    28.9重叠及特殊形状目标的检测效果好感受野范围较小arXiv2017
    CornerNet[22]Hourglass-104511 × 511TitanX × 10
    4.1
    42.1使用角池化来精确定位目标同类别的角点匹配易出错ECCV2018
    CornerNet-Saccade[23]Hourglass-54255 × 255GTX 1080Ti × 4
    5.2
    42.6无需对每个像素点进行类别检测小目标的误检率较高BMVC2020
    CornerNet-Squeeze[23]Hourglass-54255 × 255GTX 1080Ti × 4
    33
    34.4大幅提升检测速度角点类别的判断较易出错BMVC2020
    ExtremeNet[24]Hourglass-104511 × 511TitanX × 10
    3.1
    43.7极值点和中心点充分获取目标信息容易产生假阳性样本CVPR2019
    CenterNet-Triplets[25]Hourglass-104511 × 511Tesla V100 × 8
    2.94
    47.0用角点和中心点获取充分目标信息中心点遗漏时位置偏移量大ICCV2019
    CentripetalNet[26]Hourglass-104511 × 511Tesla V100 × 16
    48.0改进CornerNet的角点误匹配问题中心区域的缩放依赖超参数CVPR2020
    SaccadeNet[27]DLA-34-DCN512 × 512RTX 2080Ti
    28
    40.4获取局部和整体特征, 提高特征利用率需要平衡检测精度与速度CVPR2020
    CPNDet[28]Hourglass-104511 × 511Tesla V100 × 8
    49.2多种分类器提升角点类别判断准确率检测头计算效率较低ECCV2020
    下载: 导出CSV

    表  4  基于中心点回归的无锚框目标检测算法在COCO数据集上的性能及优缺点对比

    Table  4  Comparison of the center point regression based anchor-free object detection methods on the COCO dataset

    算法特征提取网络输入尺寸
    (像素)
    处理器配置及检测速度(帧/s)mAP (%)优点缺点收录来源发表年份
    YOLO v1[31]用网格划分法提高中心点搜寻效率目标中心点在同
    网格内的漏检
    CVPR2016
    FCOS[33]ResNet-101800 × $\le 1333$
    9.3
    41.5用中心度降低远离中心点的预测框得分同尺度特征层中
    出现目标误检
    ICCV2019
    CenterNet[35]Hourglass-104511 × 511Titan X
    7.8
    45.1用中心点定位目标减少角点匹配操作目标中心点重合,
    产生漏检
    arXiv2019
    Grid R-CNN[40]ResNet-101800 × 800Titan Xp × 32
    3.45
    41.5用网格定位机制精准定位边界框特征采样区域
    范围过于广泛
    CVPR2019
    Grid R-CNN Plus[41]ResNet-101800 × 800Titan Xp × 32
    7.69
    42.0缩小特征表达区域尺寸, 减少计算量非代表性特征
    区域存在遗漏
    arXiv2019
    HoughNet[37]Hourglass-104512 × 512Tesla V100 × 4
    46.4用投票机制改进全局信息缺失的问题投票机制使
    计算量增大
    ECCV2020
    YOLOX[32]Darknet53640 × 640Tesla V100 × 8
    90.1
    47.4解耦分类和回归分支, 提升收敛速度难分类样本的
    检测精度较低
    arXiv2021
    OneNet[34]ResNet-101512 × $\le 853$Tesla V100 × 8
    50
    37.7用最小匹配损失提升预测框和标签的匹配单像素点检测单
    目标, 产生漏检
    ICML2021
    CenterNet2[36]Res2Net-101-DCN-BiFPN1280 × 1280Titan Xp
    56.4清晰区分目标特征和背景区域的特征分步分类、回归的
    效率较低
    arXiv2021
    下载: 导出CSV

    表  5  基于Transformer的无锚框目标检测算法在COCO数据集上的性能及优缺点对比

    Table  5  Comparison of the Transformer based anchor-free object detection methods on the COCO dataset

    算法特征提取
    网络
    输入尺寸
    (像素)
    处理器配置及
    检测速度(帧/s)
    mAP (%)浮点计算量(FLOPs/G)优点缺点收录
    来源
    发表
    年份
    DETR[42]ResNet-50(480, 800)×
    (800, 1333)
    Tesla V100 × 16
    28
    42.086用Transformer减少手工设计参数数量收敛速度慢, 小
    目标检测性能较差
    ECCV2020
    TSP-FCOS[43]ResNet-50(640, 800)×
    (800, 1333)
    Tesla V100 × 8
    15
    43.1189添加辅助子网来提高多尺度特征的提取模型计算量、
    复杂度较高
    ICCV2021
    Deformable DETR[44]ResNet-50(480, 800)×
    (800, 1333)
    Tesla V100
    19
    43.8173有效关注稀疏空间的目标位置模型计算量、
    复杂度较高
    ICLR2021
    Dynamic DETR[45]ResNet-50Tesla V100 × 8
    47.2用动态注意力机制加速收敛未说明模型的
    计算量、复杂度
    ICCV2021
    YOLOS[47]DeiT-base(480, 800)×
    (800, 1333)

    2.7
    42.0538不依赖卷积骨干网络, 性能良好检测速度较低,
    计算量较高
    NeurlPS2021
    SAM-DETR[46]ResNet-50(480, 800)×
    (800, 1333)
    Tesla V100 × 8
    41.8100利用语义对齐加速模型收敛速度检测精度有待
    进一步提升
    CVPR2022
    ViDT[49]Swin-base(480, 800)×
    (800, 1333)
    Tesla V100 × 8
    11.6
    49.2用新的骨干网络和检测颈减少计算开销浅层难以直接获取
    目标的有用信息
    ICLR2022
    DN-DETR[50]ResNet-50Tesla A100 × 8
    44.194利用去噪训练法大幅提升检测性能仅使用均匀
    分布的噪声
    CVPR2022
    下载: 导出CSV

    表  6  基于锚框和无锚框融合的目标检测算法在COCO数据集上的性能及优缺点对比

    Table  6  Comparison of the anchor-based and anchor-free fusion object detection methods on the COCO dataset

    算法特征提取网络输入尺寸
    (像素)
    处理器配置及检测速度(帧/s)mAP (%)优点缺点收录来源发表年份
    FSAF[52]ResNeXt-101800 × 800Tesla V100 × 8
    2.76
    44.6动态选择最适合目标的特征层未区分不同特征
    的关注程度
    CVPR2019
    SAPD[54]ResNeXt-101800 × 800GTX 1080Ti
    4.5
    47.4能筛选出有代表性的目标特征未能真正将有锚框和
    无锚框分支融合
    ECCV2020
    ATSS[56]ResNeXt-101800 ×
    (800, 1333)
    Tesla V100
    50.7能根据统计特性自动训练样本未完全实现无需参数
    调节的样本分配
    CVPR2020
    AutoAssign[57]ResNeXt-101800 × 80052.1无需手动调节的动态样本分配样本的的权重分配
    机制相对较复杂
    arXiv2020
    LSNet[58]ResNeXt-101800 ×
    (800, 1333)
    Tesla V100 × 8
    5.1
    50.4用位置敏感网络大幅提高定位精度小目标的定位和
    分类精度较低
    arXiv2021
    DW[59]ResNeXt-101800 × 800GPU × 8
    49.8有效获取分类和回归置信度高的框小目标的检测性能
    仍需进一步提升
    CVPR2022
    下载: 导出CSV

    表  7  解决目标重叠排列问题的不同检测方法的性能对比

    Table  7  Performance comparison of detection methods to solve the problem that objects are densely arranged

    问题算法数据集输入尺寸
    (像素)
    骨干网络处理器配置检测速度
    (帧/s)
    mAP (%)收录来源发表年份
    目标重叠排列VarifocalNet[75]COCO(480, 960)×
    1333
    ResNeXt-101Tesla V100 × 86.750.8TMI2019
    WSMA-Seg[77]COCOMSP-Seg38.1arXiv2019
    FCOS v2[73]COCO CrowdHuman800×$\le$1333ResNeXt-101 ResNet-50GTX 1080Ti50.4
    87.3
    TPAMI2022
    BorderDet[76]COCO800×$\le$1333ResNeXt-101GPU × 850.3ECCV2020
    AlignPS[71]CUHK-SYSU
    PRW
    900 × 1500ResNet-50Tesla V10016.494.0
    46.1
    CVPR2021
    OTA-FCOS[78]COCO CrowdHuman(640, 800) ×$\le$
    1333
    ResNeXt-101
    ResNet-50
    GPU × 851.5
    88.4
    CVPR2021
    LLA-FCOS[79]CrowdHuman800×$\le$1400ResNet-50GPU × 888.1Neuro-
    computing
    2021
    LTM[80]COCO800×$\le$1333ResNeXt-101Tesla V100 × 81.746.3TPAMI2022
    Efficient DETR[81]COCO CrowdHumanResNet-101
    ResNet-50
    45.7
    90.8
    arXiv2021
    PSTR[72]CUHK-SYSU
    PRW
    900×1500ResNet-50Tesla V10094.2
    50.1
    CVPR2022
    COAT[82]CUHK-SYSU
    PRW
    900×1500ResNet-50Tesla A10011.194.2
    53.3
    CVPR2022
    Progressive
    DETR[83]
    COCO CrowdHuman(480, 800)×$\le$1333ResNet-50GPU × 846.7
    92.1
    CVPR2022
    下载: 导出CSV

    表  8  解决目标重叠排列问题的不同检测方法优缺点对比

    Table  8  Feature comparison of detection methods to solve the problem that objects are densely arranged

    问题算法方法优点缺点/难点
    目标重叠排列CSP[70]增加中心点偏移量预测分支和尺度预测分支解决行人检测任务中漏检问题特征与框间的关联度较低
    VarifocalNet[75]预测IACS分类得分、提出Varifocal Loss损失函数有效抑制同目标重叠框小目标检测效果需提升
    WSMA-Seg[77]利用分割模型构建无需NMS后处理的目标检测模型准确利用重叠目标边缘特征分割算法的模型复杂度较高
    FCOS v2[73]将中心度子分支加入回归分支, 并修正中心度计算方式减少类别判断错误数量针对不同尺度特征仅使用相同
    检测头, 限制模型性能
    BorderDet[76]用边界对齐的特征提取操作自适应地提取边界特征高效获取预测框的位置边界点选取数量较多
    AlignPS[71]使用特征对齐和聚合模块解决区域、尺度不对齐的问题未扩展到通用目标检测任务
    OTA-FCOS[78]用最优传输理论寻找全局高置信度样本分配方式有助于选择信息丰富区域模型的计算复杂度较高
    LLA-FCOS[79]使用基于损失感知的样本分配策略锚点和真实框对应性更好仅在密集人群中的效果较好
    LTM[80]目标与特征的匹配定义为极大似然估计问题提高目标遮挡和不对齐的精度检测速度有待进一步提高
    Efficient DETR[81]用密集先验知识初始化来简化模型结构减少编码器和解码器数量检测精度有待进一步提升
    PSTR[72]使用Transformer构成首个行人搜索网络提高特征的可判别性和关联性未扩展到通用目标检测任务
    COAT[82]用三段级联设计来检测和完善目标的检测和重识别更清晰地区分目标和背景特征部分阶段过度关注ReID特征,
    牺牲部分检测性能
    Progressive
    DETR[83]
    设计关联信息提取模块和队列更新模块加强低置信点的复用检测精度有待进一步提升
    下载: 导出CSV

    表  9  解决目标尺寸过小问题的不同检测方法性能对比

    Table  9  Performance comparison of detection methods to solve the problem that object pixels are too few

    问题算法数据集输入尺寸
    (像素)
    骨干网络处理器配置检测速度
    (帧/s)
    mAP (%)收录来源发表年份
    目标尺寸过小RepPoints[89]COCO(480, 960) ×$\le$960ResNet-101GPU × 446.5ICCV2019
    DuBox[92] COCO
    VOC 2012
    800 × 800
    500 × 500
    ResNet-101 VGG-16NVIDIA P40 × 839.5
    82.0
    arXiv2019
    PPDet[87]COCO800 × 1300ResNet-101Tesla V100 × 445.2BMVC2020
    RepPoints v2[90]COCO(800, 1333) × $\le$1333ResNet-101GPU × 848.1NeurlPS2020
    FoveaBox[93]COCO
    VOC 2012
    800 × 800ResNet-101
    ResNet-50
    GPU × 4
    16.4
    42.1
    76.6
    TIP2020
    FBR-Net[94]SSDD448 × 448ResNet-50RTX 2080Ti25.092.8TGRS2021
    FCOS (AFE-GDH)[88]HRSID
    SSDD
    800 × 800ResNet-50NVIDIA Titan Xp15.2
    28.5
    67.4
    56.2
    Remote Sensing2022
    Oriented RepPoints [91]DOTA HRSC20161024 × 1024
    (300, 900)×
    (300, 1500)
    ResNet-101
    ResNet-50
    RTX 2080Ti × 476.5
    97.3
    CVPR2022
    QueryDet[95]COCOResNet-50RTX 2080Ti × 814.439.5CVPR2022
    下载: 导出CSV

    表  10  解决目标尺寸过小问题的不同检测方法优缺点对比

    Table  10  Feature comparison of detection methods to solve the problem that object pixels are too few

    问题算法方法优点缺点/难点
    目标尺寸过小RepPoints[89]使用点集形式表征目标的特征自适应地学习极值点和语义信息过度依赖回归分支
    DuBox[92]使用有多尺度特性的双尺度残差单元减少小目标边缘和内部信息的漏检分割模型的复杂度较高
    PPDet[87]使用框内部为正样本点的新标记策略提高判别性目标特征的贡献程度小目标特征信息不足
    RepPoints v2[90]增加角点验证分支来判断特征映射点获得更具目标内部和边缘信息的特征预测框定位准确度低
    FoveaBox[93]在多层特征图上检测多尺度目标特征对目标形状和分布有很强的适应能力难以区分目标和背景区域
    FBR-Net[94]用多尺度注意力机制选择特征重要性减少背景区域与小目标间的强关联性检测精度仍需进一步提升
    FCOS (AFE-GDH)[88]使用自适应特征编码策略(AFE)和构造高斯引导检测头有效增强小目标表达能力仅说明船舰目标有效性
    Oriented RepPoints[91]提出质量评估、样本分配方案和空间约束提升非轴对齐小目标特征的捕获能力仅涉及空域小目标检测
    QueryDet[95]使用基于级联稀疏查询机制进行动态预测减少检测头计算开销、提高小目标的
    位置精确度
    提高分辨率导致误判概率提高
    下载: 导出CSV

    表  11  解决目标方向变化问题的不同检测方法性能对比

    Table  11  Performance comparison of detection methods to solve the problem that object direction changeable

    问题算法数据集输入尺寸
    (像素)
    骨干网络处理器配置检测速度
    (帧/s)
    mAP (%)收录来源发表年份
    目标方向
    变化
    SARD[102]DOTA HRSC2016800 × 800ResNet-101Tesla P100
    1.5
    72.9
    85.4
    IEEE Access2019
    P-RSDet[97]DOTA
    UCAS-AOD
    512 × 512ResNet-101Tesla V100 × 272.3
    90.0
    IEEE Access2020
    O2-DNet[101]DOTA ICDAR2015800 × 800ResNet-101Tesla V100 × 271.0
    85.6
    P&RS2020
    DRN[106]DOTA HRSC20161024 × 1024
    768 × 768
    Hourglass-104Tesla V10073.2
    92.7
    CVPR2020
    BBAVectors[96]DOTA HRSC2016608 × 608ResNet-101GTX 1080Ti × 4
    11.7
    75.4
    88.6
    WACV2021
    FCOSR[98]DOTA HRSC20161024 × 1024
    800 × 800
    ResNeXt-101Tesla V100 × 47.9
    77.4
    95.7
    arXiv2021
    DARDet[103]DOTA HRSC20161024 × 1024ResNet-50RTX 2080Ti12.6
    71.2
    78.9
    GRSL2021
    DAFNe[105]DOTA HRSC20161024 × 1024ResNet-101Tesla V100 × 476.9
    89.5
    arXiv2021
    CHPDet[107]UCAS-AOD HRSC20161024 × 1024DLA-34RTX 2080Ti89.6
    88.8
    TGRS2021
    AOPG[99]DOTA HRSC20161024 × 1024
    (800, 1333) ×
    (800, 1333)
    ResNet-101
    ResNet-50
    RTX 2080Ti10.8
    80.2
    96.2
    TGRS2022
    GGHL[100]DOTA SSDD+800 × 800
    Darknet53RTX 3090 × 242.3
    44.1
    76.9
    90.2
    TIP2022
    下载: 导出CSV

    表  12  解决目标方向变化问题的不同检测方法优缺点对比

    Table  12  Feature comparison of detection methods to solve the problem that object direction changeable

    问题算法方法优点缺点/难点
    目标方向变化SARD[102]用尺度感知方法融合深层和
    浅层特征信息
    对大尺度变化和多角度变化目标适应度好整体检测效率较低
    P-RSDet[97]回归一个极半径和两个极角,
    实现多角度物体的检测
    避免角度周期性及预测框的顶点排序问题 极坐标的后处理操作相关复杂度较高
    O2-DNet[101]用横纵比感知方向中心度的
    方法, 学习判别性特征
    网络从复杂背景中学习更具判别性的特征特征融合方法的实际融合效果较差
    DRN[106]使用自适应的特征选择模块和
    动态优化的检测头
    缓解目标特征和坐标轴之间的不对齐问题检测精度有待进一步提升
    BBAVectors[96]使用边缘感知向量来替代原回归参数在同坐标系中回归所有参数, 减少计算量向量的类型转化过程处理较复杂
    FCOSR[98]使用基于高斯分布的椭圆中心采样策略修正样本分配方法在航空场景下漏检问题未实现标签分配方案的自适应
    DARDet[103]设计高效对齐卷积模块来提取对齐特征可一次性预测出所有的预测框相关参数损失函数的角度预测偏移量较大
    DAFNe[105]使用基于方向感知的边界框中心度函数降低低质量框的权重并且提高定位精度损失函数仍存在部分旋转敏感度误差
    CHPDet[107]使用方位不变模块OIM生成
    方位不变特征映射
    确定旋转目标的朝向(如车头、船头等)存在目标预测框的位置偏移量
    AOPG[99]使用区域标签分配模块和粗定位模块缓解标签分配不均衡、目标特征不对齐的问题未实现标签分配方案的自适应
    GGHL[100]使用二维定向高斯热力图进行
    定向框标签分配
    实现动态标签分配对齐回归和分类任务检测精度有待进一步提升
    下载: 导出CSV
  • [1] 聂光涛, 黄华. 光学遥感图像目标检测算法综述. 自动化学报, 2021, 47(8): 1749-1768 doi: 10.16383/j.aas.c200596

    Nie Guang-Tao, Huang Hua. A survey of object detection in optical remote sensing images. Acta Automatica Sinica, 2021, 47(8): 1749-1768 doi: 10.16383/j.aas.c200596
    [2] Neubeck A, Van Gool L. Efficient non-maximum suppression. In: Proceedings of the 18th International Conference on Pattern Recognition (ICPR'06). Hong Kong, China: IEEE, 2006. 850−855
    [3] Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C Y, et al. SSD: Single shot multibox detector. In: Proceedings of the 14th European Conference on Computer Vision. Amsterdam, The Netherlands: Springer, 2016. 21−37
    [4] Girshick R. Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision. Santiago, Chile: IEEE, 2015. 1440−1448
    [5] Ren S Q, He K M, Girshick R, Sun J. Faster R-CNN: Towards real-time object detection with region proposal networks. In: Proceedings of the 29th International Conference on Neural Information Processing Systems. Montreal, Canada: 2015. 91−99
    [6] Redmon J, Farhadi A. YOLO9000: Better, faster, stronger. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE, 2017. 6517−6525
    [7] Redmon J, Farhadi A. YOLOv3: An incremental improvement. arXiv preprint arXiv: 1804.02767, 2018.
    [8] 肖雨晴, 杨慧敏. 目标检测算法在交通场景中应用综述. 计算机工程与应用, 2021, 57(6): 30-41 doi: 10.3778/j.issn.1002-8331.2011-0361

    Xiao Yu-Qing, Yang Hui-Min. Research on application of object detection algorithm in traffic scene. Computer Engineering and Applications, 2021, 57(6): 30-41 doi: 10.3778/j.issn.1002-8331.2011-0361
    [9] Huang L C, Yang Y, Deng Y F, Yu Y N. DenseBox: Unifying landmark localization with end to end object detection. arXiv preprint arXiv: 1509.04874, 2015.
    [10] Everingham M, Van Gool L, Williams C K I, Winn J, Zisserman A. The PASCAL visual object classes (VOC) challenge. International Journal of Computer Vision, 2010, 88(2): 303-338 doi: 10.1007/s11263-009-0275-4
    [11] Lin T Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, et al. Microsoft COCO: Common objects in context. In: Proceedings of the 13th European Conference on Computer Vision. Zurich, Switzerland: Springer, 2014. 740−755
    [12] Xia G S, Bai X, Ding J, Zhu Z, Belongie S, Luo J B, et al. DOTA: A large-scale dataset for object detection in aerial images. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018. 3974−3983
    [13] Zhu H G, Chen X G, Dai W Q, Fu K, Ye Q X, Jiao J B. Orientation robust object detection in aerial images using deep convolutional neural network. In: Proceedings of the IEEE International Conference on Image Processing (ICIP). Quebec City, Canada: IEEE, 2015. 3735−3739
    [14] Karatzas D, Gomez-Bigorda L, Nicolaou A, Ghosh S, Bagdanov A, Iwamura M, et al. ICDAR 2015 competition on robust reading. In: Proceedings of the 13th International Conference on Document Analysis and Recognition (ICDAR). Tunis, Tunisia: IEEE, 2015. 1156−1160
    [15] Xiao T, Li S, Wang B C, Lin L, Wang X G. Joint detection and identification feature learning for person search. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE, 2017. 3376−3385
    [16] Zheng L, Zhang H H, Sun S Y, Chandraker M, Yang Y, Tian Q. Person re-identification in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE, 2017. 3346−3355
    [17] Shao S, Zhao Z J, Li B X, Xiao T T, Yu G, Zhang X Y, et al. CrowdHuman: A benchmark for detecting human in a crowd. arXiv preprint arXiv: 1805.00123, 2018.
    [18] Liu Z K, Yuan L, Weng L B, Yang Y P. A high resolution optical satellite image dataset for ship recognition and some new baselines. In: Proceedings of the 6th International Conference on Pattern Recognition Applications and Methods. Porto, Portugal: SciTePress, 2017. 324−331
    [19] Li J W, Qu C W, Shao J Q. Ship detection in SAR images based on an improved faster R-CNN. In: Proceedings of the SAR in Big Data Era: Models, Methods and Applications (BIGSARDATA). Beijing, China: IEEE, 2017. 1−6
    [20] Wei S J, Zeng X F, Qu Q Z, Wang M, Su H, Shi J. HRSID: A high-resolution SAR images dataset for ship detection and instance segmentation. IEEE Access, 2020, 8: 120234-120254 doi: 10.1109/ACCESS.2020.3005861
    [21] Wang X G, Chen K B, Huang Z L, Yao C, Liu W Y. Point linking network for object detection. arXiv preprint arXiv: 1706.03646, 2017.
    [22] Law H, Deng J. CornerNet: Detecting objects as paired keypoints. In: Proceedings of the 15th European Conference on Computer Vision (ECCV). Munich, Germany: Springer, 2018. 765−781
    [23] Law H, Teng Y, Russakovsky O, Deng J. Cornernet-lite: Efficient keypoint based object detection. In: Proceedings of the 31st British Machine Vision Conference. BMVC, 2020.
    [24] Zhou X Y, Zhuo J C, Krähenbühl P. Bottom-up object detection by grouping extreme and center points. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE, 2019. 850−859
    [25] Duan K W, Bai S, Xie L X, Qi H G, Huang Q M, Tian Q. CenterNet: Keypoint triplets for object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. Seoul, South Korea: IEEE, 2019. 6568−6577
    [26] Dong Z W, Li G X, Liao Y, Wang F, Ren P J, Qian C. CentripetalNet: Pursuing high-quality keypoint pairs for object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE, 2020. 10516−10525
    [27] Lan S Y, Ren Z, Wu Y, Davis L S, Hua G. SaccadeNet: A fast and accurate object detector. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE, 2020. 10394−10403
    [28] Duan K W, Xie L X, Qi H G, Bai S, Huang Q M, Tian Q. Corner proposal network for anchor-free, two-stage object detection. In: Proceedings of the 16th European Conference on Computer Vision. Glasgow, UK: Springer, 2020. 399−416
    [29] 王彦情, 马雷, 田原. 光学遥感图像舰船目标检测与识别综述. 自动化学报, 2011, 37(9): 1029-1039

    Wang Yan-Qing, Ma Lei, Tian Yuan. State-of-the-art of ship detection and recognition in optical remotely sensed imagery. Acta Automatica Sinica, 2011, 37(9): 1029-1039
    [30] Yu J H, Jiang Y N, Wang Z Y, Cao Z M, Huang T. UnitBox: An advanced object detection network. In: Proceedings of the 24th ACM International Conference on Multimedia. Amsterdam, The Netherlands: ACM, 2016. 516−520
    [31] Redmon J, Divvala S, Girshick R, Farhadi A. You only look once: Unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE, 2016. 779−788
    [32] Ge Z, Liu S T, Wang F, Li Z M, Sun J. YOLOX: Exceeding YOLO series in 2021. arXiv preprint arXiv: 2107.08430, 2021.
    [33] Tian Z, Shen C H, Chen H, He T. FCOS: Fully convolutional one-stage object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. Seoul, South Korea: IEEE, 2019. 9626−9635
    [34] Sun P Z, Jiang Y, Xie E Z, Shao W Q, Yuan Z H, Wang C H, et al. What makes for end-to-end object detection? In: Proceedings of the 38th International Conference on Machine Learning. PMLR, 2021. 9934−9944
    [35] Zhou X Y, Wang D Q, Krahenbuhl P. Objects as points. arXiv preprint arXiv: 1904.07850, 2019.
    [36] Zhou X Y, Koltun V, Krähenbühl P. Probabilistic two-stage detection. arXiv preprint arXiv: 2103.07461, 2021.
    [37] Samet N, Hicsonmez S, Akbas E. HoughNet: Integrating near and long-range evidence for bottom-up object detection. In: Proceedings of the 16th European Conference on Computer Vision. Glasgow, UK: Springer, 2020. 406−423
    [38] Chu C, Zhmoginov A, Sandler M. CycleGAN, a master of steganography. arXiv preprint arXiv: 1712.02950, 2017.
    [39] Isola P, Zhu J Y, Zhou T H, Efros A A. Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE, 2017. 5967−5976
    [40] Lu X, Li B Y, Yue Y X, Li Q Q, Yan J J. Grid R-CNN. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE, 2019. 7355−7364
    [41] Lu X, Li B Y, Yue Y X, Li Q Q, Yan J J. Grid R-CNN plus: Faster and better. arXiv preprint arXiv: 1906.05688, 2019.
    [42] Carion N, Massa F, Synnaeve G, Usunier N, Kirillov A, Zagoruyko S. End-to-end object detection with transformers. In: Proceedings of the 16th European Conference on Computer Vision. Glasgow, UK: Springer, 2020. 213−229
    [43] Sun Z Q, Cao S C, Yang Y M, Kitani K. Rethinking transformer-based set prediction for object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. Montreal, Canada: IEEE, 2021. 3591−3600
    [44] Zhu X Z, Su W J, Lu L W, Li B, Wang X G, Dai J F. Deformable DETR: Deformable transformers for end-to-end object detection. In: Proceedings of the 9th International Conference on Learning Representations. ICLR, 2021.
    [45] Dai X Y, Chen Y P, Yang J W, Zhang P C, Yuan L, Zhang L. Dynamic DETR: End-to-end object detection with dynamic attention. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. Montreal, Canada: IEEE, 2021. 2968−2977
    [46] Zhang G J, Luo Z P, Yu Y C, Cui K W, Lu S J. Accelerating DETR convergence via semantic-aligned matching. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans, USA: IEEE, 2022. 939−948
    [47] Fang Y X, Liao B C, Wang X G, Fang J M, Qi J Y, Wu R, et al. You only look at one sequence: Rethinking transformer in vision through object detection. In: Proceedings of the 35th International Conference on Neural Information Processing Systems. 2021. 26183−26197
    [48] Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X H, Unterthiner T, et al. An image is worth 16×16 words: Transformers for image recognition at scale. In: Proceedings of the 9th International Conference on Learning Representations. ICLR, 2021.
    [49] Song H, Sun D Q, Chun S, Jampani V, Han D, Heo B, et al. ViDT: An efficient and effective fully transformer-based object detector. In: Proceedings of the 10th International Conference on Learning Representations. ICLR, 2022.
    [50] Li F, Zhang H, Liu S L, Guo J, Ni L M, Zhang L. DN-DETR: Accelerate DETR training by introducing query DeNoising. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans, USA: IEEE, 2022. 13609−13617
    [51] Wang J F, Yuan Y, Li B X, Yu G, Jian S. SFace: An efficient network for face detection in large scale variations. arXiv preprint arXiv: 1804.06559, 2018.
    [52] Zhu C C, He Y H, Savvides M. Feature selective anchor-free module for single-shot object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE, 2019. 840−849
    [53] Lin T Y, Goyal P, Girshick R, He K M, Dollár P. Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision. Venice, Italy: IEEE, 2017. 2999−3001
    [54] Zhu C C, Chen F Y, Shen Z Q, Savvides M. Soft anchor-point object detection. In: Proceedings of the 16th European Conference on Computer Vision. Glasgow, UK: Springer, 2020. 91−107
    [55] Zhang X S, Wan F, Liu C, Ji R R, Ye Q X. FreeAnchor: Learning to match anchors for visual object detection. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems. Vancouver, Canada: Curran Associates Inc., 2019. Article No. 14
    [56] Zhang S F, Chi C, Yao Y Q, Lei Z, Li S Z. Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE, 2020. 9756−9765
    [57] Zhu B J, Wang J F, Jiang Z K, Zong F H, Liu S T, Li Z M, et al. AutoAssign: Differentiable label assignment for dense object detection. arXiv preprint arXiv: 2007.03496, 2020.
    [58] Duan K W, Xie L X, Qi H G, Bai S, Huang Q M, Tian Q. Location-sensitive visual recognition with cross-IOU loss. arXiv preprint arXiv: 2104.04899, 2021.
    [59] Li S, He C H, Li R H, Zhang L. A dual weighting label assignment scheme for object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans, USA: IEEE, 2022. 9377−9386
    [60] 刘小波, 刘鹏, 蔡之华, 乔禹霖, 王凌, 汪敏. 基于深度学习的光学遥感图像目标检测研究进展. 自动化学报, 2021, 47(9): 2078-2089 doi: 10.16383/j.aas.c190455

    Liu Xiao-Bo, Liu Peng, Cai Zhi-Hua, Qiao Yu-Lin, Wang Ling, Wang Min. Research progress of optical remote sensing image object detection based on deep learning. Acta Automatica Sinica, 2021, 47(9): 2078-2089 doi: 10.16383/j.aas.c190455
    [61] 龚浩田, 张萌. 基于关键点检测的无锚框轻量级目标检测算法. 计算机科学, 2021, 48(8): 106-110 doi: 10.11896/jsjkx.200700161

    Gong Hao-Tian, Zhang Meng. Lightweight anchor-free object detection algorithm based on KeyPoint detection. Computer Science, 2021, 48(8): 106-110 doi: 10.11896/jsjkx.200700161
    [62] 邵晓雯, 帅惠, 刘青山. 融合属性特征的行人重识别方法. 自动化学报, 2022, 48(2): 564-571

    Shao Xiao-Wen, Shuai Hui, Liu Qing-Shan. Person re-identification based on fused attribute features. Acta Automatica Sinica, 2022, 48(2): 564-571
    [63] 刘洋, 战荫伟. 基于深度学习的小目标检测算法综述. 计算机工程与应用, 2021, 57(2): 37-48 doi: 10.3778/j.issn.1002-8331.2009-0047

    Liu Yang, Zhan Yin-Wei. Survey of small object detection algorithms based on deep learning. Computer Engineering and Applications, 2021, 57(2): 37-48 doi: 10.3778/j.issn.1002-8331.2009-0047
    [64] Bodla N, Singh B, Chellappa R, Davis L S. Soft-NMS: Improving object detection with one line of code. In: Proceedings of the IEEE International Conference on Computer Vision. Venice, Italy: IEEE, 2017. 5562−5570
    [65] Liu S T, Huang D, Wang Y H. Adaptive NMS: Refining pedestrian detection in a crowd. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE, 2019. 6452−6461
    [66] Huang X, Ge Z, Jie Z Q, Yoshie O. NMS by representative region: Towards crowded pedestrian detection by proposal pairing. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE, 2020. 10747−10756
    [67] Zhang S F, Wen L Y, Bian X, Lei Z, Li S Z. Occlusion-aware R-CNN: Detecting pedestrians in a crowd. In: Proceedings of the 15th European Conference on Computer Vision (ECCV). Munich, Germany: Springer, 2018. 657−674
    [68] 阳珊, 王建, 胡莉, 刘波, 赵皓. 改进RetinaNet的遮挡目标检测算法研究. 计算机工程与应用, 2022, 58(11): 209-214 doi: 10.3778/j.issn.1002-8331.2107-0277

    Yang Shan, Wang Jian, Hu Li, Liu Bo, Zhao Hao. Research on occluded object detection by improved RetinaNet. Computer Engineering and Applications, 2022, 58(11): 209-214 doi: 10.3778/j.issn.1002-8331.2107-0277
    [69] Luo Z K, Fang Z, Zheng S X, Wang Y B, Fu Y W. NMS-Loss: Learning with non-maximum suppression for crowded pedestrian detection. In: Proceedings of the International Conference on Multimedia Retrieval. Taipei, China: ACM, 2021. 481−485
    [70] Liu W, Hasan I, Liao S C. Center and scale prediction: A box-free approach for pedestrian and face detection. arXiv preprint arXiv: 1904.02948, 2019.
    [71] Yan Y C, Li J P, Qin J, Bai S, Liao S C, Liu L, et al. Anchor-free person search. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, USA: IEEE, 2021. 7686−7695
    [72] Cao J L, Pang Y W, Anwer R M, Cholakkal H, Xie J, Shah M, et al. PSTR: End-to-end one-step person search with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans, USA: IEEE, 2022. 9448−9457
    [73] Tian Z, Shen C H, Chen H, He T. FCOS: A simple and strong anchor-free object detector. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(4): 1922-1933
    [74] Rezatofighi H, Tsoi N, Gwak J Y, Sadeghian A, Reid I, Savarese S. Generalized intersection over union: A metric and a loss for bounding box regression. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE, 2019. 658−666
    [75] Qin Y L, Wen J, Zheng H, Huang X L, Yang J, Song N, et al. Varifocal-Net: A chromosome classification approach using deep convolutional networks. IEEE Transactions on Medical Imaging, 2019, 38(11): 2569-2581 doi: 10.1109/TMI.2019.2905841
    [76] Qiu H, Ma Y C, Li Z M, Liu S T, Sun J. BorderDet: Border feature for dense object detection. In: Proceedings of the 16th European Conference on Computer Vision. Glasgow, UK: Springer, 2020. 549−564
    [77] Cheng Z H, Wu Y X, Xu Z H, Lukasiewicz T, Wang W Y. Segmentation is all you need. arXiv preprint arXiv: 1904.13300, 2019.
    [78] Ge Z, Liu S T, Li Z M, Yoshie O, Sun J. OTA: Optimal transport assignment for object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville, USA: IEEE, 2021. 303−312
    [79] Ge Z, Wang J F, Huang X, Liu S T, Yoshie O. LLA: Loss-aware label assignment for dense pedestrian detection. Neurocomputing, 2021, 462: 272-281 doi: 10.1016/j.neucom.2021.07.094
    [80] Zhang X S, Wan F, Liu C, Ji X Y, Ye Q X. Learning to match anchors for visual object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44(6): 3096-3109 doi: 10.1109/TPAMI.2021.3050494
    [81] Yao Z Y, Ai J B, Li B X, Zhang C. Efficient DETR: Improving end-to-end object detector with dense prior. arXiv preprint arXiv: 2104.01318, 2021.
    [82] Yu R, Du D W, LaLonde R, Davila D, Funk C, Hoogs A, et al. Cascade transformers for end-to-end person search. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans, USA: IEEE, 2022. 7257−7266
    [83] Zheng A L, Zhang Y, Zhang X Y, Qi X J, Sun J. Progressive end-to-end object detection in crowded scenes. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans, USA: IEEE, 2022. 847−856
    [84] Zhu Y S, Zhao C Y, Wang J Q, Zhao X, Wu Y, Lu H Q. CoupleNet: Coupling global structure with local parts for object detection. In: Proceedings of the IEEE International Conference on Computer Vision. Venice, Italy: IEEE, 2017. 4146−4154
    [85] Li Y Z, Pang Y W, Shen J B, Cao J L, Shao L. NETNet: Neighbor erasing and transferring network for better single shot object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE, 2020. 13346−13355
    [86] Zhong Z Y, Sun L, Huo Q. An anchor-free region proposal network for Faster R-CNN-based text detection approaches. International Journal on Document Analysis and Recognition (IJDAR), 2019, 22(3): 315-327 doi: 10.1007/s10032-019-00335-y
    [87] Samet N, Hicsonmez S, Akbas E. Reducing label noise in anchor-free object detection. In: Proceedings of the 31st British Machine Vision Conference. BMVC, 2020.
    [88] He B K, Zhang Q Y, Tong M, He C. An anchor-free method based on adaptive feature encoding and Gaussian-guided sampling optimization for ship detection in SAR imagery. Remote Sensing, 2022, 14(7): 1738 doi: 10.3390/rs14071738
    [89] Yang Z, Liu S H, Hu H, Wang L W, Lin S. RepPoints: Point set representation for object detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. Seoul, South Korea: IEEE, 2019. 9656−9665
    [90] Chen Y H, Zhang Z, Cao Y, Wang L W, Lin S, Hu H. RepPoints v2: Verification meets regression for object detection. In: Proceedings of the 34th International Conference on Neural Information Processing Systems. 2020. Article No. 33
    [91] Li W T, Chen Y J, Hu K X, Zhu J K. Oriented RepPoints for aerial object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans, USA: IEEE, 2022. 1819−1828
    [92] Chen S, Li J P, Yao C Q, Hou W B, Qin S, Jin W Y, et al. DuBox: No-prior box objection detection via residual dual scale detectors. arXiv preprint arXiv: 1904.06883, 2019.
    [93] Kong T, Sun F C, Liu H P, Jiang Y N, Li L, Shi J B. FoveaBox: Beyound anchor-based object detection. IEEE Transactions on Image Processing, 2020, 29: 7389-7398 doi: 10.1109/TIP.2020.3002345
    [94] Fu J M, Sun X, Wang Z R, Fu K. An anchor-free method based on feature balancing and refinement network for multiscale ship detection in SAR images. IEEE Transactions on Geoscience and Remote Sensing, 2021, 59(2): 1331-1344 doi: 10.1109/TGRS.2020.3005151
    [95] Yang C, Huang Z H, Wang N Y. QueryDet: Cascaded sparse query for accelerating high-resolution small object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans, USA: IEEE, 2022. 13658−13667
    [96] Yi J R, Wu P X, Liu B, Huang Q Y, Qu H, Metaxas D. Oriented object detection in aerial images with box boundary-aware vectors. In: Proceedings of the IEEE Winter Conference on Applications of Computer Vision. Waikoloa, USA: IEEE, 2021. 2149−2158
    [97] Zhou L, Wei H R, Li H, Zhao W Z, Zhang Y, Zhang Y. Arbitrary-oriented object detection in remote sensing images based on polar coordinates. IEEE Access, 2020, 8: 223373-223384 doi: 10.1109/ACCESS.2020.3041025
    [98] Li Z H, Hou B, Wu Z T, Jiao L C, Ren B, Yang C. FCOSR: A simple anchor-free rotated detector for aerial object detection. arXiv preprint arXiv: 2111.10780, 2021.
    [99] Cheng G, Wang J B, Li K, Xie X X, Lang C B, Yao Y Q, et al. Anchor-free oriented proposal generator for object detection. IEEE Transactions on Geoscience and Remote Sensing, 2022, 60: Article No. 5625411
    [100] Huang Z C, Li W, Xia X G, Tao R. A general Gaussian heatmap label assignment for arbitrary-oriented object detection. IEEE Transactions on Image Processing, 2022, 31: 1895-1910 doi: 10.1109/TIP.2022.3148874
    [101] Wei H R, Zhang Y, Chang Z H, Li H, Wang H Q, Sun X. Oriented objects as pairs of middle lines. ISPRS Journal of Photogrammetry and Remote Sensing, 2020, 169: 268-279 doi: 10.1016/j.isprsjprs.2020.09.022
    [102] Wang Y S, Zhang Y, Zhang Y, Zhao L J, Sun X, Guo Z. SARD: Towards scale-aware rotated object detection in aerial imagery. IEEE Access, 2019, 7: 173855-173865 doi: 10.1109/ACCESS.2019.2956569
    [103] Zhang F, Wang X Y, Zhou S L, Wang Y Q. DARDet: A dense anchor-free rotated object detector in aerial images. IEEE Geoscience and Remote Sensing Letters, 2021, 19: Article No. 8024305
    [104] Chen Z M, Chen K, Lin W Y, See J, Yu H, Ke Y, et al. PIoU loss: Towards accurate oriented object detection in complex environments. In: Proceedings of the 16th European Conference on Computer Vision. Glasgow, UK: Springer, 2020. 195−211
    [105] Lang S, Ventola F, Kersting K. DAFNe: A one-stage anchor-free approach for oriented object detection. arXiv preprint arXiv: 2109.06148, 2021.
    [106] Pan X J, Ren Y Q, Sheng K K, Dong W M, Yuan H L, Guo X W, et al. Dynamic refinement network for oriented and densely packed object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE, 2020. 11204−11213
    [107] Zhang F, Wang X Y, Zhou S L, Wang Y Q, Hou Y. Arbitrary-oriented ship detection through center-head point extraction. IEEE Transactions on Geoscience and Remote Sensing, 2021, 60: Article No. 5612414
    [108] Yang X, Yan J C, Ming Q, Wang W T, Zhang X P, Tian Q. Rethinking rotated object detection with Gaussian wasserstein distance loss. In: Proceedings of the 38th International Conference on Machine Learning. PMLR, 2021. 11830−11841
    [109] Yang X, Yang X J, Yang J R, Ming Q, Wang W T, Tian Q, et al. Learning high-precision bounding box for rotated object detection via Kullback-Leibler divergence. In: Proceedings of the 35th International Conference on Neural Information Processing Systems. 2021. 18381−18394
    [110] Qian W, Yang X, Peng S L, Zhang X J, Yan J C. RSDet++: Point-based modulated loss for more accurate rotated object detection. IEEE Transactions on Circuits and Systems for Video Technology, 2022, 32(11): 7869-7879 doi: 10.1109/TCSVT.2022.3186070
    [111] Yang X, Yang J R, Yan J C, Zhang Y, Zhang T F, Guo Z, et al. SCRDet: Towards more robust detection for small, cluttered and rotated objects. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. Seoul, South Korea: IEEE, 2019. 8231−8240
    [112] Milan A, Leal-Taixé L, Reid I, Roth S, Schindler K. MOT16: A benchmark for multi-object tracking. arXiv preprint arXiv: 1603.00831, 2016.
    [113] Zhang Y F, Wang C Y, Wang X G, Zeng W J, Liu W Y. FairMOT: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision, 2021, 129(11): 3069-3087 doi: 10.1007/s11263-021-01513-4
    [114] Howard A G, Zhu M L, Chen B, Kalenichenko D, Wang W J, Weyand T, et al. MobileNets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv: 1704.04861, 2017.
    [115] Zhang X Y, Zhou X Y, Lin M X, Sun J. ShuffleNet: An extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018. 6848−6856
    [116] Wang R J, Li X, Ling C X. Pelee: A real-time object detection system on mobile devices. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems. Montréal, Canada: Curran Associates Inc., 2018. 1967−1976
  • 加载中
图(11) / 表(12)
计量
  • 文章访问数:  1530
  • HTML全文浏览量:  1121
  • PDF下载量:  552
  • 被引次数: 0
出版历程
  • 收稿日期:  2022-02-24
  • 录用日期:  2022-11-03
  • 网络出版日期:  2022-12-22
  • 刊出日期:  2023-07-20

目录

    /

    返回文章
    返回