2.793

2018影响因子

(CJCR)

  • 中文核心
  • EI
  • 中国科技核心
  • Scopus
  • CSCD
  • 英国科学文摘

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于旋转框精细定位的遥感目标检测方法研究

朱煜 方观寿 郑兵兵 韩飞

朱煜, 方观寿, 郑兵兵, 韩飞. 基于旋转框精细定位的遥感目标检测方法研究. 自动化学报, 2020, 45(x): 1−10 doi: 10.16383/j.aas.c200261
引用本文: 朱煜, 方观寿, 郑兵兵, 韩飞. 基于旋转框精细定位的遥感目标检测方法研究. 自动化学报, 2020, 45(x): 1−10 doi: 10.16383/j.aas.c200261
Zhu Yu, Fang Guan-Shou, Zheng Bing-Bing, Han Fei. Research on detection method of refined rotated boxes in remote sensing. Acta Automatica Sinica, 2020, 45(x): 1−10 doi: 10.16383/j.aas.c200261
Citation: Zhu Yu, Fang Guan-Shou, Zheng Bing-Bing, Han Fei. Research on detection method of refined rotated boxes in remote sensing. Acta Automatica Sinica, 2020, 45(x): 1−10 doi: 10.16383/j.aas.c200261

基于旋转框精细定位的遥感目标检测方法研究

doi: 10.16383/j.aas.c200261
基金项目: 上海市科学技术委员会科研计划项目(17DZ1100808)资助
详细信息
    作者简介:

    朱煜:南京大学科学和技术学院博士研究生, 现为华东理工大学电子通信工程系教授. 主要研究方向为图像处理, 计算机视觉, 多媒体通信和深度学习. 本文通信作者. E-mail: zhuyu@ecust.edu.cn

    方观寿:正攻读华东理工大学信息与工程学院硕士研究生. 主要研究方向为目标检测, 深度学习. E-mail: y30180616@mail.ecust.edu.cn

    郑兵兵:华东理工大学信息科学与工程学院硕士研究生, 正攻读华东理工大学博士学位, 主要研究方向为医学图像处理, 深度学习, 计算机视觉. E-mail: bostonkg@outlook.com

    韩飞:华东理工大学信息与工程学院硕士研究生. 主要研究方向为目标检测, 计算机视觉和深度学习. E-mail: fei-han_huali@163.com

Research on Detection Method of Refined Rotated Boxes in Remote Sensing

Funds: Shanghai Association for Science and Technology under Grant (17DZ1100808)
  • 摘要: 遥感图像中的目标往往呈现出任意方向排列, 而常见的目标检测算法均采用水平框检测, 并不能满足这类场景的应用需求. 因此本文提出一种旋转框检测网络R2-FRCNN. 该网络利用粗调与细调两阶段实现旋转框检测, 粗调阶段将水平框转换为旋转框, 细调阶段进一步优化旋转框的定位. 针对遥感图像存在较多小目标的特点, 本文提出像素重组金字塔结构, 融合深浅层特征, 提升复杂背景下小目标的检测精度. 此外, 为了在金字塔各层中提取更加有效的特征信息, 本文在粗调阶段设计一种积分与面积插值法相结合的感兴趣区域特征提取方法, 同时在细调阶段设计旋转框区域特征提取方法. 最后, 本文在粗调和细调阶段均采用全连接层与卷积层相结合的预测分支, 并且利用SmoothLn作为网络的回归损失函数, 进一步提升算法性能. 本文提出的网络在大型遥感数据集DOTA上进行评估, 评估指标mAP达到0.7602. 对比实验表明所提出的R2-FRCNN网络的有效性.
  • 图  1  遥感图像目标检测问题可视化

    Fig.  1  Visualization of remote sensing images object detection problem

    图  2  R2-FRCNN网络结构图

    Fig.  2  The structure of R2-FRCNN

    图  3  像素重组金字塔结构

    Fig.  3  The structure of pixel-recombination pyramid

    图  4  特征融合结构

    Fig.  4  The structure of feature fusion

    图  5  常用RoI特征提取示意图

    Fig.  5  The schematic diagram of common RoI feature extraction

    图  6  IRoIPool特征提取示意图

    Fig.  6  The diagram of IRoIPool feature extraction

    图  7  旋转RoI特征提取示意图

    Fig.  7  The diagram of rotated RoI feature extraction

    图  8  预测分支结构图

    Fig.  8  The diagram of prediction branch

    图  9  在DOTA上训练过程loss曲线图

    Fig.  9  Train Loss on DOTA

    图  10  各类别检测结果展示

    Fig.  10  Visualization of each category detection

    表  1  不同方法在DOTA数据集的检测精度对比

    Table  1  Comparison of detection accuracy of different methods in DOTA

    类别 方法
    R2CNN[10] RT[12] CADNet[13] SCRDet[15] R3Det[16] GV[17] 本文
    PL 80.94 88.64 87.80 89.98 89.24 89.64 89.10
    BD 65.67 78.52 82.40 80.65 80.81 85.00 81.22
    BR 35.34 43.44 49.40 52.09 51.11 52.26 54.47
    GTF 67.44 75.92 73.50 68.36 65.62 77.34 72.97
    SV 59.92 68.81 71.10 68.36 70.67 73.01 79.99
    LV 50.91 73.68 64.50 60.32 76.03 73.14 82.28
    SH 55.81 83.59 76.60 72.41 78.32 86.82 87.64
    TC 90.67 90.74 90.90 90.85 90.83 90.74 90.54
    BC 66.92 77.27 79.20 87.94 84.89 79.02 87.31
    ST 72.39 81.46 73.30 86.86 84.42 86.81 86.33
    SBF 55.06 58.39 48.40 65.02 65.10 59.55 54.20
    RA 52.23 53.54 60.90 66.68 57.18 70.91 68.18
    HA 55.14 62.83 62.00 66.25 68.10 72.94 76.12
    SP 53.35 58.93 67.00 68.24 68.98 70.86 70.83
    HC 48.22 47.67 62.20 65.21 60.88 57.32 59.19
    mAP(%) 60.67 69.56 69.90 72.61 72.81 75.02 76.02
    下载: 导出CSV

    表  2  R2-FRCNN模块分离检测结果

    Table  2  R2-FRCNN module separates detection results

    模块 R2-FRCNN
    Baseline
    精细调整
    IRoIPool
    RRoIPool
    PFPN
    SmoothLn
    ConvFc
    mAP(%) 69.52 73.62 73.99 74.31 74.97 75.13 75.96
    下载: 导出CSV

    表  3  不同水平框特征提取方法的实验结果

    Table  3  Experimental results of feature extraction methods of different horizontal boxes

    模块 Baseline + 精细调整
    方法 RoI Pooling RoI Align IRoIPool
    mAP(%) 71.21 73.62 73.99
    下载: 导出CSV

    表  4  不同旋转框特征提取方法的实验结果

    Table  4  Experimental results of different feature extraction methods of rotated boxes

    模块 Baseline + 精细调整 + IRoIPool
    方法 RRoI A-Pooling RRoI Align RRoIPool
    mAP(%) 73.38 73.99 74.31
    下载: 导出CSV
  • [1] Ya, Ying, et al. Fusion object detection of satellite imagery with arbitrary-oriented region convolutional neural network. Aerospace Systems, 2019, 2(2): 163−174 doi: 10.1007/s42401-019-00033-x
    [2] 王彦情, 马雷, 田原. 光学遥感图像舰船目标检测与识别综述. 自动化学报, 2011, 37(9): 1029−1039

    WANG Yan-Qing, MA Lei, TIAN Yuan. State-of-the-art of Ship Detection and Recognition in Optical Remotely Sensed Imagery. ACTA AUTOMATICA SINICA, 2011, 37(9): 1029−1039
    [3] 张慧, 王坤峰, 王飞跃. 深度学习在目标视觉检测中的应用进展与展望. 自动化学报, 2017, 43(8): 1289−1305

    ZHANG Hui, WANG Kun-Feng, WANG Fei-Yue. Advances and Perspectives on Applications of Deep Learning in Visual Object Detection. ACTA AUTO-MATICA SINICA, 2017, 43(8): 1289−1305
    [4] Ren S Q, He K M, Girshick R, Sun J. Faster R-CNN: to-wards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137−1149 doi: 10.1109/TPAMI.2016.2577031
    [5] Dai J F, Li Y, He K M, Sun J. R-FCN: object detection via region-based fully convolutional networks. In: Proceedings of the 2016 Advances in Neural Information Processing Systems (NIPS). Barcelona, Spain: MIT Press: IEEE, 2016. 379−387.
    [6] Cai, Zhaowei, and Nuno Vasconcelos. Cascade r-cnn: Delving into high quality object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). Salt Lake City, UT: IEEE, 2018. 6154−6162.
    [7] Redmon J, Divvala S, Girshick R, Farhadi A. You only look once: unified, real-time object detection. In: Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas, NV, USA: IEEE, 2016. 779−788.
    [8] Liu W, Anguelov D, Erhan D, Szegedy C, Reed S E, Fu C Y, Berg A C. SSD: single shot multibox detector. In: Proceeding of the 14th European Conference on Computer Vision (ECCV). Amsterdam, Netherlands: Springer, 2016. 21−37.
    [9] Lin, Tsung-Yi, et al. Focal loss for dense object detection. In: Proceedings of the IEEE Transactions on Pattern Analysis and Machine Intelligence: IEEE, 2017, 42(2): 318−327.
    [10] Jiang Y, Zhu X, Wang X, et al. R2cnn: rotational region cnn for orientation robust scene text detection[Online], available: https://arxiv.org/abs/1706.09579, 29 Jun, 2017.
    [11] Ma J, Shao W, Ye H, et al. Arbitrary-oriented scene text detection via rotation proposals. IEEE Transactions on Multimedia, 2018, 20(11): 3111−3122 doi: 10.1109/TMM.2018.2818020
    [12] Ding, Jian, et al. Learning roi transformer for detecting oriented objects in aerial images. In: Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, CA, USA: IEEE, 2019. 2844−2853.
    [13] Zhang, Gongjie, Shijian Lu, and Wei Zhang. CAD-Net: A context-aware detection network for objects in remote sensing imagery. IEEE Transactions on Geoscience and Remote Sensing, 2019, 57(12): 10015−10024 doi: 10.1109/TGRS.2019.2930982
    [14] Azimi, Seyed Majid, Vig, Eleonora, Bahmanyar, Reza, et al. Towards Multi-class Object Detection in Unconstrained Remote Sensing Imagery. Cham: Springer International Publishing, 2019. 150−165.
    [15] Yang, Xue, et al. SCRDet: Towards More Robust Detection for Small, Cluttered and Rotated Objects. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), Seoul, Korea (South): IEEE, 2019. 8231−8240.
    [16] Yang, Xue, et al. R3DET: Refined single-stage detector with feature refinement for rotating object[Online], available: https://arxiv.org/abs/1908.05612, 15 Aug, 2019.
    [17] Xu, Yongchao, et al. Gliding vertex on the horizontal bounding box for multi-oriented object detection[Online], available: https://arxiv.org/abs/1911.09358, 21 Nov, 2019.
    [18] Wei, Haoran, et al. Oriented Objects as pairs of Middle Lines[Online], available: https://arxiv.org/abs/1912.10694, 23 Dec, 2019.
    [19] L i, Yangyang, et al. RADet: Refine Feature Pyramid Network and Multi-Layer Attention Network for Arbi-trary-Oriented Object Detection of Remote Sensing Images. Remote Sensing, 2020, 12(3): 389−409 doi: 10.3390/rs12030389
    [20] Wa ng, Jinwang, et al. Mask OBB: A Semantic Atten-tion-Based Mask Oriented Bounding Box Representation for Multi-Category Object Detection in Aerial Images. Remote Sensing, 2019, 11(24): 2930−2951 doi: 10.3390/rs11242930
    [21] Xia, Gui-Song, et al. DOTA: A large-scale dataset for object detection in aerial images. In: Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Salt Lake City, UT: IEEE, 2018. 3974−3983.
    [22] He, Kaiming, et al. Deep residual learning for image recognition. In: Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas, NV, USA: IEEE, 2016. 770−778.
    [23] M a, Jianqi, et al. Arbitrary-oriented scene text detection via rotation proposals. IEEE Transactions on Multimedia, 2018, 20(11): 3111−3122 doi: 10.1109/TMM.2018.2818020
    [24] T. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan and S. Belongie. Feature Pyramid Networks for Object Detection. In: Proceeding of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, HI: IEEE, 2017. 936−944.
    [25] Yi, Jingru, Pengxiang Wu, and Dimitris N. Metaxas. ASSD: Attentive single shot multibox detector. Computer Vision and Image Understanding, 2019, 189: 102827−102836.
    [26] Zeiler M D, Krishnan D, Taylor G W, et al. Deconvolu-tional networks. In: 2010 Proceedings of the IEEE Computer Society Conference on computer vision and pattern recognition (CVPR). San Francisco, CA: IEEE, 2010. 2528−2535.
    [27] Wang J, Chen K, Xu R, et al. CARAFE: Content-Aware ReAssembly of Features [Online], available: https://arxiv.org/abs/1905.02188, 6 May, 2019.
    [28] Zhou, Peng, et al. Scale-transferrable object detection. In: Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Salt Lake City, UT: IEEE, 2018. 528−537.
    [29] Bridle, John S. Probabilistic interpretation of feedforward classification network outputs, with relationships to statistical pattern recognition. Neurocomputing. Springer, Berlin, Heidelberg, 1990, 68: 227−236
    [30] K. He, G. Gkioxari, P. Dollár and R. Girshick. Mask R-CNN. In: Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice: IEEE, 2017. 2980−2988.
    [31] Jiang, Borui, et al. Acquisition of localization confidence for accurate object detection [Online], available: https://arxiv.org/abs/1807.11590, 30 Jul, 2018.
    [32] Wu Y, Chen Y, Yuan L, et al. Rethinking Classification and Localization for Object Detection[Online], available: https://arxiv.org/abs/1904.06493, 13 Apr, 2019.
    [33] Liu, Yuliang, and Lianwen Jin. Deep matching prior network: Toward tighter multi-oriented text detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Salt Lake City, UT: IEEE, 2018. 8759−8768.
    [34] Dai J, Qi H, Xiong Y, et al. Deformable convolutional networks. In: Proceedings of the IEEE international con-ference on computer vision (CVPR). Honolulu, HI: IEEE, 2017. 3454−3461.
  • 加载中
计量
  • 文章访问数:  11
  • HTML全文浏览量:  2
  • 被引次数: 0
出版历程
  • 收稿日期:  2020-04-29
  • 录用日期:  2020-09-07

目录

    /

    返回文章
    返回