2.845

2023影响因子

(CJCR)

  • 中文核心
  • EI
  • 中国科技核心
  • Scopus
  • CSCD
  • 英国科学文摘

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于旋转框精细定位的遥感目标检测方法研究

朱煜 方观寿 郑兵兵 韩飞

朱煜, 方观寿, 郑兵兵, 韩飞. 基于旋转框精细定位的遥感目标检测方法研究. 自动化学报, 2023, 49(2): 415−424 doi: 10.16383/j.aas.c200261
引用本文: 朱煜, 方观寿, 郑兵兵, 韩飞. 基于旋转框精细定位的遥感目标检测方法研究. 自动化学报, 2023, 49(2): 415−424 doi: 10.16383/j.aas.c200261
Zhu Yu, Fang Guan-Shou, Zheng Bing-Bing, Han Fei. Research on detection method of refined rotated boxes in remote sensing. Acta Automatica Sinica, 2023, 49(2): 415−424 doi: 10.16383/j.aas.c200261
Citation: Zhu Yu, Fang Guan-Shou, Zheng Bing-Bing, Han Fei. Research on detection method of refined rotated boxes in remote sensing. Acta Automatica Sinica, 2023, 49(2): 415−424 doi: 10.16383/j.aas.c200261

基于旋转框精细定位的遥感目标检测方法研究

doi: 10.16383/j.aas.c200261
基金项目: 上海市科学技术委员会(17DZ1100808)资助
详细信息
    作者简介:

    朱煜:华东理工大学信息科学与工程学院教授. 1999年获得南京大学博士学位. 主要研究方向为图像处理, 计算机视觉, 多媒体通信和深度学习. 本文通信作者. E-mail: zhuyu@ecust.edu.cn

    方观寿:华东理工大学信息科学与工程学院硕士研究生. 主要研究方向为目标检测, 深度学习. E-mail: y30180616@mail.ecust.edu.cn

    郑兵兵:华东理工大学信息科学与工程学院博士研究生. 主要研究方向为医学图像处理, 深度学习和计算机视觉. E-mail: bostonkg@outlook.com

    韩飞:华东理工大学信息科学与工程学院硕士研究生. 主要研究方向为目标检测, 计算机视觉和深度学习. E-mail: fei-han_huali@163.com

Research on Detection Method of Refined Rotated Boxes in Remote Sensing

Funds: Supported by Shanghai Science and Technology Committee (17DZ1100808)
More Information
    Author Bio:

    ZHU Yu Professor at the School of Information Science and Engineering, East China University of Science and Technology. She received her Ph.D. degree from Nanjing University in 1999. Her research interest covers image processing, computer vision, multi-media communication, and deep learning. Corresponding author of this paper

    FANG Guan-Shou Master student at the School of Information Science and Engineering, East China University of Science and Technology. His research interest covers object detection and deep learning

    ZHENG Bing-Bing Ph.D. candidate at the School of Information Sc-ience and Engineering, East China University of Science and Technology. His research interest covers medical image processing, deep learning, and computer vision

    HAN Fei Master student at the School of Information Science and Engineering, East China University of Science and Technology. His research interest covers object detection, computer vision, and deep learning

  • 摘要: 遥感图像中的目标往往呈现出任意方向排列, 而常见的目标检测算法均采用水平框检测, 并不能满足这类场景的应用需求. 因此提出一种旋转框检测网络R2-FRCNN. 该网络利用粗调与细调两阶段实现旋转框检测, 粗调阶段将水平框转换为旋转框, 细调阶段进一步优化旋转框的定位. 针对遥感图像存在较多小目标的特点, 提出像素重组金字塔结构, 融合深浅层特征, 提升复杂背景下小目标的检测精度. 此外, 为了在金字塔各层中提取更加有效的特征信息, 在粗调阶段设计一种积分与面积插值法相结合的感兴趣区域特征提取方法, 同时在细调阶段设计旋转框区域特征提取方法. 最后在粗调和细调阶段均采用全连接层与卷积层相结合的预测分支, 并且利用SmoothLn作为网络的回归损失函数, 进一步提升算法性能. 提出的网络在大型遥感数据集DOTA上进行评估, 评估指标平均准确率达到0.7602. 对比实验表明了R2-FRCNN网络的有效性.
  • 图  1  遥感图像目标检测问题可视化

    Fig.  1  Visualization of remote sensing images object detection problem

    图  2  R2-FRCNN网络结构图

    Fig.  2  The structure of R2-FRCNN

    图  3  像素重组金字塔结构

    Fig.  3  The structure of pixel-recombination pyramid

    图  4  特征融合结构

    Fig.  4  The structure of feature fusion

    图  5  常用RoI特征提取示意图

    Fig.  5  The schematic diagram of commonRoI feature extraction

    图  6  IRoIPool特征提取示意图

    Fig.  6  The diagram of IRoIPool feature extraction

    图  7  旋转RoI特征提取示意图

    Fig.  7  The diagram of rotated RoI feature extraction

    图  8  预测分支结构图

    Fig.  8  The diagram of prediction branch

    图  9  在DOTA上训练过程损失曲线图

    Fig.  9  Train loss on DOTA

    图  10  各类别检测结果展示

    Fig.  10  Visualization of each category detection

    表  1  不同方法在DOTA数据集的检测精度对比(%)

    Table  1  Comparison of detection accuracy of different methods in DOTA (%)

    类别R2CNN[10]RT[12]CADNet[13]SCRDet[15]R3Det[16]GV[17]本文方法
    飞机80.9488.6487.8089.9889.2489.6489.10
    棒球场65.6778.5282.4080.6580.8185.0081.22
    桥梁35.3443.4449.4052.0951.1152.2654.47
    田径场67.4475.9273.5068.3665.6277.3472.97
    小型车辆59.9268.8171.1068.3670.6773.0179.99
    大型车辆50.9173.6864.5060.3276.0373.1482.28
    船舶55.8183.5976.6072.4178.3286.8287.64
    网球场90.6790.7490.9090.8590.8390.7490.54
    篮球场66.9277.2779.2087.9484.8979.0287.31
    储油罐72.3981.4673.3086.8684.4286.8186.33
    足球场55.0658.3948.4065.0265.1059.5554.20
    环形车道52.2353.5460.9066.6857.1870.9168.18
    港口55.1462.8362.0066.2568.1072.9476.12
    游泳池53.3558.9367.0068.2468.9870.8670.83
    直升机48.2247.6762.2065.2160.8857.3259.19
    平均准确率60.6769.5669.9072.6172.8175.0276.02
    下载: 导出CSV

    表  2  R2-FRCNN模块分离检测结果

    Table  2  R2-FRCNN module separates detection results

    模块R2-FRCNN
    基准设置
    精细调整
    IRoIPool
    RRoIPool
    PFPN
    SmoothLn
    ConvFc
    平均准确率 (%)69.5273.6273.9974.3174.9775.1375.96
    下载: 导出CSV

    表  3  不同水平框特征提取方法的实验结果

    Table  3  Experimental results of feature extraction methods of different horizontal boxes

    模块平均准确率 + 精细调整
    方法RoIPoolingRoI AlignIRoIPool
    平均准确率 (%)71.2173.6273.99
    下载: 导出CSV

    表  4  不同旋转框特征提取方法的实验结果

    Table  4  Experimental results of different featureextraction methods of rotated boxes

    模块平均准确率 + 精细调整 + IRoIPool
    方法RRoI A-PoolingRRoI AlignRRoIPool
    平均准确率 (%)73.3873.9974.31
    下载: 导出CSV
  • [1] Ya Y, Pan H, Jing Z L, Ren X G, Qiao L F. Fusion object detection of satellite imagery with arbitrary-oriented region convolutional neural network. Aerospace Systems, 2019, 2(2): 163-174 doi: 10.1007/s42401-019-00033-x
    [2] 王彦情, 马雷, 田原. 光学遥感图像舰船目标检测与识别综述. 自动化学报, 2011, 37(9): 1029-1039

    Wang Yan-Qing, Ma Lei, Tian Yuan. State-of-the-art of ship detection and recognition in optical remotely sensed imagery. Acta Automatica Sinica, 2011, 37(9): 1029-1039
    [3] 张慧, 王坤峰, 王飞跃. 深度学习在目标视觉检测中的应用进展与展望. 自动化学报, 2017, 43(8): 1289-1305

    Zhang Hui, Wang Kun-Feng, Wang Fei-Yue. Advances and perspec-tives on applications of deep learning in visual object detection. Acta Auto-matica Sinica, 2017, 43(8): 1289-1305
    [4] Ren S Q, He K M, Girshick R, Sun J. Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149 doi: 10.1109/TPAMI.2016.2577031
    [5] Dai J F, Li Y, He K M, Sun J. R-FCN: Object detection via re-gion-based fully convolutional networks. In: Proceedings of the 30th International Conference on Neural Information Processing Systems. Barcelona, Spain: 2016. 379−387
    [6] Cai Z W, Vasconcelos N. Cascade R-CNN: Delving into high quality object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018. 6154−6162
    [7] Redmon J, Divvala S, Girshick R, Farhadi A. You only look once: Unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE, 2016. 779−788
    [8] Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C Y, et al. SSD: Single shot MultiBox detector. In: Proceedings of the 14th European Conference on Computer Vision. Amsterdam, Netherlands: 2016. 21−37
    [9] Lin T Y, Goyal P, Girshick R, He K M, Dollár P. Focal loss for dense object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(2): 318-327 doi: 10.1109/TPAMI.2018.2858826
    [10] Jiang Y Y, Zhu X Y, Wang X B, Yang S L, Li W, Wang H, et al. R2CNN: Rotational region CNN for orientation robust scene text detection [Online], available: https://arxiv.org/abs/1706. 09579, June 29, 2017
    [11] Ma J Q, Shao W Y, Ye H, Wang L, Wang H, Zheng Y B, et al. Ar-bitrary-oriented scene text detection via rotation proposals. IEEE Transac-tions on Multimedia, 2018, 20(11): 3111-3122 doi: 10.1109/TMM.2018.2818020
    [12] Ding J, Xue N, Long Y, Xia G S, Lu Q K. Learning RoI transformer for oriented object detection in aerial images. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE, 2019. 2844−2853
    [13] Zhang G J, Lu S J, Zhang W. CAD-Net: A context-aware detection network for objects in remote sensing imagery. IEEE Transactions on Geoscience and Remote Sensing, 2019, 57(12): 10015-10024 doi: 10.1109/TGRS.2019.2930982
    [14] Azimi S M, Vig E, Bahmanyar R, Körner M, Reinartz P. To-wards multi-class object detection in unconstrained remote sensing imagery. In: Proceedings of the 14th Asian Conference on Computer Vision. Perth, Australia: 2019. 150−165
    [15] Yang X, Yang J R, Yan J C, Zhang Y, Zhang T F, Guo Z, et al. SCRDet: Towards more robust detection for small, cluttered and rotated objects. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. Seoul, South Korea: IEEE, 2019. 8231−8240
    [16] Yang X, Yan J C, Feng Z N, He T. R3DET: Refined single-stage detector with feature refinement for rotating object. In: Proceedings of the 35th AAAI Conference on Artificial Intelligence. Virtual Event: 2021. 3163−3171
    [17] Xu Y C, Fu M T, Wang Q M, Wang Y K, Chen K, Xia G S, et al. Gliding vertex on the horizontal bounding box for multi-oriented object detection. IEEE Transactions on Pattern Analysis and Machine Intelli-gence, 2019, 43(4): 1452-1459
    [18] Wei H R, Zhang Y, Cheng Z H, Li H, Wang H Q, Sun X. Oriented objects as pairs of middle lines [Online], available: https://arxiv.org/abs/1912.10694, December 23, 2019
    [19] Li Y Y, Huang Q, Pei X, Jiao L C, Shang R H. RADet: Refine feature pyramid network and multi-layer atten-tion network for arbitrary-oriented ob-ject detection of remote sensing images. Remote Sensing, 2020, 12(3): Article No. 389 doi: 10.3390/rs12030389
    [20] Wang J W, Ding J, Guo H W, Cheng W S, Pan T, Yang W. Mask OBB: A semantic attention-based mask ori-ented bounding box representation for multi-category object detection in aerial images. Remote Sensing, 2019, 11(24): Article No. 2930 doi: 10.3390/rs11242930
    [21] Xia G S, Bai X, Ding J, Zhu Z, Belongie S, Luo J B, et al. DOTA: A large-scale dataset for object detection in aerial images. In: Proceedings of the IEEE/CVF Conference on Com-puter Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018. 3974−3983
    [22] He K M, Zhang X Y, Ren S Q, Sun J. Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE, 2016. 770−778
    [23] Lin T Y, Dollár P, Girshick R, He K M, Hariharan B, Be-longie S. Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE, 2017. 936−944
    [24] Yi J R, Wu P X, Metaxas D N. ASSD: Attentive single shot multibox detector. Computer Vision and Im-age Understanding, 2019, 189: Article No. 102827 doi: 10.1016/j.cviu.2019.102827
    [25] Zeiler M D, Krishnan D, Taylor G W, Fergus R. Deconvolutional networks. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. San Francisco, USA: IEEE, 2010. 2528−2535
    [26] Wang J Q, Chen K, Xu R, Liu Z W, Loy C C, Lin D. CARAFE: Content-aware reassembly of features. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. Seoul, South Korea: IEEE, 2019. 3007−3016
    [27] Zhou P, Ni B B, Geng C, Hu J G, Xu Y. Scale-transferrable object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018. 528−537
    [28] Bridle J S. Probabilistic interpretation of feedforward classification network outputs, with relationships to statistical pattern recognition. Neurocomputing: Algorithms, Architectures and Applications, 1990: 227−236
    [29] He K M, Gkioxari G, Dollár P, Girshick R. Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision. Venice, Italy: IEEE, 2017. 2980−2988
    [30] Jiang B R, Luo R X, Mao J Y, Xiao T T, Jiang Y N. Acquisition of localization confidence for accurate object detection. In: Proceedings of the 15th European Conference on Computer Vision. Munich, Germany: 2018. 816−832
    [31] Wu Y, Chen Y P, Yuan L, Liu Z C, Wang L J, Li H Z, et al. Rethinking classification and localization for object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE, 2020. 10183− 10192
    [32] Liu Y L, Jin L W. Deep matching prior network: Toward tighter multi-oriented text detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE, 2017. 3454−3461
    [33] Dai J F, Qi H Z, Xiong Y W, Li Y, Zhang G D, Hu H, et al. Deformable convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision. Venice, Italy: IEEE, 2017. 764−773
  • 加载中
图(10) / 表(4)
计量
  • 文章访问数:  1913
  • HTML全文浏览量:  240
  • PDF下载量:  297
  • 被引次数: 0
出版历程
  • 收稿日期:  2020-04-29
  • 录用日期:  2020-09-07
  • 网络出版日期:  2023-01-06
  • 刊出日期:  2023-02-20

目录

    /

    返回文章
    返回