Research Progress of Optical Remote Sensing Image Object Detection Based on Deep Learning
-
摘要: 光学遥感图像的目标检测 (Optical remote sensing images object detection, ORSIOD) 是航空和卫星图像分析领域的一个基本但具有挑战性的问题, 近年来受到广泛关注. 本文从如下几个方面介绍了基于深度学习的光学遥感图像目标检测的研究现状. 首先对光学遥感图像目标检测的主要难点进行了介绍, 接着对现有基于深度学习的目标检测算法进行概括, 并以光学遥感图像目标检测的难点为驱动分析对比了不同的基于深度学习的光学遥感图像目标检测方法的优缺点, 最后对未来的发展趋势进行了详细的分析.Abstract: Optical remote sensing images object detection (ORSIOD) is a basic but challenging question in the field of aerial and satellite image analysis. In recent years, it has received extensive attention. In this paper we introduces The current research status of the deep learning based ORSIOD from the following aspects. Firstly, the main difficulties of optical remote sensing image object detection are introduced, and then the existing deep learning based object detection algorithm is summarized, and according to different difficulties the deep learning based ORSIOD algorithm faced. The advantages and disadvantages of the deep learning based ORSIOD method are compared and analyzed, and finally the future development trend is detailed analysis.1) 1 未正式发表的算法, 参见文献: Adam V. You only look twice: Rapid multi-scale object detection in satellite imagery. arXiv:1805.09512v1[cs.CV]. 2018.
-
表 1 解决图像分辨率过高问题的不同方法对比
Table 1 Comparison of different methods to solve the problem of high image resolution
解决问题 论文来源 方法 优点 缺点/难点 图像分辨率过高 [21-27] 将大图切割为小图 使得网络能够对每个图像块进行处理提高检测效果 图像块边缘的目标被切割 [28] 以一定重叠率切割 避免图像块边缘目标被切割 引入过多冗余信息 [29] 在以一定重叠了切割后使用RCNN检测 RCNN 的第 1 阶段网络可以过滤掉冗余的背景信息 第 1 阶段网络仍然会受到冗余信息的影响 YOLT 滑动取图像块之后, 利用非极大值抑制来防止重叠部分的多次检测 减小重叠区域的重复检测 未能解决使用滑块方法造成的检测速度过慢问题 [30] 使用全卷积神经网络将大图映射为较小特征图, 每个像素对应固定尺寸的边界框 能够直接对整个大图进行处理 只能应对较大尺寸的目标 注: YOLT是一个未正式发表的算法, 参见文献: Adam V. You only look twice: rapid multi-scale object detection in satellite imagery. arXiv:1805.09512v1 [cs.CV]. 2018. 表 2 解决目标像素过少问题的不同方法对比
Table 2 Comparison of different methods to solve the problem of too few object pixels
解决问题 论文来源 方法 优点 缺点/难点 目标过小 YOLT 增加YOLOv3网络的网格数 提高了小目标的检测能力 降低检测速度 [31] 加入反卷积层 对小目标进行放大 无法放大在反卷积层前已经消失的小目标 [25] 用反卷积层来结合不同层的信息 兼顾大小目标的检测 浅层网络会引入过多噪声 [24] 使用平衡系数来减小浅层网络中
的噪声提高小目标检测能力的同时减
少背景影响平衡系数不易定义 [32] 在YOLOv2网络中加入膨胀卷积 扩大感受野同时减少参数 膨胀卷积会丢失局部信息 [33] 使用像素级注意力机制 弥补了膨胀卷积的不足 其注意力机制经过多个池化层得来, 对于小目标不敏感 注: YOLT是一个未正式发表的算法, 参见文献: Adam V. You only look twice: Rapid multi-scale object detection in satellite imagery. arXiv:1805.09512v1 [cs.CV]. 2018. 表 3 解决目标方向变化问题的不同方法对比
Table 3 Comparison of different methods to solve the problem of object direction change
解决问题 论文来源 方法 优点 缺点/难点 目标方向
变化[22-26, 34] 使用几个不同角度进行旋转数据增强 易于实现 效果有限 [35] 使用但应性变换来进行数据增强 比普通选择增强效果要好 属于数据增强范畴, 效果有限 [36] 加入旋转不变层, 引入正则化约束 从增强网络能力上解决问题 正则化约束项难以定义 [37] 预测阶段加入多角度锚框进行预测 对方向的变化目标可以更好的定位 锚框的角度固定, 不能很好适应实际情况 [38] 采用全连接层增强旋转不变性 能够解决方向变化问题 采用全连接层会固定网络输入的尺寸 [25] 加入方向预测分支 边界框可以以一定角度定位目标 没有解决网络对于方向变化的处理问题 [39] 使用旋转ROI池化 解决普通ROI池化会引入背景噪声
的问题没有解决网络对于方向变化的处理问题 表 4 解决目标尺寸变化问题的不同方法对比
Table 4 Comparison of different methods to solve the problem of object size change
表 5 解决目标密集排列问题的不同方法对比
Table 5 Comparison of different methods to solve the problem that objects are densely arranged
解决问题 论文来源 方法 优点 缺点/难点 目标密集
排列[22, 39] 使用带有方向的边界框预测 解决密集目标定位问题 没有解决网络难以提取密集排列目标特
征的问题[41] 使用局部再识别机制, 使用网络进行重复
检测减小因密集排列漏检的目标数 对密集排列严重的目标检测效果不够好 YolT 通过上采用来放大密集目标之间的间隙 将密集排列目标区分开 会增大图片分辨率 注: YOLT是一个未正式发表的算法, 参见文献: Adam V. You only look twice: Rapid multi-scale object detection in satellite imagery. arXiv:1805.09512v1 [cs.CV]. 2018. 表 6 解决复杂背景问题的不同方法对比
Table 6 Comparison of different methods for solving complex background problems
解决问题 论文来源 方法 优点 缺点/难点 背景复杂 [42] 使用语义分割区分海洋和陆地 实现海陆分离, 避免陆地背景的影响 密集停靠的舰船与陆地特征相近,
不易分割[23] 将大量不包含有目标的陆地信息作为负样本进行训练, 使网络在检测过程中自动实现海陆分离 无需海陆分离步骤, 减少陆地背景对于近岸船只的检测 海陆分离的效果易受选取的背景样
本的影响[43] 使用多尺度的视觉注意力机制 减少不同尺寸目标预测过程中复杂背景的影响 多尺度的引入增加了网络计算量 [44] 在Faster R-CNN 网络中加入位置敏感分数图预测 通过不同局部位置的预测综合确定目标类别 对于相似多较高的背景类难以区分 表 7 解决样本不足问题的不同方法对比
Table 7 Comparison of different methods for solving sample insufficient
-
[1] 林煜东. 复杂背景下的光学遥感图像目标检测算法研究 [博士学位论文], 西南交通大学, 中国, 2017.Lin Yu-Dong. Target Detection in Optical Remote Sensing Images with Complecated Background [Ph.D. dissertation], Southwest Jiaotong University, China, 2017. [2] 刘欢. 基于高分辨率光学遥感影像的特定目标检测算法研究 [硕士学位论文], 哈尔滨工业大学, 中国, 2016.Liu Huan. Research on Detection Algorithm for Specific Object in High-Resolution Optical Remote Sensing Images [Master thesis], Harbin Institute of Technology, China, 2016. [3] 曹晓明. 基于多图像特征金字塔的车辆检测 [硕士学位论文], 北京交通大学, 中国, 2016.Cao Xiao-Ming. Vehicle Detection based on a Multi-Channel Image Feature Pyramid [Master thesis], Beijing Jiaotong University, China, 2016. [4] 张桂梅, 张松, 储珺. 一种新的基于局部轮廓特征的目标检测方法. 自动化学报, 2014, 40(10): 2346-2355Zhang Gui-Mei, Zhang Song, Chu Jun. A new object detection algorithm using local contour features. Acta Automatica Sinica, 2014, 40(10): 2346-2355 [5] 尹宏鹏, 陈波, 柴毅, 刘兆栋. 基于视觉的目标检测与跟踪综述. 自动化学报, 2016, 42(10): 1466-1489Yin Hong-Peng, Chen Bo, Chai Yi, Liu Zhao-Dong. Vision-based object detection and tracking: A review. Acta Automatica Sinica, 2016, 42(10): 1466-1489 [6] 鞠玉翠. 基于视觉的目标检测和跟踪关键算法的研究 [硕士学位论文], 天津理工大学, 中国, 2014.Ju Yu-Cui. Research on Key Algorithm of Object Detection and Tracking based on Visual [Master thesis], Tianjin University of Technology, China, 2014. [7] 王彦情, 马雷, 田原. 光学遥感图像舰船目标检测与识别综述. 自动化学报, 2011, 37(9): 1029-1039Wang Yan-Qing, Ma Lei, Tian Yuan. State-of-the-art of ship detection and recognition in optical remotely sensed imagery. Acta Automatica Sinica, 2011, 37(9): 1029-1039 [8] Girshick R, Donahue J, Darrell T, Malik J. Region-based convolutional networks for accurate object detection and segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 38(1): 142-158 doi: 10.1109/TPAMI.2015.2437384 [9] 张号逵, 李映, 姜晔楠. 深度学习在高光谱图像分类领域的研究现状与展望. 自动化学报, 2018, 44(6): 961-977Zhang Hao-Kui, Li Ying, Jiang Ye-Nan. Deep learning for hyperspectral imagery classification: The state of the art and prospects. Acta Automatica Sinica, 2018, 44(6): 961-977 [10] 陈雨丝. 基于背景差分的光照鲁棒性运动目标检测与跟踪技术研究 [硕士学位论文], 西南交通大学, 中国, 2011.Chen Yu-Si. Research on Moving Objects Detection and Tracking based on Background Subtraction with Illumination Robustness [Master thesis], Southwest Jiaotong University, China, 2011. [11] 王彬. 基于改进的ViBE和HOG的运动目标检测系统研究与实现 [硕士学位论文], 沈阳工业大学, 中国, 2016.Wang Bin. Research and Implementation of Object Detection System Based on Optimized ViBE and HOG [Master thesis], Shenyang University of Technology, China, 2016. [12] Felzenszwalb P, McAllester D, Ramanan D. A discriminatively trained, multiscale, deformable part model. In: Proceedings of the 2008 IEEE Conference on Computer Vision and Pattern Recognition. Anchorage, AK, USA: IEEE, 2008. 1−8 [13] Sarikaya D , Corso J J, Guru K A. Detection and localization of robotic tools in robot-assisted surgery videos using deep neural networks for region proposal and detection. IEEE Transactions on Medical Imaging, 2017, 36(7): 1542-1549 [14] Girshick R. Fast R-CNN. In: Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV). Santiago, Chile: IEEE, 2015. 1440−1448 [15] Ren S Q, He K M, Girshick R, Sun J. Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149 doi: 10.1109/TPAMI.2016.2577031 [16] He K M, Gkioxari G, Dollár P, Girshick R. Mask R-CNN. In: Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV). Venice, Italy: IEEE, 2017. 2980−2988 [17] Redmon J, Divvala S, Girshick R, Farhadi A. You only look once: Unified, real-time object detection. In: Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas, NV, USA: IEEE, 2016. 779−788 [18] Redmon J, Farhadi A. YOLO9000: Better, faster, stronger. In: Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, HI, USA: IEEE, 2017. 6517−6525 [19] Szegedy C, Liu W, Jia Y Q, Sermanet P, Reed S, Anguelov D, et al. Going deeper with convolutions. In: Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Boston, MA, USA: IEEE, 2015. 1−9 [20] Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C Y, et al. SSD: Single shot MultiBox detector. In: Proceedings of Computer Vision – ECCV 2016. Amsterdam, The Netherlands: Springer, 2016. 21−37 [21] Yan Z G, Song X, Zhong H Y, Zhu X Z. Object detection in optical remote sensing images based on transfer learning convolutional neural networks. In: Proceedings of the 5th IEEE International Conference on Cloud Computing and Intelligence Systems (CCIS). Nanjing, China: IEEE, 2018. 935−942 [22] Yang X, Sun H, Sun X, Yan M L, Guo Z, Fu K. Position detection and direction prediction for arbitrary-oriented ships via multitask rotation region convolutional neural network. IEEE Access, 2018, 6: 50839-50849 doi: 10.1109/ACCESS.2018.2869884 [23] He Y Q, Sun X, Gao L R, Zhang B. Ship detection without sea-land segmentation for large-scale high-resolution optical satellite images. In: Proceedings of the 2018 IEEE International Geoscience and Remote Sensing Symposium. Valencia, Spain: IEEE, 2018. 717−720 [24] Fu Y M, Wu F G, Zhao J S. Context-aware and depthwise-based detection on orbit for remote sensing image. In: Proceedings of the 24th International Conference on Pattern Recognition (ICPR). Beijing, China: IEEE, 2018. 1725−1730 [25] Li M J, Guo W W, Zhang Z H, Yu W X, Zhang T. Rotated region based fully convolutional network for ship detection. In: Proceedings of the 2018 IEEE International Geoscience and Remote Sensing Symposium. Valencia, Spain: IEEE, 2018. 673−676 [26] Schilling H, Bulatov D, Niessner R, Middelmann W, Soergel U. Detection of vehicles in multisensor data via multibranch convolutional neural networks. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2018, 11(11): 4299-4316 doi: 10.1109/JSTARS.2018.2825099 [27] Li X B, Wang S J. Object detection using convolutional neural networks in a coarse-to-fine manner. IEEE Geoscience and Remote Sensing Letters, 2017, 14(11): 2037-2041 doi: 10.1109/LGRS.2017.2749478 [28] Wang C, Bai X, Wang S, Zhou J, Ren P. Multiscale visual attention networks for object detection in VHR remote sensing images. IEEE Geoscience and Remote Sensing Letters, 2019, 16(2): 310-314 doi: 10.1109/LGRS.2018.2872355 [29] Pang J M, Li C, Shi J P, Xu Z H, Feng H J. R2-CNN: Fast tiny object detection in large-scale remote sensing images. IEEE Transactions on Geoscience and Remote Sensing, 2019, 57(8): 5512-5524 doi: 10.1109/TGRS.2019.2899955 [30] Zhang F, Du B, Zhang L P, Xu M Z. Weakly supervised learning based on coupled convolutional neural networks for aircraft detection. IEEE Transactions on Geoscience and Remote Sensing, 2016, 54(9): 5553-5563 doi: 10.1109/TGRS.2016.2569141 [31] Zhang W, Wang S H, Thachan S, Chen J Z, Qian Y T. Deconv R-CNN for small object detection on remote sensing images. In: Proceedings of the 2018 IEEE International Geoscience and Remote Sensing Symposium. Valencia, Spain: IEEE, 2018. 2483−2486 [32] Liu W C, Ma L, Wang J, Chen H. Detection of multiclass objects in optical remote sensing images. IEEE Geoscience and Remote Sensing Letters, 2019, 16(5): 791-795 doi: 10.1109/LGRS.2018.2882778 [33] Ying X, Wang Q, Li X W, Yu M, Jiang H, Gao J, et al. Multi-attention object detection model in remote sensing images based on multi-scale. IEEE Access, 2019, 7: 94508-94519 doi: 10.1109/ACCESS.2019.2928522 [34] Deng Z P, Sun H, Zhou S L, Zhao J P, Lei L, Zou H X. Fast multiclass object detection in optical remote sensing images using region based convolutional neural networks. In: Proceedings of the 2017 IEEE International Geoscience and Remote Sensing Symposium (IGARSS). Fort Worth, TX, USA: IEEE, 2017. 858−861 [35] Ji H, Gao Z, Mei T C, Li Y F. Improved faster R-CNN with multiscale feature fusion and homography augmentation for vehicle detection in remote sensing images. IEEE Geoscience and Remote Sensing Letters, 2019, 16(11): 1761-1765 [36] Cheng G, Zhou P C, Han J W. Learning rotation-invariant convolutional neural networks for object detection in VHR optical remote sensing images. IEEE Transactions on Geoscience and Remote Sensing, 2016, 54(12): 7405-7415 doi: 10.1109/TGRS.2016.2601622 [37] Li K, Cheng G, Bu S H, You X. Rotation-insensitive and context-augmented object detection in remote sensing images. IEEE Transactions on Geoscience and Remote Sensing, 2018, 56(4): 2337-2348 doi: 10.1109/TGRS.2017.2778300 [38] Zhang Y L, Yuan Y, Feng Y C, Lu X Q. Hierarchical and robust convolutional neural network for very high-resolution remote sensing object detection. IEEE Transactions on Geoscience and Remote Sensing, 2019 57(8): 5535-5548 doi: 10.1109/TGRS.2019.2900302 [39] Liu Z K, Hu J G, Weng L B, Yang Y P. Rotated region based CNN for ship detection. In: Proceedings of the 2017 IEEE International Conference on Image Processing (ICIP). Beijing, China: IEEE, 2017. 900−904 [40] Zhang S, He G H, Chen H B, Jing N F, Wang Q. Scale adaptive proposal network for object detection in remote sensing images. IEEE Geoscience and Remote Sensing Letters, 2019, 16(6): 864-868 doi: 10.1109/LGRS.2018.2888887 [41] Zhao W, Ma W P, Jiao L C, Chen P H, Yang S Y, Hou B. Multi-scale image block-level F-CNN for remote sensing images object detection. IEEE Access, 2019, 7: 43607-43621 doi: 10.1109/ACCESS.2019.2908016 [42] Zhang Y K, You Y, Wang R, Liu F, Liu J. Nearshore vessel detection based on scene-mask R-CNN in remote sensing image. In: Proceedings of the 2018 International Conference on Network Infrastructure and Digital Content (IC-NIDC). Guiyang, China: IEEE, 2018. 76−80 [43] Li Q P, Mou L C, Jiang K Y, Liu Q J, Wang Y H, Zhu X X. Hierarchical region based convolution neural network for multiscale object detection in remote sensing images. In: Proceedings of the 2018 IEEE International Geoscience and Remote Sensing Symposium. Valencia, Spain: IEEE, 2018. 4355−4358 [44] Xie H N, Wang T, Qiao M N, Zhang M Y, Shan G C, Snoussi H. Robust object detection for tiny and dense targets in VHR aerial images. In: Proceedings of the 2017 Chinese Automation Congress (CAC). Ji'nan, China: IEEE, 2017. 6397−6401 [45] Wu Z H, Gao Y M, Li L, Fan J L. Research on object detection technique in high resolution remote sensing images based on U-Net. In: Proceedings of the 2018 Chinese Control and Decision Conference (CCDC). Shenyang, China: IEEE, 2018. 2849−2853 [46] Cao Y S, Niu X, Dou Y. Region-based convolutional neural networks for object detection in very high resolution remote sensing images. In: Proceedings of the 12th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD). Changsha, China: IEEE, 2016. 548−554 [47] Chen G W, Liu L, Hu W L, Pan Z X. Semi-supervised object detection in remote sensing images using generative adversarial networks. In: Proceedings of the 2018 IEEE International Geoscience and Remote Sensing Symposium. Valencia, Spain: IEEE, 2018. 2503−2506