A Dual Deep Network Based on the Improved YOLO for Fast Bridge Surface Defect Detection
-
摘要: 桥梁表观病害检测是确保桥梁安全的关键步骤. 然而, 桥梁表观病害类型多样, 不同病害间外观差异显著且病害之间可能发生重叠, 现有算法无法实现快速且准确的桥梁多病害检测. 针对这一问题, 对YOLO (You only look once) 进行了改进, 提出了YOLO-lump和YOLO-crack以提高网络检测多病害的能力, 进而形成基于双网络的桥梁表观病害快速检测算法. 一方面, YOLO-lump在较大的滑动窗口图像上实现块状病害的检测. 在YOLO-lump中, 提出了混合空洞金字塔模块, 其结合了混合空洞卷积与空间金字塔池化, 用于提取稀疏表达的多尺度特征, 同时可以避免空洞卷积造成的局部信息丢失; 另一方面, YOLO-crack在较小的滑动窗口图像上实现裂缝病害的检测. 在YOLO-crack中, 提出了下采样注意力模块, 利用1×1卷积和3×3分组卷积分别解耦特征的通道相关性和空间相关性, 可以增强裂缝在下采样阶段的前景响应, 减少空间信息的损失. 实验结果表明, 该算法能够提高桥梁表观病害检测的精度, 同时可实现病害的实时检测.Abstract: Surface defect detection is a critical step to ensure bridge safety. However, there are various types of bridge surface defects, different defects have a wide range of variation in appearance and generally overlap with each other. The existing algorithms cannot efficiently and precisely detect such defects. To solve this problem, we improve the YOLO (You only look once) to enhance the performance of the network to detect multiple defects, YOLO-lump and YOLO-crack are proposed to form a dual deep network for fast bridge surface defect detection. On the one hand, the YOLO-lump can realize the detection of the lump defects on larger sub-images, by employing a hybrid dilated pyramid module based on the hybrid dilated convolution and the spatial pyramid pooling to extract multi-scale features and to avoid losing local information caused by the dilated convolution. On the other hand, the YOLO-crack can realize the detection of the crack defects on smaller sub-images, by proposing a downsampling attention module which uses the 1×1 convolution and the 3×3 group convolution to respectively map cross-channel correlation and spatial correlation of features, enhancing the foreground response of the crack in the downsampling stage and reducing the loss of spatial information. Experimental results show that the proposed algorithm can improve the detection accuracy of the bridge surface defects and realize real-time detection.1) 收稿日期 2021-03-06 录用日期 2021-11-17 Manuscript received March 6, 2021; accepted November 17,2021 国家自然科学基金 (61771189, 62073126, 62027810), 湖南省自然科学基金杰出青年基金 (2020JJ2008), 湖南省交通运输厅科技进步与创新计划项目 (201734, 202138) 资助 Supported by National Natural Science Foundation of China (61771189, 62073126, 62027810), Natural Science Foundation for Distinguished Young Scholars of Hunan Province (2020JJ2008), Science and Technology Progress and Innovation Plan of Department of Transportation of Hunan Province (201734, 202138)2) 本文责任编委 胡清华 Recommended by Associate Editor HU Qing-Hua 1. 湖南大学电气与信息工程学院 长沙 410082 2. 湖南大学机器人视觉感知与控制技术国家工程研究中心 长沙 410082 3. 湖南桥康智能科技有限公司 长沙 410021 1. College of Electrical and Information Engineering, HunanUniversity, Changsha 410082 2. National Engineering Research Center for Robot Visual Perception and Control Technology, Hunan University, Changsha 410082 3. Hunan Qiaokang Intelligent Technology Company Limited, Changsha 410021
-
表 1 桥梁表观图像数据库
Table 1 Dataset of the bridge surface images
采集时间 桥梁名称 图像数目 数据大小 2018-09 东临路大桥 2126张 6.4 GB 红旗二号桥 10470张 31.4 GB 荒唐亭大桥 6090张 18.2 GB 马家河大桥 1402张 4.2 GB 2018-10 南川河大桥 4055张 13.6 GB 宁家冲大桥 17119张 48.4 GB 天马大桥 3614张 11.9 GB 2018-11 铜陵长江大桥 19961张 116.0 GB 2019-07 新庆大桥 25784张 225.5 GB 2019-04 广东潮汕大桥 79000张 317.1 GB 总计 10座大桥 169621张 792.7 GB 表 2 训练/验证/测试数据集
Table 2 Training/validation/testing datasets
类型 训练集 (正/负样本) 验证集 (正/负样本) 测试集 (正/负样本) 块状病害 7668张 (2611/5057) 2924张 (978/1946) 51231张 (345/50886) 裂缝病害 5643张 (3283/2360) 1453张 (873/580) 51079张 (193/50886) 表 3 不同输入大小下块状病害检测结果对比
Table 3 Results of lump defect detection with different input sizes
输入大小 召回率 准确率 F1 mAP 检测时间 704 × 704 80.6% 79.3% 79.9% 85.6% 37.8 ms 608 × 608 86.5% 83.9% 85.2% 89.2% 24.7 ms 512 × 512 85.8% 84.5% 85.1% 88.6% 18.8 ms 416 × 416 80.7% 77.4% 79.0% 82.1% 16.6 ms 表 4 YOLO-lump网络消融实验
Table 4 Ablation experiment on the YOLO-lump
网络模型 召回率 准确率 F1 mAP 检测时间 YOLOv4 85.8% 84.5% 85.1% 88.6% 18.8 ms YOLO-lump-A 86.1% 84.8% 85.4% 89.3% 18.8 ms YOLO-lump-B 87.2% 84.4% 85.8% 90.7% 18.8 ms YOLO-lump-C 79.3% 68.1% 73.3% 74.9% 26.7 ms YOLO-lump-D 84.4% 83.3% 83.8% 87.7% 20.1 ms YOLO-lump 86.4% 89.7% 88.0% 92.7% 20.4 ms YOLO-lump-E 88.7% 89.5% 89.1% 93.5% 24.3 ms 表 5 块状病害检测网络对比实验
Table 5 Comparison of different detectors on the lump dataset
网络模型 特征提取网络 mAP 检测时间 SSD VGG-16 85.1% 30.3 ms Faster-RCNN ResNet-101 86.9% 34.9 ms RetinaNet ResNet-101 89.5% 41.5 ms FCOS ResNet-101 87.9% 28.8 ms EfficientDet EfficientNet 89.6% 22.3 ms YOLOv3 Darknet-53 87.6% 15.4 ms Improved-YOLOv3 Darknet-53 89.3% 15.4 ms YOLOv4 CSPDarknet-53 88.6% 18.8 ms YOLO-lump CSPDarknet-53 92.7% 20.4 ms 表 6 YOLO-crack网络消融实验
Table 6 Ablation experiment on the YOLO-crack
网络模型 召回率 准确率 F1 mAP 检测时间 YOLOv4 80.8% 79.4% 80.2% 84.5% 29.7 ms YOLO-crack-A 77.5% 82.6% 80.0% 85.0% 29.7 ms YOLO-crack-B 85.6% 76.7% 81.0% 85.7% 29.7 ms YOLO-crack-C 78.7% 79.1% 78.9% 83.8% 17.1 ms YOLO-crack-D 79.0% 79.5% 79.2% 84.6% 16.5 ms YOLO-crack 80.2% 81.2% 80.7% 86.2% 17.6 ms YOLO-crack-E 77.9% 80.9% 79.4% 82.5% 17.7 ms 表 7 注意力模块对比实验
Table 7 Comparison of different attention modules
网络模型 mAP 检测时间 YOLO-crack-D 84.6% 16.5 ms YOLO-crack-D+SE注意力模块 84.9% 16.9 ms YOLO-crack-D+CBAM注意力模块 85.7% 17.4 ms YOLO-crack-D+下采样注意力模块 86.2% 17.6 ms 表 8 裂缝病害检测网络对比实验
Table 8 Comparison of different detectors on the crack dataset
网络模型 特征提取网络 mAP 检测时间 SSD VGG-16 79.8% 45.2 ms Faster-RCNN ResNet-101 81.2% 54.7 ms RetinaNet ResNet-101 82.9% 58.4 ms FCOS ResNet-101 83.4% 42.9 ms EfficientDet EfficientNet 83.5% 27.4 ms YOLOv3 Darknet-53 82.3% 23.8 ms Improved-YOLOv3 Darknet-53 84.1% 23.8 ms YOLOv4 CSPDarknet-53 84.5% 29.7 ms YOLO-crack CSPDarknet-39 86.2% 17.6 ms 表 9 实际应用测试结果
Table 9 Results of the practical application
测试数据集 图像数量 块状病害检测 裂缝病害检测 检测时间 GT TP FN FP 召回率 GT TP FN FP 召回率 东临路大桥 872 8 8 0 907 100% 6 5 1 1478 83.3% 995 ms/张 红旗二号桥 3265 26 25 1 2132 96.2% 17 17 0 13582 100% 995 ms/张 荒唐亭大桥 2929 22 19 3 3295 86.4% 26 25 1 10115 96.2% 994 ms/张 马家河大桥 836 11 9 2 1041 81.8% 7 7 0 3569 100% 993 ms/张 南川河大桥 2617 20 20 0 4238 100% 23 21 2 5331 91.3% 996 ms/张 宁家冲大桥 2453 28 27 1 3145 96.4% 28 26 2 9504 92.9% 997 ms/张 天马大桥 7107 65 62 3 7294 95.4% 90 86 4 22383 95.6% 996 ms/张 铜陵长江大桥 5962 46 45 1 8505 97.8% 57 55 2 26237 96.5% 996 ms/张 新庆大桥 6194 63 61 2 5598 96.8% 46 45 1 15394 97.8% 995 ms/张 广东潮汕大桥 19189 130 124 6 19869 95.4% 186 179 7 41908 96.2% 995 ms/张 总计 51424 419 400 19 56024 95.5% 486 466 20 149501 95.9% 995 ms/张 表 10 双网络算法与单网络性能比较
Table 10 Comparison of performance between the dual deep network and the single network
检测策略 蜂窝病害 漏筋病害 孔洞病害 裂缝病害 检测时间 召回率 准确率 F1 mAP 召回率 准确率 F1 mAP 召回率 准确率 F1 mAP 召回率 准确率 F1 mAP 双网络 85.2% 84.6% 84.9% 86.7% 87.1% 86.5% 86.8% 89.8% 87.3% 85.2% 86.2% 89.3% 80.8% 79.4% 80.1% 84.5% 34.8 ms 单网络 78.5% 77.2% 77.8% 80.6% 83.8% 84.3% 84.0% 84.4% 84.4% 83.1% 83.7% 85.0% 78.7% 76.8% 77.7% 80.1% 30.3 ms -
[1] 马建, 孙守增, 杨琦. 中国桥梁工程学术研究综述: 2014. 中国公路学报, 2014, 27(5): 1-96 doi: 10.3969/j.issn.1001-7372.2014.05.001Ma Jian, Sun Shou-Zeng, Yang Qi. Review on china's bridge engineering research: 2014. China Journal of High-way and Transport, 2014, 27(5): 1-96 doi: 10.3969/j.issn.1001-7372.2014.05.001 [2] 陈榕峰, 徐群丽, 秦凯强. 桥梁裂缝智能检测系统的研究新进展. 公路, 2019, 64(05): 101-105Chen Rong-Feng, Xu Kai-Li, Qin Kai-Qiang. Research progress of intelligent bridge crack detection system. Highway, 2019, 64(05): 101-105 [3] 钟新谷, 彭雄, 沈明燕. 基于无人飞机成像的桥梁裂缝宽度识别可行性研究. 土木工程学报, 2019, 52(4): 52-61Zhong Xin-Gu, Peng Xiong, Shen Ming-Yan. Study on the feasibility of identifying concrete crack width with images acquired by unmanned aerial vehicles. China Civil Engineering Journal, 2019, 52(4): 52-61 [4] Lin W G, Sun Y C, Yang Q N. Real-time comprehensive image processing system for detecting concrete bridges crack. Computers and Concrete, 2019, 23(6): 445-457 [5] Sutter B, Lelevé A, Pham M T, Gouin O, Jupille N, Kuhn M. A semi-autonomous mobile robot for bridge inspection. Automation Construction, 2018, 91(JUL.): 111-119 [6] Hirai H, Ishii K. Development of dam inspection underwater robot. Journal of Robotics, Networking and Artificial Life, 2019, 6(1): 18-22 doi: 10.2991/jrnal.k.190531.004 [7] 钟钒, 周激流, 郎方年, 何坤, 黄梅. 边缘检测滤波尺度自适应选择算法. 自动化学报, 2007, 33(8): 867-870Zhong Fan, Zhou Ji-Liu, He Kun, Huang Mei. Adaptive scale filtering for edge detection. Acta Automatica Sinica, 2007, 33(8): 867-870 [8] Win M, Bushroa A R, Hassan M A. A contrast adjustment thresholding method for surface defect detection based on mesoscopy. IEEE Transactions on Industrial Informatics, 2017, 11(3): 642-649 [9] Kamaliardakani M, Sun L, Ardakani M K. Sealed-crack detection algorithm using heuristic thresholding approach. Journal of Computing in Civil Engineering, 2016, 30(1): 04014110 doi: 10.1061/(ASCE)CP.1943-5487.0000447 [10] German S, Brilakis I, Desroches R. Rapid entropy-based detection and properties measurement of concrete spalling with machine vision for post-earthquake safety assessments. Advanced Engineering Informatics, 2012, 26(4): 846-858 doi: 10.1016/j.aei.2012.06.005 [11] 张慧, 王坤峰, 王飞跃. 深度学习在目标视觉检测中的应用进展与展望. 自动化学报, 2017, 43(8): 1289−1305Zhang Hui, Wang Kun-Feng, Wang Fei-Yue. Advances and perspectives on applications of deep learning in visual object detection. Acta Automatica Sinica, 2017, 43(8): 1289-1305 [12] Shi Y, Cui L, Qi Z. Automatic road crack detection using random structured forests. IEEE Transactions on Intelligent Transportation Systems, 2016, 17(12): 3434-3445 doi: 10.1109/TITS.2016.2552248 [13] 勾红叶, 杨彪, 华辉, 谢蕊. 桥梁信息化及智能桥梁2019年度研究进展. 土木与环境工程学报, 2020, 42(5): 14-27Gou Ye-Hong, Yang Biao, Hua Hui, Xie Rui. Research progress of bridge informatization and intelligent bridge in 2019. Journal of Civil and Environmental Engineering, 2020, 42(5): 14-27 [14] Zou Q, Zhang Z, Li Q, Qi X, Wang Q, Wang S. Deepcrack: Learning hierarchical convolutional features for crack detection. IEEE Transactions on Image Processing, 2019, 28(3): 1498-1512 doi: 10.1109/TIP.2018.2878966 [15] 李良福, 马卫飞, 李丽, 陆铖. 基于深度学习的桥梁裂缝检测算法研究. 自动化学报, 2019, 45(9): 1727−1742Li Liang-Fu, Ma Wei-Fei, Li Li, Lu Cheng. Research on detection algorithm for bridge cracks based on deep learning. Acta Automatica Sinica, 2019, 45(9): 1727-1742 [16] Kim I H, Jeon H, Baek S C, Hong W H. Application of crack identification techniques for an aging concrete bridge inspection using an unmanned aerial vehicle. Sensors, 2018, 18(6): 1881 doi: 10.3390/s18061881 [17] Zhang C B, Chang C, Maziar J. Concrete bridge surface damage detection using a single-stage detector. Computer‐Aided Civil and Infrastructure Engineering, 2020, 35(4): 389-409 doi: 10.1111/mice.12500 [18] Li S Y, Zhao X F, Zhou G Y. Automatic pixel‐level multiple damage detection of concrete structure using fully convolutional network. Computer‐Aided Civil and Infrastructure Engineering, 2019, 34(7): 616-634 doi: 10.1111/mice.12433 [19] Yang L, Li B, Li W. Deep concrete inspection using unmanned aerial vehicle towards CSSC database. In: Proceeding of the 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems. Vancouver, Canada: IEEE, 2017. 24−28 [20] Mundt M, Majumder S, Murali S. Meta-learning convolutional neural architectures for multi-target concrete defect classification with the concrete defect bridge image dataset. In: Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE, 2019. 11188−11197 [21] Hüthwohla P, Lu R D, Brilakisa I. Multi-classifier for reinforced concrete bridge defect. Automation in Construction, 2019, 105(SEP.): 102824.1-102826.15 [22] Redmon J, Divvala S, Girshick R, Farhadi A. You only look once: unified, real-time object detection. In: Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE, 2016. 779−788 [23] Bochkovskiy A, Wang C Y. Yolov4: Optimal speed and accuracy of object detection [Online], available: https://arxiv.org/abs/2004.10934, April 23, 2020 [24] He K M, Zhang X, Ren S. Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 37(9): 1904-1916 [25] Wang P, Chen P, Yuan Y. Understanding convolution for semantic segmentation. In: Proceedings of the 2018 IEEE Winter Conference on Applications of Computer Vision. Lake Tahoe, USA: IEEE, 2018. 1451−1460 [26] Hu J, Shen L, Surr G. Squeeze-and-excitation networks. In: Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018. 7132−7141 [27] Woo S, Park J, Lee J Y. CBAM: Convolutional block attention module. In: Proceedings of the 2018 European Conference on Computer Vision. Munich, Germany: Springer, 2018. 3−19 [28] Chollet F. Xception: Deep learning with depth-wise separable convolutions. In: Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Hawaii, USA: IEEE, 2017 [29] Tan M X, Le Q V. EfficientNet: Rethinking model scaling for convolutional neural networks. In: Proceedings of the 2019 International Conference on Machine Learning. Los Angeles, USA: IEEE, 2019. 97: 6105−6114 [30] Wang C Y, Mark-Liao H Y, Wu Y H, Chen P Y. CSPNet: A new backbone that can enhance learning capability of CNN. In: Proceedings of the 2020 IEEE Conference on Computer Vision and Pattern Recognition Workshop. Seattle, USA: IEEE, 2020. 390−391 [31] Wang T H, Liu M Y, Zhu J Y, Tao A, Catanzaro B. High-resolution image synthesis and semantic manipulation with conditional GANs. In: Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018. 8798−8807 [32] Lin T Y, Goyal P, Girshick R. Focal loss for dense object detection. In: Proceedings of the 2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE, 2017. 2999−3007 [33] Jiang W B, Liu M, Peng Y N, Wu L H, Wang Y N. HDCB-Net: A neural network with the hybrid dilated convolution for pixel-level crack detection on concrete bridges. IEEE Transactions on Industrial Informatics, 2020 [34] Goodfellow I, Mirza M, Xu B, Courville A, Bengio Y. Generative adversarial networks. In: Proceedings of the 2014 Conference and Workshop on Neural Information Processing Systems. Montreal, Canada: 2014. [35] 刘建伟, 谢浩杰, 罗雄麟. 生成对抗网络在各领域应用研究进展. 自动化学报, 2020, 46(12): 2500−2536Liu Jian-Wei, Xie Hao-Jie, Luo Xiong-Lin. Research progress on application of generative adversarial networks in various fields. Acta Automatica Sinica, 2020, 46(12): 2500−2536 [36] 林懿伦, 戴星原, 李力, 王晓, 王飞跃. 人工智能研究的新前线: 生成式对抗网络. 自动化学报, 2018, 44(5): 775-792Lin Yi-Lun, Dai Xing-Yuan, Li Li, Wang Xiao, Wang Fei-Yue. The new frontier of AI research: generative adversarial networks. Acta Automatica Sinica, 2018, 44(5): 775-792 [37] Zheng Z, Zheng L, Yang Y. Unlabeled samples generated by GAN improve the person reidentification baseline in vitro. In: Proceedings of the 2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE, 2017. 3774−3782 [38] Yang Q, Yan P, Zhang Y. Low-dose CT image denoising using a generative adversarial network with wasserstein distance and perceptual loss. IEEE Transactions on Medical Imaging, 2018: 1348-1357 [39] Nie D, Trullo R, Lian J, Wang L, Petitjean C. Medical image synthesis with deep convolutional adversarial networks. IEEE Transactions on Biomedical Engineering, 2018, 65(12): 2720-2730 doi: 10.1109/TBME.2018.2814538 [40] Chen L, Papandreou G, Kokkinos I. DeepLab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(4): 834-848 doi: 10.1109/TPAMI.2017.2699184 [41] Misra D. Mish: A self regularized non-monotonic activation function. In: Proceedings of the 2020 British Machine Vision Virtual Conference. Virtual Event, UK: 2020. [42] Sandler M, Howard A, Zhu M, Zhmoginov A. MobileNetV2: Inverted residuals and linear bottlenecks. In: Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018. 4510−4520 [43] Liu S, Qi L, Qin H F, Shi J P, Jia J Y. Path aggregation network for instance segmentation. In: Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018. 8759−8768 [44] Lin T Y, Dollar P, Girshick R, He K M, Hariharan B, Belongie S. Feature pyramid networks for object detection. In: Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Venice, Italy: IEEE, 2017. 2117–2125 [45] Dollár P, Zitnick C L. Fast edge detection using structured forests. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 37(8): 1558-1570. [46] Liu W, Anguelov D, Erhan D, Szegedy C, Reed S. SSD: Single shot multi-box detector. In: Proceedings of the 2016 European Conference on Computer Vision. Amsterdam, Netherlands: Springer, 2016. 21−37 [47] Ren S Q, He K M, Girshick R, Sun J. Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149 doi: 10.1109/TPAMI.2016.2577031 [48] Tian Z, Shen C H, Chen H, He T. FCOS: Fully convolutional one-stage object detection. In: Proceedings of the 2019 IEEE International Conference on Computer Vision. Seoul, Korea: IEEE, 2019. 9627–9636 [49] Chattopadhyay A, Sarkar A, Howlader P. Grad-CAM++: Improved visual explanations for deep convolutional networks. In: Proceedings of the 2018 IEEE Winter Conference on Applications of Computer Vision. Lake Tahoe, USA: IEEE, 2018.