Bone Ultrasound Segmentation Network Based on Sequential Attention and Local Phase Guidance
-
摘要: 在超声辅助的骨科手术导航中, 需要从采集的超声图像序列中精确分割出骨结构, 并展示给医生, 来辅助医生进行术中决策. 但是, 图像噪声、成像伪影以及模糊的骨边界导致从超声图像序列中精确分割提取骨结构十分困难. 为解决该问题, 提出一种新的基于序列注意力与局部相位引导的骨超声图像分割网络. 该网络一方面自适应地利用超声序列帧之间的关系即序列注意力来辅助骨结构的语义分割. 另一方面, 该网络通过引入局部相位引导模块, 突出骨边缘信息, 进一步提高分割精度. 利用包含19 050幅图像的骨超声数据集, 进行交叉实验、消融实验并与最新的超声骨分割方法进行比较. 实验结果表明所提方法对骨结构分割精度高, 优于现有的超声骨分割方法.Abstract: In the ultrasound assisted navigation of orthopaedics, the bone structure needs to be segmented accurately from the collected ultrasound images and displayed to the doctor to assist the intraoperative decision-making. However, it is difficult to segment bone structures from ultrasound images because of imaging noises, shadow artifacts and blurred bone boundaries. For solving this problem, this paper proposes a bone ultrasound image segmentation network based on sequential attention and local phase guidance. On the one hand, the network adaptively uses the relationship between frames of ultrasound sequence, that is, sequence attention, to assist the semantic segmentation of bone structures. On the other hand, the local phase guidance module is introduced to highlight the bone edge information and further improve the segmentation accuracy. We performed the cross validation, ablation experiments and the comparison experiments with the state-of-arts by using a dataset that contained 19 050 bone ultrasound images. The experimental results show that the proposed method has high accuracy and is superior to the existing bone segmentation methods.
-
图 1 基于序列注意力与局部相位引导的骨超声图像分割网络系统框图; 图中ConvA表示卷积核为1×1 、步长为1的卷积操作; ConvB表示卷积核为3×3、 步长为1的卷积操作
Fig. 1 Bone ultrasound segmentation network based on sequential attention and local phase guidance; ConvA denotes the convolution operation with kernel size of 1×1 and a stride of 1; ConvB denotes the convolution operation with kernel size of 3×3 and a stride of 1
图 5 利用所提出的分割网络对骨结构进行分割的实际结果(第1行: 待分割的骨超声图像;第2行: 专家手动标注的骨结构; 第3行: 利用所提出的方法自动分割的骨结构)
Fig. 5 Bone segmentation results by using the proposed segmentation network (The first line: Ultrasound bone images to be segmented; The second line: Bone structures manually delineated by experts; The third line: Segmented bone structures using the proposed method)
图 7 局部相位模块消融实验结果示例. 第1帧: 超声图像帧; 第2帧: 手动标注的骨结构;第3帧: 带有局部相位模块的模型分割结果; 第4帧: 去除局部相位模块的模型分割结果
Fig. 7 Ablation results of local phase guidance (The first graph: Ultrasound image frame; The second graph: Manually delineated bone structures; The third graph: Results of the model with local phase guidance; The fourth graph: Results of the model without local phase guidance)
表 1 本研究采集的超声图像序列数据集的信息
Table 1 Information of the ultrasound image sequence dataset collected in this study
志愿者ID 图像帧数量 图像分辨率(mm/像素) 1 1 900 0.19 ~ 0.21 2 2 010 0.17 ~ 0.21 3 1 740 0.19 ~ 0.21 4 1 870 0.21 ~ 0.23 5 2 010 0.19 ~ 0.23 6 1 980 0.17 ~ 0.21 7 1 560 0.18 ~ 0.23 8 1 980 0.19 ~ 0.21 9 1 900 0.21 ~ 0.23 10 2 100 0.17 ~ 0.23 表 2 所提出的分割网络对10名志愿者采集的超声序列图像的分割结果
Table 2 Results of our proposed model obtained on the ultrasound images from ten volunteers
Exp_K 交并比 IoU 平均欧氏距离 AED Exp_1 0.91 ± 0.08 0.41 ± 0.06 Exp_2 0.90 ± 0.08 0.39 ± 0.07 Exp_3 0.91 ± 0.07 0.39 ± 0.06 Exp_4 0.92 ± 0.07 0.37 ± 0.05 Exp_5 0.90 ± 0.07 0.41 ± 0.08 Exp_6 0.89 ± 0.08 0.43 ± 0.08 Exp_7 0.91 ± 0.06 0.42 ± 0.07 Exp_8 0.90 ± 0.07 0.41 ± 0.09 Exp_9 0.92 ± 0.06 0.40 ± 0.07 Exp_10 0.91 ± 0.08 0.39 ± 0.09 表 3 针对主干网络的比较实验结果
Table 3 Comparison experiments by using different backbones
主干网络 ResNet18 ResNet34 ResNet50 ResNet101 VGGNet 超声图像骨分割平均交并比IoU值 0.89 ± 0.10 0.90 ± 0.08 0.91 ± 0.07 0.89 ± 0.09 0.89 ± 0.09 表 4 本文所提出的超声图像分割网络与其他最新的分割方法的实验结果(IoU值)比较
Table 4 Comparison of the experimental results (IoU values) of the ultrasound image segmentation network proposed in this study with other state-of-the-art segmentation methods
实验结果/
方法局部相位引导
CNNBoneNet模型 滤波层引导
CNN时空CNN 注意引导网络
AGNet三重注意力网络
TriANet本文方法 Exp_1 0.88 0.88 0.87 0.85 0.85 0.86 0.91 Exp_2 0.87 0.88 0.86 0.84 0.85 0.85 0.90 Exp_3 0.87 0.89 0.87 0.86 0.86 0.88 0.91 Exp_4 0.89 0.89 0.87 0.85 0.86 0.87 0.92 Exp_5 0.87 0.89 0.85 0.86 0.87 0.87 0.90 Exp_6 0.86 0.88 0.85 0.84 0.85 0.85 0.89 Exp_7 0.86 0.90 0.86 0.84 0.86 0.87 0.91 Exp_8 0.87 0.88 0.84 0.85 0.86 0.86 0.90 Exp_9 0.89 0.91 0.89 0.86 0.87 0.88 0.92 Exp_10 0.87 0.89 0.87 0.87 0.87 0.89 0.91 平均值 0.87 ± 0.13 0.88 ± 0.08 0.86 ± 0.14 0.87 ± 0.12 0.86 ± 0.09 0.88 ± 0.10 0.91 ± 0.07 -
[1] Zhe Z, Zhu J J, Song F, He D W, Deng J Z, Chen F, et al. Intraoperative ultrasound-guided reduction of femoral shaft fractures using intramedullary nailing: A technical note. Archives of Orthopaedic and Trauma Surgery, 2019, 139(5): 589−596 doi: 10.1007/s00402-018-3085-8 [2] Zhou H, Zhang G, Li M, Qu X Y, Cao Y J, Liu X, et al. Ultrasonography-guided closed reduction in the treatment of displaced transphyseal fracture of the distal humerus. Journal of Orthopaedic Surgery and Research, 2020, 15(1): Article No. 575 doi: 10.1186/s13018-020-02118-2 [3] Wein W, Karamalis A, Baumgartner A, Navab N. Automatic bone detection and soft tissue aware ultrasound-CT registration for computer-aided orthopedic surgery. International Journal of Computer Assisted Radiology and Surgery, 2015, 10(6): 971−979 doi: 10.1007/s11548-015-1208-z [4] Hacihaliloglu I. Ultrasound imaging and segmentation of bone surfaces: A review. Technology, 2017, 5(2): 74−80 doi: 10.1142/S2339547817300049 [5] Pandey P U, Quader N, Guy P, Garbi R, Hodgson A J. Ultrasound bone segmentation: A scoping review of techniques and validation practices. Ultrasound in Medicine and Biology, 2020, 46(4): 921−935 doi: 10.1016/j.ultrasmedbio.2019.12.014 [6] Masson-Sibut A, Nakib A, Petit E, Leitner F. Computer-assisted intramedullary nailing using real-time bone detection in 2D ultrasound images. In: Proceedings of the 2nd International Workshop on Machine Learning in Medical Imaging. Toronto, Canada: Springer, 2011. 18−25 [7] Wang P Y, Patel V M, Hacihaliloglu I. Simultaneous segmentation and classification of bone surfaces from ultrasound using a multi-feature guided CNN. In: Proceedings of the 21st International Conference on Medical Image Computing and Computer-Assisted Intervention. Granada, Spain: Springer, 2018. 134−142 [8] Huang Z X, Wang L W, Leung F H F, Banerjee S, Yang D, Lee T, et al. Bone feature segmentation in ultrasound spine image with robustness to speckle and regular occlusion noise. In: Proceedings of IEEE International Conference on Systems, Man, and Cybernetics (SMC). Toronto, Canada: IEEE, 2020. 1566−1571 [9] Hacihaliloglu I, Abugharbieh R, Hodgson A, Rohling R. Bone segmentation and fracture detection in ultrasound using 3D local phase features. In: Proceedings of the 11th International Conference on Medical Image Computing and Computer-Assisted Intervention. New York, USA: Springer, 2008. 287−295 [10] 范家伟, 张如如, 陆萌, 何佳雯, 康霄阳, 柴文俊, 等. 深度学习方法在糖尿病视网膜病变诊断中的应用. 自动化学报, 2021, 47(5): 985−1004Fan Jia-Wei, Zhang Ru-Ru, Lu Meng, He Jia-Wen, Kang Xiao-Yang, Chai Wen-Jun, et al. Applications of deep learning techniques for diabetic retinal diagnosis. Acta Automatica Sinica, 2021, 47(5): 985−1004 [11] 蒋芸, 谭宁. 基于条件深度卷积生成对抗网络的视网膜血管分割. 自动化学报, 2021, 47(1): 136−147Jiang Yun, Tan Ning. Retinal vessel segmentation based on conditional deep convolutional generative adversarial networks. Acta Automatica Sinica, 2021, 47(1): 136−147 [12] 夏平, 施宇, 雷帮军, 龚国强, 胡蓉, 师冬霞. 复小波域混合概率图模型的超声医学图像分割. 自动化学报, 2021, 47(1): 185−196Xia Ping, Shi Yu, Lei Bang-Jun, Gong Guo-Qiang, Hu Rong, Shi Dong-Xia. Ultrasound medical image segmentation based on hybrid probabilistic graphical model in complex-wavelet domain. Acta Automatica Sinica, 2021, 47(1): 185−196 [13] Ouahabi A, Taleb-Ahmed A. RETRACTED: Deep learning for real-time semantic segmentation: Application in ultrasound imaging. Pattern Recognition Letters, 2021, 144: 27−34 doi: 10.1016/j.patrec.2021.01.010 [14] Baka N, Leenstra S, van Walsum T. Ultrasound aided vertebral level localization for lumbar surgery. IEEE Transactions on Medical Imaging, 2017, 36(10): 2138−2147 doi: 10.1109/TMI.2017.2738612 [15] Ciganovic M, Özdemir F, Farshad M, Göksel O. Deep learning techniques for bone surface delineation in ultrasound. In: Proceedings of the SPIE 10955, Medical Imaging 2019: Ultrasonic Imaging and Tomography. San Diego, USA: SPIE, 2019. Article No. 109550Y [16] Wang P Y, Vives M, Patel V M, Hacihaliloglu I. Robust real-time bone surfaces segmentation from ultrasound using a local phase tensor-guided CNN. International Journal of Computer Assisted Radiology and Surgery, 2020, 15(7): 1127−1135 doi: 10.1007/s11548-020-02184-1 [17] Alsinan A Z, Patel V M, Hacihaliloglu I. Automatic segmentation of bone surfaces from ultrasound using a filter-layer-guided CNN. International Journal of Computer Assisted Radiology and Surgery, 2019, 14(5): 775−783 doi: 10.1007/s11548-019-01934-0 [18] Luan K, Li Z Y, Li J. An efficient end-to-end CNN for segmentation of bone surfaces from ultrasound. Computerized Medical Imaging and Graphics, 2020, 84: Article No. 101766 doi: 10.1016/j.compmedimag.2020.101766 [19] Hu P, Caba F, Wang O, Lin Z, Sclaroff S, Perazzi F. Temporally distributed networks for fast video semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE, 2020. 8815−8824 [20] Yao R, Xu X, Zhou Y, Zhao J Q, Fang L. Joint attention mechanism for unsupervised video object segmentation. In: Proceedings of the 4th Chinese Conference on Pattern Recognition and Computer Vision (PRCV). Beijing, China: Springer, 2021. 154−165 [21] Alcázar J L, Bravo M A, Jeanneret G, Thabet A K, Brox T, Arbeláez P, et al. MAIN: Multi-attention instance network for video segmentation. Computer Vision and Image Understanding, 2021, 210: Article No. 103240 doi: 10.1016/j.cviu.2021.103240 [22] He K M, Zhang X Y, Ren S Q, Sun J. Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE, 2016. 770−778 [23] He K M, Zhang X Y, Ren S Q, Sun J. Identity mappings in deep residual networks. In: Proceedings of the 14th European Conference on Computer Vision. Amsterdam, The Netherlands: Springer, 2016. 630−645 [24] Cox J, Rubin S, Adams J, Pereira C, Dighe M, Alessio A. Hyperparameter selection for ResNet classification of malignancy from thyroid ultrasound images. In: Proceedings of the SPIE 11314, Medical Imaging 2020: Computer-Aided Diagnosis. Houston, USA: SPIE, 2020. Article No. 1131447 [25] Zhang Q, Cui Z P, Niu X G, Geng S J, Qiao Y. Image segmentation with pyramid dilated convolution based on ResNet and U-Net. In: Proceedings of the 24th International Conference on Neural Information Processing. Guangzhou, China: Springer, 2017. 364−372 [26] Hacihaliloglu I, Rasoulian A, Rohling R N, Abolmaesumi P. Local phase tensor features for 3-D ultrasound to statistical shape+pose spine model registration. IEEE Transactions on Medical Imaging, 2014, 33(11): 2167−2179 doi: 10.1109/TMI.2014.2332571 [27] Hacihaliloglu I. Localization of bone surfaces from ultrasound data using local phase information and signal transmission maps. In: Proceedings of the 5th International Workshop on Computational Methods and Clinical Applications in Musculoskeletal Imaging. Quebec City, Canada: Springer, 2017. 1−11 [28] Xu K, Wen L Y, Li G R, Bo L F, Huang Q M. Spatiotemporal CNN for video object segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE, 2019. 1379−1388 [29] Li J Y, Zhao Y K, Fu J, Wu J J, Liu J. Attention-guided network for semantic video segmentation. IEEE Access, 2019, 7: 140680−140689 doi: 10.1109/ACCESS.2019.2943365 [30] Tian Y, Zhang Y J, Zhou D, Cheng G H, Chen W G, Wang R L. Triple attention network for video segmentation. Neurocomputing, 2020, 417: 202−211 doi: 10.1016/j.neucom.2020.07.078