方超伟 李雪 李钟毓 焦李成 张鼎文

方超伟, 李雪, 李钟毓, 焦李成, 张鼎文. 基于双模型交互学习的半监督医学图像分割. 自动化学报, 2023, 49(4): 805−819 doi: 10.16383/j.aas.c210667
Fang Chao-Wei, Li Xue, Li Zhong-Yu, Jiao Li-Cheng, Zhang Ding-Wen. Interactive dual-model learning for semi-supervised medical image segmentation. Acta Automatica Sinica, 2023, 49(4): 805−819 doi: 10.16383/j.aas.c210667
doi: 10.16383/j.aas.c210667
基金项目: 国家自然科学基金(62003256, 61876140, U21B2048) 资助

    方超伟:西安电子科技大学人工智能学院讲师. 2019年获得香港大学博士学位, 2013年获得西安交通大学学士学位. 主要研究方向为图像处理, 医学影像分析, 计算机视觉, 机器学习. E-mail: chaoweifang@outlook.com

    李雪:西安电子科技大学机电工程学院硕士研究生. 2020年获得西安理工大学自动化学院学士学位. 主要研究方向为医学影像分析, 计算机视觉. E-mail: lixue@stu.xidian.edu.cn

    李钟毓:西安交通大学软件学院副教授. 2018年获得美国北卡罗来纳大学夏洛特分校博士学位, 2015年和2012年分别获得西安交通大学硕士和学士学位. 主要研究方向为计算视觉, 医学影像分析. E-mail: zhongyuli@xjtu.edu.cn

    焦李成:西安电子科技大学智能感知与图像理解教育部重点实验室教授. 1982 年获得上海交通大学学士学位, 1984年和1990年分别获得西安交通大学硕士和博士学位. 主要研究方向为图像处理, 自然计算, 机器学习和智能信息处理. E-mail: lchjiao@mail.xidian.edu.cn

    张鼎文:西北工业大学脑与人工智能实验室教授. 2018年获得西北工业大学博士学位. 主要研究方向为计算机视觉和多媒体处理, 显著性检测, 视频物体分割和弱监督学习. 本文通信作者.E-mail: zhangdingwen2006yyy@gmail.com

Interactive Dual-model Learning for Semi-supervised Medical Image Segmentation

Funds: Supported by National Natural Science Foundation of China (62003256, 61876140, U21B2048)
More Information
    Author Bio:

    FANG Chao-Wei Lecturer at the School of Artificial Intelligence, Xidian University. He received his Ph.D. degree from University of Hong Kong in 2019. He received his bachelor degree from Xi'an Jiaotong University in 2013. His research interest covers image processing, medical image analysis, computer vision, and machine learning

    LI Xue Master student at the School of Mechano-Electronic Engineering, Xidian University. She received her bachelor degree from the School of Automation, Xi'an University of Technology in 2020. Her research interest covers medical image analysis and computer vision

    LI Zhong-Yu Associate professor at the School of Software Engineering, Xi'an Jiaotong University. He received his Ph.D. degree from the University of North Carolina at Charlotte, USA in 2018. He received his master degree and bachelor degree from Xi'an Jiaotong University in 2015 and 2012, respectively. His research interest covers computer vision and medical image analysis

    JIAO Li-Cheng Professor at the Key Laboratory of Intelligent Perception and Image Understanding, Ministry of Education, Xidian University. He received his bachelor degree from Shanghai Jiao Tong University in 1982, his master and Ph.D. degrees from Xi'an Jiaotong University in 1984 and 1990, respectively. His research interest covers image processing, natural computation, machine learning, and intelligent information processing

    ZHANG Ding-Wen Professor at the Brain and Artificial Intelligence Laboratory, Northwestern Polytechnical University. He received his Ph.D. degree from Northwestern Polytechnical University in 2018. His research interest covers computer vision and multimedia processing, especially on saliency detection, video object segmentation, and weakly supervised learning. Corresponding author of thispaper

  • 摘要: 在医学图像中, 器官或病变区域的精准分割对疾病诊断等临床应用有着至关重要的作用, 然而分割模型的训练依赖于大量标注数据. 为减少对标注数据的需求, 本文主要研究针对医学图像分割的半监督学习任务. 现有半监督学习方法广泛采用平均教师模型, 其缺点在于, 基于指数移动平均(Exponential moving average, EMA)的参数更新方式使得老师模型累积学生模型的错误知识. 为避免上述问题, 提出一种双模型交互学习方法, 引入像素稳定性判断机制, 利用一个模型中预测结果更稳定的像素监督另一个模型的学习, 从而缓解了单个模型的错误经验的累积和传播. 提出的方法在心脏结构分割、肝脏肿瘤分割和脑肿瘤分割三个数据集中取得优于前沿半监督方法的结果. 在仅采用30%的标注比例时, 该方法在三个数据集上的戴斯相似指标(Dice similarity coefficient, DSC)分别达到89.13%, 94.15%, 87.02%.
  • 图  1  模型框架的对比图 ((a)基于双模型交互学习的半监督分割框架; (b)基于平均教师模型[22]的半监督分割框架; (c)基于一致性约束的单模型半监督分割框架. 实线箭头表示训练数据的传递和模型的更新, 虚线箭头表示无标注数据监督信息的来源)

    Fig.  1  Comparison of the model framework ((a) Semi-supervised segmentation framework based on dual-model interactive learning; (b) Semi-supervised segmentation framework based on the mean teacher model[22]; (c) Semi-supervised segmentation framework based on single model. Solid arrows represent the propagation of training data and the update of models. Dashed arrows point out the origin of the supervisions on unlabeled images)

    图  2  双模型交互学习框架图. MSE、CE 和 DICE 分别表示均方误差函数、交叉熵函数和戴斯函数. 单向实线箭头表示原始图像($ {{\boldsymbol{I}}}^{{{l}}} $$ {{\boldsymbol{I}}}^{{{u}}} $)在各模型中的前向计算过程, 单向虚线箭头表示噪声图像($ {{\bar{{\boldsymbol{I}}}}}^{{{l}}} $$ {{\bar{{\boldsymbol{I}}}}}^{{{u}}} $)在各模型中的前向计算过程

    Fig.  2  Framework of interactive learning of dual-models. MSE, CE and DICE represent mean square error function, cross entropy function and DICE function, respectively. The solid single-directional arrow represents the forward calculation process of the original image ($ {{\boldsymbol{I}}}^{{{l}}} $ and $ {{\boldsymbol{I}}}^{{{u}}} $) in each model. The dashed single-directional arrow represents the forward calculation process of noise images (${{\bar{{{{\boldsymbol{I}}}}}}}^{{{l}}}$ and ${{\bar{{\boldsymbol{I}}}}}^{{{u}}}$) in each model

    图  3  在 CSS 数据集中, 双模型与其他半监督方法分割结果图, 图中黑色区域代表背景, 深灰色区域代表左室腔,浅灰色区域代表左室心肌, 白色区域代表右室腔

    Fig.  3  Segmentation results of our method and other semi-supervised methods on the CSS dataset. The black, dark gray, light gray, and white represents the background, left ventricle cavity (LV Cavity), left ventricular myocardium (LV Myo), and right ventricle cavity (RV Cavity), respectively

    图  4  在训练过程, 平均教师模型和双模型的输出结果对比图

    Fig.  4  Comparison between the mean teacher method and our proposed dual-model learning method

    图  5  双模型与其他半监督方法在 LiTS 数据集中的分割结果, 其中白色区域为肝脏区域

    Fig.  5  Liver segmentation results of our method and other semi-supervised methods on the LiTS dataset. The white is the liver region

    图  6  双模型与其他半监督方法在 BraTS 数据集中的分割结果, 其中白色区域为整个肿瘤区域

    Fig.  6  The whole tumor segmentation results of our method and other semi-supervised methods on the BraTS dataset. The white is the whole tumor region

    图  7  不使用伴随变量Q和使用伴随变量Q时, 模型在验证集上的分割性能变化趋势

    Fig.  7  The segmentation performance variation trend of the model on the validation set when the adjoint variable Q is not used and when the adjoint variable Q is used

    表  1  本文双模型方法与其他双模型方法的比较

    Table  1  Compared with other dual-model methods

    提出双模型, 两个小网络实现交互学习. 用 KL 散度评估两个模型网络预测结果之间的差异
    提出双模型, 在 DML 基础上, 在两个网络模型输出预测结果之间引入对抗学习
    提出双模型, 每个模型提取特征并通过辅助分类器做出预测.同时将两个分支提取的特征进行融合, 通过融合分类器得到整体分类结果
    U 形网络
    密集 U 形网络
    三维 U 形网络
    提出双模型, 引入稳定伪标签判断机制, 用一个模型的稳定像素约束另一个模型的不稳定像素
    表  2  采用U-Net和DenseU-Net网络结构时, 在不同标签比例的CSS数据集下与其他方法的对比结果

    Table  2  Comparison with other methods on the CSS dataset when different training images are annotated. The baseline segmentation network is U-Net or DenseU-Net

    表  3  采用U-Net和DenseU-Net网络结构, 在30%标签比例的LiTS数据集下与其他方法的对比结果

    Table  3  Comparison with other methods on LiTS when 30% training images are annotated. The baseline segmentation network is U-Net or DenseU-Net

    网络结构方法DSC (%)HD95ASD
    表  4  采用3D U-Net网络, 在30%标签比例的BraTS数据集下与其他方法的对比结果

    Table  4  Comparison with other methods on the BraTS dataset when 30% training images are annotated. The baseline network is 3D U-Net

    方法DSC (%)HD95ASD
    表  5  采用U-Net网络, 在标签比例为10%的CSS数据上验证不同变体对结果的影响

    Table  5  Performance of different variants of our method on the CSS dataset when 10% training images are annotated. The baseline segmentation network is U-Net

    序号有监督约束无监督一致性交互学习稳定性选择策略不使用伴随变量QDSC (%)HD95ASD
    表  6  采用U-Net网络, 在标签比例为10%的CSS数据上验证模型数量对结果的影响

    Table  6  Performance of number of model on the CSS dataset when 10% training images are annotated. The baseline network is U-Net

    学生数量 DSC (%)
    2 87.53
    4 87.32
    6 87.46
    表  7  采用U-Net网络, 在标签比例为10%的CSS数据上验证损失函数对结果的影响

    Table  7  Performance of different loss function of our method on the CSS dataset when 10% training images are annotated. The baseline network is U-Net

    损失函数DSC (%)HD95ASD
    $ {L}_{{\rm{dice}}} $75.0010.943.63
    $ {L}_{{\rm{seg}}} $76.4110.463.12
    $ {L}_{{\rm{seg}}}+{L}_{{\rm{con}}\_P} $80.627.352.66
    $ {L}_{{\rm{seg}}}+{L}_{{\rm{con}}\text{­}} $83.425.841.64
    ${L}_{ {\rm{seg} } }+{L}_{ {\rm{con} } }+{L}_{{\rm{sta}}}$87.533.201.61
    表  8  采用U-Net网络, 在标签比例为10%的CSS数据集上验证损失函数对结果的影响

    Table  8  Performance of network sharing of our method on the CSS dataset when 10% training images are annotated. The baseline network is U-Net

    共享网络DSC (%)HD95ASD
