2.845

2023影响因子

(CJCR)

  • 中文核心
  • EI
  • 中国科技核心
  • Scopus
  • CSCD
  • 英国科学文摘

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于语义引导特征聚合的显著性目标检测网络

王正文 宋慧慧 樊佳庆 刘青山

王正文, 宋慧慧, 樊佳庆, 刘青山. 基于语义引导特征聚合的显著性目标检测网络. 自动化学报, 2023, 49(11): 2386−2395 doi: 10.16383/j.aas.c210425
引用本文: 王正文, 宋慧慧, 樊佳庆, 刘青山. 基于语义引导特征聚合的显著性目标检测网络. 自动化学报, 2023, 49(11): 2386−2395 doi: 10.16383/j.aas.c210425
Wang Zheng-Wen, Song Hui-Hui, Fan Jia-Qing, Liu Qing-Shan. Semantic guided feature aggregation network for salient object detection. Acta Automatica Sinica, 2023, 49(11): 2386−2395 doi: 10.16383/j.aas.c210425
Citation: Wang Zheng-Wen, Song Hui-Hui, Fan Jia-Qing, Liu Qing-Shan. Semantic guided feature aggregation network for salient object detection. Acta Automatica Sinica, 2023, 49(11): 2386−2395 doi: 10.16383/j.aas.c210425

基于语义引导特征聚合的显著性目标检测网络

doi: 10.16383/j.aas.c210425
基金项目: 国家自然科学基金(61872189, 61532009), 江苏省自然科学基金(BK20191397), 江苏省“六大人才”高峰项目(XYDXX-015)资助
详细信息
    作者简介:

    王正文:南京信息工程大学自动化学院硕士研究生. 主要研究方向为显著性目标检测, 深度学习. E-mail: 20191223064@nuist.edu.cn

    宋慧慧:南京信息工程大学自动化学院教授. 主要研究方向为视频目标分割, 图像超分. 本文通信作者. E-mail: songhuihui@nuist.edu.cn

    樊佳庆:南京信息工程大学自动化学院硕士研究生. 主要研究方向为视频目标分割. E-mail: jqfan@nuaa.edu.cn

    刘青山:南京信息工程大学自动化学院教授. 主要研究方向为视频内容分析与理解. E-mail: qsliu@nuist.edu.cn

Semantic Guided Feature Aggregation Network for Salient Object Detection

Funds: Supported by National Natural Science Foundation of China (61872189, 61532009), Natural Science Foundation of Jiangsu Province (BK20191397), and “Six Talent Peaks” Project of Jiangsu Province (XYDXX-015)
More Information
    Author Bio:

    WANG Zheng-Wen Master student at the School of Automation, Nanjing University of Information Science and Technology. His research interest covers salient object detection and deep learning

    SONG Hui-Hui Professor at the School of Automation, Nanjing University of Information Science and Technology. Her research interest covers video object segmentation and image super-resolution. Corresponding author of this paper

    FAN Jia-Qing Master student at the School of Automation, Nanjing University of Information Science and Technology. His main research interest is video object segmentation

    LIU Qing-Shan Professor at the School of Automation, Nanjing University of Information Science and Technology. His research interest covers video content analysis and understanding

  • 摘要: 在显著性目标检测网络的设计中, U型结构使用广泛. 但是在U型结构显著性检测方法中, 普遍存在空间位置细节丢失和边缘难以细化的问题, 针对这些问题, 提出一种基于语义信息引导特征聚合的显著性目标检测网络, 通过高效的特征聚合来获得精细的显著性图. 该网络由混合注意力模块(Mixing attention module, MAM)、增大感受野模块(Enlarged receptive field module, ERFM)和多层次聚合模块(Multi-level aggregation module, MLAM)三个部分组成. 首先, 利用增大感受野模块处理特征提取网络提取出的低层特征, 使其在保留原有边缘细节的同时增大感受野, 以获得更加丰富的空间上/下文信息; 然后, 利用混合注意力模块处理特征提取网络的最后一层特征, 以增强其表征力, 并作为解码过程中的语义指导, 不断指导特征聚合; 最后, 多层次聚合模块对来自不同层次的特征进行有效聚合, 得到最终精细的显著性图. 在6个基准数据集上进行了实验, 结果验证了该方法能够有效地定位显著特征, 并且对边缘细节的细化也很有效.
  • 图  1  网络结构图

    Fig.  1  Network structure diagram

    图  2  混合注意力模块

    Fig.  2  Mixing attention module

    图  3  增大感受野模块

    Fig.  3  Enlarged receptive field module

    图  4  多层次聚合模块

    Fig.  4  Multi-level aggregation module

    图  5  不同算法的查准率−查全率曲线示意图

    Fig.  5  Comparison of precision−recall curves of different methods

    图  6  不同算法的显著性图

    Fig.  6  Salient maps of different methods

    表  1  不同方法的${F_\beta }$指标结果比较

    Table  1  Comparison of ${F_\beta }$ values of different models

    数据集本文方法PAGRRASDGRLCPDMLMSPoolNetAFNetBASNetU2NetITSD
    ECSSD0.9510.9240.9210.9210.9360.9300.9440.9350.9420.9510.947
    DUT-OMRON0.8270.7710.7860.7740.7940.7930.8080.7970.8050.8230.824
    PASCAL-S0.8730.8470.8370.8440.8660.8580.8690.8680.8540.8590.871
    HKU-IS0.9370.9190.9130.9100.9240.9220.9330.9230.9280.9350.934
    DUTS-TE0.8880.8550.8310.8280.8640.8540.8800.8620.8600.8730.883
    SOD0.8730.8380.8100.8430.8500.8620.8670.8510.8610.880
    注: ${F_\beta }$值越大越好, 加粗数字为最优结果, 加下划线数字为次优结果.
    下载: 导出CSV

    表  2  不同方法的MAE指标结果比较

    Table  2  Comparison of MAE values of different models

    数据集本文方法PAGRRASDGRLCPDMLMSPoolNetAFNetBASNetU2NetITSD
    ECSSD0.0340.0640.0560.0430.0400.0380.0390.0420.0370.0340.035
    DUT-OMRON0.0580.0710.0620.0620.0560.0600.0560.0570.0560.0540.061
    PASCAL-S0.0650.0890.1040.0720.0740.0690.0750.0690.0760.0740.072
    HKU-IS0.0320.0470.0450.0360.0330.0340.0330.0360.0320.0310.031
    DUTS-TE0.0420.0530.0600.0490.0430.0450.0400.0460.0470.0440.041
    SOD0.0930.1450.1240.1030.1120.1060.1000.1140.1080.095
    注: MAE值越小越好.
    下载: 导出CSV

    表  3  不同方法的${S_m}$指标结果比较

    Table  3  Comparison of ${S_m}$ values of different models

    数据集 本文方法PAGRRASDGRLCPDMLMSPoolNetAFNetBASNetU2NetITSD
    ECSSD0.9320.8890.8930.9060.9150.9110.9210.9140.9160.9280.925
    DUT-OMRON0.8470.7750.8140.8100.8180.8170.8360.8260.8360.8470.840
    PASCAL-S0.8650.7490.7950.8690.8440.8490.8450.8500.8380.8440.859
    HKU-IS0.9300.8870.8870.8970.9040.9010.9170.9050.9090.9160.917
    DUTS-TE0.8730.8380.8390.8420.8670.8560.8830.8660.8530.8610.872
    SOD0.8080.7200.7640.7710.7710.7800.7950.7720.7860.809
    注: ${S_{{m} } }$值越大越好.
    下载: 导出CSV

    表  4  消融实验结果

    Table  4  Results of ablation experiment

    MAMERFMMLAMMAE/${F_\beta }$
    0.049/0.935
    0.045/0.937
    0.042/0.942
    0.039/0.944
    0.034/0.951
    注: MAE值越小越好, 加粗字体为最优结果, “✓”为使用指定模块.
    下载: 导出CSV

    表  5  ERFM模块中, 不同扩张率设置的对比实验

    Table  5  Comparative experiment of different dilation rate configurations in ERFM

    扩张率的不同设置组合MAE/${F_\beta }$
    (1, 3, 5), (1, 3, 5), (1, 3, 5), (1, 3, 5)0.039/0.946
    (1, 3, 5), (1, 3, 5), (3, 5, 7), (1, 3, 5)0.037/0.948
    (1, 3, 5), (4, 6, 8), (3, 5, 7), (1, 3, 5)0.036/0.950
    (5, 8, 11), (4, 6, 8), (3, 5, 7), (1, 3, 5)0.034/0.951
    下载: 导出CSV

    表  6  MLAM模块中, 两个分支的消融实验

    Table  6  Ablation experiment of two branches in MLAM

    自下而上分支自上而下分支MAE/${F_\beta }$
    0.041/0.940
    0.040/0.946
    0.034/0.951
    下载: 导出CSV

    表  7  MAM模块中, 注意力模块位置关系的消融实验

    Table  7  Ablation experiment on the position relationship of attention module in MAM

    注意力模块之间的位置关系MAE/${F_\beta }$
    通道注意力在前0.036/0.947
    空间注意力在前0.038/0.944
    并行放置 (本文方法)0.034/0.951
    下载: 导出CSV
  • [1] Donoser M, Urschler M, Hirzer M, Bischof H. Saliency driven total variation segmentation. In: Proceedings of the IEEE International Conference on Computer Vision. Kyoto, Japan: IEEE, 2009. 817−824
    [2] Wei J, Wang S, Huang Q. F3Net: Fusion, feedback and focus for salient object detection. In: Proceedings of the AAAI Conference on Artificial Intelligence. New York, USA: Spring, 2020. 12321−12328
    [3] 李东民, 李静, 梁大川, 王超. 基于多尺度先验深度特征的多目标显著性检测方法. 自动化学报, 2019, 45(11): 2058-2070

    Li Dong-Min, Li Jing, Liang Da-Chuan, Wang Chao. Multiple Salient Objects Detection Using Multi-scale Prior and Deep Features. Acta Automatica Sinica, 2019, 45(11): 2058-2070
    [4] 徐威, 唐振民. 利用层次先验估计的显著性目标检测. 自动化学报, 2015, 41(4): 799-812

    Xu Wei, Tang Zhen-Min. Exploiting Hierarchical Prior Estimation for Salient Object Detection. Acta Automatica Sinica, 2015, 41(4): 799-812
    [5] 杨赛, 赵春霞, 徐威. 一种基于词袋模型的新的显著性目标检测方法. 自动化学报, 2016, 42(8): 1259-1273

    Yang Sai, Zhao Chun-Xia, Xu Wei. A novel salient object detection method using bag-of-features. Acta Automatica Sinica, 2016, 42(8): 1259-1273
    [6] Hong S, You T, Kwak S, Han B. Online tracking by learning discriminative saliency map with convolutional neural network. In: Proceedings of the 32nd International Conference on Machine Learning. Miami, USA: IMLS, 2015. 597−606
    [7] Ren Z, Gao S, Chia L, Tsang I W. Region-based saliency detection and its application in object recognition. IEEE Transactions on Circuits and Systems for Video Technology, 2014, 24(5): 769−779
    [8] Wang X, You S, Li X, Ma H. Weakly-supervised semantic segmentation by iteratively mining common object features. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018. 1354− 1362
    [9] Goferman S, Zelnik-Manor L, Tal A. Context-aware saliency detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012, 34(10): 1915−1926
    [10] Yan Q, Xu L, Shi J, Jia J. Hierarchical saliency detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. America, Portland: IEEE, 2013. 1155− 1162
    [11] Hou Q, Cheng M M, Hu X, Borji A, Tu Z, Torr P. Deeply supervised salient object detection with short connections. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Hawaii, USA: IEEE, 2017. 5300−5309
    [12] Luo Z, Mishra A, Achkar A, Eichel J, Jodoin P M. Non-local deep features for salient object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Hawaii, USA: IEEE, 2017. 6609−6617
    [13] Zhang P, Wang D, Lu H, Wang H, Ruan X. Amulet: Aggregating multi-level convolutional features for salient object detection. In: Proceedings of the IEEE International Conference on Computer Vision. Venice, Italy: IEEE, 2017. 202−211
    [14] Li X, Zhao L M, Wei L, Yang M, Wu F, Zhuang Y T, et al. DeepSaliency: Multi-task deep neural network model for salient object detection. IEEE Transactions on Image Processing, 2016, 25(8): 3919−3930
    [15] Qin X, Zhang Z, Huang C, Dehghan M, Jagersand M. U2Net: Going deeper with nested U-structure for salient object detection. Pattern Recognition, 2020, 106: Article No. 107404
    [16] Pang Y, Zhao X, Zhang L, Lu H. Multi-scale interactive network for salient object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE, 2020. 9413−9422
    [17] Wei J, Wang S, Wu Z, Su C, Huang Q, Tian Q. Label decoupling framework for salient object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE, 2020. 13025−13034
    [18] Borji A, Cheng M M, Jiang H, Li J. Salient object detection: A benchmark. IEEE Transactions on Image Processing, 2015, 24(12): 5706−5722
    [19] Zhang X N, Wang T T, Qi J Q, Lu H C, Wang G. Progressive attention guided recurrent network for salient object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018. 714−722
    [20] Zhao T, Wu X Q. Pyramid feature attention network for saliency detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Los Angeles, USA: IEEE, 2019. 3085−3094
    [21] Chen S, Tan X, Wang B, Hu X. Reverse attention for salient object detection. In: Proceedings of the IEEE Europeon Conference on Computer Vision. Munich, Germany: IEEE, 2018. 234−250
    [22] Wang W, Zhao S, Shen J, Hoi S C, Borji A. Salient object detection with pyramid attention and salient edges. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Los Angeles, USA: IEEE, 2019. 1448−1457
    [23] Wu Z, Su L, Huang Q. Cascaded partial decoder for fast and accurate salient object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Los Angeles, USA: IEEE, 2019. 3907−3916
    [24] Deng Z J, Hu X W, Zhu L, Xu X M, Qin J, Han G Q, et al. R3Net: Recurrent residual refinement network for saliency detection. In: Proceedings of the 27th International Joint Conference on Artificial Intelligence. Stockholm, Sweden: AAAI Press, 2018. 684−690
    [25] Wang B, Chen Q, Zhou M, Zhang Z, Jin X, Gai K. Progressive feature polishing network for salient object detection. In: Proceedings of the AAAI Conference on Artificial Intelligence. New York, USA: Springer, 2020.
    [26] Fu J, Liu J, Tian H, Li Y, Bao Y, Fang Z, et al. Dual attention network for scene segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Los Angeles, USA: IEEE, 2019. 3146−3154
    [27] Zhao H S, Shi J P, Qi X J, Wang X G, Jia J Y. Pyramid scene parsing network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Hawaii, USA: IEEE, 2017. 6230−6239
    [28] Wang L J, Lu H C, Wang Y F, Feng M Y, Wang D, Yin B C, et al. Learning to detect salient objects with image-level supervision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Hawaii, USA: IEEE, 2017. 3796− 3805
    [29] Kingma D P, Ba J. Adam: A method for stochastic optimization. arXiv preprint arXiv: 1412.6980, 2014.
    [30] Yang C, Zhang L, Lu H C, Ruan X, Yang M. Saliency detection via graph-based manifold ranking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Portland, USA: IEEE, 2013. 3166−3173
    [31] Li X H, Lu H C, Zhang L, Ruan X, Yang M. Saliency detection via dense and sparse reconstruction. In: Proceedings of the IEEE International Conference on Computer Vision. Sydney, Austra-lia: IEEE, 2013. 2976−2983
    [32] Li G B, Yu Y Z. Visual saliency based on multi-scale deep features. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA: IEEE, 2015. 5455− 5463
    [33] Li Y, Hou X, Koch C, Rehg J M, Yuille A L. The secrets of salient object segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Columbus, USA: IEEE, 2014. 280−287
    [34] Movahedi V, Elder J H. Design and perceptual validation of performance measures for salient object segmentation. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition-Workshops. San Francisco, USA: IEEE, 2010. 49−56
    [35] Fan D P, Cheng M M, Liu Y, Li T, Borji A. Structure-measure: A new way to evaluate foreground maps. In: Proceedings of the IEEE International Conference on Computer Vision. Venice, Italy: IEEE, 2017. 4548−4557
    [36] Wang T T, Zhang L, Wang S, Lu H C, Yang G, Ruan Y, et al. Detect globally, refine locally: A novel approach to saliency detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE, 2018. 3127−3135
    [37] Wu R, Feng M, Guan W, Wang D, Lu H, Ding E. A mutual learning method for salient object detection with intertwined multi-supervision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Los Angeles, USA: IEEE, 2019. 8150−8159
    [38] Liu J J, Hou Q, Cheng M M, Feng J, Jiang J. PoolNet: A simple pooling-based design for real-time salient object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Los Angeles, USA: IEEE, 2019. 3917−3926
    [39] Feng M Y, Lu H C, Ding E. Attentive feedback network for boundary-aware salient object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Los Angeles, USA: IEEE, 2019. 1623−1632
    [40] Qin X B, Zhang Z C, Huang C Y, Gao C, Dehghan M, Jagersand M. BASNet: Boundary-aware salient object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Los Angeles, USA: IEEE, 2019. 7479−7489
    [41] Zhou H, Xie X, Lai J H, Chen Z, Yang L. Interactive two-stream decoder for accurate and fast saliency detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Seattle, USA: IEEE, 2020. 9141−9150
  • 加载中
图(6) / 表(7)
计量
  • 文章访问数:  1319
  • HTML全文浏览量:  578
  • PDF下载量:  247
  • 被引次数: 0
出版历程
  • 收稿日期:  2021-05-17
  • 录用日期:  2020-10-18
  • 网络出版日期:  2021-11-15
  • 刊出日期:  2023-11-22

目录

    /

    返回文章
    返回