2.845

2023影响因子

(CJCR)

  • 中文核心
  • EI
  • 中国科技核心
  • Scopus
  • CSCD
  • 英国科学文摘

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于多尺度上下文的图像标注算法

周全 王磊 周亮 郑宝玉

周全, 王磊, 周亮, 郑宝玉. 基于多尺度上下文的图像标注算法. 自动化学报, 2014, 40(12): 2944-2949. doi: 10.3724/SP.J.1004.2014.02944
引用本文: 周全, 王磊, 周亮, 郑宝玉. 基于多尺度上下文的图像标注算法. 自动化学报, 2014, 40(12): 2944-2949. doi: 10.3724/SP.J.1004.2014.02944
ZHOU Quan, WANG Lei, ZHOU Liang, ZHENG Bao-Yu. Multi-scale Contextual Image Labeling. ACTA AUTOMATICA SINICA, 2014, 40(12): 2944-2949. doi: 10.3724/SP.J.1004.2014.02944
Citation: ZHOU Quan, WANG Lei, ZHOU Liang, ZHENG Bao-Yu. Multi-scale Contextual Image Labeling. ACTA AUTOMATICA SINICA, 2014, 40(12): 2944-2949. doi: 10.3724/SP.J.1004.2014.02944

基于多尺度上下文的图像标注算法

doi: 10.3724/SP.J.1004.2014.02944
基金项目: 

国家自然科学基金(61201165,61271240,61201164),高等学校博士学科点专项科研基金(20113223120002),中国博士后科学基金(2013M531392),江苏高校优势学科建设工程资助项目,东南大学移动通信国家重点实验室开放研究基金(2011D05),南京邮电大学科研基金(NY210072,NY213067)资助

详细信息
    作者简介:

    王磊 南京邮电大学通信与信息工程学院副教授, 博士. 主要研究方向为无线通信. E-mail: wanglei@njupt.edu.cn

    通讯作者:

    周全 南京邮电大学通信与信息工程学院讲师, 博士. 主要研究方向为图像/视频处理, 计算机视觉, 机器学习与模式识别. 本文通信作者.E-mail: quan.zhou@njupt.edu.cn

Multi-scale Contextual Image Labeling

Funds: 

Supported by National Natural Science Foundation of China (61201165, 61271240, 61201164), Specialized Research Fund for the Doctoral Program of Higher Education (20113223120002), China Postdoctoral Science Foundation (2013M531392), A Project Funded by the Priority Academic Program Development of Jiangsu Higher Education Institutions, The Open Research Fund of National Mobile Communications Research Laboratory, Southeast University (2011D05), and the Scientific Research Foundation of Nanjing University of Posts and Telecommunications (NY210072, NY213067)

  • 摘要: 提出了一种在层次化分割框架下,通过结合图像的底层局部特征以及高层的上下文特征,进行图像自动语义标注的新算法. 该算法的核心思想在于对较大的图像区域的识别结果有利于对其包含的较小图像区域进行识别.算法首先对每层分割后的图像区域进行识别, 然后利用贝叶斯定理将各层区域识别的结果通过线性加权的方式进行融合,从而达到对整幅图像进行自动语义标注的目的.与现有的图像标注算法相比,仿真实验表明本文算法获得了最好的标注精度以及最快的标注速度.
  • [1] Shotton J, Winn J W, Rother C, Criminisi A. Textonboost for image understanding: multi-class object recognition and segmentation by jointly modeling texture, layout, and context. International Journal of Computer Vision, 2009, 81(1): 2-23
    [2] [2] Tu Z W, Bai X. Auto-context and its application to highlevel vision tasks and 3D brain image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010, 32(10): 1744-1757
    [3] [3] Gould S, Rodgers J, Cohen D, Elidan E, Koller D. Multi-class segmentation with relative location prior. International Journal of Computer Vision, 2008, 80(3): 300-316
    [4] [4] Gould S, Fulton R, Koller D. Decomposing a scene into geometric and semantically consistent regions. In: Proceedings of the 12th IEEE Conference on Computer Vision. Kyoto, Japan: IEEE, 2009. 1-8
    [5] Jiang Li-Xing, Hou Jin. Image annotation using the ensemble learning. Acta Automatica Sinica, 2012, 38(8): 1257-1262(蒋黎星, 侯进. 基于集成分类算法的自动图像标注. 自动化学报, 2012, 38(8): 1257-1262)
    [6] Zhang Su-Lan, Guo Ping, Zhang Ji-Fu, Hu Li-Hua. Automatic semantic image annotation with granular analysis method. Acta Automatica Sinica, 2012, 38(5): 688-697(张素兰, 郭平, 张继福, 胡立华. 图像语义自动标注及其粒度分析方法. 自动化学报, 2012, 38(5): 688-697)
    [7] Yang Dong, Zhou Xiu-Ling, Guo Ping. Image annotation with Bayesian universal background model. Acta Automatica Sinica, 2013, 39(10): 1674-1680(杨栋, 周秀玲, 郭平. 基于贝叶斯通用背景模型的图像标注. 自动化学报, 2013, 39(10): 1674-1680)
    [8] [8] Lafferty J, McCallum A, Pereira F. Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of the 2008 IEEE Conference on Machine Learning. Helsinki, Finland: IEEE, 2008. 282-289
    [9] [9] Galleguillos C, Rabinovich A, Belongie S. Object categorization using co-occurrence, location and appearance. In: Proceedings of the 2008 IEEE Conference on Computer Vision and Pattern Recognition. Anchorage, USA: IEEE, 2008. 1-8
    [10] Hoiem D, Efros A A, Hebert M. Geometric context from a single image. In: Proceedings of the 2005 IEEE Conference on Computer Vision. Beijing, China: IEEE, 2005. 654-661
    [11] He X M, Zemel R S, Ray D. Learning and incorporating top-down cues in image segmentation. In: Proceedings of the 2006 Europe Conference on Computer Vision. Berlin Heidelberg: Springer, 2006. 338-351
    [12] Medin D L, Schaffer M M. Context theory of classification learning. Psychological Review, 1978, 85(3): 207-238
    [13] Yang L, Meer P, Foran D J. Multiple class segmentation using a unified framework over mean-shift patches. In: Proceedings of the 2007 IEEE Conference on Computer Vision and Pattern Recognition. Minneapolis, USA: IEEE, 2007. 1-8
    [14] Comaniciu D, Meer P. Mean shift: a robust approach toward feature space analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2002, 24(5): 603-617
    [15] Elkan C. Using the triangle inequality to accelerate k-means. In: Proceedings of the 2003 IEEE Conference on Machine Learning. Washington D.C., USA: IEEE, 2003. 147-153
    [16] Collins M, Schapire R, Singer Y. Logistic regression, adaboost and Bregman distances. Machine Learning, 2002, 48(1-3): 253-285
    [17] Yao B, Yang X, Zhu S C. Introduction to a large scale general purpose groundtruth dataset: methodology, annotation tool, and benchmark. In: Proceedings of the 2009 Energy Minimization Methods in Computer Vision and Pattern Recognition. Berlin, Heidelberg: Springer-Verlag, 2007. 169-183
  • 加载中
计量
  • 文章访问数:  2659
  • HTML全文浏览量:  162
  • PDF下载量:  757
  • 被引次数: 0
出版历程
  • 收稿日期:  2013-12-03
  • 修回日期:  2014-05-23
  • 刊出日期:  2014-12-20

目录

    /

    返回文章
    返回