Multi-scale Contextual Image Labeling
-
摘要: 提出了一种在层次化分割框架下,通过结合图像的底层局部特征以及高层的上下文特征,进行图像自动语义标注的新算法. 该算法的核心思想在于对较大的图像区域的识别结果有利于对其包含的较小图像区域进行识别.算法首先对每层分割后的图像区域进行识别, 然后利用贝叶斯定理将各层区域识别的结果通过线性加权的方式进行融合,从而达到对整幅图像进行自动语义标注的目的.与现有的图像标注算法相比,仿真实验表明本文算法获得了最好的标注精度以及最快的标注速度.Abstract: This paper provides a novel method for image labeling by combining the local features and contextual cues in a multiple segmentation framework. Our main insight is that identifying a larger image region provides strong evidence for classifying the contained smaller ones. The proposed method weights the classification results of each image region at different levels using the Bayesian rules, which are obtained by a series of learned discriminative models based on bag of features. Multiple segmentation framework provides a robust representation, allowing a wide variety of cues to contribute to the confidence in each semantic label. Compared with previous methods, the algorithm achieves the state-of-the-art results and fastest implemental speed on the benchmark dataset.
-
Key words:
- Image labeling and understanding /
- segmentation /
- feature selection /
- classification
-
[1] Shotton J, Winn J W, Rother C, Criminisi A. Textonboost for image understanding: multi-class object recognition and segmentation by jointly modeling texture, layout, and context. International Journal of Computer Vision, 2009, 81(1): 2-23 [2] [2] Tu Z W, Bai X. Auto-context and its application to highlevel vision tasks and 3D brain image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010, 32(10): 1744-1757 [3] [3] Gould S, Rodgers J, Cohen D, Elidan E, Koller D. Multi-class segmentation with relative location prior. International Journal of Computer Vision, 2008, 80(3): 300-316 [4] [4] Gould S, Fulton R, Koller D. Decomposing a scene into geometric and semantically consistent regions. In: Proceedings of the 12th IEEE Conference on Computer Vision. Kyoto, Japan: IEEE, 2009. 1-8 [5] Jiang Li-Xing, Hou Jin. Image annotation using the ensemble learning. Acta Automatica Sinica, 2012, 38(8): 1257-1262(蒋黎星, 侯进. 基于集成分类算法的自动图像标注. 自动化学报, 2012, 38(8): 1257-1262) [6] Zhang Su-Lan, Guo Ping, Zhang Ji-Fu, Hu Li-Hua. Automatic semantic image annotation with granular analysis method. Acta Automatica Sinica, 2012, 38(5): 688-697(张素兰, 郭平, 张继福, 胡立华. 图像语义自动标注及其粒度分析方法. 自动化学报, 2012, 38(5): 688-697) [7] Yang Dong, Zhou Xiu-Ling, Guo Ping. Image annotation with Bayesian universal background model. Acta Automatica Sinica, 2013, 39(10): 1674-1680(杨栋, 周秀玲, 郭平. 基于贝叶斯通用背景模型的图像标注. 自动化学报, 2013, 39(10): 1674-1680) [8] [8] Lafferty J, McCallum A, Pereira F. Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of the 2008 IEEE Conference on Machine Learning. Helsinki, Finland: IEEE, 2008. 282-289 [9] [9] Galleguillos C, Rabinovich A, Belongie S. Object categorization using co-occurrence, location and appearance. In: Proceedings of the 2008 IEEE Conference on Computer Vision and Pattern Recognition. Anchorage, USA: IEEE, 2008. 1-8 [10] Hoiem D, Efros A A, Hebert M. Geometric context from a single image. In: Proceedings of the 2005 IEEE Conference on Computer Vision. Beijing, China: IEEE, 2005. 654-661 [11] He X M, Zemel R S, Ray D. Learning and incorporating top-down cues in image segmentation. In: Proceedings of the 2006 Europe Conference on Computer Vision. Berlin Heidelberg: Springer, 2006. 338-351 [12] Medin D L, Schaffer M M. Context theory of classification learning. Psychological Review, 1978, 85(3): 207-238 [13] Yang L, Meer P, Foran D J. Multiple class segmentation using a unified framework over mean-shift patches. In: Proceedings of the 2007 IEEE Conference on Computer Vision and Pattern Recognition. Minneapolis, USA: IEEE, 2007. 1-8 [14] Comaniciu D, Meer P. Mean shift: a robust approach toward feature space analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2002, 24(5): 603-617 [15] Elkan C. Using the triangle inequality to accelerate k-means. In: Proceedings of the 2003 IEEE Conference on Machine Learning. Washington D.C., USA: IEEE, 2003. 147-153 [16] Collins M, Schapire R, Singer Y. Logistic regression, adaboost and Bregman distances. Machine Learning, 2002, 48(1-3): 253-285 [17] Yao B, Yang X, Zhu S C. Introduction to a large scale general purpose groundtruth dataset: methodology, annotation tool, and benchmark. In: Proceedings of the 2009 Energy Minimization Methods in Computer Vision and Pattern Recognition. Berlin, Heidelberg: Springer-Verlag, 2007. 169-183
点击查看大图
计量
- 文章访问数: 2560
- HTML全文浏览量: 140
- PDF下载量: 740
- 被引次数: 0