2.845

2023影响因子

(CJCR)

  • 中文核心
  • EI
  • 中国科技核心
  • Scopus
  • CSCD
  • 英国科学文摘

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

一种多层次抽象语义决策图像分类方法

刘鹏 叶志鹏 赵巍 唐降龙

刘鹏, 叶志鹏, 赵巍, 唐降龙. 一种多层次抽象语义决策图像分类方法. 自动化学报, 2015, 41(5): 960-969. doi: 10.16383/j.aas.2015.c140238
引用本文: 刘鹏, 叶志鹏, 赵巍, 唐降龙. 一种多层次抽象语义决策图像分类方法. 自动化学报, 2015, 41(5): 960-969. doi: 10.16383/j.aas.2015.c140238
LIU Peng, YE Zhi-Peng, ZHAO Wei, TANG Xiang-Long. A Multiple Layer Abstract Semantic Decision Method for Image Classification. ACTA AUTOMATICA SINICA, 2015, 41(5): 960-969. doi: 10.16383/j.aas.2015.c140238
Citation: LIU Peng, YE Zhi-Peng, ZHAO Wei, TANG Xiang-Long. A Multiple Layer Abstract Semantic Decision Method for Image Classification. ACTA AUTOMATICA SINICA, 2015, 41(5): 960-969. doi: 10.16383/j.aas.2015.c140238

一种多层次抽象语义决策图像分类方法

doi: 10.16383/j.aas.2015.c140238
基金项目: 

国家自然科学基金(61171184, 61201309, 61440025) 资助

详细信息
    作者简介:

    叶志鹏 哈尔滨工业大学计算机科学与技术学院博士研究生. 2013 年获得哈尔滨工业大学计算机应用技术硕士学位.主要研究方向为模式识别, 机器学习.E-mail: yezhipeng@hit.edu.cn

    通讯作者:

    唐降龙 哈尔滨工业大学计算机科学与技术学院教授. 1995 年获得哈尔滨工业大学计算机应用技术博士学位. 主要研究方向为模式识别, 图像处理, 机器学习.E-mail: tangxl@hit.edu.cn

A Multiple Layer Abstract Semantic Decision Method for Image Classification

Funds: 

Supported by National Natural Science Foundation of China (61171184, 61201309, 61440025)

  • 摘要: 视觉词包(Bag-of-visual-words, BoVW) 模型是一种有效的图像分类方法. 本文提出一种基于语义抽象的多层次决策(Multiple layer decision, MLD) 方法,通过在BoVW 中引入抽象语义进行多层次扩展,采用语义保留方法生成具有语义的视觉词典,利用自底向上的方式逐层传递语义, 训练上层语义分类器;分类时采用自顶向下方式逐层判断待测样本的类别. 用标准数据集验证方法的分类性能. 结果表明,本文提出的方法与主流分类方法相比具有更好的分类性能.
  • [1] Csurka G, Dance C R, Fan L X, Willamowski J, Bray C. Visual categorization with bags of keypoints. In: Proceedings of the 2004 Workshop on Statistical Learning in Computer Vision, European Conference on Computer Vision. Berlin, Germany: Springer Berlin Heidelberg, 2004. 1-2
    [2] Zhang Su-Lan, Guo Ping, Zhang Ji-Fu, Hu Li-Hua. Automatic semantic image annotation with granular analysis method. Acta Automatica Sinica, 2012, 38(5): 688-697(张素兰, 郭平, 张继福, 胡立华. 图像语义自动标注及其粒度分析方法. 自动化学报, 2012, 38(5): 688-697)
    [3] [3] Qin J Z, Yung N H C. Scene categorization via contextual visual words. Pattern Recognition, 2010, 43(5): 1874-1888
    [4] [4] Elfiky N M, Khan F S, van De Weijer J, Gonzlez J. Discriminative compact pyramids for object and scene recognition. Pattern Recognition, 2012, 45(4): 1627-1636
    [5] [5] Wang F, Jiang Y G, Ngo C W. Video event detection using motion relativity and visual relatedness. In: Proceedings of the 16th ACM International Conference on Multimedia. NY, USA: ACM, 2008. 239-248
    [6] [6] Liu J E, Yang Y, Saleemi I, Shah M. Learning semantic features for action recognition via diffusion maps. Computer Vision and Image Understanding, 2012, 116(3): 361-377
    [7] [7] Lazebnik S, Schmid C, Ponce J. Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. New York, USA: IEEE, 2006. 2169-2178
    [8] [8] Yuan J S, Wu Y, Yang M. Discovery of collocation patterns: from visual words to visual phrases. In: Proceedings of the 2007 IEEE Conference on Computer Vision and Pattern Recognition. Minneapolis, MN: IEEE, 2007. 1-8
    [9] [9] Du R, Wu Q, He X J, Yang J. Object categorization based on a supervised mean shift algorithm. In: Proceedings of the Computer Vision EECV 2012 Workshops and Demonstrations. Berlin, Germany: Springer Berlin Heidelberg, 2012. 611-614
    [10] Chai Y N, Rahtu E, Lempitsky V, van Gool L, Zisserman A. TriCoS: a tri-level class-discriminative co-segmentation method for image classification. In: Proceedings of the 2012 European Conference on Computer Vision. Berlin, Germany: Springer Berlin Heidelberg, 2012. 794-807
    [11] Krapac J, Verbeek J, Jurie F. Modeling spatial layout with fisher vectors for image categorization. In: Proceedings of the 2011 IEEE International Conference on Computer Vision (ICCV). Barcelona, Spain: IEEE, 2011. 1487-1494
    [12] Bolovinou A, Pratikakis I, Perantonis S. Bag of spatio-visual words for context inference in scene classification. Pattern Recognition, 2013, 46(3): 1039-1053
    [13] Wang J J, Yang J C, Yu K, Lv F J, Huang T, Gong Y H. Locality-constrained linear coding for image classification. In: Proceedings of the 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). San Francisco, CA: IEEE, 2010. 3360-3367
    [14] van Gemert J C, Veenman C J, Smeulders A W M, Geusebroek J M. Visual word ambiguity. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010, 32(7): 1271-1283
    [15] Liu J, Zhang C J, Tian Q, Xu C S, Lu H Q, Ma S D. One step beyond bags of features: visual categorization using components. In: Proceedings of the 18th IEEE International Conference on Image Processing (ICIP). Brussels, Belgium: IEEE, 2011. 2417-2420
    [16] Avrithis Y, Kalantidis Y. Approximate Gaussian mixtures for large scale vocabularies. In: Proceedings of the 12th European Conference on Computer Vision. Berlin, Germany: Springer-Verlag Berlin, Heidelberg, 2012. 15-28
    [17] Mikulk A, Perdoch M, Chum O, Matas J. Learning a fine vocabulary. In: Proceedings of the 11th European Conference on Computer Vision. Berlin, Germany: Springer-Verlag Berlin, Heidelberg, 2010. 1-14
    [18] Tang J H, Zha Z J, Tao D C, Chua T S. Semantic-gap-oriented active learning for multilabel image annotation. IEEE Transactions on Image Processing, 2012, 21(4): 2354-2360
    [19] Wu L, Hoi S C H, Yu N H. Semantics-preserving bag-of-words models and applications. IEEE Transactions on Image Processing, 2010, 19(7): 1908-1920
    [20] Ji C J, Zhou X D, Lin L, Yang W D. Labeling images by integrating sparse multiple distance learning and semantic context modeling. In: Proceedings of the 12th European Conference on Computer Vision. Berlin, Germany: Springer-Verlag Berlin, Heidelberg, 2012. 688-701
    [21] Liu J E, Yang Y, Shah M. Learning semantic visual vocabularies using diffusion distance. In: Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition. Miami, FL: IEEE, 2009. 461-468
    [22] Penatti O A B, Silva F B, Valle E, Gouet-Brunet V, Torres R S. Visual word spatial arrangement for image retrieval and classification. Pattern Recognition, 2014, 47(2): 705-720
    [23] Li L J, Wang C, Lim Y, Blei D M, Li F F. Building and using a semantivisual image hierarchy. In: Proceedings of the 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). San Francisco, CA: IEEE, 2010. 3336-3343
    [24] Bannour H, Hudelot C. Building semantic hierarchies faithful to image semantics. In: Proceedings of the 18th International Conference on Advances in Multimedia Modeling. Berlin, Germany: Springer-Verlag Berlin, Heidelberg, 2012. 4-15
    [25] Bannour H, Hudelot C. Hierarchical image annotation using semantic hierarchies. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management. New York, NY, USA: ACM, 2012. 2431-2434
    [26] Deng J, Berg A C, Li K, Li F F. What does classifying more than 10000 image categories tell us? In: Proceedings of the 11th European Conference on Computer Vision. Berlin, Germany: Springer-Verlag Berlin, Heidelberg, 2010. 71-84
    [27] Lorenza S, Jean-Daniel Z. Abstraction in Artificial Intelligence and Complex Systems. New York: Springer-Verlag New York Inc., 2013. 273-325
    [28] Everingham M, Van Gool L, Williams C K I, Winn J, Zisserman A. The pascal visual object classes (voc) challenge. International Journal of Computer Vision, 2010, 88(2): 303-338
    [29] Li F F, Fergus R, Perona P. Learning generative visual models from few training examples: an incremental Bayesian approach tested on 101 object categories. Computer Vision and Image Understanding, 2007, 106(1): 59-70
    [30] Fan R E, Chang K W, Hsieh C J, Wang X R, Lin C J. LIBLINEAR: A library for large linear classification. Journal of Machine Learning Research, 2008, 9: 1871-1874
    [31] Zhang Lin-Bo, Wang Chun-Heng, Xiao Bo-Hua, Shao Yun-Xue. Image representation using bag-of-phrases. Acta Automatica Sinica, 2012, 38(1): 46-54 (张琳波, 王春恒, 肖柏华, 邵允学. 基于Bag-of-phrases的图像表示方法. 自动化学报, 2012, 38(1): 46-54)
    [32] Fernando B, Fromont E, Muselet D, Sebban M. Supervised learning of Gaussian mixture models for visual vocabulary generation. Pattern Recognition, 2012, 45(2): 897-907
    [33] Su Y, Jurie F. Improving image classification using semantic attributes. International Journal of Computer Vision, 2012, 100(1): 59-77
    [34] Perronnin F, Snchez J, Mensink T. Improving the Fisher kernel for large-scale image classification. In: Proceedings of the 11th European Conference on Computer Vision. Berlin, Germany: Springer-Verlag Berlin, Heidelberg, 2010. 143-156
    [35] Zhong J, Wang J, Su Y T, Song Z J, Xing S K. Balance between object and background: object-enhanced features for scene image classification. Neurocomputing, 2013, 120: 15-23
    [36] Maji S, Berg A C, Malik J. Efficient classification for additive kernel SVMs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(1): 66-77
    [37] Bilen H, Namboodiri V P, Van Gool L J. Object and action classification with latent window parameters. International Journal of Computer Vision, 2014, 106(3): 237-251
  • 加载中
计量
  • 文章访问数:  1548
  • HTML全文浏览量:  70
  • PDF下载量:  1107
  • 被引次数: 0
出版历程
  • 收稿日期:  2014-04-23
  • 修回日期:  2015-01-06
  • 刊出日期:  2015-05-20

目录

    /

    返回文章
    返回