2.845

2023影响因子

(CJCR)

  • 中文核心
  • EI
  • 中国科技核心
  • Scopus
  • CSCD
  • 英国科学文摘

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于黎曼流形稀疏编码的图像检索算法

王瑞霞 彭国华

王瑞霞, 彭国华. 基于黎曼流形稀疏编码的图像检索算法. 自动化学报, 2017, 43(5): 778-788. doi: 10.16383/j.aas.2017.c150838
引用本文: 王瑞霞, 彭国华. 基于黎曼流形稀疏编码的图像检索算法. 自动化学报, 2017, 43(5): 778-788. doi: 10.16383/j.aas.2017.c150838
WANG Rui-Xia, PENG Guo-Hua. An Image Retrieval Method with Sparse Coding Based on Riemannian Manifold. ACTA AUTOMATICA SINICA, 2017, 43(5): 778-788. doi: 10.16383/j.aas.2017.c150838
Citation: WANG Rui-Xia, PENG Guo-Hua. An Image Retrieval Method with Sparse Coding Based on Riemannian Manifold. ACTA AUTOMATICA SINICA, 2017, 43(5): 778-788. doi: 10.16383/j.aas.2017.c150838

基于黎曼流形稀疏编码的图像检索算法

doi: 10.16383/j.aas.2017.c150838
基金项目: 

国家自然科学基金 61201323

详细信息
    作者简介:

    彭国华 西北工业大学理学院教授.1993年获得西北工业大学博士学位.主要研究方向为计算机图形学, 计算机辅助几何处理, 图像处理, 计算机视觉.E-mail:penggh@nwpu.edu.cn

    通讯作者:

    王瑞霞 西北工业大学理学院博士研究生.2009年获得西北工业大学硕士学位.主要研究方向为基于内容的图像检索技术.E-mail:wangruixia921@163.com

An Image Retrieval Method with Sparse Coding Based on Riemannian Manifold

Funds: 

National Natural Science Foundation of China 61201323

More Information
    Author Bio:

    Professor at the School of Natural and Applied Sciences, Northwestern Polytechnical University. He received his Ph. D. degree from Northwestern Polytechnical University in 1993. His research interest covers computer graphics, computer aided geometric processing, image processing, and computer vision

    Corresponding author: WANG Rui-Xia Ph. D. candidate at the School of Natural and Applied Sciences, Northwestern Polytechnical University. She received her master degree from Northwestern Polytechnical University in 2009. Her main research interest is content-based image retrieval technology. Corresponding author of this paper
  • 摘要: 针对视觉词袋(Bag-of-visual-words,BOVW)模型直方图量化误差大的缺点,提出基于稀疏编码的图像检索算法.由于大多数图像特征属于非线性流形结构,传统稀疏编码使用向量空间对其度量必然导致不准确的稀疏表示.考虑到图像特征空间的流形结构,选择对称正定矩阵作为特征描述子,构建黎曼流形空间.利用核技术将黎曼流形结构映射到再生核希尔伯特空间,非线性流形转换为线性稀疏编码,获得图像更准确的稀疏表示.实验在Corel1000和Caltech101两个数据集上进行,与已有的图像检索算法对比,提出的图像检索算法不仅提高了检索准确率,而且获得了更好的检索性能.
    1)  本文责任编委 贾云得
  • 图  1  Corel1000数据集字典大小对检索准确率的影响

    Fig.  1  The influence of dictionary size on the retrieval accuracy on Corel1000 database

    图  2  Caltech101数据集字典大小对检索准确率的影响

    Fig.  2  The influence of dictionary size on the retrieval accuracy on Caltech101 database

    图  3  Corel1000数据集两类算法检索对比图

    Fig.  3  The retrieval accuracy contrastive figure of two algorithms on Corel1000 database

    图  4  Caltech101数据集两类算法检索对比图

    Fig.  4  The retrieval accuracy contrastive figure of two algorithms on Caltech101 database

    图  5  Corel1000数据集各类算法 $F_{1}$ -measure对比图

    Fig.  5  The $F_{1}$ -measure contrastive figure of different algorithms on Corel1000 database

    图  6  Corel1000数据集各类算法检索结果示例

    Fig.  6  A few retrieval examples on Corel1000 database by different algorithms

    图  7  Caltech101数据集各类算法 $F_{1}$ -measure对比图

    Fig.  7  The $F_{1}$ -measure contrastive figure of different algorithms on Caltech101 database

    图  8  Caltech101数据集各类算法检索结果示例

    Fig.  8  A few retrieval examples on Caltech101 database by different algorithms

    表  1  Corel1000数据集各类算法MAP值对比

    Table  1  The MAP contrastive results of different algorithms on Corel11000 database

    各类算法 MAP (%) Error
    n-Grams算法 42.31 士0.0729
    LTrPs算法 54.25 士0.0533
    RMSC算法 54.25 士0.0468
    下载: 导出CSV

    表  2  Caltech101数据集各类算法MAP值对比

    Table  2  The MAP contrastive results of different algorithms on Caltech101 database

    各类算法 MAP (%) Error
    n-Grams算法 28.32 士0.0898
    LTrPs算法 43.81 士0.0732
    RMSC算法 51.31 士0.0539
    下载: 导出CSV

    表  3  Caltech101数据集的图像类别

    Table  3  The image classification on Caltech101 database

    1~17类 18~34类 35~51类 52~68类 69~85类 86~101类
    1 faces 18 camera 35 dragonfly 52 ibis 69 okapi 86 stapler
    2 faces_easy 19 cannon 36 electric_guitar 53 inline_skate 70 pagoda 87 starfish
    3 leopards 20 car_side 37 elephant 54 joshua_tree 71 panda 88 stegosaurus
    4 motorbikes 21 ceiling_fan 38 emu 55 kangaroo 72 pigeon 89 stop_sign
    5 accordion 22 cellphone 39 euphonium 56 ketch 73 pizza 90 strawberry
    6 airplanes 23 chair 40 ewer 57 lamp 74 platypus 91 sunflower
    7 anchor 24 chandelier 41 ferry 58 laptop 75 pyramid 92 tick
    8 ant 25 cougar_body 42 flamingo 59 llama 76 revolver 93 trilobite
    9 barrel 26 cougar_face 43 flamingo_head 60 lobster 77 rhino 94 umbrella
    10 bass 27 crab 44 garfield 61 lotus 78 rooster 95 watch
    11 beaver 28 crayfish 45 gerenuk 62 mandolin 79 saxophone 96 water」illy
    12 binocular 29 crocodile 46 gramophone 63 mayfly 80 schooner 97 wheelchair
    13 bonsai 30 crocodile_head 47 grand_piano 64 menorah 81 scissors 98 wild_cat
    14 brain 31 cup 48 hawksbill 65 metronome 82 scorpion 99 windsor_chair
    15 brontosaurus 32 dalmatian 49 headphone 66 minaret 83 sea_horse 100 wrench
    16 buddha 33 dollar_bill 50 hedgehog 67 nautilus 84 snoopy 101 yin_yang
    17 butterfly 34 dolphin 51 helicopter 68 octopus 85 soccer_ball
    下载: 导出CSV
  • [1] Sivic J, Zisserman A. Video google: a text retrieval approach to object matching in videos. In: Proceedings of the 9th IEEE International Conference on Computer Vision. Nice, France: IEEE, 2003. 1470-1477
    [2] 刘鹏, 叶志鹏, 赵巍, 唐降龙.一种多层次抽象语义决策图像分类方法.自动化学报, 2015, 41(5): 960-969 http://www.aas.net.cn/CN/abstract/abstract18670.shtml

    Liu Peng, Ye Zhi-Peng, Zhao Wei, Tang Xiang-Long. A multiple layer abstract semantic decision method for image classification. Acta Automatica Sinica, 2015, 41(5): 960-969 http://www.aas.net.cn/CN/abstract/abstract18670.shtml
    [3] 张琳波, 王春恒, 肖柏华, 邵允学.基于Bag-of-phrases的图像表示方法.自动化学报, 2012, 38(1): 46-54 http://www.aas.net.cn/CN/abstract/abstract17634.shtml

    Zhang Lin-Bo, Wang Chun-Heng, Xiao Bai-Hua, Shao Yun-Xue. Image representation using bag-of-phrases. Acta Automatica Sinica, 2012, 38(1): 46-54 http://www.aas.net.cn/CN/abstract/abstract17634.shtml
    [4] El Sayad I, Martinet J, Urruty T, Djeraba C. Toward a higher-level visual representation for content-based image retrieval. Multimedia Tools and Applications, 2012, 60(2): 455-482 doi: 10.1007/s11042-010-0596-x
    [5] Pedrosa G V, Traina A J M. From bag-of-visual-words to bag-of-visual-phrases using n-grams. In: Proceedings of the 2013 XXVI Conference on Graphics, Patterns and Images. Arequipa, Peru: IEEE, 2013. 304-311
    [6] Shriwas M K, Raut V R. Content based image retrieval: a past, present and new feature descriptor. In: Proceedings of the 2015 International Conference on Circuit, Power and Computing Technologies. Nagercoil, India: IEEE, 2015. 1-7
    [7] Cherian A, Morellas V, Papanikolopoulos N. Bayesian nonparametric clustering for positive definite matrices. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 38(5): 862-874 doi: 10.1109/TPAMI.2015.2456903
    [8] Wu Y W, Jia Y D, Li P H, Zhang J, Yuan J S. Manifold kernel sparse representation of symmetric positive-definite matrices and its applications. IEEE Transactions on Image Processing, 2015, 24(11): 3729-3741 doi: 10.1109/TIP.2015.2451953
    [9] Tabia H, Laga H. Covariance-based descriptors for efficient 3D shape matching, retrieval, and classification. IEEE Transactions on Multimedia, 2015, 17(9): 1591-1603 doi: 10.1109/TMM.2015.2457676
    [10] 李广伟, 刘云鹏, 尹健, 史泽林.基于黎曼流形的平面目标识别.自动化学报, 2010, 36(4): 465-474 http://www.aas.net.cn/CN/abstract/abstract13693.shtml

    Li Guang-Wei, Liu Yun-Peng, Yin Jian, Shi Ze-Lin. Planar object recognition based on Riemannian manifold. Acta Automatica Sinica, 2010, 36(4): 465-474 http://www.aas.net.cn/CN/abstract/abstract13693.shtml
    [11] Jayasumana S, Hartley R, Salzmann M, Li H D, Harandi M. Kernel methods on the Riemannian manifold of symmetric positive definite matrices. In: Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition. Portland, OR, USA: IEEE, 2013. 73-80
    [12] Harandi M T, Hartley R, Lovell B, Sanderson C. Sparse coding on symmetric positive definite manifolds using bregman divergences. IEEE Transactions on Neural Networks and Learning Systems, 2016, 27(6): 1294-1306 doi: 10.1109/TNNLS.2014.2387383
    [13] Tuzel O, Porikli F, Meer P. Region covariance: a fast descriptor for detection and classification. In: Proceedings of the 9th European Conference on Computer Vision. Graz, Austria: Springer, 2006. 589-600
    [14] Arsigny V, Fillard P, Pennec X, Ayache N. Log-Euclidean metrics for fast and simple calculus on diffusion tensors. Magnetic Resonance in Medicine, 2006, 56(2): 411-421 doi: 10.1002/(ISSN)1522-2594
    [15] Pennec X, Fillard P, Ayache N. A Riemannian framework for tensor computing. International Journal of Computer Vision, 2006, 66(1): 41-66 doi: 10.1007/s11263-005-3222-z
    [16] Sra S. A new metric on the manifold of kernel matrices with application to matrix geometric means. In: Proceedings of the 2012 Advances in Neural Information Processing Systems 25. Lake Tahoe, Nevada, USA: Curran Associates Inc., 2012. 144-152
    [17] Cherian A, Sra S, Banerjee A, Papanikolopoulos N. Jensen-Bregman LogDet divergence with application to efficient similarity search for covariance matrices. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(9): 2161-2174 doi: 10.1109/TPAMI.2012.259
    [18] Sra S, Hosseini R. Conic geometric optimization on the manifold of positive definite matrices. SIAM Journal on Optimization, 2015, 25(1): 713-739 doi: 10.1137/140978168
    [19] Sra S. Positive definite matrices and the S-divergence [Online], available: http://arxiv.org/pdf/1110.1773.pdf, May 23, 2016
    [20] Harandi M, Sanderson C, Shen C, Lovell B. Dictionary learning and sparse coding on Grassmann manifolds: an extrinsic solution. In: Proceedings of the 2013 IEEE International Conference on Computer Vision. Sydney, NSW, Australia: IEEE, 2013. 3120-3127
    [21] Aharon M, Elad M, Bruckstein A. K-SVD: an algorithm for designing overcomplete dictionaries for sparse representation. IEEE Transactions on Signal Processing, 2006, 54(11): 4311-4322 doi: 10.1109/TSP.2006.881199
    [22] Xie Y C, Ho J, Vemuri B. On a nonlinear generalization of sparse coding and dictionary learning. In: Proceedings of the 30th International Conference on Machine Learning. Atlanta, GA, USA: IEEE, 2013. 1480-1488
    [23] Zhang S P, Kasiviswanathan S, Yuen P C, Harandi M. Online dictionary learning on symmetric positive definite manifolds with vision applications. In: Proceedings of the 29th AAAI Conference on Artificial Intelligence. Austin, Texas, USA: AAAI Press, 2015. 3165-3173
    [24] Schölkopf B, Platt J, Hofmann T. Efficient sparse coding algorithms. In: Proceedings of the 2006 Advances in Neural Information Processing Systems 19. Vancouver, British Columbia, Canada: MIT Press, 2006. 801-808
    [25] Higham N J. Computing a nearest symmetric positive semidefinite matrix. Linear Algebra and Its Applications, 1988, 103: 103-118 doi: 10.1016/0024-3795(88)90223-6
    [26] Powers D M W. Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation. Journal of Machine Learning Technologies, 2011, 2(1): 37-63 http://www.peerevaluation.org/pdf/download/libraryID:29919
    [27] Turpin A, Scholer F. User performance versus precision measures for simple search tasks. In: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. Seattle, Washington, USA: ACM, 2006. 11-18
  • 加载中
图(8) / 表(3)
计量
  • 文章访问数:  2303
  • HTML全文浏览量:  394
  • PDF下载量:  857
  • 被引次数: 0
出版历程
  • 收稿日期:  2015-12-11
  • 录用日期:  2016-05-17
  • 刊出日期:  2017-05-01

目录

    /

    返回文章
    返回