基于黎曼流形稀疏编码的图像检索算法

王瑞霞; 彭国华

doi:10.16383/j.aas.2017.c150838

基于黎曼流形稀疏编码的图像检索算法

doi: 10.16383/j.aas.2017.c150838

王瑞霞^1,2, ,,
彭国华^1,

1.
西北工业大学理学院西安 710129
2.
陕西科技大学文理学院西安 710021

基金项目:

国家自然科学基金 61201323

详细信息

作者简介:
彭国华西北工业大学理学院教授.1993年获得西北工业大学博士学位.主要研究方向为计算机图形学, 计算机辅助几何处理, 图像处理, 计算机视觉.E-mail:penggh@nwpu.edu.cn

通讯作者:
王瑞霞西北工业大学理学院博士研究生.2009年获得西北工业大学硕士学位.主要研究方向为基于内容的图像检索技术.E-mail:wangruixia921@163.com

计量
- 文章访问数: 2335
- HTML全文浏览量: 395
- PDF下载量: 866
- 被引次数: 0
出版历程
- 收稿日期: 2015-12-11
- 录用日期: 2016-05-17
- 刊出日期: 2017-05-01

An Image Retrieval Method with Sparse Coding Based on Riemannian Manifold

WANG Rui-Xia^{1,2
, ,},
PENG Guo-Hua^1
,

1.
School of Natural and Applied Sciences, Northwestern Polytechnical University, Xi'an 710129
2.
Arts and Sciences College, Shaanxi University of Science and Technology, Xi'an 710021

Funds:

National Natural Science Foundation of China 61201323

More Information

Author Bio:
Professor at the School of Natural and Applied Sciences, Northwestern Polytechnical University. He received his Ph. D. degree from Northwestern Polytechnical University in 1993. His research interest covers computer graphics, computer aided geometric processing, image processing, and computer vision

Corresponding author: WANG Rui-Xia Ph. D. candidate at the School of Natural and Applied Sciences, Northwestern Polytechnical University. She received her master degree from Northwestern Polytechnical University in 2009. Her main research interest is content-based image retrieval technology. Corresponding author of this paper

摘要

摘要: 针对视觉词袋（Bag-of-visual-words，BOVW）模型直方图量化误差大的缺点，提出基于稀疏编码的图像检索算法.由于大多数图像特征属于非线性流形结构，传统稀疏编码使用向量空间对其度量必然导致不准确的稀疏表示.考虑到图像特征空间的流形结构，选择对称正定矩阵作为特征描述子，构建黎曼流形空间.利用核技术将黎曼流形结构映射到再生核希尔伯特空间，非线性流形转换为线性稀疏编码，获得图像更准确的稀疏表示.实验在Corel1000和Caltech101两个数据集上进行，与已有的图像检索算法对比，提出的图像检索算法不仅提高了检索准确率，而且获得了更好的检索性能.
- 稀疏编码 /
- 黎曼几何 /
- 流形结构 /
- 对称正定矩阵 /
- 希尔伯特空间 /
- 图像检索
Abstract: In the BOVW (bag-of-visual-words) model, histogram quantization would result in a bigger error for image retrieval. Considering this shortcoming, a new image retrieval algorithm based on sparse coding is proposed. Most image features belongs to nonlinear manifold structure, but the traditional sparse coding uses vector space to measure image feature space, which must lead to an inaccurate sparse representation. Owing to the manifold structure of image features space, symmetric positive definite matrices are selected as feature descriptors to build a Riemannian manifold space. Through the kernel method, the Riemann manifold structure is mapped into the reproducing kernel Hilbert space, and nonlinear manifold is converted into linear sparse coding, so the image can acquire a more accurate sparse representation. Experiments are performed on the Corel1000 database and Caltech101 database. In comparison with the existing image retrieval algorithms, the new image retrieval algorithm largely improves the retrieval accuracy and has a better efficiency.
- Sparse coding /
- Riemannian geometry /
- manifold structure /
- symmetric positive definite matrix /
- Hilbert space /
- image retrieval
注释:

1) 本文责任编委贾云得

HTML全文

图 1 Corel1000数据集字典大小对检索准确率的影响

Fig. 1 The influence of dictionary size on the retrieval accuracy on Corel1000 database

下载: 全尺寸图片幻灯片

图 2 Caltech101数据集字典大小对检索准确率的影响

Fig. 2 The influence of dictionary size on the retrieval accuracy on Caltech101 database

下载: 全尺寸图片幻灯片

图 3 Corel1000数据集两类算法检索对比图

Fig. 3 The retrieval accuracy contrastive figure of two algorithms on Corel1000 database

下载: 全尺寸图片幻灯片

图 4 Caltech101数据集两类算法检索对比图

Fig. 4 The retrieval accuracy contrastive figure of two algorithms on Caltech101 database

下载: 全尺寸图片幻灯片

图 5 Corel1000数据集各类算法 $F_{1}$ -measure对比图

Fig. 5 The $F_{1}$ -measure contrastive figure of different algorithms on Corel1000 database

下载: 全尺寸图片幻灯片

图 6 Corel1000数据集各类算法检索结果示例

Fig. 6 A few retrieval examples on Corel1000 database by different algorithms

下载: 全尺寸图片幻灯片

图 7 Caltech101数据集各类算法 $F_{1}$ -measure对比图

Fig. 7 The $F_{1}$ -measure contrastive figure of different algorithms on Caltech101 database

下载: 全尺寸图片幻灯片

图 8 Caltech101数据集各类算法检索结果示例

Fig. 8 A few retrieval examples on Caltech101 database by different algorithms

下载: 全尺寸图片幻灯片

表 1 Corel1000数据集各类算法MAP值对比

Table 1 The MAP contrastive results of different algorithms on Corel11000 database

各类算法	MAP (%)	Error
n-Grams算法	42.31	士0.0729
LTrPs算法	54.25	士0.0533
RMSC算法	54.25	士0.0468

下载: 导出CSV

表 2 Caltech101数据集各类算法MAP值对比

Table 2 The MAP contrastive results of different algorithms on Caltech101 database

各类算法	MAP (%)	Error
n-Grams算法	28.32	士0.0898
LTrPs算法	43.81	士0.0732
RMSC算法	51.31	士0.0539

下载: 导出CSV

表 3 Caltech101数据集的图像类别

Table 3 The image classification on Caltech101 database

1~17类	18~34类	35~51类	52~68类	69~85类	86~101类
1 faces	18 camera	35 dragonfly	52 ibis	69 okapi	86 stapler
2 faces_easy	19 cannon	36 electric_guitar	53 inline_skate	70 pagoda	87 starfish
3 leopards	20 car_side	37 elephant	54 joshua_tree	71 panda	88 stegosaurus
4 motorbikes	21 ceiling_fan	38 emu	55 kangaroo	72 pigeon	89 stop_sign
5 accordion	22 cellphone	39 euphonium	56 ketch	73 pizza	90 strawberry
6 airplanes	23 chair	40 ewer	57 lamp	74 platypus	91 sunflower
7 anchor	24 chandelier	41 ferry	58 laptop	75 pyramid	92 tick
8 ant	25 cougar_body	42 flamingo	59 llama	76 revolver	93 trilobite
9 barrel	26 cougar_face	43 flamingo_head	60 lobster	77 rhino	94 umbrella
10 bass	27 crab	44 garfield	61 lotus	78 rooster	95 watch
11 beaver	28 crayfish	45 gerenuk	62 mandolin	79 saxophone	96 water」illy
12 binocular	29 crocodile	46 gramophone	63 mayfly	80 schooner	97 wheelchair
13 bonsai	30 crocodile_head	47 grand_piano	64 menorah	81 scissors	98 wild_cat
14 brain	31 cup	48 hawksbill	65 metronome	82 scorpion	99 windsor_chair
15 brontosaurus	32 dalmatian	49 headphone	66 minaret	83 sea_horse	100 wrench
16 buddha	33 dollar_bill	50 hedgehog	67 nautilus	84 snoopy	101 yin_yang
17 butterfly	34 dolphin	51 helicopter	68 octopus	85 soccer_ball

下载: 导出CSV

参考文献(27)

[1]	Sivic J, Zisserman A. Video google: a text retrieval approach to object matching in videos. In: Proceedings of the 9th IEEE International Conference on Computer Vision. Nice, France: IEEE, 2003. 1470-1477
[2]	刘鹏, 叶志鹏, 赵巍, 唐降龙.一种多层次抽象语义决策图像分类方法.自动化学报, 2015, 41(5): 960-969 http://www.aas.net.cn/CN/abstract/abstract18670.shtml Liu Peng, Ye Zhi-Peng, Zhao Wei, Tang Xiang-Long. A multiple layer abstract semantic decision method for image classification. Acta Automatica Sinica, 2015, 41(5): 960-969 http://www.aas.net.cn/CN/abstract/abstract18670.shtml
[3]	张琳波, 王春恒, 肖柏华, 邵允学.基于Bag-of-phrases的图像表示方法.自动化学报, 2012, 38(1): 46-54 http://www.aas.net.cn/CN/abstract/abstract17634.shtml Zhang Lin-Bo, Wang Chun-Heng, Xiao Bai-Hua, Shao Yun-Xue. Image representation using bag-of-phrases. Acta Automatica Sinica, 2012, 38(1): 46-54 http://www.aas.net.cn/CN/abstract/abstract17634.shtml
[4]	El Sayad I, Martinet J, Urruty T, Djeraba C. Toward a higher-level visual representation for content-based image retrieval. Multimedia Tools and Applications, 2012, 60(2): 455-482 doi: 10.1007/s11042-010-0596-x
[5]	Pedrosa G V, Traina A J M. From bag-of-visual-words to bag-of-visual-phrases using n-grams. In: Proceedings of the 2013 XXVI Conference on Graphics, Patterns and Images. Arequipa, Peru: IEEE, 2013. 304-311
[6]	Shriwas M K, Raut V R. Content based image retrieval: a past, present and new feature descriptor. In: Proceedings of the 2015 International Conference on Circuit, Power and Computing Technologies. Nagercoil, India: IEEE, 2015. 1-7
[7]	Cherian A, Morellas V, Papanikolopoulos N. Bayesian nonparametric clustering for positive definite matrices. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 38(5): 862-874 doi: 10.1109/TPAMI.2015.2456903
[8]	Wu Y W, Jia Y D, Li P H, Zhang J, Yuan J S. Manifold kernel sparse representation of symmetric positive-definite matrices and its applications. IEEE Transactions on Image Processing, 2015, 24(11): 3729-3741 doi: 10.1109/TIP.2015.2451953
[9]	Tabia H, Laga H. Covariance-based descriptors for efficient 3D shape matching, retrieval, and classification. IEEE Transactions on Multimedia, 2015, 17(9): 1591-1603 doi: 10.1109/TMM.2015.2457676
[10]	李广伟, 刘云鹏, 尹健, 史泽林.基于黎曼流形的平面目标识别.自动化学报, 2010, 36(4): 465-474 http://www.aas.net.cn/CN/abstract/abstract13693.shtml Li Guang-Wei, Liu Yun-Peng, Yin Jian, Shi Ze-Lin. Planar object recognition based on Riemannian manifold. Acta Automatica Sinica, 2010, 36(4): 465-474 http://www.aas.net.cn/CN/abstract/abstract13693.shtml
[11]	Jayasumana S, Hartley R, Salzmann M, Li H D, Harandi M. Kernel methods on the Riemannian manifold of symmetric positive definite matrices. In: Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition. Portland, OR, USA: IEEE, 2013. 73-80
[12]	Harandi M T, Hartley R, Lovell B, Sanderson C. Sparse coding on symmetric positive definite manifolds using bregman divergences. IEEE Transactions on Neural Networks and Learning Systems, 2016, 27(6): 1294-1306 doi: 10.1109/TNNLS.2014.2387383
[13]	Tuzel O, Porikli F, Meer P. Region covariance: a fast descriptor for detection and classification. In: Proceedings of the 9th European Conference on Computer Vision. Graz, Austria: Springer, 2006. 589-600
[14]	Arsigny V, Fillard P, Pennec X, Ayache N. Log-Euclidean metrics for fast and simple calculus on diffusion tensors. Magnetic Resonance in Medicine, 2006, 56(2): 411-421 doi: 10.1002/(ISSN)1522-2594
[15]	Pennec X, Fillard P, Ayache N. A Riemannian framework for tensor computing. International Journal of Computer Vision, 2006, 66(1): 41-66 doi: 10.1007/s11263-005-3222-z
[16]	Sra S. A new metric on the manifold of kernel matrices with application to matrix geometric means. In: Proceedings of the 2012 Advances in Neural Information Processing Systems 25. Lake Tahoe, Nevada, USA: Curran Associates Inc., 2012. 144-152
[17]	Cherian A, Sra S, Banerjee A, Papanikolopoulos N. Jensen-Bregman LogDet divergence with application to efficient similarity search for covariance matrices. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(9): 2161-2174 doi: 10.1109/TPAMI.2012.259
[18]	Sra S, Hosseini R. Conic geometric optimization on the manifold of positive definite matrices. SIAM Journal on Optimization, 2015, 25(1): 713-739 doi: 10.1137/140978168
[19]	Sra S. Positive definite matrices and the S-divergence [Online], available: http://arxiv.org/pdf/1110.1773.pdf, May 23, 2016
[20]	Harandi M, Sanderson C, Shen C, Lovell B. Dictionary learning and sparse coding on Grassmann manifolds: an extrinsic solution. In: Proceedings of the 2013 IEEE International Conference on Computer Vision. Sydney, NSW, Australia: IEEE, 2013. 3120-3127
[21]	Aharon M, Elad M, Bruckstein A. K-SVD: an algorithm for designing overcomplete dictionaries for sparse representation. IEEE Transactions on Signal Processing, 2006, 54(11): 4311-4322 doi: 10.1109/TSP.2006.881199
[22]	Xie Y C, Ho J, Vemuri B. On a nonlinear generalization of sparse coding and dictionary learning. In: Proceedings of the 30th International Conference on Machine Learning. Atlanta, GA, USA: IEEE, 2013. 1480-1488
[23]	Zhang S P, Kasiviswanathan S, Yuen P C, Harandi M. Online dictionary learning on symmetric positive definite manifolds with vision applications. In: Proceedings of the 29th AAAI Conference on Artificial Intelligence. Austin, Texas, USA: AAAI Press, 2015. 3165-3173
[24]	Schölkopf B, Platt J, Hofmann T. Efficient sparse coding algorithms. In: Proceedings of the 2006 Advances in Neural Information Processing Systems 19. Vancouver, British Columbia, Canada: MIT Press, 2006. 801-808
[25]	Higham N J. Computing a nearest symmetric positive semidefinite matrix. Linear Algebra and Its Applications, 1988, 103: 103-118 doi: 10.1016/0024-3795(88)90223-6
[26]	Powers D M W. Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation. Journal of Machine Learning Technologies, 2011, 2(1): 37-63 http://www.peerevaluation.org/pdf/download/libraryID:29919
[27]	Turpin A, Scholer F. User performance versus precision measures for simple search tasks. In: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. Seattle, Washington, USA: ACM, 2006. 11-18

施引文献

资源附件(0)

访问统计

图(8) / 表(3)

计量

文章访问数: 2335
HTML全文浏览量: 395
PDF下载量: 866
被引次数: 0

姓名
邮箱
手机号码
标题
留言内容
验证码

留言板

基于黎曼流形稀疏编码的图像检索算法

doi: 10.16383/j.aas.2017.c150838

作者简介:
彭国华西北工业大学理学院教授.1993年获得西北工业大学博士学位.主要研究方向为计算机图形学, 计算机辅助几何处理, 图像处理, 计算机视觉.E-mail:penggh@nwpu.edu.cn

通讯作者:
王瑞霞西北工业大学理学院博士研究生.2009年获得西北工业大学硕士学位.主要研究方向为基于内容的图像检索技术.E-mail:wangruixia921@163.com

计量

An Image Retrieval Method with Sparse Coding Based on Riemannian Manifold

计量

目录

留言板

基于黎曼流形稀疏编码的图像检索算法

doi: 10.16383/j.aas.2017.c150838

作者简介: 彭国华 西北工业大学理学院教授.1993年获得西北工业大学博士学位.主要研究方向为计算机图形学, 计算机辅助几何处理, 图像处理, 计算机视觉.E-mail:penggh@nwpu.edu.cn

通讯作者: 王瑞霞 西北工业大学理学院博士研究生.2009年获得西北工业大学硕士学位.主要研究方向为基于内容的图像检索技术.E-mail:wangruixia921@163.com

计量

出版历程

An Image Retrieval Method with Sparse Coding Based on Riemannian Manifold

计量

出版历程

目录

作者简介:
彭国华西北工业大学理学院教授.1993年获得西北工业大学博士学位.主要研究方向为计算机图形学, 计算机辅助几何处理, 图像处理, 计算机视觉.E-mail:penggh@nwpu.edu.cn

通讯作者:
王瑞霞西北工业大学理学院博士研究生.2009年获得西北工业大学硕士学位.主要研究方向为基于内容的图像检索技术.E-mail:wangruixia921@163.com