一种基于视觉词典优化和查询扩展的图像检索方法

柯圣财; 李弼程; 陈刚; 赵永威; 魏晗

doi:10.16383/j.aas.2018.c160041

一种基于视觉词典优化和查询扩展的图像检索方法

doi: 10.16383/j.aas.2018.c160041

柯圣财^1,2,,
李弼程^3, ,,
陈刚^1,,
赵永威^1,,
魏晗^1,

1.
解放军信息工程大学信息系统工程学院郑州 450001
2.
75830部队广州 510000
3.
华侨大学计算机科学与技术学院厦门 361021

基金项目:

国家自然科学基金 60872142

详细信息

作者简介:
柯圣财  解放军信息工程大学信息系统工程学院硕士研究生.解放军75830部队助理工程师.主要研究方向为图像处理和计算机视觉.E-mail:keshengcai0705@163.com

陈刚  解放军信息工程大学信息系统工程学院讲师.主要研究方向为自然语言处理, 图像/视频处理与识别.E-mail:maplechen111@gmail.com

赵永威  解放军信息工程大学信息系统工程学院博士研究生.主要研究方向为图像/视频处理与识别.E-mail:zhaoyongwei369@163.com

魏晗  解放军信息工程大学信息系统工程学院讲师.主要研究方向为计算机视觉, 图像/视频处理与识别.E-mail:weihan0627@126.com

通讯作者:
李弼程华侨大学计算机科学与技术学院教授.主要研究方向为文本分析与理解, 语音处理与识别, 图像/视频处理与识别, 信息融合.本文通信作者.E-mail:lbclm@163.com

计量
- 文章访问数: 2008
- HTML全文浏览量: 406
- PDF下载量: 678
- 被引次数: 5
出版历程
- 收稿日期: 2016-01-29
- 录用日期: 2016-08-15
- 刊出日期: 2018-01-20

Image Retrieval with Enhanced Visual Dictionary and Query Expansion

KE Sheng-Cai^{1,2
,},
LI Bi-Cheng^{3
, ,},
CHEN Gang^1
,,
ZHAO Yong-Wei^1
,,
WEI Han^1
,

1.
Institute of Information System Engineering, PLA Information Engineering University, Zhengzhou 450001
2.
Unit 75830, Guangzhou 510000
3.
College of Computer Science and Technology, Huaqiao University, Xiamen 361021

Funds:

National Natural Science Foundation of China 60872142

More Information

Author Bio:
Master student at the Institute of Information System Engineering, PLA Information Engineering University, assistant engineer at Unit 65022. His research interest covers image processing and computer vision

Lecturer at the Institute of Information System Engineering, PLA Information Engineering University. His research interest covers natural language processing, image/video processing and recognition

Ph. D. candidate at the Institute of Information System Engineering, PLA Information Engineering University. His research interest covers image/video processing and recognition

Lecturer at the Institute of Information System Engineering, PLA Information Engineering University. Her research interest covers computer vision, image/video processing and recognition

Corresponding author: LI Bi-Cheng Professor at the College of Computer Science and Technology, Huaqiao University. His research interest covers text analysis and understanding, speech/image/video processing and recognition, and information fusing. Corresponding author of this paper

摘要

摘要: 视觉词典方法（Bag of visual words，BoVW）是当前图像检索领域的主流方法，然而，传统的视觉词典方法存在计算量大、词典区分性不强以及抗干扰能力差等问题，难以适应大数据环境.针对这些问题，本文提出了一种基于视觉词典优化和查询扩展的图像检索方法.首先，利用基于密度的聚类方法对SIFT特征进行聚类生成视觉词典，提高视觉词典的生成效率和质量；然后，通过卡方模型分析视觉单词与图像目标的相关性，去除不包含目标信息的视觉单词，增强视觉词典的分辨能力；最后，采用基于图结构的查询扩展方法对初始检索结果进行重排序.在Oxford5K和Paris6K图像集上的实验结果表明，新方法在一定程度上提高了视觉词典的质量和语义分辨能力，性能优于当前主流方法.
- 视觉词典模型 /
- 密度聚类 /
- 卡方模型 /
- 查询扩展
Abstract: The most popular approach in image retrieval is based on the bag of visual-words (BoVW) model. However, there are several fundamental problems that restrict the performance of this method, such as low time efficiency, weak discrimination of visual words and less robustness. So, an image retrieval method with enhanced visual dictionary and query expansion is proposed. Firstly, clustering by fast search and finding density peaks are used to generate a group of visual words. Secondly, non-information words in the dictionary are eliminated by Chi-square model to improve the distinguishing ability of the visual dictionary. Finally, an efficient graph-based visual reranking method is introduced to refine the initial search results. Experimental results of Oxford5K and Paris6K datasets indicate that the expression ability of visual dictionary is effectively improved and the method is superior to the state-of-the-art image retrieval methods in performance.
- Bag of visual words (BoVW) /
- clustering based on density /
- Chi-square model /
- query expansion
注释:

1) 本文责任编委刘跃虎

HTML全文

本文责任编委刘跃虎

图 1 基于视觉词典优化和查询扩展的图像检索方法流程

Fig. 1 The flow chart of image retrieval based on enhanced visual dictionary and query expansion

下载: 全尺寸图片幻灯片

图 2 基于图结构的查询扩展方法流程图

Fig. 2 The flow chart of query expansion based on image structure

下载: 全尺寸图片幻灯片

图 3 距离阈值参数$d_c$对图像检索MAP值的影响

Fig. 3 The effect of distance threshold on MAP

下载: 全尺寸图片幻灯片

图 4 视觉词典规模对图像检索MAP值的影响

Fig. 4 The effect of vocabulary size on MAP

下载: 全尺寸图片幻灯片

图 5 去除停用词数目对图像检索MAP值的影响

Fig. 5 The effect of parameter on MAP

下载: 全尺寸图片幻灯片

图 6 在Oxford5K和Oxford5K+Paris6K数据库上的图像检索MAP值

Fig. 6 The MAP of different methods for Oxford5K and Oxford5K+Paris6K database

下载: 全尺寸图片幻灯片

图 7 EVD+GBQE方法在Oxford5K+Paris6K数据库上的检索结果

Fig. 7 The image retrieval results of EVD+GBQE for Oxford5K+Paris6K database

下载: 全尺寸图片幻灯片

表 1 视觉单词$w$与各目标类别统计关系

Table 1 Relation between $w$ and categories of each objective

	$C_1$	$C_2$	$\cdots$	$C_m$	Total
包含$w_i$的图像数目	$n_{11}$	$ n_{12}$	$\cdots$	$n_{1m}$	$n_{{\rm{1 + }}}$
不包含$w_i$的图像数目	$n_{21}$	$n_{22}$	$\cdots$	$n_{2m}$	$n_{{\rm{2 + }}}$
Total	$n_{{\rm{ + }}1}$	$n_{{\rm{ +}}2}$	$\cdots$	$n_{{\rm{ + }}m}$	$n_{{{m + }}}$

下载: 导出CSV

表 2 不同查询扩展方法的图像检索MAP值对比(%)

Table 2 The image retrieval results of different query expansion methods for Oxford5K database (%)

	Initial	AQE	KNNR	DQE	GBQE
All Souls	71.4	79.3	81.8	81.4	83.6
Ashmolean	76.5	81.2	83.1	85.1	87.4
Balliol	73.8	78.4	79.3	80.6	82.5
Bodleian	67.2	70.5	73.4	74.5	74.8
Christ_Church	74.1	78.3	81.5	82.4	83.2
Cornmarket	77.4	82.1	81.8	83.2	84.3
Hertford	85.7	89.2	90.9	91.6	93.2
Keble	86.5	91.6	92.2	93.8	94.4
Magdalen	54.6	61.6	63.8	62.9	63.7
Pitt Rivers	92.4	95.6	95.3	95.1	97.6
Radcliffe cam	74.4	80.8	82.6	84.7	86.1
Average	75.82	80.78	82.34	83.21	84.62

下载: 导出CSV

参考文献(22)

[1]	Chen Y Z, Dick A, Li X, Van Den Hengel A. Spatially aware feature selection and weighting for object retrieval. Image and Vision Computing, 2013, 31(12):935-948 doi: 10.1016/j.imavis.2013.09.005
[2]	Wang J J Y, Bensmail H, Gao X. Joint learning and weighting of visual vocabulary for bag-of-feature based tissue classification. Pattern Recognition, 2013, 46(12):3249-3255 doi: 10.1016/j.patcog.2013.05.001
[3]	Cao Y, Wang C H, Li Z W, Zhang L Q, Zhang L. Spatial-bag-of-features. In:Proceedings of the 2010 IEEE Conference on Computer Vision and Pattern Recognition. San Francisco, CA, USA:IEEE, 2010. 3352-3359
[4]	Philbin J, Chum O, Isard M, Sivic J, Zisserman A. Object retrieval with large vocabularies and fast spatial matching. In:Proceedings of the 2007 IEEE Conference on Computer Vision and Pattern Recognition. Minneapolis, USA:IEEE, 2007. 1-8
[5]	Nister D, Stewenius H. Scalable recognition with a vocabulary tree. In:Proceedings of the 2006 IEEE Conference on Computer Vision and Pattern Recognition. New York, USA:IEEE, 2006. 2161-2168
[6]	Goes J, Zhang T, Arora R, Lerman G. Robust stochastic principal component analysis. In:Proceedings of the 17th International Conference on Artificial Intelligence and Statistics. Reykjavik, Iceland:JMLR, 2014. 266-274
[7]	Goswami A K, Jain R, Tripathi P. Automatic segmentation of satellite image using self organizing feature map (SOFM) an artificial neural network (ANN) approach. International Journal of Advanced Research in Computer Science, 2014, 5(8):92-97 http://connection.ebscohost.com/c/articles/100182789/automatic-segmentation-satellite-image-using-self-organizing-feature-map-sofm-artificial-neural-network-ann-approach
[8]	McLachlan G, Krishnan T. The EM Algorithm and Extensions (Second Edition). Hoboken, New Jersey:John Wiley & Sons, 2008.
[9]	Sivic J, Zisserman A. Video Google:a text retrieval approach to object matching in videos. In:Proceedings of the 9th IEEE International Conference on Computer Vision. Nice, France:IEEE, 2003. 1470-1477
[10]	Yuan J S, Wu Y, Yang M. Discovery of collocation patterns:from visual words to visual phrases. In:Proceedings of the 2007 IEEE Conference on Computer Vision and Pattern Recognition. Minneapolis, USA:IEEE, 2007. 1-8
[11]	Fulkerson B, Vedaldi A, Soatto S. Localizing objects with smart dictionaries. In:Proceedings of the 10th European Conference on Computer Vision. Berlin, Heidelberg, Germany:Springer, 2008. 179-192
[12]	Perd'och M, Chum O, Matas J. Efficient representation of local geometry for large scale object retrieval. In:Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition. Miami, USA:IEEE, 2009. 9-16
[13]	Shen X H, Lin Z, Brandt J, Avidan S, Wu Y. Object retrieval and localization with spatially-constrained similarity measure and k-nn re-ranking. In:Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition. Providence, USA:IEEE, 2012. 3013-3020
[14]	Chum O, Philbin J, Sivic J, Isard M, Zisserman A. Total recall:automatic query expansion with a generative feature model for object retrieval. In:Proceedings of the 11th IEEE International Conference on Computer Vision. Rio de Janeiro, Brazil:IEEE, 2007. 1-8
[15]	Rodriguez A, Laio A. Clustering by fast search and find of density peaks. Science, 2014, 344(6191):1492-1496 doi: 10.1126/science.1242072
[16]	Kesom K, Poslad S. An enhanced bag-of-visual word vector space model to represent visual content in athletics images. IEEE Transactions on Multimedia, 2012, 14(1):211-222 doi: 10.1109/TMM.2011.2170665
[17]	Zhang S T, Yang M, Cour T, Yu K, Metaxas D N. Query specific rank fusion for image retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(4):803-815 doi: 10.1109/TPAMI.2014.2346201
[18]	Philbin J, Arandjelović R, Zisserman A. Oxford5K dataset[Online], available:http://www.robots.ox.ac.uk/~vgg/data/oxbuildings/, December, 2015.
[19]	Philbin J, Zisserman A. Paris6K database[Online], available:http://www.robots.ox.ac.uk/~vgg/data/parisbuil-dings/, December, 2015.
[20]	Arandjelović R, Zisserman A. Three things everyone should know to improve object retrieval. In:Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition. Providence, USA:IEEE, 2012. 2911-2918
[21]	Xie H T, Zhang Y D, Tan J L, Guo L, Li J T. Contextual query expansion for image retrieval. IEEE Transactions on Multimedia, 2014, 16(4):1104-1114 doi: 10.1109/TMM.2014.2305909
[22]	Gao Y, Shi M J, Tao D C, Xu C. Database saliency for fast image retrieval. IEEE Transactions on Multimedia, 2015, 17(3):359-369 doi: 10.1109/TMM.2015.2389616

施引文献

期刊类型引用(1)

尚婷，钱富才，刘磊，胡绍林. 一类混合不确定系统的对偶自适应控制. 应用科学学报. 2018(06): 1022-1030 .

百度学术

其他类型引用(4)

资源附件(0)

访问统计

姓名
邮箱
手机号码
标题
留言内容
验证码

留言板

一种基于视觉词典优化和查询扩展的图像检索方法

doi: 10.16383/j.aas.2018.c160041

通讯作者:
李弼程华侨大学计算机科学与技术学院教授.主要研究方向为文本分析与理解, 语音处理与识别, 图像/视频处理与识别, 信息融合.本文通信作者.E-mail:lbclm@163.com

计量

Image Retrieval with Enhanced Visual Dictionary and Query Expansion

Corresponding author: LI Bi-Cheng Professor at the College of Computer Science and Technology, Huaqiao University. His research interest covers text analysis and understanding, speech/image/video processing and recognition, and information fusing. Corresponding author of this paper

期刊类型引用(1)

其他类型引用(4)

计量

目录

留言板

一种基于视觉词典优化和查询扩展的图像检索方法

doi: 10.16383/j.aas.2018.c160041

通讯作者: 李弼程 华侨大学计算机科学与技术学院教授.主要研究方向为文本分析与理解, 语音处理与识别, 图像/视频处理与识别, 信息融合.本文通信作者.E-mail:lbclm@163.com

计量

出版历程

Image Retrieval with Enhanced Visual Dictionary and Query Expansion

Corresponding author: LI Bi-Cheng Professor at the College of Computer Science and Technology, Huaqiao University. His research interest covers text analysis and understanding, speech/image/video processing and recognition, and information fusing. Corresponding author of this paper

期刊类型引用(1)

其他类型引用(4)

计量

出版历程

目录

通讯作者:
李弼程华侨大学计算机科学与技术学院教授.主要研究方向为文本分析与理解, 语音处理与识别, 图像/视频处理与识别, 信息融合.本文通信作者.E-mail:lbclm@163.com