姚涛 孔祥维 付海燕 TIANQi

    姚涛  大连理工大学信息与通信工程学院博士研究生.主要研究方向为多媒体检索, 计算机视觉与模式识别.E-mail:yaotaoedu@mail.dlut.edu.cn

    付海燕  大连理工大学信息与通信工程学院副教授.2014年获得大连理工大学博士学位.主要研究方向为图像检索和计算机视觉.E-mail:fuhy@dlut.edu.cn

    TIANQi:TIAN Qi  美国德克萨斯大学圣安东尼奥分校计算机科学学院教授.IEEEFellow.2002年获得伊利诺伊大学厄巴纳-香槟分校电子与计算工程博士学位.主要研究方向为多媒体信息检索, 模式识别和计算机视觉.E-mail:qitian@cs.utsa.edu


    孔祥维  浙江大学数据科学与管理工程学系教授.2003获得大连理工大学管理科学与工程专业博士学位.2006~2007年美国普渡大学访问学者.主要研究方向为人工智能和商务分析, 大数据分析, 跨媒体检索和安全.本文通信作者.E-mail:kongxiangwei@zju.edu.cn

Projective Dictionary Learning Hashing for Cross-modal Retrieval


National Natural Science Foundation of China 71421001

the Open Projects Program of National Laboratory of Pattern Recognition 201407349

National Natural Science Foundation of China 61429201

National Natural Science Foundation of China 61172109

National Natural Science Foundation of China 61502073

     Ph. D. candidate at the School of Information and Communication Engineering, Dalian University of Technology. His research interest covers multimedia retrieval, computer vision, and machine learning

     Associate professor at the School of Information and Communication Engineering, Dalian University of Technology. She received her Ph. D. degree from Dalian University of Technology in 2014. Her research interest covers image retrieval and computer vision

     Professor in the Department of Computer Science at the University of Texas at San Antonio, USA. IEEE Fellow. He received Ph. D. degree in electrical and computer engineering from the University of Illinois, Urbana-Champaign in 2002. His research interest covers multimedia information retrieval, machine learning, and computer vision

    Corresponding author: KONG Xiang-Wei  Professor at the Department of Data Science and Engineering Management, Zhejiang University. She received her Ph. D. degree in management science and engineering from Dalian University of Technology, in 2003. She is a visiting researcher at Purdue University, USA, from 2006 to 2007. Her research interest covers artificial intelligence and business analysis, big data analysis, cross-modal retrieval and security. Corresponding author of this paper
  • 摘要: 针对网络上出现越来越多的多模态数据,如何在海量数据中检索不同模态的数据成为一个新的挑战.哈希方法把数据映射到Hamming空间,大大降低了计算复杂度,为海量数据的跨模态检索提供了一条有效的路径.然而,大部分现存方法生成的哈希码不包含任何语义信息,从而导致算法性能的下降.为了解决这个问题,本文提出一种基于映射字典学习的跨模态哈希检索算法.首先,利用映射字典学习一个共享语义子空间,在子空间保持数据模态间的相似性.然后,提出一种高效的迭代优化算法得到哈希函数,但是可以证明问题的解并不是唯一的.因此,本文提出通过学习一个正交旋转矩阵最小化量化误差,得到性能更好的哈希函数.最后,在两个公开数据集上的实验结果说明了该算法优于其他现存方法.
  • 图  1  算法的收敛性分析

    Fig.  1  Convergence analysis of the proposed optimization algorithm

    图  2  码长16 bits在Wiki数据集的PR曲线图

    Fig.  2  PR curves on Wiki dataset with the code length fixed to 16 bits

    图  3  码长32 bits在Wiki数据集的PR曲线图

    Fig.  3  PR curves on Wiki dataset with the code length fixed to 32 bits

    图  4  码长16 bits在NUS-WIDE数据集的PR曲线图

    Fig.  4  PR curves on NUS-WIDE dataset with the code length fixed to 16 bits

    图  5  码长32 bits在NUS-WIDE数据集的PR曲线图

    Fig.  5  PR curves on NUS-WIDE dataset with the code length fixed to 32 bits

    表  1  图像检索文本和文本检索图像任务在Wiki数据集上的实验结果(MAP@200)

    Table  1  MAP@200 results on Wiki dataset for the tasks of using the image to query texts and vice versa

    表  2  图像检索文本和文本检索图像任务在NUS-WIDE数据集上的实验结果(MAP@200)

    Table  2  MAP results on NUS-WIDE dataset for the tasks of using the image to query texts and vice versa (MAP@200)

    表  3  同数量训练样本的训练时间(s)和MAP结果

    Table  3  The time costs (s) and MAP results with different sizes of training dataset

    10 00030.250.48390.4603
    20 00058.750.54660.4973
    50 000750.770.56430.5520
    10 0000325.900.57190.5584
    150 000504.590.60280.5603
