基于字典学习的非线性降维方法

郑思龙; 李元祥; 魏宪; 彭希帅

doi:10.16383/j.aas.2016.c150557

基于字典学习的非线性降维方法

doi: 10.16383/j.aas.2016.c150557 cstr: 32138.14.j.aas.2016.c150557

郑思龙^1,,
李元祥^1, ,,
魏宪^2,,
彭希帅^1,

1.
上海交通大学航空航天学院上海 200240 中国
2.
慕尼黑工业大学电气与计算机工程系慕尼黑 D-80333 德国

基金项目:

国家自然科学基金 U1406404

国家自然科学基金 61331015

国家自然科学基金 41174164

详细信息

作者简介:
郑思龙上海交通大学航空航天学院硕士研究生.主要研究方向为图像处理, 计算机视觉与模式识别.E-mail:zhengsilong@sjtu.edu.cn

魏宪慕尼黑工业大学电气与计算机工程系博士研究生.主要研究方向为几何优化, 数据表达与图像处理.E-mail:xianweich@gmail.com

彭希帅上海交通大学航空航天学院博士研究生.主要研究方向为图像处理, 计算机视觉, 机器学习.E-mail:xishuaipeng@sjtu.edu.cn

通讯作者:
李元祥上海交通大学航空航天学院副教授.2001年获得清华大学电子工程系博士学位.主要研究方向为遥感图像解译, 图像识别, 图像重构与评估, 台风云图信息提取与汉字信息处理.本文通信作者.E-mail:yuanxli@sjtu.edu.cn

计量
- 文章访问数: 3496
- HTML全文浏览量: 595
- PDF下载量: 1616
- 被引次数: 0
出版历程
- 收稿日期: 2015-09-02
- 录用日期: 2016-01-13
- 刊出日期: 2016-07-01

Nonlinear Dimensionality Reduction Based on Dictionary Learning

1.
School of Aeronautics and Astronautics, Shanghai Jiao Tong University, Shanghai 200240, China
2.
Department of Electrical and Computer Engineering, Technische Universitaet München, Munich D-80333, Germany

Funds:

National Natural Science Foundation of China U1406404

National Natural Science Foundation of China 61331015

National Natural Science Foundation of China 41174164

More Information

Author Bio:
Master student at the School of Aeronautics and Astronautics, Shanghai Jiao Tong University.His research interest covers image processing, computer vision, and pattern recognition

Ph.D.candidate in the Department of Electrical and Computer Engineering, Technische Univesitaet M¨unchen, Germany.His research interest covers geometry optimization, data representation, and image processing

Ph.D.candidate at the School of Aeronautics and Astronautics, Shanghai Jiao Tong University.His research interest covers image processing, computer vision, and machine learning

Corresponding author: LI Yuan-Xiang Associate professor at the School of Aeronautics and Astronautics, Shanghai Jiao Tong University.He received his Ph.D.degree from the Department of Electronic Engineering, Tsinghua University in 2001.His research interest covers remote sensing image interpretation, image recognition, image reconstruction and evaluation, typhoon image information extraction, and Chinese information processing.Corresponding author of this paper

摘要

摘要: 目前，众多的数据降维（Dimensionality reduction, DR）方法（如经典的PCA（Principle component analysis）, ISOMAP（Isometric mapping））能够使降维后的数据保留原始信号的重要特征，但是从降维后的数据中很好地恢复出原始信号仍旧是一个挑战.近年来，稀疏表示（Sparse representation, SR）在信号重构研究中受到广泛关注，信号可以利用过完备字典中少数原子的线性组合来描述.本文提出一种基于字典学习的非线性降维方法.从高维输入信号到低维特征的降维过程中，期望一些重要的几何特征（内积、距离和夹角）得以保留，同时又能够从低维数据中恢复出原始信号.为达此目的，本文采用CDL（Concentrated dictionary learning）算法训练一个字典对（高维字典D和低维字典P），使高维原始信号的能量能够聚集于低维子空间中.字典D用来获取稀疏表示系数，字典P是D的直接降维采样，CDL算法能够保证P聚集D中的大部分能量.这样，信号的降维与恢复问题就转变为字典对的训练问题，信号的降维即为从D到P的能量保留过程.实验表明:CDL可在RIP（Restricted isomery property）条件的限制之外具有一定的信号重建能力，能在更低的维度条件下恢复图像，优于传统的压缩感知方法.此外，在噪声较大的情况下，CDL图像压缩效果优于JPEG2000.
- 数据降维 /
- 稀疏表示 /
- 压缩感知 /
- 字典学习
Abstract: Most classic dimensionality reduction (DR) algorithms (such as principle component analysis (PCA) and isometric mapping (ISOMAP)) focus on finding a low-dimensional embedding of original data, which are often not reversible. It is still challenging to make DR processes reversible in many applications. Sparse representation (SR) has shown its power on signal reconstruction and denoising. To tackle the problem of large scale dataset processing, this work focuses on developing a differentiable model for invertible DR based on SR. From high-dimensional input signal to the low-dimensional feature, we expect to preserve some important geometric features (such as inner product, distance and angle) such that the reliable reconstruction from the low dimensional space back to the original high dimensional space is possible. We employ the algorithm called concentrated dictionary learning (CDL) to train the high dimensional dictionary to concentrate the energy in its low dimensional subspace. Then we design a paired dictionaries: D and P, where D is used to obtain the sparse representation and P is a direct down-sampling of D. CDL can ensure P to capture the most energy of D. Then, the problem about signal reconstruction is transformed into how to train dictionaries D and P, so the process of input signal X to feature Y is transformed into the process of energy retention from D to P. Experimental results show that without the restrictions of linear projection using restricted isometry property (RIP), CDL can reconstruct the image at a lower dimensional space and outperform state-of-the-art DR methods (such as Gaussian random compressive sensing). In addition, for noise-corrupted images, CDL can obtain better compression performance than JPEG2000.
- Dimensionality reduction (DR) /
- sparse representation (SR) /
- compressed sensing (CS) /
- dictionary learning

HTML全文

图 1 CS算法示意图

Fig. 1 CS algorithm schematic

下载: 全尺寸图片幻灯片

图 4 不同字典奇异值的分布 (当选取90%主成分 $(t_d=0.9)$ 时, 图中的点代表所需的字典维数. DCT字典: 207, K-SVD字典: 148, CDL字典: 16

Fig. 4 Distribution of singular values from different dictionaries (When selecting 90% component of dictionaries $(t_d=0.9)$ , the dots represent the required dimensions for different dictionaries. DCT: 207, K-SVD: 148, CDL: 16)

下载: 全尺寸图片幻灯片

图 3 由CDL算法训练得到的字典对

Fig. 3 Coupled dictionaries trained by CDL

下载: 全尺寸图片幻灯片

图 2 自然图像示例

Fig. 2 Natural image examples

下载: 全尺寸图片幻灯片

图 5 Lena图像重构对比(原始图像、CS+K-SVD (GAU)、CS+K-SVD (PCA)、PCA、LPP、CDL, $d=16$ )

Fig. 5 Reconstructed images of Lena (Original image, CS+K-SVD (GAU), CS+K-SVD(PCA), PCA, LPP, CDL, $d=16$ )

下载: 全尺寸图片幻灯片

图 6 Boat图像重构对比(原始图像、CS+K-SVD (GAU)、CS+K-SVD (PCA)、PCA、LPP、CDL, $d=32$ )

Fig. 6 Reconstructed images of Boat (Original image, CS+K-SVD (GAU), CS+K-SVD (PCA), PCA, LPP, CDL, $d=32$ )

下载: 全尺寸图片幻灯片

图 7 USPS手写数字重构对比(从上到下: 原始图像、CS+K-SVD (GAU)、CS+K-SVD (PCA)、PCA、CDL， $d=16$ )

Fig. 7 Reconstructed images of USPS (From top to bottom:original image, CS+K-SVD (GAU), CS+K-SVD (PCA), PCA, CDL, $d=16$ )

下载: 全尺寸图片幻灯片

图 8 USPS手写数字三类样本(0、3、4)可视化结果( $d=3$ )

Fig. 8 The visualization results of USPS (0, 3, 4) ( $d=3$ )

下载: 全尺寸图片幻灯片

图 9 USPS手写数字三类样本(0、3、4)图像重构对比(从上到下:原始图像、CDL、CS+K-SVD (GAU))

Fig. 9 Reconstructed images of USPS (0, 3, 4) (From top to bottom: original image, CDL, CS+K-SVD (GAU))

下载: 全尺寸图片幻灯片

图 10 PIE人脸图像重构对比(从上到下: 原始图像、CS+K-SVD (GAU)、CS+K-SVD (PCA)、PCA、CDL, $d=16$ )

Fig. 10 Reconstructed images of PIE (From top to bottom:original image, CS+K-SVD (GAU), CS+K-SVD (PCA), PCA, CDL, $d=16$ )

下载: 全尺寸图片幻灯片

图 11 PIE人脸图像重构对比(从上到下: 原始图像、CS+PCA、CDL) ( $d=64$ )

Fig. 11 Reconstructed images of PIE (From top to bottom:original image, CS+PCA, CDL) ( $d=64$ ))

下载: 全尺寸图片幻灯片

图 12 Lena图像和Barara图像重构对比( $d=4$ )

Fig. 12 Reconstructed images of Lena and Barara ( $d=4$ )

下载: 全尺寸图片幻灯片

图 13 Lena图像压缩重建(从左到右: 原始图像、JPEG2000 (38.45 dB)、CDL (30.56 dB)) (无噪声, CR = 16)

Fig. 13 Reconstructed images of Lena (From left to right:original image, JPEG2000 (38.45 dB), CDL (30.56 dB)) (No noise, CR= 16)

下载: 全尺寸图片幻灯片

图 14 Lena图像压缩重建(从左到右: 原始图像、加噪图像(28.15 dB)、JPEG2000 (32.04 dB)、CDL (30.24 dB) ( $\sigma = 10$ , CR = 16))

Fig. 14 Reconstructed images of Lena (From left to right: original image, noisy image (28.15 dB), JPEG2000 (32.04 dB), CDL(30.24 dB) ( $\sigma = 10$ , CR = 16))

下载: 全尺寸图片幻灯片

图 15 Lena图像压缩重建(从左到右: 原始图像、加噪图像(22.09 dB)、JPEG2000 (25.93 dB)、CDL (29.34 dB) ( $\sigma = 20$ , CR = 16)

Fig. 15 Reconstructed images of Lena (From left to right:original image, noisy image (22.09 dB), JPEG2000 (25.93 dB), CDL (29.34 dB) ( $\sigma = 20$ , CR = 16))

下载: 全尺寸图片幻灯片

图 16 Lena图像压缩重建(从左到右: 原始图像、加噪图像(16.09 dB)、JPEG2000 (19.76 dB)、CDL (26.96 dB) ( $\sigma =40$ , CR = 16))

Fig. 16 Reconstructed images of Lena (From left to right:original image, noisy image (16.09 dB), JPEG2000 (19.76 dB),CDL (26.96 dB) ( $\sigma =40$ , CR = 16))

下载: 全尺寸图片幻灯片

表 1 Lena图像重构对比

Table 1 Comparison of reconstructed Lena images

维数		8	16	32	64
PSNR (dB)	CS+K-SVD (GAU)	25.38	27.62	29.35	34.16
	CS+K-SVD (PCA)	24.25	25.97	28.73	33.55
	PCA	23.28	26.11	27.66	32.67
	LPP	23.06	24.56	27.89	33.21
	CDL	29.14	31.13	33.86	37.25

下载: 导出CSV

表 2 Boat图像重构对比

Table 2 Comparison of reconstructed Boat images

维数		8	16	32	64
PSNR (dB)	CS+K-SVD (GAU)	22.16	23.35	26.25	30.01
	CS+K-SVD (PCA)	24.25	25.97	28.73	31.73
	PCA	23.74	26.51	28.22	31.52
	LPP	24.28	25.35	27.89	30.69
	CDL	25.08	26.98	29.52	32.59

下载: 导出CSV

表 3 USPS手写数字重构对比（d=16)

Table 3 Comparison of reconstructed USPS (d = 16)

数字	PSNR (dB)
数字	CS+K-SVD (GAU)	CS+K-SVD (PCA)	PCA	CDL
0	12.53	14.87	13.34	15.51
1	15.07	13.77	15.02	18.09
2	9.54	9.28	9.73	12.23
3	10.54	14.39	14.43	17.43
4	8.93	15.3	14.13	15.63
5	12.61	13.08	13.19	18.3
6	11.06	9.89	18.4	21.59
7	11.39	15.53	19.64	22.49
8	22.23	25.22	21.65	28.11
9	11.42	20.76	18.63	21.7

下载: 导出CSV

表 4 USPS手写数字重构对比（d = 8、16、32、64)

Table 4 Comparison of reconstructed USPS (d = 8, 16, 32, 64)

维数		8	16	32	64
PSNR (dB)	CS+K-SVD (GAU)	13.53	15.87	19.65	21.43
	CS+K-SVD (PCA)	15.21	18.71	22.81	25.97
	PCA	15.82	19.23	23.18	26.38
	CDL	19.11	22.75	25.57	27.25

下载: 导出CSV

表 5 PIE人脸图像重构对比(d = 8、16、32、64)

Table 5 Comparison of reconstructed PIE(d = 8, 16, 32, 64)

维数		8	16	32	64
PSNR (dB)	CS+K-SVD (GAU)	21.29	25.87	27.65	32.43
	CS+K-SVD (PCA)	25.35	28.71	32.81	37.97
	PCA	20.27	22.1	25.18	28.38
	CDL	27.14	30.75	36.57	40.25

下载: 导出CSV

表 6 USPS手写数字三类样本（0、3、4)图像重构对比

Table 6 Comparison of reconstructed USPS (0, 3, 4)

维数数字		d = 3			d =12
维数数字		0	3	4	0	3	4
PSNR (dB)	CS+K-SVD (GAU)	3.40	2.31	3.10	12.66	10.38	10.56
	CDL	12.43	10.22	11.10	18.05	19.49	18.69

下载: 导出CSV

表 7 Lena图像压缩重建（无噪声）

Table 7 Comparison of reconstructed Lena (No noise)

CR		4	16	32	64
PSNR(dB)	JPEG2000	46.32	41.82	38.45	35.19
PSNR(dB)	CDL	35.22	32.46	30.56	28.23

下载: 导出CSV

表 8 Lena图像压缩重建（σ = 10)

Table 8 Comparison of reconstructed Lena (σ=10)

CR		4	16	32	64
PSNR(dB)	JPEG2000	27.95	29.04	32.04	32.8
PSNR(dB)	CDL	32.37	31.53	30.24	28.14

下载: 导出CSV

表 9 Lena图像压缩重建(σ = 20)

Table 9 Comparison of reconstructed Lena (σ = 20)

CR		4	16	32	64
PSNR(dB)	JPEG2000	22.07	23.18	25.93	28.8
PSNR(dB)	CDL	28.36	29.52	29.34	27.85

下载: 导出CSV

表 10 Lena图像压缩重建( $\sigma =40$ )

Table 10 Comparison of reconstructed Lena ( $\sigma =40$ )

CR		4	16	32	64
PSNR(dB)	JPEG2000	28.36	29.52	26.96	27.85
PSNR(dB)	CDL	16.38	17.5	19.76	22.86

下载: 导出CSV

参考文献(25)

[1]	Van der Maaten L J P, Postma E O, Van den Herik H J. Dimensionality reduction:a comparative review. Journal of Machine Learning Research, 2009, 10:66-71 http://cn.bing.com/academic/profile?id=2137570937&encoded=0&v=paper_preview&mkt=zh-cn
[2]	Donoho D L. Compressed sensing. IEEE Transactions on Information Theory, 2006, 52(4):1289-1306 doi: 10.1109/TIT.2006.871582
[3]	Candés E J, Romberg J, Tao T. Robust uncertainty principles:exact signal reconstruction from highly incomplete frequency information. IEEE Transactions on Information Theory, 2006, 52(2):489-509 doi: 10.1109/TIT.2005.862083
[4]	Tropp J A, Gilbert A C. Signal recovery from random measurements via orthogonal matching pursuit. IEEE Transactions on Information Theory, 2007, 53(12):4655-4666 doi: 10.1109/TIT.2007.909108
[5]	Mairal J, Bach F, Ponce J. Task-driven dictionary learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012, 34(4):791-804 doi: 10.1109/TPAMI.2011.156
[6]	Gao J B, Shi Q F, Caetano T S. Dimensionality reduction via compressive sensing. Pattern Recognition Letters, 2012, 33(9):1163-1170 doi: 10.1016/j.patrec.2012.02.007
[7]	Gkioulekas I A, Zickler T. Dimensionality reduction using the sparse linear model. In:Proceedings of the 2011 Advances in Neural Information Processing Systems. Granada, Spain:Morgan Kaufmann, 2011. 271-279
[8]	Calderbank R, Jafarpour S, Schapire R. Compressed Learning:Universal Sparse Dimensionality Reduction and Learning in the Measurement Domain. Technical Report, Princeton University, USA, 2009
[9]	Baraniuk R G, Wakin M B. Random projections of smooth manifolds. Foundations of Computational Mathematics, 2009, 9(1):51-77 doi: 10.1007/s10208-007-9011-z
[10]	Hegde C, Wakin M B, Baraniuk R G. Random projections for manifold learning. In:Proceedings of the 2008 Advances in Neural Information Processing Systems. Vancouver, Canada:Morgan Kaufmann, 2008. 641-648
[11]	Chen M H, Silva J, Paisley J, Wang C P, Dunson D, Carin L. Compressive sensing on manifolds using a nonparametric mixture of factor analyzers:algorithm and performance bounds. IEEE Transactions on Signal Processing, 2010, 58(12):6140-6155 doi: 10.1109/TSP.2010.2070796
[12]	Baraniuk R, Davenport M, DeVore R, Wakin M. A simple proof of the restricted isometry property for random matrices. Constructive Approximation, 2008, 28(3):253-263 doi: 10.1007/s00365-007-9003-x
[13]	Weiss Y, Chang H S, Freeman W T. Learning compressed sensing. In:Proceedings of the 24th Annual Allerton Conference on Communications, Control and Computing. Illinois, USA:Citeseer, 2007. 535-541
[14]	Baraniuk R G, Cevher V, Duarte M F, Hegde C. Model-based compressive sensing. IEEE Transactions on Information Theory, 2010, 56(4):1982-2001 doi: 10.1109/TIT.2010.2040894
[15]	Gleichman S, Eldar Y C. Blind compressed sensing. IEEE Transactions on Information Theory, 2011, 57(10):6958-6975 doi: 10.1109/TIT.2011.2165821
[16]	Silva J, Chen M H, Eldar Y C, Sapiro G, Carin L. Blind compressed sensing over a structured union of subspaces. eprint arXiv:1103.2469, 2011.
[17]	Zeyde R, Elad M, Protter M. On single image scale-up using sparse-representations. In:Proceedings of the 7th International Conference on Curves and Surfaces. Avignon, France:Springer, 2012. 711-730 http://cn.bing.com/academic/profile?id=1791560514&encoded=0&v=paper_preview&mkt=zh-cn
[18]	Yang J C, Wang Z W, Lin Z, Cohen S, Huang T. Coupled dictionary training for image super-resolution. IEEE Transactions on Image Processing, 2012, 21(8):3467-3478 doi: 10.1109/TIP.2012.2192127
[19]	Yang J C, Wright J, Huang T S, Ma Y. Image super-resolution via sparse representation. IEEE Transactions on Image Processing, 2010, 19(11):2861-2873 doi: 10.1109/TIP.2010.2050625
[20]	Adler A, Hel-Or Y, Elad M. A shrinkage learning approach for single image super-resolution with overcomplete representations. In:Proceedings of the 11th European Conference on Computer Vision. Heraklion, Crete, Greece:Springer, 2010. 622-635 http://cn.bing.com/academic/profile?id=2129696493&encoded=0&v=paper_preview&mkt=zh-cn
[21]	Wei X, Kleinsteuber M, Shen H. Invertible nonlinear dimensionality reduction via joint dictionary learning. In:Proceedings of the 12th International Conference on Latent Variable Analysis and Signal Separation. Liberec, Czech Republic:Springer, 2015. 279-286 doi: 10.1007/978-3-319-22482-4_32
[22]	Aharon M, Elad M, Bruckstein A. K-SVD:an algorithm for designing overcomplete dictionaries for sparse representation. IEEE Transactions on Signal Processing, 2006, 54(11):4311-4322 doi: 10.1109/TSP.2006.881199
[23]	Elad M, Aharon M. Image denoising via sparse and redundant representations over learned dictionaries. IEEE Transactions on Image Processing, 2006, 15(12):3736-3745 doi: 10.1109/TIP.2006.881969
[24]	Pati Y C, Rezaiifar R, Krishnaprasad P S. Orthogonal matching pursuit:recursive function approximation with applications to wavelet decomposition. In:Proceedings of the 27th Asilomar Conference on Signals, Systems and Computers. Pacific Grove, USA:IEEE, 1993. 40-44 http://www.oalib.com/paper/4238169
[25]	Tropp J A. Greed is good:algorithmic results for sparse approximation. IEEE Transactions on Information Theory, 2004, 50(10):2231-2242 doi: 10.1109/TIT.2004.834793