基于生成对抗网络的低秩图像生成方法

赵树阳; 李建武

doi:10.16383/j.aas.2018.c170473

基于生成对抗网络的低秩图像生成方法

doi: 10.16383/j.aas.2018.c170473

赵树阳^1,,
李建武^1, ,

1.
北京理工大学计算机学院智能信息技术北京市重点实验室北京 100081

基金项目:

国家自然科学基金 61271374

详细信息

作者简介:
赵树阳北京理工大学计算机学院硕士研究生.主要研究方向为计算机视觉, 图像处理与机器智能.E-mail:zsyprich@bit.edu.cn

通讯作者:
李建武博士, 北京理工大学计算机学院副教授.主要研究方向为计算机视觉, 图像处理, 超分辨率图像重建技术.本文通信作者.E-mail:ljw@bit.edu.cn

计量
- 文章访问数: 2836
- HTML全文浏览量: 563
- PDF下载量: 1134
- 被引次数: 0
出版历程
- 收稿日期: 2017-08-29
- 录用日期: 2017-12-14
- 刊出日期: 2018-05-20

Generative Adversarial Network for Generating Low-rank Images

ZHAO Shu-Yang^1
,,
LI Jian-Wu^{1
, ,}

1.
Beijing Key Laboratory of Intelligent Information Technology, School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081

Funds:

National Natural Science Foundation of China 61271374

More Information

Author Bio:
Master student at the School of Computer Science and Technology, Beijing Institute of Technology. Her research interest covers computer vision, image processing, and machine intelligence

Corresponding author: LI Jian-Wu Ph. D., associate professor at the School of Computer Science and Technology, Beijing Institute of Technology. His research interest covers computer vision, image processing, and superresolution image reconstruction. Corresponding author of this paper

摘要

摘要: 低秩纹理结构是图像处理领域中具有重要几何意义的结构，通过提取低秩纹理可以对受到各种变换干扰的图像进行有效校正.针对受到各种变换干扰的低秩图像校正问题，利用生成式框架来缓解图像中不具明显低秩特性区域的校正结果不理想的问题，提出了一种非监督式的由图像生成图像的低秩纹理生成对抗网络（Low-rank generative adversarial network，LR-GAN）算法.首先，该算法将传统的无监督学习的低秩纹理映射算法（Transform invariant low-rank textures，TILT）作为引导加入到网络中来辅助判别器，使网络整体达到无监督学习的效果，并且使低秩对抗对在生成网络和判别网络上都能够学习到结构化的低秩表示.其次，为了保证生成的图像既有较高的图像质量又有相对较低的秩，同时考虑到低秩约束条件下的优化问题不易解决（NP难问题），在经过一定阶段TILT的引导后，设计并加入了低秩梯度滤波层来逼近网络的低秩最优解.通过在MNIST，SVHN和FG-NET这三个数据集上的实验，并使用分类算法评估生成的低秩图像质量，结果表明，本文提出的LR-GAN算法均取得了较好的生成质量与识别效果.
- 生成对抗网络 /
- 低秩纹理生成对抗网络 /
- 结构化低秩表示 /
- 低秩约束
Abstract: Low-rank texture structure is an important geometric structure in image processing. By extracting low-rank textures, images with various interferences can be rectified effectively. To solve the problem of low rank image correction with various interferences, this paper proposes to use the generation framework to alleviate poor correction results on the region without obvious low-rank properties. And a low-rank texture generative adversarial network (LR-GAN) is proposed using an unsupervised image-to-image network. Firstly, by using transform invariant low-rank textures (TILT) to guide the discriminator in the LR-GAN, the whole network can not only achieve the effect of unsupervised learning but also learn a structured low rank representation on both generation network and discrimination network. Secondly, considering that the low-rank constraint is difficult to optimize (NP-hard problem) in the loss function, we introduce a layer of the low-rank gradient filters to approach the optimal low-rank solution after many iterations guided by TILT. We evaluate the LR-GAN network on three public datasets: MNIST, SVHN and FG-NET, and verify the quality of generative low-rank images by using a classification network. Experimental results demonstrate that the proposed method is effective in both generative quality and recognition accuracy.
- Generative adversarial network (GAN) /
- low-rank texture generative adversarial network (LR-GAN) /
- structured low-rank representation /
- low-rank constraint
注释:

1) 本文责任编委王坤峰

HTML全文

图 1 利用TILT模型进行图像校正的例子

Fig. 1 Examples of image correction that using TILT

下载: 全尺寸图片幻灯片

图 2 LR-GAN的网络结构示意图((a) LR-GAN网络的整体算法流程; (b)生成器网络负责生成原始图像的低秩纹理图像; (c)判别器网络将生成器生成的图像和TILT算法转换之后的图像进行对抗学习; (d)为在训练后期加入的低秩梯度过滤层)

Fig. 2 The structure chart of LR-GAN ((a) The general framework of LR-GAN; (b) The Generator generates the low-rank texture image from the original image; (c) The Discriminator distinguishes between the generative image and the TILT image; (d) The layer of the low-rank gradient filter for training.)

下载: 全尺寸图片幻灯片

图 3 网络的训练与微调

Fig. 3 Training and fine-tuning

下载: 全尺寸图片幻灯片

图 4 MNIST数据集上的生成过程

Fig. 4 The generative process on MNIST dataset

下载: 全尺寸图片幻灯片

图 5 MNIST数据集迭代过程中生成器与判别器的损失值变化

Fig. 5 The loss of both the generator and the discriminator on MNIST during the iterations

下载: 全尺寸图片幻灯片

图 6 MNIST数据集上生成器迭代过程中图像秩的变化

Fig. 6 The changes of the rank during the generator iterations on MNIST

下载: 全尺寸图片幻灯片

图 7 图为SVHN数据集上的生成过程

Fig. 7 The generative process on SVHN dataset

下载: 全尺寸图片幻灯片

图 8 SVHN数据集迭代过程中生成器与判别器的损失值变化

Fig. 8 The loss of both the generator and the discriminator on SVHN during the iterations

下载: 全尺寸图片幻灯片

图 9 SVHN数据集上生成器迭代过程中图像秩的变化

Fig. 9 The changes of the rank during the generator iterations on SVHN

下载: 全尺寸图片幻灯片

图 10 FG-NET数据集上的生成过程

Fig. 10 The generative process on FG-NET dataset

下载: 全尺寸图片幻灯片

图 11 FG-NET数据集迭代过程中生成器与判别器的损失值变化

Fig. 11 The loss of both the generator and the discriminator on FG-NET during the iterations

下载: 全尺寸图片幻灯片

图 12 FG-NET数据集上生成器迭代过程中图像秩的变化

Fig. 12 The changes of the rank during the generator iterations on FG-NET

下载: 全尺寸图片幻灯片

表 1 MNIST与SVHN上的平均秩结果

Table 1 The average rank on MNIST and SVHN datasets

method MNIST SVHN

TILT 31 46

LR-GAN 29 40

LR-GAN + Filter 27 37

下载: 导出CSV

表 2 在形变的MNIST上的分类识别效果

Table 2 The classification performance on distorted MNIST

database method mAp

no 0.5701

MNIST TILT 0.6303

ours 0.6497

下载: 导出CSV

表 3 在SVHN上的分类识别效果

Table 3 The classification performance on SVHN

database method mAp

no 0.9609

SVHN TILT 0.9701

ours 0.9756

下载: 导出CSV

参考文献(30)

[1]	李树涛, 魏丹.压缩传感综述.自动化学报, 2009, 35(11):1369-1377 http://www.aas.net.cn/CN/abstract/abstract13592.shtml Li Shu-Tao, Wei Dan. A survey on compressive sensing. Acta Automatica Sinica, 2009, 35(11):1369-1377 http://www.aas.net.cn/CN/abstract/abstract13592.shtml
[2]	彭义刚, 索津莉, 戴琼海, 徐文立.从压缩传感到低秩矩阵恢复:理论与应用.自动化学报, 2013, 39(7):981-994 http://www.aas.net.cn/CN/abstract/abstract18126.shtml Peng Yi-Gang, Suo Jin-Li, Dai Qiong-Hai, Xu Wen-Li. From compressed sensing to low-rank matrix recovery:theory and applications. Acta Automatica Sinica, 2013, 39(7):981-994 http://www.aas.net.cn/CN/abstract/abstract18126.shtml
[3]	Yang S, Wei E L, Guan R M, Zhang X F, Qin J, Wang Y Y. Triangle chain codes for image matching. Neurocomputing, 2013, 120:268-276 doi: 10.1016/j.neucom.2012.08.055
[4]	Brown M, Lowe D G. Automatic panoramic image stitching using invariant features. International Journal of Computer Vision, 2007, 74(1):59-73 doi: 10.1007/s11263-006-0002-3
[5]	Han J G, Farin D, de With P. A mixed-reality system for broadcasting sports video to mobile devices. IEEE MultiMedia, 2011, 18(2):72-84 doi: 10.1109/MMUL.2010.24
[6]	Cheng L, Gong J Y, Li M C, Liu Y X. 3D building model reconstruction from multi-view aerial imagery and lidar data. Photogrammetric Engineering and Remote Sensing, 2011, 77(2):125-139 doi: 10.14358/PERS.77.2.125
[7]	Zhang Z D, Liang X, Ganesh A, Ma Y. Tilt: transform invariant low-rank textures. In: Proceedings of the 10th Asian Conference on Computer Vision-ACCV 2010. Berlin Heidelberg, Germany: Springer, 2011. 314-328
[8]	Zhang Z D, Ganesh A, Liang X, Ma Y. Tilt:transform invariant low-rank textures. International Journal of Computer Vision, 2012, 99(1):1-24 doi: 10.1007/s11263-012-0515-x
[9]	Zhang Y, Jiang Z L, Davis L S. Learning structured low-rank representations for image classification. In: Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Portland, OR, USA: IEEE, 2013. 676-683
[10]	Zhang Z D, Liang X, Ma Y. Unwrapping low-rank textures on generalized cylindrical surfaces. In: Proceedings of the 2001 International Conference on Computer Vision (ICCV). Barcelona, Spain: IEEE, 2011. 1347-1354
[11]	Mobahi H, Zhou Z H, Yang A Y, Ma Y. Holistic 3D reconstruction of urban structures from low-rank textures. In: Proceedings of the 2011 International Conference on Computer Vision Workshops (ICCV Workshops). Barcelona, Spain: IEEE, 2011. 593-600
[12]	Zhang Z D, Matsushita Y, Ma Y. Camera calibration with lens distortion from low-rank textures. In: Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Colorado Springs, CO, USA: IEEE, 2011. 2321-2328
[13]	Zhang X, Lin Z C, Sun F C, Ma Y. Rectification of optical characters as transform invariant low-rank textures. In: Proceedings of the 12th International Conference on Document Analysis and Recognition (ICDAR). Washington, DC, USA: IEEE, 2013. 393-397
[14]	Lin Z C, Liu R S, Su Z X. Linearized alternating direction method with adaptive penalty for low-rank representation. In: Proceedings of the 24th International Conference on Neural Information Processing Systems. Granada, Spain: ACM, 2011. 612-620
[15]	Zhang Q, Li Y J, Blum R S, Xiang P. Matching of images with projective distortion using transform invariant low-rank textures. Journal of Visual Communication and Image Representation, 2016, 38:602-613 doi: 10.1016/j.jvcir.2016.04.007
[16]	Goodfellow I J, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y. Generative adversarial nets. In: Proceedings of the 27th International Conference on Neural Information Processing Systems. Montreal, Canada: ACM, 2014. 2672-2680
[17]	王坤峰, 苟超, 段艳杰, 林懿伦, 郑心湖, 王飞跃.生成式对抗网络GAN的研究进展与展望.自动化学报, 2017, 43(3):321-332 http://www.aas.net.cn/CN/abstract/abstract19012.shtml Wang Kun-Feng, Gou Chao, Duan Yan-Jie, Lin Yi-Lun, Zheng Xin-Hu, Wang Fei-Yue. Generative adversarial networks:the state of the art and beyond. Acta Automatica Sinica, 2017, 43(3):321-332 http://www.aas.net.cn/CN/abstract/abstract19012.shtml
[18]	Radford A, Metz L, Chintala S. Unsupervised representation learning with deep convolutional generative adversarial networks. In: Proceedings of the 2016 International Conference on Learning Representation (ICLR). San Juan, Puerto Rico: 2016. 3, 5, 6
[19]	Nair V, Hinton G E. Rectified linear units improve restricted Boltzmann machines. In: Proceedings of the 27th International Conference on Machine Learning. Haifa, Israel: ACM, 2010. 807-814
[20]	Ioffe S, Szegedy C. Batch normalization: accelerating deep network training by reducing internal covariate shift. In: Proceedings of the 32nd International Conference on Machine Learning. Lille, France: PMLR, 2015. 448-456
[21]	Fan E G. Extended tanh-function method and its applications to nonlinear equations. Physics Letters A, 2000, 277(4-5):212-218 doi: 10.1016/S0375-9601(00)00725-8
[22]	Maas A L, Hannun A Y, Ng A Y. Rectifier nonlinearities improve neural network acoustic models. In: Proceedings of the 30th International Conference on Machine Learning. Atlanta, Georgia, USA: PMLR, 2013.
[23]	Mao X D, Li Q, Xie H R, Lau R Y K, Wang Z, Smolley S P. Least squares generative adversarial networks. In: Proceedings of the 2017 International Conference on Computer Vision (ICCV). Venice, Italy: IEEE, 2017. 2813-2821
[24]	Zhao S Y, Li W J. Fast asynchronous parallel stochastic gradient descent: a lock-free approach with convergence guarantee. In: Proceedings of the 30th AAAI Conference on Artificial Intelligence. Phoenix, Arizona: AAAI, 2016. 2379-2385
[25]	LeCun Y, Cortes C, Burges C J C. The MNIST database of handwritten digits[Online], available: http://yann.lecun.com/exdb/mnist/, July 12, 2016
[26]	Jaderberg M, Simonyan K, Zisserman A, Kavukcuoglu K. Spatial transformer networks. In: Proceedings of the 29th Annual Conference on Neural Information Processing Systems. Montreal, Canada: NIPS, 2015. 2017-2025
[27]	Netzer Y, Wang T, Coates A, Bissacco A, Wu B, Ng A Y. Reading digits in natural images with unsupervised feature learning. In: Proceedings of the 2011 NIPS Workshop on Deep Learning and Unsupervised Feature Learning. Granada, Spain: NIPS, 2011. 2: 5-13
[28]	Panis G, Lanitis A, Tsapatsoulis N, Cootes T F. Overview of research on facial ageing using the FG-NET ageing database. IET Biometrics, 2016, 5(2):37-46 doi: 10.1049/iet-bmt.2014.0053
[29]	李力, 林懿伦, 曹东璞, 郑南宁, 王飞跃.平行学习——机器学习的一个新型理论框架.自动化学报, 2017, 43(1):1-8 http://www.aas.net.cn/CN/abstract/abstract18984.shtml Li Li, Lin Yi-Lun, Cao Dong-Pu, Zheng Nan-Ning, Wang Fei-Yue. Parallel learning——a new framework for machine learning. Acta Automatica Sinica, 2017, 43(1):1-8 http://www.aas.net.cn/CN/abstract/abstract18984.shtml
[30]	王坤峰, 苟超, 王飞跃.平行视觉:基于ACP的智能视觉计算方法.自动化学报, 2016, 42(10):1490-1500 http://www.aas.net.cn/CN/abstract/abstract18936.shtml Wang Kun-Feng, Gou Chao, Wang Fei-Yue. Parallel vision:an ACP-based approach to intelligent vision computing. Acta Automatica Sinica, 2016, 42(10):1490-1500 http://www.aas.net.cn/CN/abstract/abstract18936.shtml

施引文献

资源附件(0)

访问统计

图(12) / 表(3)

计量

文章访问数: 2836
HTML全文浏览量: 563
PDF下载量: 1134
被引次数: 0

姓名
邮箱
手机号码
标题
留言内容
验证码

留言板

基于生成对抗网络的低秩图像生成方法

doi: 10.16383/j.aas.2018.c170473

作者简介:
赵树阳北京理工大学计算机学院硕士研究生.主要研究方向为计算机视觉, 图像处理与机器智能.E-mail:zsyprich@bit.edu.cn

通讯作者:
李建武博士, 北京理工大学计算机学院副教授.主要研究方向为计算机视觉, 图像处理, 超分辨率图像重建技术.本文通信作者.E-mail:ljw@bit.edu.cn

计量

Generative Adversarial Network for Generating Low-rank Images

Author Bio:
Master student at the School of Computer Science and Technology, Beijing Institute of Technology. Her research interest covers computer vision, image processing, and machine intelligence

Corresponding author: LI Jian-Wu Ph. D., associate professor at the School of Computer Science and Technology, Beijing Institute of Technology. His research interest covers computer vision, image processing, and superresolution image reconstruction. Corresponding author of this paper

计量

目录

method	MNIST	SVHN
TILT	31	46
LR-GAN	29	40
LR-GAN + Filter	27	37

database	method	mAp
	no	0.5701
MNIST	TILT	0.6303
	ours	0.6497

database	method	mAp
	no	0.9609
SVHN	TILT	0.9701
	ours	0.9756

留言板

基于生成对抗网络的低秩图像生成方法

doi: 10.16383/j.aas.2018.c170473

作者简介: 赵树阳 北京理工大学计算机学院硕士研究生.主要研究方向为计算机视觉, 图像处理与机器智能.E-mail:zsyprich@bit.edu.cn

通讯作者: 李建武 博士, 北京理工大学计算机学院副教授.主要研究方向为计算机视觉, 图像处理, 超分辨率图像重建技术.本文通信作者.E-mail:ljw@bit.edu.cn

计量

出版历程

Generative Adversarial Network for Generating Low-rank Images

Author Bio: Master student at the School of Computer Science and Technology, Beijing Institute of Technology. Her research interest covers computer vision, image processing, and machine intelligence

Corresponding author: LI Jian-Wu Ph. D., associate professor at the School of Computer Science and Technology, Beijing Institute of Technology. His research interest covers computer vision, image processing, and superresolution image reconstruction. Corresponding author of this paper

计量

出版历程

目录

作者简介:
赵树阳北京理工大学计算机学院硕士研究生.主要研究方向为计算机视觉, 图像处理与机器智能.E-mail:zsyprich@bit.edu.cn

通讯作者:
李建武博士, 北京理工大学计算机学院副教授.主要研究方向为计算机视觉, 图像处理, 超分辨率图像重建技术.本文通信作者.E-mail:ljw@bit.edu.cn

Author Bio:
Master student at the School of Computer Science and Technology, Beijing Institute of Technology. Her research interest covers computer vision, image processing, and machine intelligence