基于自适应学习率的深度信念网设计与应用

乔俊飞; 王功明; 李晓理; 韩红桂; 柴伟

doi:10.16383/j.aas.2017.c160389

基于自适应学习率的深度信念网设计与应用

doi: 10.16383/j.aas.2017.c160389

乔俊飞^1,2,,
王功明^1,2, ,,
李晓理^1,,
韩红桂^1,2,,
柴伟^1,2,

1.
北京工业大学信息学部北京 100124
2.
计算智能与智能系统北京市重点实验室北京 100124

基金项目:

国家杰出青年科学基金 61225016

国家自然科学基金 61533002

国家自然科学基金 61473034

详细信息

作者简介:
乔俊飞    北京工业大学教授.主要研究方向为智能控制, 神经网络分析与设计.E-mail:junfeq@bjut.edu.cn

李晓理    北京工业大学教授.1997年获得大连理工大学控制理论与工程硕士学位, 2000年获得东北大学博士学位.主要研究方向为多模型自适应控制, 神经网络控制.E-mail:lixiaolibjut@bjut.edu.cn

韩红桂    北京工业大学教授.主要研究方向为污水处理工艺复杂建模与控制, 神经网络分析与设计.E-mail:rechardhan@sina.com

柴伟    北京工业大学讲师.主要研究方向为系统辨识和状态估计研究.E-mail:chaiwei@bjut.edu.cn

通讯作者:
王功明北京工业大学博士研究生.主要研究方向为深度学习, 神经网络结构设计和优化.本文通信作者.E-mail:xiaowangqsd@163.com

计量
- 文章访问数: 3212
- HTML全文浏览量: 428
- PDF下载量: 1293
- 被引次数: 0
出版历程
- 收稿日期: 2016-05-10
- 录用日期: 2016-10-09
- 刊出日期: 2017-08-20

Design and Application of Deep Belief Network with Adaptive Learning Rate

QIAO Jun-Fei^{1,2
,},
WANG Gong-Ming^{1,2
, ,},
LI Xiao-Li^1
,,
HAN Hong-Gui^{1,2
,},
CHAI Wei^{1,2
,}

1.
Faculty of Information Technology, Beijing University of Technology, Beijing 100124
2.
Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing 100124

Funds:

National Natural Science Fund for Distinguished Young Scholars 61225016

National Natural Science Foundation of China 61533002

National Natural Science Foundation of China 61473034

More Information

Author Bio:
   Professor at Faculty of Information Technology, Beijing University of Technology. His research interest covers intelligent control, analysis and design of neural networks.E-mail:

   Professor at Faculty of Information Technology, Beijing University of Technology. He received his master degree in control theory and control engineering from Dalian University of Technology in 1997, and Ph. D. degree from Northeastern University in 2000, respectively. His research interest covers multiple model adaptive control and neural network control.E-mail:

   Professor at Faculty of Information Technology, Beijing University of Technology. His research interest covers modelling and control in waste water treatment process, analysis and design of neural networks.E-mail:

   Lecturer at Faculty of Information Technology, Beijing University of Technology. His research interest covers system identiflcation and state estimation.E-mail:

Corresponding author: WANG Gong-Ming Ph. D. candidate at Faculty of Information Technology, Beijing University of Technology. His research interest covers deep learning, analysis and design of neural networks. Corresponding author of this paper.E-mail:xiaowangqsd@163.com

摘要

摘要: 针对深度信念网（Deep belief network，DBN）预训练耗时长的问题，提出了一种基于自适应学习率的DBN（Adaptive learning rate DBN，ALRDBN）.ALRDBN将自适应学习率引入到对比差度（Contrastive divergence，CD）算法中，通过自动调整学习步长来提高CD算法的收敛速度.然后设计基于自适应学习率的权值训练方法，通过网络性能分析给出学习率变化系数的范围.最后，通过一系列的实验对所设计的ALRDBN进行测试，仿真实验结果表明，ALRDBN的收敛速度得到了提高且预测精度也有所改善.
- 深度信念网 /
- 自适应学习率 /
- 对比差度 /
- 收敛速度 /
- 性能分析
Abstract: A deep belief network with adaptive learning rate (ALRDBN) is proposed to solve the time-consuming problem in the pre-training period of DBN. The ALRDBN introduces the idea of adaptive learning rate into contrastive divergence (CD) algorithm and accelerates its convergence by a self-adjusting learning rate. The training method of weights in this case is designed, in which the adjusting scope of the coefficient in learning rate is determined by performance analysis. Finally, a series of experiments are carried out to test the performance of ALRDBN, and the corresponding results show that the convergence rate is accelerated significantly and the accuracy of prediction is improved as well.
- Deep belief network /
- adaptive learning rate /
- contrastive divergence /
- convergence rate /
- performance analysis
注释:

1) 本文责任编委王占山

HTML全文

图 1 ALRDBN结构

Fig. 1 The structure of ALRDBN

下载: 全尺寸图片幻灯片

图 2 受限玻尔兹曼机

Fig. 2 Restricted Boltzmann machine

下载: 全尺寸图片幻灯片

图 3 CD-k算法

Fig. 3 he algorithm of CD-k

下载: 全尺寸图片幻灯片

图 4 ALRDBN分层表述结构

Fig. 4 Hierarchical representation structure of ALRDBN

下载: 全尺寸图片幻灯片

图 5 ALRDBN顶层的反传误差

Fig. 5 Error back-propagated from top layer of ALRDBN

下载: 全尺寸图片幻灯片

图 6 顶层RBM的重构误差

Fig. 6 The reconstruction error of top RBM

下载: 全尺寸图片幻灯片

图 7 ALRDBN错误识别原图像

Fig. 7 The original images with classification mistakes of ALRDBN

下载: 全尺寸图片幻灯片

图 8 ALRDBN错误识别图像

Fig. 8 The images with classification mistakes of ALRDBN

下载: 全尺寸图片幻灯片

图 9 隐含层神经元数对收敛时间的影响

Fig. 9 Effect of the number of hidden neurons on convergence time

下载: 全尺寸图片幻灯片

图 10 $\alpha$和$\beta$对收敛时间的影响

Fig. 10 Influence of $\alpha$ and $\beta$ on convergence time

下载: 全尺寸图片幻灯片

图 11 ALRDBN训练结果

Fig. 11 The training results of ALRDBN

下载: 全尺寸图片幻灯片

图 12 ALRDBN测试结果

Fig. 12 The test results of ALRDBN

下载: 全尺寸图片幻灯片

图 13 ALRDBN训练RMSE

Fig. 13 The training RMSE of ALRDBN

下载: 全尺寸图片幻灯片

图 14 隐含层神经元数对收敛时间的影响

Fig. 14 Effect of the number of hidden neurons on convergence time

下载: 全尺寸图片幻灯片

图 15 $\alpha$和$\beta$对收敛时间的影响

Fig. 15 Influence of $\alpha$ and $\beta$ on convergence time

下载: 全尺寸图片幻灯片

图 16 ALRDBN训练结果

Fig. 16 The training results of ALRDBN

下载: 全尺寸图片幻灯片

图 17 ALRDBN测试结果

Fig. 17 The test results of ALRDBN

下载: 全尺寸图片幻灯片

图 18 ALRDBN训练RMSE

Fig. 18 The training RMSE of ALRDBN

下载: 全尺寸图片幻灯片

图 19 隐含层神经元数对收敛时间的影响

Fig. 19 Effect of the number of hidden neurons on convergence time

下载: 全尺寸图片幻灯片

图 20 $\alpha$和$\beta$对收敛时间的影响

Fig. 20 Influence of $\alpha$ and $\beta$ on convergence time

下载: 全尺寸图片幻灯片

表 1 MNIST手写数字实验结果对比

Table 1 Result comparison of MNIST experiment

方法	隐含层数	每层节点数	正确识别率	运算时间(s)
ALRDBN	2	100	93.1 %	20.0
CDBN	2	100	93.0 %	34.3
DBN^[21]	2	100	92.6 %	32.9

下载: 导出CSV

表 2 CO₂浓度变化实验结果对比

Table 2 Result comparison of CO₂ forecasting

方法	网络结构	RMSE (训练)	RMSE (测试)	运算时间(s)
ALRDBN	3-20-40-1	0.9164	1.1671	7.6
DBN	3-20-40-1	0.9487	1.2830	11.9
CDBN^[22]	3-20-40-1	0.9133	1.1507	11.5
BP	3-60-1	＞0.1	1.3 ~ 6.6	15.8

下载: 导出CSV

表 3 Lorenz时序预测实验结果对比

Table 3 Result comparison of Lorenz forecasting

方法	网络结构	RMSE (训练)	RMSE (测试)	运算时间(s)
ALRDBN	3-3-3-	0.0210	0.0225	2.9
DBN	3-3-3-1	0.0371	0.0388	3.6
CDBN	3-3-3-1	0.0208	0.0223	3.2
BPNN^[23]	3-6-1	0.0700	0.0835	＞10
SRNN^[24]	3-6-1	0.0232	0.0302	6.7

下载: 导出CSV

参考文献(24)

[1]	Bengio Y H, Delalleau O. On the expressive power of deep Architectures. In:Proceeding of the 22nd International Conference. Berlin Heidelberg, Germany:Springer-Verlag, 2011. 18-36
[2]	Hinton G E, Salakhutdinov R R. Reducing the dimensionality of data with neural networks. Science, 2006, 313(5786):504-507 doi: 10.1126/science.1127647
[3]	郭潇逍, 李程, 梅俏竹.深度学习在游戏中的应用.自动化学报, 2016, 42(5):676-684 http://www.aas.net.cn/CN/abstract/abstract18857.shtml Guo Xiao-Xiao, Li Cheng, Mei Qiao-Zhu. Deep learning applied to games. Acta Automatica Sinica, 2016, 42(5):676-684 http://www.aas.net.cn/CN/abstract/abstract18857.shtml
[4]	LeCun Y, Bengio Y, Hinton G. Deep learning. Nature, 2015, 521(7553):436-444 doi: 10.1038/nature14539
[5]	贺昱曜, 李宝奇.一种组合型的深度学习模型学习率策略.自动化学报, 2016, 42(6):953-958 http://www.aas.net.cn/CN/abstract/abstract18886.shtml He Yu-Yao, Li Bao-Qi. A combinatory form learning rate scheduling for deep learning model. Acta Automatica Sinica, 2016, 42(6):953-958 http://www.aas.net.cn/CN/abstract/abstract18886.shtml
[6]	马帅, 沈韬, 王瑞琦, 赖华, 余正涛.基于深层信念网络的太赫兹光谱识别.光谱学与光谱分析, 2015, 35(12):3325-3329 http://cdmd.cnki.com.cn/Article/CDMD-10674-1015636153.htm Ma Shuai, Shen Tao, Wang Rui-Qi, Lai Hua, Yu Zheng-Tao. Terahertz spectroscopic identification with deep belief network. Spectroscopy and Spectral Analysis, 2015, 35(12):3325-3329 http://cdmd.cnki.com.cn/Article/CDMD-10674-1015636153.htm
[7]	耿志强, 张怡康.一种基于胶质细胞链的改进深度信念网络模型.自动化学报, 2016, 42(6):943-952 http://www.aas.net.cn/CN/abstract/abstract18885.shtml Geng Zhi-Qiang, Zhang Yi-Kang. An improved deep belief network inspired by glia chains. Acta Automatica Sinica, 2016, 42(6):943-952 http://www.aas.net.cn/CN/abstract/abstract18885.shtml
[8]	Abdel-Zaher A M, Eldeib A M. Breast cancer classification using deep belief networks. Expert Systems with Applications, 2016, 46:139-144 doi: 10.1016/j.eswa.2015.10.015
[9]	Rumelhart D E, Hinton G E, Williams R J. Learning representations by back-propagating errors. Nature, 1986, 323(6088):533-536 doi: 10.1038/323533a0
[10]	Mohamed A R, Dahl G E, Hinton G. Acoustic modeling using deep belief networks. IEEE Transactions on Audio, Speech, and Language Processing, 2012, 20(1):14-22 doi: 10.1109/TASL.2011.2109382
[11]	Bengio Y. Learning deep architectures for AI. Foundations and Trends in Machine Learning, 2009, 2(1):1-127 doi: 10.1561/2200000006
[12]	Lopes N, Ribeiro B. Improving convergence of restricted Boltzmann machines via a learning adaptive step size. Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications, Lecture Notes in Computer Science. Berlin Heidelberg:Springer, 2012. 511-518
[13]	Raina R, Madhavan A, Ng A Y. Large-scale deep unsupervised learning using graphics processors. In:Proceedings of the 26th Annual International Conference on Machine Learning. New York, NY, USA:ACM, 2009. 873-880
[14]	Ly D L, Paprotski V, Danny Y. Neural Networks on GPUs:Restricted Boltzmann Machines, Technical Report, University of Toronto, Canada, 2009.
[15]	Lopes N, Ribeiro B. Towards adaptive learning with improved convergence of deep belief networks on graphics processing units. Pattern Recognition, 2014, 47(1):114-127 doi: 10.1016/j.patcog.2013.06.029
[16]	Le Roux N, Bengio Y. Representational power of restricted boltzmann machines and deep belief networks. Neural Computation, 2008, 20(6):1631-1649 doi: 10.1162/neco.2008.04-07-510
[17]	Hinton G E. Training products of experts by minimizing contrastive divergence. Neural Computation, 2002, 14(8):1771-1800 doi: 10.1162/089976602760128018
[18]	Yu X H, Chen G A, Cheng S X. Dynamic learning rate optimization of the backpropagation algorithm. IEEE Transactions on Neural Networks, 1995, 6(3):669-677 doi: 10.1109/72.377972
[19]	Magoulas G D, Vrahatis M N, Androulakis G S. Improving the convergence of the backpropagation algorithm using learning rate adaptation methods. Neural Computation, 1999, 11(7):1769-1796 doi: 10.1162/089976699300016223
[20]	Lee H, Ekanadham C, Ng A. Sparse deep belief net model for visual area V2. In:Proceedings of the 2008 Advances in Neural Information Processing Systems. Cambridge, MA:MIT Press, 2008. 873-880
[21]	Ji N N, Zhang J S, Zhang C X. A sparse-response deep belief network based on rate distortion theory. Pattern Recognition, 2014, 47(9):3179-3191 doi: 10.1016/j.patcog.2014.03.025
[22]	乔俊飞, 潘广源, 韩红桂.一种连续型深度信念网的设计与应用.自动化学报, 2015, 41(12):2138-2146 http://www.aas.net.cn/CN/abstract/abstract18786.shtml Qiao Jun-Fei, Pan Guang-Yuan, Han Hong-Gui. Design and application of continuous deep belief network. Acta Automatica Sinica, 2015, 41(12):2138-2146 http://www.aas.net.cn/CN/abstract/abstract18786.shtml
[23]	Chang L C, Chen P A, Chang F J. Reinforced two-step-ahead weight adjustment technique for online training of recurrent neural networks. IEEE Transactions on Neural Networks and Learning Systems, 2012, 23(8):1269-1278 doi: 10.1109/TNNLS.2012.2200695
[24]	Chen Q L, Chai W, Qiao J F. A stable online self-constructing recurrent neural network. Advances in Neural Networks-ISNN 2011. Berlin Heidelberg:Springer, 2011, 6677:122-131