基于强化学习的浓密机底流浓度在线控制算法

袁兆麟; 何润姿; 姚超; 李佳; 班晓娟; 袁兆麟; 何润姿; 姚超; 李佳; 班晓娟

doi:10.16383/j.aas.c190348

[1]

Shen Y, Hao L, Ding S X. Real-time implementation of fault tolerant control systems with performance optimization. IEEE Trans. Ind. Electron, 2014, 61(5): 2402−2411 doi: 10.1109/TIE.2013.2273477

[2]

Kouro S, Cortes P, Vargas R, Ammann U, Rodriguez J. Model predictive control — A simple and powerful method to control power converters. IEEE Trans. Ind. Electron, 2009, 56(6): 1826−1838 doi: 10.1109/TIE.2008.2008349

[3]

Dai W, Chai T, Yang S X. Data-driven optimization control for safety operation of hematite grinding process. IEEE Trans. Ind. Electron, 2015, 62(5): 2930−2941 doi: 10.1109/TIE.2014.2362093

[4]

Wang D, Liu D, Zhang Q, Zhao D. Data-based adaptive critic designs for nonlinear robust optimal control with uncertain dynamics. IEEE Trans. Syst., Man, Cybern., Syst., 2016, 46(11): 1544−1555 doi: 10.1109/TSMC.2015.2492941

[5]

Sutton S R, Barto G A. Reinforcement Learning: An Introduction. Cambridge: MIT Press, 2nd edition, 2018.

[6]

Lewis F L, Vrabie D, Syrmos V L. Optimal Control. New York, USA: John Wiley & Sons, Hoboken, 3rd Edition, 2012.

[7]

Prokhorov V D, Wunsch C D. Adaptive critic design. IEEE Transactions on Neural Networks, 1997, 8(5): 997−1007 doi: 10.1109/72.623201

[8]

Werbos P J. Foreword - ADP: the key direction for future research in intelligent control and understanding brain intelligence. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics)., 2008, 38(4): 898−900 doi: 10.1109/TSMCB.2008.924139

[9]

段艳杰, 吕宜生, 张杰, 赵学亮, 王飞跃. 深度学习在控制领域的研究现状与展望. 自动化学报, 2016, 42(5): 643−654

Duan Yan-Jie, Lv Yi-Sheng, Zhang Jie, Zhao Xue-Liang, Wang Fei-Yue. Deep learning for control: the state of the art and prospects. Acta Automatica Sinica, 2016, 42(5): 643−654

[10]

Liu Y-J, Tang L, Tong S-C, Chen C L P, Li D-J. Reinforcement learning design-based adaptive tracking control with less learning parameters for nonlinear discrete-time MIMO systems. IEEE Transactions on Neural Networks and Learning Systems, 2015, 26(1): 165−176 doi: 10.1109/TNNLS.2014.2360724

[11]

Liu L, Wang Z, Zhang H. Adaptive fault-tolerant tracking control for MIMO discrete-time systems via reinforcement learning algorithm with less learning parameters. IEEE Transactions on Automation Science and Engineering, 2017, 14(1): 299−313 doi: 10.1109/TASE.2016.2517155

[12]

Xu X, Yang H, Lian C, Liu J. Self-learning control using dual heuristic programming with global laplacian eigenmaps. IEEE Transactions on Industrial Electronics, 2017, 64(12): 9517−9526 doi: 10.1109/TIE.2017.2708002

[13]

Wei Q-L, Liu D-R. Adaptive dynamic programming for optimal tracking control of unknown nonlinear systems with application to coal gasification. IEEE Transactions on Automation Science and Engineering, 2014, 11(4): 1020−1036 doi: 10.1109/TASE.2013.2284545

[14]

Jiang Y, Fan J-L, Chai T-Y, Li J-N, Lewis L F. Data-driven flotation industrial process operational optimal control based on reinforcement learning. IEEE Transactions on Industrial Informatics, 2017, 14(5): 1974−1989

[15]

Jiang Y, Fan J-L, Chai T-Y, Lewis L F. Dual-rate operational optimal control for flotation industrial process with unknown operational model. IEEE Transactions on Industrial Electronics, 2019, 66(6): 4587−4599 doi: 10.1109/TIE.2018.2856198

[16]

Modares H, Lewis F L. Automatica integral reinforcement learning and experience replay for adaptive optimal control of partiallyunknownconstrained-input. Automatica, 2014, 50(1): 193−202 doi: 10.1016/j.automatica.2013.09.043

[17]

Mnih V, Silver D, Riedmiller M. Playing atari with deep reinforcement learning. In: Procedings of the NIPS Deep Learning Workshop 2013, Lake Tahoe, USA: NIPS 2013, 1−9

[18]

Wang D, Liu D R, Wei Q L, Zhao D B, Jin N. Automatica optimal control of unknown nonaffine nonlinear discrete-time systems basedon adaptive dynamic programming. Automatica, 2012, 48(8): 1825−1832 doi: 10.1016/j.automatica.2012.05.049

[19]

Chai T Y, Jia Y, Li H B, Wang H. An intelligent switching control for a mixed separation thickener process. Control Engineering Practice, 2016, 57: 61−71 doi: 10.1016/j.conengprac.2016.07.007

[20]

Kim B H, Klima M S. Development and application of a dynamic model for hindered-settling column separations. Minerals Engineering, 2004, 17(3): 403−410 doi: 10.1016/j.mineng.2003.11.013

[21]

Wang L Y, Jia Y, Chai T Y, Xie W F. Dual rate adaptive control for mixed separationthickening process using compensation signal basedapproach. IEEE Transactions on Industrial Electronics, 2017, PP: 1−1

[22]

王猛. 矿浆中和沉降分离过程模型软件的研发. 东北大学, 2011

Wang Meng. Design and development of model software of processes of slurry neutralization, sedimentation and separation. Northeastern University, 2011

[23]

唐谟堂. 湿法冶金设备. 中南大学出版社, 2009

Tang Mo-Tang. Hydrometallurgical equipment. Central South University, 2009

[24]

王琳岩, 李健, 贾瑶, 柴天佑. 混合选别浓密过程双速率智能切换控制. 自动化学报, 2018, 44(2): 330−343

Wang Lin-Yan, Li Jian, Jia Yao, Chai Tian-You. Dual-rate intelligent switching control for mixed separation thickening process. Acta Automatica Sinica, 2018, 44(2): 330−343

[25]

Luo B, Liu D R, Huang T W, Wang D. Model-free optimal tracking control via critic-only Q-learning. IEEE Transactions on Neural Networks and Learning Systems, 2016, 27(10): 2134−2144 doi: 10.1109/TNNLS.2016.2585520

[26]

Padhi R, Unnikrishnan N, Wang X H, Balakrishnan S N. A single network adaptive critic (SNAC) architecture for optimal controlsynthesis for a class of nonlinear systems. Neural Networks, 2006, 19(10): 1648−1660 doi: 10.1016/j.neunet.2006.08.010