[1]
|
Zhang H G, Zhang X, Luo Y H, Yang J. An overview of research on adaptive dynamic programming. Acta Automatica Sinica, 2013, 39(4):303-311 doi: 10.1016/S1874-1029(13)60031-2
|
[2]
|
Liu D R, Li H L, Wang D. Data-based self-learning optimal control:research progress and prospects. Acta Automatica Sinica, 2013, 39(11):1858-1870 doi: 10.3724/SP.J.1004.2013.01858
|
[3]
|
Werbos P. Beyond Regression:New Tools for Prediction and Analysis in the Behavioral Sciences[Ph., D. dissertation], Harvard University, USA, 1974.
|
[4]
|
Prokhorov D V, Wunsch D C. Adaptive critic designs. IEEE Transactions on Neural Networks, 1997, 8(5):997-1007 doi: 10.1109/72.623201
|
[5]
|
Padhi R, Unnikrishnan N, Wang X H, Balakrishnan S N. A single network adaptive critic (SNAC) architecture for optimal control synthesis for a class of nonlinear systems. Neural Networks, 2006, 19(10):1648-1660 doi: 10.1016/j.neunet.2006.08.010
|
[6]
|
Wang Y, O'Donoghue B, Boyd S. Approximate dynamic programming via iterated Bellman inequalities. International Journal of Robust and Nonlinear Control, 2015, 25(10):1472-1496 doi: 10.1002/rnc.v25.10
|
[7]
|
Bertsekas D P, Tsitsiklis J N. Neuro-dynamic programming:an overview. In:Proceedings of the 34th IEEE Conference on Decision and Control. New Orleans, LA, USA:IEEE, 1995. 560-564
|
[8]
|
Zhu L M, Modares H, Peen G O, Lewis F L, Yue B Z. Adaptive suboptimal output-feedback control for linear systems using integral reinforcement learning. IEEE Transactions on Control Systems Technology, 2015, 23(1):264-273 doi: 10.1109/TCST.2014.2322778
|
[9]
|
Bhasin S. Reinforcement Learning and Optimal Control Methods for Uncertain Nonlinear Systems[Ph., D. dissertation], University of Florida, USA, 2011.
|
[10]
|
Vrabie D, Vamvoudakis K G, Lewis F L. Optimal Adaptive Control and Differential Games by Reinforcement Learning Principles. London:IET, 2012.
|
[11]
|
Zhang H, Liu D, Luo Y, Wang D. Adaptive Dynamic Programming for Control:Algorithms and Stability. London:Springer-Verlag, 2013.
|
[12]
|
Lewis F L, Liu D R. Reinforcement Learning and Approximate Dynamic Programming for Feedback Control. New Jersey:IEEE Press, 2013.
|
[13]
|
Jiang Z P, Jiang Y. Robust adaptive dynamic programming for linear and nonlinear systems:an overview. European Journal of Control, 2013, 19(5):417-425 doi: 10.1016/j.ejcon.2013.05.017
|
[14]
|
Khan S G, Herrmann G, Lewis F L, Pipe T, Melhuish C. Reinforcement learning and optimal adaptive control:an overview and implementation examples. Annual Reviews in Control, 2012, 36(1):42-59 doi: 10.1016/j.arcontrol.2012.03.004
|
[15]
|
Buşoniu L, Ernst D, De Schutter B, Babuška R. Approximate reinforcement learning:an overview. In:Proceedings of the 2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning. Paris, France:IEEE, 2011. 1-8
|
[16]
|
刁兆师. 导弹精确高效末制导与控制若干关键技术研究[博士学位论文], 北京理工大学, 中国, 2015. http://cdmd.cnki.com.cn/Article/CDMD-10007-1015801403.htmDiao Zhao-Shi. Research on High-precision and High-efficiency Terminal Guidance and Control Key Technologies for Missiles[Ph., D. dissertation], Beijing Institute of Technology, China, 2015. http://cdmd.cnki.com.cn/Article/CDMD-10007-1015801403.htm
|
[17]
|
李运迁. 大气层内拦截弹制导控制及一体化研究[博士学位论文], 哈尔滨工业大学, 中国, 2011. http://cdmd.cnki.com.cn/Article/CDMD-10213-1012000340.htmLi Yun-Qian. Integrated Guidance and Control for Endo-Atmospheric Interceptors[Ph., D. dissertation], Harbin Institute of Technology, China, 2011. http://cdmd.cnki.com.cn/Article/CDMD-10213-1012000340.htm
|
[18]
|
孙传鹏. 基于博弈论的拦截制导问题研究[博士学位论文], 哈尔滨工业大学, 中国, 2014. http://cdmd.cnki.com.cn/Article/CDMD-10213-1014081874.htmSun Chuan-Peng. Research on Interception Guidance Based on Game Theory[Ph., D. dissertation], Harbin Institute of Technology, China, 2014. http://cdmd.cnki.com.cn/Article/CDMD-10213-1014081874.htm
|
[19]
|
Liu D R, Wei Q L. Policy iteration adaptive dynamic programming algorithm for discrete-time nonlinear systems. IEEE Transactions on Neural Networks and Learning Systems, 2014, 25(3):621-634 doi: 10.1109/TNNLS.2013.2281663
|
[20]
|
Wei Q L, Liu D R, Lin Q, Song R Z. Discrete-time optimal control via local policy iteration adaptive dynamic programming. IEEE Transactions on Cybernetics, 2016, DOI: 10.1109/TCYB.2016.2586082
|
[21]
|
Zhang H G, Song R Z, Wei Q L, Zhang T Y. Optimal tracking control for a class of nonlinear discrete-time systems with time delays based on heuristic dynamic programming. IEEE Transactions on Neural Networks, 2011, 22(12):1851-1862 doi: 10.1109/TNN.2011.2172628
|
[22]
|
Song R Z, Xiao W D, Zhang H G, Sun C Y. Adaptive dynamic programming for a class of complex-valued nonlinear systems. IEEE Transactions on Neural Networks and Learning Systems, 2014, 25(9):1733-1739 doi: 10.1109/TNNLS.2014.2306201
|
[23]
|
Wei Q L, Liu D R, Yang X. Infinite horizon self-learning optimal control of nonaffine discrete-time nonlinear systems. IEEE Transactions on Neural Networks and Learning Systems, 2015, 26(4):866-879 doi: 10.1109/TNNLS.2015.2401334
|
[24]
|
Wei Q L, Liu D R, Lewis F L, Liu Y, Zhang J. Mixed iterative adaptive dynamic programming for optimal battery energy control in smart residential microgrids. IEEE Transactions on Industrial Electronics, 2017, DOI:10. 1109/TIE.2017.265087
|
[25]
|
Wei Q L, Liu D R, Lin Q, Song R Z. Adaptive dynamic programming for discrete-time zero-sum games. IEEE Transactions on Neural Networks and Learning Systems, 2017, DOI: 10.1109/TNNLS.2016.2638863
|
[26]
|
Wei Q L, Liu D R. A novel policy iteration based deterministic Q-learning for discrete-time nonlinear systems. Science China Information Sciences, 2015, 58(12):1-15 doi: 10.1007%2F978-981-10-4080-1_4
|
[27]
|
Kiumarsi B, Lewis F L, Modares H, Karimpour A, Naghibi-Sistani M B. Reinforcement Q-learning for optimal tracking control of linear discrete-time systems with unknown dynamics. Automatica, 2014, 50(4):1167-1175 doi: 10.1016/j.automatica.2014.02.015
|
[28]
|
Vamvoudakis K G. Non-zero sum Nash Q-learning for unknown deterministic continuous-time linear systems. Automatica, 2015, 61:274-281 doi: 10.1016/j.automatica.2015.08.017
|
[29]
|
Murray J J, Cox C J, Lendaris G G, Saeks R. Adaptive dynamic programming. IEEE Transactions on Systems, Man, and Cybernetics, Part C:Applications and Reviews, 2002, 32(2):140-153 doi: 10.1109/TSMCC.2002.801727
|
[30]
|
Al-Tamimi A, Lewis F L, Abu-Khalaf M. Discrete-time nonlinear HJB solution using approximate dynamic programming:convergence proof. IEEE Transactions on Systems, Man, and Cybernetics, Part B:Cybernetics, 2008, 38(4):943-949 doi: 10.1109/TSMCB.2008.926614
|
[31]
|
Wang F Y, Jin N, Liu D R, Wei Q L. Adaptive dynamic programming for finite-horizon optimal control of discrete-time nonlinear systems with şvarepsilon-error bound. IEEE Transactions on Neural Networks, 2011, 22(1):24-36 doi: 10.1109/TNN.2010.2076370
|
[32]
|
Heydari A, Balakrishnan S N. Finite-horizon control-constrained nonlinear optimal control using single network adaptive critics. IEEE Transactions on Neural Networks and Learning Systems, 2013, 24(1):145-157 doi: 10.1109/TNNLS.2012.2227339
|
[33]
|
Wei Q L, Liu D R, Xu Y C. Neuro-optimal tracking control for a class of discrete-time nonlinear systems via generalized value iteration adaptive dynamic programming approach. Soft Computing, 2016, 20(2):697-706 doi: 10.1007/s00500-014-1533-0
|
[34]
|
Wei Q L, Liu D R, Lin H Q. Value iteration adaptive dynamic programming for optimal control of discrete-time nonlinear systems. IEEE Transactions on Cybernetics, 2016, 46(3):840-853 doi: 10.1109/TCYB.2015.2492242
|
[35]
|
Wei Q L, Lewis F L, Sun Q Y, Yan P F, Song R Z. Discrete-time deterministic Q-learning:a novel convergence analysis. IEEE Transactions on Cybernetics, 2017, 47(5):1024-0237 http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=7450633&
|
[36]
|
Wei Q L, Song R Z, Sun Q Y. Nonlinear neuro-optimal tracking control via stable iterative Q-learning algorithm. Neurocomputing, 2015, 168:520-528 doi: 10.1016/j.neucom.2015.05.075
|
[37]
|
Wei Q L, Liu D R. Stable iterative adaptive dynamic programming algorithm with approximation errors for discrete-time nonlinear systems. Neural Computing and Applications, 2014, 24(6):1355-1367 doi: 10.1007/s00521-013-1361-7
|
[38]
|
Wei Q L, Liu D R. Adaptive dynamic programming for optimal tracking control of unknown nonlinear systems with application to coal gasification. IEEE Transactions on Automation Science and Engineering, 2014, 11(4):1020-1036 doi: 10.1109/TASE.2013.2284545
|
[39]
|
Wei Q L, Liu D R. Numerical adaptive learning control scheme for discrete-time non-linear systems. IET Control Theory and Applications, 2013, 7(11):1472-1486 doi: 10.1049/iet-cta.2012.0486
|
[40]
|
Wei Q L, Liu D R, Lin Q. Discrete-time local value iteration adaptive dynamic programming:admissibility and termination analysis. IEEE Transactions on Neural Networks and Learning Systems, 2017, DOI:10.1109/TNNLS. 2016.2593743
|
[41]
|
Wei Q L, Wang F Y, Liu D R, Yang X. Finite-approximation-error-based discrete-time iterative adaptive dynamic programming. IEEE Transactions on Cybernetics, 2014, 44(12):2820-2833 doi: 10.1109/TCYB.2014.2354377
|
[42]
|
Zhang H G, Luo Y H, Liu D R. Neural-network-based near-optimal control for a class of discrete-time affine nonlinear systems with control constraints. IEEE Transactions on Neural Networks, 2009, 20(9):1490-1503 doi: 10.1109/TNN.2009.2027233
|
[43]
|
Wei Q L, Zhang H G, Liu D R, Zhao Y. An optimal control scheme for a class of discrete-time nonlinear systems with time delays using adaptive dynamic programming. Acta Automatica Sinica, 2010, 36(1):121-129 http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.669.1013&rep=rep1&type=pdf
|
[44]
|
Song R Z, Wei Q L, Sun Q Y. Nearly finite-horizon optimal control for a class of nonaffine time-delay nonlinear systems based on adaptive dynamic programming. Neurocomputing, 2015, 156:166-175 doi: 10.1016/j.neucom.2014.12.066
|
[45]
|
Wang D, Liu D R, Wei Q L. Finite-horizon neuro-optimal tracking control for a class of discrete-time nonlinear systems using adaptive dynamic programming approach. Neurocomputing, 2012, 78(1):14-22 doi: 10.1016/j.neucom.2011.03.058
|
[46]
|
Abu-Khalaf M, Lewis F L. Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach. Automatica, 2005, 41(5):779-791 doi: 10.1016/j.automatica.2004.11.034
|
[47]
|
Tassa Y, Erez T. Least squares solutions of the HJB equation with neural network value-function approximators. IEEE Transactions on Neural Networks, 2007, 18(4):1031-1041 doi: 10.1109/TNN.2007.899249
|
[48]
|
Song R Z, Lewis F L, Wei Q L, Zhang H G. Off-policy actor-critic structure for optimal control of unknown systems with disturbances. IEEE Transactions on Cybernetics, 2016, 46(5):1041-1050 doi: 10.1109/TCYB.2015.2421338
|
[49]
|
Song R Z, Lewis F L, Wei Q L. Off-policy integral reinforcement learning method to solve nonlinear continuous-time multiplayer nonzero-sum games. IEEE Transactions on Neural Networks and Learning Systems, 2017, 28(3):704-713 doi: 10.1109/TNNLS.2016.2582849
|
[50]
|
Vrabie D, Pastravanu O, Abu-Khalaf M, Lewis F L. Adaptive optimal control for continuous-time linear systems based on policy iteration. Automatica, 2009, 45(2):477-484 doi: 10.1016/j.automatica.2008.08.017
|
[51]
|
Vrabie D, Lewis F. Neural network approach to continuous-time direct adaptive optimal control for partially unknown nonlinear systems. Neural Networks, 2009, 22(3):237-246 doi: 10.1016/j.neunet.2009.03.008
|
[52]
|
Vamvoudakis K G, Lewis F L. Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem. Automatica, 2010, 46(5):878-888 doi: 10.1016/j.automatica.2010.02.018
|
[53]
|
Vamvoudakis K G, Vrabie D, Lewis F L. Online adaptive learning of optimal control solutions using integral reinforcement learning. In:Proceedings of the 2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning. Paris, France:IEEE, 2011. 250-257
|
[54]
|
Zhang H G, Cui L L, Zhang X, Luo Y H. Data-driven robust approximate optimal tracking control for unknown general nonlinear systems using adaptive dynamic programming method. IEEE Transactions on Neural Networks, 2011, 22(12):2226-2236 doi: 10.1109/TNN.2011.2168538
|
[55]
|
Liu D R, Yang X, Wang D, Wei Q L. Reinforcement-learning-based robust controller design for continuous-time uncertain nonlinear systems subject to input constraints. IEEE Transactions on Cybernetics, 2015, 45(7):1372-1385 doi: 10.1109/TCYB.2015.2417170
|
[56]
|
Yang X, Liu D R, Huang Y Z. Neural-network-based online optimal control for uncertain non-linear continuous-time systems with control constraints. IET Control Theory and Applications, 2013, 7(17):2037-2047 doi: 10.1049/iet-cta.2013.0472
|
[57]
|
Wang D, Liu D R, Zhang Q C, Zhao D B. Data-based adaptive critic designs for nonlinear robust optimal control with uncertain dynamics. IEEE Transactions on Systems, Man, and Cybernetics:Systems, 2016, 46(11):1544-1555 doi: 10.1109/TSMC.2015.2492941
|
[58]
|
Wang D, Liu D R, Li H L. Policy iteration algorithm for online design of robust control for a class of continuous-time nonlinear systems. IEEE Transactions on Automation Science and Engineering, 2014, 11(2):627-632 doi: 10.1109/TASE.2013.2296206
|
[59]
|
Wang D, Liu D R, Li H L, Ma H W. Neural-network-based robust optimal control design for a class of uncertain nonlinear systems via adaptive dynamic programming. Information Sciences, 2014, 282:167-179 doi: 10.1016/j.ins.2014.05.050
|
[60]
|
Vamvoudakis K, Vrabie D, Lewis F. Online policy iteration based algorithms to solve the continuous-time infinite horizon optimal control problem. In:Proceedings of the 2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning. Nashville, TN, USA:IEEE, 2009.
|
[61]
|
Yang X, Liu D R, Wei Q L. Online approximate optimal control for affine non-linear systems with unknown internal dynamics using adaptive dynamic programming. IET Control Theory and Applications, 2014, 8(16):1676-1688 doi: 10.1049/iet-cta.2014.0186
|
[62]
|
Wang Z, Liu X P, Liu K F, Li S, Wang H Q. Backstepping-based Lyapunov function construction using approximate dynamic programming and sum of square techniques. IEEE Transactions on Cybernetics, 2016, DOI: 10.1109/TCYB.2016.2574747
|
[63]
|
Vamvoudakis K G, Miranda M F, Hespanha J P. Asymptotically stable adaptive-optimal control algorithm with saturating actuators and relaxed persistence of excitation. IEEE Transactions on Neural Networks and Learning Systems, 2016, 27(11):2386-2398 doi: 10.1109/TNNLS.2015.2487972
|
[64]
|
Yang X, Liu D R, Wei Q L, Wang D. Guaranteed cost neural tracking control for a class of uncertain nonlinear systems using adaptive dynamic programming. Neurocomputing, 2016, 198:80-90 doi: 10.1016/j.neucom.2015.08.119
|
[65]
|
Zargarzadeh H, Dierks T, Jagannathan S. State and output feedback-based adaptive optimal control of nonlinear continuous-time systems in strict feedback form. In:Proceedings of the 2012 American Control Conference. Montréal, Canada:IEEE, 2012. 6412-6417
|
[66]
|
Zargarzadeh H, Dierks T, Jagannathan S. Optimal control of nonlinear continuous-time systems in strict-feedback form. IEEE Transactions on Neural Networks and Learning Systems, 2015, 26(10):2535-2549 doi: 10.1109/TNNLS.2015.2441712
|
[67]
|
Modares H, Lewis F L. Optimal tracking control of nonlinear partially-unknown constrained-input systems using integral reinforcement learning. Automatica, 2014, 50(7):1780-1792 doi: 10.1016/j.automatica.2014.05.011
|
[68]
|
Kamalapurkar R, Dinh H, Bhasin S, Dixon W E. Approximate optimal trajectory tracking for continuous-time nonlinear systems. Automatica, 2015, 51:40-48 doi: 10.1016/j.automatica.2014.10.103
|
[69]
|
Zhou Q, Shi P, Tian Y, Wang M Y. Approximation-based adaptive tracking control for mimo nonlinear systems with input saturation. IEEE Transactions on Cybernetics, 2015, 45(10):2119-2128 doi: 10.1109/TCYB.2014.2365778
|
[70]
|
Modares H, Lewis F L, Sistani M B N. Online solution of nonquadratic two-player zero-sum games arising in the H∞ control of constrained input systems. International Journal of Adaptive Control and Signal Processing, 2014, 28(3-5):232-254 doi: 10.1002/acs.v28.3-5
|
[71]
|
Modares H, Sistani M B N, Lewis F L. A policy iteration approach to online optimal control of continuous-time constrained-input systems. ISA Transactions, 2013, 52(5):611-621 doi: 10.1016/j.isatra.2013.04.004
|
[72]
|
Abu-Khalaf M, Lewis F L, Huang J. Neurodynamic programming and zero-sum games for constrained control systems. IEEE Transactions on Neural Networks, 2008, 19(7):1243-1252 doi: 10.1109/TNN.2008.2000204
|
[73]
|
Yang X, Liu D R, Wang D. Reinforcement learning for adaptive optimal control of unknown continuous-time nonlinear systems with input constraints. International Journal of Control, 2014, 87(3):553-566 doi: 10.1080/00207179.2013.848292
|
[74]
|
Jiang Y, Jiang Z P. Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics. Automatica, 2012, 48(10):2699-2704 doi: 10.1016/j.automatica.2012.06.096
|
[75]
|
Jiang Y, Jiang Z P. Robust approximate dynamic programming and global stabilization with nonlinear dynamic uncertainties. In:Proceedings of the 50th IEEE Conference on Decision and Control and European Control Conference. Orlando, FL, USA:IEEE, 2011. 115-120
|
[76]
|
Jiang Y, Jiang Z P. Robust adaptive dynamic programming and feedback stabilization of nonlinear systems. IEEE Transactions on Neural Networks and Learning Systems, 2014, 25(5):882-893 doi: 10.1109/TNNLS.2013.2294968
|
[77]
|
Jiang Y, Jiang Z P. Global adaptive dynamic programming for continuous-time nonlinear systems. IEEE Transactions on Automatic Control, 2015, 60(11):2917-2929 doi: 10.1109/TAC.2015.2414811
|
[78]
|
Wang D, Liu D R, Li H L, Ma H W. Adaptive dynamic programming for infinite horizon optimal robust guaranteed cost control of a class of uncertain nonlinear systems. In:Proceedings of the 2015 American Control Conference. Chicago, IL, USA:IEEE, 2015. 2900-2905
|
[79]
|
Luo Y H, Sun Q Y, Zhang H G, Cui L L. Adaptive critic design-based robust neural network control for nonlinear distributed parameter systems with unknown dynamics. Neurocomputing, 2015, 148:200-208 doi: 10.1016/j.neucom.2013.08.049
|
[80]
|
Fan Q Y, Yang G H. Adaptive actor-critic design-based integral sliding-mode control for partially unknown nonlinear systems with input disturbances. IEEE Transactions on Neural Networks and Learning Systems, 2016, 27(1):165-177 doi: 10.1109/TNNLS.2015.2472974
|
[81]
|
Liu D R, Huang Y Z, Wang D, Wei Q L. Neural-network-observer-based optimal control for unknown nonlinear systems using adaptive dynamic programming. International Journal of Control, 2013, 86(9):1554-1566 doi: 10.1080/00207179.2013.790562
|
[82]
|
Lv Y F, Na J, Yang Q M, Wu X, Guo Y. Online adaptive optimal control for continuous-time nonlinear systems with completely unknown dynamics. International Journal of Control, 2016, 89(1):99-112 doi: 10.1080/00207179.2015.1060362
|
[83]
|
Zhu Y H, Zhao D B, Li X J. Iterative adaptive dynamic programming for solving unknown nonlinear zero-sum game based on online data. IEEE Transactions on Neural Networks and Learning Systems, 2017, 28(3):714-724 doi: 10.1109/TNNLS.2016.2561300
|
[84]
|
Song R Z, Lewis F L, Wei Q L, Zhang H G, Jiang Z P, Levine D. Multiple actor-critic structures for continuous-time optimal control using input-output data. IEEE Transactions on Neural Networks and Learning Systems, 2015, 26(4):851-865 doi: 10.1109/TNNLS.2015.2399020
|
[85]
|
Vamvoudakis K, Vrabie D, Lewis F L. Adaptive optimal control algorithm for zero-sum Nash games with integral reinforcement learning. In:Proceedings of the 2012 AIAA Guidance, Navigation, and Control Conference. Minneapolis, Minnesota, USA:AIAA, 2012.
|
[86]
|
Wei Q L, Song R Z, Yan P F. Data-driven zero-sum neuro-optimal control for a class of continuous-time unknown nonlinear systems with disturbance using ADP. IEEE Transactions on Neural Networks and Learning Systems, 2016, 27(2):444-458 doi: 10.1109/TNNLS.2015.2464080
|
[87]
|
Vamvoudakis K G, Lewis F L. Multi-player non-zero-sum games:online adaptive learning solution of coupled Hamilton-Jacobi equations. Automatica, 2011, 47(8):1556-1569 doi: 10.1016/j.automatica.2011.03.005
|
[88]
|
Vamvoudakis K G, Lewis F L, Hudas G R. Multi-agent differential graphical games:online adaptive learning solution for synchronization with optimality. Automatica, 2012, 48(8):1598-1611 doi: 10.1016/j.automatica.2012.05.074
|
[89]
|
Wei Q L, Liu D R, Lewis F L. Optimal distributed synchronization control for continuous-time heterogeneous multi-agent differential graphical games. Information Sciences, 2015, 317:96-113 doi: 10.1016/j.ins.2015.04.044
|
[90]
|
Zhang H G, Zhang J L, Yang G H, Luo Y H. Leader-based optimal coordination control for the consensus problem of multiagent differential games via fuzzy adaptive dynamic programming. IEEE Transactions on Fuzzy Systems, 2015, 23(1):152-163 doi: 10.1109/TFUZZ.2014.2310238
|
[91]
|
Nguyen T L. Adaptive dynamic programming-based design of integrated neural network structure for cooperative control of multiple MIMO nonlinear systems. Neurocomputing, 2017, 237:12-24 doi: 10.1016/j.neucom.2016.05.044
|
[92]
|
Jiao Q, Modares H, Xu S Y, Lewis F L, Vamvoudakis K G. Multi-agent zero-sum differential graphical games for disturbance rejection in distributed control. Automatica, 2016, 69:24-34 doi: 10.1016/j.automatica.2016.02.002
|
[93]
|
Jiao Q, Modares H, Lewis F L, Xu S Y, Xie L H. Distributed L2-gain output-feedback control of homogeneous and heterogeneous systems. Automatica, 2016, 71:361-368 doi: 10.1016/j.automatica.2016.04.025
|
[94]
|
Adib Yaghmaie F, Lewis F L, Su R. Output regulation of linear heterogeneous multi-agent systems via output and state feedback. Automatica, 2016, 67:157-164 doi: 10.1016/j.automatica.2016.01.040
|
[95]
|
Zhang H G, Jiang H, Luo Y H, Xiao G Y. Data-driven optimal consensus control for discrete-time multi-agent systems with unknown dynamics using reinforcement learning method. IEEE Transactions on Industrial Electronics, 2017, 64(5):4091-4100 doi: 10.1109/TIE.2016.2542134
|
[96]
|
Venayagamoorthy G K, Harley R G, Wunsch D C. Dual heuristic programming excitation neurocontrol for generators in a multimachine power system. IEEE Transactions on Industry Applications, 2003, 39(2):382-394 doi: 10.1109/TIA.2003.809438
|
[97]
|
Park J W, Harley R G, Venayagamoorthy G K. Adaptive-critic-based optimal neurocontrol for synchronous generators in a power system using MLP/RBF neural networks. IEEE Transactions on Industry Applications, 2003, 39(5):1529-1540 doi: 10.1109/TIA.2003.816493
|
[98]
|
Wei Q L, Liu D R, Shi G, Liu Y. Multibattery optimal coordination control for home energy management systems via distributed iterative adaptive dynamic programming. IEEE Transactions on Industrial Electronics, 2015, 62(7):4203-4214 doi: 10.1109/TIE.2014.2388198
|
[99]
|
Wei Q L, Liu D R, Shi G. A novel dual iterative Q-learning method for optimal battery management in smart residential environments. IEEE Transactions on Industrial Electronics, 2015, 62(4):2509-2518 doi: 10.1109/TIE.2014.2361485
|
[100]
|
Cai C, Wong C K, Heydecker B G. Adaptive traffic signal control using approximate dynamic programming. Transportation Research Part C:Emerging Technologies, 2009, 17(5):456-474 doi: 10.1016/j.trc.2009.04.005
|
[101]
|
赵冬斌, 刘德荣, 易建强.基于自适应动态规划的城市交通信号优化控制方法综述.自动化学报, 2009, 35(6):676-681 http://www.aas.net.cn/CN/abstract/abstract13331.shtmlZhao Dong-Bin, Liu De-Rong, Yi Jian-Qiang. An overview on the adaptive dynamic programming based urban city traffic signal optimal control. Acta Automatica Sinica, 2009, 35(6):676-681 http://www.aas.net.cn/CN/abstract/abstract13331.shtml
|
[102]
|
Wang F Y. Agent-based control for networked traffic management systems. IEEE Intelligent Systems, 2005, 20(5):92-96 doi: 10.1109/MIS.2005.80
|
[103]
|
Lee J M, Lee J H. An approximate dynamic programming based approach to dual adaptive control. Journal of Process Control, 2009, 19(5):859-864 doi: 10.1016/j.jprocont.2008.11.009
|
[104]
|
Wei Q L, Liu D R. Data-driven neuro-optimal temperature control of water-gas shift reaction using stable iterative adaptive dynamic programming. IEEE Transactions on Industrial Electronics, 2014, 61(11):6399-6408 doi: 10.1109/TIE.2014.2301770
|
[105]
|
林小峰, 黄元君, 宋春宁.带şvarepsilon误差限的近似最优控制.控制理论与应用, 2012, 29(1):104-108 http://www.cnki.com.cn/Article/CJFDTOTAL-KZLY201201017.htmLin Xiao-Feng, Huang Yuan-Jun, Song Chun-Ning. Approximate optimal control with şvarepsilon-error bound. Control Theory and Applications, 2012, 29(1):104-108 http://www.cnki.com.cn/Article/CJFDTOTAL-KZLY201201017.htm
|
[106]
|
Nodland D, Zargarzadeh H, Jagannathan S. Neural network-based optimal adaptive output feedback control of a helicopter UAV. IEEE Transactions on Neural Networks and Learning Systems, 2013, 24(7):1061-1073 doi: 10.1109/TNNLS.2013.2251747
|
[107]
|
Stingu E, Lewis F L. An approximate dynamic programming based controller for an underactuated 6DOF quadrotor. In:Proceedings of the 2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning. Paris, France:IEEE, 2011.
|
[108]
|
Xie Q Q, Luo B, Tan F X, Guan X P. Optimal control for vertical take-off and landing aircraft non-linear system by online kernel-based dual heuristic programming learning. IET Control Theory and Applications, 2015, 9(6):981-987 doi: 10.1049/iet-cta.2013.0889
|
[109]
|
Mu C X, Ni Z, Sun C Y, He H B. Air-breathing hypersonic vehicle tracking control based on adaptive dynamic programming. IEEE Transactions on Neural Networks and Learning Systems, 2017, 28(3):584-598 doi: 10.1109/TNNLS.2016.2516948
|
[110]
|
Balakrishnan S N, Biega V. Adaptive-critic-based neural networks for aircraft optimal control. Journal of Guidance, Control, and Dynamics, 1996, 19(4):893-898 doi: 10.2514/3.21715
|
[111]
|
Enns R, Si J. Apache helicopter stabilization using neural dynamic programming. Journal of Guidance, Control, and Dynamics, 2002, 25(1):19-25 doi: 10.2514/2.4870
|
[112]
|
Enns R, Si J. Helicopter trimming and tracking control using direct neural dynamic programming. IEEE Transactions on Neural Networks, 2003, 14(4):929-939 doi: 10.1109/TNN.2003.813839
|
[113]
|
Ferrari S, Stengel R F. Online adaptive critic flight control. Journal of Guidance, Control, and Dynamics, 2004, 27(5):777-786 doi: 10.2514/1.12597
|
[114]
|
Valasek J, Doebbler J, Tandale M D, Meade A J. Improved adaptive-reinforcement learning control for morphing unmanned air vehicles. IEEE Transactions on Systems, Man, and Cybernetics, Part B:Cybernetics, 2008, 38(4):1014-1020 doi: 10.1109/TSMCB.2008.922018
|
[115]
|
Guo C, Wu H N, Luo B, Guo L. H∞ control for air-breathing hypersonic vehicle based on online simultaneous policy update algorithm. International Journal of Intelligent Computing and Cybernetics, 2013, 6(2):126-143 doi: 10.1108/IJICC-Jun-2012-0031
|
[116]
|
Luo X, Chen Y, Si J, Feng L. Longitudinal control of hypersonic vehicles based on direct heuristic dynamic programming using ANFIS. In:Proceedings of the 2014 International Joint Conference on Neural Networks. Beijing, China:IEEE, 2014. 3685-3692
|
[117]
|
Furfaro R, Wibben D R, Gaudet B, Simo J. Terminal multiple surface sliding guidance for planetary landing:development, tuning and optimization via reinforcement learning. The Journal of the Astronautical Sciences, 2015, 62(1):73-99 doi: 10.1007/s40295-015-0045-1
|
[118]
|
Zhou Y, Van Kampen E J, Chu Q P. Nonlinear adaptive flight control using incremental approximate dynamic programming and output feedback. Journal of Guidance, Control, and Dynamics, 2017, 40(S):493-500 https://www.researchgate.net/publication/310781464_Nonlinear_Adaptive_Flight_Control_Using_Incremental_Approximate_Dynamic_Programming_and_Output_Feedback
|
[119]
|
Zhou Y, Van Kampen E J, Chu Q P. An incremental approximate dynamic programming flight controller based on output feedback. In:Proceedings of the 2016 AIAA Guidance, Navigation, and Control Conference. San Diego, California, USA:AIAA, 2016.
|
[120]
|
Ghosh S, Ghose D, Raha S. Capturability of augmented pure proportional navigation guidance against time-varying target maneuvers. Journal of Guidance, Control, and Dynamics, 2014, 37(5):1446-1461 doi: 10.2514/1.G000561
|
[121]
|
Shaferman V, Shima T. Linear quadratic guidance laws for imposing a terminal intercept angle. Journal of Guidance, Control, and Dynamics, 2008, 31(5):1400-1412 doi: 10.2514/1.32836
|
[122]
|
Lee Y, Kim Y, Moon G, Jun B E. Sliding-mode-based missile-integrated attitude control schemes considering velocity change. Journal of Guidance, Control, and Dynamics, 2016, 39(3):423-436 doi: 10.2514/1.G001416
|
[123]
|
Kumar S R, Rao S, Ghose D. Nonsingular terminal sliding mode guidance with impact angle constraints. Journal of Guidance, Control, and Dynamics, 2014, 37(4):1114-1130 doi: 10.2514/1.62737
|
[124]
|
周慧波. 基于有限时间和滑模理论的导引律及多导弹协同制导研究[博士学位论文], 哈尔滨工业大学, 中国, 2015. http://cdmd.cnki.com.cn/Article/CDMD-10213-1015957301.htmZhou Hui-Bo. Study on Guidance Law and Cooperative Guidance for Multi-missiles Based on Finite-time and Sliding Mode Theory[Ph., D. dissertation], Harbin Institute of Technology, China, 2015. http://cdmd.cnki.com.cn/Article/CDMD-10213-1015957301.htm
|
[125]
|
张友安, 黄诘, 王丽英.约束条件下的末制导律研究进展.海军航空工程学院学报, 2013, 28(6):581-586 http://www.cnki.com.cn/Article/CJFDTOTAL-HJHK201306001.htmZhang You-An, Huang Jie, Wang Li-Ying. Research progress of terminal guidance law with constraint. Journal of Naval Aeronautical and Astronautical, 2013, 28(6):581-586 http://www.cnki.com.cn/Article/CJFDTOTAL-HJHK201306001.htm
|
[126]
|
Imado F, Kuroda T, Miwa S. Optimal midcourse guidance for medium-range air-to-air missiles. Journal of Guidance, Control, and Dynamics, 1990, 13(4):603-608 doi: 10.2514/3.25376
|
[127]
|
Balakrishnan S N, Xin M. Robust state dependent Riccati equation based guidance laws. In:Proceedings of the 2001 American Control Conference. Arlington, VA, USA:IEEE, 2001. 3352-3357
|
[128]
|
Indig N, Ben-Asher J Z, Sigal E. Near-optimal minimum-time guidance under spatial angular constraint in atmospheric flight. Journal of Guidance, Control, and Dynamics, 2016, 39(7):1563-1577 doi: 10.2514/1.G001485
|
[129]
|
Taub I, Shima T. Intercept angle missile guidance under time varying acceleration bounds. Journal of Guidance, Control, and Dynamics, 2013, 36(3):686-699 doi: 10.2514/1.59139
|
[130]
|
陈克俊, 赵汉元.一种适用于攻击地面固定目标的最优再入机动制导律.宇航学报, 1994, 15(1):1-7, 94 http://www.cnki.com.cn/Article/CJFDTOTAL-YHXB401.000.htmChen Ke-Jun, Zhao Han-Yuan. An optimal reentry maneuver guidance law applying to attack the ground fixed target. Journal of Astronautics, 1994, 15(1):1-7, 94 http://www.cnki.com.cn/Article/CJFDTOTAL-YHXB401.000.htm
|
[131]
|
赵汉元.飞行器再入动力学和制导.北京:国防科技大学出版社, 1997.Zhao Han-Yuan. Reentry Vehicle Dynamics and Guidance. Beijing:National University of Defense Technology, 1997.
|
[132]
|
Lee Y I, Ryoo C K, Kim E. Optimal guidance with constraints on impact angle and terminal acceleration. In:Proceedings of the 2003 AIAA Guidance, Navigation, and Control Conference and Exhibit. Austin, Texas, USA:AIAA, 2003.
|
[133]
|
Lee J I, Jeon I S, Tahk M J. Guidance law to control impact time and angle. IEEE Transactions on Aerospace and Electronic Systems, 2007, 43(1):301-310 doi: 10.1109/TAES.2007.357135
|
[134]
|
胡正东, 郭才发, 蔡洪.带落角约束的再入机动弹头的复合导引律.国防科技大学学报, 2008, 30(3):21-26 http://www.cnki.com.cn/Article/CJFDTOTAL-GFKJ200803004.htmHu Zheng-Dong, Guo Cai-Fa, Cai Hong. Integrated guidance law of reentry maneuvering warhead with terminal angular constraint. Journal of National University of Defense Technology, 2008, 30(3):21-26 http://www.cnki.com.cn/Article/CJFDTOTAL-GFKJ200803004.htm
|
[135]
|
Bardhan R, Ghose D. Nonlinear differential games-based impact-angle-constrained guidance law. Journal of Guidance, Control, and Dynamics, 2015, 38(3):384-402 doi: 10.2514/1.G000940
|
[136]
|
方绍琨, 李登峰.微分对策及其在军事领域的研究进展.指挥控制与仿真, 2008, 30(1):114-117 http://www.cnki.com.cn/Article/CJFDTOTAL-QBZH200801032.htmFang Shao-Kun, Li Deng-Feng. Research advances on differential games and applications to military field. Command Control and Simulation, 2008, 30(1):114-117 http://www.cnki.com.cn/Article/CJFDTOTAL-QBZH200801032.htm
|
[137]
|
Yang C D, Chen H Y. Nonlinear H∞ robust guidance law for homing missiles. Journal of Guidance, Control, and Dynamics, 1998, 21(6):882-890 doi: 10.2514/2.4321
|
[138]
|
Dalton J, Balakrishnan S N. A neighboring optimal adaptive critic for missile guidance. Mathematical and Computer Modelling, 1996, 23(1-2):175-188 doi: 10.1016/0895-7177(95)00226-X
|
[139]
|
Han D C, Balakrishnan S N. Adaptive critic based neural networks for control-constrained agile missile control. In:Proceedings of the 1999 American Control Conference. San Diego, California, USA:IEEE, 1999. 2600-2604
|
[140]
|
Si J, Barto A, Powell W, Wunsch D. Adaptive Critic Based Neural Network for Control-Constrained Agile Missile. New Jersey:John Wiley and Sons, Inc., 2012.
|
[141]
|
Han D C, Balakrishnan S. Midcourse guidance law with neural networks. In:Proceedings of the 2000 AIAA Guidance, Navigation, and Control Conference and Exhibit. Denver, CO, USA:AIAA, 2000.
|
[142]
|
Han D C, Balakrishnan S N. State-constrained agile missile control with adaptive-critic-based neural networks. IEEE Transactions on Control Systems Technology, 2002, 10(4):481-489 doi: 10.1109/TCST.2002.1014669
|
[143]
|
Bertsekas D P, Homer M L, Logan D A, Patek S D, Sandell N R. Missile defense and interceptor allocation by neuro-dynamic programming. IEEE Transactions on Systems, Man, and Cybernetics, Part A:Systems and Humans, 2000, 30(1):42-51 doi: 10.1109/3468.823480
|
[144]
|
Davis M T, Robbins M J, Lunday B J. Approximate dynamic programming for missile defense interceptor fire control. European Journal of Operational Research, 2017, 259(3):873-886 doi: 10.1016/j.ejor.2016.11.023
|
[145]
|
Lin C K. Adaptive critic autopilot design of bank-to-turn missiles using fuzzy basis function networks. IEEE Transactions on Systems, Man, and Cybernetics, Part B:Cybernetics, 2005, 35(2):197-207 doi: 10.1109/TSMCB.2004.842246
|
[146]
|
卢超群, 江加和, 任章.基于增强学习的空空导弹智能精确制导律研究.战术导弹控制技术, 2006, (4):19-22, 76 http://d.wanfangdata.com.cn/Periodical/zsddkzjs200604007Lu Chao-Qun, Jiang Jia-He, Ren Zhang. Research of precision guidance law based on Q-learning for air-to-air missile. Control Technology of Tactical Missile, 2006, (4):19-22, 76 http://d.wanfangdata.com.cn/Periodical/zsddkzjs200604007
|
[147]
|
McGrew J S, How J P, Bush L, Williams B, Roy N. Air combat strategy using approximate dynamic programming. In:Proceedings of the 2008 AIAA Guidance, Navigation and Control Conference and Exhibit. Honolulu, Hawaii, USA:AIAA, 2008.
|
[148]
|
Gaudet B, Furfaro R. Missile homing-phase guidance law design using reinforcement learning. In:Proceedings of the 2012 AIAA Guidance, Navigation, and Control Conference. Minneapolis, Minnesota, USA:AIAA, 2012.
|
[149]
|
Lee D, Bang H. Planar evasive aircrafts maneuvers using reinforcement learning. Intelligent Autonomous Systems 12:Advances in Intelligent Systems and Computing. Berlin Heidelberg:Springer, 2013. 533-542
|
[150]
|
Sun J L, Liu C S, Ye Q. Robust differential game guidance laws design for uncertain interceptor-target engagement via adaptive dynamic programming. International Journal of Control, 2017, 64(5):4091-4100 https://www.researchgate.net/publication/283513965_Robust_Adaptive_Dynamic_Programming_of_Two-Player_Zero-Sum_Games_for_Continuous-Time_Linear_Systems
|
[151]
|
姚郁, 郑天宇, 贺风华, 王龙, 汪洋, 张曦, 朱柏羊, 杨宝庆.飞行器末制导中的几个热点问题与挑战.航空学报, 2015, 36(8):2696 http://www.cnki.com.cn/Article/CJFDTOTAL-HKXB201508020.htmYao Yu, Zheng Tian-Yu, He Feng-Hua, Wang Long, Wang Yang, Zhang Xi, Zhu Bai-Yang, Yang Bao-Qing. Several hot issues and challenges in terminal guidance of flight vehicles. Acta Aeronautica et Astronautica Sinica, 2015, 36(8):2696-2716 http://www.cnki.com.cn/Article/CJFDTOTAL-HKXB201508020.htm
|