不可靠通信下矿物磨选过程多速率分层学习控制

任鹏旭; 代伟; 张淇瑞; 杨春雨

doi:10.16383/j.aas.c250584

不可靠通信下矿物磨选过程多速率分层学习控制

doi: 10.16383/j.aas.c250584 cstr: 32138.14.j.aas.c250584

1.
中国矿业大学信息与控制工程学院徐州 221116

基金项目: 江苏省自然科学基金(BK20240102, BK20231062), 国家自然科学基金(62373361, 62403469), 江苏省研究生科研与实践创新计划(KYCX25_2846) 资助

详细信息

作者简介:
任鹏旭：中国矿业大学信息与控制工程学院博士研究生. 主要研究方向为工业过程运行优化与控制, 强化学习. px.ren@cumt.edu.cn

代伟：中国矿业大学信息与控制工程学院教授. 主要研究方向为复杂工业过程建模、运行优化与控制. 本文通信作者. weidai@cumt.edu.cn

张淇瑞：中国矿业大学信息与控制工程学院副教授. 主要研究方向为复杂工业过程的安全控制和脆弱性分析. qiruizhang@cumt.edu.cn

杨春雨：中国矿业大学信息与控制工程学院教授. 主要研究方向为奇异摄动系统, 工业过程运行控制, 网络物理系统和鲁棒控制. chunyuyang@cumt.edu.cn

计量
- 文章访问数: 84
- HTML全文浏览量: 60
- 被引次数: 0
出版历程
- 收稿日期: 2025-10-30
- 录用日期: 2026-04-12
- 网络出版日期: 2026-05-25

Multi-rate Layered Learning Control of Mineral Grinding Process Subject to Unreliable Communication

1.
School of Information and Control Engineering, China University of Mining and Technology, Xuzhou 221116

Funds: Supported by Natural Science Foundation of Jiangsu Province (BK20240102, BK20231062), National Natural Science Foundation of China (62373361, 62403469), Postgraduate Research & Practice Innovation Program of Jiangsu Province (KYCX25_2846)

More Information

Author Bio:
REN Peng-Xu　Ph.D. candidate at the School of Information and Control Engineering, China University of Mining and Technology. His research interests include operational optimization and control for complex industrial process and reinforcement learning

DAI Wei　Professor at the School of Information and Control Engineering, China University of Mining and Technology. His research interests include modeling, operational optimization and control for complex industrial process. Corresponding author of this paper

ZHANG Qi-Rui　Associate professor at the School of Information and Control Engineering, China University of Mining and Technology. His research interests include secure control and vulnerability analysis for complex industrial process

YANG Chun-Yu　Professor at the School of Information and Control Engineering, China University of Mining and Technology. His research interests include singularly perturbed systems, industrial process operational control, cyber-physical systems, and robust control

摘要

摘要: 矿物磨选过程运行优化控制通常采用基础回路层和运行层双层结构, 涉及不同时间尺度被控对象, 其运行层动态机理复杂难以建模, 且层级间具有不同采样速率与通信丢包问题, 进一步增加了控制设计难度. 因此, 针对矿物磨选过程运行优化中存在的多速率、不可靠通信问题, 提出一种带有通信补偿的多速率分层学习控制方法. 该方法在基础回路层采用提升技术和模型预测控制实现多速率下的设定值跟踪; 在此基础上, 通过递归提升将回路动态引入运行层, 采用强化学习技术, 结合史密斯预估器的思想, 设计带有通信补偿的快慢耦合逆学习控制算法, 以解决性能指标权重参数依赖人工经验设定、调参困难的问题, 利用演示运行数据逆向学习性能指标权重参数的同时在线更新回路设定值, 进而实现运行指标的优化控制. 理论分析和工业应用验证了所提方法的有效性.
- 矿物磨选过程 /
- 运行优化控制 /
- 多速率 /
- 不可靠通信 /
- 强化学习
Abstract: The operational optimal control of mineral grinding process typically adopt a two-layer structure consisting of the basic loop layer and the operational layer, involving controlled objects operating at different time scales. The operational layer is characterized by complex dynamic mechanisms that are difficult to model. Moreover, the problem of different sampling rates and communication packet loss between layers further increases the difficulty of control design. To address the issues of multi-rate and unreliable communication issues in the operational optimal control of mineral grinding process, a multi-rate layered learning control method with communication compensation is proposed. At the basic loop layer, lifting technology and model predictive control are used to achieve setpoint tracking under multi-rate conditions. Building on this, recursive lifting is employed to introduce loop dynamics into the operational layer, where reinforcement learning technology combined with the Smith predictor is used to develop a fast-slow coupled inverse learning control algorithm with communication compensation. The proposed method addresses the reliance of performance index weight parameters on empirical manual setting and tuning difficulty by inversely learning them from demonstration operation data while updating loop setpoints online, thereby achieving optimal control of operational indices. The effectiveness of the method is verified by theoretical analysis and industrial applications.
- mineral grinding process /
- operational optimal control /
- multi-rate /
- unreliable communication /
- reinforcement learning

HTML全文

图 1 不可靠通信下矿物磨选过程多速率分层学习控制框图

Fig. 1 Block diagram of multi-rate layered learning control of mineral grinding processes subject to unreliable communication

下载: 全尺寸图片幻灯片

图 2 演示系统与被控系统状态丢包频率

Fig. 2 States dropout frequency of demonstrate system and controlled system

下载: 全尺寸图片幻灯片

图 3 运行指标控制曲线

Fig. 3 The control curve of the operational index

下载: 全尺寸图片幻灯片

图 4 基础回路层设定值跟踪曲线

Fig. 4 The tracking curves for the set points of the basic loop layer

下载: 全尺寸图片幻灯片

图 5 基础回路层控制输入曲线

Fig. 5 The control inputs curve of basic loop layer

下载: 全尺寸图片幻灯片

图 6 参数收敛曲线

Fig. 6 The convergence curve of the parameters

下载: 全尺寸图片幻灯片

图 7 模型变化后参数收敛曲线

Fig. 7 The convergence curve of the parameters after the model change

下载: 全尺寸图片幻灯片

图 8 演示系统与被控系统状态连续丢包频率

Fig. 8 States dropout frequency of demonstrate system and controlled system subject to continuous packet loss

下载: 全尺寸图片幻灯片

图 9 连续丢包下运行指标控制曲线

Fig. 9 The control curve of the operational index subject to continuous packet loss

下载: 全尺寸图片幻灯片

图 10 连续丢包下参数收敛曲线

Fig. 10 The convergence curve of the parameters subject to continuous packet loss

下载: 全尺寸图片幻灯片

图 11 连续丢包下模型变化后参数收敛曲线

Fig. 11 The convergence curve of the parameters after the model change subject to continuous packet loss

下载: 全尺寸图片幻灯片

图 12 状态保持策略下运行指标控制曲线

Fig. 12 The control curve of the operational index under state-holding strategy

下载: 全尺寸图片幻灯片

图 13 状态保持策略下参数收敛曲线

Fig. 13 The convergence curve of the parameters under state-holding strategy

下载: 全尺寸图片幻灯片

图 14 多速率分层MPC方法下运行指标控制曲线

Fig. 14 The control curve of the operational index under multi-rate layered MPC method

下载: 全尺寸图片幻灯片

图 15 多速率分层Q学习方法下运行指标控制曲线

Fig. 15 The control curve of the operational index under multi-rate layered Q-learning method

下载: 全尺寸图片幻灯片

表 1 本文方法在不同丢包率下的评价指标

Table 1 Evaluating index of the proposed method under different packet loss rates

丢包率	$r_1$(IAE)	$r_2$(IAE)	$r_1$(MSE)	$r_2$(MSE)	代码运行时间/(s)
0%	0.0002	0.0011	2.8763e-08	1.2812e-06	0.1123
10%	0.0023	0.0030	5.3894e-06	8.7299e-06	0.1124
30%	0.0036	0.0024	1.3208e-05	5.8225e-06	0.1245
50%	0.0034	0.0050	1.1482e-05	2.2387e-05	0.1305
70%	0.0053	0.0152	2.7620e-05	2.3004e-04	0.1527

下载: 导出CSV

表 2 不同通信补偿方案的评价指标

Table 2 Evaluating index of different communication compensation schemes

丢包处理方法	$r_1$(IAE)	$r_2$(IAE)	$r_1$(MSE)	$r_2$(MSE)
本文	0.0036	0.0024	1.3208e-05	5.8225e-06
文献[24]	38.2981	49.8683	0.0723	0.1932

下载: 导出CSV

表 3 不同方法下的评价指标

Table 3 Evaluating index of different methods

对比方法	$r_1$(IAE)	$r_2$(IAE)	$r_1$(MSE)	$r_2$(MSE)
本文方法	0.0036	0.0024	1.3208e-05	5.8225e-06
多速率分层MPC	438.7964	629.7690	0.6073	0.8281
多速率分层Q学习	47.2822	107.7708	0.0047	0.0189

下载: 导出CSV

参考文献(36)

[1]	柴天佑, 郑锐, 邢方新, 贾瑶, 郑秀萍. 工业过程控制智能化及未来发展展望. 中国科学: 信息科学, 2025, 55: 1555−1570 doi: 10.1360/SSI-2025-0108 Chai Tian-You, Zheng Rui, Xing Fang-Xin, Zheng Xiu-Ping. Intelligence for industrial process control: Development and prospects. SCIENCE CHINA Information Sciences, 2025, 55: 1555−1570 doi: 10.1360/SSI-2025-0108
[2]	范家璐, 张也维, 柴天佑. 一类工业过程运行反馈优化控制方法. 自动化学报, 2015, 41(10): 1754−1761 doi: 10.16383/j.aas.2015.c150061 Fan Jia-Lu, Zhang Ye-Wei, Chai Tian-You. Optimal operational feedback control for a class of industrial processes. Acta Automatica Sinica, 2015, 41(10): 1754−1761 doi: 10.16383/j.aas.2015.c150061
[3]	Zhou P, Chai T Y, Wang H. Intelligent optimal-setting control for grinding circuits of mineral processing process. IEEE Transactions on Automation Science and Engineering, 2009, 6(4): 730−743 doi: 10.1109/TASE.2008.2011562
[4]	Aguila-Camacho N, le Roux J D, Duarte-Mermoud M A, Orchard M E. Control of a grinding mill circuit using fractional order controllers. Journal of Process Control, 2017, 53: 80−94 doi: 10.1016/j.jprocont.2017.02.012
[5]	Lestage R, Pomerleau A, Hodouin D. Conbstrained real-time optimization of a grinding circuit using steady-state linear programming supervisory control. Powder Technology, 2002, 124(3): 254−263 doi: 10.1016/S0032-5910(02)00028-1
[6]	le Roux J D, Padhi R, Craig I K. Optimal control of grinding mill circuit using model predictive static programming: A new nonlinear MPC paradigm. Journal of Process Control, 2014, 24(12): 29−40 doi: 10.1016/j.jprocont.2014.10.007
[7]	Minchala L I, Zhang Y, Garza-Castanón L. Predictive control of a closed grinding circuit system in cement industry. IEEE Transactions on Industrial Electronics, 2017, 65(5): 4070−4079 doi: 10.1109/tie.2017.2762635
[8]	Yamashita A S, Martins W T, Pinto T V B, Raffo G V, Euzébio T A M. Multiobjective tuning technique for MPC in grinding circuits. IEEE Access, 2023, 11: 43041−43054 doi: 10.1109/ACCESS.2023.3269559
[9]	Chen X S, Yang J, Li S H, Li Q. Disturbance observer based multi-variable control of ball mill grinding circuits. Journal of Process Control, 2009, 19(7): 1205−1213 doi: 10.1016/j.jprocont.2009.02.004
[10]	Zhou P, Dai W, Chai T Y. Multivariable disturbance observer based advanced feedback control design and its application to a grinding circuit. IEEE Transactions on Control Systems Technology, 2013, 22(4): 1474−1485 doi: 10.1109/tcst.2013.2283239
[11]	Engell S. Feedback control for optimal process operation. Journal of Process Control, 2007, 17(3): 203−219 doi: 10.1016/j.jprocont.2006.10.011
[12]	Baraheni M, Azarhoushang B, Daneshi A, Kadivar M, Amini S. Development of an expert system for optimal design of the grinding process. The International Journal of Advanced Manufacturing Technology, 2021, 116(9): 2823−2833
[13]	Zhao D, Chai T Y, Wang H, Fu J. Hybrid intelligent control for regrinding process in hematite beneficiation. Control Engineering Practice, 2014, 22: 217−230 doi: 10.1016/j.conengprac.2013.02.015
[14]	代伟, 王献伟, 路兴龙, 柴天佑. 基于案例推理增强学习的磨矿过程设定值优化. 控制理论与应用, 2019, 36(1): 53−64 doi: 10.7641/CTA.2018.70719 Dai Wei, Wang Xian-Wei, Lu Xing-Long, Chai Tian-You. Case-based reasoning and reinforcement learning integrated set-point optimization method for grinding process. Control Theory & Applications, 2019, 36(1): 53−64 doi: 10.7641/CTA.2018.70719
[15]	Zhou P, Chai T Y, Sun J. Intelligence-based supervisory control for optimal operation of a DCS-controlled grinding system. IEEE Transactions on Control Systems Technology, 2012, 21(1): 162−175 doi: 10.1109/tcst.2012.2182996
[16]	李金娜, 高溪泽, 柴天佑, 范家璐. 数据驱动的工业过程运行优化控制. 控制理论与应用, 2016, 33(12): 1584−1592 Li Jin-Na, Gao Xi-Ze, Chai Tian-You, Fan Jia-Lu. Data-driven operational optimization control of industrial processes. Control Theory & Applications, 2016, 33(12): 1584−1592
[17]	Dai W, Chai T Y, Yang S X. Data-driven optimization control for safety operation of hematite grinding process. IEEE Transactions on Industrial Electronics, 2014, 62(5): 2930−2941 doi: 10.1109/tie.2014.2362093
[18]	Lu X L, Kiumarsi B, Chai T Y, Lewis F L. Data-driven optimal control of operational indices for a class of industrial processes. IET Control Theory & Applications, 2016, 10(12): 1348−1356 doi: 10.1049/iet-cta.2015.0798
[19]	Lu X L, Kiumarsi B, Chai T Y, Jiang Y, Lewis F L. Operational control of mineral grinding processes using adaptive dynamic programming and reference governor. IEEE Transactions on Industrial Informatics, 2018, 15(4): 2210−2221 doi: 10.1109/tii.2018.2868473
[20]	Li J, Yang M, Lewis F L. Optimal operational self-learning control for multi-time scale industrial processes with signal compensations. Engineering Applications of Artificial Intelligence, 2023, 126: Article No. 107065 doi: 10.1016/j.engappai.2023.107065
[21]	代伟, 陆文捷, 付俊, 马小平. 工业过程多速率分层运行优化控制. 自动化学报, 2019, 45(10): 1946−1959 doi: 10.16383/j.aas.2018.c180300 Dai Wei, Lu Wen-Jie, Fu Jun, Ma Xiao-Ping. Multi-rate layered optimal operational control of industrial processes. Acta Automatica Sinica, 2019, 45(10): 1946−1959 doi: 10.16383/j.aas.2018.c180300
[22]	Liu J X, Shen H, Wang J, Cao J D, Rutkowski L. $H_\infty$ control for interconnected systems with unknown system dynamics: A two-stage reinforcement learning method. IEEE Transactions on Automation Science and Engineering, 2024, 22: 6388−6397
[23]	柴天佑. 复杂工业过程运行优化与反馈控制. 自动化学报, 2013, 39(11): 1744−1757 Chai Tian-You. Operational optimization and feedback control for complex industrial processes. Acta Automatica Sinica, 2013, 39(11): 1744−1757
[24]	Chai T Y. Industrial process control systems: research status and development direction. SCIENTIA SINICA Informationis, 2016, 46(8): 1003−1015 doi: 10.1360/n112016-00062
[25]	Chai T Y, Zhao L, Qiu J, Fan J. Integrated network-based model predictive control for setpoints compensation in industrial processes. IEEE Transactions on Industrial Informatics, 2012, 9(1): 417−426 doi: 10.1109/tii.2012.2217750
[26]	范家璐, 姜艺, 柴天佑. 无线网络环境下工业过程运行反馈控制方法. 自动化学报, 2016, 42(8): 1166−1174 Fan Jia-Lu, Jiang Yi, Chai Tian-You. Operational feedback control of industrial processes in a wireless network environment. Acta Automatica Sinica, 2016, 42(8): 1166−1174
[27]	Jiang Y, Liu L, Feng G. Adaptive optimal control of networked nonlinear systems with stochastic sensor and actuator dropouts based on reinforcement learning. IEEE Transactions on Neural Networks and Learning Systems, 2022, 35(3): 3107−3120 doi: 10.1109/tnnls.2022.3183020
[28]	Zhang X M, Han Q L, Ge X. A novel approach to $ H_\infty $ performance analysis of discrete-time networked systems subject to network-induced delays and malicious packet dropouts. Automatica, 2022, 136: Article No. 110010 doi: 10.1016/j.automatica.2021.110010
[29]	Jiang Y, Yang T, Gao W, Wu J, Chai T Y, Lewis F L. Off-policy reinforcement learning for $ H_\infty $ control of linear discrete-time systems with network induced dropouts. IEEE Transactions on Automatic Control, 2025, 70(12): 8000−8015 doi: 10.1109/TAC.2025.3582529
[30]	Lian B, Xue W, Lewis F L, Davoudi A. Inverse value iteration and Q-learning: Algorithms, stability, and robustness. IEEE Transactions on Neural Networks and Learning Systems, 2024, 36(4): 6970−6980 doi: 10.1109/tnnls.2024.3409182
[31]	Wu H, Hu Q, Zheng J, Dong F, Ouyang Z, Li D. Discounted inverse reinforcement learning for linear quadratic control. IEEE Transactions on Cybernetics, 2025, 55(4): 1995−2007 doi: 10.1109/TCYB.2025.3540967
[32]	Chen T, Francis B A. Optimal Sampled-data Control Systems. London: Springer Science & Business Media, 2012.
[33]	Astrom K J, Hang C C, Lim B C. A new Smith predictor for controlling a process with an integrator and long dead-time. IEEE transactions on Automatic Control, 1994, 39(2): 343−345 doi: 10.1109/9.272329
[34]	Xue W Q, Lian B S, Fan J L, Kolaric P, Chai T Y, Lewis F L. Inverse reinforcement Q-learning through expert imitation for discrete-time systems. IEEE Transactions on Neural Networks and Learning Systems, 2021, 34(5): 2386−2399 doi: 10.1080/00207721.2025.2521013
[35]	Lewis F L, Vrabie D, Syrmos V L. Optimal Control. Hoboken: John Wiley & Sons, 2012.
[36]	Jiang Y, Li Y N, Fan J L. Model predictive control-based setpoint regulation in industrial processes. Control Engineering of China, 2018, 25(6): 980−984