基于强化学习的减少烘丝过程中烟丝 “干头” 量的方法

毕素环; 蒋一翔; 于树松; 丁香乾; 牟亮亮; 王彬

doi:10.16383/j.aas.c190367

基于强化学习的减少烘丝过程中烟丝 “干头” 量的方法

doi: 10.16383/j.aas.c190367 cstr: 32138.14.j.aas.c190367

毕素环^{1, 2,},
蒋一翔^3,,
于树松^1,,
丁香乾^1,,
牟亮亮^1,,
王彬^4,

1.
中国海洋大学信息科学与工程学院青岛 266000
2.
青岛理工大学信息与控制工程学院青岛 266520
3.
浙江中烟工业有限责任公司杭州 310000
4.
中国海洋大学继续教育学院青岛 266000

基金项目: 国家重点研发计划 (2017YFA0700601)资助

详细信息

作者简介:
毕素环：中国海洋大学信息科学与工程学院博士. 青岛理工大学信息与控制工程学院讲师. 主要研究方向为机器学习与智能控制. E-mail: bisuhuan2016@163.com

蒋一翔：浙江中烟工业有限责任公司工程师. 主要研究方向为信息系统应用, 信息安全管理. E-mail: jiangyxlunwen@sina.com

于树松：中国海洋大学信息科学与工程学院副教授. 主要研究方向为人工智能, 智能控制. 本文通信作者. E-mail: yushusong@ouc.edu.cn

丁香乾：中国海洋大学信息科学与工程学院教授. 主要研究方向为人工智能, 智能控制. E-mail: dingxq1995@vip.sina.com

牟亮亮：中国海洋大学信息科学与工程学院博士研究生. 主要研究方向为深度学习与数据挖掘. E-mail: merlin_mu@163.com

王彬：中国海洋大学继续教育学院讲师. 主要研究方向为机器学习与数据挖掘. E-mail: wangbin@ouc.edu.cn

计量
- 文章访问数: 685
- HTML全文浏览量: 446
- PDF下载量: 184
- 被引次数: 0
出版历程
- 收稿日期: 2019-05-14
- 录用日期: 2019-09-02
- 网络出版日期: 2023-07-28
- 刊出日期: 2023-08-21

A Method for Reducing Over-dried Tobacco at Head Stage of Drying Process Based on Reinforcement Learning

1.
College of Information Science and Engineering, Ocean University of China, Qingdao 266000
2.
School of Information and Control Engineering, Qingdao University of Technology, Qingdao 266520
3.
China Tobacco Zhejiang Industrial CO., LTD., Hangzhou 310000
4.
School of Continuing Education, Ocean University of China, Qingdao 266000

Funds: Supported by National Key Research and Development Program of China (2017YFA0700601)

More Information

Author Bio:
BI Su-Huan　Ph.D. at the College of Information Science and Engineering, Ocean University of China. She is a lecturer at the School of Information and Control Engineering, Qingdao University of Technology. Her research interest covers machine learning and intelligent control

JIANG Yi-Xiang　Engineer at the China Tobacco Zhejiang Industrial CO., LTD.. His research interest covers information system application and information security management

YU Shu-Song　Associate professor at the College of Information Science and Engineering, Ocean University of China. His research interest covers artificial intelligence and intelligent control. Corresponding author of this paper

DING Xiang-Qian　Professor at the College of Information Science and Engineering, Ocean University of China. His research interest covers artificial intelligence and intelligent control

MU Liang-Liang　Ph.D. candidate at the College of Information Science and Engineering, Ocean University of China. His research interest covers deep learning and data mining

WANG Bin　Lecturer at the School of Continuing Education, Ocean University of China. His research interest covers machine learning and data mining

摘要

摘要: 针对烘丝开始阶段存在的烘丝温度超调、过干烟丝较多等问题, 提出一种基于强化学习 (Reinforcement learning, RL)的减少烟丝“干头” 量的方法. 该方法利用生产实时数据作为输入特征向量感知烘丝生产过程的状态变化, 以烟丝含水率检测值为依据来评价、优化烘丝温度控制策略, 实现对烘丝机温度设定值的在线修正, 优化烘丝开始阶段的温度控制, 有效改善烟丝过干问题. 与烘丝机的自动控制模式和人工干预模式相比, 烟丝含水率的标准偏差比自动控制时降低了44.7%, 比人工干预时降低了14.3%. 实验结果表明烟丝含水率的稳定性有较大提高, 烟丝“干头” 量明显减少, 验证了所提方法的有效性和可行性.
- 烟丝含水率 /
- 过干烟丝 /
- 强化学习 /
- 超调
Abstract: To solve the problem of high overshoot of drying temperature and too much over-dried cut tobacco at head stage of drying process, a method for reducing over-dried tobacco based on reinforcement learning (RL) is proposed. The presented model detects dynamic performance of tobacco drying system relying on real-time production data, evaluates and optimizes the temperature control according to the amount of moisture content in tobacco, and performs real-time correction for the set value of dryer temperature. The control strategy optimizes the temperature control and effectively improves the over-dried problem. The proposed method is compared with the automatic control mode and manual intervention mode of dryer. The standard deviation of the moisture content in dried tobacco is reduced by 44.7% compared with automatic control, and decreased by 14.3% compared with manual intervention. The experimental results show that the stability of the moisture content level is improved, and the amount of over-dried tobacco is significantly reduced, which verify the effectiveness and feasibility of the proposed method.
- The amount of moisture content in tobacco /
- over-dried tobacco /
- reinforcement learning (RL) /
- overshoot

HTML全文

图 1 人工干预时烟丝含水率标准偏差

Fig. 1 Standard deviation of moisture content level in tobacco when in manual intervention mode

下载: 全尺寸图片幻灯片

图 2 制叶丝工段

Fig. 2 The stage of cut tobacco processing

下载: 全尺寸图片幻灯片

图 3 烘丝过程及温度控制

Fig. 3 Drying process and temperature control flow

下载: 全尺寸图片幻灯片

图 4 烘丝温度优化控制策略

Fig. 4 Optimal control strategy for drying temperature

下载: 全尺寸图片幻灯片

图 5 烘丝温度优化控制流程

Fig. 5 Optimization of temperature control flow

下载: 全尺寸图片幻灯片

图 6 烘丝生产系统状态感知

Fig. 6 Tobacco drying system state perception

下载: 全尺寸图片幻灯片

图 7 模型的输入特征向量

Fig. 7 Input feature vector of the proposed model

下载: 全尺寸图片幻灯片

图 8 3 种常用激活函数及设计的奖励函数

Fig. 8 Three commonly used activation functions and the designed reward functions

下载: 全尺寸图片幻灯片

图 9 3 种模式下烘丝机温度和烟丝含水率曲线

Fig. 9 The dryer temperature and the moisture content level in tobacco when in three control modes

下载: 全尺寸图片幻灯片

表 1 烘丝生产系统状态特征

Table 1 The state features of tobacco drying system

特征类别	生产状态特征	特征数
原料烟丝	KLD烘前水分、KLD烘丝流量、叶丝累计量	3
批次编号	年、月、日、生产线编号、班组、生产序号	6
配方参数	KLD除水量、含水率目标值、干燥能力、干燥因子	4
过程检测量	KLD烘丝段蒸汽流量、SIROX蒸汽流量、SIROX烘丝分汽缸压力、SIROX烘丝分汽缸温度、SIROX排潮风机负压值、SIROX后温度、SIROX阀后蒸汽温度、SIROX阀后蒸汽压力、SIROX阀前蒸汽温度、SIROX阀前蒸汽压力、KLD一次减压后蒸汽压力、KLD烘后水分、KLD烘后温度、KLD排潮温度、Ⅰ区工作蒸汽压力、Ⅱ区工作蒸汽压力、Ⅰ区回水温度、Ⅱ区回水温度、Ⅰ区筒壁温度、Ⅱ区筒壁温度、热风风速、热风温度、排潮负压、风选冷却排潮负压、冷却温度、冷却水分	26
设备参数	SIROX蒸汽阀门开度、KLD筒转速、KLDⅠ区蒸汽薄膜阀开度、KLDⅡ区蒸汽薄膜阀开度、Ⅰ区筒壁温度设定值、Ⅱ区筒壁温度设定值、热风蒸汽阀门开度、风门开度、排潮开度、风选冷却排潮开度	10

下载: 导出CSV

表 2 烟丝含水率标准偏差

Table 2 Standard deviation of moisture content level in tobacco

控制模式	标准偏差
控制模式	开机20 min	开机30 min	开机40 min
自动控制	0.097	0.082	0.076
人工干预	0.056	0.053	0.049
Actor-Critic优化控制	0.051	0.045	0.042

下载: 导出CSV

参考文献(33)

[1]	Zhu W K, Wang Y, Chen L Y, Wang Z G, Li B, Wang B. Effect of two-stage dehydration on retention of characteristic flavor components of flue-cured tobacco in rotary dryer. Drying Technology, 2016, 34(13): 1621-1629 doi: 10.1080/07373937.2016.1138965
[2]	李晓理, 王康, 于秀明, 苏伟. 基于CPS框架的微粉生产过程多模型自适应控制. 自动化学报, 2019, 45(7): 1354-1365 doi: 10.16383/j.aas.2018.c180387 Li Xiao-Li, Wang Kang, Yu Xiu-Ming, Su Wei. CPS-based multiple model adaptive control of GGBS production process. Acta Automatica Sinica, 2019, 45(7): 1354-1365 doi: 10.16383/j.aas.2018.c180387
[3]	国家烟草专卖局. 卷烟工艺规范. 北京: 中国轻工业出版社, 2016. State Tobacco Monopoly Administration. Cigarette Making Process Specification. Beijing: China Light Industry Press, 2016.
[4]	Pakowski Z, Druzdzel A, Drwiega J. Validation of a model of an expanding superheated steam flash dryer for cut tobacco based on processing data. Drying Technology, 2004, 22 (1−2): 45-57 doi: 10.1081/DRT-120028212
[5]	Zhou F, Peng H, Ruan W J, Wang D, Liu M Y, Gu Y F, et al. Cubic-RBF-ARX modeling and model-based optimal setting control in head and tail stages of cut tobacco drying process. Neural Computing & Applications, 2016, 30: 1039-1053.
[6]	赖旭芝, 李爱萍, 吴敏, 雷琪. 基于多目标遗传算法的炼焦生产过程优化控制. 计算机集成制造系统, 2009, 15 (5): 990-997 doi: 10.13196/j.cims.2009.05.160.laixzh.003 Lai Xu-Zhi, Li Ai-Ping, Wu Min, Lei Qi. Optimization control based on the multi-objective genetic algorithm for coking plant production process. Computer Integrated Manufacturing Systems, 2009, 15 (5): 990-997 doi: 10.13196/j.cims.2009.05.160.laixzh.003
[7]	廖龙. 基于模糊控制的烘丝机温度优化控制 [硕士学位论文], 西南科技大学, 2018. Liao Long. Drying Machine Temperature Optimization Control Based on Fuzzy Control [Master thesis], Southwest University of Science and Technology, 2018.
[8]	郑坤明, 张秋菊. 基于弹性动力学模型与遗传算法的Delta机器人模糊PID控制. 计算机集成制造系统, 2016, 22(07): 1707-1716 doi: 10.13196/j.cims.2016.07.010 Zheng Kun-Ming, Zhang Qiu-Ju. Fuzzy PID control of delta robot based on elastic dynamic model and genetic algorithm. Computer Integrated Manufacturing Systems, 2016, 22 (07): 1707-1716 doi: 10.13196/j.cims.2016.07.010
[9]	王述彦, 师宇, 冯忠绪. 基于模糊PID控制器的控制方法研究. 机械科学与技术, 2011, 30 (01): 166-172 doi: 10.13433/j.cnki.1003-8728.2011.01.035 Wang Shu-Yan, Shi Yu, Feng Zhong-Xu. A Method for Controlling a Loading System Based on a Fuzzy PID Controller. Mechanical Science and Technology for Aerospace Engineering, 2011, 30 (01): 166-172 doi: 10.13433/j.cnki.1003-8728.2011.01.035
[10]	Ang K H, Chong G, Li Y. PID control system analysis, design, and technology. IEEE Transactions on Control Systems Technology, 2005, 13 (4): 559-576 doi: 10.1109/TCST.2005.847331
[11]	Skogestad S. Simple analytic rules for model reduction and PID controller tuning. Journal of Process Control, 2003, 13 (4): 291-309 doi: 10.1016/S0959-1524(02)00062-8
[12]	Duma R, Trusca M, Dobra P. Tuning and Implementation of PID Controllers using Rapid Control Prototyping. Control Engineering and Applied Informatics, 2011, 13 (4): 64-73
[13]	Murthy TPK, Manohar B. Microwave drying of mango ginger (Curcuma amada Roxb): prediction of drying kinetics by mathematical modelling and artificial neural network. International Journal of Food Science & Technology, 2012, 47 (6): 1229-1236
[14]	Bravo S, Moreno A H. Prediction model based on neural networks for microwave drying process of amaranth seeds. In: Proceedings of the 3rd International Conference on Compute and Data Analysis. New York, USA: ACM, 2019. 88−93
[15]	Balbay A, Avci E, Sahin O, Coteli R. Modeling of drying process of bittim nuts (pistacia terebinthus) in a fixed bed dryer system by using extreme learning machine. International Journal of Food Engineering, 2012, 8 (4)
[16]	柴天佑. 自动化科学与技术发展方向. 自动化学报, 2018, 44 (11): 1923-1930 doi: 10.16383/j.aas.2018.c180252 Chai Tian-You. Development directions of automation science and technology. Acta Automatica Sinica, 2018, 44 (11): 1923-1930 doi: 10.16383/j.aas.2018.c180252
[17]	Dai A N, Zhou X G, Liu X D, Liu J Y, Zhang C. Intelligent control of a grain drying system using a GA-SVM-IMPC controller. Drying Technology, 2018, 36 (12), 1413-1435 doi: 10.1080/07373937.2017.1407938
[18]	Li J S, Xiong Q Y, Wang K, Shi X, Liang S. A recurrent self-evolving fuzzy neural network predictive control for microwave drying process. Drying Technology, 2016, 34 (12), 1434-1444 doi: 10.1080/07373937.2015.1122612
[19]	Wu J, Yang S X, Tian F C. An adaptive neuro-fuzzy approach to bulk tobacco flue-curing control process. Drying Technology, 2017, 35 (4), 465-477 doi: 10.1080/07373937.2016.1183211
[20]	Balbay A, Kaya Y, Sahin O. Drying of black cumin (Nigella sativa) in a microwave assisted drying system and modeling using extreme learning machine. Energy, 2012, 44 (1): 352-357 doi: 10.1016/j.energy.2012.06.022
[21]	Sedighizadeh M, Rezazadeh A. Adaptive PID controller based on reinforcement learning for wind turbine control. In: Proceedings of World Academy of Science Engineering and Technology. Cairo, Egypt: 2008. 257−262
[22]	陈学松, 杨宜民. 基于执行器-评价器学习的自适应PID控制. 控制理论与应用, 2011, 28 (08): 1187-1192 Chen Xue-Song, Yang Yi-Min. A novel adaptive PID controller based on Actor-Critic learning. Control and Decision, 2011, 28 (8): 1187-1192
[23]	孙京诰, 杨嘉雄, 王硕, 薛瑞, 潘红光. 基于Actor-Critic和神经网络的闭环脑机接口控制器设计. 控制与决策, 2018, 33 (11): 1967-1974 doi: 10.13195/j.kzyjc.2017.0791 Sun Jing-Gao, Yang Jia-Xiong, Wang Shuo, Xue Rui, Pan Hong-Guang. Design of closed-loop brain machine interface controller based on Actor Critic and neural network. Control and Decision, 2018, 33 (11): 1967-1974 doi: 10.13195/j.kzyjc.2017.0791
[24]	高阳, 陈世福, 陆鑫. 强化学习研究综述. 自动化学报, 2004, 30 (1): 86-100 doi: 10.16383/j.aas.2004.01.011 Gao Yang, Chen Shi-Fu, Lu Xin. Research on reinforcement learning technology: A review. Acta Automatica Sinica, 2004, 30 (1): 86-100 doi: 10.16383/j.aas.2004.01.011
[25]	Günther J, Pilarski P M, Helfrich G, Hao S, Diepold K. Intelligent laser welding through representation, prediction, and control learning: An architecture with deep neural networks and reinforcement learning. Mechatronics, 2016, 34 : 1-11 doi: 10.1016/j.mechatronics.2015.09.004
[26]	Jiang Y, Fan J L, Chai T Y, Lewis F L. Dual-rate operational optimal control for flotation industrial process with unknown operational model. IEEE Transactions on Industrial Electronics, 2019, 66 (6): 4587-4599 doi: 10.1109/TIE.2018.2856198
[27]	Feng G X, Busoniu L, Guerra T M, Mohammad S. Data-efficient reinforcement learning for energy optimization of power-assisted wheelchairs. IEEE Transactions on Industrial Electronics, 2019, 66 (12): 9734-9744 doi: 10.1109/TIE.2019.2903751
[28]	Zhang K, Zhang H G, Mua Y F, Sun S X. Tracking control optimization scheme for a class of partially unknown fuzzy systems by using integral reinforcement learning architecture. Applied Mathematics and Computation, 2019, 359: 344-356 doi: 10.1016/j.amc.2019.04.084
[29]	周凯敏, 何晋, 盛科, 余娜, 朱生才, 吉德祥. 滚筒烘丝机内烟丝滞留时间模型的建立及数值模拟. 烟草科技, 2016, 49 (5): 94-99 doi: 10.16135/j.issn1002-0861.20160514 Zhou Kai-Min, He Jin, Sheng Ke, Yu Na, Zhu Sheng-Cai, Ji De-Xiang. Modeling and numerical simulation of residence time of cut tobacco in cylinder dryer. Tobacco Science & Technology, 2016, 49 (5): 94-99 doi: 10.16135/j.issn1002-0861.20160514
[30]	Barto A G, Sutton R S, Anderson C W. Neuronlike adaptive elements that can solve difficult learning control problems. IEEE Transactions on Systems, Man and Cybernetics, 1983, 13 (5): 834-846
[31]	Konda V R, Tsitsiklis J N. On actor-critic algorithms. SIAM Journal on Control and Optimization, 2003, 42(4): 1143-1166 doi: 10.1137/S0363012901385691
[32]	刘强, 卓洁, 郎自强, 秦泗钊. 数据驱动的工业过程运行监控与自优化研究展望. 自动化学报, 2018, 44(11): 1944-1956 doi: 10.16383/j.aas.2018.c180207 Liu Qiang, Zhuo Jie, Lang Zi-Qiang, Qin Si-Zhao. Perspectives on data-driven operation monitoring and self-optimization of industrial processes. Acta Automatica Sinica, 2018, 44(11): 1944-1956 doi: 10.16383/j.aas.2018.c180207
[33]	陈龙, 刘全利, 王霖青, 赵珺, 王伟. 基于数据的流程工业生产过程指标预测方法综述. 自动化学报, 2017, 43(6): 944-954 doi: 10.16383/j.aas.2017.c170136 Chen Long, Liu Quan-Li, Wang Lin-Qing, Zhao Jun, Wang Wei. Data-driven prediction on performance indicators in process industry: a survey. Acta Automatica Sinica, 2017, 43(6): 944-954 doi: 10.16383/j.aas.2017.c170136