2.845

2023影响因子

(CJCR)

  • 中文核心
  • EI
  • 中国科技核心
  • Scopus
  • CSCD
  • 英国科学文摘

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于深度强化学习的四向协同三维装箱方法

尹昊 陈帆 和红杰

尹昊, 陈帆, 和红杰. 基于深度强化学习的四向协同三维装箱方法. 自动化学报, 2024, 50(12): 1−12 doi: 10.16383/j.aas.c240124
引用本文: 尹昊, 陈帆, 和红杰. 基于深度强化学习的四向协同三维装箱方法. 自动化学报, 2024, 50(12): 1−12 doi: 10.16383/j.aas.c240124
Yin Hao, Chen Fan, He Hong-Jie. A four directional cooperative three-dimensional packing method based on deep reinforcement learning. Acta Automatica Sinica, 2024, 50(12): 1−12 doi: 10.16383/j.aas.c240124
Citation: Yin Hao, Chen Fan, He Hong-Jie. A four directional cooperative three-dimensional packing method based on deep reinforcement learning. Acta Automatica Sinica, 2024, 50(12): 1−12 doi: 10.16383/j.aas.c240124

基于深度强化学习的四向协同三维装箱方法

doi: 10.16383/j.aas.c240124 cstr: 32138.14.j.aas.c240124
详细信息
    作者简介:

    尹昊:西南交通大学信息科学与技术学院博士研究生. 2020年获得西南交通大学学士学位. 主要研究方向为强化学习, 人工智能. E-mail: haoyin@my.swjtu.edu.cn

    陈帆:西南交通大学计算机与人工智能学院副教授. 主要研究方向为机器学习, 多媒体安全和计算机应用. E-mail: fchen@swjtu.edu.cn

    和红杰:西南交通大学信息科学与技术学院教授. 主要研究方向为深度学习, 图像处理和信息安全. 本文通信作者. E-mail: hjhe@swjtu.edu.cn

A Four Directional Cooperative Three-dimensional Packing Method Based on Deep Reinforcement Learning

More Information
    Author Bio:

    YIN Hao Ph.D. candidate at the School of Information Science and Technology, Southwest Jiaotong University. He received his bachelor degree from Southwest Jiaotong University in 2020. His research interest covers reinforcement learning and artificial intelligence

    CHEN Fan Associate professor at the School of Computing and Artificial Intelligence, Southwest Jiaotong University. His research interest covers machine learning, multimedia security, and computer applications

    HE Hong-Jie Professor at the School of Information Science and Technology, Southwest Jiaotong University. Her research interest covers deep learning, image processing, and information security. Corresponding author of this paper

  • 摘要: 物流作为现代经济的重要组成部分, 在国民经济和社会发展中发挥着重要作用. 物流中的三维装箱问题(Three-dimensional bin packing problem, 3D-BPP)是提高物流运作效率必须解决的关键难题之一. 深度强化学习(Deep reinforcement learning, DRL)具有强大的学习与决策能力, 基于DRL的三维装箱方法(Three-dimensional bin packing method based on DRL, DRL-3DBP)已成为智能物流领域的研究热点之一. 现有DRL-3DBP面对大尺寸容器3D-BPP时难以达成动作空间、计算复杂性与探索能力之间的平衡. 为此, 提出一种四向协同装箱(Four directional cooperative packing, FDCP)方法: 两阶段策略网络接收旋转后的容器状态, 生成4个方向的装箱策略; 根据由4个策略采样而得的动作更新对应的4个状态, 选取其中价值最大的对应动作为装箱动作. FDCP在压缩动作空间、减小计算复杂性的同时, 鼓励智能体对4个方向合理装箱位置的探索. 实验结果表明, FDCP在100 × 100大尺寸容器以及20、30、50箱子数量的装箱问题上实现了1.2% ~ 2.9%的空间利用率提升.
  • 图  1  3D-BPP在物流中的应用场景

    Fig.  1  Application scenarios of 3D-BPP in logistics

    图  2  笛卡尔坐标系与箱子属性

    Fig.  2  Cartesian coordinate system and item properties

    图  3  容器状态表示

    Fig.  3  Representation of the bin state

    图  4  各编码器和解码器的结构

    Fig.  4  Structure of each encoder and decoder

    图  5  4种方向的装箱策略

    Fig.  5  The packing policy for the four direction

    图  6  四向协同装箱方法结构

    Fig.  6  Structure of the four directional cooperative packing method

    图  7  FDCP求解3D-SPP流程图

    Fig.  7  Flowchart of FDCP for solving 3D-SPP

    图  8  装箱结果可视化

    Fig.  8  Visualization of packing results

    图  9  FDCP在不同箱子数量算例上的测试结果

    Fig.  9  Test results of FDCP on instances with different number of items

    图  10  容器各区域放置次数热力图

    Fig.  10  Heat map of the number of placements in each area of the bin

    表  1  $ 100 \times 100 $容器装箱算例上的对比结果

    Table  1  Comparative results on packing instances with $ 100 \times 100 $ bin

    方法 $N=20$ $N=30$ $N=50$
    UR (%) Time (s) UR (%) Time (s) UR (%) Time (s)
    Heuristic-3DBPGA+DBLF70.217.569.436.366.371.9
    EP62.7<1.063.8<1.066.3<1.0
    LAFF58.6<1.059.1<1.061.9<1.0
    EBAFIT65.4<1.065.9<1.066.11.5
    DRL-3DBPMTSL62.44.860.110.255.323.0
    CQL67.01.069.31.273.63.3
    JIANG73.52.376.93.282.010.9
    QUE76.51.479.32.182.43.5
    FDCP79.41.981.73.183.65.2
    下载: 导出CSV

    表  2  $ 200 \times 200 $和$ 400 \times 200 $容器算例上各方法的空间利用率UR (%)

    Table  2  Space utilization (UR) of each method on instances with $ 200 \times 200 $ and $ 400 \times 200 $ bins (%)

    容器尺寸GA+DBLFEPLAFFEBAFITMTSLCQLJIANGQUEFDCP
    200$\times$20061.463.358.062.850.858.775.280.581.2
    400$\times$20058.760.155.460.546.947.570.576.776.8
    下载: 导出CSV

    表  3  消融实验结果 (%)

    Table  3  Results of ablation experiment (%)

    方法$N=20$$N=30$$N=50$
    FDCP79.481.783.6
    −CO75.979.281.4
    −FD75.979.081.5
    −PN76.579.581.9
    下载: 导出CSV
  • [1] Bortfeldt A, Wascher G. Constraints in container loading——A state-of-the-art review. European Journal of Operational Research, 2013, 229(1): 1−20 doi: 10.1016/j.ejor.2012.12.006
    [2] Jiao G S, Huang M, Song Y, Liu H B, Wang X W. Container loading problem based on robotic loader system: An optimization approach. Expert Systems With Applications, 2024, 236: Article No. 121222 doi: 10.1016/j.eswa.2023.121222
    [3] Consolini L, Laurini M, Locatelli M. A dynamic programming approach for cooperative pallet-loading manipulators. IEEE Transactions on Automation Science and Engineering, DOI: 10.1109/TASE.2023.3310007
    [4] Martello S, Pisinger D, Vigo D. The three-dimensional bin packing problem. Operations Research, 2000, 48(2): 256−267 doi: 10.1287/opre.48.2.256.12386
    [5] Hifi M. Exact algorithms for unconstrained three-dimensional cutting problems: A comparative study. Computers & Operations Research, 2004, 31(5): 657−674
    [6] Chen C S, Lee S M, Shen Q S. An analytical model for the container loading problem. European Journal of Operational Research, 1995, 80(1): 68−76 doi: 10.1016/0377-2217(94)00002-T
    [7] Dósa G, Sgall J. First Fit bin packing: A tight analysis. In: Proceedings of the 30th International Symposium on Theoretical Aspects of Computer Science (STACS 2013). Dagstuhl, Germany: Schloss Dagstuhl——Leibniz-Zentrum für Informatik, 2013. 538−549
    [8] Crainic T G, Perboli G, Tadei R. Extreme point-based heuristics for three-dimensional bin packing. INFORMS Journal on Computing, 2008, 20(3): 368−384 doi: 10.1287/ijoc.1070.0250
    [9] Karabulut K, Inceoglu M M. A hybrid genetic algorithm for packing in 3D with deepest bottom left with fill method. In: Proceedings of the Third International Conference on Advances in Information Systems. Berlin, Heidelberg: Springer, 2004. 441−450
    [10] Gurbuz M Z, Akyokus S, Emiroglu I, Guran A. An efficient algorithm for 3D rectangular box packing. In: Proceedings of the Selected AAS 2009 Papers. Skopje, Macedonia: Society for ETAI of Republic of Macedonia, 2009. 131−134
    [11] Parreño F, Alvarez-Valdés R, Tamarit J M, Oliveira J F. A maximal-space algorithm for the container loading problem. INFORMS Journal on Computing, 2008, 20(3): 412−422 doi: 10.1287/ijoc.1070.0254
    [12] Hasan J, Kaabi J, Harrath Y. Multi-objective 3D bin-packing problem. In: Proceedings of the 8th International Conference on Modeling Simulation and Applied Optimization (ICMSAO). Manama, Bahrain: IEEE, 2019. 1−5
    [13] 张德富, 魏丽军, 陈青山, 陈火旺. 三维装箱问题的组合启发式算法. 软件学报, 2007, 18(9): 2083−2089 doi: 10.1360/jos182083

    Zhang De-Fu, Wei Li-Jun, Chen Qing-Shan, Chen Huo-Wang. A combinational heuristic algorithm for the three-dimensional packing problem. Journal of Software, 2007, 18(9): 2083−2089 doi: 10.1360/jos182083
    [14] 何琨, 黄文奇. 基于动作空间的三维装箱问题的确定性高效率求解算法. 计算机学报, 2014, 37(8): 1786−1793

    He Kun, Huang Wen-Qi. An action space based deterministic efficient algorithm for solving the three-dimensional container loading. Chinese Journal of Computers, 2014, 37(8): 1786−1793
    [15] Wu Y, Li W K, Goh M, de Souza R. Three-dimensional bin packing problem with variable bin height. European Journal of Operational Research, 2010, 202(2): 347−355 doi: 10.1016/j.ejor.2009.05.040
    [16] Yang J L, Liu H W, Liang K B, Zhou L, Zhao J H. Variable neighborhood genetic algorithm for multi-order multi-bin open packing optimization. Applied Soft Computing, 2024, 163: Article No. 111890 doi: 10.1016/j.asoc.2024.111890
    [17] Silveira M E, Vieira S M, Sousa J M D C. An ACO algorithm for the 3D bin packing problem in the steel industry. In: Proceedings of the 26th International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems. Berlin, Heidelberg: Springer, 2013. 535−544
    [18] 张德富, 彭煜, 朱文兴, 陈火旺. 求解三维装箱问题的混合模拟退火算法. 计算机学报, 2009, 32(11): 2147−2156

    Zhang De-Fu, Peng Yu, Zhu Wen-Xing, Chen Huo-Wang. A hybrid simulated annealing algorithm for the three-dimensional packing problem. Chinese Journal of Computers, 2009, 32(11): 2147−2156
    [19] Tsao Y C, Tai J Y, Vu T L, Chen T H. Multiple bin-size bin packing problem considering incompatible product categories. Expert Systems With Applications, 2024, 247: Article No. 123340 doi: 10.1016/j.eswa.2024.123340
    [20] Bortfeldt A, Gehring H. Applying tabu search to container loading problems. In: Proceedings of the Operations Research Proceedings 1997. Berlin, Heidelberg: Springer, 1998. 533−538
    [21] 刘胜, 沈大勇, 商秀芹, 赵红霞, 董西松, 王飞跃. 求解三维装箱问题的多层树搜索算法. 自动化学报, 2020, 46(6): 1178−1187

    Liu Sheng, Shen Da-Yong, Shang Xiu-Qin, Zhao Hong-Xia, Dong Xi-Song, Wang Fei-Yue. A multi-level tree search algorithm for three dimensional container loading problem. Acta Automatica Sinica, 2020, 46(6): 1178−1187
    [22] Ren J D, Tian Y J, Sawaragi T. A tree search method for the container loading problem with shipment priority. European Journal of Operational Research, 2011, 214(3): 526−535 doi: 10.1016/j.ejor.2011.04.025
    [23] Zhu W B, Lim A. A new iterative-doubling Greedy-Lookahead algorithm for the single container loading problem. European Journal of Operational Research, 2012, 222(3): 408−417 doi: 10.1016/j.ejor.2012.04.036
    [24] Araya I, Moyano M, Sanchez C. A beam search algorithm for the biobjective container loading problem. European Journal of Operational Research, 2020, 286(2): 417−431 doi: 10.1016/j.ejor.2020.03.040
    [25] Vinyals O, Fortunato M, Jaitly N. Pointer networks. In: Proceedings of the 28th International Conference on Neural Information Processing Systems. Montreal, Canada: MIT Press, 2015. 2692−2700
    [26] Bello I, Pham H, Le Q V, Norouzi M, Bengio S. Neural combinatorial optimization with reinforcement learning. arXiv preprint arXiv: 1611.09940, 2017.
    [27] Jin Y, Ding Y D, Pan X H, He K, Zhao L, Qin T, et al. Pointerformer: Deep reinforced multi-pointer Transformer for the traveling salesman problem. In: Proceedings of the 37th AAAI Conference on Artificial Intelligence. Washington, USA: AAAI Press, 2023. 8132−8140
    [28] Tang Y H, Agrawal S, Faenza Y. Reinforcement learning for integer programming: Learning to cut. In: Proceedings of the 37th International Conference on Machine Learning. Vienna, Austria: PMLR, 2020. 9367−9376
    [29] Theodoropoulos T, Makris A, Psomakelis E, Carlini E, Mordacchini M, Dazzi P, et al. GNOSIS: Proactive image placement using graph neural networks & deep reinforcement learning. In: Proceedings of the IEEE 16th International Conference on Cloud Computing (CLOUD). Chicago, USA: IEEE, 2023. 120−128
    [30] Ahn S, Seo Y, Shin J. Deep auto-deferring policy for combinatorial optimization [Online], avaliable: https://openreview.net/forum?id=Hkexw1BtDr, March 12, 2024
    [31] Hu H Y, Zhang X D, Yan X W, Wang L F, Xu Y H. Solving a new 3D bin packing problem with deep reinforcement learning method. arXiv preprint arXiv: 1708.05930, 2017.
    [32] Duan L, Hu H Y, Qian Y, Gong Y, Zhang X D, Xu Y H, et al. A multi-task selected learning approach for solving 3D flexible bin packing problem. arXiv preprint arXiv: 1804.06896, 2019.
    [33] Verma R, Singhal A, Khadilkar H, Basumatary A, Nayak S, Singh H V, et al. A generalized reinforcement learning algorithm for online 3D bin-packing. arXiv preprint arXiv: 2007.00463, 2020.
    [34] Liu H W, Zhou L, Yang J L, Zhao J H. The 3D bin packing problem for multiple boxes and irregular items based on deep Q-network. Applied Intelligence, 2023, 53(20): 23398−23425 doi: 10.1007/s10489-023-04604-6
    [35] Li D D, Ren C W, Gu Z Q, Wang Y X, Lau F. Solving packing problems by conditional query learning [Online], avaliable: https://openreview.net/forum?id=BkgTwRNtPB, March 12, 2024
    [36] Jiang Y, Cao Z G, Zhang J. Learning to solve 3-D bin packing problem via deep reinforcement learning and constraint programming. IEEE Transactions on Cybernetics, 2023, 53(5): 2864−2875 doi: 10.1109/TCYB.2021.3121542
    [37] Zhang J W, Zi B, Ge X Y. Attend2Pack: Bin packing through deep reinforcement learning with attention. arXiv preprint arXiv: 2107.04333, 2021.
    [38] Que Q Q, Yang F, Zhang D F. Solving 3D packing problem using Transformer network and reinforcement learning. Expert Systems With Applications, 2022, 214: Article No. 119153
    [39] Parker J A, Kenyon R V, Troxel D E. Comparison of interpolating methods for image resampling. IEEE Transactions on Medical Imaging, 1983, 2(1): 31−39 doi: 10.1109/TMI.1983.4307610
  • 加载中
计量
  • 文章访问数:  363
  • HTML全文浏览量:  25
  • 被引次数: 0
出版历程
  • 收稿日期:  2024-03-12
  • 录用日期:  2024-07-04
  • 网络出版日期:  2024-10-29

目录

    /

    返回文章
    返回