• 中文核心
  • EI
  • 中国科技核心
  • Scopus
  • CSCD
  • 英国科学文摘

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

冗余人工肌肉驱动的仿生机器人强化学习控制

牛鹏军 程屹涛 朱彦臣 厉侃 刘珂

牛鹏军, 程屹涛, 朱彦臣, 厉侃, 刘珂. 冗余人工肌肉驱动的仿生机器人强化学习控制. 自动化学报, xxxx, xx(x): x−xx doi: 10.16383/j.aas.c250508
引用本文: 牛鹏军, 程屹涛, 朱彦臣, 厉侃, 刘珂. 冗余人工肌肉驱动的仿生机器人强化学习控制. 自动化学报, xxxx, xx(x): x−xx doi: 10.16383/j.aas.c250508
Niu Peng-Jun, Cheng Yi-Tao, Zhu Yan-Chen, Li Kan, Liu Ke. Reinforcement learning control for bionic robots driven by redundant artificial muscles. Acta Automatica Sinica, xxxx, xx(x): x−xx doi: 10.16383/j.aas.c250508
Citation: Niu Peng-Jun, Cheng Yi-Tao, Zhu Yan-Chen, Li Kan, Liu Ke. Reinforcement learning control for bionic robots driven by redundant artificial muscles. Acta Automatica Sinica, xxxx, xx(x): x−xx doi: 10.16383/j.aas.c250508

冗余人工肌肉驱动的仿生机器人强化学习控制

doi: 10.16383/j.aas.c250508 cstr: 32138.14.j.aas.c250508
基金项目: 国家重点研发计划(2022YFB4701900) 资助
详细信息
    作者简介:

    牛鹏军:北京大学先进制造与机器人学院博士研究生. 2025年获得北京航空航天大学机械工程及自动化学院学士学位. 主要研究方向为机器人仿真与控制. E-mail: pjniu25@stu.pku.edu.cn

    程屹涛:北京大学先进制造与机器人学院博士研究生. 2023年获得北京大学工学院学士学位. 主要研究方向为软体机器人, 机器人感知与控制, 人机交互. E-mail: chengyitao@pku.edu.cn

    朱彦臣:华中科技大学博士研究生. 2023年获得四川大学机械工程学院学士学位. 主要研究方向为柔性电子, 软体机器人. E-mail: yanchenshizhu@hust.edu.cn

    厉侃:华中科技大学研究员. 2019年博士毕业于美国西北大学理论与应用力学专业. 主要研究方向为三维柔性微飞行器, 三维柔性可拉伸电子器件. E-mail: kanli@hust.edu.cn

    刘珂:北京大学先进制造与机器人学院研究员.2019年博士毕业于美国佐治亚理工学院.主要研究方向为柔性结构与软体机器的设计、分析与应用.本文通信作者. E-mail: liuke@pku.edu.cn

Reinforcement Learning Control for Bionic Robots Driven by Redundant Artificial Muscles

Funds: Supported by National Key Research and Development Program of China (2022YFB4701900)
More Information
    Author Bio:

    NIU Peng-Jun Ph.D. candidate at School of Advanced Manufacturing and Robotics, Peking University. He received his bachelor degree from Beihang University in 2025. His research interest covers robot simulation and control

    CHENG Yi-Tao Ph.D. candidate at School of Advanced Manufacturing and Robotics, Peking University. He received his bachelor degree from Peking University in 2023. His research interest covers soft robot, robot perception and control, human-machine interaction

    Zhu Yan-Chen Ph.D. student at Huazhong University of Science and Technology, Wuhan, China. He received his B.S. degree in mechanical engineering from the School of Mechanical Engineering, Sichuan University, Chengdu, China, in 2023. His research interests include flexible electronics and soft robotics

    LI Kan Research Fellow at Huazhong University of Science and Technology, Wuhan, China. He received his Ph.D. degree in theoretical and applied mechanics from Northwestern University, Evanston, IL, USA, in 2019. His research interests include three-dimensional flexible micro aerial vehicles and three-dimensional flexible and stretchable electronic devices

    LIU Ke Research Fellow at School of Advanced Manufacturing and Robotics, Peking University. He received his Ph.D. degree from Georgia Institute of Technology, GA, USA, in 2019. His research interests include Design, Analysis and Application of Flexible Structures and Soft Machines. Corresponding author of this paper

  • 摘要: 人工肌肉是仿生机器人的核心驱动部件, 然而当前人工肌肉的应用与真实生物相差甚远, 缺乏像生物一样的冗余多肌肉协同. 围绕仿生机器人的复杂人工肌肉驱动与协同, 本文提出一种由多股人工肌肉并联驱动的软体机器人设计, 并围绕这种设计建立了基于强化学习的运动控制策略. 研制了以柔性十字形电路板为主体, 集成六路液晶弹性体人工肌肉与驱动电路的原型样机, 并测试获得其应变特性与响应性能; 针对原型样机形变-运动特点, 在仿真环境中构建了基于绳腱驱动的简化模型. 通过合理设计状态空间、动作空间及奖励函数等, 以Soft Actor-Critic算法进行强化学习并行训练, 得到平移与旋转运动肌肉协同策略. 将运动策略中稳定周期段以离线方式驱动实物样机, 实现有效的多向平移与旋转运动, 验证了采用强化学习控制复杂人工肌肉系统的可行性.
  • 图  1  原型样机

    Fig.  1  The prototype

    图  2  人工肌肉制备与封装: (a)初始状态(b)开始驱动(c)停止驱动(d)焊接铜片(e)不同规格(f)嵌入电路

    Fig.  2  Fabrication and encapsulation of artificial muscles: (a) Initial state (b) Start driving (c) Stop driving (d) Weld copper sheets (e) Different specifications (f) Embed circuits

    图  3  形变特征

    Fig.  3  Deformation characteristic

    图  4  拉压力学性能测试: (a)人工肌肉力输出实验(b) 2.9 cm人工肌肉力输出曲线(c) 4.2 cm人工肌肉力输出曲线(d) FPCB压缩形变实验, 压缩前(e) FPCB压缩形变实验, 压缩后(f) FPCB压缩

    Fig.  4  Tension-compression mechanical testing: (a) Experiment on force output of artificial muscle (b) Force output curve of 2.9 cm artificial muscle (c) Force output curve of 4.2 cm artificial muscle (d) FPCB compression deformation experiment, before compression (e) FPCB compression deformation experiment, after compression (f) FPCB compression

    图  5  人工肌肉应变响应

    Fig.  5  Artificial muscle actuation response

    图  6  器件选择与电路设计: (a)部分电路器件选择(b)顶面电路(c)底面电路

    Fig.  6  Component selection and circuit design: (a) Partial Circuit Device Selection (b) Top Surface Circuit (c) Bottom Surface Circuit

    图  7  人工肌肉驱动方案: (a)“一对多”控制关系(b)“控制−驱动”实现方案(c)“驱动−反馈”实现方案

    Fig.  7  Artificial muscle actuation scheme: (a)“One-to-Many” control relationship (b)“Control-Driver” implementation scheme (c)“Driver-Feedback” implementation scheme

    图  8  简化模型结构

    Fig.  8  Simplified model structure

    图  9  实物−仿真形变对应

    Fig.  9  Physical-simulation deformation correspondence

    图  10  SAC算法网络结构

    Fig.  10  SAC network structure

    图  11  并行训练

    Fig.  11  Parallel training

    图  12  强化学习训练成果

    Fig.  12  Reinforcement learning training results

    图  13  机器人离线x方向运动

    Fig.  13  Offline translation along the x-axis

    图  14  机器人离线y方向运动

    Fig.  14  Offline translation along the y-axis

    图  15  机器人离线yaw方向旋转

    Fig.  15  Offline rotation about the yaw axis

  • [1] Joachimczak M, Suzuki R, Arita T. Improving evolvability of morphologies and controllers of developmental soft-bodied robots with novelty search. Frontiers in Robotics and AI, 2015, 2: 33 doi: 10.3389/frobt.2015.00033
    [2] Woodward M A, Sitti M. Morphological intelligence counters foot slipping in the desert locust and dynamic robots. Proceedings of the National Academy of Sciences of the United States of America, 2018, 115(36): E8358−E8367 doi: 10.1073/pnas.1804239115
    [3] Ghazi-Zahedi K, Haeufle D F B, Montufar G, Schmitt S, Ay N. Evaluating Morphological Computation in Muscle and DC-motor Driven Models of Human Hopping. [Online], available: https://doi.org/10.48550/arXiv.1512.00250, Dec. 1, 2015.
    [4] Uppington M, Gobbo P, Hauert S, Hauser H. Evolving and generalising morphologies for locomoting micro-scale robotic agents. Journal of Micro and Bio Robotics, 2022, 18: 37−47 doi: 10.1007/s12213-023-00155-8
    [5] 王久斌, 贺威, 孟亭亭, 邹尧, 付强. 基于高仿生形态布局的仿鸽扑翼飞行机器人系统设计. 自动化学报, 2024, 50(2): 308−319 doi: 10.16383/j.aas.c220836

    Wang J B, He W, Meng T T, Zou Y, Fu Q. System Design of Dove-like Flapping-wing Flying Robot Based on Highly Bionic Morphological Layout. Acta Automatica Sinica, 2024, 50(2): 308−319 doi: 10.16383/j.aas.c220836
    [6] Wang T Y, Pierce C, Kojouharov V, Chong B X, Diaz K, Lu H, et al. Mechanical intelligence simplifies control in terrestrial limbless locomotion. Science Robotics, 2023, 8: eadi2243 doi: 10.1126/scirobotics.adi2243
    [7] Chen A, Song B F, Liu K, Wang Z H, Xue D, Qi H D. Flapping-wing robot achieves bird-style self-takeoff by adopting reconfigurable mechanisms. Science Advances, 2025, 11: eadx0465 doi: 10.1126/sciadv.adx0465
    [8] Zhong Q, Zhu J, Fish F E, Kerr S J, Downs A M, Bart-Smith H, et al. Tunable stiffness enables fast and efficient swimming in fish-like robots. Science Robotics, 2021, 6: eabe4088 doi: 10.1126/scirobotics.abe4088
    [9] Wen L, Ren Z Y, Di Santo V, Hu K N, Yuan T, Wang T M, et al. Understanding fish linear acceleration using an undulatory biorobotic model with soft fluidic elastomer actuated morphing median fins. Soft Robotics, 2018, 5(4): 375−388 doi: 10.1089/soro.2017.0085
    [10] 吴正兴, 喻俊志, 谭民. 两类仿鲹科机器鱼倒游运动控制方法的对比研究. 自动化学报, 2013, 39(12): 2032−2042 doi: 10.3724/SP.J.1004.2013.02032

    Wu Z X, Yu J Z, Tan M. Comparison of Two Methods to Implement Backward Swimming for a Carangiform Robotic Fish. Acta Automatica Sinica, 2013, 39(12): 2032−2042 doi: 10.3724/SP.J.1004.2013.02032
    [11] Liu Z M, Liu J Q, Wang H, Yu X, Yang K, Liu W B, et al. A 1 mm-thick miniatured mobile soft robot with mechanosensation and multimodal locomotion. IEEE Robotics and Automation Letters, 2020, 5(2): 3291−3298 doi: 10.1109/LRA.2020.2976306
    [12] Zhang Y F, Yang D Z, Yan P N, Zhou P W, Zou J, Gu G Y. Inchworm inspired multimodal soft robots with crawling, climbing, and transitioning locomotion. IEEE Transactions on Robotics, 2022, 38(3): 1806−1819 doi: 10.1109/TRO.2021.3115257
    [13] Ren Z Y, Sitti M. Design and build of small-scale magnetic soft-bodied robots with multimodal locomotion. Nature Protocols, 2024, 19: 441−486 doi: 10.1038/s41596-023-00916-6
    [14] Hu W Q, Lum G Z, Mastrangeli M, Sitti M. Small-scale soft-bodied robot with multimodal locomotion. Nature, 2018, 554: 81−85 doi: 10.1038/nature25443
    [15] Niu J W, Zhang F W, Liu C L, Xie K R, Zhang Y X, Zhang J, et al. Magnetically driven biomimetic microrobot loaded with eleutheroside B for targeted delivery and neural repair in spinal cord injury. ACS Applied Materials & Interfaces, 2025, 17(30): 42688−42705 doi: 10.1021/acsami.5c07658
    [16] Yu S M, Zhang W W, Feng Y Z, Zhang X, Li C H, Shi S J, et al. Magnetic cell-mimetic droplet microrobots with division and exocytosis capabilities. Research, 2025, 8: 0730 doi: 10.34133/research.0730
    [17] Li T L, Yu S M, Sun B, Li Y L, Wang X L, Pan Y L, et al. Bioinspired claw-engaged and biolubricated swimming microrobots creating active retention in blood vessels. Science Advances, 2023, 9: eadg4501 doi: 10.1126/sciadv.adg4501
    [18] Pan F, Liu J Q, Zuo Z H, He X, Shao Z Y, Chen J Y, et al. Miniature deep-sea morphable robot with multimodal locomotion. Science Robotics, 2025, 10: eadp7821 doi: 10.1126/scirobotics.adp7821
    [19] Feng R Y, He Y M, Feng S Y, Li S G. Impulsive actuation for soft robots. npj Robotics, 2025, 3: 27 doi: 10.1038/s44182-025-00045-0
    [20] Xu Y, Zhuo J S, Fan M Y, Li X, Cao X N, Ruan D R, et al. A bioinspired shape memory alloy based soft robotic system for deep-sea exploration. Advanced Intelligent System, 2024, 6: 2300699 doi: 10.1002/aisy.202300699
    [21] Huang X N, Kumar K, Jawed M K, Nasab A M, Ye Z S, Shan W L, et al. Chasing biomimetic locomotion speeds: Creating untethered soft robots with shape memory alloy actuators. Science Robotics, 2018, 3: eaau7557 doi: 10.1126/scirobotics.aau7557
    [22] Gu G Y, Zou J, Zhao R K, Zhao X H, Zhu X Y. Soft wall-climbing robots. Science Robotics, 2018, 3: eaat2874 doi: 10.1126/scirobotics.aat2874
    [23] Wang X X, Pei X, Wang X Y, Hou T G. Lightweight untethered soft robotic fish. In: Proceedings of 2024 IEEE International Conference on Robotics and Automation (ICRA). Yokohama, Japan: IEEE, 2024. 669-675 doi: 10.1109/ICRA57147.2024.10610533
    [24] Shintake J, Cacucciolo S, Shea H, Floreano D. Soft biomimetic fish robot made of dielectric elastomer actuators. Soft Robotics, 2018, 5(4): 466−474 doi: 10.1089/soro.2017.0062
    [25] Wang T L, Joo H J, Song S Y, Hu W Q, Keplinger C, Sitti M. A versatile jellyfish-like robotic platform for effective underwater propulsion and manipulation. Science Advances, 2023, 9: eadg0292 doi: 10.1126/sciadv.adg0292
    [26] 陶子辰, 刘松源, 桂昀, 郝思远, 方浩, 杨庆凯. 张拉整体跨域机器人的设计与控制. 机器人, 2025, 47(3): 338−347 doi: 10.13973/j.cnki.robot.240303

    Tao Z C, Liu S Y, Gui Y, Hao S Y, Fang H, Yang Q K. Design and Control of Tensegrity Based Cross-domain Robot. Robot, 2025, 47(3): 338−347 doi: 10.13973/j.cnki.robot.240303
    [27] Mo J X, Gao C Q, Fang H, Yang Q K. Design and locomotion characteristic analysis of a novel tensegrity hopping robot. In: Proceedings of 2023 IEEE International Conference on Robotics and Biomimetics (ROBIO). Koh Samui, Thailand: IEEE, 2023. 1-8 doi: 10.1109/ROBIO58561.2023.10354708
    [28] Mo J X, Fang H, Yang Q K. Design and locomotion characteristic analysis of two kinds of tensegrity hopping robots. iScience, 2024, 27(3): 109226 doi: 10.1016/j.isci.2024.109226
    [29] 陈雯慧, 周晓航, 刘珂. 液晶弹性体在人工肌肉领域的研究进展. 液晶与显示, 2025, 40(2): 201−217 doi: 10.37188/CJLCD.2024-0228

    Chen W H, Zhou X H, Liu K. Application of liquid crystal elastomers in the development of artificial muscles. Chinese Journal of Liquid Crystals and Displays, 2025, 40(2): 201−217 doi: 10.37188/CJLCD.2024-0228
    [30] Chen W H, Tong D Z, Meng L H, Tan B W, Lan R C, Zhang Q F, et al. Knotted artificial muscles for bio-mimetic actuation under deepwater. Advanced Materials, 2024, 36: 2400763 doi: 10.1002/adma.202400763
    [31] Chen W H, Yang S A, Zhu C, Cheng Y T, Shi Y T, Yu C P, et al. Scalable jet swimmer driven by pulsatile artificial muscles and soft chamber buckling. Advanced Materials, 2025, 37: 2503777 doi: 10.1002/adma.202503777
    [32] Lai M, Go K, Li Z B, Kroger T, Schaal S, Allen K, et al. RoboBallet: Planning for multirobot reaching with graph neural networks and reinforcement learning. Science Robotics, 2025, 10: eads1204 doi: 10.1126/scirobotics.ads1204
    [33] Cao S J, Sun L, Jiang J J, Zuo Z Y. Reinforcement learning-based fixed-time trajectory tracking control for uncertain robotic manipulators with input saturation. IEEE Transactions on Neural Networks and Learning Systems, 2023, 34(8): 4584−4595 doi: 10.1109/TNNLS.2021.3116713
    [34] Pavlichenko D, Behnke S. Real-robot deep reinforcement learning: Improving trajectory tracking of flexible-joint manipulator with reference correction. In: Proceedings of 2022 IEEE International Conference on Robotics and Automation (ICRA). Philadelphia, PA, USA: IEEE, 2022. 2671−2677 doi: 10.1109/ICRA46639.2022.9812023
    [35] He J Z, Zhang C, Jenelten F, Grandia R, Bacher M, Hutter M. Attention-based map encoding for learning generalized legged locomotion. Science Robotics, 2025, 10: eadv3604 doi: 10.1126/scirobotics.adv3604
    [36] Huang H D, Sun S L, Zhao Z D, Huang H L, Shen C Q, Xu W F. PTRL: Prior Transfer Deep Reinforcement Learning for Legged Robots Locomotion. [Online], available: https://doi.org/10.48550/arXiv.2504.05629, Apr. 8, 2025.
    [37] 李源潮, 陶重犇, 王琛. 基于最大熵深度强化学习的双足机器人步态控制方法. 计算机应用, 2024, 44(02): 445−451 doi: 10.11772/j.issn.1001-9081.2023020153

    Li Y C, Tao C B, Wang C. Gait control method based on maximum entropy deep reinforcement learning for biped robot. Journal of Computer Applications, 2024, 44(02): 445−451 doi: 10.11772/j.issn.1001-9081.2023020153
    [38] 吴晓光, 刘绍维, 杨磊, 邓文强, 贾哲恒. 基于深度强化学习的双足机器人斜坡步态控制方法. 自动化学报, 2021, 47(8): 1973−1987 doi: 10.16383/j.aas.c190547

    Wu X G, Liu S W, Yang L, Deng W Q, Jia Z H. A Gait Control Method for Biped Robot on Slope Based on Deep Reinforcement Learning. Acta Automatica Sinica, 2021, 47(8): 1973−1987 doi: 10.16383/j.aas.c190547
    [39] Ma J C, Lu H M, Xiao J H, Zeng Z W, Zheng Z Q. Multi-robot target encirclement control with collision avoidance via deep reinforcement learning. Journal of Intelligent & Robotic Systems, 2020, 99: 371−386 doi: 10.1007/s10846-019-01106-x
    [40] Zhou Z Q, Zhu P M, Zeng Z W, Xiao J H, Lu H M, Zhou Z T. Robot navigation in a crowd by integrating deep reinforcement learning and online planning. Applied Intelligence, 2022, 52: 15600−15616 doi: 10.1007/s10489-022-03191-2
    [41] Hua H, Wamg Y N, Zhong H, Zhang H, Fang Y C. Deep reinforcement learning-based hierarchical motion planning strategy for multirotors. IEEE Transactions on Industrial Informatics, 2025, 21(6): 4324−4333 doi: 10.1109/TII.2024.3523594
    [42] 朱亚洲, 刘煜莹, 王亚东, 谢慧婷, 李恭新. 基于液晶弹性体的仿尺蠖软体机器人. 液晶与显示, 2025, 40(4): 527−535 doi: 10.37188/CJLCD.2025-0002

    Zhu Y Z, Liu Y Y, Wang Y D, Xie H T, Li G X. Inchworm-like soft robot based on liquid crystal elastomer. Chinese Journal of Liquid Crystals and Displays, 2025, 40(4): 527−535 doi: 10.37188/CJLCD.2025-0002
    [43] Wu S, Hong Y Y, Zhao Y, Yin J, Zhu Y. Caterpillar-inspired soft crawling robot with distributed programmable thermal actuation. Science Advances, 2023, 9: eadf8014 doi: 10.1126/sciadv.adf8014
    [44] Rogers J A, Someya T, Huang Y. Materials and Mechanics for Stretchable Electronics. Science, 2023, 9: eadf8014 doi: 10.1126/sciadv.adf8014
    [45] Todorov E, Erez T, Tassa Y. MuJoCo: A physics engine for model-based control. In: Proceedings of 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Vilamoura-Algarve, Portugal: IEEE, 2012. 5026-5033 doi: 10.1109/IROS.2012.6386109
    [46] Kumar S, Narayanan M S, Singhal P, Corso J J, Krovi V. Surgical tool attributes from monocular video. In: Proceedings of 2014 IEEE International Conference on Robotics and Automation (ICRA). Hong Kong, China: IEEE, 2014. 4887-4892 doi: 10.1109/ICRA.2014.6907575
    [47] Brockman G, Cheung V, Pettersson L, Schneider J, Schulman J, Tang J, et al. OpenAI Gym. [Online], available: https://doi.org/10.48550/arXiv.1606.01540, Jun. 5, 2016.
    [48] Ma J, Han Z J, Yang L S, Min G C, Liu Z J, He W. Dynamics modeling of a soft arm under the Cosserat theory. In: Proceedings of 2021 IEEE International Conference on Real-time Computing and Robotics (RCAR), Xining, China: IEEE, 2021. 87-90 doi: 10.1109/RCAR52367.2021.9517660
    [49] Li J Z, Ma J, Hu Y J, Zhang L, Liu Z J, Sun S Y. Vision-based reinforcement learning control of soft robot manipulators. Robotic Intelligence and Automation, 2024, 44(6): 783−790 doi: 10.1108/RIA-01-2024-0002
    [50] 杨妍, 刘运鹏, 韩江涛, 刘志杰, 韩志冀. 软体机械臂的建模与神经网络控制. 工程科学学报, 2023, 43(3): 454−464 doi: 10.13374/j.issn2095-9389.2021.12.17.003

    Yang Y, Liu Y P, Han J T, Liu Z J, Han Z J. Modeling and neural network control of a soft manipulator. Chinese Journal of Engineering, 2023, 43(3): 454−464 doi: 10.13374/j.issn2095-9389.2021.12.17.003
    [51] 程屹涛, 杨焕煜, 刘珂. 基于梁单元的曲面软体机器人简化力学模型. 机器人, 2025, 47(5): 646−656 doi: 10.13973/j.cnki.robot.240122

    Cheng Y T, Yang H Y, Liu K. Reduced Order Model for Soft Robotic Surface Based on Beam Elements. Robot, 2025, 47(5): 646−656 doi: 10.13973/j.cnki.robot.240122
    [52] Sutton R S, Barto A G. Reinforcement Learning: An Introductions, 2nd ed. Cambridge, MA: MIT Press, 2018.
    [53] Haarnoja T, Zhou A, Hartikainen K, Tucker G, Ha S, Tan J, et al. Soft Actor-Critic Algorithms and Applications. [Online], available: https://doi.org/10.48550/arXiv.1812.05905, Dec. 13, 2018.
    [54] Haarnoja T, Zhou A, Abbeel P, Levine S. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor. [Online], available: https://doi.org/10.48550/arXiv.1801.01290, Jan. 4, 2018.
  • 加载中
计量
  • 文章访问数:  10
  • HTML全文浏览量:  9
  • 被引次数: 0
出版历程
  • 网络出版日期:  2026-04-24

目录

    /

    返回文章
    返回