王文晟 谭宁 黄凯 张雨浓 郑伟诗 孙富春

王文晟, 谭宁, 黄凯, 张雨浓, 郑伟诗, 孙富春. 基于大模型的具身智能系统综述. 自动化学报, xxxx, xx(x): x−xx doi: 10.16383/j.aas.c240542
引用本文: 王文晟, 谭宁, 黄凯, 张雨浓, 郑伟诗, 孙富春. 基于大模型的具身智能系统综述. 自动化学报, xxxx, xx(x): x−xx doi: 10.16383/j.aas.c240542
Wang Wen-Sheng, Tan Ning, Huang Kai, Zhang Yu-Nong, Zheng Wei-Shi, Sun Fu-Chun. Embodied intelligence systems based on large models: a survey. Acta Automatica Sinica, xxxx, xx(x): x−xx doi: 10.16383/j.aas.c240542
Citation: Wang Wen-Sheng, Tan Ning, Huang Kai, Zhang Yu-Nong, Zheng Wei-Shi, Sun Fu-Chun. Embodied intelligence systems based on large models: a survey. Acta Automatica Sinica, xxxx, xx(x): x−xx doi: 10.16383/j.aas.c240542


doi: 10.16383/j.aas.c240542
基金项目: 国家自然科学基金面上项目(62173352), 广东省基础与应用基础研究基金杰出青年基金(2024B1515020104)资助

    王文晟:中山大学计算机技术专业硕士研究生. 2023年获得北京科技大学自动化学院测控技术与仪器学士学位. 主要研究方向为基于大模型的具身智能. E-mail: wangwsh23@mail2.sysu.edu.cn

    谭宁:中山大学计算机学院副教授. 2013年获法国弗朗什-孔泰大学博士学位. 主要研究方向为各类机器人系统的建模、设计、仿真、优化、规划与控制, 内容涵盖基础研究和应用开发. 本文通信作者. E-mail: tann5@mail.sysu.edu.cn

    黄凯:中山大学计算机学院教授. 2010年获瑞士苏黎世联邦理工学院计算机科学博士学位. 主要研究方向为汽车和机器人领域嵌入式系统的分析、设计和优化技术. E-mail: huangk36@mail.sysu.edu.cn

    张雨浓:中山大学计算机学院教授. 2002年获香港中文大学博士学位. 广东省珠江学者特聘教授, Elsevier中国高被引学者. 主要研究方向为冗余机器人, 递归神经网络, 高斯过程, 科学计算和软硬件开发. E-mail: zhynong@mail.sysu.edu.cn

    郑伟诗:教育部“长江学者奖励计划” 特聘教授, 英国皇家学会牛顿高级学者, IAPR Fellow. 主要研究方向为协同与交互分析理论与方法, 解决人体建模和机器人行为的视觉计算问题. E-mail: zhwshi@mail.sysu.edu.cn

    孙富春:清华大学计算机科学与技术系教授. 1997 年获得清华大学博士学位. 国家杰出青年科学基金获得者. 中国人工智能学会副理事长, IEEE Fellow. 主要研究方向为智能控制、智能机器人与具身智能. E-mail: fcsun@tsinghua.edu.cn

Embodied Intelligence Systems Based on Large Models: A Survey

Funds: Supported by National Natural Science Foundation of China (62173352), Guangdong Basic and Applied Basic Research Foundation (2024B1515020104)
More Information
    Author Bio:

    WANG Wen-Sheng Master's student in Computer Technology at Sun Yat-sen University. He received his bachelor degree in Measurement and Control Technology and Instruments from the School of Automation at University of Science and Technology Beijing in 2023. His main research focus is on embodied AI based on large models

    TAN Ning Associate professor at the School of Computer Science and Engineering, Sun Yat-sen University. He received his Ph.D. degree from University of Franche-Comté in France in 2013. His research interest covers the modeling, design, simulation, optimization, planning, and control of various robotic systems, covering both fundamental research and application development. Correspoinding author of this paper

    HUANG Kai Professor at Sun Yat-sen University. He received his Ph.D. degree in computer science from ETH Zürich, Zürich, Switzerland, in 2010. His research interests include techniques for the analysis, design, and optimization of embedded systems, particularly in the automotive and robotic domains

    ZHANG Yu-Nong Professor at Sun Yat-sen University. He received his Ph.D. degree from Chinese University of Hong Kong in 2002. Distinguished Scholar of the Pearl River Scholars Program in Guangdong Province, and an Elsevier Highly Cited Researcher in China.His research interests include redundant robots, recurrent neural networks, Gaussian processes, scientific computing, and software-hardware development

    ZHENG Wei-Shi Cheung Kong scholar distinguished professor, recipient of the Excellent Young Scientists Fund of the National Natural Science Foundation of China, and recipient of the Royal Society-Newton Advanced Fellowship of the United Kingdom. His research interest covers theories and methods of collaborative and interactive analysis, addressing visual computing issues in human behavior modeling and artificial intelligence (AI) robotic learning

    SUN Fu-Chun Professor in the Department of Computer Science and Technology,Tsinghua University. He received his Ph.D. degree from Tsinghua University in 1997. He was a recipient of the National Science Fund for Distinguished Young Scholars. He serves as the vice director of Chinese Association for Artificial Intelligence. He is an IEEE Fellow. His main research interest is intelligent control, intelligent robotics, and embodied intelligence

  • 摘要: 得益于近期拥有世界知识的大规模预训练模型的迅速发展, 基于大模型的具身智能在各类任务中取得了良好的效果, 展现出了强大的泛化能力与在各领域内广阔的应用前景. 文章对基于大模型的具身智能的工作进行了综述, 首先介绍了大模型在具身智能系统中起到的感知与理解作用, 其次对大模型在具身智能中参与的需求级、任务级、规划级、动作级四个级别的控制进行了较为全面的总结, 随后对不同具身智能系统架构进行介绍, 并总结了具目前具身智能模型的数据来源, 包括模拟器、模仿学习以及视频学习, 最后对基于大语言模型的具身智能系统的面临的挑战与发展方向进行讨论与总结.
  • 图  1  基于大模型的具身智能工作概览

    Fig.  1  Overview of embodied intelligence work based on large models

    图  2  基于Nerf的语义特征场景表示[108]

    Fig.  2  Semantic feature scene representation based on NeRF[108]

    图  3  具身智能系统的控制层级

    Fig.  3  Control hierarchy of embodied intelligence systems

    图  5  具身智能的不同架构举例

    Fig.  5  Examples of different architectures in embodied AI

    图  6  RT-X收集到的多样化数据[55]

    Fig.  6  Diverse data collected by RT-X[55]

    图  4  VoxPoser根据价值图规划轨迹[52]

    Fig.  4  VoxPoser plans a motion trajectory based on value maps[52]

