-
摘要: 作为机器人技能学习中的一个重要分支, 模仿学习近年来在机器人系统中得到了广泛的应用. 模仿学习能够将人类的技能以一种相对直接的方式迁移到机器人系统中, 其思路是先从少量示教样本中提取相应的运动特征, 然后将该特征泛化到新的情形. 本文针对机器人运动轨迹的模仿学习进行综述. 首先详细解释模仿学习中的技能泛化、收敛性和外插等基本问题; 其次从原理上对动态运动基元、概率运动基元和核化运动基元等主要的模仿学习算法进行介绍; 然后深入地讨论模仿学习中姿态和刚度矩阵的学习问题、协同和不确定性预测的问题以及人机交互中的模仿学习等若干关键问题; 最后本文探讨了结合因果推理的模仿学习等几个未来的发展方向.Abstract: As a promising direction in the community of robot learning, imitation learning has achieved great success in a myriad of robotic systems. Imitation learning is capable of providing a straightforward way to transfer human skills to robots by extracting motion features from few demonstrations and subsequently employing them to new scenarios. This paper will review literature on trajectory learning by imitation for robots. The basic problems in imitation learning are first described in detail, such as skill adaptation, convergence and extrapolation. After that, state-of-the-art approaches are introduced, including dynamical movement primitives, probabilistic movement primitives and kernelized movement primitives. Later, various key problems are explained at length, e.g., learning of orientations and stiffness matrices, synergy and uncertainty prediction, as well as imitation learning in human-robot interaction. Finally, the possible future directions of imitation learning, for instance, the combination of imitation learning and causal inference, are discussed.
-
Key words:
- Robot learning /
- imitation learning /
- movement primitive /
- trajectory learning
-
模态是指人接受信息的特定方式, 由于多媒体数据往往是多种信息的传递媒介, 多模态学习已逐渐发展为多媒体内容分析和理解的主要手段. 在医学领域, 也有研究者应用多模态学习. 针对Alzheimer病, 韩坤等[1]提出结合磁共振图像(Magnetic resonance imaging, MRI)和正电子发射型计算机断层显像(Positron emission computed tomography, PET)图像模态的特征信息相融合的方法, 实验结果表明该方法在准确率上取得了较好的成绩. 为了解决传统模态医学图像缺陷, 张淑丽等[2]提出了自由变形法对多模态的医学图像进行融合. 然而大多数研究人员主要融合多模态的医学图像, 没有加入电子病历等文本模态的数据. 调查发现, 肺癌是世界发病率和死亡率最高的疾病之一[3]. 病人在进行肺疾病诊断时, 需要CT检查, 影像科医生对CT影像进行检查描述, 但在实际的诊断和治疗过程中, 常常是由主治医生根据检查描述以及CT影像进行进一步的判断. 这一过场不仅增加了主治医生的工作量, 也导致了医疗资源的不合理应用.
基于此, 本文在影像CT基础上, 融入影像医生对CT影像描述的文本信息, 以及一些其他检验结果(比如癌胚抗原测定、鳞状上皮细胞癌抗原测定等), 构建深度学习模型对肺疾病进行预测, 将影像医生给出的CT影像和检查描述以及其他检验结果输入到模型中, 对疾病进行判别并给出得病概率, 患病概率大的病人则交由主治医生更进一步地诊断和治疗, 以减轻主治医生的工作量, 提高工作效率.
1. 数据预处理
本文收集的电子病历数据, 主要分为三部分: 检查描述、CT影像和检验结果.
对检查描述研究发现, 虽出自不同医生之手, 但是对医学名词写法相同, 只是在电子病历输入的时候, 存在错别字、同音异字等问题. 如“双肺实质未见明显异常密度, 双肺门不大, 纵膈内未见明确肿大淋巴结 ··· 肺癌不除外纵隔淋巴结增大, 肝脏内见斑片状高密度影, 门静脉周围间隙增宽.” 数据中除了含有少见的医学专有名词“纵隔淋巴结”、“斑片状高密度影”外, 还有错别字“隔”. 本文使用预定义词库的方法, 解决医学常见缩略语的分词问题, 然后使用Multi-head attention与Bi-LSTM对文本进行编码, 减少同音异字或者语法错误带来的文本理解上的问题.
CT影像数据是通过成像设备进行采集的, 但是由于成像设备参数、外界环境的干扰, 会导致采集的CT图像数据有差异, 这些问题都会影响模型的准确率. 本文使用去噪和归一化等图像处理技术对CT图像进行处理.
其他检验结果主要是痰液细胞学、胸水检查、血常规检查和肿瘤标记物筛查等. 痰液与胸水细胞学检查, 主要是判断痰液与胸水中是否存在肿瘤细胞; 血常规检查包括白细胞、红细胞和血小板以及细胞酸碱性等; 肺癌筛选的肿瘤标记物主要有癌胚抗原(Carcinoembryonic antigen, CEA)、癌抗原CA125 (Cancer antigen 125, CA125)、细胞角蛋白19片段(Cytokeratin fragment 19, CYFRA21-1)等.
考虑到数据由文本数据和图像数据两部分组成, 因此分别对两部分数据进行处理.
1.1 文本数据预处理
1.1.1 检查描述数据预处理
深度学习出现后, 基于神经网络的词嵌入模型成为了主流, GloVe[4]使用词共现矩阵学习更广泛的共现概率. CoVe[5]通过神经翻译的编码器向词嵌入中添加含有上下文背景的表征向量, 令模型学习上下文背景化的语义. BERT (Bidirectional encoder representation from transformers)使用多层Transformer[6]编码器学习词汇前后的语义依赖关系, 并通过遮罩语言模型(Masked language model, MLM)解决了模型的输入在多层Transformer 结构中可以看到自己的“镜像问题”. ERNIE[7]提出了知识融合与对话语言模型的概念, 针对中文通用领域的自然语言处理任务对BERT进行了优化.
本文使用jieba分词, 考虑医学短文本中特有的专有名词、缩写语多的特点, 在分词过程中加入了医学词库, 医学词库的建立一方面是通过网络爬取医学专业词汇, 另一方面通过影像科医生总结出常见的肺部CT描述词汇. 文本数据中有大量的词虽然出现频率很高, 却对分类预测没有帮助, 比如在“检查描述”中常出现“无”、“可”、“检查”这类词在实际训练中不能体现不同病历差异性的作用, 更加重了学习器的负担, 一般称其为“停用词”. 因此在分词的时候, 需要将这些停用词去掉. 分词之后的文本数据还需向量化, 本论文使用(Word to vector, word2vec) 模型来训练词向量, 并在模型中加入位置词向量与Multi-head attention来更好地表征文本语义.
1.1.2 检验结果数据预处理
检验结果主要是痰液细胞学、胸水检查、血常规检查和肿瘤标记物筛查等, 检验项目如表1所示, 电子病历中的检查结果会给出参考范围、检查名称、状态和结果值, 由于不同检查项目的量纲不同, 所以结果值有很大的差异, 因此, 本文使用状态值来作模型的输入, 将正常的状态映射为0, 非正常状态(高或低)映射为1, 然后输入到模型里面.
表 1 检验项目Table 1 Examine items参考范围 检验名称 状态 结果值 血常规检查 0 ~ 0.1 嗜碱性粒细胞 正常 0.01 0.05 ~ 0.5 嗜酸性粒细胞 正常 0.07 0 ~ 1 嗜碱性粒细胞比率 正常 0.20 % 110 ~ 160 血红蛋白 正常 128 g/L 100 ~ 300 血小板 正常 $13510{\hat 9}/{\rm{L}}$ 3.5 ~ 5.5 红细胞 正常 4.25 37 ~ 50 红细胞分布宽度 正常 43.90 % 4 ~ 10 白细胞 正常 $6.1810{\hat 9}/{\rm{L}}$ 86 ~ 100 红细胞平均体积 正常 88.2 fL 痰液检查 无肿瘤细胞 痰液细胞 正常 无肿瘤细胞 肿瘤标记物 5 μg/ml CEA (Carcinoembryonic antigen) 正常 2.31 30 U/ml CA125 (Cancer antigen 125) 正常 13.70 U/ml 8.20 U/ml CA72-4 (Cancer antigen 72-4) 正常 1.34 U/ml 16.3 ng/ml NSE (Neuron-specific enolase) 正常 15.18 ng/ml 1.5 ng/ml SCC (Squamous cell carcinoma) 正常 0.8 ng/ml 2.0 ng/ml CYFRA21-1 (Cytokeratin fragment 19) 高 7.31 ng/ml 胸水检验 0.38 ~ 2.1 甘油三脂 正常 0.74 mmol/L 0.8 ~ 1.95 高密度脂蛋白 正常 1.31 mmol/L 3.8 ~ 6.1 葡萄糖 高 10.11 mmol/L 2 ~ 4 低密度脂蛋白 正常 2.02 mmol/L 109 ~ 271 乳酸脱氢酶 正常 205.2 U/L 0 ~ 6.8 直接胆红素 正常 3.49 μmol/L 3.6 ~ 5.9 总胆固醇 低 3.54 mmol/L 20 ~ 45 球蛋白 正常 31.7 g/L 1.2 图像数据预处理
在计算机辅助诊断领域中, 主要针对肺部CT影像进行肺癌良恶性的诊断. Sun等[8]使用了单层的CNN (Convolutional neural networks)和SDAE (Stacked denoised autoencoder) (3个DAE (DialAnExchange))以及DBN (Deep belief nets)(4层RBM (Restricted Boltzmann machine))解决了肺节点的良恶性分类问题. Xiao等[9]增加了一个卷积层, 使用CNN (2个卷积层、2个池化层、2个全连接层)和DBN (2层RBM)实现了肺节点的良恶性分类, 其效果有明显的提高. Cheng等[10]提出将肺节点兴趣区的多个参数与肺节点兴趣区一起输入到SDAE模型, 仅使用肺节点中间切片的Single模型与使用所有肺节点切片的All模型进行对比, 实验结果表明All模型相比Single模型, 在准确率上大约有11 %的提升, 而AUC大约有5 %的提升. Nibali等[11]将深度残差网络模型与迁移学习应用到肺癌分类中, 由于深度残差模型, 在加深网络深度的同时, 减少了梯度消失的可能, 因此, 通过深度残差网络模型以ImageNet图像集为源域进行迁移学习分类, 使得分类准确率为89.9 %, AUC (Area under curve)为0.946. Shen等[12]提出了一种具有多级裁剪结构的CNN模型, 该模型可以获取不同尺度的图像特征, 从而加强模型的分类效果, 该模型的准确率为87.1 %, AUC为0.93.
通过对已有方法对比发现, 分类准确率有明显的提高, 但是分类效果还不是很高. 一方面是由于模型过于简单, 另一方面, 没有根据目标数据进行有针对性的调整, 所以模型仍有更大的改进空间.
由于CT图像使用不同的扫描以及重建方法, 会产生一些不需要的杂质和噪点, 比如像结节一样的球状结构, 这些干扰信息与感兴趣区域之间存在某种相似性. 如果不去除噪声, 后面对特征提取的质量将受到严重影响, 从而影响模型的准确性. 本文实验分析发现高斯滤波器的去噪效果比均值滤波等的效果更好, 而且高斯滤波器对边缘信息的保留能力也更佳. 除此之外, 为了加快模型收敛, 将图像像素归一化或标准化, 在本文中, 对去噪之后的图像, 将像素的值归一化为0到255的整数. 处理后的图像采用残差神经网络为基础构建模型, 具体模型将在实验的图像模型部分给出.
2. 实验
模型结构如图1所示, 整个模型的主要由三部分构成, 分别是文本部分、图像部分和多层感知器(Multilayer perceptron, MLP), 文本部分输入的是电子病历的文本信息(影像医生给出的CT描述信息), 图像部分输入的是影像检查的CT图像, 多层感知器输入的是其他检查结果. 将文本部分的输出、图像部分的输出和多层感知器的输出拼接起来, 然后经过全连接层, 最后输出结果. 模型的损失函数是交叉熵:
$$ L = -\frac{1}{n}\sum\limits[y \ln(a) + (1-y) \ln(1 - a)] $$ (1) 其中,
$ a $ 是真实值,$ y $ 是预测值.2.1 文本模型
在文本方面, 以Bi-LSTM和Multi-head attention为核心对文本建模, 模型的输入层为词向量加位置向量, 同时在模型的输入层后面引入Multi-head attention. 最后将多个特征进行拼接和融合, 使模型进一步提高特征表达能力.
2.1.1 Word Embedding
本文使用词粒度的词向量. 考虑到文本语料相对比较少, 训练出来的词向量语义不够丰富, 而腾讯预训练词向量大约超过800万中文词汇数据, 与其他公开的预训练词向量相比, 具有比较好的覆盖性和新鲜度, 因此本文使用腾讯预训练向量.
由于病例中的词语所在的位置不同而代表不同的语义, 在词向量基础上, 加入位置向量, 能够使模型区别出不同位置的单词. 因此, 模型的输入也会将位置向量(Position embedding)作为辅助词向量输入. 在语言序列中, 相对位置至关重要, 而Position embedding本身是绝对值位置的信息, 因此, 本文将Position embedding定义为如下:
$$ \begin{split} & {\boldsymbol{PE}}_{2 i}(p) = \sin \left(\frac{p} {10\;000^{2 i / d_{pos}}} \right)\\ &{\boldsymbol{PE}}_{2 i+1}(p) = \cos \left(\frac{p} {10\;000^{2 i / d_{pos}}}\right) \end{split}$$ (2) ${\boldsymbol{{{P}}E}}$ 代表Position embedding,$ p $ 代表词的位置,$ d_{pos} $ 代表维度, 公式将词位置信息使用三角函数映射到$ d_{pos} $ 维度上.2.1.2 Multi-head Attention
Multi-head attention本质是进行多次Self-attention计算, 它可以使模型从不同表征子空间获取更多层面的特征, 从而使模型能够捕获句子更多的上下文信息.
Self-attention本质是一种信息编码方式, 类似于CNN中的卷积, Self-attention的定义如下所示:
$$\begin{array}{l} {\rm{Attention}}({\boldsymbol{Q}},{\boldsymbol{K}},{\boldsymbol{V}}) =\\ \qquad {\mathop{\rm softmax}\nolimits} \left( {\left[ {\begin{array}{*{20}{c}} {{v_1}}\\ {{v_2}}\\ \vdots \\ {{v_n}} \end{array}} \right]\left[ {v_1^{\rm{T}},v_2^{\rm{T}}, \cdots ,v_n^{\rm{T}}} \right]} \right)\left[ {\begin{array}{*{20}{c}} {{v_1}}\\ {{v_2}}\\ \vdots \\ {{v_n}} \end{array}} \right] =\\\qquad {\mathop{\rm softmax}\nolimits} ({\boldsymbol{Q}}{{\boldsymbol{K}}^{\rm{T}}}){\boldsymbol{V}} \end{array}\;\;\qquad$$ (3) $ {\boldsymbol{Q}} $ 是Query, 代表Query向量,$ {\boldsymbol{K}} $ 是Key, 代表Key向量,$ {\boldsymbol{V}} $ 是Value, 代表Value向量.$ W_{q} $ 矩阵,$ W_{k} $ 矩阵和$ W_{v} $ 矩阵将输入的词向量映射成$ {\boldsymbol{Q}} $ ,$ {\boldsymbol{K}} $ ,$ {\boldsymbol{V}} $ , 然后按照公式进行加权求和, 对文本信息进行编码.将Self-attention执行k次, 然后将结果拼接起来, 就得到了Multi-head attention.
2.1.3 Bi-LSTM
词向量经过Multi-head attention的时候, 由于Self-attention是对输入信息的上下文的向量进行计算编码信息, 没有考虑到输入信息的词序, 所以, 在模型的输入层加入了Position embedding, 除此之外, 还在Multi-head attention的后面加入了Bi-LSTM. LSTM (Long short-term memory)[13]是为了缓解RNN的梯度消失而提出的, LSTM单元有三个门, 分别是遗忘门
${\boldsymbol{f}}_{t}$ , 输入门${\boldsymbol{i}}_{t}$ 和输出门${\boldsymbol{o}}_{t} $ [14]. 假设在$ t $ 时刻, 输入为${\boldsymbol{x}}_{t}$ , 而$ t-1 $ (上一时刻)的隐藏层的输出为${\boldsymbol{h}}_{t-1}$ , 其中${\boldsymbol{C}}_{t-1}$ 为$ t-1 $ (上一时刻)的细胞状态值, 则在$ t $ 时LSTM的各个状态值:$$ \begin{split} {\boldsymbol{f}}_{t} =\;& \sigma\left({\boldsymbol{W}}_{f} \times\left[{\boldsymbol{h}}_{t-1}, {\boldsymbol{x}}_{t}\right]+{\boldsymbol{b}}_{f}\right) \\ {\boldsymbol{i}}_{t} =\; & \sigma\left({\boldsymbol{W}}_{i} \times\left[{\boldsymbol{h}}_{t-1}, {\boldsymbol{x}}_{t}\right]+{\boldsymbol{b}}_{i}\right) \\ \tilde{{\boldsymbol{C}}}_{t} =\; & \tanh \left({\boldsymbol{W}}_{C} \times\left[{\boldsymbol{h}}_{t-1}, {\boldsymbol{x}}_{t}\right]+{\boldsymbol{b}}_{C}\right) \\ {\boldsymbol{C}}_{t} =\;& {\boldsymbol{f}}_{t} \times {\boldsymbol{C}}_{t-1}+{\boldsymbol{i}}_{t} \times \tilde{{\boldsymbol{C}}}_{t} \\ {\boldsymbol{o}}_{t} =\; & \sigma\left({\boldsymbol{W}}_{o} \times\left[{\boldsymbol{h}}_{t-1}, {\boldsymbol{x}}_{t}\right]+{\boldsymbol{b}}_{o}\right) \\ {\boldsymbol{h}}_{t} =\;& {\boldsymbol{o}}_{t} \times \tanh \left({\boldsymbol{C}}_{t}\right) \end{split} $$ (4) 通过以上计算, 最终得到
$ t $ 时刻LSTM隐层状态的输出值. 由于LSTM对句子只是从前向后单向建模, 无法进行从后向前的编码信息. 因此, 本文使用Bi-LSTM (双向LSTM), 可以更好地捕捉双向的语义信息.2.1.4 Soft Attention
Soft attention即传统的Attention mechanism, 通过保留Bi-LSTM编码器对输入序列的中间输出结果, 然后计算每个中间结果与其他结果的点积, 最后加权求和.
$$ \begin{split} {\boldsymbol{M}} =\; &\tanh ({\boldsymbol{H}})\\ {\boldsymbol{\alpha}} =\;&{\mathop{\rm softmax}\nolimits} \left( {{{\boldsymbol{w}}^{\rm{T}}}{\boldsymbol{M}}} \right)\\ {\boldsymbol{r}} =\; &{\boldsymbol{H}}{{\boldsymbol{\alpha}} ^{\rm{T}}} \end{split}$$ (5) ${\boldsymbol{ H}}$ 是Bi-LSTM隐藏层的输出结果,${\boldsymbol{ w}}$ 是需要学习的参数. 第二个Attention机制的实现是通过计算每个中间结果与其他结果的点积, 其中中间结果是通过保留Bi-LSTM编码器对输入序列的中间输出的结果, 最后再进行加权求和. 这一层的Attention能够观察到序列中的每个词与输入序列中一些词的对齐关系. 本文使用的是乘法注意力机制, 其中使用高度优化的矩阵乘法实现乘法注意力机制, 那么整体计算成本和单次注意力机制的计算成本并不会相差很大, 同时又提升了模型的特征表达能力.2.2 多层感知机(Multilayer Perceptron, MLP)
模型的第三部分是多层感知器(MLP), MLP主要包含输入层、隐藏层和输出层. 实验验证, 隐藏层不能过多, 一方面, 层数越多, 参数越多, 容易过拟合, 另一方面, 到了一定的层数, 增加更深的隐藏层, 分类效果也不会提升太多, 反而有时会下降. 因此, MLP部分设置三个隐藏层, 具体参数如表2所示.
表 2 MLP参数设置Table 2 The parameter of MLPName 节点个数 激活函数 Hidden1 65 Sigmoid Hidden2 131 Sigmoid Hidden3 263 Sigmoid 2.3 图像模型
本文的图像卷积部分在ResNet-50结构基础上, 基于ImageNet数据集预训练, 然后微调构建的模型. 模型的结构如图2所示, ResNet中有2个基本的block,一个是Identity block, 输入和输出的dimension是一样的, 所以可以串联多个; 另一个是ConvBlock, 输入和输出的Dimension是不一样的, 所以不能连续串联, 它的作用是为了改变特征向量的Dimension.
图像中包含足够的区分信息是卷积神经网络能够学习不同肺癌特征的重要条件[15]. 图像的大小会影响网络区分不同特征的能力, 太小会使一些不明显的特征提取不到, 太大会受计算机内存的限制, 因此必须选择大小合适的图像尺寸, 由于本文使用的是ResNet-50 (Residual neural network)网络, 输入的图像尺寸需要调整为
$ 224 \times 224 $ .2.4 实验设置
实验中所用的计算机硬件配置为Centos系统, CPU为Intel(R) Xeon(R) CPU E5-2630, GPU为NVIDIA Tesla M4显卡, 深度学习框架为Keras 2.2.4, 后端为Tensorflow 1.13.
在本论文中, 主要有两个实验, 第一个是分别测试Multi-head attention, Bi-LSTM和Soft attention层在文本深度模型的效果, 第二个是测试文本深度模型、图像深度模型、MLP和文本图像混合模型.
为了验证模型的优点和比较模型的表现能力, 在第二个实验中, 主要实现了以下几个模型: 一个基线模型为ImageNet预训练的VGG-19 (Visual geometry group), 三个单模态模型为图像深度模型 (Img-net)、多层感知器(MLP)和文本深度模型 (Text-net), 以及多模态模型Img+Text, Img+MLP和MLP+Text. Text-net网络去掉下面的图像卷积部分, 添加一个全连接层, 损失函数为交叉熵的输出层. Img-net网络去掉上面的文本深度模型, 添加全连接层之后加上代价函数为交叉熵的输出层. MLP是一个多层感知机网络, 只使用检查结果进行预测. TI-net网络是文本图像混合模型, 输入为图像、文本和其他数值, 数据经过各自的模型之后, 拼接起来, 经过一个全连接层之后输出. 为了减少模型之间的扰动, 对于单模型Text-net, Img-Net和MLP三个网络分别用各自的输入进行预训练, 而对于多模态模型, 使用预训练的单模型的网络权重作为初始化, 再对多模态模型进行微调.
实验数据共有3 785个样本. 本文主要研究的是一个二分类问题, 即判断病人是否患有肺癌, 与一般分类问题不同, 疾病诊断分类问题的数据集往往存在不均衡问题, 因此需要对不均衡的样本进行处理. 由于本文的数据量比较大, 因此, 使用采样的方法来平衡数据集, 以1:2的比例对全量数据进行采样, 数据的比例分布如表3所示.
表 3 正负样本比例Table 3 Positive and negative sample ratio正样本 1 262 负样本 2 523 为了验证模型的效果, 将原始数据按照8:2的比例切分出训练集和验证集, 并将训练集在3个模型上进行训练, 然后在验证集上评价模型. 防止模型结果的偶然性, 在训练模型的时候, 采用k-fold交叉验证的形式来训练模型, 实验结果显示k取值为7的时候效果比较好一些. 训练集和验证集中, 文本的最大长度设置为80, 词向量的维度为200, 优化器为Adam, 初始学习率为0.01, 衰减因子为0.0001, 训练轮次为2 000次, 为了防止过拟合, 使用EarlyStopping来提前停止训练, 评价指标采用准确率, 精确率和召回率.
2.5 实验结果
实验1的结果如表4所示, 主要用来测试Multi-head attention, Bi-LSTM和Soft attention层的效果, Text-net网络使用了所有的层, Text-net1去掉了Multi-head attention层, Text-net2去掉了Bi-LSTM层, Text-net3去掉了Soft attention层, 从表中结果可以看出, Text-net模型比其他三个模型都要好. 对比Text-net、Text-net1和Text-net2可以看出, 加入Multi-head attention准确率提升了7 %, 加入Bi-LSTM准确率提升了3 %, 所以加入Multi-head attention层比Bi-LSTM层效果更好. 对比Text-net和Text-net3, 加入Soft-attention层后, 模型准确率提升了4 %, 这是因为Bi-LSTM层只对文本进行序列建模, 缺乏层次信息, 后面加入Soft-attention, 可以将Bi-LSTM编码后的信息, 进行层次信息建模.
表 4 实验1的结果Table 4 The result of experiment 1Model name Train (%) Test (%) Accuracy Precision Recall Accuracy Precision Recall Text-net 83.12 ± 0.02 80.10 ± 0.05 81.12 ± 0.02 81.21 ± 0.01 79.82 ± 0.03 80.15 ± 0.01 Text-net1 76.87 ± 0.02 75.29 ± 0.01 75.11 ± 0.03 74.91 ± 0.02 73.41 ± 0.02 74.07 ± 0.03 Text-net2 80.49 ± 0.03 78.16 ± 0.04 78.82 ± 0.03 78.43 ± 0.02 77.15 ± 0.01 78.59 ± 0.02 Text-net3 79.73 ± 0.02 77.19 ± 0.02 76.92 ± 0.01 78.19 ± 0.02 76.79 ± 0.03 75.57 ± 0.02 实验2的结果如表5所示, 从表5可以看出, 基线模型VGG-19的准确率为92.53 %, 而Img-Net (ResNet-50)的准确率为93.85 %, 从图像深度卷积方面来看, 显然ResNet-50模型的效果更好. 从单模态模型与多模态模型方面来说, 对比Img-net、Img+Text、Img+MLP和TI-net模型, 可以看出, 增加CT检验信息准确率提升了1 %, 增加检验结果准确率提升了2 %, 同时增加CT检验信息和检验结果, 准确率提升了3.2 %, 精确率提升了4 %, 召回率提升了4 %. 从实验结果上可以看出, 基于多模态数据的模型效果优于单模型的效果, 并且对比单模型的结果可以看出, Img-net效果远比Text-net和MLP的效果好, 这说明, CT影像仍是肺癌诊断的主要信息, 而检查描述和检验结果作为补充信息加入到模型中, 可以很好地提升模型的精确度.
表 5 实验2的结果Table 5 The result of experiment 2Model Name Train (%) Test (%) Accuracy Precision Recall Accuracy Precision Recall TI-Net 97.08 ± 0.03 95.69 ± 0.01 94.37 ± 0.02 96.90 ± 0.04 95.17 ± 0.03 93.71 ± 0.01 Img+MLP 95.15 ± 0.03 93.90 ± 0.02 93.17 ± 0.03 94.76 ± 0.02 92.89 ± 0.03 91.78 ± 0.01 Img+Text 94.71 ± 0.02 92.13 ± 0.03 91.26 ± 0.04 93.17 ± 0.04 90.88 ± 0.03 89.99 ± 0.03 MLP+Text 89.88 ± 0.04 87.67 ± 0.01 86.92 ± 0.02 87.78 ± 0.03 84.23 ± 0.03 84.57 ± 0.04 Img-Net 93.85 ± 0.03 91.84 ± 0.02 90.83 ± 0.03 92.67 ± 0.02 89.77 ± 0.03 88.93 ± 0.01 VGG-19 92.53 ± 0.02 89.16 ± 0.03 88.57 ± 0.01 90.94 ± 0.02 87.10 ± 0.03 87.04 ± 0.02 MLP 86.75 ± 0.03 85.21 ± 0.02 85.12 ± 0.03 84.86 ± 0.02 82.37 ± 0.03 81.59 ± 0.01 Text-Net 83.12 ± 0.04 80.10 ± 0.05 81.12 ± 0.02 81.21 ± 0.03 79.82 ± 0.03 80.15 ± 0.02 3. 结论
本文提出了一种基于文本和图像的肺疾病分类算法, 详细介绍了本文提出的文本图像混合深度模型, 从基于深度学习的肺癌图像分类出发, 引入了CT影像描述信息和电子病历的检验项目, 并使用Multi-head attention以及Bi-LSTM对文本建模, 提取文本信息. 实验结果证明, 将文本信息和检验信息引入到模型后, 与传统单纯的图像模型相比, 本文提出的算法具有更好的识别效果和更强的泛化能力.
-
图 2 粉刷任务中的示教轨迹(a) ~ (b)以及泛化轨迹(c) ~ (f), 其中(c) ~ (d)和(e) ~ (f)对应不同情形下的泛化[30].
$[p_x \ p_y \ p_z]^{\rm{T}} $ 和$[q_s \ q_x \ q_y \ q_z]^{\rm{T}}$ 分别表示机器人末端的位置和四元数姿态. 圆圈为泛化时对应的期望路径点Fig. 2 Demonstrations (a) ~ (b) and adapted trajectories (c) ~ (f) in painting tasks, where (c) ~ (d) and (e) ~ (f) correspond to different adaptations.
$[p_x \ p_y \ p_z]^{\rm{T}} $ and$[q_s \ q_x \ q_y \ q_z]^{\rm{T}}$ denote Cartesian position and quaternion, respectively. Circles depict various desired points图 3 DMP在书写字母中的应用. (a)表示技能的复现, (b) ~ (c)均表示技能的泛化, 其中实线对应DMP生成的轨迹, 虚线为示教轨迹并用 ‘*’ 和 ‘+’ 分别表示其起点和终点, 圆圈表示泛化轨迹需要经过的期望位置点
Fig. 3 The application of DMP in writing tasks. (a) corresponds to skill reproduction, (b) ~ (c) represent skill adaptations with different desired points. Solid curves are generated via DMP, while the dashed curves denote the demonstration with ‘*’ and ‘+’ respectively marking its starting and ending points. Circles depict desired points which the adapted trajectories should go through
图 4 KMP在书写字母中的应用. (a)对应二维轨迹, (b) ~ (e)分别表示轨迹的
$x,$ $y,$ $\dot{x}$ 和$\dot{y}$ 分量. 实线对应KMP生成的轨迹, 虚线为通过GMR对示教轨迹进行建模得到的均值, 圆圈表示不同的期望点Fig. 4 The application of KMP in a writing task. (a) plots the corresponding 2D trajectories, while (b) ~ (e) show the
$x,$ $y,$ $\dot{x}$ and$\dot{y}$ components of trajectories, respectively. Solid curves are planned via KMP while the dashed curves are retrieved by GMR after modelling demonstrations. Circles denote various desired points图 5 应用GMM和GMR对多条示教轨迹进行概率建模. (a) ~ (b)分别对应示教轨迹的
$x$ 和$y$ 分量, (c) ~ (d)表示GMM和GMR的建模结果, 其中(c)中椭圆表示GMM中的高斯成分, (d)中的实线和阴影部分分别表示多条轨迹的均值和方差Fig. 5 The modeling of multiple demonstrations using GMM and GMR. (a) ~ (b) plot the
$x$ and$y$ components of demonstrations. (c) ~ (d) depict the probabilistic features obtained via GMM and GMR, where the ellipses in (c) denote the Gaussian components in GMM, the solid curve and shaded area in (d) represent the mean and covariance of demonstrations, respectively表 1 几种主要模仿学习方法的对比
Table 1 Comparison among the state-of-the-art approaches in imitation learning
表 2 几种主要姿态学习方法的对比
Table 2 Comparison among the state-of-the-art approaches in orientation learning
单位范数 多轨迹概率 中间姿态 目标姿态 收敛性 时间输入 多维输入 单个基元 姿态 角速度 姿态 角速度 Pastor 等[62] — — — — — √ — √ √ — Silverio 等[63] — √ — — — √ — — √ √ Ude 等[64] √ — — — — √ — √ √ — Abu-Dakka 等[65] √ — — — — √ — √ √ — Ravichandar 等[66] √ √ — — — √ — √ — √ Zeestraten 等[67] √ √ — — — √ — — √ √ Huang 等[34] √ √ √ √ √ √ √ — √ √ Saveriano 等[68] √ — — √ √ √ √ √ √ — -
[1] Schaal S. Is imitation learning the route to humanoid robots? Trends in Cognitive Sciences, 1999, 3(6): 233-242 doi: 10.1016/S1364-6613(99)01327-3 [2] Ijspeert A J, Nakanishi J, Hoffmann H, Pastor P, Schaal S. Dynamical movement primitives: Learning attractor models for motor behaviors. Neural Computation, 2013, 25(2): 328-373 doi: 10.1162/NECO_a_00393 [3] Khansari-Zadeh S M, Billard A. Learning stable nonlinear dynamical systems with gaussian mixture models. IEEE Transactions on Robotics, 2011, 27(5): 943-957 doi: 10.1109/TRO.2011.2159412 [4] Paraschos A, Daniel C, Peters J, Neumann G. Probabilistic movement primitives. In: Proceedings of the 26th International Conference on Neural Information Processing Systems. Nevada, USA: NIPS, 2013. 2616−2624 [5] Calinon S, Bruno D, Caldwell D G. A task-parameterized probabilistic model with minimal intervention control. In: Proceedings of the 2014 IEEE International Conference on Robotics and Automation. Hong Kong, China: IEEE, 2014. 3339−3344 [6] Huang Y L, Rozo L, Silverio J, Caldwell D G. Kernelized movement primitives. The International Journal of Robotics Research, 2019, 38(7): 833-852 doi: 10.1177/0278364919846363 [7] Muhlig M, Gienger M, Hellbach S, Steil J J, Goerick C. Task-level imitation learning using variance-based movement optimization. In: Proceedings of the 2009 IEEE International Conference on Robotics and Automation. Kobe, Japan: IEEE, 2009. 1177−1184 [8] Huang Y L, Buchler D, Koc O, Scholkopf B, Peters J. Jointly learning trajectory generation and hitting point prediction in robot table tennis. In: Proceedings of the 2016 IEEE-RAS 16th International Conference on Humanoid Robots. Cancun, Mexico: IEEE, 2016. 650−655 [9] Huang Y L, Silverio J, Rozo L, Caldwell D G. Hybrid probabilistic trajectory optimization using null-space exploration. In: Proceedings of the 2018 IEEE International Conference on Robotics and Automation. Brisbane, Australia: IEEE, 2018. 7226−7232 [10] Stulp F, Theodorou E, Buchli J, Schaal S. Learning to grasp under uncertainty. In: Proceedings of the 2011 IEEE International Conference on Robotics and Automation. Shanghai, China: IEEE, 2011. 5703−5708 [11] Mylonas G P, Giataganas P, Chaudery M, Vitiello V, Darzi A, Yang G Z. Autonomous eFAST ultrasound scanning by a robotic manipulator using learning from demonstrations. In: Proceedings of the 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems. Tokyo, Japan: IEEE, 2013. 3251−3256 [12] Reiley C E, Plaku E, Hager G D. Motion generation of robotic surgical tasks: Learning from expert demonstrations. In: Proceedings of the 2010 Annual International Conference of the IEEE Engineering in Medicine and Biology. Buenos Aires, Argentina: IEEE, 2010. 967−970 [13] Colome A, Torras C. Dimensionality reduction in learning Gaussian mixture models of movement primitives for contextualized action selection and adaptation. IEEE Robotics and Automation Letters, 2018, 3(4): 3922-3929 doi: 10.1109/LRA.2018.2857921 [14] Canal G, Pignat E, Alenya G, Calinon S, Torras C. Joining high-level symbolic planning with low-level motion primitives in adaptive HRI: Application to dressing assistance. In: Proceedings of the 2018 IEEE International Conference on Robotics and Automation. Brisbane, Australia: IEEE, 2018. 3273−3278 [15] Joshi R P, Koganti N, Shibata T. A framework for robotic clothing assistance by imitation learning. Advanced Robotics, 2019, 33(22): 1156-1174 doi: 10.1080/01691864.2019.1636715 [16] Motokura K, Takahashi M, Ewerton M, Peters J. Plucking motions for tea harvesting robots using probabilistic movement primitives. IEEE Robotics and Automation Letters, 2020, 5(2): 3275-3282 doi: 10.1109/LRA.2020.2976314 [17] Ding J T, Xiao X H, Tsagarakis N, Huang Y L. Robust gait synthesis combining constrained optimization and imitation learning. In: Proceedings of the 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems. Las Vegas, USA: IEEE, 2020. 3473−3480 [18] Zou C B, Huang R, Cheng H, Qiu J. Learning gait models with varying walking speeds. IEEE Robotics and Automation Letters, 2020, 6(1): 183-190 [19] Huang R, Cheng H, Guo H L, Chen Q M, Lin X C. Hierarchical interactive learning for a human-powered augmentation lower exoskeleton. In: Proceedings of the 2016 IEEE International Conference on Robotics and Automation. Stockholm, Sweden: IEEE, 2016. 257−263 [20] Maeda G, Ewerton M, Neumann G, Lioutikov R, Peters J. Phase estimation for fast action recognition and trajectory generation in human–robot collaboration. The International Journal of Robotics Research, 2017, 36(13-14): 1579-1594 doi: 10.1177/0278364917693927 [21] Silverio J, Huang Y L, Abu-Dakka F J, Rozo L, Caldwell D G. Uncertainty-aware imitation learning using kernelized movement primitives. In: Proceedings of the 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems. Macau, China: IEEE, 2019. 90−97 [22] Pomerleau D A. ALVINN: An autonomous land vehicle in a neural network. In: Proceedings of the 1st International Conference on Neural Information Processing Systems. Denver, USA: NIPS, 1989. 305−313 [23] Ross S, Gordon G J, Bagnell D. A reduction of imitation learning and structured prediction to no-regret online learning. In: Proceedings of the 14th International Conference on Artificial Intelligence and Statistics. Fort Lauderdale, USA: JMLR.org, 2011. 627−635 [24] Abbeel P, Ng A Y. Apprenticeship learning via inverse reinforcement learning. In: Proceedings of the 21st International Conference on Machine Learning. Banff, Canada: 2004. 1−8 [25] Ho J, Ermon S. Generative adversarial imitation learning. In: Proceedings of the 30th International Conference on Neural Information Processing Systems. Barcelona, Spain: NIPS, 2016. 4572−4580 [26] Liu Nai-Jun, Lu Tao, Cai Ying-Hao, Wang Shuo. A review of robot manipulation skills learning methods. Acta Automatica Sinica, 2019, 45(3): 458-470 [27] Qin Fang-Bo, Xu De. Review of robot manipulation skill models. Acta Automatica Sinica, 2019, 45(8): 1401-1418 [28] Billard A, Epars Y, Cheng G, Schaal S. Discovering imitation strategies through categorization of multi-dimensional data. In: Proceedings of the 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems. Las Vegas, USA: IEEE, 2003. 2398−2403 [29] Calinon S, Guenter F, Billard A. On learning, representing, and generalizing a task in a humanoid robot. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 2007, 37(2): 286-298 doi: 10.1109/TSMCB.2006.886952 [30] Huang Y L, Abu-Dakka F J, Silverio J, Caldwell D G. Generalized orientation learning in robot task space. In: Proceedings of the 2019 International Conference on Robotics and Automation. Montreal, Canada: IEEE, 2019. 2531−2537 [31] Matsubara T, Hyon S H, Morimoto J. Learning stylistic dynamic movement primitives from multiple demonstrations. In: Proceedings of the 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems. Taipei, China: IEEE, 2010. 1277−1283 [32] Giusti A, Zeestraten M J A, Icer E, Pereira A, Caldwell D G, Calinon S, et al. Flexible automation driven by demonstration: Leveraging strategies that simplify robotics. IEEE Robotics & Automation Magazine, 2018, 25(2): 18-27 [33] Huang Y L, Scholkopf B, Peters J. Learning optimal striking points for a ping-pong playing robot. In: Proceedings of the 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems. Hamburg, Germany: IEEE, 2015. 4587−4592 [34] Huang Y L, Abu-Dakka F J, Silverio J, Caldwell D G. Toward orientation learning and adaptation in Cartesian space. IEEE Transactions on Robotics, 2021, 37(1): 82-98 doi: 10.1109/TRO.2020.3010633 [35] Bishop C M. Pattern Recognition and Machine Learning. Heidelberg: Springer, 2006. [36] Cohn D A, Ghahramani Z, Jordan M I. Active learning with statistical models. Journal of Artificial Intelligence Research, 1996, 4: 129-145 doi: 10.1613/jair.295 [37] Calinon S. A tutorial on task-parameterized movement learning and retrieval. Intelligent Service Robotics, 2016, 9(1): 1-29 doi: 10.1007/s11370-015-0187-9 [38] Guenter F, Hersch M, Calinon S, Billard A. Reinforcement learning for imitating constrained reaching movements. Advanced Robotics, 2007, 21(13): 1521-1544 doi: 10.1163/156855307782148550 [39] Peters J, Vijayakumar S, Schaal S. Natural actor-critic. In: Proceedings of the 16th European Conference on Machine Learning. Porto, Portugal: Springer, 2005. 280−291 [40] Rabiner L R. A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE, 1989, 77(2):257-286 doi: 10.1109/5.18626 [41] Yu S Z. Hidden semi-Markov models. Artificial Intelligence, 2010, 174(2): 215-243 doi: 10.1016/j.artint.2009.11.011 [42] Calinon S, D’halluin F, Sauser E L, Caldwell D G, Billard A G. Learning and reproduction of gestures by imitation. IEEE Robotics & Automation Magazine, 2010, 17(2): 44-54 [43] Osa T, Pajarinen J, Neumann G, Bagnell J A, Abbeel P, Peters J. An algorithmic perspective on imitation learning. Foundations and Trends® in Robotics, 2018, 7(1-2): 1-79 doi: 10.1561/2300000053 [44] Zeestraten M J A, Calinon S, Caldwell D G. Variable duration movement encoding with minimal intervention control. In: Proceedings of the 2016 IEEE International Conference on Robotics and Automation. Stockholm, Sweden: IEEE, 2016. 497−503 [45] Rasmussen C E, Williams C K I. Gaussian Processes for Machine Learning. Cambridge: MIT Press, 2006. [46] Hofmann T, Scholkopf B, Smola A J. Kernel methods in machine learning. The Annals of Statistics, 2008, 36(3): 1171-1220 [47] Alvarez M A, Rosasco L, Lawrence N D. Kernels for vector-valued functions: A review. Foundations and Trends® in Machine Learning, 2012, 4(3): 195-266 doi: 10.1561/2200000036 [48] Solak E, Murray-Smith R, Leithead W E, Leith D J, Rasmussen C E. Derivative observations in Gaussian process models of dynamic systems. In: Proceedings of the 15th International Conference on Neural Information Processing Systems. Vancouver, Canada: MIT Press, 2002. 1057−1064 [49] Atkeson C G, Moore A W, Schaal S. Locally weighted learning. Artificial Intelligence Review, 1997, 11(1-5): 11-73 [50] Kober J, Mulling K, Kromer O, Lampert C H, Scholkopf B, Peters J. Movement templates for learning of hitting and batting. In: Proceedings of the 2010 IEEE International Conference on Robotics and Automation. Anchorage, USA: IEEE, 2010. 853−858 [51] Fanger Y, Umlauft J, Hirche S. Gaussian processes for dynamic movement primitives with application in knowledge-based cooperation. In: Proceedings of the 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems. Daejeon, Korea : IEEE, 2016. 3913−3919 [52] Calinon S, Li Z B, Alizadeh T, Tsagarakis N G, Caldwell D G. Statistical dynamical systems for skills acquisition in humanoids. In: Proceedings of the 12th IEEE-RAS International Conference on Humanoid Robots. Osaka, Japan: IEEE, 2012. 323−329 [53] Stulp F, Sigaud O. Robot skill learning: From reinforcement learning to evolution strategies. Paladyn, Journal of Behavioral Robotics, 2013, 4(1): 49-61 [54] Kober J, Oztop E, Peters J. Reinforcement learning to adjust robot movements to new situations. In: Proceedings of the 22nd International Joint Conference on Artificial Intelligence. Barcelona, Spain: IJCAI/AAAI, 2011. 2650−2655 [55] Zhao T, Deng M D, Li Z J, Hu Y B. 2018. Cooperative manipulation for a mobile dual-arm robot using sequences of dynamic movement primitives. IEEE Transactions on Cognitive and Developmental Systems, 2020, 12(1): 18−29 [56] Li Z J, Zhao T, Chen F, Hu Y B, Su C Y, Fukuda T. Reinforcement learning of manipulation and grasping using dynamical movement primitives for a humanoidlike mobile manipulator. IEEE/ASME Transactions on Mechatronics, 2018, 23(1): 121-131 doi: 10.1109/TMECH.2017.2717461 [57] Paraschos A, Rueckert E, Peters J, Neumann G. Model-free probabilistic movement primitives for physical interaction. In: Proceedings of the 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems. Hamburg, Germany: IEEE, 2015. 2860−2866 [58] Havoutis I, Calinon S. Supervisory teleoperation with online learning and optimal control. In: Proceedings of the 2017 IEEE International Conference on Robotics and Automation. Singapore: IEEE, 2017. 1534−1540 [59] Hershey J R, Olsen P A. Approximating the Kullback Leibler divergence between Gaussian mixture models. In: Proceedings of the 2007 IEEE International Conference on Acoustics, Speech and Signal Processing. Honolulu, USA: IEEE, 2007. IV-317−IV-320 [60] Goldberg P W, Williams C K I, Bishop C M. Regression with input-dependent noise: A Gaussian process treatment. In: Proceedings of the 10th International Conference on Neural Information Processing Systems. Denver, USA: NIPS, 1998. 493−499 [61] Kersting K, Plagemann C, Pfaff P, Burgard W. Most likely heteroscedastic Gaussian process regression. In: Proceedings of the 24th International Conference on Machine Learning. Corvalis, USA: ACM, 2007. 393−400 [62] Pastor P, Hoffmann H, Asfour T, Schaal S. Learning and generalization of motor skills by learning from demonstration. In: Proceedings of the 2009 IEEE International Conference on Robotics and Automation. Kobe, Japan: IEEE, 2009. 763−768 [63] Silverio J, Rozo L, Calinon S, Caldwell D G. Learning bimanual end-effector poses from demonstrations using task-parameterized dynamical systems. In: Proceedings of the 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems. Hamburg, Germany: IEEE, 2015. 464−470 [64] Ude A, Nemec B, Petric T, Morimoto J. Orientation in cartesian space dynamic movement primitives. In: Proceedings of the 2014 IEEE International Conference on Robotics and Automation. Hong Kong, China: IEEE, 2014. 2997−3004 [65] Abu-Dakka F J, Nemec B, J\orgensen J A, Savarimuthu T R, Kruger N, Ude A. Adaptation of manipulation skills in physical contact with the environment to reference force profiles. Autonomous Robots, 2015, 39(2): 199-217 doi: 10.1007/s10514-015-9435-2 [66] Ravichandar H, Dani A. Learning position and orientation dynamics from demonstrations via contraction analysis. Autonomous Robots, 2019, 43(4): 897-912 doi: 10.1007/s10514-018-9758-x [67] Zeestraten M J A, Havoutis I, Silverio J, Calinon S, Caldwell D G. An approach for imitation learning on Riemannian manifolds. IEEE Robotics and Automation Letters, 2017, 2(3): 1240-1247 doi: 10.1109/LRA.2017.2657001 [68] Saveriano M, Franzel F, Lee D. Merging position and orientation motion primitives. In: Proceedings of the 2019 International Conference on Robotics and Automation. Montreal, Canada: IEEE, 2019. 7041−7047 [69] Abu-Dakka F J, Kyrki V. Geometry-aware dynamic movement primitives. In: Proceedings of the 2020 IEEE International Conference on Robotics and Automation. Paris, France: IEEE, 2020. 4421−4426 [70] Abu-Dakka F J, Huang Y L, Silverio J, Kyrki V. A probabilistic framework for learning geometry-based robot manipulation skills. Robotics and Autonomous Systems, 2021, 141: 103761. doi: 10.1016/j.robot.2021.103761 [71] Calinon S. Gaussians on Riemannian manifolds: Applications for robot learning and adaptive control. IEEE Robotics & Automation Magazine, 2020, 27(2): 33-45 [72] Kronander K, Billard A. Learning compliant manipulation through kinesthetic and tactile human-robot interaction. IEEE Transactions on Haptics, 2014, 7(3): 367-380 doi: 10.1109/TOH.2013.54 [73] Wu Y Q, Zhao F, Tao T, Ajoudani A. A framework for autonomous impedance regulation of robots based on imitation learning and optimal control. IEEE Robotics and Automation Letters, 2021, 6(1): 127-134 doi: 10.1109/LRA.2020.3033260 [74] Forte D, Gams A, Morimoto J, Ude A. On-line motion synthesis and adaptation using a trajectory database. Robotics and Autonomous Systems, 2012, 60(10): 1327-1339 doi: 10.1016/j.robot.2012.05.004 [75] Kramberger A, Gams A, Nemec B, Chrysostomou D, Madsen O, Ude A. Generalization of orientation trajectories and force-torque profiles for robotic assembly. Robotics and Autonomous Systems, 2017, 98: 333-346 doi: 10.1016/j.robot.2017.09.019 [76] Stulp F, Raiola G, Hoarau A, Ivaldi S, Sigaud O. Learning compact parameterized skills with a single regression. In: Proceedings of the 13th IEEE-RAS International Conference on Humanoid Robots. Atlanta, USA: IEEE, 2013. 417−422 [77] Huang Y L, Silverio J, Rozo L, Caldwell D G. Generalized task-parameterized skill learning. In: Proceedings of the 2018 IEEE International Conference on Robotics and Automation. Brisbane, Australia: IEEE, 2018. 5667−5474 [78] Kulic D, Ott C, Lee D, Ishikawa J, Nakamura Y. Incremental learning of full body motion primitives and their sequencing through human motion observation. The International Journal of Robotics Research, 2012, 31(3): 330-345 doi: 10.1177/0278364911426178 [79] Manschitz S, Gienger M, Kober J, Peters J. Learning sequential force interaction skills. Robotics, 2020, 9(2): Article No. 45 doi: 10.3390/robotics9020045 [80] Kober J, Gienger M, Steil J J. Learning movement primitives for force interaction tasks. In: Proceedings of the 2015 IEEE International Conference on Robotics and Automation. Seattle, USA: IEEE, 2015. 3192−3199 [81] Medina J R, Billard A. Learning stable task sequences from demonstration with linear parameter varying systems and hidden Markov models. In: Proceedings of the 1st Annual Conference on Robot Learning. Mountain View, USA: PMLR, 2017. 175−184 [82] Meier F, Theodorou E, Stulp F, Schaal S. Movement segmentation using a primitive library. In: Proceedings of the 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems. San Francisco, USA: IEEE, 2011. 3407−3412 [83] Lee S H, Suh I H, Calinon S, Johansson R. Autonomous framework for segmenting robot trajectories of manipulation task. Autonomous Robots, 2015, 38(2): 107-141 doi: 10.1007/s10514-014-9397-9 [84] Stulp F, Schaal S. Hierarchical reinforcement learning with movement primitives. In: Proceedings of the 11th IEEE-RAS International Conference on Humanoid Robots. Bled, Slovenia: IEEE, 2011. 231−238 [85] Daniel C, Neumann G, Kroemer O, Peters J. Learning sequential motor tasks. In: Proceedings of the 2013 IEEE International Conference on Robotics and Automation. Karlsruhe, Germany: IEEE, 2013. 2626−2632 [86] Duan A Q, Camoriano R, Ferigo D, Huang Y L, Calandriello D, Rosasco L, et al. Learning to sequence multiple tasks with competing constraints. In: Proceedings of the 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems. Macau, China: IEEE, 2019. 2672−2678 [87] Silverio J, Huang Y L, Rozo L, Calinon S, Caldwell D G. Probabilistic learning of torque controllers from kinematic and force constraints. In: Proceedings of the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems. Madrid, Spain: IEEE, 2018. 1−8 [88] Schneider M, Ertel W. Robot learning by demonstration with local Gaussian process regression. In: Proceedings of the 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems. Taipei, China: IEEE, 2010. 255-260 [89] Umlauft J, Fanger Y, Hirche S. Bayesian uncertainty modeling for programming by demonstration. In: Proceedings of the 2017 IEEE International Conference on Robotics and Automation. Singapore: IEEE, 2017. 6428−6434 [90] Wilson A G, Ghahramani Z. Generalised Wishart processes. In: Proceedings of the 27th Conference on Uncertainty in Artificial Intelligence. Barcelona, Spain: AUAI Press, 2011. 1−9 [91] Medina J R, Lee D, Hirche S. Risk-sensitive optimal feedback control for haptic assistance. In: Proceedings of the 2012 IEEE International Conference on Robotics and Automation. Saint Paul, USA: IEEE, 2012. 1025−1031 [92] Huang Y L, Silverio J, Caldwell D G. Towards minimal intervention control with competing constraints. In: Proceedings of the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems. Madrid, Spain: IEEE, 2018. 733−738 [93] Calinon S, Billard A. A probabilistic programming by demonstration framework handling constraints in joint space and task space. In: Proceedings of the 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems. Nice, France: IEEE, 2008. 367−372 [94] Calinon S, Billard A. Statistical learning by imitation of competing constraints in joint space and task space. Advanced Robotics, 2009, 23(15): 2059-2076 doi: 10.1163/016918609X12529294461843 [95] Paraschos A, Lioutikov R, Peters J, Neumann G. Probabilistic prioritization of movement primitives. IEEE Robotics and Automation Letters, 2017, 2(4): 2294-2301 doi: 10.1109/LRA.2017.2725440 [96] Fajen B R, Warren W H. Behavioral dynamics of steering, obstable avoidance, and route selection. Journal of Experimental Psychology: Human Perception and Performance, 2003, 29(2): 343-362 doi: 10.1037/0096-1523.29.2.343 [97] Hoffmann H, Pastor P, Park D H, Schaal S. Biologically-inspired dynamical systems for movement generation: Automatic real-time goal adaptation and obstacle avoidance. In: Proceedings of the 2009 IEEE International Conference on Robotics and Automation. Kobe, Japan: IEEE, 2009. 2587−2592 [98] Duan A Q, Camoriano R, Ferigo D, Huang Y L, Calandriello D, Rosasco L, et al. Learning to avoid obstacles with minimal intervention control. Frontiers in Robotics and AI, 2020, 7: Article No. 60 doi: 10.3389/frobt.2020.00060 [99] Park D H, Hoffmann H, Pastor P, Schaal S. Movement reproduction and obstacle avoidance with dynamic movement primitives and potential fields. In: Proceedings of the 8th IEEE-RAS International Conference on Humanoid Robots. Daejeon, Korea: IEEE, 2008. 91−98 [100] Maciejewski A A, Klein C A. Obstacle avoidance for kinematically redundant manipulators in dynamically varying environments. The International Journal of Robotics Research, 1985, 4(3): 109-117 doi: 10.1177/027836498500400308 [101] Shyam R B, Lightbody P, Das G, Liu P C, Gomez-Gonzalez S, Neumann G. Improving local trajectory optimisation using probabilistic movement primitives. In: Proceedings of the 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems. Macau, China: IEEE, 2019. 2666−2671 [102] Zucker M, Ratliff N, Dragan A D, Pivtoraiko M, Klingensmith M, Dellin C M, et al. CHOMP: Covariant hamiltonian optimization for motion planning. The International Journal of Robotics Research, 2013, 32(9-10): 1164-1193 doi: 10.1177/0278364913488805 [103] Huang Y L, Caldwell D G. A linearly constrained nonparametric framework for imitation learning. In: Proceedings of the 2020 IEEE International Conference on Robotics and Automation. Paris, France: IEEE, 2020. 4400−4406 [104] Saveriano M, Lee D. Learning barrier functions for constrained motion planning with dynamical systems. In: Proceedings of the 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems. Macau, China: IEEE, 2019. 112−119 [105] Huang Y L. EKMP: Generalized imitation learning with adaptation, nonlinear hard constraints and obstacle avoidance. arXiv: 2103.00452, 2021. [106] Osa T, Esfahani A M G, Stolkin R, Lioutikov R, Peters J, Neumann G. Guiding trajectory optimization by demonstrated distributions. IEEE Robotics and Automation Letters, 2017, 2(2): 819-826 doi: 10.1109/LRA.2017.2653850 [107] Marinho Z, Boots B, Dragan A, Byravan A, Srinivasa S, Gordon G J. Functional gradient motion planning in reproducing kernel Hilbert spaces. In: Proceedings of the Robotics: Science and Systems XII. Ann Arbor, USA, 2016. 1−9 [108] Rana M A, Mukadam M, Ahmadzadeh S R, Chernova S, Boots B. Towards robust skill generalization: Unifying learning from demonstration and motion planning. In: Proceedings of the 1st Annual Conference on Robot Learning. Mountain View, USA: PMLR, 2017. 109−118 [109] Koert D, Maeda G, Lioutikov R, Neumann G, Peters J. Demonstration based trajectory optimization for generalizable robot motions. In: Proceedings of the 2016 IEEE-RAS 16th International Conference on Humanoid Robots. Cancun, Mexico: IEEE, 2016. 515−522 [110] Ye G, Alterovitz R. Demonstration-guided motion planning. Robotics Research. Cham: Springer, 2017. 291−307 [111] Englert P, Toussaint M. Learning manipulation skills from a single demonstration. The International Journal of Robotics Research, 2018, 37(1): 137-154 doi: 10.1177/0278364917743795 [112] Doerr A, Ratliff N D, Bohg J, Toussaint M, Schaal S. Direct loss minimization inverse optimal control. In: Proceedings of the Robotics: Science and Systems. Rome, Italy, 2015. 1−9 [113] Hansen N. The CMA evolution strategy: A comparing review. Towards a New Evolutionary Computation: Advances in the Estimation of Distribution Algorithms. Berlin, Heidelberg: Springer, 2006, 75−102 [114] Ewerton M, Neumann G, Lioutikov R, Amor H B, Peters J, Maeda G. Learning multiple collaborative tasks with a mixture of interaction primitives. In: Proceedings of the 2015 IEEE International Conference on Robotics and Automation. Seattle, USA: IEEE, 2015. 1535−1542 [115] Amor H B, Neumann G, Kamthe S, Kroemer O, Peters J. Interaction primitives for human-robot cooperation tasks. In: Proceedings of the 2014 IEEE International Conference on Robotics and Automation. Hong Kong, China: IEEE, 2014. 2831−2837 [116] Vogt D, Stepputtis S, Grehl S, Jung B, Amor H B. A system for learning continuous human-robot interactions from human-human demonstrations. In: Proceedings of the 2017 IEEE International Conference on Robotics and Automation. Singapore: IEEE, 2017. 2882−2889 [117] Silverio J, Huang Y L, Rozo L, Caldwell D G. An uncertainty-aware minimal intervention control strategy learned from demonstrations. In: Proceedings of the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems. Madrid, Spain: IEEE, 2018. 6065−6071 [118] Khoramshahi M, Billard A. A dynamical system approach to task-adaptation in physical human–robot interaction. Autonomous Robots, 2019, 43(4): 927-946 doi: 10.1007/s10514-018-9764-z [119] Kalakrishnan M, Chitta S, Theodorou E, Pastor P, Schaal S. STOMP: Stochastic trajectory optimization for motion planning. In: Proceedings of the 2011 IEEE International Conference on Robotics and Automation. Shanghai, China: IEEE, 2011. 4569−4574 [120] Schulman J, Duan Y, Ho J, Lee A, Awwal I, Bradlow H, et al. Motion planning with sequential convex optimization and convex collision checking. The International Journal of Robotics Research, 2014, 33(9): 1251-1270 doi: 10.1177/0278364914528132 [121] Osa T. Multimodal trajectory optimization for motion planning. The International Journal of Robotics Research, 2020, 39(8): 983-1001 doi: 10.1177/0278364920918296 [122] LaValle S M, Kuffner Jr J J. Randomized kinodynamic planning. The International Journal of Robotics Research, 2001, 20(5): 378-400 doi: 10.1177/02783640122067453 [123] Kavraki L E, Svestka P, Latombe J C, Overmars M H. Probabilistic roadmaps for path planning in high-dimensional configuration spaces. IEEE Transactions on Robotics and Automation, 1996, 12(4): 566-580 doi: 10.1109/70.508439 [124] Hsu D, Latombe J C, Kurniawati H. On the probabilistic foundations of probabilistic roadmap planning. The International Journal of Robotics Research, 2006, 25(7): 627-643 doi: 10.1177/0278364906067174 [125] Celemin C, Maeda G, Ruiz-del-Solar J, Peters J, Kober J. Reinforcement learning of motor skills using policy search and human corrective advice. The International Journal of Robotics Research, 2019, 38(14): 1560-1580 doi: 10.1177/0278364919871998 [126] Maeda G, Ewerton M, Osa T, Busch B, Peters J. Active incremental learning of robot movement primitives. In: Proceedings of the 1st Annual Conference on Robot Learning. Mountain View, USA: PMLR, 2017. 37−46 [127] Pearl J. Causality. Cambridge: Cambridge University Press, 2009. [128] Katz G, Huang D W, Hauge T, Gentili R, Reggia J. A novel parsimonious cause-effect reasoning algorithm for robot imitation and plan recognition. IEEE Transactions on Cognitive and Developmental Systems, 2018, 10(2): 177-193 doi: 10.1109/TCDS.2017.2651643 [129] Haan P, Jayaraman D, Levine S. Causal confusion in imitation learning. In: Proceedings of the 33rd Conference on Neural Information Processing Systems. Vancouver, Canada: NeurIPS, 2019. 11693−11704 期刊类型引用(4)
1. 杨静,王晓,王雨桐,刘忠民,李小双,王飞跃. 平行智能与CPSS:三十年发展的回顾与展望. 自动化学报. 2023(03): 614-634 . 本站查看
2. 赵奎,闫玉芳,曹吉龙,高延军. 融合规范化判断的双向循环神经网络诊疗预测模型. 小型微型计算机系统. 2022(06): 1278-1284 . 百度学术
3. 赵奎,杜昕娉,高延军,马慧敏. 融合文字与标签的电子病历命名实体识别. 计算机系统应用. 2022(10): 375-381 . 百度学术
4. 高华睿,郝龙,王明明,包绍伦,康乐. 基于Att-Bi-LSTM的高速公路短时交通流预测研究. 武汉理工大学学报. 2020(09): 59-64 . 百度学术
其他类型引用(5)
-