2.845

2023影响因子

(CJCR)

  • 中文核心
  • EI
  • 中国科技核心
  • Scopus
  • CSCD
  • 英国科学文摘

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

一种基于联合学习的家庭日常工具功用性部件检测算法

吴培良 隰晓珺 杨霄 孔令富 侯增广

吴培良, 隰晓珺, 杨霄, 孔令富, 侯增广. 一种基于联合学习的家庭日常工具功用性部件检测算法. 自动化学报, 2019, 45(5): 985-992. doi: 10.16383/j.aas.c170423
引用本文: 吴培良, 隰晓珺, 杨霄, 孔令富, 侯增广. 一种基于联合学习的家庭日常工具功用性部件检测算法. 自动化学报, 2019, 45(5): 985-992. doi: 10.16383/j.aas.c170423
WU Pei-Liang, XI Xiao-Jun, YANG Xiao, KONG Ling-Fu, HOU Zeng-Guang. An Algorithm for Affordance Parts Detection of Household Tools Based on Joint Learning. ACTA AUTOMATICA SINICA, 2019, 45(5): 985-992. doi: 10.16383/j.aas.c170423
Citation: WU Pei-Liang, XI Xiao-Jun, YANG Xiao, KONG Ling-Fu, HOU Zeng-Guang. An Algorithm for Affordance Parts Detection of Household Tools Based on Joint Learning. ACTA AUTOMATICA SINICA, 2019, 45(5): 985-992. doi: 10.16383/j.aas.c170423

一种基于联合学习的家庭日常工具功用性部件检测算法

doi: 10.16383/j.aas.c170423
基金项目: 

国家自然科学基金 61305113

燕山大学博士基金 BL18007

国家重点研发计划 2018YFB1308305

中国博士后自然科学基金 2018M631620

河北省自然科学基金 F2016203358

详细信息
    作者简介:

    隰晓珺  燕山大学信息科学与工程学院硕士研究生.主要研究方向为RGB-D数据处理, 工具功用性认知.E-mail:xixiaojun@ysu.edu.cn

    杨霄燕  山大学信息科学与工程学院硕士研究生.主要研究方向为RGB-D数据处理, 行为建模与学习.E-mail:yangxiao@ysu.edu.cn

    孔令富  燕山大学教授.1995年获得哈尔滨工业大学博士学位.主要研究方向为家庭服务机器人, 机器视觉, 智能信息处理, 并联机器人及自动控制.E-mail:lfkong@ysu.edu.cn

    侯增广  中国科学院自动化研究所复杂系统管理与控制国家重点实验室研究员.主要研究方向为机器人与智能系统, 康复机器人与微创介入手术机器人.E-mail:zengguang.hou@ia.ac.cn

    通讯作者:

    吴培良  燕山大学副教授.2010年获得燕山大学博士学位.主要研究方向为家庭服务机器人智能提升, 功用性认知, SLAM.本文通信作者.E-mail:peiliangwu@ysu.edu.cn

An Algorithm for Affordance Parts Detection of Household Tools Based on Joint Learning

Funds: 

National Natural Science Foundation of China 61305113

Doctoral Fund of Yanshan University BL18007

National Key Research and Development Program 2018YFB1308305

Postdoctoral Science Foundation of China 2018M631620

Natural Science Foundation of Hebei Province F2016203358

More Information
    Author Bio:

     Master student at the School of Information Science and Engineering, Yanshan University. Her research interest covers RGB-D data processing and tools affordance cognition

     Master student at the School of Information Science and Engineering, Yanshan University. His research interest covers RGB-D data processing, human behavior modeling and learning

     Professor at Yanshan University. He received his Ph. D. degree from Harbin Institute of Technology in 1995. His research interest covers home service robot, machine vision, intelligent information processing, parallel robotics, and automatic control

      Professor at the State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences. His research interest covers intelligent robotic systems, rehabilitation and surgery robots

    Corresponding author: WU Pei-Liang  Associate professor at Yanshan University. He received his Ph. D. degree from Yanshan University in 2010. His research interest covers intelligence promotion home service robot, affordance cognition, and SLAM. Corresponding author of this paper
  • 摘要: 对工具及其功用性部件的认知是共融机器人智能提升的重要研究方向.本文针对家庭日常工具的功用性部件建模与检测问题展开研究,提出了一种基于条件随机场(Conditional random field,CRF)和稀疏编码联合学习的家庭日常工具功用性部件检测算法.首先,从工具深度图像提取表征工具功用性部件的几何特征;然后,分析CRF和稀疏编码之间的耦合关系并进行公式化表示,将特征稀疏化后作为潜变量构建初始条件随机场模型,并进行稀疏字典和CRF的协同优化:一方面,将特征的稀疏表示作为CRF的随机变量条件及权重参数选择器;另一方面,在CRF调控下对稀疏字典进行更新.随后使用自适应时刻估计(Adaptive moment estimation,Adam)方法实现模型解耦与求解.最后,给出了基于联合学习的工具功用性部件模型离线构建算法,以及基于该模型的在线检测方法.实验结果表明,相较于使用传统特征提取和模型构建方法,本文方法对功用性部件的检测精度和效率均得到提升,且能够满足普通配置机器人对工具功用性认知的需要.
    1)  本文责任编委 胡清华
  • 图  1  RGB-D数据集中部分工具

    Fig.  1  Tools in RGB-D data set

    图  2  工具目标部件功用性区域

    Fig.  2  Target affordance parts in tools

    图  3  包含功用性部件“盛(Contain)”的工具及其对应的二值标签

    Fig.  3  Tools containing affordance of "contain" and the corresponding labels in binaryzation

    图  4  本文方法与其他方法的检测结果对比图((a)为单一场景下的待检测工具图, 由上到下分别为碗(bowl)、杯子(cup)、勺子(ladle)、铲子(turner); (b)为待检测目标功用性部件的真实值图, 由上到下分别为盛(contain)、握抓(wrap-grasp)、舀(scoop)、支撑(support); (c) SIFT +文献[15]方法检测结果; (d)深度特征+文献[15]方法检测结果; (e) SIFT +文献[16]方法检测结果; (f)深度特征+文献[16]方法检测结果; (g)深度特征+文献[7]方法检测结果; (h)深度特征+文献[13]方法检测结果; (i)本文方法检测结果)

    Fig.  4  Comparison of detection results between our method and others ((a) Tools in a single scene, from the top to the bottom: bowl, cup, ladle and turner; (b) Ground truth of object affordances, from the top to the bottom: contain、wrap-grasp、scoop、support; (c) Detection result with SIFT + Paper [15]; (d) Detection result with Depth + Paper [15]; (e) Detection result with SIFT + Paper [16]; (f) Detection result with Depth + Paper [16]; (g) Detection result with Depth + Paper [7]; (h) Detection result with Depth + Paper [13]; (i) Detection result with our method)

    图  5  本文方法与其他方法的精度召回率曲线对比

    Fig.  5  Comparison of precision recall curves between our method and others

    表  1  本文方法与其他方法的效率对比(秒)

    Table  1  Comparison of efficiency between our method and others (s)

    功用性部件 SIFT特征+ SIFT特征+ 深度特征+ 深度特征+ 深度特征+ 深度特征+ Ours
    文献[15] 文献[16] 文献[15] 文献[16] 文献[13] 文献[7]
    6.46 8.00 9.41 10.95 1.25 16.29 1.13
    6.09 7.09 8.60 10.67 1.18 16.34 1.33
    支撑 5.94 6.93 10.40 10.98 1.53 16.28 1.56
    握抓 5.93 6.99 10.65 11.73 1.27 15.52 1.24
    下载: 导出CSV
  • [1] Aly A, Griffiths S, Stramandinoli F. Towards intelligent social robots:current advances in cognitive robotics. Cognitive Systems Research, 2017, 43:153-156 doi: 10.1016/j.cogsys.2016.11.005
    [2] Min H Q, Yi C A, Luo R H, Zhu J H, Bi S. Affordance research in developmental robotics:a survey. IEEE Transactions on Cognitive and Developmental Systems, 2016, 8(4):237-255 https://ieeexplore.ieee.org/document/7582380
    [3] Lenz I, Lee H, Saxena A. Deep learning for detecting robotic grasps. The International Journal of Robotics Research, 2015, 34(4-5):705-724 doi: 10.1177/0278364914549607
    [4] Kjellström H, Romero J, Kragić D. Visual object-action recognition:inferring object affordances from human demonstration. Computer Vision and Image Understanding, 2011, 115(1):81-90 http://d.old.wanfangdata.com.cn/NSTLQK/NSTL_QKJJ0220084270/
    [5] Grabner H, Gall J, Van Gool L. What makes a chair a chair? In: Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition. Providence, RI: IEEE, 2011. 1529-1536
    [6] Koppula H S, Gupta R, Saxena A. Learning human activities and object affordances from RGB-D videos. The International Journal of Robotics Research, 2013, 32(8):951-970 doi: 10.1177/0278364913478446
    [7] Myers A, Teo C L, Fermüller C, Aloimonos Y. Affordance detection of tool parts from geometric features. In: Proceedings of the 2015 IEEE International Conference on Robotics and Automation. Seattle, WA: IEEE, 2015. 1374-1381
    [8] 林煜东, 和红杰, 陈帆, 尹忠科.基于轮廓几何稀疏表示的刚性目标模型及其分级检测算法.自动化学报, 2015, 41(4):843-853 http://www.aas.net.cn/CN/abstract/abstract18658.shtml

    Li Yu-Dong, He Hong-Jie, Chen Fan, Yin Zhong-Ke. A rigid object detection model based on geometric sparse representation of profile and its hierarchical detection algorithm. Acta Automatica Sinica, 2015, 41(4):843-853 http://www.aas.net.cn/CN/abstract/abstract18658.shtml
    [9] Redmon J, Angelova A. Real-time grasp detection using convolutional neural networks. In: Proceedings of the 2015 IEEE International Conference on Robotics and Automation. Seattle, WA: IEEE, 2015. 1316-1322
    [10] 仲训杲, 徐敏, 仲训昱, 彭侠夫.基于多模特征深度学习的机器人抓取判别方法.自动化学报, 2016, 42(7):1022-1029 http://www.aas.net.cn/CN/abstract/abstract18893.shtml

    Zhong Xun-Gao, Xu Min, Zhong Xun-Yu, Peng Xia-Fu. Multimodal features deep learning for robotic potential grasp recognition. Acta Automatica Sinica, 2016, 42(7):1022-1029 http://www.aas.net.cn/CN/abstract/abstract18893.shtml
    [11] Myers A O. From form to function: detecting the affordance of tool parts using geometric features and material cues[Ph.D. dissertation], University of Maryland, 2016
    [12] Nguyen A, Kanoulas D, Caldwell D G, Tsagarakis N G. Detecting object affordances with Convolutional Neural Networks. In: Proceedings of the 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems. Daejeon: IEEE, 2016. 2765-2770
    [13] 吴培良, 付卫兴, 孔令富.一种基于结构随机森林的家庭日常工具部件功用性快速检测算法.光学学报, 2017, 37(2):0215001 http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=gxxb201702020

    Wu Pei-Liang, Fu Wei-Xing, Kong Ling-Fu. A fast algorithm for affordance detection of household tool parts based on structured random forest. Acta Optica Sinica, 2017, 37(2):0215001 http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=gxxb201702020
    [14] Thogersen M, Escalera S, González J, Moeslund T B. Segmentation of RGB-D indoor scenes by stacking random forests and conditional random fields. Pattern Recognition Letters, 2016, 80:208-215 doi: 10.1016/j.patrec.2016.06.024
    [15] Bao C L, Ji H, Quan Y H, Shen Z W. Dictionary learning for sparse coding:algorithms and convergence analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 38(7):1356-1369 doi: 10.1109/TPAMI.2015.2487966
    [16] Yang J M, Yang M H. Top-down visual saliency via joint CRF and dictionary learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(3):576-588 doi: 10.1109/TPAMI.2016.2547384
    [17] Yang E, Gwak J, Jeon M. Conditional random field (CRF)-boosting:constructing a robust online hybrid boosting multiple object tracker facilitated by CRF learning. Sensors, 2017, 17(3):617 doi: 10.3390/s17030617
    [18] Liu T, Huang X T, Ma J S. Conditional random fields for image labeling. Mathematical Problems in Engineering, 2016, 2016: Article ID 3846125
    [19] Lv P Y, Zhong Y F, Zhao J, Jiao H Z, Zhang L P. Change detection based on a multifeature probabilistic ensemble conditional random field model for high spatial resolution remote sensing imagery. IEEE Geoscience & Remote Sensing Letters, 2016, 13(12):1965-1969 https://ieeexplore.ieee.org/document/7731208
    [20] 钱生, 陈宗海, 林名强, 张陈斌.基于条件随机场和图像分割的显著性检测.自动化学报, 2015, 41(4):711-724 http://www.aas.net.cn/CN/abstract/abstract18647.shtml

    Qian Sheng, Chen Zong-Hai, Lin Ming-Qiang, Zhang Chen-Bin. Saliency detection based on conditional random field and image segmentation. Acta Automatica Sinica, 2015, 41(4):711-724 http://www.aas.net.cn/CN/abstract/abstract18647.shtml
    [21] Wang Z, Zhu S Q, Li Y H, Cui Z Z. Convolutional neural network based deep conditional random fields for stereo matching. Journal of Visual Communication & Image Representation, 2016, 40:739-750 http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=c784f7eb1578e1bfa06238c4fb50b4ea
    [22] Szummer M, Kohli P, Hoiem D. Learning CRFs using graph cuts. In: Proceedings of European Conference on Computer Vision, Lecture Notes in Computer Science, vol.5303. Berlin, Heidelberg: Springer, 2008. 582-595
    [23] Kolmogorov V, Zabin R. What energy functions can be minimized via graph cuts? IEEE Transactions on Pattern Analysis & Machine Intelligence, 2004, 26(2):147-159 http://d.old.wanfangdata.com.cn/NSTLQK/NSTL_QKJJ0214863408/
    [24] Kingma D P, Ba J. Adam: a method for stochastic optimization. In: Proceedings of the 3rd International Conference for Learning Representations. San Diego, 2015.
    [25] Mairal J, Bach F, Ponce J. Task-driven dictionary learning. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2012, 34(4):791-804 http://d.old.wanfangdata.com.cn/NSTLQK/NSTL_QKJJ0225672733/
  • 加载中
图(5) / 表(1)
计量
  • 文章访问数:  1910
  • HTML全文浏览量:  432
  • PDF下载量:  454
  • 被引次数: 0
出版历程
  • 收稿日期:  2017-07-31
  • 录用日期:  2018-03-24
  • 刊出日期:  2019-05-20

目录

    /

    返回文章
    返回