2.793

2018影响因子

(CJCR)

  • 中文核心
  • EI
  • 中国科技核心
  • Scopus
  • CSCD
  • 英国科学文摘

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

使用增强学习训练多焦点聚焦模型

刘畅 刘勤让

刘畅, 刘勤让. 使用增强学习训练多焦点聚焦模型. 自动化学报, 2017, 43(9): 1563-1570. doi: 10.16383/j.aas.2017.c160643
引用本文: 刘畅, 刘勤让. 使用增强学习训练多焦点聚焦模型. 自动化学报, 2017, 43(9): 1563-1570. doi: 10.16383/j.aas.2017.c160643
LIU Chang, LIU Qin-Rang. Using Reinforce Learning to Train Multi-attention Model. ACTA AUTOMATICA SINICA, 2017, 43(9): 1563-1570. doi: 10.16383/j.aas.2017.c160643
Citation: LIU Chang, LIU Qin-Rang. Using Reinforce Learning to Train Multi-attention Model. ACTA AUTOMATICA SINICA, 2017, 43(9): 1563-1570. doi: 10.16383/j.aas.2017.c160643

使用增强学习训练多焦点聚焦模型

doi: 10.16383/j.aas.2017.c160643
基金项目: 

国家高技术研究发展计划(863计划) 2014AA01A

国家自然科学基金 61572520

详细信息
    作者简介:

    刘勤让 国家数字交换系统工程技术研究中心研究员.主要研究方向为片上网络设计. E-mail: qinrangliu@sina.com

    通讯作者:

    刘畅 国家数字交换系统工程技术研究中心硕士研究生.主要研究方向为人工智能和芯片技术.本文通信作者.E-mail: liunux1992@gmail.com

  • 本文责任编委 袁勇

Using Reinforce Learning to Train Multi-attention Model

Funds: 

National High Technology Research and Development Program of China (863 Program) 2014AA01A

National Natural Science Foundation of China 61572520

More Information
    Author Bio:

    Researcher at China National Digital Switching System Engineering and Technological Research and Development Center. His main research interest is network-on-chip

    Corresponding author: LIU Chang Master student at China National Digital Switching System Engineering and Technological Research and Development Center. His research interest covers artificial intelligence and chip design technology. Corresponding author of this paper
  • 摘要: 聚焦模型(Attention model,AM)将计算资源集中于输入数据特定区域,相比卷积神经网络,AM具有参数少、计算量独立输入和高噪声下正确率较高等优点.相对于输入图像和识别目标,聚焦区域通常较小;如果聚焦区域过小,就会导致过多的迭代次数,降低了效率,也难以在同一输入中寻找多个目标.因此本文提出多焦点聚焦模型,同时对多处并行聚焦.使用增强学习(Reinforce learning,RL)进行训练,将所有焦点的行为统一评分训练.与单焦点聚焦模型相比,训练速度和识别速度提高了25%.同时本模型具有较高的通用性.
    1)  本文责任编委 袁勇
  • 图  1  单焦点聚焦模型识别过程

    Fig.  1  Recognition process of RAM

    图  2  模型结构

    Fig.  2  Model structure

    图  3  多焦点模型识别过程

    Fig.  3  Recognition process of multi attention model

    图  4  多焦点模型在60像素× 60像素图像中识别的效果

    Fig.  4  Recognition process of multi attention model in 60 × 60 image dataset

    图  5  多焦点聚焦模型相比RAM的对照试验

    Fig.  5  Control experiment between RAM and multi attention model

    图  6  对比训练时收敛速度

    Fig.  6  Convergence speed control experiment

    图  7  焦点数量与正确率关系

    Fig.  7  Relationship between quantify and accuracy

    图  8  焦点数量与运行速度关系

    Fig.  8  Relationship between quantify and speed

    表  1  多焦点模型错误率

    Table  1  Multi-attention model error rate

    模型 错误率(%)
    RAM, 2次 8.11
    RAM, 4次 3.28
    RAM, 6次 2.11
    RAM, 8次 1.55
    RAM, 10次 1.26
    多焦点模型, 2次 4.17
    多焦点模型, 4次 2.59
    多焦点模型, 6次 1.58
    多焦点模型, 8次 1.19
    多焦点模型, 10次 1.19
    下载: 导出CSV

    表  2  随机坐标错误率

    Table  2  Random position error rate

    模型 错误率(%)
    RAM, 2次 1.51
    RAM, 4次 1.29
    RAM, 6次 1.22
    多焦点模型, 2次 2.81
    多焦点模型, 4次 1.55
    多焦点模型, 6次 1.01
    下载: 导出CSV

    表  3  噪声环境对比

    Table  3  Noisy dataset error rate between RAM and multi attention model

    模型 错误率(%)
    RAM, 2次 4.96
    RAM, 4次 4.08
    RAM, 6次 4.04
    多焦点模型, 2次 5.25
    多焦点模型, 4次 4.47
    多焦点模型, 6次 3.43
    下载: 导出CSV
  • [1] Mnih V, Kavukcuoglu K, Silver D, Rusu A A, Veness J, Bellemare M G, Graves A, Riedmiller M, Fidjeland A K, Ostrovski G, Petersen S, Beattie C, Sadik A, Antonoglou I, King H, Kumaran D, Wierstra D, Legg S, Hassabis D. Human-level control through deep reinforcement learning. Nature, 2015, 518(7540): 529-533 doi: 10.1038/nature14236
    [2] Mordvintsev A, Olah C, Tyka M. Inceptionism: going deeper into neural networks [Online], available: http://research.googleblog.com/2015/06/inceptionism-goi-ng-deeper-into-neural.html, August 22, 2016
    [3] Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks. In: Proceedings of the 25th International Conference on Neural Information Processing Systems. Lake Tahoe, Nevada, USA: Curran Associates Inc., 2012. 1097-1105 http://dl.acm.org/citation.cfm?id=2999257
    [4] Girshick R, Donahue J, Darrell T, Malik J. Rich feature Hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, OH, USA: IEEE, 2014. 580-587 http://ieeexplore.ieee.org/xpls/icp.jsp?arnumber=6909475
    [5] Sermanet P, Eigen D, Zhang X, Mathieu M, Fergus R, LeCun Y. OverFeat: integrated recognition, localization and detection using convolutional networks [Online], available: http://arxiv.org/abs/1312.6229, August 22, 2016 http://www.oalib.com/paper/4042258
    [6] Felzenszwalb P F, Girshick R B, McAllester D. Cascade object detection with deformable part models. In: Proceedings of the 2010 IEEE Conference on Computer Vision and Pattern Recognition. San Francisco, CA, USA: IEEE, 2010. 2241-2248 http://ieeexplore.ieee.org/xpls/icp.jsp?arnumber=5539906
    [7] Viola P, Jones M. Rapid object detection using a boosted cascade of simple features. In: Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Kauai, HI, USA: IEEE, 2001. 511-518
    [8] Mnih V, Heess N, Graves A, Kavukcuoglu K. Recurrent models of visual attention. In: Proceedings of the 27th International Conference on Neural Information Processing Systems. Cambridge, MA, USA: MIT Press, 2014. 2204-2212 http://dl.acm.org/citation.cfm?id=2969073
    [9] Rensink R A. The dynamic representation of scenes. Visual Cognition, 2000, 7(1-3): 17-42 doi: 10.1080/135062800394667
    [10] Yoo D, Park S, Lee J Y, Paek A S, Kweon I S. AttentionNet: aggregating weak directions for accurate object detection [Online], available: http://arxiv.org/abs/1506.07704, August 22, 2016
    [11] Stollenga M F, Masci J, Gomez F, Schmidhuber J. Deep networks with internal selective attention through feedback connections. In: Proceedings of the 27th International Conference on Neural Information Processing Systems. Cambridge, MA, USA: MIT Press, 2014. 4(2): 107-122 http://www.ams.org/mathscinet-getitem?mr=1312581
    [12] Legrand J, Collobert R. Jiont RNN-based greedy parsing and word composition [Online], avaliable: https://arxiv.org/abs/1412.7028?context=cs, August 22, 2016 http://arxiv.org/abs/1412.7028
    [13] Alexe B, Heess N, Teh Y W, Ferrari V. Searching for objects driven by context. In: Proceedings of the 25th International Conference on Neural Information Processing Systems. Lake Tahoe, Nevada, USA: Curran Associates Inc., 2012. 881-889 http://dl.acm.org/citation.cfm?id=2999233
    [14] 冯欣, 杨丹, 张凌.基于视觉注意力变化的网络丢包视频质量评估.自动化学报, 2011, 37(11): 1322-1331 http://www.aas.net.cn/CN/abstract/abstract17526.shtml

    Feng Xin, Yang Dan, Zhang Ling. Saliency variation based quality assessment for packet-loss-impaired videos. Acta Automatica Sinica, 2011, 37(11): 1322-1331 http://www.aas.net.cn/CN/abstract/abstract17526.shtml
    [15] 刘龙, 樊波阳, 刘金星, 杨乐超.面向运动目标检测的粒子滤波视觉注意力模型.电子学报, 2016, 44(9): 2235-2241 http://www.cnki.com.cn/Article/CJFDTOTAL-DZXU201609031.htm

    Liu Long, Fan Bo-Yang, Liu Jin-Xing, Yang Le-Chao. Particle filtering based visual attention model for moving target detection. Acta Electronica Sinica, 2016, 44(9): 2235-2241 http://www.cnki.com.cn/Article/CJFDTOTAL-DZXU201609031.htm
    [16] 张冲. 基于Attention-Based LSTM模型的文本分类技术的研究[硕士学位论文], 南京大学, 中国, 2016. http://cdmd.cnki.com.cn/Article/CDMD-10284-1016136802.htm

    Zhang Chong. Text Classification Based on Attention-Based LSTM Model [Master dissertation], Nanjing University, China, 2016. http://cdmd.cnki.com.cn/Article/CDMD-10284-1016136802.htm
    [17] Denil M, Bazzani L, Larochelle H, de Freitas N. Learning where to attend with deep architectures for image tracking. Neural Computation, 2012, 24(8): 2151-2184 doi: 10.1162/NECO_a_00312
    [18] Paletta L, Fritz G, Seifert C. Q-learning of sequential attention for visual object recognition from informative local descriptors. In: Proceedings of the 22nd International Conference on Machine Learning. New York, NY, USA: ACM, 2005. 649-656 http://dl.acm.org/citation.cfm?id=1102433
    [19] Ranzato M. On learning where to look [Online], available: http://arxiv.org/abs/1405.5488, August 22, 2016.
    [20] Stanley K O, Miikkulainen R. Evolving a roving eye for go. In: Proceedings of the 2004 Genetic and Evolutionary Computation Conference. Berlin, Heidelberg, Germany: Springer, 2004. 1226-1238 http://www.springerlink.com/index/96y7lyycbj8k67ey.pdf
    [21] Larochelle H, Hinton G. Learning to combine foveal glimpses with a third-order Boltzmann machine. In: Proceedings of the 23rd International Conference on Neural Information Processing Systems. Vancouver, Canada: Curran Associates Inc., 2010. 1243-1251 http://dl.acm.org/citation.cfm?id=2997328
  • 加载中
图(8) / 表(3)
计量
  • 文章访问数:  861
  • HTML全文浏览量:  389
  • PDF下载量:  939
  • 被引次数: 0
出版历程
  • 收稿日期:  2016-09-08
  • 录用日期:  2017-03-21
  • 刊出日期:  2017-09-20

目录

    /

    返回文章
    返回