2.845

2023影响因子

(CJCR)

  • 中文核心
  • EI
  • 中国科技核心
  • Scopus
  • CSCD
  • 英国科学文摘

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于ROI-KNN卷积神经网络的面部表情识别

孙晓 潘汀 任福继

孙晓, 潘汀, 任福继. 基于ROI-KNN卷积神经网络的面部表情识别. 自动化学报, 2016, 42(6): 883-891. doi: 10.16383/j.aas.2016.c150638
引用本文: 孙晓, 潘汀, 任福继. 基于ROI-KNN卷积神经网络的面部表情识别. 自动化学报, 2016, 42(6): 883-891. doi: 10.16383/j.aas.2016.c150638
SUN Xiao, PAN Ting, REN Fu-Ji. Facial Expression Recognition Using ROI-KNN Deep Convolutional Neural Networks. ACTA AUTOMATICA SINICA, 2016, 42(6): 883-891. doi: 10.16383/j.aas.2016.c150638
Citation: SUN Xiao, PAN Ting, REN Fu-Ji. Facial Expression Recognition Using ROI-KNN Deep Convolutional Neural Networks. ACTA AUTOMATICA SINICA, 2016, 42(6): 883-891. doi: 10.16383/j.aas.2016.c150638

基于ROI-KNN卷积神经网络的面部表情识别

doi: 10.16383/j.aas.2016.c150638 cstr: 32138.14.j.aas.2016.c150638
基金项目: 

安徽省自然科学基金 1508085QF119

合肥工业大学2015年国家省级大学生创新训练计划项目( 2015cxcys109

国家自然科学基金重点项目 61432004

中国博士后科学基金 2015M580532

模式识别国家重点实验室开放课题 NLPR201407345

详细信息
    作者简介:

    潘汀 合肥工业大学计算机与信息学院本科生. 主要研究方向为深度学习, 贝叶斯学习理论及其在计算机视觉与自然语言处理方面的应用

    任福继 合肥工业大学计算机与信息学院情感计算研究所教授, 德岛大学教授. 主要研究方向为人工智能, 情感计算, 自然语言处理, 机器学习与人机交互. E-mail:ren@is.tokushima-u.ac.jp

    通讯作者:

    孙晓 合肥工业大学计算机与信息学院情感计算研究所副教授. 主要研究方向为自然语言处理与情感计算, 机器学习与人机交互. 本文通信作者. E-mail:sunx@hfut.edu.cn

  • 中图分类号: 

Facial Expression Recognition Using ROI-KNN Deep Convolutional Neural Networks

Funds: 

the Natural Science Foundation of Anhui Province 1508085QF119

National Training Program of Innovation and Entrepreneurship for HFUT Undergraduates 2015cxcys109

Key Program of National Natural Foundation Science of China 61432004

China Postdoctoral Science Foundation 2015M580532

Open Project Program of the National Laboratory of Pattern Recognition NLPR201407345

More Information
    Author Bio:

    PAN Ting Bachelor student at the School of Computer Science and Infor- mation, Hefei University of Technology. His research interest covers the theory of deep learning and Bayesian learning, and corresponding applications in com- puter vision and natural language processing

    REN Fu-Ji Professor at the Insti- tute of A®ective Computing, Hefei Uni- versity of Technology and Tokushima University. His research interest coves arti¯cial intelligent, a®ective computing, natural language processing, machine learning, and human-machine interaction.

    Corresponding author: SUN Xiao Associate professor at the Institute of A®ective Computing, Hefei University of Technology. His re- search interest covers natural language processing, a®ective computing, machine learning and human-machine interac- tion. Corresponding author of this paper. E-mail:sunx@hfut.edu.cn
  • 摘要: 深度神经网络已经被证明在图像、语音、文本领域具有挖掘数据深层潜在的分布式表达特征的能力. 通过在多个面部情感数据集上训练深度卷积神经网络和深度稀疏校正神经网络两种深度学习模型, 对深度神经网络在面部情感分类领域的应用作了对比评估. 进而, 引入了面部结构先验知识, 结合感兴趣区域(Region of interest, ROI)和K最近邻算法(K-nearest neighbors, KNN), 提出一种快速、简易的针对面部表情分类的深度学习训练改进方案——ROI-KNN, 该训练方案降低了由于面部表情训练数据过少而导致深度神经网络模型泛化能力不佳的问题, 提高了深度学习在面部表情分类中的鲁棒性, 同时, 显著地降低了测试错误率.
  • 图  1  CK+与Wild数据集样例

    Fig.  1  Samples from CK+ and Wild

    图  2  输入空间的流形面

    Fig.  2  Manifold side of input space

    图  3  卷积神经网络的局部块状连接与基本结构

    Fig.  3  Local connection and structure of convolutional neural network (CNN)

    图  4  不同激活函数的函数图像 (图片源自Glorot[11]

    Fig.  4  Graphs for different activation functions from Glorot[11]

    图  5  深度卷积神经网络的结构

    ?表示不确定超参数,有多种优选方案.

    Fig.  5  Structure of DNN

    ? represents uncertain parameters with many candidate solutions.

    图  6  深度稀疏校正网络的结构

    Fig.  6  Structure of deep sparse rectifier net

    图  7  9个ROI区域(切割、翻转、遮盖、中心聚焦

    Fig.  7  Nine ROI regions (cut,flip,cover,center focus

    表  1  ROI辅助评估的测试集错误率 (%)

    Table  1  Test set error rate of ROI auxiliary (%)

    中性高兴悲伤惊讶愤怒整体
    CNN-644.732.754.33340.333.3
    CNN-64*5.636.359.320.031.730.6
    CNN-96*5.036.753.320.724.728.6
    CNN-1283.332.051.027.037.730.2
    CNN-128*3.031.055.718.724.326.6
    DNN-10003.037.765.338.336.736.2
    DNN-1000*2.339.052.030.031.731.0
    DNN-2000*2.043.355.024.732.731.5
    下载: 导出CSV

    表  2  旋转生成样本评估的测试集错误率 (%)

    Table  2  Test set error rate of rotating generated sample(%)

    中性高兴悲伤惊讶愤怒整体
    CNN-1283.332.051.027.037.730.2
    CNN-128*4.741.352.732.735.033.2
    CNN-128+3.037.051.715.724.026.3
    CNN-128^0.030.054.013.026.724.7
    DNN-10003.037.765.338.336.736.2
    DNN-1000*1.339.762.037.342.036.5
    DNN-1000+2.341.357.030.035.733.3
    DNN-1000^1.343.067.731.033.735.3
    下载: 导出CSV

    表  3  ROI-KNN辅助评估的测试集错误率 (%)

    Table  3  Test set error rate with ROI-KNN (%)

    中性高兴悲伤惊讶愤怒整体
    CNN-645.636.359.320.031.730.6
    CNN-64*1.029.756.017.030.026.7
    CNN-965.036.753.320.724.728.6
    CNN-96*0.326.056.316.026.725.8
    CNN-1283.031.055.718.724.326.6
    CNN-128*0.622.757.012.026.323.7
    DNN-10002.339.052.030.031.731.0
    DNN-1000*0.337.361.031.731.032.2
    DNN-20002.043.355.024.732.731.5
    DNN-2000*0.340.068.026.333.333.6
    下载: 导出CSV

    表  4  在JAFFE上的模型对比

    Table  4  Comparisons on JAFFE

    来源模型方法错误率(%)
    Kumbhar 等[20]Image feature30»40
    Lekshmi 等[21]SVM13.1
    Zhao 等[22]PCA and NMF6.28
    Zhi 等[23]2D-DLPP4.09
    Lee 等[24]RDA3.3
    本文ROI-KNN+CNN 2.81
    下载: 导出CSV
  • [1] Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems 25. Lake Tahoe, Nevada, USA: Curran Associates, Inc., 2012. 1097-1105
    [2] Lopes A T, de Aguiar E, Oliveira-Santos T. A facial expression recognition system using convolutional networks. In: Proceedings of the 28th SIBGRAPI Conference on Graphics, Patterns and Images. Salvador: IEEE, 2015. 273-280
    [3] Lucey P, Cohn J F, Kanade T, Saragih J, Ambadar Z, Matthews I. The extended Cohn-Kanade dataset (CK+): a complete dataset for action unit and emotion-specified expression. In: Proceedings of the 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). San Francisco, CA: IEEE, 2010. 94-101
    [4] Bishop C M. Pattern Recognition and Machine Learning. New York: Springer, 2007.
    [5] Bengio Y. Learning deep architectures for AI. Foundations and Trends in Machine Learning. Hanover, MA, USA: Now Publishers Inc., 2009. 1-127
    [6] LeCun Y, Boser B, Denker J S, Howard R E, Hubbard W, Jackel L D, Henderson D. Handwritten digit recognition with a back-propagation network. In: Proceedings of Advances in Neural Information Processing Systems 2. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 1990. 396-404
    [7] Fukushima K. Neocognitron: a self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biological Cybernetics, 1980, 36(4): 193-202
    [8] Rumelhart D E, Hinton G E, Williams R J. Learning representations by back-propagating errors. Nature, 1986, 323(6088): 533-536
    [9] LeCun Y, Bottou L, Bengio Y, Haffner P. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 1998, 86(11): 2278-2324
    [10] Szegedy C, Liu W, Jia Y Q, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A. Going deeper with convolutions. In: Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, MA: IEEE, 2015. 1-9
    [11] Glorot X, Bordes A, Bengio Y. Deep sparse rectifier neural networks. In: Proceedings of the 14th International Conference on Artificial Intelligence and Statistics (AISTATS). Fort Lauderdale, FL, USA, 2011, 15: 315-323
    [12] Barron A R. Universal approximation bounds for superpositions of a sigmoidal function. IEEE Transactions on Information Theory, 1993, 39(3): 930-945
    [13] Hubel D H, Wiesel T N, LeVay S. Visual-field representation in layer IV C of monkey striate cortex. In: Proceedings of the 4th Annual Meeting, Society for Neuroscience. St. Louis, US, 1974. 264
    [14] Dayan P, Abott L F. Theoretical Neuroscience. Cambridge: MIT Press, 2001.
    [15] Attwell D, Laughlin S B. An energy budget for signaling in the grey matter of the brain. Journal of Cerebral Blood Flow and Metabolism, 2001, 21(10): 1133-1145
    [16] Hinton G E, Srivastava N, Krizhevsky A, Sutskever I, Salakhutdinov R R. Improving neural networks by preventing co-adaptation of feature detectors. arXiv: 1207.0580, 2012.
    [17] Darwin C. On the Origin of Species. London: John Murray, Albemarle Street, 1859.
    [18] Xavier G, Yoshua B. Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the 13th International Conference on Artificial Intelligence and Statistics (AISTATS 2010). Chia Laguna Resort, Sardinia, Italy, 2010, 9: 249-256
    [19] Sun Y, Wang X, Tang X. Deep learning face representation from predicting 10000 classes. In: Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Columbus, OH: IEEE, 2014. 1891-1898
    [20] Kumbhar M, Jadhav A, Patil M. Facial expression recognition based on image feature. International Journal of Computer and Communication Engineering, 2012, 1(2): 117-119
    [21] Lekshmi V P, Sasikumar M. Analysis of facial expression using Gabor and SVM. International Journal of Recent Trends in Engineering, 2009, 1(2): 47-50
    [22] Zhao L H, Zhuang G B, Xu X H. Facial expression recognition based on PCA and NMF. In: Proceedings of the 7th World Congress on Intelligent Control and Automation. Chongqing, China: IEEE, 2008. 6826-6829
    [23] Zhi R C, Ruan Q Q. Facial expression recognition based on two-dimensional discriminant locality preserving projections. Neurocomputing, 2008, 71(7-9): 1730-1734
    [24] Lee C C, Huang S S, Shih C Y. Facial affect recognition using regularized discriminant analysis-based algorithms. EURASIP Journal on Advances in Signal Processing, 2010, article ID 596842(doi: 10.1155/2010/596842)
    [25] Bastien F, Lamblin P, Pascanu R, Bergstra J, Goodfellow I J, Bergeron A, Bouchard N, Warde-Farley D, Bengio Y. Theano: new features and speed improvements. In: Conference on Neural Information Processing Systems (NIPS) Workshop on Deep Learning and Unsuper Vised Feature Learning. Lake Tahoe, US, 2012.
  • 加载中
图(7) / 表(4)
计量
  • 文章访问数:  3975
  • HTML全文浏览量:  1623
  • PDF下载量:  2719
  • 被引次数: 0
出版历程
  • 收稿日期:  2015-10-12
  • 录用日期:  2016-04-01
  • 刊出日期:  2016-06-20

目录

    /

    返回文章
    返回