-
摘要: 深度神经网络已经被证明在图像、语音、文本领域具有挖掘数据深层潜在的分布式表达特征的能力. 通过在多个面部情感数据集上训练深度卷积神经网络和深度稀疏校正神经网络两种深度学习模型, 对深度神经网络在面部情感分类领域的应用作了对比评估. 进而, 引入了面部结构先验知识, 结合感兴趣区域(Region of interest, ROI)和K最近邻算法(K-nearest neighbors, KNN), 提出一种快速、简易的针对面部表情分类的深度学习训练改进方案——ROI-KNN, 该训练方案降低了由于面部表情训练数据过少而导致深度神经网络模型泛化能力不佳的问题, 提高了深度学习在面部表情分类中的鲁棒性, 同时, 显著地降低了测试错误率.Abstract: Deep neural networks have been proved to be able to mine distributed representation of data including image, speech and text. By building two models of deep convolutional neural networks and deep sparse rectifier neural networks on facial expression dataset, we make contrastive evaluations in facial expression recognition system with deep neural networks. Additionally, combining region of interest (ROI) and K-nearest neighbors (KNN), we propose a fast and simple improved method called "ROI-KNN" for facial expression classification, which relieves the poor generalization of deep neural networks due to lacking of data and decreases the testing error rate apparently and generally. The proposed method also improves the robustness of deep learning in facial expression classification.
-
表 1 ROI辅助评估的测试集错误率 (%)
Table 1 Test set error rate of ROI auxiliary (%)
中性 高兴 悲伤 惊讶 愤怒 整体 CNN-64 4.7 32.7 54.3 33 40.3 33.3 CNN-64* 5.6 36.3 59.3 20.0 31.7 30.6 CNN-96* 5.0 36.7 53.3 20.7 24.7 28.6 CNN-128 3.3 32.0 51.0 27.0 37.7 30.2 CNN-128* 3.0 31.0 55.7 18.7 24.3 26.6 DNN-1000 3.0 37.7 65.3 38.3 36.7 36.2 DNN-1000* 2.3 39.0 52.0 30.0 31.7 31.0 DNN-2000* 2.0 43.3 55.0 24.7 32.7 31.5 表 2 旋转生成样本评估的测试集错误率 (%)
Table 2 Test set error rate of rotating generated sample(%)
中性 高兴 悲伤 惊讶 愤怒 整体 CNN-128 3.3 32.0 51.0 27.0 37.7 30.2 CNN-128* 4.7 41.3 52.7 32.7 35.0 33.2 CNN-128+ 3.0 37.0 51.7 15.7 24.0 26.3 CNN-128^ 0.0 30.0 54.0 13.0 26.7 24.7 DNN-1000 3.0 37.7 65.3 38.3 36.7 36.2 DNN-1000* 1.3 39.7 62.0 37.3 42.0 36.5 DNN-1000+ 2.3 41.3 57.0 30.0 35.7 33.3 DNN-1000^ 1.3 43.0 67.7 31.0 33.7 35.3 表 3 ROI-KNN辅助评估的测试集错误率 (%)
Table 3 Test set error rate with ROI-KNN (%)
中性 高兴 悲伤 惊讶 愤怒 整体 CNN-64 5.6 36.3 59.3 20.0 31.7 30.6 CNN-64* 1.0 29.7 56.0 17.0 30.0 26.7 CNN-96 5.0 36.7 53.3 20.7 24.7 28.6 CNN-96* 0.3 26.0 56.3 16.0 26.7 25.8 CNN-128 3.0 31.0 55.7 18.7 24.3 26.6 CNN-128* 0.6 22.7 57.0 12.0 26.3 23.7 DNN-1000 2.3 39.0 52.0 30.0 31.7 31.0 DNN-1000* 0.3 37.3 61.0 31.7 31.0 32.2 DNN-2000 2.0 43.3 55.0 24.7 32.7 31.5 DNN-2000* 0.3 40.0 68.0 26.3 33.3 33.6 -
[1] Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems 25. Lake Tahoe, Nevada, USA: Curran Associates, Inc., 2012. 1097-1105 [2] Lopes A T, de Aguiar E, Oliveira-Santos T. A facial expression recognition system using convolutional networks. In: Proceedings of the 28th SIBGRAPI Conference on Graphics, Patterns and Images. Salvador: IEEE, 2015. 273-280 [3] Lucey P, Cohn J F, Kanade T, Saragih J, Ambadar Z, Matthews I. The extended Cohn-Kanade dataset (CK+): a complete dataset for action unit and emotion-specified expression. In: Proceedings of the 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). San Francisco, CA: IEEE, 2010. 94-101 [4] Bishop C M. Pattern Recognition and Machine Learning. New York: Springer, 2007. [5] Bengio Y. Learning deep architectures for AI. Foundations and Trends in Machine Learning. Hanover, MA, USA: Now Publishers Inc., 2009. 1-127 [6] LeCun Y, Boser B, Denker J S, Howard R E, Hubbard W, Jackel L D, Henderson D. Handwritten digit recognition with a back-propagation network. In: Proceedings of Advances in Neural Information Processing Systems 2. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 1990. 396-404 [7] Fukushima K. Neocognitron: a self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biological Cybernetics, 1980, 36(4): 193-202 [8] Rumelhart D E, Hinton G E, Williams R J. Learning representations by back-propagating errors. Nature, 1986, 323(6088): 533-536 [9] LeCun Y, Bottou L, Bengio Y, Haffner P. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 1998, 86(11): 2278-2324 [10] Szegedy C, Liu W, Jia Y Q, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A. Going deeper with convolutions. In: Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, MA: IEEE, 2015. 1-9 [11] Glorot X, Bordes A, Bengio Y. Deep sparse rectifier neural networks. In: Proceedings of the 14th International Conference on Artificial Intelligence and Statistics (AISTATS). Fort Lauderdale, FL, USA, 2011, 15: 315-323 [12] Barron A R. Universal approximation bounds for superpositions of a sigmoidal function. IEEE Transactions on Information Theory, 1993, 39(3): 930-945 [13] Hubel D H, Wiesel T N, LeVay S. Visual-field representation in layer IV C of monkey striate cortex. In: Proceedings of the 4th Annual Meeting, Society for Neuroscience. St. Louis, US, 1974. 264 [14] Dayan P, Abott L F. Theoretical Neuroscience. Cambridge: MIT Press, 2001. [15] Attwell D, Laughlin S B. An energy budget for signaling in the grey matter of the brain. Journal of Cerebral Blood Flow and Metabolism, 2001, 21(10): 1133-1145 [16] Hinton G E, Srivastava N, Krizhevsky A, Sutskever I, Salakhutdinov R R. Improving neural networks by preventing co-adaptation of feature detectors. arXiv: 1207.0580, 2012. [17] Darwin C. On the Origin of Species. London: John Murray, Albemarle Street, 1859. [18] Xavier G, Yoshua B. Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the 13th International Conference on Artificial Intelligence and Statistics (AISTATS 2010). Chia Laguna Resort, Sardinia, Italy, 2010, 9: 249-256 [19] Sun Y, Wang X, Tang X. Deep learning face representation from predicting 10000 classes. In: Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Columbus, OH: IEEE, 2014. 1891-1898 [20] Kumbhar M, Jadhav A, Patil M. Facial expression recognition based on image feature. International Journal of Computer and Communication Engineering, 2012, 1(2): 117-119 [21] Lekshmi V P, Sasikumar M. Analysis of facial expression using Gabor and SVM. International Journal of Recent Trends in Engineering, 2009, 1(2): 47-50 [22] Zhao L H, Zhuang G B, Xu X H. Facial expression recognition based on PCA and NMF. In: Proceedings of the 7th World Congress on Intelligent Control and Automation. Chongqing, China: IEEE, 2008. 6826-6829 [23] Zhi R C, Ruan Q Q. Facial expression recognition based on two-dimensional discriminant locality preserving projections. Neurocomputing, 2008, 71(7-9): 1730-1734 [24] Lee C C, Huang S S, Shih C Y. Facial affect recognition using regularized discriminant analysis-based algorithms. EURASIP Journal on Advances in Signal Processing, 2010, article ID 596842(doi: 10.1155/2010/596842) [25] Bastien F, Lamblin P, Pascanu R, Bergstra J, Goodfellow I J, Bergeron A, Bouchard N, Warde-Farley D, Bengio Y. Theano: new features and speed improvements. In: Conference on Neural Information Processing Systems (NIPS) Workshop on Deep Learning and Unsuper Vised Feature Learning. Lake Tahoe, US, 2012. 期刊类型引用(5)
1. 郭文静,张勇,邱纪方. 下肢外骨骼机器人在脑卒中恢复期患者步态康复中的应用研究进展. 中华物理医学与康复杂志. 2023(11): 1035-1039 . 百度学术
2. 周智雍,钱伟,丁加涛,肖晓晖,郭朝. 基于核化运动基元的外骨骼膝关节步态轨迹在线规划. 机器人. 2021(05): 557-566 . 百度学术
3. 刘成菊,耿烷东,张长柱,陈启军. 基于自学习中枢模式发生器的仿人机器人适应性行走控制. 自动化学报. 2021(09): 2170-2181 . 本站查看
4. 王瑷珲,葛祎霏,胡宁宁,但永平,喻俊,卢俊兰. 基于步态数据的下肢康复机器人控制设计. 控制工程. 2021(11): 2266-2272 . 百度学术
5. 曹婧华. 镇静剂对马匹运动步态影响的实验研究. 内蒙古工业大学学报(自然科学版). 2019(02): 104-109 . 百度学术
其他类型引用(9)
-