计算机科学技术学报 ›› 2020,Vol. 35 ›› Issue (5): 1127-1146.doi: 10.1007/s11390-020-9665-4

所属专题: Artificial Intelligence and Pattern Recognition

• • 上一篇    下一篇

用于老年人面部表情识别的深度学习模型和传统机器学习方法的对比研究

Andrea Caroppo, Alessandro Leone, and Pietro Siciliano   

  1. Institute for Microelectronics and Microsystems, National Research Council of Italy, Lecce 73100, Italy
  • 收稿日期:2019-04-23 修回日期:2020-03-06 出版日期:2020-09-20 发布日期:2020-09-30
  • 作者简介:Andrea Caroppo received his Master's degree in computer science engineering in 2004 from the University of Lecce, Lecce. From 2004 to 2006 he was a researcher fellow at the Italian National Research Council (CNR), Institute of Study of Intelligent Systems for Automation (ISSIA) in Bari (Italy). His research interests were in the area of image and video processing/coding, neural networks, motion estimation in video sequences and multidimensional signal processing. From 2012 to 2017 he was a researcher fellow at the National Research Council (CNR) of Italy, Institute for Microelectronics and Microsystems (IMM) in Lecce (Italy). Since 2018 he has been a researcher in the same institute. He is interested in signal and image processing, pattern recognition, computer vision and development of enabling technologies for healthcare with particular focus on the new Ambient Assisted Living (AAL) technologies. He is the author of more than 30 papers in national and international journals and conference proceedings.

Comparison Between Deep Learning Models and Traditional Machine Learning Approaches for Facial Expression Recognition in Ageing Adults

Andrea Caroppo, Alessandro Leone, and Pietro Siciliano        

  1. Institute for Microelectronics and Microsystems, National Research Council of Italy, Lecce 73100, Italy
  • Received:2019-04-23 Revised:2020-03-06 Online:2020-09-20 Published:2020-09-30

人脸面部是一种重要的解读人情绪/心理状态的非语言沟通方式,所以人脸表情识别是计算机视觉最为活跃的研究领域之一,广泛应用于很多领域,如人机交互、安全、计算机图形动画和环境支持等。然而,因为现有方法缺乏普遍性,此外,即便有研究表明人脸表情表现随年龄变化,大多数研究仍然忽视了诸如年龄的面部特征对表情识别的影响,因此,人脸表情研究仍面临着挑战。近期,随着深度学习方法的兴起,此研究成效卓越并使分类任务明显改善。这些方法表明从原始数据可以直接获得特征层次结构,以此避免经典特征提取方法对人工进行的数据标签的依赖。然而,目前缺乏系统地探索老年人表情分类任务的深度学习模型性能的研究。本文尝试填补这一空缺,对3种最新深度卷积神经网络模型(VGG-16,AlexNet和GoogLeNet/Inception V1)在4个包含老年受试者面部表情的基准数据集(FACES,Lifespan,CIFE,和FER2013)上进行了性能对比分析。同时,也以两种基于手工特征提取过程的传统机器学习方法作为基准,进行了对比分析。本文实验精细严谨,着力于“迁移学习”,即,用新的适合分类(面部表情)数目的输出层代替所考察的深层构架的输出层,并训练3种不同的分类器(随机森林、支持向量机和线性回归)。对每个数据集和每个年龄组而言,VGG-16深度结构与随机森林分类器的组合对表情的识别最准确。此外,实验表明深度学习方法改进了基准方法,并且在老年人的面部表情识别方面效果显著。

关键词: 计算机视觉, 深度学习, 人脸表情, 机器学习, 老年人

Abstract: Facial expression recognition is one of the most active areas of research in computer vision since one of the non-verbal communication methods by which one understands the mood/mental state of a person is the expression of face. Thus, it has been used in various fields such as human-robot interaction, security, computer graphics animation, and ambient assistance. Nevertheless, it remains a challenging task since existing approaches lack generalizability and almost all studies ignore the effects of facial attributes, such as age, on expression recognition even though the research indicates that facial expression manifestation varies with age. Recently, a lot of progress has been made in this topic and great improvements in classification task were achieved with the emergence of deep learning methods. Such approaches have shown how hierarchies of features can be directly learned from original data, thus avoiding classical hand designed feature extraction methods that generally rely on manual operations with labelled data. However, research papers systematically exploring the performance of existing deep architectures for the task of classifying expression of ageing adults are absent in the literature. In the present work a tentative to try this gap is done considering the performance of three recent deep convolutional neural networks models (VGG-16, AlexNet and GoogLeNet/Inception V1) and evaluating it on four different benchmark datasets (FACES, Lifespan, CIFE, and FER2013 ) which also contain facial expressions performed by elderly subjects. As the baseline, and with the aim of making a comparison, two traditional machine learning approaches based on handcrafted features extraction process are evaluated on the same datasets. Carrying out an exhaustive and rigorous experimentation focused on the concept of “transfer learning”, which consists of replacing the output level of the deep architectures considered with new output levels appropriate to the number of classes (facial expressions), and training three different classifiers (i.e., Random Forest, Support Vector Machine and Linear Regression), VGG-16 deep architecture in combination with Random Forest classifier was found to be the best in terms of accuracy for each dataset and for each considered age-group. Moreover, the experimentation stage showed that the deep learning approach significantly improves the baseline approaches considered, and the most noticeable improvement was obtained when considering facial expressions of ageing adults.

Key words: computer vision, deep learning, facial expression, machine learning, ageing adult

[1] Zeng Z, Pantic M, Roisman G I, Huang T S. A survey of affect recognition methods:Audio, visual, and spontaneous expressions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2009, 31(1):39-58.
[2] Pantic M, Rothkrantz L J M. Automatic analysis of facial expressions:The state of the art. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2000, 22(12):1424-1445.
[3] Fasel B, Luettin J. Automatic facial expression analysis:A survey. Pattern Recognition, 2003, 36(1):259-275.
[4] Carroll J M, Russell J A. Do facial expressions signal specific emotions? Judging emotion from the face in context. Journal of Personality and Social Psychology, 1996, 70(2):205-218.
[5] Rolls E T, Ekman P, Perrett D I et al. Facial expressions of emotion:An old controversy and new findings:Discussion. RSPTB, 335(1273):69.
[6] Shbib R, Zhou S. Facial expression analysis using active shape model. International Journal of Signal Processing, Image Processing and Pattern Recognition, 2015, 8(1):9-22.
[7] Cheon Y, Kim D. Natural facial expression recognition using differential-AAM and manifold learning. Pattern Recognition, 2009, 42(7):1340-1350.
[8] Soyel H, Demirel H. Facial expression recognition based on discriminative scale invariant feature transform. Electronics Letters, 2010, 46(5):343-345.
[9] Gu W, Xiang C, Venkatesh Y V, Huang D, Lin H. Facial expression recognition using radial encoding of local Gabor features and classifier synthesis. Pattern Recognition, 2012, 45(1):80-91.
[10] Shan C, Gong S, McOwan P W. Facial expression recognition based on local binary patterns:A comprehensive study. Image and Vision Computing, 2009, 27(6):803-816.
[11] Chen J, Chen Z, Chi Z, Fu H. Facial expression recognition based on facial components detection and HOG features. In Proc. the Scientific Cooperations International Workshops on Electrical and Computer Engineering Subfields, Aug. 2014, pp.884-888.
[12] Guo G, Guo R, Li X. Facial expression recognition influenced by human ageing. IEEE Transactions on Affective Computing, 2013, 4(3):291-298.
[13] Wang S, Wu S, Gao Z, Ji Q. Facial expression recognition through modeling age-related spatial patterns. Multimedia Tools and Applications, 2016, 75(7):3937-3954.
[14] Malatesta C Z, Izard C E. The facial expression of emotion:Young, middle-aged, and older adult expressions. In Emotion in Adult Development, Malatesta C Z, Izard C E (eds.), Sage Publications, 1984, pp.253-273.
[15] Malatesta-Magai C, Jonas R, Shepard B, Culver L C. Type A behavior pattern and emotion expression in younger and older adults. Psychology and Aging, 1992, 7(4):551-561.
[16] Malatesta C Z, Fiore M J, Messina J J. Affect, personality, and facial expressive characteristics of older people. Psychology and Aging, 1987, 2(1):64-69.
[17] Lozano-Monasor E, López M T, Vigo-Bustos F, FernándezCaballero A. Facial expression recognition in ageing adults:From lab to ambient assisted living. Journal of Ambient Intelligence and Humanized Computing, 2017, 8(4):567-578.
[18] LeCun Y, Bengio Y, Hinton G. Deep learning. Nature, 2015, 521(7553):436-444.
[19] Yu D, Deng L. Deep learning and its applications to signal and information processing[Exploratory DSP]. IEEE Signal Processing Magazine, 2011, 28(1):145-154.
[20] Li S, Deng W. Deep facial expression recognition:A survey. arXiv:1804.08348, 2018. https://arxiv.org/abs/1804.08348, Dec. 2019.
[21] Ginne R, Jariwala K. Facial expression recognition using CNN:A survey. International Journal of Advances in Electronics and Computer Science, 2018, 5(3):13-16.
[22] Goodfellow I J, Erhan D, Carrier P L et al. Challenges in representation learning:A report on three machine learning contests. In Proc. the 20th International Conference on Neural Information Processing, Nov. 2013, pp.117-124.
[23] Kahou S E, Pal C, Bouthillier X et al. Combining modality specific deep neural networks for emotion recognition in video. In Proc. the 15th ACM on International Conference on Multimodal Interaction, Dec. 2013, pp.543-550.
[24] Liu M, Wang R, Li S, Shan S, Huang Z, Chen X. Combining multiple kernel methods on Riemannian manifold for emotion recognition in the wild. In Proc. the 16th International Conference on Multimodal Interaction, Nov. 2014, pp.494-501.
[25] Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556, 2014. https://arxiv.org/abs/1409.1556, Dec. 2019.
[26] Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks. In Proc. the 26th Annual Conference on Neural Information Processing Systems, Dec. 2012, pp.1106-1114.
[27] Szegedy C, Liu W, Jia Y et al. Going deeper with convolutions. In Proc. the 2015 IEEE Conference on Computer Vision and Pattern Recognition, June 2015, pp.1-9.
[28] Viola P, Jones M J. Robust real-time face detection. International Journal of Computer Vision, 2004, 57(2):137-154.
[29] Zuiderveld K. Contrast limited adaptive histogram equalization. In Graphics Gems IV, Heckbert P S (ed.), Academic Press Professional, 1994, pp.474-485.
[30] Hubel D H, Wiesel T N. Receptive fields and functional architecture of monkey striate cortex. The Journal of Physiology, 1968, 195(1):215-243.
[31] Pan S J, Yang Q. A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering, 2010, 22(10):1345-1359.
[32] Russakovsky O, Deng J, Su H et al. ImageNet large scale visual recognition challenge. International Journal of Computer Vision, 2015, 115(3):211-252.
[33] Lin M, Chen Q, Yan S. Network in network. arXiv:1312.4400, 2013. https://arxiv.org/abs/1312.4400, Dec. 2019.
[34] Milborrow S, Nicolls F. Active shape models with SIFT descriptors and MARS. In Proc. the 9th International Conference on Computer Vision Theory and Applications, Jan. 2014, pp.380-387.
[35] Shan C, Gong S, McOwan P W. Facial expression recognition based on local binary patterns:A comprehensive study. Image and Vision Computing, 2009, 27(6):803-816.
[36] EbnerN C, Riediger M, Lindenberger U. FACES-A database of facial expressions in young, middle-aged, and older women and men:Development and validation. Behavior Research Methods, 2010, 42(1):351-362.
[37] Minear M, Park D C. A lifespan database of adult facial stimuli. Behavior Research Methods, Instruments, & Computers, 2004, 36(4):630-633.
[38] Li W, Li M, Su Z, Zhu Z. A deep-learning approach to facial expression recognition with candid images. In Proc. the 14th IAPR International Conference on Machine Vision Applications, May 2015, pp.279-282.
[39] Goodfellow I J, Erhan D, Carrier P L et al. Challenges in representation learning:A report on three machine learning contests. In Proc. the 20th International Conference on Neural Information Processing, Nov. 2013, pp.117-124.
[40] Wu T, Turaga P, Chellappa R. Age estimation and face verification across ageing using landmarks. IEEE Transactions on Information Forensics and Security, 2012, 7(6):1780-1788.
[41] Giannopoulos P, Perikos I, Hatzilygeroudis I. Deep learning approaches for facial emotion recognition:A case study on FER-2013. In Advances in Hybridization of Intelligent Methods:Models, Systems and Applications, Hatzilygeroudis I, Palade V (eds.), Springer, 2018, pp.1-16.
[42] Georgescu M I, Ionescu R T, Popescu M. Local learning with deep and handcrafted features for facial expression recognition. arXiv:1804.10892, 2018. https://arxiv.org/pdf/1804.10892.pdf, Dec. 2019.
[43] Abadi M, Barham P, Chen J et al. TensorFlow:A system for large-scale machine learning. In Proc. the 12th USENIX Symposium on Operating Systems Design and Implementation, Nov. 2016, pp.265-283.
[44] Caroppo A, Leone A, Siciliano P. Facial expression recognition in ageing adults:A comparative study. In Ambient Assisted Living, Leone A, Caroppo A, Rescio G et al. (eds.), pp.349-359.
[45] Li W, Tsangouri C, Abtahi F, Zhu Z. A recursive framework for expression recognition:From web images to deep models to game dataset. Machine Vision and Applications, 2018, 29(3):489-502.
[46] Wang X, Wang X, Ni Y. Unsupervised domain adaptation for facial expression recognition using generative adversarial networks. Computational Intelligence and Neuroscience, 2018, Article No. 7208794.
[47] Ionescu R T, Popescu M, Grozea C. Local learning to improve bag of visual words model for facial expression recognition. In Proc. the 2013 ICML Workshop on Challenges in Representation Learning, June 2013.
[48] Benitez-Quiroz C F, Srinivasan R, Feng Q, Wang Y, Martinez A M. EmotioNet challenge:Recognition of facial expressions of emotion in the wild. arXiv:1703.01210, 2017. https://arxiv.org/abs/1703.01210, Dec. 2019.
[49] Mollahosseini A, Hasani B, Mahoor M H. AffectNet:A database for facial expression, valence, and arousal computing in the wild. IEEE Transactions on Affective Computing, 2019, 10(1):18-31
[50] He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In Proc. the 2016 IEEE Conference on Computer Vision and Pattern Recognition, June 2016, pp.770-778.
[51] Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z. Rethinking the inception architecture for computer vision. In Proc. the 2016 IEEE Conference on Computer Vision and Pattern Recognition, June 2016, pp.2818-2826.
[52] Szegedy C, Ioffe S, Vanhoucke V, Alemi A A. Inception-v4, Inception-ResNet and the impact of residual connections on learning. In Proc. the 31st AAAI Conference on Artificial Intelligence, February 2017, pp.4278-4284.
[53] Chollet F. Xception:Deep learning with depthwise separable convolutions. In Proc. the 2017 IEEE Conference on Computer Vision and Pattern Recognition, July 2017, pp.1800-1827.
[1] 曹荣禹、曹逸轩、周干斌、罗平. 从长文档中提取深度可变的文档逻辑结构:方法、评估和应用[J]. 计算机科学技术学报, 2022, 37(3): 699-718.
[2] 张鑫, 陆思源, 王水花, 余翔, 王甦菁, 姚仑, 潘毅, 张煜东. 通过新型深度学习架构诊断COVID-19肺炎[J]. 计算机科学技术学报, 2022, 37(2): 330-343.
[3] Geun Yong Kim, Joon-Young Paik, Yeongcheol Kim, and Eun-Sun Cho. 基于字节频率特征码的勒索病毒检测方法[J]. 计算机科学技术学报, 2022, 37(2): 423-442.
[4] 赵建喆, 王兴伟, 毛克明, 黄辰希, 苏昱恺, 李宇宸. 机器学习中基于相关差分隐私保护的多方数据发布方法[J]. 计算机科学技术学报, 2022, 37(1): 231-251.
[5] Yi Zhong, Jian-Hua Feng, Xiao-Xin Cui, Xiao-Le Cui. 机器学习辅助的抗逻辑块加密密钥猜测攻击范式[J]. 计算机科学技术学报, 2021, 36(5): 1102-1117.
[6] Songjie Niu, Shimin Chen. TransGPerf:利用迁移学习建模分布式图计算性能[J]. 计算机科学技术学报, 2021, 36(4): 778-791.
[7] Lan Chen, Juntao Ye, Xiaopeng Zhang. 基于多特征超分网络的布料褶皱合成[J]. 计算机科学技术学报, 2021, 36(3): 478-493.
[8] Yu-Jie Yuan, Yukun Lai, Tong Wu, Lin Gao, Li-Gang Liu. 回顾形状编辑技术:从几何角度到神经网络方法[J]. 计算机科学技术学报, 2021, 36(3): 520-554.
[9] Sheng-Luan Hou, Xi-Kun Huang, Chao-Qun Fei, Shu-Han Zhang, Yang-Yang Li, Qi-Lin Sun, Chuan-Qing Wang. 基于深度学习的文本摘要研究综述[J]. 计算机科学技术学报, 2021, 36(3): 633-663.
[10] Wei Du, Yu Sun, Hui-Min Bao, Liang Chen, Ying Li, Yan-Chun Liang. 基于迁移学习与深度学习的人类血液分泌蛋白预测框架[J]. 计算机科学技术学报, 2021, 36(2): 234-247.
[11] Jun Gao, Paul Liu, Guang-Di Liu, Le Zhang. 基于深度学习与波束偏转的穿刺针定位与增强算法[J]. 计算机科学技术学报, 2021, 36(2): 334-346.
[12] Hua Chen, Juan Liu, Qing-Man Wen, Zhi-Qun Zuo, Jia-Sheng Liu, Jing Feng, Bao-Chuan Pang, Di Xiao. CytoBrain:基于深度学习技术的宫颈癌筛查系统[J]. 计算机科学技术学报, 2021, 36(2): 347-360.
[13] Sara Elmidaoui, Laila Cheikhi, Ali Idri, Alain Abran. 用于软件可维护性预测的机器学习技术:精度分析[J]. 计算机科学技术学报, 2020, 35(5): 1147-1174.
[14] 梁盾, 郭元晨, 张少魁, 穆太江, 黄晓蕾. 车道检测-新结果和调查研究[J]. 计算机科学技术学报, 2020, 35(3): 493-505.
[15] Zheng Zeng, Lu Wang, Bei-Bei Wang, Chun-Meng Kang, Yan-Ning Xu. 一种基于多重残差网络的随机渐进式光子映射的降噪方法[J]. 计算机科学技术学报, 2020, 35(3): 506-521.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 周笛;. A Recovery Technique for Distributed Communicating Process Systems[J]. , 1986, 1(2): 34 -43 .
[2] 陈世华;. On the Structure of Finite Automata of Which M Is an(Weak)Inverse with Delay τ[J]. , 1986, 1(2): 54 -59 .
[3] 屈延文;. AGDL: A Definition Language for Attribute Grammars[J]. , 1986, 1(3): 80 -91 .
[4] 王建潮; 魏道政;. An Effective Test Generation Algorithm for Combinational Circuits[J]. , 1986, 1(4): 1 -16 .
[5] 陈肇雄; 高庆狮;. A Substitution Based Model for the Implementation of PROLOG——The Design and Implementation of LPROLOG[J]. , 1986, 1(4): 17 -26 .
[6] 黄河燕;. A Parallel Implementation Model of HPARLOG[J]. , 1986, 1(4): 27 -38 .
[7] 郑国梁; 李辉;. The Design and Implementation of the Syntax-Directed Editor Generator(SEG)[J]. , 1986, 1(4): 39 -48 .
[8] 闵应骅; 韩智德;. A Built-in Test Pattern Generator[J]. , 1986, 1(4): 62 -74 .
[9] 黄学东; 蔡莲红; 方棣棠; 迟边进; 周立; 蒋力;. A Computer System for Chinese Character Speech Input[J]. , 1986, 1(4): 75 -83 .
[10] 许小曙;. Simplification of Multivalued Sequential SULM Network by Using Cascade Decomposition[J]. , 1986, 1(4): 84 -95 .
版权所有 © 《计算机科学技术学报》编辑部
本系统由北京玛格泰克科技发展有限公司设计开发 技术支持:support@magtech.com.cn
总访问量: