用于老年人面部表情识别的深度学习模型和传统机器学习方法的对比研究

doi:10.1007/s11390-020-9665-4

用于老年人面部表情识别的深度学习模型和传统机器学习方法的对比研究

Comparison Between Deep Learning Models and Traditional Machine Learning Approaches for Facial Expression Recognition in Ageing Adults

摘要

摘要: 人脸面部是一种重要的解读人情绪/心理状态的非语言沟通方式，所以人脸表情识别是计算机视觉最为活跃的研究领域之一，广泛应用于很多领域，如人机交互、安全、计算机图形动画和环境支持等。然而，因为现有方法缺乏普遍性，此外，即便有研究表明人脸表情表现随年龄变化，大多数研究仍然忽视了诸如年龄的面部特征对表情识别的影响，因此，人脸表情研究仍面临着挑战。近期，随着深度学习方法的兴起，此研究成效卓越并使分类任务明显改善。这些方法表明从原始数据可以直接获得特征层次结构，以此避免经典特征提取方法对人工进行的数据标签的依赖。然而，目前缺乏系统地探索老年人表情分类任务的深度学习模型性能的研究。本文尝试填补这一空缺，对3种最新深度卷积神经网络模型（VGG-16,AlexNet和GoogLeNet/Inception V1）在4个包含老年受试者面部表情的基准数据集（FACES,Lifespan,CIFE,和FER2013）上进行了性能对比分析。同时，也以两种基于手工特征提取过程的传统机器学习方法作为基准，进行了对比分析。本文实验精细严谨，着力于“迁移学习”，即，用新的适合分类（面部表情）数目的输出层代替所考察的深层构架的输出层，并训练3种不同的分类器（随机森林、支持向量机和线性回归）。对每个数据集和每个年龄组而言，VGG-16深度结构与随机森林分类器的组合对表情的识别最准确。此外，实验表明深度学习方法改进了基准方法，并且在老年人的面部表情识别方面效果显著。

Abstract: Facial expression recognition is one of the most active areas of research in computer vision since one of the non-verbal communication methods by which one understands the mood/mental state of a person is the expression of face. Thus, it has been used in various fields such as human-robot interaction, security, computer graphics animation, and ambient assistance. Nevertheless, it remains a challenging task since existing approaches lack generalizability and almost all studies ignore the effects of facial attributes, such as age, on expression recognition even though the research indicates that facial expression manifestation varies with age. Recently, a lot of progress has been made in this topic and great improvements in classification task were achieved with the emergence of deep learning methods. Such approaches have shown how hierarchies of features can be directly learned from original data, thus avoiding classical hand designed feature extraction methods that generally rely on manual operations with labelled data. However, research papers systematically exploring the performance of existing deep architectures for the task of classifying expression of ageing adults are absent in the literature. In the present work a tentative to try this gap is done considering the performance of three recent deep convolutional neural networks models (VGG-16, AlexNet and GoogLeNet/Inception V1) and evaluating it on four different benchmark datasets (FACES, Lifespan, CIFE, and FER2013 ) which also contain facial expressions performed by elderly subjects. As the baseline, and with the aim of making a comparison, two traditional machine learning approaches based on handcrafted features extraction process are evaluated on the same datasets. Carrying out an exhaustive and rigorous experimentation focused on the concept of “transfer learning”, which consists of replacing the output level of the deep architectures considered with new output levels appropriate to the number of classes (facial expressions), and training three different classifiers (i.e., Random Forest, Support Vector Machine and Linear Regression), VGG-16 deep architecture in combination with Random Forest classifier was found to be the best in terms of accuracy for each dataset and for each considered age-group. Moreover, the experimentation stage showed that the deep learning approach significantly improves the baseline approaches considered, and the most noticeable improvement was obtained when considering facial expressions of ageing adults.

HTML全文

参考文献()

施引文献

资源附件()