Publication | Closed Access
Deep learning for robust feature generation in audiovisual emotion recognition
391
Citations
33
References
2013
Year
Unknown Venue
MusicEngineeringMachine LearningData ScienceFeature LearningPattern RecognitionFacial Expression RecognitionAutoencodersAffective ComputingDeep Learning TechniquesSpeech ProcessingMultimodal Signal ProcessingSocial SciencesMultimodal Sentiment AnalysisDeep LearningEmotionEmotion RecognitionSpeech Recognition
Automatic emotion recognition systems predict affective content from low‑level human‑centered cues, yet most feature‑selection methods capture only linear relationships or rely on labeled data. The study aims to use deep learning to capture complex non‑linear feature interactions across multimodal data for emotion recognition. The authors employ deep learning techniques to explicitly model high‑order non‑linear relationships in multimodal signals. The Deep Belief Network models outperform non‑deep‑learning baselines, showing that learned high‑order non‑linear relationships improve emotion classification.
Automatic emotion recognition systems predict high-level affective content from low-level human-centered signal cues. These systems have seen great improvements in classification accuracy, due in part to advances in feature selection methods. However, many of these feature selection methods capture only linear relationships between features or alternatively require the use of labeled data. In this paper we focus on deep learning techniques, which can overcome these limitations by explicitly capturing complex non-linear feature interactions in multimodal data. We propose and evaluate a suite of Deep Belief Network models, and demonstrate that these models show improvement in emotion classification performance over baselines that do not employ deep learning. This suggests that the learned high-order non-linear relationships are effective for emotion recognition.
| Year | Citations | |
|---|---|---|
Page 1
Page 1