Publication | Closed Access
Exploring Multimodal Visual Features for Continuous Affect Recognition
31
Citations
29
References
2016
Year
Unknown Venue
EngineeringMachine LearningEmotion Sub-challengeAffective NeuroscienceMultimodal Visual FeaturesMultimodal LearningMultimodal Sentiment AnalysisPsychologySocial SciencesSpeech RecognitionAvec 2016Image AnalysisData SciencePattern RecognitionAffective ComputingCognitive ScienceMultimodal Signal ProcessingDeep LearningComputer VisionFacial Expression RecognitionSpeech ProcessingEmotionEmotion Recognition
This paper presents our work in the Emotion Sub-Challenge of the 6th Audio/Visual Emotion Challenge and Workshop (AVEC 2016), whose goal is to explore utilizing audio, visual and physiological signals to continuously predict the value of the emotion dimensions (arousal and valence). As visual features are very important in emotion recognition, we try a variety of handcrafted and deep visual features. For each video clip, besides the baseline features, we extract multi-scale Dense SIFT features (MSDF), and some types of Convolutional neural networks (CNNs) features to recognize the expression phases of the current frame. We train linear Support Vector Regression (SVR) for every kind of features on the RECOLA dataset. Multimodal fusion of these modalities is then performed with a multiple linear regression model. The final Concordance Correlation Coefficient (CCC) we gained on the development set are 0.824 for arousal, and 0.718 for valence; and on the test set are 0.683 for arousal and 0.642 for valence.
| Year | Citations | |
|---|---|---|
Page 1
Page 1