Publication | Closed Access
Rethinking the Learning Paradigm for Dynamic Facial Expression Recognition
72
Citations
47
References
2023
Year
Unknown Venue
Multiple Instance LearningEngineeringMachine LearningFeature ExtractionVanilla R3d18 BackboneVideo InterpretationSocial SciencesFacial Recognition SystemImage AnalysisData ScienceFacial ExpressionsPattern RecognitionAffective ComputingVideo TransformerLearning ParadigmCognitive ScienceMachine VisionVideo UnderstandingDeep LearningComputer VisionFacial Expression RecognitionFacial Animation
Dynamic Facial Expression Recognition (DFER) is a rapidly developing field that focuses on recognizing facial expressions in video format. Previous research has considered non-target frames as noisy frames, but we propose that it should be treated as a weakly supervised problem. We also identify the imbalance of short- and long-term temporal relationships in DFER. Therefore, we introduce the Multi-3D Dynamic Facial Expression Learning (M3DFEL) framework, which utilizes Multi-Instance Learning (MIL) to handle inexact labels. M3DFEL generates 3D-instances to model the strong short-term temporal relationship and utilizes 3DCNNs for feature extraction. The Dynamic Long-term Instance Aggregation Module (DLIAM) is then utilized to learn the long-term temporal relationships and dynamically aggregate the instances. Our experiments on DFEW and FERV39K datasets show that M3DFEL outperforms existing state-of-the-art approaches with a vanilla R3D18 backbone. The source code is available at https://github.com/faceeyes/M3DFEL.
| Year | Citations | |
|---|---|---|
Page 1
Page 1