Concepedia

Publication | Closed Access

ABAW5 Challenge: A Facial Affect Recognition Approach Utilizing Transformer Encoder and Audiovisual Fusion

22

Citations

45

References

2023

Year

Abstract

In this paper, we present our approach to tackling the 5th Workshop and Competition on Affective Behavior Analysis in-the-wild (ABAW). The competition comprises four sub-challenges, namely Valence-Arousal (VA) Estimation, Expression (Expr) Classification, Action Unit (AU) Detection, and Emotional Reaction Intensity (ERI) Estimation. To address theμse challenges, we leverage state-of-the-art (sota) models to extract robust audio and visual features. Subsequently, these features are fused using a Transformer Encoder for the VA, Expr, and AU sub-challenges, and TEMMA for the ERI sub-challenge. To mitigate the effect of disparate feature dimensions, we introduce an Affine Module to align the features to the same dimension. Overall, our results outperform the baseline by a substantial margin across all four sub-challenges. Specifically, for the VA Estimation sub-challenge, our method attains a mean Concordance Correlation Coefficient (CCC) of 0.5342, ranking fifth overall. For the Expression Classification sub-challenge, our approach achieves an average F1 Score of 0.3337, placing fourth overall. For the AU Detection sub-challenge, our method obtains an average F1 Score of 0.4752. Lastly, for the Emotional Reaction Intensity Estimation sub-challenge, our approach yields an average Pearson’s correlation coefficient of 0.3968.

References

YearCitations

Page 1