Automatic Engagement Prediction with GAP Feature

TLDR

We propose an automatic engagement prediction method for the Engagement in the Wild sub‑challenge of EmotiW 2018. We design a Gaze‑AU‑Pose (GAP) feature that fuses gaze, action units, and head pose, extract it from overlapping video clips, and feed the sequence into a GRU‑based deep model followed by mean pooling to predict engagement. Experimental results show the approach achieves an MSE of 0.0724 on the EmotiW 2018 test set, demonstrating its effectiveness.

Abstract

In this paper, we propose an automatic engagement prediction method for the Engagement in the Wild sub-challenge of EmotiW 2018. We first design a novel Gaze-AU-Pose (GAP) feature taking into account the information of gaze, action units and head pose of a subject. The GAP feature is then used for the subsequent engagement level prediction. To efficiently predict the engagement level for a long-time video, we divide the long-time video into multiple overlapped video clips and extract GAP feature for each clip. A deep model consisting of a Gated Recurrent Unit (GRU) layer and a fully connected layer is used as the engagement predictor. Finally, a mean pooling layer is applied to the per-clip estimation to get the final engagement level of the whole video. Experimental results on the validation set and test set show the effectiveness of the proposed approach. In particular, our approach achieves a promising result with an MSE of 0.0724 on the test set of Engagement Prediction Challenge of EmotiW 2018.t with an MSE of 0.072391 on the test set of Engagement Prediction Challenge of EmotiW 2018.

References

Page 1

	Year	Citations

Page 1