Human‐centered attention‐aware networks for action recognition

Abstract

Action recognition in video is a research hot spot in the field of computer vision. Learning important clues in video context has significant effect to promote the interaction prediction and gesture recognition. Most existing methods infer the interactions between actor and context through relational reasoning methods. While these relational features contribute to improve the salience of action performance, the error will occur when the salient region is irrelevant to the recognized action. Therefore, this paper establishes a human-centered attention mechanism that dynamically highlights regions associated with action recognition according to target appearance to selectively recognize the human-object interaction action. The effectiveness of the proposed mechanism is verified on the AVA2.2 data set, and the visualized attention map further shows that the proposed attention mechanism can effectively recognize human-centered strongly correlated action.

References

Page 1

	Year	Citations

Page 1