Structure Inference Machines: Recurrent Neural Networks for Analyzing Relations in Group Activity Recognition

TLDR

Rich semantic relations, such as those in group activity recognition, are crucial for visual understanding, yet translating low‑level deep‑learning outputs into high‑level compositional scenes remains a challenge that graphical models aim to address. The study proposes integrating graphical models with deep neural networks into a joint framework for group activity recognition. The method employs a recurrent neural network to perform sequential inference and learns inference structure by gating edges between nodes. Experiments on group activity recognition show that the model can effectively handle highly structured learning tasks.

Abstract

Rich semantic relations are important in a variety of visual recognition problems. As a concrete example, group activity recognition involves the interactions and relative spatial relations of a set of people in a scene. State of the art recognition methods center on deep learning approaches for training highly effective, complex classifiers for interpreting images. However, bridging the relatively low-level concepts output by these methods to interpret higher-level compositional scenes remains a challenge. Graphical models are a standard tool for this task. In this paper, we propose a method to integrate graphical models and deep neural networks into a joint framework. Instead of using a traditional inference method, we use a sequential inference modeled by a recurrent neural network. Beyond this, the appropriate structure for inference can be learned by imposing gates on edges between nodes. Empirical results on group activity recognition demonstrate the potential of this model to handle highly structured learning tasks.

References

Page 1

	Year	Citations

Page 1