Actions as Space-Time Shapes

TLDR

Human action in video sequences can be viewed as silhouettes of a moving torso and limbs, which we treat as three‑dimensional shapes induced by the silhouettes in the space‑time volume. The study adopts a recent 2D shape analysis approach and extends it to volumetric space‑time action shapes. The method extracts space‑time features—including local saliency, action dynamics, shape structure, and orientation—by solving the Poisson equation. These features enable fast, alignment‑free action recognition, detection, and clustering, and the method remains robust to partial occlusions, non‑rigid deformations, scale and viewpoint changes, irregular action performance, and low‑quality video.

Abstract

Human action in video sequences can be seen as silhouettes of a moving torso and protruding limbs undergoing articulated motion. We regard human actions as three-dimensional shapes induced by the silhouettes in the space-time volume. We adopt a recent approach for analyzing 2D shapes and generalize it to deal with volumetric space-time action shapes. Our method utilizes properties of the solution to the Poisson equation to extract space-time features such as local space-time saliency, action dynamics, shape structure and orientation. We show that these features are useful for action recognition, detection and clustering. The method is fast, does not require video alignment and is applicable in (but not limited to) many scenarios where the background is known. Moreover, we demonstrate the robustness of our method to partial occlusions, non-rigid deformations, significant changes in scale and viewpoint, high irregularities in the performance of an action, and low quality video.

References

Page 1

	Year	Citations

Page 1