Efficient visual event detection using volumetric features

TLDR

The paper investigates using volumetric features as an alternative to local descriptors for video event detection. The authors extend 2D box features to 3D volumetric features and build a realtime cascade filter detector that scans video sequences in space and time. The detector achieves real‑time analysis, accurately recognizes actions difficult for interest‑point methods, and matches state‑of‑the‑art performance on human activity classification while being robust to viewpoint, scale, and speed variations.

Abstract

This paper studies the use of volumetric features as an alternative to popular local descriptor approaches for event detection in video sequences. Motivated by the recent success of similar ideas in object detection on static images, we generalize the notion of 2D box features to 3D spatio-temporal volumetric features. This general framework enables us to do real-time video analysis. We construct a realtime event detector for each action of interest by learning a cascade of filters based on volumetric features that efficiently scans video sequences in space and time. This event detector recognizes actions that are traditionally problematic for interest point methods - such as smooth motions where insufficient space-time interest points are available. Our experiments demonstrate that the technique accurately detects actions on real-world sequences and is robust to changes in viewpoint, scale and action speed. We also adapt our technique to the related task of human action classification and confirm that it achieves performance comparable to a current interest point based human activity recognizer on a standard database of human activities.

References

Page 1

	Year	Citations

Page 1