Concepedia

Publication | Closed Access

Recognizing action at a distance

1.2K

Citations

18

References

2003

Year

Alexei A. Efros, Berg, Mori, Malik

Unknown Venue

TLDR

The study aims to recognize human actions from low‑resolution, distant views where a person may be only ~30 pixels tall. The authors propose a spatiotemporal motion descriptor derived from smoothed optical‑flow patterns, used in a nearest‑neighbor framework to classify actions and transfer skeletons or synthesize new actions. The method is validated on ballet, tennis, and football datasets.

Abstract

Our goal is to recognize human action at a distance, at resolutions where a whole person may be, say, 30 pixels tall. We introduce a novel motion descriptor based on optical flow measurements in a spatiotemporal volume for each stabilized human figure, and an associated similarity measure to be used in a nearest-neighbor framework. Making use of noisy optical flow measurements is the key challenge, which is addressed by treating optical flow not as precise pixel displacements, but rather as a spatial pattern of noisy measurements which are carefully smoothed and aggregated to form our spatiotemporal motion descriptor. To classify the action being performed by a human figure in a query sequence, we retrieve nearest neighbor(s) from a database of stored, annotated video sequences. We can also use these retrieved exemplars to transfer 2D/3D skeletons onto the figures in the query sequence, as well as two forms of data-based action synthesis "do as I do" and "do as I say". Results are demonstrated on ballet, tennis as well as football datasets.

References

YearCitations

Page 1