Publication | Closed Access
Action snippets: How many frames does human action recognition require?
528
Citations
25
References
2008
Year
Unknown Venue
Video ClipsMachine LearningEngineeringVideo ProcessingBiometricsSocial SciencesVideo InterpretationImage AnalysisMotion CapturePattern RecognitionAffective ComputingVideo Content AnalysisHuman ActionsCognitive ScienceMachine VisionAction PatternComputer ScienceVideo UnderstandingComputer VisionEye TrackingAction SnippetsVisual RecognitionActivity RecognitionMotion Analysis
Visual recognition of human actions in video clips has been an active field of research in recent years. However, most published methods either analyse an entire video and assign it a single action label, or use relatively large look-ahead to classify each frame. Contrary to these strategies, human vision proves that simple actions can be recognised almost instantaneously. In this paper, we present a system for action recognition from very short sequences (ldquosnippetsrdquo) of 1-10 frames, and systematically evaluate it on standard data sets. It turns out that even local shape and optic flow for a single frame are enough to achieve ap90% correct recognitions, and snippets of 5-7 frames (0.3-0.5 seconds of video) are enough to achieve a performance similar to the one obtainable with the entire video sequence.
| Year | Citations | |
|---|---|---|
Page 1
Page 1