Concepedia

Abstract

Recognizing human actions in complex scenes is a challenging problem due to background clutters, camera motion, occlusions, and illumination variations. Markov models are widely used to model temporal statistical relationships among elementary actions for human action recognition. However, traditional Markov models cannot model long-range temporal relations for complex activities, and the states of elementary actions may be unstable due to unwanted background local features. In this paper, we propose a multiple-instance Markov model for human action recognition to address these issues. Our contributions are twofold. First, a novel representation for elementary actions is proposed to encode the movements of local parts. Based on this representation, our method selects elementary actions with stable states due to our multiple-instance formulation. Second, we build multiple Markov chains, which encode both local and long-range temporal information among elementary actions, to represent each video. Multiple-instance formulation allows our model to capture the most discriminative Markov chain for action representation. We evaluate the proposed model on a variety of data sets. Experimental results demonstrate its effectiveness for human action recognition.

References

YearCitations

Page 1