Concepedia

Publication | Open Access

Action Recognition with Improved Trajectories

3.5K

Citations

31

References

2013

Year

TLDR

Dense trajectories have become a state‑of‑the‑art video representation for action recognition, but human motion differs from camera motion, leading to inconsistent matches. The paper aims to improve dense trajectory performance by correcting for camera motion. Camera motion is estimated by matching SURF descriptors and dense optical flow, fitting a homography with RANSAC, and filtering out human‑motion matches with a detector; trajectories consistent with this motion are removed and optical flow is corrected. The corrected trajectories markedly improve HOF and MBH descriptors and achieve state‑of‑the‑art performance on Hollywood2, HMDB51, Olympic Sports, and UCF50.

Abstract

Recently dense trajectories were shown to be an efficient video representation for action recognition and achieved state-of-the-art results on a variety of datasets. This paper improves their performance by taking into account camera motion to correct them. To estimate camera motion, we match feature points between frames using SURF descriptors and dense optical flow, which are shown to be complementary. These matches are, then, used to robustly estimate a homography with RANSAC. Human motion is in general different from camera motion and generates inconsistent matches. To improve the estimation, a human detector is employed to remove these matches. Given the estimated camera motion, we remove trajectories consistent with it. We also use this estimation to cancel out camera motion from the optical flow. This significantly improves motion-based descriptors, such as HOF and MBH. Experimental results on four challenging action datasets (i.e., Hollywood2, HMDB51, Olympic Sports and UCF50) significantly outperform the current state of the art.

References

YearCitations

Page 1