Publication | Open Access
I3D-LSTM: A New Model for Human Action Recognition
73
Citations
3
References
2019
Year
Convolutional Neural NetworkEngineeringMachine LearningAction Recognition (Movement Science)Action Recognition (Computer Vision)Abstract Action RecognitionVideo RetrievalVideo InterpretationImage AnalysisPattern RecognitionRobot LearningHuman Action RecognitionKinetics-pretrained 3DHealth SciencesMachine VisionComputer ScienceVideo UnderstandingDeep LearningComputer VisionConvolution Neural NetworkVideo AnalysisActivity Recognition
Abstract Action recognition has already been a heated research topic recently, which attempts to classify different human actions in videos. The current main-stream methods generally utilize ImageNet-pretrained model as features extractor, however it’s not the optimal choice to pretrain a model for classifying videos on a huge still image dataset. What’s more, very few works notice that 3D convolution neural network(3D CNN) is better for low-level spatial-temporal features extraction while recurrent neural network(RNN) is better for modelling high-level temporal feature sequences. Consequently, a novel model is proposed in our work to address the two problems mentioned above. First, we pretrain 3D CNN model on huge video action recognition dataset Kinetics to improve generality of the model. And then long short term memory(LSTM) is introduced to model the high-level temporal features produced by the Kinetics-pretrained 3D CNN model. Our experiments results show that the Kinetics-pretrained model can generally outperform ImageNet-pretrained model. And our proposed network finally achieve leading performance on UCF-101 dataset.
| Year | Citations | |
|---|---|---|
Page 1
Page 1