Publication | Closed Access
Spatiotemporal Multiplier Networks for Video Action Recognition
693
Citations
35
References
2017
Year
Unknown Venue
Convolutional Neural NetworkImage AnalysisMachine VisionData ScienceMachine LearningPattern RecognitionEngineeringVideo Action RecognitionComputer ScienceVideo UnderstandingGeneral Convnet ArchitectureIdentity Mapping KernelsDeep LearningVideo TransformerActivity RecognitionVideo InterpretationComputer Vision
This paper presents a general ConvNet architecture for video action recognition based on multiplicative interactions of spacetime features. Our model combines the appearance and motion pathways of a two-stream architecture by motion gating and is trained end-to-end. We theoretically motivate multiplicative gating functions for residual networks and empirically study their effect on classification accuracy. To capture long-term dependencies we inject identity mapping kernels for learning temporal relationships. Our architecture is fully convolutional in spacetime and able to evaluate a video in a single forward pass. Empirical investigation reveals that our model produces state-of-the-art results on two standard action recognition datasets.
| Year | Citations | |
|---|---|---|
Page 1
Page 1