Publication | Closed Access
Skeleton-Based Action Recognition With Shift Graph Convolutional Network
916
Citations
32
References
2020
Year
Unknown Venue
Geometric LearningEngineeringMachine LearningHuman Pose EstimationSkeleton DataVideo InterpretationKinesiologyImage AnalysisPattern RecognitionHealth SciencesMachine VisionSkeleton-based Action RecognitionComputer ScienceVideo UnderstandingDeep LearningComputer VisionTemporal GraphHuman MovementGraph Neural NetworkActivity Recognition
Skeleton‑based action recognition has attracted attention, but graph convolutional networks suffer from high computational cost and inflexible receptive fields. This work introduces Shift‑GCN to address both the heavy computation and limited expressiveness of conventional GCNs. Shift‑GCN replaces costly regular graph convolutions with shift graph operations and lightweight point‑wise convolutions, enabling flexible spatial and temporal receptive fields. Across three benchmark datasets, Shift‑GCN surpasses state‑of‑the‑art methods while reducing computational complexity by more than tenfold.
Action recognition with skeleton data is attracting more attention in computer vision. Recently, graph convolutional networks (GCNs), which model the human body skeletons as spatiotemporal graphs, have obtained remarkable performance. However, the computational complexity of GCN-based methods are pretty heavy, typically over 15 GFLOPs for one action sample. Recent works even reach about 100 GFLOPs. Another shortcoming is that the receptive fields of both spatial graph and temporal graph are inflexible. Although some works enhance the expressiveness of spatial graph by introducing incremental adaptive modules, their performance is still limited by regular GCN structures. In this paper, we propose a novel shift graph convolutional network (Shift-GCN) to overcome both shortcomings. Instead of using heavy regular graph convolutions, our Shift-GCN is composed of novel shift graph operations and lightweight point-wise convolutions, where the shift graph operations provide flexible receptive fields for both spatial graph and temporal graph. On three datasets for skeleton-based action recognition, the proposed Shift-GCN notably exceeds the state-of-the-art methods with more than 10 times less computational complexity.
| Year | Citations | |
|---|---|---|
Page 1
Page 1