Publication | Closed Access
Cross-Modality Compensation Convolutional Neural Networks for RGB-D Action Recognition
52
Citations
54
References
2021
Year
Convolutional Neural NetworkEngineeringMachine LearningHuman Pose EstimationSingle ModalityVideo InterpretationImage AnalysisData SciencePattern RecognitionDepth ModalitiesRobot LearningHuman Action RecognitionVideo TransformerMachine VisionFeature LearningComputer ScienceVideo UnderstandingDeep LearningRgb-d Action RecognitionComputer Vision
RGB-D-based human action recognition has attracted much attention recently because it can provide more complementary information than a single modality. However, it is difficult for two modalities to effectively learn spatial-temporal information from each other. To facilitate information interaction between different modalities, a cross-modality compensation convolutional neural network (ConvNet) is proposed for human action recognition, which enhances the discriminative ability by jointly learning compensation features from the RGB and depth modalities. Moreover, we design a cross-modality compensation block (CMCB) to extract compensation features from the RGB and depth modalities. Specifically, CMCB is incorporated into two typical network architectures, ResNet and VGG, to verify the ability to improve the performance of our model. The proposed architecture has been evaluated on three challenging datasets: NTU RGB+D 120, THU-READ and PKU-MMD. We experimentally verify that our proposed model with CMCB is effective for different input types, such as pairs of raw images and dynamic images constructed from the entire RGB-D sequence, and the experimental results show that the proposed framework achieves state-of-the-art performance on all three datasets.
| Year | Citations | |
|---|---|---|
Page 1
Page 1