Publication | Open Access
Multimodal Gesture Recognition Using 3-D Convolution and Convolutional LSTM
273
Citations
41
References
2017
Year
3-D ConvolutionConvolutional LstmEngineeringMachine LearningHuman Pose Estimation3D Pose EstimationImage AnalysisData SciencePattern RecognitionRobot LearningMultimodal Human Computer InterfaceHealth SciencesConvolutional Lstm NetworksDanceMachine VisionDeep LearningComputer VisionGesture RecognitionHuman MovementActivity Recognition
Gesture recognition aims to recognize meaningful movements of human bodies, and is of utmost importance in intelligent human-computer/robot interactions. In this paper, we present a multimodal gesture recognition method based on 3-D convolution and convolutional long-short-term-memory (LSTM) networks. The proposed method first learns short-term spatiotemporal features of gestures through the 3-D convolutional neural network, and then learns long-term spatiotemporal features by convolutional LSTM networks based on the extracted short-term spatiotemporal features. In addition, fine-tuning among multimodal data is evaluated, and we find that it can be considered as an optional skill to prevent overfitting when no pre-trained models exist. The proposed method is verified on the ChaLearn LAP large-scale isolated gesture data set (IsoGD) and the Sheffield Kinect gesture (SKIG) data set. The results show that our proposed method can obtain the state-of-the-art recognition accuracy (51.02% on the validation set of IsoGD and 98.89% on SKIG).
| Year | Citations | |
|---|---|---|
Page 1
Page 1