Publication | Open Access
A 3DCNN-LSTM Multi-Class Temporal Segmentation for Hand Gesture Recognition
13
Citations
32
References
2022
Year
EngineeringMachine LearningHuman Pose EstimationAction Recognition (Movement Science)BiometricsAction Recognition (Computer Vision)Image FramesVideo InterpretationSpeech RecognitionImage AnalysisData ScienceHand Gesture SequencesPattern RecognitionVideo TransformerGesture ProcessingHealth SciencesGesture StudiesMachine VisionVideo UnderstandingDeep LearningGesture RecognitionComputer VisionVideo AnalysisConsecutive FramesHand Gesture Recognition
This paper introduces a multi-class hand gesture recognition model developed to identify a set of hand gesture sequences from two-dimensional RGB video recordings, using both the appearance and spatiotemporal parameters of consecutive frames. The classifier utilizes a convolutional-based network combined with a long-short-term memory unit. To leverage the need for a large-scale dataset, the model deploys training on a public dataset, adopting a technique known as transfer learning to fine-tune the architecture on the hand gestures of relevance. Validation curves performed over a batch size of 64 indicate an accuracy of 93.95% (±0.37) with a mean Jaccard index of 0.812 (±0.105) for 22 participants. The fine-tuned architecture illustrates the possibility of refining a model with a small set of data (113,410 fully labelled image frames) to cover previously unknown hand gestures. The main contribution of this work includes a custom hand gesture recognition network driven by monocular RGB video sequences that outperform previous temporal segmentation models, embracing a small-sized architecture that facilitates wide adoption.
| Year | Citations | |
|---|---|---|
Page 1
Page 1