Publication | Open Access
Investigating the use of recurrent motion modelling for speech gesture generation
133
Citations
29
References
2018
Year
Unknown Venue
Artificial IntelligenceEngineeringMachine LearningRecurrent MotionSpeech RecognitionNatural Language ProcessingMultimodal LlmGesture MotionRobot LearningAmerican Sign LanguageHealth SciencesDanceRealistic BehaviorSpeech SynthesisMotion SynthesisSpeech OutputComputer ScienceDeep LearningGesture RecognitionSpeech CommunicationSpeech TechnologySpeech ProcessingRecurrent NetworkSpeech PerceptionSpeech Gesture GenerationLinguistics
The growing use of virtual humans demands generating increasingly realistic behavior for them while minimizing cost and time. Gestures are a key ingredient for realistic and engaging virtual agents and consequently automatized gesture generation has been a popular area of research. So far, good gesture generation has relied on explicit formulation of if-then rules and probabilistic modelling of annotated features. Machine learning approaches have yielded only marginal success, indicating a high complexity of the speech-to-motion learning task. In this work, we explore the use of transfer learning using previous motion modelling research to improve learning outcomes for gesture generation from speech. We use a recurrent network with an encoder-decoder structure that takes in prosodic speech features and generates a short sequence of gesture motion. We pre-train the network with a motion modelling task. We recorded a large multimodal database of conversational speech for the purpose of this work.
| Year | Citations | |
|---|---|---|
Page 1
Page 1