Publication | Closed Access
Synthesizing multimodal utterances for conversational agents
243
Citations
17
References
2004
Year
Speech KinematicsEducationSpoken Dialog SystemSpeech RecognitionComputational LinguisticsConversational AgentsMultimodal InteractionSpeech InterfaceConversation AnalysisGesture ProcessingHand GesturesGesture StudiesHealth SciencesMultimodal UtterancesSpeech SynthesisGesture SynthesisIncremental Production ModelSpeech OutputShape SpecificationsGesture RecognitionSpeech CommunicationSpeech AcousticsSpeech ProcessingHuman-computer InteractionSpeech PerceptionLinguistics
Conversational agents aim to integrate speech with non‑verbal modalities to produce intelligible multimodal utterances. The study targets generating gestures and speech from XML‑based descriptions of their overt form. An incremental production model synthesizes synchronized gestural, verbal, and facial behaviors, employing an efficient kinematic hand‑gesture animation that adapts to temporal constraints for natural co‑articulation and transition effects. © 2004 John Wiley & Sons, Ltd.
Abstract Conversational agents are supposed to combine speech with non‐verbal modalities for intelligible multimodal utterances. In this paper, we focus on the generation of gesture and speech from XML‐based descriptions of their overt form. An incremental production model is presented that combines the synthesis of synchronized gestural, verbal, and facial behaviors with mechanisms for linking them in fluent utterances with natural co‐articulation and transition effects. In particular, an efficient kinematic approach for animating hand gestures from shape specifications is presented, which provides fine adaptation to temporal constraints that are imposed by cross‐modal synchrony. Copyright © 2004 John Wiley & Sons, Ltd.
| Year | Citations | |
|---|---|---|
Page 1
Page 1