Concepedia

TLDR

Gesture and speech jointly form a rich basis for human conversational interaction, grounded in the psycholinguistic idea that both modalities are coequal outputs of the same semantic intent. The study seeks to understand how gesture and speech interplay to support communication and proposes a framework for cross‑modal cues in discourse segmentation during natural conversation. The authors conduct a gesture‑speech elicitation experiment in which a subject describes her living space, analyze video and audio separately to extract segmentation cues, and compare these to expert psycholinguistic annotations to identify correlating cues. The study finds that handedness and symmetry in two‑handed gestures provide effective supersegmental discourse cues.

Abstract

Gesture and speech combine to form a rich basis for human conversational interaction. To exploit these modalities in HCI, we need to understand the interplay between them and the way in which they support communication. We propose a framework for the gesture research done to date, and present our work on the cross-modal cues for discourse segmentation in free-form gesticulation accompanying speech in natural conversation as a new paradigm for such multimodal interaction. The basis for this integration is the psycholinguistic concept of the coequal generation of gesture and speech from the same semantic intent. We present a detailed case study of a gesture and speech elicitation experiment in which a subject describes her living space to an interlocutor. We perform two independent sets of analyses on the video and audio data: video and audio analysis to extract segmentation cues, and expert transcription of the speech and gesture data by microanalyzing the videotape using a frame-accurate videoplayer to correlate the speech with the gestural entities. We compare the results of both analyses to identify the cues accessible in the gestural and audio data that correlate well with the expert psycholinguistic analysis. We show that "handedness" and the kind of symmetry in two-handed gestures provide effective supersegmental discourse cues.

References

YearCitations

Page 1