Disentangling Chat with Local Coherence Models

Abstract

We evaluate several popular models of local discourse coherence for domain and task generality by applying them to chat disentanglement. Using experiments on synthetic multiparty conversations, we show that most models transfer well from text to dialogue. Coherence models improve results overall when good parses and topic models are available, and on a constrained task for real chat data. One property of a well-written document is coherence, the way each sentence ts into its context– sentences should be interpretable in light of what has come before, and in turn make it possible to interpret what comes after. Models of coherence have primarily been used for text-based generation tasks: ordering units of text for multidocument summarization or inserting new text into an existing article. In general, the corpora used consist of informative writing, and the tasks used for evaluation consider different ways of reordering the same set of textual units. But the theoretical concept of coherence goes beyond both this domain and this task setting – and so should coherence models. This paper evaluates a variety of local coherence models on the task of chat disentanglement or “threading”: separating a transcript of a multiparty interaction into independent conversations 1. Such simultaneous conversations occur in internet chat rooms, and on shared voice channels such as pushto-talk radio. In these situations, a single, correctly 1 A public implementation is available via

References

Page 1

	Year	Citations

Page 1