Concepedia

Publication | Closed Access

Language model adaptation using latent dirichlet allocation and an efficient topic inference algorithm

42

Citations

8

References

2007

Year

Abstract

We present an effort to perform topic mixture-based language model adaptation using latent Dirichlet allocation (LDA).We use probabilistic latent semantic analysis (PLSA) to automatically cluster a heterogeneous training corpus, and train an LDA model using the resultant topicdocument assignments.Using this LDA model, we then construct topic-specific corpora at the utterance level for interpolation with a background language model during language model adaptation.We also present a novel iterative algorithm for LDA topic inference.Very encouraging results were obtained in preliminary experiments with broadcast news in Mandarin Chinese.

References

YearCitations

Page 1