Corpus-based techniques in the AT&t nextgen synthesis system.

Abstract

The AT&T text-to-speech (TTS) synthesis system has been used as a framework for experimenting with a perceptuallyguided data-driven approach to speech synthesis, with primary focus on data-driven elements in the \back end. Statistical training techniques applied to a large corpus are used to make decisions about predicted speech events and selected speech inventory units. Our recent advances in automatic phonetic and prosodic labeling and a new faster harmonic plus noise model (HNM) and unit preselection implementations have signi cantly improved TTS quality and speeded up both development time and runtime.

References

Page 1

	Year	Citations

Page 1