Concepedia

Abstract

The AT&T text-to-speech (TTS) synthesis system has been used as a framework for experimenting with a perceptuallyguided data-driven approach to speech synthesis, with primary focus on data-driven elements in the \back end. Statistical training techniques applied to a large corpus are used to make decisions about predicted speech events and selected speech inventory units. Our recent advances in automatic phonetic and prosodic labeling and a new faster harmonic plus noise model (HNM) and unit preselection implementations have signi cantly improved TTS quality and speeded up both development time and runtime.

References

YearCitations

Page 1