Publication | Closed Access
Probability based prosody model for unit selection
12
Citations
9
References
2004
Year
Unknown Venue
EngineeringSpoken Language ProcessingPhonologyAcoustic ModelingSpeech RecognitionNatural Language ProcessingUnit SelectionPhoneticsComputational LinguisticsLanguage StudiesProsody ValuesSpeech SynthesisSpeech OutputProsody (Linguistics)Text-to-speechSpeech CommunicationSpeech TechnologySpeech ProcessingSpeech PerceptionProsody ModelLinguistics
Most modern text-to-speech (TTS) systems are unit selection style. In this kind of system, the predicted prosody values, such as pitch, duration and energy values for each synthesis unit, are important factors to conduct unit selection. We present a probability based prosody model in which the distribution of prosody values in a given context equivalent cluster is described by a Gaussian mixture model (GMM), and the distance between a candidate unit and the context equivalent cluster is defined by the GMM probability output. A novel framework for unit selection style TTS systems is derived from the model, and a series of experiments are done on the framework.
| Year | Citations | |
|---|---|---|
Page 1
Page 1