Publication | Closed Access
A probabilistic approach to unit selection for corpus-based speech synthesis
19
Citations
18
References
2005
Year
Unknown Venue
EngineeringMachine LearningSpeech CorpusGaussian MixturesSpoken Language ProcessingCorpus LinguisticsSpeech RecognitionNatural Language ProcessingUnit SelectionPhoneticsComputational LinguisticsLanguage StudiesMachine TranslationF0 ContourSpeech SynthesisSpeech OutputComputer ScienceText-to-speechSpeech CommunicationSpeech TechnologySpeech ProcessingSpectral CharacteristicsSpeech PerceptionLinguistics
In this paper, we present a novel statistical approach to corpus-based speech synthesis. Unit selection is directed by probabilistic models for F0 contour, duration, and spectral characteristics of the synthesis units. The F0 targets for units are modeled by statistical additive models, and duration targets are modeled by regression trees. Spectral targets for a unit is modeled by Gaussian mixtures on MFCC-based features. Goodness of concatenation of two units is modeled by conditional Gaussian models on MFCC-based features. Although the system is in its early stage of development, we implemented an English speech synthesizer with CMU Arctic corpora and confirmed the effectiveness of this new framework.
| Year | Citations | |
|---|---|---|
Page 1
Page 1