Publication | Closed Access
TD-PSOLA versus harmonic plus noise model in diphone based speech synthesis
37
Citations
9
References
2002
Year
Unknown Venue
EngineeringSpeech RepresentationSpeech EnhancementPhonologySpeech RecognitionSpeech CodingPhoneticsNoiseHealth SciencesSpeech SynthesisComputer EngineeringSpeech OutputSound SynthesisNoise ModelText-to-speechSpeech CommunicationSpeech TechnologyTd-psola VersusSpeech ProcessingDatabase CompressionSpeech Perception
In an effort to select a speech representation for our next generation concatenative text-to-speech synthesizer, the use of two candidates is investigated; TD-PSOLA and the harmonic plus noise model, HNM. A formal listening test has been conducted and the two candidates have been rated regarding intelligibility, naturalness and pleasantness. Ability for database compression and computational load is also discussed. The results show that HNM consistently outperforms TD-PSOLA in all the above features except for computational load. HNM allows for high-quality speech synthesis without smoothing problems at the segmental boundaries and without buzziness or other oddities observed with TD-PSOLA.
| Year | Citations | |
|---|---|---|
Page 1
Page 1