Publication | Closed Access
HNM-based MFCC+F0 extractor applied to statistical speech synthesis
21
Citations
23
References
2011
Year
Unknown Venue
EngineeringSpeech SignalsAccurate Mfcc ExtractionSpeech RecognitionSpeech CodingPhoneticsStatistical Speech SynthesisNoiseRobust Speech RecognitionVoice RecognitionHealth SciencesSpeech SynthesisSpeech OutputDistant Speech RecognitionSignal ProcessingSpeech CommunicationSpeech ProcessingSpeech PerceptionHidden Markov Models
Currently, the statistical framework based on Hidden Markov Models (HMMs) plays a relevant role in speech synthesis, while voice conversion systems based on Gaussian Mixture Models (GMMs) are almost standard. In both cases, statistical modeling is applied to learn distributions of acoustic vectors extracted from speech signals, each vector containing a suitable parametric representation of one speech frame. The overall performance of the systems is often limited by the accuracy of the underlying speech parameterization and reconstruction method. The method presented in this paper allows accurate MFCC extraction and high-quality reconstruction of speech signals assuming a Harmonics plus Noise Model (HNM). Its suitability for high-quality HMM-based speech synthesis is shown through subjective tests.
| Year | Citations | |
|---|---|---|
Page 1
Page 1