Publication | Closed Access
Multi-distribution deep belief network for speech synthesis
105
Citations
11
References
2013
Year
Unknown Venue
EngineeringMachine LearningSpeech RecognitionSpeech CommunityRobust Speech RecognitionVoice RecognitionDeep Belief NetworkHealth SciencesSpeech SynthesisSpeech OutputGenerative ModelsComputer ScienceSpeech ParametersDeep LearningDistant Speech RecognitionSpeech CommunicationMulti-speaker Speech RecognitionSpeech ProcessingSpeech InputSpeech Perception
Deep belief network (DBN) has been shown to be a good generative model in tasks such as hand-written digit image generation. Previous work on DBN in the speech community mainly focuses on using the generatively pre-trained DBN to initialize a discriminative model for better acoustic modeling in speech recognition (SR). To fully utilize its generative nature, we propose to model the speech parameters including spectrum and F0 simultaneously and generate these parameters from DBN for speech synthesis. Compared with the predominant HMM-based approach, objective evaluation shows that the spectrum generated from DBN has less distortion. Subjective results also confirm the advantage of the spectrum from DBN, and the overall quality is comparable to that of context-independent HMM.
| Year | Citations | |
|---|---|---|
Page 1
Page 1