Publication | Closed Access
Towards Achieving Robust Universal Neural Vocoding
90
Citations
26
References
2019
Year
Unknown Venue
EngineeringMachine LearningSpeech RecognitionSparse Neural NetworkPhoneticsRobust Speech RecognitionLanguage StudiesWavernn-based VocoderSpeech SynthesisSpeech OutputComputer ScienceDeep LearningDistant Speech RecognitionSpeech CommunicationSpeech TechnologyPotential UniversalityNeural Vocoders.weSpeech ProcessingLinguistics
This paper explores the potential universality of neural vocoders.We train a WaveRNN-based vocoder on 74 speakers coming from 17 languages.This vocoder is shown to be capable of generating speech of consistently good quality (98% relative mean MUSHRA when compared to natural speech) regardless of whether the input spectrogram comes from a speaker or style seen during training or from an out-of-domain scenario when the recording conditions are studio-quality.When the recordings show significant changes in quality, or when moving towards non-speech vocalizations or singing, the vocoder still significantly outperforms speaker-dependent vocoders, but operates at a lower average relative MUSHRA of 75%.These results are shown to be consistent across languages, regardless of them being seen during training (e.g.English or Japanese) or unseen (e.g.
| Year | Citations | |
|---|---|---|
Page 1
Page 1