Publication | Closed Access
Synthesizing Audio with Generative Adversarial Networks
104
Citations
25
References
2018
Year
Unknown Venue
MusicArtificial IntelligenceSpeech RecognitionMachine LearningAudio SynthesisEngineeringGenerative Adversarial NetworkGenerative ModelsSpeech ProcessingGenerative Adversarial NetworksComputer ScienceSound SynthesisGenerative AiBird VocalizationsDeep LearningGenerative ModelGenerative SystemSynthetic Image Generation
While Generative Adversarial Networks (GANs) have seen wide success at the problem of synthesizing realistic images, they have seen little application to the problem of unsupervised audio generation. Unlike for images, a barrier to success is that the best discriminative representations for audio tend to be non-invertible, and thus cannot be used to synthesize listenable outputs. In this paper, we introduce WaveGAN, a first attempt at applying GANs to raw audio synthesis in an unsupervised setting. Our experiments on speech demonstrate that WaveGAN can produce intelligible words from a small vocabulary of human speech, as well as synthesize audio from other domains such as bird vocalizations, drums, and piano. Qualitatively, we find that human judges prefer the generated examples from WaveGAN over those from a method which naively apply GANs on image-like audio feature representations.
| Year | Citations | |
|---|---|---|
Page 1
Page 1