Singing Voice Synthesis Based on Generative Adversarial Networks

Abstract

This paper proposes a generative adversarial training method for deep neural network (DNN)-based singing voice synthesis. The DNN-based approach has been used in statistical parametric singing voice synthesis and improved the naturalness of the synthesized singing voice [1]. Recently, generative adversarial networks (GANs) [2] have attracted significant attention in various machine learning research areas including speech synthesis [3]. GANs have achieved great success in modeling the distributions of complex data, and they have the potential to alleviate over-smoothing problem on the generated speech parameters in speech synthesis. In this paper, we propose a DNN-based singing voice synthesis system incorporating the GAN. Experimental results show that the proposed method outperforms the conventional method in the naturalness of the synthesized singing voice.

References

Page 1

	Year	Citations

Page 1