Publication | Open Access
Neural Discrete Representation Learning
1.9K
Citations
0
References
2017
Year
Natural Language ProcessingGeometric LearningVector QuantisationSpeech RecognitionMachine LearningData ScienceEngineeringComputational NeuroscienceUseful RepresentationsAutoencodersSparse Neural NetworkGenerative ModelsSpeech ProcessingGenerative ModelComputer ScienceDeep LearningGenerative SystemRepresentation Learning
Learning useful representations without supervision remains a key challenge in machine learning. The authors propose a simple yet powerful generative model that learns discrete representations. Their Vector‑Quantised Variational AutoEncoder (VQ‑VAE) produces discrete latent codes via vector quantisation and learns a prior, thereby avoiding posterior collapse common in VAEs. When paired with an autoregressive prior, the VQ‑VAE generates high‑quality images, videos, and speech, performs speaker conversion, and learns phonemes unsupervised, demonstrating the utility of the learned representations.
Learning useful representations without supervision remains a key challenge in machine learning. In this paper, we propose a simple yet powerful generative model that learns such discrete representations. Our model, the Vector Quantised-Variational AutoEncoder (VQ-VAE), differs from VAEs in two key ways: the encoder network outputs discrete, rather than continuous, codes; and the prior is learnt rather than static. In order to learn a discrete latent representation, we incorporate ideas from vector quantisation (VQ). Using the VQ method allows the model to circumvent issues of "posterior collapse" -- where the latents are ignored when they are paired with a powerful autoregressive decoder -- typically observed in the VAE framework. Pairing these representations with an autoregressive prior, the model can generate high quality images, videos, and speech as well as doing high quality speaker conversion and unsupervised learning of phonemes, providing further evidence of the utility of the learnt representations.