MidiNet: A Convolutional Generative Adversarial Network for Symbolic-domain Music Generation

TLDR

Most neural music generation models rely on recurrent networks, but DeepMind’s WaveNet demonstrates that convolutional neural networks can also produce realistic audio waveforms. The study investigates using convolutional neural networks to generate symbolic‑domain melodies bar‑by‑bar and introduces a conditional mechanism that allows generation from scratch, following a chord sequence, or conditioned on prior bars. MidiNet is a GAN comprising a CNN generator and discriminator, incorporates the conditional mechanism for flexible melody generation, supports multiple MIDI channels, and was evaluated in a user study against Google’s MelodyRNN. User studies show that MidiNet’s melodies are as realistic and pleasant as MelodyRNN’s but are judged to be significantly more interesting.

Abstract

Most existing neural network models for music generation use recurrent neural networks. However, the recent WaveNet model proposed by DeepMind shows that convolutional neural networks (CNNs) can also generate realistic musical waveforms in the audio domain. Following this light, we investigate using CNNs for generating melody (a series of MIDI notes) one bar after another in the symbolic domain. In addition to the generator, we use a discriminator to learn the distributions of melodies, making it a generative adversarial network (GAN). Moreover, we propose a novel conditional mechanism to exploit available prior knowledge, so that the model can generate melodies either from scratch, by following a chord sequence, or by conditioning on the melody of previous bars (e.g. a priming melody), among other possibilities. The resulting model, named MidiNet, can be expanded to generate music with multiple MIDI channels (i.e. tracks). We conduct a user study to compare the melody of eight-bar long generated by MidiNet and by Google's MelodyRNN models, each time using the same priming melody. Result shows that MidiNet performs comparably with MelodyRNN models in being realistic and pleasant to listen to, yet MidiNet's melodies are reported to be much more interesting.

References

Page 1

	Year	Citations
Reinforcement Learning: An Introduction IEEE Transactions on Neural Networks Artificial IntelligenceEngineeringDeep Reinforcement LearningStochastic GameGame Theory	2005	25.7K
Conditional Generative Adversarial Nets Mehdi Mirza, Simon Osindero arXiv (Cornell University) Artificial IntelligenceGenerative SystemGenerative Adversarial NetsMachine LearningData Science	2014	8.9K
Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks Alec Radford, Luke Metz, Soumith Chintala arXiv (Cornell University) Convolutional Neural NetworkEngineeringMachine LearningRepresentation LearningImage Analysis	2015	7K
Generating Sequences With Recurrent Neural Networks DROPS (Schloss Dagstuhl – Leibniz Center for Informatics)	2013	3.1K
WaveNet: A Generative Model for Raw Audio Aäron van den Oord, Sander Dieleman, Heiga Zen, arXiv (Cornell University) MusicEngineeringMachine LearningData ScienceA Single Wavenet	2016	2.5K
Conditional Image Generation with PixelCNN Decoders Aäron van den Oord, Nal Kalchbrenner, Oriol Vinyals, arXiv (Cornell University) Convolutional Neural NetworkConditional PixelcnnMachine VisionImage AnalysisMachine Learning	2016	1.6K
Image-to-Image Translation with Conditional Adversarial Networks Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, Artificial IntelligenceImage AnalysisMachine VisionMachine LearningData Science	2017	1.6K
Improved Techniques for Training GANs Tim Salimans, Ian Goodfellow, Wojciech Zaremba, arXiv (Cornell University) Artificial IntelligenceNew Architectural FeaturesMachine VisionMachine LearningData Science	2016	1.4K
NIPS 2016 Tutorial: Generative Adversarial Networks Ian Goodfellow arXiv (Cornell University) Artificial IntelligenceGenerative Artificial IntelligenceEngineeringMachine LearningData Science	2016	1.3K
InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets Xi Chen, Yan Duan, Rein Houthooft, Ghent University Academic Bibliography (Ghent University) Artificial IntelligenceEngineeringMachine LearningGenerative SystemRepresentation Learning	2016	1.2K

Page 1