Large Scale GAN Training for High Fidelity Natural Image Synthesis

TLDR

Despite recent progress in generative image modeling, successfully generating high‑resolution, diverse samples from complex datasets such as ImageNet remains an elusive goal. The study trains GANs at the largest scale yet attempted to investigate scale‑specific instabilities. They train GANs at unprecedented scale and analyze the resulting instabilities. Orthogonal regularization enables a truncation trick that balances fidelity and variety, and the resulting BigGANs set new state‑of‑the‑art on ImageNet 128×128 with an Inception Score of 166.5 and FID of 7.4, surpassing previous bests of 52.52 and 18.6.

Abstract

Despite recent progress in generative image modeling, successfully generating high-resolution, diverse samples from complex datasets such as ImageNet remains an elusive goal. To this end, we train Generative Adversarial Networks at the largest scale yet attempted, and study the instabilities specific to such scale. We find that applying orthogonal regularization to the generator renders it amenable to a simple truncation trick, allowing fine control over the trade-off between sample fidelity and variety by reducing the variance of the Generator's input. Our modifications lead to models which set the new state of the art in class-conditional image synthesis. When trained on ImageNet at 128x128 resolution, our models (BigGANs) achieve an Inception Score (IS) of 166.5 and Frechet Inception Distance (FID) of 7.4, improving over the previous best IS of 52.52 and FID of 18.6.

References

Page 1

	Year	Citations

Page 1