Publication | Closed Access
Spatial Fusion GAN for Image Synthesis
168
Citations
39
References
2019
Year
Unknown Venue
Machine VisionImage AnalysisMachine LearningSpatial Fusion GanEngineeringForeground ObjectsGenerative Adversarial NetworkPlaces Foreground ObjectsImage SynthesisSynthesis RealismGenerative ModelsStyle TransferGenerative AiDeep LearningComputer VisionSynthetic Image Generation
GANs have advanced realistic image synthesis, yet most work focuses on either appearance or geometry, rarely both. The study introduces SF‑GAN, a model that fuses geometry and appearance synthesis to generate realistic images in both domains. SF‑GAN employs a geometry synthesizer that learns background context to place foreground objects and an appearance synthesizer that adjusts color, brightness, and style with a guided filter; the two modules reference each other and are trained end‑to‑end with minimal supervision, and the model is evaluated on scene‑text synthesis and realistic glasses/hats matching. Experiments show that SF‑GAN outperforms state‑of‑the‑art methods in both qualitative and quantitative evaluations.
Recent advances in generative adversarial networks (GANs) have shown great potentials in realistic image synthesis whereas most existing works address synthesis realism in either appearance space or geometry space but few in both. This paper presents an innovative Spatial Fusion GAN (SF-GAN) that combines a geometry synthesizer and an appearance synthesizer to achieve synthesis realism in both geometry and appearance spaces. The geometry synthesizer learns contextual geometries of background images and transforms and places foreground objects into the background images unanimously. The appearance synthesizer adjust the color, brightness and styles of the foreground objects and embeds them into background images harmoniously, where a guided filter is incorporated for detail preserving. The two synthesizers are inter-connected as mutual references which can be trained end-to-end with little supervision. The SF-GAN has been evaluated in two tasks: (1) realistic scene text image synthesis for training better recognition models; (2) glass and hat wearing for realistic matching glasses and hats with real portraits. Qualitative and quantitative comparisons with the state-of-the-art demonstrate the superiority of the proposed SF-GAN.
| Year | Citations | |
|---|---|---|
Page 1
Page 1