Learning to predict indoor illumination from a single image

TLDR

Previous work on indoor illumination estimation relies on specialized capture, user input, or simple scene models, limiting applicability. This study proposes an automatic method to infer high dynamic range illumination from a single low dynamic range photo of an indoor scene. The approach trains a lighting classifier on LDR environment maps, then a neural network to predict light locations from a limited field‑of‑view image, and finally fine‑tunes on HDR maps to estimate light intensities. The resulting HDR illumination estimates outperform state‑of‑the‑art methods and enable photo‑realistic 3D object insertion, as confirmed by a perceptual user study.

Abstract

We propose an automatic method to infer high dynamic range illumination from a single, limited field-of-view, low dynamic range photograph of an indoor scene. In contrast to previous work that relies on specialized image capture, user input, and/or simple scene models, we train an end-to-end deep neural network that directly regresses a limited field-of-view photo to HDR illumination, without strong assumptions on scene geometry, material properties, or lighting. We show that this can be accomplished in a three step process: 1) we train a robust lighting classifier to automatically annotate the location of light sources in a large dataset of LDR environment maps, 2) we use these annotations to train a deep neural network that predicts the location of lights in a scene from a single limited field-of-view photo, and 3) we fine-tune this network using a small dataset of HDR environment maps to predict light intensities. This allows us to automatically recover high-quality HDR illumination estimates that significantly outperform previous state-of-the-art methods. Consequently, using our illumination estimates for applications like 3D object insertion, produces photo-realistic results that we validate via a perceptual user study.

References

Page 1

	Year	Citations

Page 1