Make3D: Learning 3D Scene Structure from a Single Still Image

TLDR

Estimating detailed 3D structure from a single still image of an unstructured environment is the problem addressed. The goal is to produce 3D models that are quantitatively accurate and visually pleasing. The method uses a supervised Markov Random Field on small homogeneous patches to infer plane parameters that capture 3D location and orientation, modeling depth cues and inter‑patch relationships while assuming only a planar decomposition, which enables detailed 3D reconstruction and richer flythroughs. The approach yields qualitatively correct 3D models for 64.9 % of 588 Internet images and can be extended to produce large‑scale 3D models from a few images.

Abstract

We consider the problem of estimating detailed 3D structure from a single still image of an unstructured environment. Our goal is to create 3D models that are both quantitatively accurate as well as visually pleasing. For each small homogeneous patch in the image, we use a Markov Random Field (MRF) to infer a set of "plane parameters" that capture both the 3D location and 3D orientation of the patch. The MRF, trained via supervised learning, models both image depth cues as well as the relationships between different parts of the image. Other than assuming that the environment is made up of a number of small planes, our model makes no explicit assumptions about the structure of the scene; this enables the algorithm to capture much more detailed 3D structure than does prior art and also give a much richer experience in the 3D flythroughs created using image-based rendering, even for scenes with significant nonvertical structure. Using this approach, we have created qualitatively correct 3D models for 64.9 percent of 588 images downloaded from the Internet. We have also extended our model to produce large-scale 3D models from a few images.

References

Page 1

	Year	Citations

Page 1