Publication | Closed Access
A Point Set Generation Network for 3D Object Reconstruction from a Single Image
2.4K
Citations
17
References
2017
Year
Unknown Venue
Point Cloud CoordinatesEngineeringMachine LearningPoint Cloud ProcessingComputer-aided DesignPoint Cloud3D Computer VisionImage AnalysisDifferentiable RenderingData ScienceSingle ImageComputational GeometryGeometric ModelingMachine VisionComputer ScienceMedical Image ComputingDeep Learning3D Object Recognition3D Data ProcessingComputer VisionDeep Neural Networks3D VisionNatural SciencesDense ReconstructionConditional Shape Sampler3D ReconstructionMulti-view GeometryScene ModelingObject Reconstruction
Deep neural networks are increasingly used to generate 3D data, yet most methods rely on regular representations such as voxel grids or image collections that obscure shape invariance and face other limitations, and the ground‑truth shape for a single image can be ambiguous. The paper aims to reconstruct 3D shapes from a single image by directly predicting point‑cloud coordinates, addressing output ambiguity through a novel architecture, loss, and learning paradigm. The authors propose a conditional shape sampler that, using a novel architecture, loss, and learning paradigm, predicts multiple plausible point‑cloud reconstructions from a single image. Experiments show the system outperforms state‑of‑the‑art single‑image 3D reconstruction methods, achieves strong shape‑completion results, and can generate multiple plausible predictions.
Generation of 3D data by deep neural network has been attracting increasing attention in the research community. The majority of extant works resort to regular representations such as volumetric grids or collection of images; however, these representations obscure the natural invariance of 3D shapes under geometric transformations and also suffer from a number of other issues. In this paper we address the problem of 3D reconstruction from a single image, generating a straight-forward form of output -- point cloud coordinates. Along with this problem arises a unique and interesting issue, that the groundtruth shape for an input image may be ambiguous. Driven by this unorthodox output form and the inherent ambiguity in groundtruth, we design architecture, loss function and learning paradigm that are novel and effective. Our final solution is a conditional shape sampler, capable of predicting multiple plausible 3D point clouds from an input image. In experiments not only can our system outperform state-of-the-art methods on single image based 3d reconstruction benchmarks; but it also shows a strong performance for 3d shape completion and promising ability in making multiple plausible predictions.
| Year | Citations | |
|---|---|---|
2014 | 8.9K | |
2000 | 4.4K | |
2015 | 2.4K | |
2014 | 2.3K | |
2014 | 1.8K | |
2009 | 1.7K | |
2012 | 876 | |
1997 | 637 | |
2005 | 622 | |
2015 | 266 |
Page 1
Page 1