Publication | Closed Access
FoldingNet: Point Cloud Auto-Encoder via Deep Grid Deformation
1.3K
Citations
54
References
2018
Year
Unknown Venue
Geometric LearningEngineeringMachine LearningPoint Cloud ProcessingPoint Cloud3D Computer VisionImage AnalysisData SciencePattern RecognitionRobot LearningComputational GeometryArbitrary Point CloudDeep Grid DeformationMachine VisionComputer ScienceDeep Learning3D Object RecognitionComputer VisionPoint Clouds
Recent deep networks that directly process point sets, such as PointNet, have set the state‑of‑the‑art for supervised point‑cloud tasks like classification and segmentation. This work introduces an end‑to‑end deep auto‑encoder designed to tackle unsupervised learning on point clouds. The encoder augments PointNet with a graph‑based enhancement to capture local structure, while the decoder deforms a canonical 2D grid onto the 3D surface of the point cloud, yielding low reconstruction errors even for delicate shapes. The folding‑based decoder uses only about 7 % of the parameters of a fully‑connected decoder yet produces a more discriminative representation that achieves higher linear SVM classification accuracy, and is theoretically capable of reconstructing any point cloud from a 2D grid. Code is available at http://www.merl.com/research/license#FoldingNet.
Recent deep networks that directly handle points in a point set, e.g., PointNet, have been state-of-the-art for supervised learning tasks on point clouds such as classification and segmentation. In this work, a novel end-to-end deep auto-encoder is proposed to address unsupervised learning challenges on point clouds. On the encoder side, a graph-based enhancement is enforced to promote local structures on top of PointNet. Then, a novel folding-based decoder deforms a canonical 2D grid onto the underlying 3D object surface of a point cloud, achieving low reconstruction errors even for objects with delicate structures. The proposed decoder only uses about 7% parameters of a decoder with fully-connected neural networks, yet leads to a more discriminative representation that achieves higher linear SVM classification accuracy than the benchmark. In addition, the proposed decoder structure is shown, in theory, to be a generic architecture that is able to reconstruct an arbitrary point cloud from a 2D grid. Our code is available at http://www.merl.com/research/license#FoldingNet.
| Year | Citations | |
|---|---|---|
Page 1
Page 1