Concepedia

Publication | Closed Access

Deep Closest Point: Learning Representations for Point Cloud Registration

970

Citations

44

References

2019

Year

Yue Wang, Justin Solomon

Unknown Venue

TLDR

Point cloud registration aligns two point clouds via a rigid transformation, but iterative methods such as ICP often converge to local optima. The authors propose Deep Closest Point to overcome ICP’s local optima and also evaluate its learned features on unseen objects. DCP employs a point‑cloud embedding network, an attention‑based pointer module, and a differentiable SVD layer, and the authors analyze its learned features to assess domain‑specific versus global contributions. Training on ModelNet40, DCP outperforms ICP, its variants, and the learning‑based method PointNetLK in several settings.

Abstract

Point cloud registration is a key problem for computer vision applied to robotics, medical imaging, and other applications. This problem involves finding a rigid transformation from one point cloud into another so that they align. Iterative Closest Point (ICP) and its variants provide simple and easily-implemented iterative methods for this task, but these algorithms can converge to spurious local optima. To address local optima and other difficulties in the ICP pipeline, we propose a learning-based method, titled Deep Closest Point (DCP), inspired by recent techniques in computer vision and natural language processing. Our model consists of three parts: a point cloud embedding network, an attention-based module combined with a pointer generation layer to approximate combinatorial matching, and a differentiable singular value decomposition (SVD) layer to extract the final rigid transformation. We train our model end-to-end on the ModelNet40 dataset and show in several settings that it performs better than ICP, its variants (e.g., Go-ICP, FGR), and the recently-proposed learning-based method PointNetLK. Beyond providing a state-of-the-art registration technique, we evaluate the suitability of our learned features transferred to unseen objects. We also provide preliminary analysis of our learned model to help understand whether domain-specific and/or global features facilitate rigid registration.

References

YearCitations

Page 1