Concepedia

Publication | Closed Access

PointNet++ Grasping: Learning An End-to-end Spatial Grasp Generation Algorithm from Sparse Point Clouds

139

Citations

17

References

2020

Year

TLDR

Grasping novel objects in unstructured environments is crucial, yet existing methods rely on costly grasp sampling and local deep‑learning feature extraction, which is especially inefficient when grasp points are sparse. The study proposes an end‑to‑end method that directly predicts grasp poses, categories, and quality scores for all grasps, eliminating the need for sampling. The approach feeds whole sparse point clouds into a PointNet++ network with a multi‑mask loss, and employs a fast multi‑object grasp detection algorithm based on Ferrari Canny metrics to generate training data from single‑ and multi‑object datasets, avoiding any sampling or search. The network weighs only 11.6 M parameters, runs in 102 ms on a GeForce 840M GPU, and achieves a 71.43 % success rate and 91.60 % completion rate, outperforming state‑of‑the‑art methods.

Abstract

Grasping for novel objects is important for robot manipulation in unstructured environments. Most of current works require a grasp sampling process to obtain grasp candidates, combined with local feature extractor using deep learning. This pipeline is time-costly, expecially when grasp points are sparse such as at the edge of a bowl.In this paper, we propose an end-to-end approach to directly predict the poses, categories and scores (qualities) of all the grasps. It takes the whole sparse point clouds as the input and requires no sampling or search process. Moreover, to generate training data of multi-object scene, we propose a fast multi-object grasp detection algorithm based on Ferrari Canny metrics. A single-object dataset (79 objects from YCB object set, 23.7k grasps) and a multi-object dataset (20k point clouds with annotations and masks) are generated. A PointNet++ based network combined with multi-mask loss is introduced to deal with different training points. The whole weight size of our network is only about 11.6M, which takes about 102ms for a whole prediction process using a GeForce 840M GPU. Our experiment shows our work get 71.43% success rate and 91.60% completion rate, which performs better than current state-of-art works.

References

YearCitations

Page 1