Concepedia

Publication | Open Access

Robotic Skill Acquisition via Instruction Augmentation with Vision-Language Models

27

Citations

37

References

2023

Year

Abstract

Step 4: Run trained policy on unseen instructions Time … "Lift up the green chip bag from the bowl and drop it at the bottom left corner of the table" * Equal contribution control (DIAL): we utilize semi-supervised language labels to propagate CLIP's semantic knowledge onto large datasets of unlabeled demonstration data, from which we then train languageconditioned policies.This method enables cheaper acquisition of useful language descriptions compared to expensive human labels, allowing for more efficient label coverage of large-scale datasets.We apply DIAL to a challenging real-world robotic manipulation domain where only 3.5% of the 80,000 demonstrations contain crowd-sourced language annotations.Through a large-scale study of over 1,300 real world evaluations, we find that DIAL enables imitation learning policies to acquire new capabilities and generalize to 60 novel instructions unseen in the original dataset.

References

YearCitations

Page 1