Publication | Open Access
Robotic Skill Acquisition via Instruction Augmentation with Vision-Language Models
27
Citations
37
References
2023
Year
Unknown Venue
Artificial IntelligenceUnseen Instructions TimeEngineeringMachine LearningInstruction AugmentationCognitive RoboticsLanguage LearningLanguage ProcessingNatural Language ProcessingMultimodal LlmData ScienceLanguage AcquisitionRobot LearningLanguage StudiesLarge Ai ModelImitation LearningRoboticsEqual Contribution ControlVision RoboticsComputer ScienceLlm-based AgentStep 4
Step 4: Run trained policy on unseen instructions Time … "Lift up the green chip bag from the bowl and drop it at the bottom left corner of the table" * Equal contribution control (DIAL): we utilize semi-supervised language labels to propagate CLIP's semantic knowledge onto large datasets of unlabeled demonstration data, from which we then train languageconditioned policies.This method enables cheaper acquisition of useful language descriptions compared to expensive human labels, allowing for more efficient label coverage of large-scale datasets.We apply DIAL to a challenging real-world robotic manipulation domain where only 3.5% of the 80,000 demonstrations contain crowd-sourced language annotations.Through a large-scale study of over 1,300 real world evaluations, we find that DIAL enables imitation learning policies to acquire new capabilities and generalize to 60 novel instructions unseen in the original dataset.
| Year | Citations | |
|---|---|---|
Page 1
Page 1