Tell me Dave: Context-sensitive grounding of natural language to manipulation instructions

TLDR

Robots must interpret natural language commands, and task execution varies with environment context. The study aims to perform mobile manipulation tasks guided by natural language instructions. We collected a dataset of natural language task descriptions paired with grounded task logs, built a verb–environment instruction library, and proposed an energy‑based model—an isomorphic conditional random field—that accounts for language variation, ambiguity, incomplete or noisy instructions, and task constraints. The model achieved 61.8% accuracy, surpassing prior methods, and was successfully demonstrated on a PR2 robot using Learning from Demonstration.

Abstract

It is important for a robot to be able to interpret natural language commands given by a human. In this paper, we consider performing a sequence of mobile manipulation tasks with instructions described in natural language. Given a new environment, even a simple task such as boiling water would be performed quite differently depending on the presence, location and state of the objects. We start by collecting a dataset of task descriptions in free-form natural language and the corresponding grounded task-logs of the tasks performed in an online robot simulator. We then build a library of verb–environment instructions that represents the possible instructions for each verb in that environment, these may or may not be valid for a different environment and task context. We present a model that takes into account the variations in natural language and ambiguities in grounding them to robotic instructions with appropriate environment context and task constraints. Our model also handles incomplete or noisy natural language instructions. It is based on an energy function that encodes such properties in a form isomorphic to a conditional random field. We evaluate our model on tasks given in a robotic simulator and show that it successfully outperforms the state of the art with 61.8% accuracy. We also demonstrate a grounded robotic instruction sequence on a PR2 robot using the Learning from Demonstration approach.

References

Page 1

	Year	Citations

Page 1