Understanding Natural Language Commands for Robotic Navigation and Mobile Manipulation

TLDR

Previous approaches have used models with fixed structure to infer the likelihood of a sequence of actions given the environment and the command. This paper describes a new model for understanding natural language commands given to autonomous systems that perform navigation and mobile manipulation in semi‑structured environments. Our framework, Generalized Grounding Graphs, dynamically constructs a probabilistic graphical model from a command’s hierarchical semantics, is trained on a crowdsourced command–action corpus, and is evaluated by inferring and executing plans in a realistic simulator with user assessment. The system successfully infers and executes plans for commands such as “Put the tire pallet on the truck,” and generally follows many commands from the corpus.

Abstract

This paper describes a new model for understanding natural language commands given to autonomous systems that perform navigation and mobile manipulation in semi-structured environments. Previous approaches have used models with fixed structure to infer the likelihood of a sequence of actions given the environment and the command. In contrast, our framework, called Generalized Grounding Graphs, dynamically instantiates a probabilistic graphical model for a particular natural language command according to the command's hierarchical and compositional semantic structure. Our system performs inference in the model to successfully find and execute plans corresponding to natural language commands such as "Put the tire pallet on the truck." The model is trained using a corpus of commands collected using crowdsourcing. We pair each command with robot actions and use the corpus to learn the parameters of the model. We evaluate the robot's performance by inferring plans from natural language commands, executing each plan in a realistic robot simulator, and asking users to evaluate the system's performance. We demonstrate that our system can successfully follow many natural language commands from the corpus.

References

Page 1

	Year	Citations

Page 1