AAAI Best Demonstration Award for developing an AI agent that learns tasks from natural language instructions
CSE PhD candidate Aaron Mininger and John L. Tishman Professor John Laird have received the Best Demonstration Award at the annual conference of the Association for the Advancement of Artificial Intelligence (AAAI) for their demonstration, “A Demonstration of Compositional, Hierarchical Interactive Task Learning.”
The demonstration showed how a learning agent built with the Soar cognitive architecture could learn new tasks in a single session from situated natural language instruction. Called Rosie, the agent has been implemented on multiple real and simulated robots and taught a wide range of household and office tasks, as well as over 60 games and puzzles.
At AAAI, Mininger’s video showed off Rosie’s capabilities with a simulated scenario that had the agent learn how to patrol a barracks environment. The agent is given knowledge of rooms, objects, people, and basic navigation up front, but has no specific knowledge of how to order and perform the tasks involved in patroling.
To teach it, Mininger simply describes the series of steps Rosie should take to perform a certain task, using plain English. As the agent is guided through a task, it creates a declarative task network behind the scenes that connects the task’s arguments and goals with sub-tasks. It then compiles the task knowledge into procedural rules though a mechanism called chunking.
“These rules are then used to efficiently execute the task,” Mininger explains, “and are the key to the agent’s ability to generalize from a single example.”
The ability to generalize is an important feature of Rosie, Mininger says. Given an initial explanation, the agent isn’t limited in its understanding to the specific example used for instruction. If told how to report a fire or other emergency, for example, the trigger for that set of tasks can be set off in any part of the environment Rosie is situated in.
After instruction, Rosie has a general task structure that includes arguments, subtasks, and a directed graph of subgoals that represents the control flow of the task. The sub-tasks in this structure can be both conditional and looping as needed, and include perceptive, communicative, and mental actions. It can do things like receive perceptive input at a certain place, remember that location for later, and then describe it to a person.
See some of the tasks Rosie has been trained on over the years, and view Mininger’s demonstration: