pat langley school of computing and informatics arizona state university tempe, arizona usa mental...
TRANSCRIPT
Pat Langley
School of Computing and InformaticsArizona State University
Tempe, Arizona USA
Mental Simulation and Learning in the ICARUS Architecture
Thanks to D. Choi, G. Cleveland, A. Danielescu, N. Li, D. and D. Stracuzzi for their contributions. This talk reports research partly funded by a grant from the Office of Naval Research, which is not responsible for its contents.
Cognitive Architectures
A cognitive architecture (Newell, 1990) is the infrastructure for an intelligent system that is constant across domains:
the memories that store domain-specific content
the system’s representation and organization of knowledge
the mechanisms that use this knowledge in performance
the processes that learn this knowledge from experience
An architecture typically comes with a programming language that eases construction of knowledge-based systems.
Research in this area incorporates many ideas from psychology about the nature of human thinking.
The ICARUS Architecture
ICARUS (Langley, 2006) is a computational theory of the human cognitive architecture that posits:
It shares these assumptions with other cognitive architectures like Soar (Laird et al., 1987) and ACT-R (Anderson, 1993).
1. Short-term memories are distinct from long-term stores
2. Memories contain modular elements cast as symbolic structures
3. Long-term structures are accessed through pattern matching
4. Cognition occurs in retrieval/selection/action cycles
5. Learning involves monotonic addition of elements to memory
6. Learning is incremental and interleaved with performance
Goals for ICARUS
Our main objectives in developing ICARUS are to produce:
a computational theory of higher-level cognition in humans
that is qualitatively consistent with results from psychology
that exhibits as many distinct cognitive functions as possible
Although quantitative fits to specific results are desirable, they can distract from achieving broad theoretical coverage.
Distinctive Features of ICARUS
However, ICARUS also makes assumptions that distinguish it from these architectures:
Some of these tenets also appear in Bonasso et al.’s (2003) 3T, Freed’s (1998) APEX, and Sun et al.’s (2001) CLARION.
1. Cognition is grounded in perception and action
2. Categories and skills are separate cognitive entities
3. Short-term elements are instances of long-term structures
4. Skills and concepts are organized in a hierarchical manner
5. Inference and execution are more basic than problem solving
Cascaded Integration in ICARUS
ICARUS adopts a cascaded approach to integration in which lower-level modules produce results for higher-level ones.
conceptual inference
skill execution
problem solving
learning
Like other unified cognitive architectures, ICARUS incorporates a number of distinct modules.
Structure and Use of Conceptual Memory
ICARUS organizes conceptual memory in a hierarchical manner.
Conceptual inference occurs from the bottom up, starting from percepts to produce high-level beliefs about the current state.
ICARUS Concepts for In-City Driving
((in-rightmost-lane ?self ?clane) :percepts ((self ?self) (segment ?seg)
(line ?clane segment ?seg)) :relations ((driving-well-in-segment ?self ?seg ?clane)
(last-lane ?clane) (not (lane-to-right ?clane ?anylane))))
((driving-well-in-segment ?self ?seg ?lane) :percepts ((self ?self) (segment ?seg) (line ?lane segment ?seg)) :relations ((in-segment ?self ?seg) (in-lane ?self ?lane)
(aligned-with-lane-in-segment ?self ?seg ?lane)(centered-in-lane ?self ?seg ?lane)(steering-wheel-straight ?self)))
((in-lane ?self ?lane) :percepts ((self ?self segment ?seg) (line ?lane segment ?seg dist ?dist)) :tests ( (> ?dist -10) (<= ?dist 0)))
Representing Short-Term Beliefs/Goals
(current-street me A) (current-segment me g550)(lane-to-right g599 g601) (first-lane g599)(last-lane g599) (last-lane g601)(at-speed-for-u-turn me) (slow-for-right-turn me)(steering-wheel-not-straight me) (centered-in-lane me g550 g599)(in-lane me g599) (in-segment me g550)(on-right-side-in-segment me) (intersection-behind g550 g522)(building-on-left g288) (building-on-left g425)(building-on-left g427) (building-on-left g429)(building-on-left g431) (building-on-left g433)(building-on-right g287) (building-on-right g279)(increasing-direction me) (buildings-on-right g287 g279)
Skill Execution in ICARUS
This process repeats on each cycle to produce goal-directed but reactive behavior, biased toward continuing initiated skills.
Skill execution occurs from the top down, starting from goals to find applicable paths through the skill hierarchy.
((in-rightmost-lane ?self ?line) :percepts ((self ?self) (line ?line)) :start ((last-lane ?line)) :subgoals ((driving-well-in-segment ?self ?seg ?line)))
((driving-well-in-segment ?self ?seg ?line) :percepts ((segment ?seg) (line ?line) (self ?self)) :start ((steering-wheel-straight ?self)) :subgoals ((in-segment ?self ?seg)
(centered-in-lane ?self ?seg ?line)(aligned-with-lane-in-segment ?self ?seg ?line)(steering-wheel-straight ?self)))
((in-segment ?self ?endsg) :percepts ((self ?self speed ?speed) (intersection ?int cross ?cross)
(segment ?endsg street ?cross angle ?angle)) :start ((in-intersection-for-right-turn ?self ?int)) :actions ((steer 1)))
ICARUS Skills for In-City Driving
Execution and Problem Solving in ICARUS
Executed plan
Problem
?
Skill Hierarchy
Primitive Skills
ReactiveExecution
impasse?
ProblemSolving
yes
no
Problem solving involves means-ends analysis that chains backward over skillsand concept definitions, executing skills whenever they become applicable.
ICARUS Learns Skills from Problem Solving
Executed plan
Problem
?
Skill Hierarchy
Primitive Skills
ReactiveExecution
impasse?
ProblemSolving
yes
no
SkillLearning
Learning from Problem Solutions
operates whenever problem solving overcomes an impasse
incorporates only information available from the goal stack
generalizes beyond the specific objects concerned
depends on whether chaining involved skills or concepts
supports cumulative learning and within-problem transfer
ICARUS incorporates a mechanism for learning new skills that:
This skill creation process is fully interleaved with means-ends analysis and execution.
Learned skills carry out forward execution in the environment rather than backward chaining in the mind.
ICARUS Summary
includes hierarchical memories for concepts and skills;
interleaves conceptual inference with reactive execution;
resorts to problem solving when it lacks routine skills;
learns such skills from successful resolution of impasses.
ICARUS is a unified theory of the cognitive architecture that:
We have developed ICARUS agents for a variety of simulated physical environments, including urban driving.
However, it has a number of limitations that we must address to improve its coverage of human intelligence.
Limitations of ICARUS’ Learning Abilities
storing states that arise in each step of the given solution
using means-ends analysis to explain why each step occurred
acquiring a new skill for each subproblem explained this way
ICARUS provides a plausible account for learning hierarchical skills from successful problem solving.
Recent work (Li et al., in press) has adapted this mechanism to learn from worked-out problem solutions by:
However, ICARUS cannot learn from mistakes, such as those that result from unexpected goal interactions.
Goal-Driven Execution: A Recipe for Disaster
This goal determines which path through the skill hierarchy ICARUS selects for execution.
As a result, the system ignores already satisfied goals while working on this objective.
ICARUS incorporates a goal memory that contains a prioritized set of top-level goals.
On each cycle, the architecture notes the most important goal not satisfied by its current beliefs.
However, unforseen interactions among goals can produce undesirable outomes.
For instance, suddenly changing lanes to avoid a stalled vehicle can lead to collision with another one.
Learning from Goal Violations
An extended ICARUS that learns from unforseen events might:
Implementing this approach requires three basic extensions to the ICARUS architecture.
1. Encounter a situation in which pursuing goal A leads it to violate previously satisfied goal B.
2. Use counterfactual reasoning to identify what it could have done differently to avoid the error.
3. Analyze the alternative to acquire a specialzed skill indexed by goals A and B.
4. In future runs, prefer the specialized skill during execution, leading it to avoid the error.
An Episodic Belief Memory
retains all beliefs inferred on earlier cognitive cycles; and
annotates beliefs with time stamps specifying when they held.
Before it can analyze the reasons why an error occurred, ICARUS must encode its previous experience.
We have introduced an episodic belief memory (Stracuzzi et al., in press) that:
These let the architecture reconstruct states that the agent has encountered recently.
The current implementation has no mechanisms for forgetting or retrieval, but we plan to add these in the future.
Learning from Counterfactual Reasoning
works backward from the violated goal to consider the agent’s choices at each step;
carries out repeated forward search to find a path that would have avoided the goal violation; and
analyzes this path to create a new skill that takes both goals into account.
Before it can learn what it should have done differently, ICARUS must identify an alternative behavioral trajectory.
We have developed a counterfactual reasoning capability that:
Because analysis starts with the conjoined goal, it produces a new skill with a specialized head and preconditions.
on-left-side
crossing-into-left-lane-straight
avoid-obstacles
lane-aligned-straight
crossing-into-left-
lane
wheels-straight
throttle-
special-
value
crossing-into-left-lane
on-left-side
crossing-into-left-
lane
on-right-side
crossing-into-right-lane
crossing-into-right-lane
wheels-straight
on-right-side
failed attemptsuccessful attempt
failed attempt
A Trace of Counterfactual Reasoning
A Specificity Bias for Skill Execution
skills with more specific heads that match top-level goals
skills with more specific conditions that match the state
For ICARUS to benefit from skills learned by its counterfactual reasoning, it must prefer them over ones that caused errors.
We have altered the architecture’s execution module to prefer:
These lead ICARUS to mask skills indexed by single goals with ones that handle goal interactions.
This in turn lets the system improve its ability to avoid errors in an incremental, cumulative manner.
Related Work on Error-Driven Learning
Learning search-control rules by discrimination in SAGE (Langley, 1985)
Analytical learning from failure in Soar (Laird et al., 1986) and Prodigy (Minton, 1988)
Ohlsson’s (1996) theory of learning from constraint violations
Mueller and Dyer’s (1985) model of learning by daydreaming
Our approach to learning from execution errors differs from, but bears similarities to:
The latter comes closest to our use of counterfactual reasoning, but it was not cast within a unified cognitive architecture.
Yet people can reason more deeply about the goals and actions of others, then use their inferences to make decisions.
Research Plans: Reasoning about Others
The framework can deal with other independent agents, but only by viewing them as other objects in the environment.
We designed ICARUS to model intelligent behavior in embodied agents, but our work to date has treated them in isolation.
Adding this capability to ICARUS will require extending its representation, performance processes, and learning methods.
• (goal me (in-left-lane me segment16))
• (belief me (goal driver2 (in-right-lane driver2 segment16)))
• (belief me (belief driver2 (in-right-lane me segment16)))
• (goal me (belief driver2 (goal me (in-left-lane me segment16))))
An Extended Representation
For ICARUS to reason about other agents’ mental states, it must first represent and store them.
We plan to introduce modal predicates like belief, goal, and intention to modify inferences like:
This scheme eliminates the need for separate goal and belief memories, so a single ‘working meomory’ will suffice.
We can also include time stamps with each substructure to indicate its temporal scope.
A Flexible Inference Mechanism
The current ICARUS inference process is both deductive and exhaustive, making it implausible and ineffective.
The revised architecture will carry out hill climbing through a space of possible worlds (truth assignments on ground literals).
Each step will involve changing an existing literal’s truth value or generating an entirely new literal.
• ICARUS will guide its inferential choices either by posterior probabilities or by expected values.
• The system will also take into account recency of elements matched by consequents or antecedents.
This approach is influenced by Polyscheme, Markov logic, and theories of spreading activation.
Default Reasoning and Revisions
Given basic inference rules, these changes should let ICARUS make abductive leaps about others’ mental states.
The agent’s initial statements about others’ beliefs will be the same as those for the agent.
But additional information can lead the system to revise these assumptions nonmonotically when needed.
• E.g., we assume that others can see what we see, then alter these beliefs if we note evidence otherwise.
This explains why making inferences about others often takes extra time and effort.
Learning to Reason about Others
Reasoning about others comes more easily to the experienced than to children and novices.
We can explain this with a mechanism that learns inference rules from empirical regularities among beliefs by:
• Generating new structures based on co-occurrences of literals in working memory; and
• Updating probabilities associated with antecedents and rules based on later co-occurrences.
When these specialized rules drive inference, they mask more basic ones, reducing the need for later revisions.
This causes more direct inferences about others’ mental states, thus reaching conclusions with less time and effort.
Summary of Planned Research
To provide ICARUS with the capability to reason about others’ mental states, we plan to:
• Extend its representation to support embedded modal literals;
• Alter inference to hill climb through possible worlds guided by recencies and probabilities;
• Combine default reasoning about others with nonmonotonic revision when appropriate; and
• Acquire specialized inference rules from experience to reduce the need for such belief revision.
We will implement these extensions to ICARUS and test them in urban driving and other settings.
End of Presentation