pat langley school of computing and informatics arizona state university tempe, arizona usa mental...

Pat Langley

School of Computing and InformaticsArizona State University

Tempe, Arizona USA

Mental Simulation and Learning in the ICARUS Architecture

Thanks to D. Choi, G. Cleveland, A. Danielescu, N. Li, D. and D. Stracuzzi for their contributions. This talk reports research partly funded by a grant from the Office of Naval Research, which is not responsible for its contents.

Cognitive Architectures

A cognitive architecture (Newell, 1990) is the infrastructure for an intelligent system that is constant across domains:

the memories that store domain-specific content

the system’s representation and organization of knowledge

the mechanisms that use this knowledge in performance

the processes that learn this knowledge from experience

An architecture typically comes with a programming language that eases construction of knowledge-based systems.

Research in this area incorporates many ideas from psychology about the nature of human thinking.

The ICARUS Architecture

ICARUS (Langley, 2006) is a computational theory of the human cognitive architecture that posits:

It shares these assumptions with other cognitive architectures like Soar (Laird et al., 1987) and ACT-R (Anderson, 1993).

1. Short-term memories are distinct from long-term stores

2. Memories contain modular elements cast as symbolic structures

3. Long-term structures are accessed through pattern matching

4. Cognition occurs in retrieval/selection/action cycles

5. Learning involves monotonic addition of elements to memory

6. Learning is incremental and interleaved with performance

Goals for ICARUS

Our main objectives in developing ICARUS are to produce:

a computational theory of higher-level cognition in humans

that is qualitatively consistent with results from psychology

that exhibits as many distinct cognitive functions as possible

Although quantitative fits to specific results are desirable, they can distract from achieving broad theoretical coverage.

Distinctive Features of ICARUS

However, ICARUS also makes assumptions that distinguish it from these architectures:

Some of these tenets also appear in Bonasso et al.’s (2003) 3T, Freed’s (1998) APEX, and Sun et al.’s (2001) CLARION.

1. Cognition is grounded in perception and action

2. Categories and skills are separate cognitive entities

3. Short-term elements are instances of long-term structures

4. Skills and concepts are organized in a hierarchical manner

5. Inference and execution are more basic than problem solving

Cascaded Integration in ICARUS

ICARUS adopts a cascaded approach to integration in which lower-level modules produce results for higher-level ones.

conceptual inference

skill execution

problem solving

learning

Like other unified cognitive architectures, ICARUS incorporates a number of distinct modules.

Structure and Use of Conceptual Memory

ICARUS organizes conceptual memory in a hierarchical manner.

Conceptual inference occurs from the bottom up, starting from percepts to produce high-level beliefs about the current state.

ICARUS Concepts for In-City Driving

((in-rightmost-lane ?self ?clane) :percepts ((self ?self) (segment ?seg)

(line ?clane segment ?seg)) :relations ((driving-well-in-segment ?self ?seg ?clane)

(last-lane ?clane) (not (lane-to-right ?clane ?anylane))))

((driving-well-in-segment ?self ?seg ?lane) :percepts ((self ?self) (segment ?seg) (line ?lane segment ?seg)) :relations ((in-segment ?self ?seg) (in-lane ?self ?lane)

(aligned-with-lane-in-segment ?self ?seg ?lane)(centered-in-lane ?self ?seg ?lane)(steering-wheel-straight ?self)))

((in-lane ?self ?lane) :percepts ((self ?self segment ?seg) (line ?lane segment ?seg dist ?dist)) :tests ( (> ?dist -10) (<= ?dist 0)))

Representing Short-Term Beliefs/Goals

(current-street me A) (current-segment me g550)(lane-to-right g599 g601) (first-lane g599)(last-lane g599) (last-lane g601)(at-speed-for-u-turn me) (slow-for-right-turn me)(steering-wheel-not-straight me) (centered-in-lane me g550 g599)(in-lane me g599) (in-segment me g550)(on-right-side-in-segment me) (intersection-behind g550 g522)(building-on-left g288) (building-on-left g425)(building-on-left g427) (building-on-left g429)(building-on-left g431) (building-on-left g433)(building-on-right g287) (building-on-right g279)(increasing-direction me) (buildings-on-right g287 g279)

Skill Execution in ICARUS

This process repeats on each cycle to produce goal-directed but reactive behavior, biased toward continuing initiated skills.

Skill execution occurs from the top down, starting from goals to find applicable paths through the skill hierarchy.

((in-rightmost-lane ?self ?line) :percepts ((self ?self) (line ?line)) :start ((last-lane ?line)) :subgoals ((driving-well-in-segment ?self ?seg ?line)))

((driving-well-in-segment ?self ?seg ?line) :percepts ((segment ?seg) (line ?line) (self ?self)) :start ((steering-wheel-straight ?self)) :subgoals ((in-segment ?self ?seg)

(centered-in-lane ?self ?seg ?line)(aligned-with-lane-in-segment ?self ?seg ?line)(steering-wheel-straight ?self)))

((in-segment ?self ?endsg) :percepts ((self ?self speed ?speed) (intersection ?int cross ?cross)

(segment ?endsg street ?cross angle ?angle)) :start ((in-intersection-for-right-turn ?self ?int)) :actions ((steer 1)))

ICARUS Skills for In-City Driving

Execution and Problem Solving in ICARUS

Executed plan

Problem

?

Skill Hierarchy

Primitive Skills

ReactiveExecution

impasse?

ProblemSolving

yes

no

Problem solving involves means-ends analysis that chains backward over skillsand concept definitions, executing skills whenever they become applicable.

ICARUS Learns Skills from Problem Solving

Executed plan

Problem

?

Skill Hierarchy

Primitive Skills

ReactiveExecution

impasse?

ProblemSolving

yes

no

SkillLearning

Learning from Problem Solutions

operates whenever problem solving overcomes an impasse

incorporates only information available from the goal stack

generalizes beyond the specific objects concerned

depends on whether chaining involved skills or concepts

supports cumulative learning and within-problem transfer

ICARUS incorporates a mechanism for learning new skills that:

This skill creation process is fully interleaved with means-ends analysis and execution.

Learned skills carry out forward execution in the environment rather than backward chaining in the mind.

ICARUS Summary

includes hierarchical memories for concepts and skills;

interleaves conceptual inference with reactive execution;

resorts to problem solving when it lacks routine skills;

learns such skills from successful resolution of impasses.

ICARUS is a unified theory of the cognitive architecture that:

We have developed ICARUS agents for a variety of simulated physical environments, including urban driving.

However, it has a number of limitations that we must address to improve its coverage of human intelligence.

Limitations of ICARUS’ Learning Abilities

storing states that arise in each step of the given solution

using means-ends analysis to explain why each step occurred

acquiring a new skill for each subproblem explained this way

ICARUS provides a plausible account for learning hierarchical skills from successful problem solving.

Recent work (Li et al., in press) has adapted this mechanism to learn from worked-out problem solutions by:

However, ICARUS cannot learn from mistakes, such as those that result from unexpected goal interactions.

Goal-Driven Execution: A Recipe for Disaster

This goal determines which path through the skill hierarchy ICARUS selects for execution.

As a result, the system ignores already satisfied goals while working on this objective.

ICARUS incorporates a goal memory that contains a prioritized set of top-level goals.

On each cycle, the architecture notes the most important goal not satisfied by its current beliefs.

However, unforseen interactions among goals can produce undesirable outomes.

For instance, suddenly changing lanes to avoid a stalled vehicle can lead to collision with another one.

Learning from Goal Violations

An extended ICARUS that learns from unforseen events might:

Implementing this approach requires three basic extensions to the ICARUS architecture.

1. Encounter a situation in which pursuing goal A leads it to violate previously satisfied goal B.

2. Use counterfactual reasoning to identify what it could have done differently to avoid the error.

3. Analyze the alternative to acquire a specialzed skill indexed by goals A and B.

4. In future runs, prefer the specialized skill during execution, leading it to avoid the error.

An Episodic Belief Memory

retains all beliefs inferred on earlier cognitive cycles; and

annotates beliefs with time stamps specifying when they held.

Before it can analyze the reasons why an error occurred, ICARUS must encode its previous experience.

We have introduced an episodic belief memory (Stracuzzi et al., in press) that:

These let the architecture reconstruct states that the agent has encountered recently.

The current implementation has no mechanisms for forgetting or retrieval, but we plan to add these in the future.

Learning from Counterfactual Reasoning

works backward from the violated goal to consider the agent’s choices at each step;

carries out repeated forward search to find a path that would have avoided the goal violation; and

analyzes this path to create a new skill that takes both goals into account.

Before it can learn what it should have done differently, ICARUS must identify an alternative behavioral trajectory.

We have developed a counterfactual reasoning capability that:

Because analysis starts with the conjoined goal, it produces a new skill with a specialized head and preconditions.

on-left-side

crossing-into-left-lane-straight

avoid-obstacles

lane-aligned-straight

crossing-into-left-

lane

wheels-straight

throttle-

special-

value

crossing-into-left-lane

on-left-side

crossing-into-left-

lane

on-right-side

crossing-into-right-lane

crossing-into-right-lane

wheels-straight

on-right-side

failed attemptsuccessful attempt

failed attempt

A Trace of Counterfactual Reasoning

A Specificity Bias for Skill Execution

skills with more specific heads that match top-level goals

skills with more specific conditions that match the state

For ICARUS to benefit from skills learned by its counterfactual reasoning, it must prefer them over ones that caused errors.

We have altered the architecture’s execution module to prefer:

These lead ICARUS to mask skills indexed by single goals with ones that handle goal interactions.

This in turn lets the system improve its ability to avoid errors in an incremental, cumulative manner.

Related Work on Error-Driven Learning

Learning search-control rules by discrimination in SAGE (Langley, 1985)

Analytical learning from failure in Soar (Laird et al., 1986) and Prodigy (Minton, 1988)

Ohlsson’s (1996) theory of learning from constraint violations

Mueller and Dyer’s (1985) model of learning by daydreaming

Our approach to learning from execution errors differs from, but bears similarities to:

The latter comes closest to our use of counterfactual reasoning, but it was not cast within a unified cognitive architecture.

Yet people can reason more deeply about the goals and actions of others, then use their inferences to make decisions.

Research Plans: Reasoning about Others

The framework can deal with other independent agents, but only by viewing them as other objects in the environment.

We designed ICARUS to model intelligent behavior in embodied agents, but our work to date has treated them in isolation.

Adding this capability to ICARUS will require extending its representation, performance processes, and learning methods.

• (goal me (in-left-lane me segment16))

• (belief me (goal driver2 (in-right-lane driver2 segment16)))

• (belief me (belief driver2 (in-right-lane me segment16)))

• (goal me (belief driver2 (goal me (in-left-lane me segment16))))

An Extended Representation

For ICARUS to reason about other agents’ mental states, it must first represent and store them.

We plan to introduce modal predicates like belief, goal, and intention to modify inferences like:

This scheme eliminates the need for separate goal and belief memories, so a single ‘working meomory’ will suffice.

We can also include time stamps with each substructure to indicate its temporal scope.

A Flexible Inference Mechanism

The current ICARUS inference process is both deductive and exhaustive, making it implausible and ineffective.

The revised architecture will carry out hill climbing through a space of possible worlds (truth assignments on ground literals).

Each step will involve changing an existing literal’s truth value or generating an entirely new literal.

• ICARUS will guide its inferential choices either by posterior probabilities or by expected values.

• The system will also take into account recency of elements matched by consequents or antecedents.

This approach is influenced by Polyscheme, Markov logic, and theories of spreading activation.

Default Reasoning and Revisions

Given basic inference rules, these changes should let ICARUS make abductive leaps about others’ mental states.

The agent’s initial statements about others’ beliefs will be the same as those for the agent.

But additional information can lead the system to revise these assumptions nonmonotically when needed.

• E.g., we assume that others can see what we see, then alter these beliefs if we note evidence otherwise.

This explains why making inferences about others often takes extra time and effort.

Learning to Reason about Others

Reasoning about others comes more easily to the experienced than to children and novices.

We can explain this with a mechanism that learns inference rules from empirical regularities among beliefs by:

• Generating new structures based on co-occurrences of literals in working memory; and

• Updating probabilities associated with antecedents and rules based on later co-occurrences.

When these specialized rules drive inference, they mask more basic ones, reducing the need for later revisions.

This causes more direct inferences about others’ mental states, thus reaching conclusions with less time and effort.

Summary of Planned Research

To provide ICARUS with the capability to reason about others’ mental states, we plan to:

• Extend its representation to support embedded modal literals;

• Alter inference to hill climb through possible worlds guided by recencies and probabilities;

• Combine default reasoning about others with nonmonotonic revision when appropriate; and

• Acquire specialized inference rules from experience to reduce the need for such belief revision.

We will implement these extensions to ICARUS and test them in urban driving and other settings.

End of Presentation

pat langley school of computing and informatics arizona state university tempe, arizona usa mental...

Documents

self segment

percepts self

clane segment

i carus architecture

seg line

seg dist

carus langley

carus concepts