self-improvement through self-understanding: model-based reflection for agent adaptation

42
J. William Murdock 1/42 Self-Improvement through Self- Understanding: Model-Based Reflection for Agent Adaptation J. William Murdock Intelligent Decision Aids Group Navy Center for Applied Research in Artificial Intelligence Naval Research Laboratory, Code 5515 Washington, DC 20375 [email protected] http://bill.murdocks.org Presentation at NIST – March 18, 2002

Upload: halima

Post on 31-Jan-2016

36 views

Category:

Documents


0 download

DESCRIPTION

Self-Improvement through Self-Understanding: Model-Based Reflection for Agent Adaptation. J. William Murdock Intelligent Decision Aids Group Navy Center for Applied Research in Artificial Intelligence Naval Research Laboratory, Code 5515 Washington, DC 20375 - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Self-Improvement through Self-Understanding: Model-Based Reflection for Agent Adaptation

J. William Murdock 1/42

Self-Improvement through Self-Understanding:Model-Based Reflection for Agent Adaptation

J. William MurdockIntelligent Decision Aids Group

Navy Center for Applied Research in Artificial IntelligenceNaval Research Laboratory, Code 5515

Washington, DC [email protected] http://bill.murdocks.org

Presentation at NIST – March 18, 2002

Page 2: Self-Improvement through Self-Understanding: Model-Based Reflection for Agent Adaptation

J. William Murdock 2/42

Adaptation

• People adapt very well.– They figure out how to do new things.– If something doesn’t work, they try something else.– They understand how and why they are doing things.

• Computer programs do not adapt very well.– They can only do what they are programmed for.– They keep making the same mistakes.– They have no understanding of themselves.

• People adapt very well.– They figure out how to do new things.– If something doesn’t work, they try something else.– They understand how and why they are doing things.

• Computer programs do not adapt very well.– They can only do what they are programmed for.– They keep making the same mistakes.– They have no understanding of themselves.

Can we make computer programs adapt?Can we make computer programs adapt?

Page 3: Self-Improvement through Self-Understanding: Model-Based Reflection for Agent Adaptation

J. William Murdock 3/42

REM(Reflective Evolutionary Mind)

• Operating environment for intelligent agents• Provides support for adaptation to new

functional requirements• Uses functional models, generative planning,

and reinforcement learning• J. William Murdock and Ashok K. Goel

• Operating environment for intelligent agents• Provides support for adaptation to new

functional requirements• Uses functional models, generative planning,

and reinforcement learning• J. William Murdock and Ashok K. Goel

Page 4: Self-Improvement through Self-Understanding: Model-Based Reflection for Agent Adaptation

J. William Murdock 4/42

Example:Web Browsing Agent

• A mock-up of web browsing software• Based on Mosaic for X Windows, version 2.4• Imitates not only behavior but also internal

process and information of Mosaic 2.4

• A mock-up of web browsing software• Based on Mosaic for X Windows, version 2.4• Imitates not only behavior but also internal

process and information of Mosaic 2.4

???ps

pdf txt

html

Page 5: Self-Improvement through Self-Understanding: Model-Based Reflection for Agent Adaptation

J. William Murdock 5/42

Example:Disassembly and Assembly

• Software agent for disassembly in the domain of cameras– Information about cameras– Information about relevant actions

• e.g., pulling, unscrewing, etc.

– Information about disassembly processing• e.g., decide how to disconnect subsystems

from each other and then decide how to disassemble those subsystems separately.

• Agent now needs to assemble a camera

• Software agent for disassembly in the domain of cameras– Information about cameras– Information about relevant actions

• e.g., pulling, unscrewing, etc.

– Information about disassembly processing• e.g., decide how to disconnect subsystems

from each other and then decide how to disassemble those subsystems separately.

• Agent now needs to assemble a camera

Page 6: Self-Improvement through Self-Understanding: Model-Based Reflection for Agent Adaptation

J. William Murdock 6/42

• TMK models provide the agent with knowledge of its own design.

• TMK encodes:– Tasks: functional specification / requirements and results– Methods: behavioral specification / composition and control– Knowledge: Domain concepts and relations

• TMK models provide the agent with knowledge of its own design.

• TMK encodes:– Tasks: functional specification / requirements and results– Methods: behavioral specification / composition and control– Knowledge: Domain concepts and relations

Remote Local

URL’s, servers,documents, etc.

Access

Request Receive Store

TMK (Task-Method-Knowledge)

Page 7: Self-Improvement through Self-Understanding: Model-Based Reflection for Agent Adaptation

J. William Murdock 7/42

...

REM Reasoning Process

A Method

Implemented Task

......

Unimplemented Task

Set of Input Values

Set of Input Values

Execution

Adaptation

ADAPTED Method

ADAPTED Implemented Task

...TraceSet of Output Values

Page 8: Self-Improvement through Self-Understanding: Model-Based Reflection for Agent Adaptation

J. William Murdock 8/42

...

ProactiveModel Transfer

...

Adaptation Process

Task

A Method

Similar Implemented Task

......

Situator(for Q-Learning) ADAPTED Method

ADAPTED Implemented Task

...

Failure-DrivenModel Transfer

Existing Method

Trace

Set of Input Values

Generative Planning

Page 9: Self-Improvement through Self-Understanding: Model-Based Reflection for Agent Adaptation

J. William Murdock 9/42

...Select Next Task

Within Method

Execution Process

SelectMethod

ExecutePrimitive Task

A Method

Implemented Task

...Set of Input Values

TraceSet of Output Values

Page 10: Self-Improvement through Self-Understanding: Model-Based Reflection for Agent Adaptation

J. William Murdock 10/42

Selection: Q-Learning

• Popular, simple form of reinforcement learning.• In each state, each possible decision is assigned an

estimate of its potential value (“Q”).• For each decision, preference is given to higher Q

values.• Each decision is reinforced, i.e., it’s Q value is altered

based on the results of the actions.• These results include actual success or failure and

the Q values of next available decisions.

• Popular, simple form of reinforcement learning.• In each state, each possible decision is assigned an

estimate of its potential value (“Q”).• For each decision, preference is given to higher Q

values.• Each decision is reinforced, i.e., it’s Q value is altered

based on the results of the actions.• These results include actual success or failure and

the Q values of next available decisions.

Page 11: Self-Improvement through Self-Understanding: Model-Based Reflection for Agent Adaptation

J. William Murdock 11/42

Q-Learning in REM

• Decisions are made for method selection and for selecting new transitions within a method.

• A decision state is a point in the reasoning (i.e., task, method) plus a set of all decisions which have been made in the past.

• Initial Q values are set to 0.• Decides on option with highest Q value or randomly

selects option with probabilities weighted by Q value (configurable).

• A decision receives positive reinforcement when it leads immediately (without any other decisions) to the success of the overall task.

• Decisions are made for method selection and for selecting new transitions within a method.

• A decision state is a point in the reasoning (i.e., task, method) plus a set of all decisions which have been made in the past.

• Initial Q values are set to 0.• Decides on option with highest Q value or randomly

selects option with probabilities weighted by Q value (configurable).

• A decision receives positive reinforcement when it leads immediately (without any other decisions) to the success of the overall task.

Page 12: Self-Improvement through Self-Understanding: Model-Based Reflection for Agent Adaptation

J. William Murdock 12/42

Task-Method-Knowledge Language (TMKL)

• A new, powerful formalism of TMK developed for REM.

• Uses LOOM, a popular off-the-shelf knowledge representation framework: concepts, relations, etc.

• A new, powerful formalism of TMK developed for REM.

• Uses LOOM, a popular off-the-shelf knowledge representation framework: concepts, relations, etc.

REM models not only the tasks of the domain but also itself in TMKL.

REM models not only the tasks of the domain but also itself in TMKL.

Page 13: Self-Improvement through Self-Understanding: Model-Based Reflection for Agent Adaptation

J. William Murdock 13/42

Tasks in TMKL

• All tasks can have input & output parameter lists and given & makes conditions.

• A non-primitive task must have one or more methods which accomplishes it.

• A primitive task must include one or more of the following: source code, a logical assertion, a specified output value.

• Unimplemented tasks have neither of these.

• All tasks can have input & output parameter lists and given & makes conditions.

• A non-primitive task must have one or more methods which accomplishes it.

• A primitive task must include one or more of the following: source code, a logical assertion, a specified output value.

• Unimplemented tasks have neither of these.

Page 14: Self-Improvement through Self-Understanding: Model-Based Reflection for Agent Adaptation

J. William Murdock 14/42

TMKL Task

(define-task communicate-with-www-server :input (input-url) :output (server-reply) :makes (:and (document-at-location (value server-reply) (value input-url)) (document-at-location (value server-reply) local-host)) :by-mmethod (communicate-with-server-method))

Page 15: Self-Improvement through Self-Understanding: Model-Based Reflection for Agent Adaptation

J. William Murdock 15/42

Methods in TMKL

• Methods have provided and additional result conditions which specify incidental requirements and results.

• In addition, a method specifies a start transition for its processing control.

• Each transition specifies requirements for using it and a new state that it goes to.

• Each state has a task and a set of outgoing transitions.

• Methods have provided and additional result conditions which specify incidental requirements and results.

• In addition, a method specifies a start transition for its processing control.

• Each transition specifies requirements for using it and a new state that it goes to.

• Each state has a task and a set of outgoing transitions.

Page 16: Self-Improvement through Self-Understanding: Model-Based Reflection for Agent Adaptation

J. William Murdock 16/42

Simple TMKL Method

(define-mmethod external-display

:provided (:not (internal-display-tag (value server-tag)))

:series (select-display-command

compile-display-command

execute-display-command))

Page 17: Self-Improvement through Self-Understanding: Model-Based Reflection for Agent Adaptation

J. William Murdock 17/42

Complex TMKL Method(define-mmethod make-plan-node-children-mmethod :series (select-child-plan-node make-subplan-hierarchy add-plan-mappings set-plan-node-children))(tell (transition>links make-plan-node-children-mmethod-t3 equivalent-plan-nodes child-equivalent-plan-nodes) (transition>next make-plan-node-children-mmethod-t5 make-plan-node-children-mmethod-s1) (:create make-plan-node-children-terminate transition) (reasoning-state>transition make-plan-node-children-mmethod-s1 make-plan-node-children-terminate) (:about make-plan-node-children-terminate (transition>provided '(terminal-addam-value (value child-plan-node)))))

Page 18: Self-Improvement through Self-Understanding: Model-Based Reflection for Agent Adaptation

J. William Murdock 18/42

Knowledge in TMKL

Foundation: LOOM– Concepts, instances, relations– Concepts and relations are instances and can

have facts about them.

Foundation: LOOM– Concepts, instances, relations– Concepts and relations are instances and can

have facts about them.

Knowledge representation in TMKL involves LOOM +

some TMKL specific reflective concepts and relations.

Knowledge representation in TMKL involves LOOM +

some TMKL specific reflective concepts and relations.

Page 19: Self-Improvement through Self-Understanding: Model-Based Reflection for Agent Adaptation

J. William Murdock 19/42

Some TMKLKnowledge Modeling

(defconcept location)(defconcept computer :is-primitive location)(defconcept url :is-primitive location :roles (text))(defrelation text :range string :characteristics :single-valued)(defrelation document-at-location :domain reply :range location)(tell (external-state-relation document-at-location))

Page 20: Self-Improvement through Self-Understanding: Model-Based Reflection for Agent Adaptation

J. William Murdock 20/42

Sample Meta-Knowledge in TMKL

•relation characteristics

–single-valued/multiple-valued

–symmetric, commutative

•relation characteristics

–single-valued/multiple-valued

–symmetric, commutative

•relations over relations

–external/internal

–state/definitional

•relations over relations

–external/internal

–state/definitional

•generic relations

–same-as

–instance-of

–inverse-of

•generic relations

–same-as

–instance-of

–inverse-of

•concepts involving concepts

–thing

–meta-concept

–concept

•concepts involving concepts

–thing

–meta-concept

–concept

Page 21: Self-Improvement through Self-Understanding: Model-Based Reflection for Agent Adaptation

J. William Murdock 21/42

Web Browsing Agent

• Interactive Domain: Web agent is affected by the user and by the network

• Dynamic Domain: Both users and networks often change

• Knowledge Intensive Domain: Documents, networks, servers, local software, etc.

• Interactive Domain: Web agent is affected by the user and by the network

• Dynamic Domain: Both users and networks often change

• Knowledge Intensive Domain: Documents, networks, servers, local software, etc.

Mock-up of a web browser:

Steps through the web-browsing process

Mock-up of a web browser:

Steps through the web-browsing process

Page 22: Self-Improvement through Self-Understanding: Model-Based Reflection for Agent Adaptation

J. William Murdock 22/42

Tasks and Methodsof Web Agent

Communicate with WWW Server Display File

Process URL Method

Process URL

Request from Server Receive from Server

Communicate with WWW Server Method

Interpret Reply Display Interpreted File

External Display Internal Display

Execute Internal DisplaySelect Display Command Compile Display Command Execute Display Command

Display File Method

Page 23: Self-Improvement through Self-Understanding: Model-Based Reflection for Agent Adaptation

J. William Murdock 23/42

Example: PDF Viewer

• The web agent is asked to browse the URL for a PDF file. It does not have any information about external viewers for PDF.

• Because the agent already has a task for browsing URL’s it is executed first.

• When the system fails, the user provides feedback indicating the correct viewer.

• Failure-Driven Model Transfer

• The web agent is asked to browse the URL for a PDF file. It does not have any information about external viewers for PDF.

• Because the agent already has a task for browsing URL’s it is executed first.

• When the system fails, the user provides feedback indicating the correct viewer.

• Failure-Driven Model Transfer

Page 24: Self-Improvement through Self-Understanding: Model-Based Reflection for Agent Adaptation

J. William Murdock 24/42

Web Agent Adaptation

External Display

Select Display Command Compile Display Command Execute Display Command

...

External Display

Compile Display Command Execute Display Command

...

Select Display Command Base Method Select Display Command Alternate Method

Select Display Command

Select Display Command Base Task Select Display Command Alternate Task

Page 25: Self-Improvement through Self-Understanding: Model-Based Reflection for Agent Adaptation

J. William Murdock 25/42

Physical Device Disassembly

• ADDAM: Legacy software agent for case-based, design-level disassembly planning and (simulated) execution

• Interactive: Agent connects to a user specifying goals and to a complex physical environment

• Dynamic: New designs and demands• Knowledge Intensive: Designs, plans, etc.

• ADDAM: Legacy software agent for case-based, design-level disassembly planning and (simulated) execution

• Interactive: Agent connects to a user specifying goals and to a complex physical environment

• Dynamic: New designs and demands• Knowledge Intensive: Designs, plans, etc.

Page 26: Self-Improvement through Self-Understanding: Model-Based Reflection for Agent Adaptation

J. William Murdock 26/42

Disassembly Assembly

• A user with access to ADDAM disassembly agent wishes to have this agent instead do assembly.

• ADDAM has no assembly method thus must adapt first.

• Since assembly is similar to disassembly, REM selects Proactive Model Transfer.

• A user with access to ADDAM disassembly agent wishes to have this agent instead do assembly.

• ADDAM has no assembly method thus must adapt first.

• Since assembly is similar to disassembly, REM selects Proactive Model Transfer.

Page 27: Self-Improvement through Self-Understanding: Model-Based Reflection for Agent Adaptation

J. William Murdock 27/42

Pieces of ADDAM which are key to Disassembly Assembly

Adapt Disassembly Plan Execute Plan

Plan Then Execute Disassembly

Disassemble

Hierarchical Plan Execution

Select Next Action Execute Action

Topology Based Plan Adaptation

Make Plan Hierarchy

Make Equivalent Plan Nodes Method

Make Equivalent Plan Node Add Equivalent Plan Node

Map Dependencies

Select Dependency Assert Dependency

Page 28: Self-Improvement through Self-Understanding: Model-Based Reflection for Agent Adaptation

J. William Murdock 28/42

New Adapted Task inDisassembly Assembly

COPIED Adapt Disassembly Plan COPIED Execute Plan

COPIED Plan Then Execute Disassembly

Assemble

COPIED Hierarchical Plan Execution

Execute Action

COPIED Topology Based Plan Adaptation

COPIED Make Plan Hierarchy

COPIED Make Equivalent Plan Nodes Method

COPIED Add Equivalent Plan Node

COPIED Map Dependencies

COPIED Select Dependency INVERTED Assert Dependency

INSERTED Inversion Task 1

INSERTED Inversion Task 2

Select Next Action

COPIED Make Equivalent Plan Node

Page 29: Self-Improvement through Self-Understanding: Model-Based Reflection for Agent Adaptation

J. William Murdock 29/42

Task: Assert Dependency

Before:define-task Assert-Dependency input: target-before-node, target-after-node asserts: (node-precedes (value target-before-node)

(value target-after-node))

After:define-task Mapped-Assert-Dependency input: target-before-node, target-after-node asserts: (node-follows (value target-before-node)

(value target-after-node)))

Page 30: Self-Improvement through Self-Understanding: Model-Based Reflection for Agent Adaptation

J. William Murdock 30/42

Task: Make Equivalent Plan Node

define-task make-equivalent-plan-node

input: base-plan-node, parent-plan-node, equivalent-topology-node

output: equivalent-plan-node

makes: (:and

(plan-node-parent (value equivalent-plan-node)

(value parent-plan-node))

(plan-node-object (value equivalent-plan-node)

(value equivalent-topology-node))

(:implies (plan-action (value base-plan-node))

(type-of-action (value equivalent-plan-node)

(type-of-action (value base-plan-node)))))

by procedure ...

Page 31: Self-Improvement through Self-Understanding: Model-Based Reflection for Agent Adaptation

J. William Murdock 31/42

Task:Inverted-Reversal-Task

define-task inserted-reversal-task

input: equivalent-plan-node

asserts: (type-of-action

(value equivalent-plan-node)

(inverse-of

(type-of-action

(value equivalent-plan-node))))

Page 32: Self-Improvement through Self-Understanding: Model-Based Reflection for Agent Adaptation

J. William Murdock 32/42

ADDAMExample:

Layered Roof

Page 33: Self-Improvement through Self-Understanding: Model-Based Reflection for Agent Adaptation

J. William Murdock 33/42

Roof Assembly

1

10

100

1000

10000

100000

1000000

1 2 3 4 5 6 7

Number of Boards

Ela

pse

d T

ime

(sec

on

ds)

REM: Meta-CBR

REM: Graphplan

REM: Q-Learning

Page 34: Self-Improvement through Self-Understanding: Model-Based Reflection for Agent Adaptation

J. William Murdock 34/42

Modified Roof Assembly: No Conflicting Goals

1

10

100

1000

10000

100000

1 2 3 4 5 6 7

Number of Boards

Ela

pse

d T

ime

(sec

on

ds)

REM: Meta-CBR

REM: Graphplan

REM: Q-Learning

Page 35: Self-Improvement through Self-Understanding: Model-Based Reflection for Agent Adaptation

J. William Murdock 35/42

Applicability ofProactive Model Transfer

• Knowledge about the concepts and relations in the domain

• Knowledge about how the tasks and methods affect these concepts and relations

• Differences between the old task and the new map onto knowledge of the concepts and relations in the domain.

• Knowledge about the concepts and relations in the domain

• Knowledge about how the tasks and methods affect these concepts and relations

• Differences between the old task and the new map onto knowledge of the concepts and relations in the domain.

Page 36: Self-Improvement through Self-Understanding: Model-Based Reflection for Agent Adaptation

J. William Murdock 36/42

Applicability ofFailure-Driven Model Transfer

• May need less knowledge about the domain itself since the adaptation is grounded in a specific incident.– e.g., feedback about PDF for an example instead

of advance knowledge of all document types.

• Still requires knowledge about how the tasks and methods interact with the domain.

• May need less knowledge about the domain itself since the adaptation is grounded in a specific incident.– e.g., feedback about PDF for an example instead

of advance knowledge of all document types.

• Still requires knowledge about how the tasks and methods interact with the domain.

Page 37: Self-Improvement through Self-Understanding: Model-Based Reflection for Agent Adaptation

J. William Murdock 37/42

Additional Mechanisms

• Model-based adaptation may leave some design decisions unsolved.– These decisions may be solved by traditional

decision making mechanisms, e.g., reinforcement learning.

• Models may be unavailable or irrelevant for some tasks or subtasks– Generative planning can combine primitive actions.

• Model-based adaptation may leave some design decisions unsolved.– These decisions may be solved by traditional

decision making mechanisms, e.g., reinforcement learning.

• Models may be unavailable or irrelevant for some tasks or subtasks– Generative planning can combine primitive actions.

Page 38: Self-Improvement through Self-Understanding: Model-Based Reflection for Agent Adaptation

J. William Murdock 38/42

Level of Decomposition

• Level of decomposition may be dictated by the nature of the agent.– Some tasks simply cannot be decomposed

• In other situations, level of decomposition may be guided by the nature of adaptation to be done.– Can be brittle if unpredicted demands arise.

• REM enables autonomous decomposition of primitives which addresses this problem.

• Level of decomposition may be dictated by the nature of the agent.– Some tasks simply cannot be decomposed

• In other situations, level of decomposition may be guided by the nature of adaptation to be done.– Can be brittle if unpredicted demands arise.

• REM enables autonomous decomposition of primitives which addresses this problem.

Page 39: Self-Improvement through Self-Understanding: Model-Based Reflection for Agent Adaptation

J. William Murdock 39/42

Computational Costs

• Reasoning about models incurs some costs.– For very easy problems, this overhead may not be

justified.– For other problems, the benefits enormously

outweigh these costs.

• Reasoning about models incurs some costs.– For very easy problems, this overhead may not be

justified.– For other problems, the benefits enormously

outweigh these costs.

Models can localize planning and learning.Models can localize planning and learning.

Page 40: Self-Improvement through Self-Understanding: Model-Based Reflection for Agent Adaptation

J. William Murdock 40/42

Knowledge Requirements

• Someone has to build an agent.• Builder should know what that agent does and

how it does it Can make model.• Analyst may be able to understand builder’s

notes, etc. Can make model• Some evidence for this in the context of

software engineering / architectural extraction.

• Someone has to build an agent.• Builder should know what that agent does and

how it does it Can make model.• Analyst may be able to understand builder’s

notes, etc. Can make model• Some evidence for this in the context of

software engineering / architectural extraction.

Page 41: Self-Improvement through Self-Understanding: Model-Based Reflection for Agent Adaptation

J. William Murdock 41/42

Current Work: AHEAD• Theme: Analyzing hypotheses regarding asymmetric

threats (e.g., criminals, terrorists).– Input: Hypotheses regarding a potential threat– Output: Argument for and/or against the hypotheses

• Technique: Analogy over functional models– An extension to TMKL will encode known behaviors for

asymetric threats and the purposes that the behaviors serve.– Analogical reasoning will enable retrieval and mapping of

new hypotheses to existing models.– Models will provide arguments about how observed actions

do or do not support the purposes of the hypothesized behavior.

• Naval Research Laboratory / DARPA Evidence Extraction and Link Discovery program

• David Aha, J. William Murdock, Len Breslow

• Theme: Analyzing hypotheses regarding asymmetric threats (e.g., criminals, terrorists).– Input: Hypotheses regarding a potential threat– Output: Argument for and/or against the hypotheses

• Technique: Analogy over functional models– An extension to TMKL will encode known behaviors for

asymetric threats and the purposes that the behaviors serve.– Analogical reasoning will enable retrieval and mapping of

new hypotheses to existing models.– Models will provide arguments about how observed actions

do or do not support the purposes of the hypothesized behavior.

• Naval Research Laboratory / DARPA Evidence Extraction and Link Discovery program

• David Aha, J. William Murdock, Len Breslow

Page 42: Self-Improvement through Self-Understanding: Model-Based Reflection for Agent Adaptation

J. William Murdock 42/42

Summary

• REM (Reflective Evolutionary Mind)– Operating environment for agents that adapt

• TMKL (Task-Method-Knowledge Language)– The language for agents in REM– Functional modeling language for encoding

computational processes• Adaptation

– Some kinds of adaptation can be performed using specialized model-based techniques

– Others require more generic planning & learning mechanisms (localized using models)

• REM (Reflective Evolutionary Mind)– Operating environment for agents that adapt

• TMKL (Task-Method-Knowledge Language)– The language for agents in REM– Functional modeling language for encoding

computational processes• Adaptation

– Some kinds of adaptation can be performed using specialized model-based techniques

– Others require more generic planning & learning mechanisms (localized using models)