integrated learning & training for interactive characters bruce blumberg & the synthetic...

56
Integrated Learning & Integrated Learning & Training for Interactive Training for Interactive Characters Characters Bruce Blumberg & the Bruce Blumberg & the Synthetic Characters Group Synthetic Characters Group www.media.mit.edu/~bruce www.media.mit.edu/~bruce

Post on 20-Dec-2015

223 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Integrated Learning & Training for Interactive Characters Bruce Blumberg & the Synthetic Characters Group bruce bruce

Integrated Learning & Integrated Learning & Training for Interactive Training for Interactive

CharactersCharacters

Integrated Learning & Integrated Learning & Training for Interactive Training for Interactive

CharactersCharactersBruce Blumberg & the Bruce Blumberg & the

Synthetic Characters GroupSynthetic Characters Group

www.media.mit.edu/~brucewww.media.mit.edu/~bruce

Bruce Blumberg & the Bruce Blumberg & the Synthetic Characters GroupSynthetic Characters Group

www.media.mit.edu/~brucewww.media.mit.edu/~bruce

Page 2: Integrated Learning & Training for Interactive Characters Bruce Blumberg & the Synthetic Characters Group bruce bruce

Field work…Field work…

Page 3: Integrated Learning & Training for Interactive Characters Bruce Blumberg & the Synthetic Characters Group bruce bruce

Where we have been and are going…Where we have been and are going…

Page 4: Integrated Learning & Training for Interactive Characters Bruce Blumberg & the Synthetic Characters Group bruce bruce

Practical & compelling real-time learningPractical & compelling real-time learning

• Easy for interactive characters to learn what they ought to be able to learn

• Easy for a human trainer to guide learning process

• A compelling user experience

• Provide heuristics and practical design principles

• Easy for interactive characters to learn what they ought to be able to learn

• Easy for a human trainer to guide learning process

• A compelling user experience

• Provide heuristics and practical design principles

Page 5: Integrated Learning & Training for Interactive Characters Bruce Blumberg & the Synthetic Characters Group bruce bruce

Our bias & focusOur bias & focus

• Learning occurs within an innate Learning occurs within an innate structure that biases…structure that biases…• Attention

• Motivation

• Innate frequency, form and organization of behavior

• When certain things are most easily learned

• What are the catalytic components of the What are the catalytic components of the scaffolding that dramatically facilitate the scaffolding that dramatically facilitate the learning & training process?learning & training process?

• Learning occurs within an innate Learning occurs within an innate structure that biases…structure that biases…• Attention

• Motivation

• Innate frequency, form and organization of behavior

• When certain things are most easily learned

• What are the catalytic components of the What are the catalytic components of the scaffolding that dramatically facilitate the scaffolding that dramatically facilitate the learning & training process?learning & training process?

Page 6: Integrated Learning & Training for Interactive Characters Bruce Blumberg & the Synthetic Characters Group bruce bruce

Where we draw from…Where we draw from…

• Reinforcement learningReinforcement learning• Barto & Sutton 98, Mitchell 97, Kaelbling 90, Drescher 91

• Animal training and ethologyAnimal training and ethology• Lindsay 00, Lorenz & Leyhausen 73, Ramirez 99, Pryor 99, Coppinger 01

• Motor learningMotor learning• van de Panne et al 93,94, Grzeszczuk & Terzopoulos 95, Hodgins &

Pollard 97, Gleicher 98, Faloutsos et al 01

• Behavior ArchitecturesBehavior Architectures• Reynolds 87, Tu & Terzopoulos 94, Perlin & Goldberg 96, Funge et al 99,

Burke et al 01

• Computer games & digital petsComputer games & digital pets• Dogz, AIBO, Black & White

• Reinforcement learningReinforcement learning• Barto & Sutton 98, Mitchell 97, Kaelbling 90, Drescher 91

• Animal training and ethologyAnimal training and ethology• Lindsay 00, Lorenz & Leyhausen 73, Ramirez 99, Pryor 99, Coppinger 01

• Motor learningMotor learning• van de Panne et al 93,94, Grzeszczuk & Terzopoulos 95, Hodgins &

Pollard 97, Gleicher 98, Faloutsos et al 01

• Behavior ArchitecturesBehavior Architectures• Reynolds 87, Tu & Terzopoulos 94, Perlin & Goldberg 96, Funge et al 99,

Burke et al 01

• Computer games & digital petsComputer games & digital pets• Dogz, AIBO, Black & White

Page 7: Integrated Learning & Training for Interactive Characters Bruce Blumberg & the Synthetic Characters Group bruce bruce

Dobie T. Coyote Goes to SchoolDobie T. Coyote Goes to School

Short Dobie Video

Page 8: Integrated Learning & Training for Interactive Characters Bruce Blumberg & the Synthetic Characters Group bruce bruce

Reinforcement Learning (R.L.) As Starting Point

Reinforcement Learning (R.L.) As Starting Point

A1 A2 A3

S1 Q(1,1) Q(1,2) Q(1,3)

S2 Q(2,1) Q(2,2) Q(2,3)

S3 Q(3,1) Q(3,2) Q(3,3)

Utility of taking action A3 in state S2

Set of all possible actions

Set of all possible states of world

• Dogs solve a simpler problem in a Dogs solve a simpler problem in a much larger space & one that is more much larger space & one that is more relevant to interactive charactersrelevant to interactive characters

• Dogs solve a simpler problem in a Dogs solve a simpler problem in a much larger space & one that is more much larger space & one that is more relevant to interactive charactersrelevant to interactive characters

Page 9: Integrated Learning & Training for Interactive Characters Bruce Blumberg & the Synthetic Characters Group bruce bruce

The problem facing dogs (real and synthetic)

The problem facing dogs (real and synthetic)

Set of all possible actions

Set of all motivational

goals

Set of all possible stimuli

What do I do, when, in order to best satisfy my motivational goals?

Page 10: Integrated Learning & Training for Interactive Characters Bruce Blumberg & the Synthetic Characters Group bruce bruce

The space of possible stimuli is wicked bigThe space of possible stimuli is wicked big

Set of all possible stimuli

SmellsMotion

Sounds

Dog sounds

SpeechWhistles

Modality of Stimuli

Time of Occurrence

State Space

Page 11: Integrated Learning & Training for Interactive Characters Bruce Blumberg & the Synthetic Characters Group bruce bruce

The space of possible actions is also very bigThe space of possible actions is also very big

Set of all possible actions

Action

Time of Performance

Figure -8

Shake

Low shake

High -5

Beg

Down

Left ear twitch

Action Space

Page 12: Integrated Learning & Training for Interactive Characters Bruce Blumberg & the Synthetic Characters Group bruce bruce

Who gets credit for good things happening?

Who gets credit for good things happening?

Yumm..

Action

Figure -8

Shake

Low shake

High -5

Beg

Down

Left ear twitch

Motion

Sounds

Dog sounds

SpeechWhistles

Modality of Stimuli

Page 13: Integrated Learning & Training for Interactive Characters Bruce Blumberg & the Synthetic Characters Group bruce bruce

Dogs seem to constrain search for causal agents

Dogs seem to constrain search for causal agents

Time

Consequences Window:Trainer “clicks” signaling reward is coming.

When reward is actually received

Attention Window:Cue given immediately before or as dog is moving into desired pose

Sit Approach Eat

Dogs make the problem tractable by constraining search for causal agents to narrow temporal windows

Page 14: Integrated Learning & Training for Interactive Characters Bruce Blumberg & the Synthetic Characters Group bruce bruce

Dogs seem to use implicit feedback to guide perceptual learning

Dogs seem to use implicit feedback to guide perceptual learning

Sit

Time

“sit-utterance” perceived.

Approach Eat

“click” perceived.

Dog decides to sit

Build & update perceptual model of “sit-utterance”

Dogs use rewarded action to identify potentially promising state to explore and to guide formation of perceptual models

Page 15: Integrated Learning & Training for Interactive Characters Bruce Blumberg & the Synthetic Characters Group bruce bruce

Dogs seem to give credit where credit is due

Dogs seem to give credit where credit is due

Sit

Time

“sit-utterance” perceived.

Approach Eat

“click” perceived.

Dog decides to sit

1. Credit sitting in presence of “sit-utterance”2. Build & update perceptual model of “sit-

utterance”

Page 16: Integrated Learning & Training for Interactive Characters Bruce Blumberg & the Synthetic Characters Group bruce bruce

Dogs seem to give credit where credit is due…

Dogs seem to give credit where credit is due…• Trainer repeatedly lures dog Trainer repeatedly lures dog

through a trajectory or into a through a trajectory or into a pose pose

• Eventually, dog performs Eventually, dog performs behavior spontaneouslybehavior spontaneously

• ImplicationImplication• Dog associates reward with

resulting body configuration or trajectory and not just with “follow-your nose”

• Trainer repeatedly lures dog Trainer repeatedly lures dog through a trajectory or into a through a trajectory or into a pose pose

• Eventually, dog performs Eventually, dog performs behavior spontaneouslybehavior spontaneously

• ImplicationImplication• Dog associates reward with

resulting body configuration or trajectory and not just with “follow-your nose”

Page 17: Integrated Learning & Training for Interactive Characters Bruce Blumberg & the Synthetic Characters Group bruce bruce

D.L.: Take Advantage of Predictable Regularities

D.L.: Take Advantage of Predictable Regularities• Constrain search for causal agents by Constrain search for causal agents by

taking advantage of temporal proximity taking advantage of temporal proximity & natural hierarchy of state spaces& natural hierarchy of state spaces• Use consequences to bias choice of action

• But vary performance and attend to differences

• Explore state and action spaces on “as-Explore state and action spaces on “as-needed” basisneeded” basis• Build models on demand

• Constrain search for causal agents by Constrain search for causal agents by taking advantage of temporal proximity taking advantage of temporal proximity & natural hierarchy of state spaces& natural hierarchy of state spaces• Use consequences to bias choice of action

• But vary performance and attend to differences

• Explore state and action spaces on “as-Explore state and action spaces on “as-needed” basisneeded” basis• Build models on demand

Page 18: Integrated Learning & Training for Interactive Characters Bruce Blumberg & the Synthetic Characters Group bruce bruce

D.L.: Make Use of All Feedback: Explicit & Implicit

D.L.: Make Use of All Feedback: Explicit & Implicit• Use rewarded action as context for Use rewarded action as context for

identifying identifying •Promising state space and action space to

explore

•Good examples from which to construct perceptual models, e.g.,

•A good example of a “sit-utterance” is one that occurs within the context of a rewarded Sit.

• Use rewarded action as context for Use rewarded action as context for identifying identifying •Promising state space and action space to

explore

•Good examples from which to construct perceptual models, e.g.,

•A good example of a “sit-utterance” is one that occurs within the context of a rewarded Sit.

Page 19: Integrated Learning & Training for Interactive Characters Bruce Blumberg & the Synthetic Characters Group bruce bruce

D.L.: Make Them Easy to TrainD.L.: Make Them Easy to Train

• Respond quickly to “obvious” Respond quickly to “obvious” contingenciescontingencies

• Support Luring and ShapingSupport Luring and Shaping•Techniques to prompt infrequently expressed

or novel motor actions

• ““Trainer friendly” credit Trainer friendly” credit assignmentassignment•Assign credit to candidate that matches

trainer’s expectation

• Respond quickly to “obvious” Respond quickly to “obvious” contingenciescontingencies

• Support Luring and ShapingSupport Luring and Shaping•Techniques to prompt infrequently expressed

or novel motor actions

• ““Trainer friendly” credit Trainer friendly” credit assignmentassignment•Assign credit to candidate that matches

trainer’s expectation

Page 20: Integrated Learning & Training for Interactive Characters Bruce Blumberg & the Synthetic Characters Group bruce bruce

The SystemThe System

Page 21: Integrated Learning & Training for Interactive Characters Bruce Blumberg & the Synthetic Characters Group bruce bruce

Representation of State: PerceptRepresentation of State: Percept

• Percepts are Percepts are atomic perception atomic perception unitsunits

• Recognize and Recognize and extract features extract features from sensory datafrom sensory data

• Model-basedModel-based

• Organized in Organized in dynamic hierarchydynamic hierarchy

• Percepts are Percepts are atomic perception atomic perception unitsunits

• Recognize and Recognize and extract features extract features from sensory datafrom sensory data

• Model-basedModel-based

• Organized in Organized in dynamic hierarchydynamic hierarchy

Page 22: Integrated Learning & Training for Interactive Characters Bruce Blumberg & the Synthetic Characters Group bruce bruce

Representation of State-Action Pairs: Action Tuples

Representation of State-Action Pairs: Action Tuples

Percept Action

Value

Novelty

Reliability

ValuePercept Activation

Action Tuples are organized Action Tuples are organized in dynamic hierarchy and in dynamic hierarchy and compete probabilistically compete probabilistically based on their learned value based on their learned value and reliability and reliability

Page 23: Integrated Learning & Training for Interactive Characters Bruce Blumberg & the Synthetic Characters Group bruce bruce

Representation of Action: Labeled Path Through Space of Body Configurations

Representation of Action: Labeled Path Through Space of Body Configurations• A motor program generates a path A motor program generates a path

through a graph of annotated through a graph of annotated poses, e.g.,poses, e.g.,•Sit animation

•Follow-your-nose procedure

• Paths can be compared and Paths can be compared and classified just like perceptual classified just like perceptual events using Motor Model Perceptsevents using Motor Model Percepts

• A motor program generates a path A motor program generates a path through a graph of annotated through a graph of annotated poses, e.g.,poses, e.g.,•Sit animation

•Follow-your-nose procedure

• Paths can be compared and Paths can be compared and classified just like perceptual classified just like perceptual events using Motor Model Perceptsevents using Motor Model Percepts

Page 24: Integrated Learning & Training for Interactive Characters Bruce Blumberg & the Synthetic Characters Group bruce bruce

Use Time to Constrain Search for Causal Agents

Use Time to Constrain Search for Causal Agents

Sit

Attention Window:Look here for cues that appear correlated with increased likelihood of action being followed by a good thing

Good Thing

Consequences Window:Assume any good or bad things that happen here are associated with the preceding action and the context in which it was performed

Scratch

Time

Page 25: Integrated Learning & Training for Interactive Characters Bruce Blumberg & the Synthetic Characters Group bruce bruce

Four Important Tasks Are Performed During Credit Assignment

Four Important Tasks Are Performed During Credit Assignment• Choose most worthy Action Tuple Choose most worthy Action Tuple

heuristically based on reliability heuristically based on reliability and novelty statisticsand novelty statistics

• Update valueUpdate value• Create new Action Tuples as Create new Action Tuples as

appropriateappropriate• Guide State and Action Space Guide State and Action Space

DiscoveryDiscovery

• Choose most worthy Action Tuple Choose most worthy Action Tuple heuristically based on reliability heuristically based on reliability and novelty statisticsand novelty statistics

• Update valueUpdate value• Create new Action Tuples as Create new Action Tuples as

appropriateappropriate• Guide State and Action Space Guide State and Action Space

DiscoveryDiscovery

Page 26: Integrated Learning & Training for Interactive Characters Bruce Blumberg & the Synthetic Characters Group bruce bruce

Implicit Feedback Guides State Space Discovery

Implicit Feedback Guides State Space Discovery

Good Thing appears. Create a new Percept with “beg” example as initial model

Time

Utterance occurs within window but not classified by any existing percept

“beg”

This means that Percepts are only created to recognize “promising” utterances

Beg Good ThingScratch

Page 27: Integrated Learning & Training for Interactive Characters Bruce Blumberg & the Synthetic Characters Group bruce bruce

Implicit Feedback Identifies Good Examples

Implicit Feedback Identifies Good Examples

Beg Good Thing

Good Thing appears. Update model of “beg” utterance using “beg” that occurred in attention window

Scratch

Time

Classify utterance as “beg”.

“beg”

This means model is built using good examples

Page 28: Integrated Learning & Training for Interactive Characters Bruce Blumberg & the Synthetic Characters Group bruce bruce

Unrewarded Examples Don’t Get Added to Models

Unrewarded Examples Don’t Get Added to Models

Beg Sit

Beg ends without food appearing. Do not update model since example may have been bad.

Scratch

Time

Utterance classified as “Beg” by mistake. Beg becomes active.

“Leg”

Actually, bad examples can be used to build model of “not-Beg.”

Page 29: Integrated Learning & Training for Interactive Characters Bruce Blumberg & the Synthetic Characters Group bruce bruce

Most Worthy Action Tuple Gets CreditMost Worthy Action Tuple Gets Credit

Sit

Time

“sit-utterance” perceived.

Good Thing

“click” perceived.

<true/Sit> begins

But credit goes to <“sit-utterance”/Sit>

Page 30: Integrated Learning & Training for Interactive Characters Bruce Blumberg & the Synthetic Characters Group bruce bruce

Create New Action Tuples As AppropriateCreate New Action Tuples As Appropriate

Page 31: Integrated Learning & Training for Interactive Characters Bruce Blumberg & the Synthetic Characters Group bruce bruce

Implicit Feedback Guides Action Space Discovery

Implicit Feedback Guides Action Space Discovery

Good Thing appears. Compare accumulated path to known paths

Time

“Follow-your-nose” action accumulates path through pose-space

Down

Down gets the credit for Good Thing appearing, rather than “Follow-your-nose.”

Follow-your-nose Good Thing

Page 32: Integrated Learning & Training for Interactive Characters Bruce Blumberg & the Synthetic Characters Group bruce bruce

If Path Is Novel, Create a New Motor Program and Action

If Path Is Novel, Create a New Motor Program and Action

Good Thing appears. Compare accumulated path to known paths

Time

“Follow-your-nose” action accumulates path through pose-space

Figure-8 is created and subsequent examples of Figure-8 are used to improve model of path

Figure-8

Follow-your-nose Good Thing

Page 33: Integrated Learning & Training for Interactive Characters Bruce Blumberg & the Synthetic Characters Group bruce bruce

Dobie T. Coyote…Dobie T. Coyote…

Long Dobie Video

Page 34: Integrated Learning & Training for Interactive Characters Bruce Blumberg & the Synthetic Characters Group bruce bruce

Limitations and Future WorkLimitations and Future Work

• Important extensions Important extensions •Other kinds of learning (e.g., social or

spatial)

•Generalization

•Sequences

•Expectation-based emotion system

• How will the system scale?How will the system scale?

• Important extensions Important extensions •Other kinds of learning (e.g., social or

spatial)

•Generalization

•Sequences

•Expectation-based emotion system

• How will the system scale?How will the system scale?

Page 35: Integrated Learning & Training for Interactive Characters Bruce Blumberg & the Synthetic Characters Group bruce bruce

Useful InsightsUseful Insights

• UseUse•Temporal proximity to limit search.

•Hierarchical representations of state, action and state-action space & use implicit feedback to guide exploration

• “trainer friendly” credit assignment

• Luring and shaping are essentialLuring and shaping are essential

• UseUse•Temporal proximity to limit search.

•Hierarchical representations of state, action and state-action space & use implicit feedback to guide exploration

• “trainer friendly” credit assignment

• Luring and shaping are essentialLuring and shaping are essential

Page 36: Integrated Learning & Training for Interactive Characters Bruce Blumberg & the Synthetic Characters Group bruce bruce

Future work: the problem of sequences…

Future work: the problem of sequences…

Page 37: Integrated Learning & Training for Interactive Characters Bruce Blumberg & the Synthetic Characters Group bruce bruce

Who gets credit for good things happening?

Who gets credit for good things happening?

stalkgrab-bite

eye

orient

kill-bitechase

Yumm..

Time

Page 38: Integrated Learning & Training for Interactive Characters Bruce Blumberg & the Synthetic Characters Group bruce bruce

Conventional idea: back propagation from goal

Conventional idea: back propagation from goal

stalkgrab-bite

eye

orient

kill-bitechase

Yumm..

Time Credit flows backward

Page 39: Integrated Learning & Training for Interactive Characters Bruce Blumberg & the Synthetic Characters Group bruce bruce

Conventional idea: back propagation from goal

Conventional idea: back propagation from goal

stalkgrab-bite

eye

orient

kill-bitechase

Yumm..

Time Credit flows backward

Page 40: Integrated Learning & Training for Interactive Characters Bruce Blumberg & the Synthetic Characters Group bruce bruce

Conventional idea: back propagation from goal

Conventional idea: back propagation from goal

stalkgrab-bite

eye

orient

kill-bitechase

Yumm..

Time Credit flows backward

Page 41: Integrated Learning & Training for Interactive Characters Bruce Blumberg & the Synthetic Characters Group bruce bruce

Back propagation is a slow way to learn…Back propagation is a slow way to learn…

• Search space is potentially hugeSearch space is potentially huge

• Individual elements of sequence Individual elements of sequence may need to be perfected in order may need to be perfected in order to reach goal at all.to reach goal at all.

• Necessary but rarely successful Necessary but rarely successful behaviors may be very difficult to behaviors may be very difficult to learn.learn.

• Search space is potentially hugeSearch space is potentially huge

• Individual elements of sequence Individual elements of sequence may need to be perfected in order may need to be perfected in order to reach goal at all.to reach goal at all.

• Necessary but rarely successful Necessary but rarely successful behaviors may be very difficult to behaviors may be very difficult to learn.learn.

Page 42: Integrated Learning & Training for Interactive Characters Bruce Blumberg & the Synthetic Characters Group bruce bruce

Leyhausen’s suggestion…Leyhausen’s suggestion…

stalkgrab-bite

eye

orient

kill-bitechase

Time Each element is innately self-motivating and has innate reward metric

motivation & reward

motivation & reward

motivation & reward

motivation & reward

motivation & reward

motivation & reward

Page 43: Integrated Learning & Training for Interactive Characters Bruce Blumberg & the Synthetic Characters Group bruce bruce

Leyhausen’s suggestion…Leyhausen’s suggestion…

stalkgrab-bite

eye

orient

kill-bitechase

Time Each element is innately self-motivating and has innate reward metric

motivation & reward

motivation & reward

motivation & reward

motivation & reward

motivation & reward

motivation & reward

Page 44: Integrated Learning & Training for Interactive Characters Bruce Blumberg & the Synthetic Characters Group bruce bruce

Functional goal plays incidental roleFunctional goal plays incidental role

stalkgrab-bite

eye

orient

kill-bitechase

Time Propagated value from functional goal plays incidental role

Yumm..

Page 45: Integrated Learning & Training for Interactive Characters Bruce Blumberg & the Synthetic Characters Group bruce bruce

Coppinger’s suggestion, part 1…Coppinger’s suggestion, part 1…

grab-bite

eye

orient

kill-bitechase

Time Varying innate tendency to follow behavior with “next” in sequence

Internal motivation

External motivation

stalk

Page 46: Integrated Learning & Training for Interactive Characters Bruce Blumberg & the Synthetic Characters Group bruce bruce

Coppinger suggestion, part 2Coppinger suggestion, part 2

Wolf

Border Collie

Live stock Guarding dog

Page 47: Integrated Learning & Training for Interactive Characters Bruce Blumberg & the Synthetic Characters Group bruce bruce

Coppinger’s suggestion, part 3 Coppinger’s suggestion, part 3

Border Collie

Livestock Guarding dog

Sensitive period for social development

Onset of predatory behaviors

Border Collies incorporate predatory patterns into social play because of early onset of these patterns

After Coppinger

Page 48: Integrated Learning & Training for Interactive Characters Bruce Blumberg & the Synthetic Characters Group bruce bruce

Future work: the problem of sequencesFuture work: the problem of sequences

• What can we learn from how animals What can we learn from how animals “learn” sequences?“learn” sequences?• Sequence may be “learned” apart from function

• Elements may be self-motivating and have local metric of goodness

• Innate bias of varying degree to perform all or part of “sequence”

• Role of developmental timing in determining function

• What can we learn from how animals What can we learn from how animals “learn” sequences?“learn” sequences?• Sequence may be “learned” apart from function

• Elements may be self-motivating and have local metric of goodness

• Innate bias of varying degree to perform all or part of “sequence”

• Role of developmental timing in determining function

Page 49: Integrated Learning & Training for Interactive Characters Bruce Blumberg & the Synthetic Characters Group bruce bruce

Future work: learning from learningFuture work: learning from learning

• Does it pay to explore?Does it pay to explore?•Does exploring lead to good things or bad

things?

• Best environment in which to Best environment in which to explore?explore?

• What can be learned from watching What can be learned from watching others?others?

• How do I know if I am doing it right?How do I know if I am doing it right?

• Does it pay to explore?Does it pay to explore?•Does exploring lead to good things or bad

things?

• Best environment in which to Best environment in which to explore?explore?

• What can be learned from watching What can be learned from watching others?others?

• How do I know if I am doing it right?How do I know if I am doing it right?

Page 50: Integrated Learning & Training for Interactive Characters Bruce Blumberg & the Synthetic Characters Group bruce bruce

Predictability & controlPredictability & control

• Is it a predictable world?Is it a predictable world?• Can I predict the potential arrival of good or bad

things so as to:

• Increase chances of good thing happening

• Decrease chances of bad thing happening

• Is it a controllable world?Is it a controllable world?• Can I control the world so as to satisfy my

motivational goals?

• Is it a predictable world?Is it a predictable world?• Can I predict the potential arrival of good or bad

things so as to:

• Increase chances of good thing happening

• Decrease chances of bad thing happening

• Is it a controllable world?Is it a controllable world?• Can I control the world so as to satisfy my

motivational goals?

Page 51: Integrated Learning & Training for Interactive Characters Bruce Blumberg & the Synthetic Characters Group bruce bruce

P & C are learned, and in turn affect learningP & C are learned, and in turn affect learning

Bad ThingBad Thing Good ThingGood Thing

I got smacked by surpriseI got smacked by surprise Increase Increase attentionattention

I got a cookie by surpriseI got a cookie by surprise

I will get smacked unless I will get smacked unless I do something (maybe)I do something (maybe)

Active Active explorationexploration

I will get a cookie if I do I will get a cookie if I do something (maybe)something (maybe)

I will get smacked unless I will get smacked unless I run awayI run away

ExploitationExploitation I will get a cookie if I sitI will get a cookie if I sit

I will get smacked no I will get smacked no matter what I domatter what I do

Low Low ExplorationExploration

I will get a cookie no matter I will get a cookie no matter what I dowhat I do

Page 52: Integrated Learning & Training for Interactive Characters Bruce Blumberg & the Synthetic Characters Group bruce bruce

Predictability and ControllabilityPredictability and Controllability

-p/c

-p/-c

p/c

p/-c

-p unpredictable

p predictable

c controllable

-c uncontrollable

After Lindsay

Anxiety Confidence

Depression Frustration

Page 53: Integrated Learning & Training for Interactive Characters Bruce Blumberg & the Synthetic Characters Group bruce bruce

Practical & compelling real-time learningPractical & compelling real-time learning

• Easy for interactive characters to learn what they ought to be able to learn

• Easy for a human trainer to guide learning process

• A compelling user experience

• Provide heuristics and practical design principles

• Easy for interactive characters to learn what they ought to be able to learn

• Easy for a human trainer to guide learning process

• A compelling user experience

• Provide heuristics and practical design principles

Page 54: Integrated Learning & Training for Interactive Characters Bruce Blumberg & the Synthetic Characters Group bruce bruce

AcknowledgementsAcknowledgements

• Members of the Synthetic Members of the Synthetic Characters Group, past, present & Characters Group, past, present & futurefuture

• Gary WilkesGary Wilkes

• Funded by the Digital Life Funded by the Digital Life ConsortiumConsortium

• Members of the Synthetic Members of the Synthetic Characters Group, past, present & Characters Group, past, present & futurefuture

• Gary WilkesGary Wilkes

• Funded by the Digital Life Funded by the Digital Life ConsortiumConsortium

Page 55: Integrated Learning & Training for Interactive Characters Bruce Blumberg & the Synthetic Characters Group bruce bruce

The problemThe problem

• If each element in sequence has 3 If each element in sequence has 3 variants, there are 729 possible variants, there are 729 possible combinations of which 1 may work combinations of which 1 may work (ignoring stimuli)(ignoring stimuli)

• If there are 12 possible stimuli, there are If there are 12 possible stimuli, there are 1,586,874,322,944 possible combinations 1,586,874,322,944 possible combinations of stimuli-action pairs to explore.of stimuli-action pairs to explore.

• Don’t know if it is the right sequence until Don’t know if it is the right sequence until goal is reachedgoal is reached

• What happens if “variant” needs to be What happens if “variant” needs to be learned?learned?

• If each element in sequence has 3 If each element in sequence has 3 variants, there are 729 possible variants, there are 729 possible combinations of which 1 may work combinations of which 1 may work (ignoring stimuli)(ignoring stimuli)

• If there are 12 possible stimuli, there are If there are 12 possible stimuli, there are 1,586,874,322,944 possible combinations 1,586,874,322,944 possible combinations of stimuli-action pairs to explore.of stimuli-action pairs to explore.

• Don’t know if it is the right sequence until Don’t know if it is the right sequence until goal is reachedgoal is reached

• What happens if “variant” needs to be What happens if “variant” needs to be learned?learned?

Page 56: Integrated Learning & Training for Interactive Characters Bruce Blumberg & the Synthetic Characters Group bruce bruce

Big idea: innate biases facilitate learning Big idea: innate biases facilitate learning

• Biases include…Biases include…• Temporal Proximity implies causality

• Attend more readily to certain classes of stimuli than to others (motion vs. speech)

• Lazy discovery (pay attention once you have a reason to pay attention)

• Elements may be “innately” self-motivating and have local metric of “goodness”

• Biases include…Biases include…• Temporal Proximity implies causality

• Attend more readily to certain classes of stimuli than to others (motion vs. speech)

• Lazy discovery (pay attention once you have a reason to pay attention)

• Elements may be “innately” self-motivating and have local metric of “goodness”