holland and goodman – caltech – banbury 2001

37
Holland and Goodman – Caltech – Banbury 2001 (A utonom ousrobots)+ (D ynam ic environm ent)+ (Intelligentcontrol)= C onsciousness? O w en H olland and R od G oodm an C alifornia Institute ofTechnology The Banbury C enter,C old Spring H arbor Laboratory M ay 13-16 2001

Upload: calais

Post on 07-Jan-2016

31 views

Category:

Documents


0 download

DESCRIPTION

Holland and Goodman – Caltech – Banbury 2001. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Holland and Goodman – Caltech – Banbury 2001

Holland and Goodman – Caltech – Banbury 2001

(Autonomous robots) +(Dynamic environment) +

(Intelligent control) =Consciousness?

Owen Holland and Rod Goodman

California Institute of Technology

The Banbury Center, Cold Spring Harbor Laboratory

May 13-16 2001

Page 2: Holland and Goodman – Caltech – Banbury 2001

Holland and Goodman – Caltech – Banbury 2001

Some facts (for engineers)

Animals are autonomous embodied entities with a particular mission – propagation of their genes

Animal brains are controllers evolved to achieve that mission

Humans are the most intelligent animals

Humans are the only animals known to be conscious

Holland and Goodman – Caltech – Banbury 2001

Page 3: Holland and Goodman – Caltech – Banbury 2001

Holland and Goodman – Caltech – Banbury 2001

Hypothesis

When an autonomous embodied system, with a difficult animal-like mission in a difficult environment, has a sufficiently high level of intelligence (i.e. is able to achieve that mission well), then it may exhibit consciousness, either as a necessary component for achieving the mission, or as a by-product.

Holland and Goodman – Caltech – Banbury 2001

Page 4: Holland and Goodman – Caltech – Banbury 2001

Holland and Goodman – Caltech – Banbury 2001

Strategies for building a conscious machine

(A) Build the most intelligent robot we can, with the right sort of mission in the right sort of environment, and see if it’s conscious

Holland and Goodman – Caltech – Banbury 2001

Page 5: Holland and Goodman – Caltech – Banbury 2001

Holland and Goodman – Caltech – Banbury 2001

Why (A) is not a good idea

We don’t know how difficult the mission or the environment would have to be

We don’t know whether the robot would be intelligent enough

And even if the robot turned out to be conscious, and we could prove that it was, we wouldn’t know exactly why

Holland and Goodman – Caltech – Banbury 2001

Page 6: Holland and Goodman – Caltech – Banbury 2001

Holland and Goodman – Caltech – Banbury 2001

Strategies for building a conscious machine

(B)Build a dumb robot, with a simple mission, in a simple environment.

If it can cope, make the mission and environment more difficult until it can’t. Then make the robot smarter until it can cope.

Repeat until conscious.

Holland and Goodman – Caltech – Banbury 2001

Page 7: Holland and Goodman – Caltech – Banbury 2001

Holland and Goodman – Caltech – Banbury 2001

Why (B) is a better idea than (A)

It probably reflects what happened during our evolution.

We can start now, because we know how to build robots too dumb to be conscious.

If we detect the appearance of consciousness after increasing the robot’s intelligence, we have a chance of identifying what underlies consciousness.

Holland and Goodman – Caltech – Banbury 2001

Page 8: Holland and Goodman – Caltech – Banbury 2001

Holland and Goodman – Caltech – Banbury 2001

What brains do (1)

Simple brains (nervous systems) perform mappings from sensory inputs to motor outputs; the interaction of this process with the environment produces behavior.

This can be a very effective strategy, especially when coupled with the use of the environment as an external memory (stigmergy).

We can build excellent real robots (behaviorbased, collective) using these principles.

Holland and Goodman – Caltech – Banbury 2001

Page 9: Holland and Goodman – Caltech – Banbury 2001

Holland and Goodman – Caltech – Banbury 2001

What brains do (2)

There seems to be little doubt that complex brains build and exploit models.

Craik (1943) ‘The brain models external reality’

But we still know very little about how the models are built, what they are like, and how they are used.

Holland and Goodman – Caltech – Banbury 2001

Page 10: Holland and Goodman – Caltech – Banbury 2001

Holland and Goodman – Caltech – Banbury 2001

What intelligent engineeringcontrol systems do

The most powerful control systems for controlling unknown state-rich plants (e.g. chemical plants) in complex, dynamic, uncertain environments are adaptive model-based predictive controllers.

They build internal models of the behavior of the plant in the environment, and use them to predict plant behavior and compute appropriate control actions.

Holland and Goodman – Caltech – Banbury 2001

Page 11: Holland and Goodman – Caltech – Banbury 2001

Holland and Goodman – Caltech – Banbury 2001

Where do models come from?

Where does a control system’s model of something come from?

- it’s built-in

- it’s partially built-in, and is modified by experience

- it’s taught

- it’s built from scratch

Holland and Goodman – Caltech – Banbury 2001

Page 12: Holland and Goodman – Caltech – Banbury 2001

Holland and Goodman – Caltech – Banbury 2001

Improvement by exercise

Once a model has been acquired or updated by exposure to the world, it can often be improved by exercising or ‘running’ it.

- e.g. Sutton’s Dyna architectures for efficient reinforcement learning in maze environments

Holland and Goodman – Caltech – Banbury 2001

Page 13: Holland and Goodman – Caltech – Banbury 2001

Holland and Goodman – Caltech – Banbury 2001

Robots and models

What can a robot use a model for?

- augmenting sensory information

- feedback and feedforward control

- detecting novelty or anomaly

- planning

Holland and Goodman – Caltech – Banbury 2001

Page 14: Holland and Goodman – Caltech – Banbury 2001

Holland and Goodman – Caltech – Banbury 2001

Planning with models

What gets planned? Action sequences.

(State t) + (Action t) => (State t+1)

(State t+1) + (Action t+1) => (State t+2) etc

For optimal planning: find the sequence of actions likely to make the greatest contribution to the success of the mission, and execute it. For useful planning, do better than is possible with no planning.

Holland and Goodman – Caltech – Banbury 2001

Page 15: Holland and Goodman – Caltech – Banbury 2001

Holland and Goodman – Caltech – Banbury 2001

What we want to know

For each robot, environment, and mission:

- What is the model like?

- How well does it correspond to the real world?

- Can it be used for control? If so, how good is it?

- Can it be used for planning? If so, how good is it?

- Are there any behavioral phenomena reminiscent of consciousness-related human behavior?

- Are there any phenomena connected with internal processes reminiscent of conscious human experience?

Holland and Goodman – Caltech – Banbury 2001

Page 16: Holland and Goodman – Caltech – Banbury 2001

Holland and Goodman – Caltech – Banbury 2001

Robots

Page 17: Holland and Goodman – Caltech – Banbury 2001

Holland and Goodman – Caltech – Banbury 2001

A Simple Robot

5.5 cm

•The Khepera miniature robot

• Features

•8 IR sensors which allow it to detect objects

• two independently controlled motors.

Page 18: Holland and Goodman – Caltech – Banbury 2001

Holland and Goodman – Caltech – Banbury 2001

Webots – Khepera Embodied Simulator

Simulators allow faster operation than real robots – particularly if learning involved.

Simlulator complexity is OK for a simple robot like the Khepera, but for more complex robots, the simulator may be too complex or not simulate the real word accurately.

Page 19: Holland and Goodman – Caltech – Banbury 2001

Holland and Goodman – Caltech – Banbury 2001

A Generic Robot Controller Architecture

INPUT UNITS

STATE UNITS

HIDDEN UNITS

OUTPUT UNITS

Recur rent Neural Machine

• The controller of the robot is an artificial neural network with recurrent feedback, capable of forming internal representations of sensory information in the form of a neural state machine.

•Sensory inputs (vision, sound, smell, etc) from sensors are fed to this structure

•Sensory inputs also include feedback from the motors and effectors.

•Controller outputs drive the locomotion and manipulators of the robot.

•The neural controller learns to perform a task, using neural network and genetic algorithm techniques.

•But - the internal model of the controller is implicit and therefore hidden from us.

SensoryInputs Including Motors and effectors

Controller outputs to motors and effectors

Page 20: Holland and Goodman – Caltech – Banbury 2001

Holland and Goodman – Caltech – Banbury 2001

Understanding the internal model

Sensory inputs, including feedback from motors & effectors

Motor & effector drive outputs

INPUT UNITS

STATE UNITS

HIDDEN UNITS

OUTPUT UNITS

Recurrent Neural Machine

INVERSE Recurrent Neural

Machine

OBSERVE

•This mechanism will allow us to represent the hidden internal state of the controller in terms of the sensory inputs that correspond to that state.

•Thus we may claim to know something of “what the robot is thinking”.

•We assume that the controller be learned first, and that, once this is learned and reasonably stable, the inverse can be learned.

•Introduce a second recurrent neural network, separate from the first system, which learns the inverse relationship between the internal activity of the controller and the sensory input space Outputs of

inverse in same sensory space as inputs of forward controller

Page 21: Holland and Goodman – Caltech – Banbury 2001

Holland and Goodman – Caltech – Banbury 2001

Simplified Inverse•In this experiment, we utilize a controller model which is much less powerful than the recurrent controllers described above, but allows us to illustrate the principle, and in particular makes “inversion” of the forward controller extremely simple.

•The crucial simplification we make is that the controller will learn its representation directly in the input space. Thus there is no inverse to learn - the internal representation learned by the robot is directly visible as an input space vector.

•The first phase is to learn or program the forward model or robot controller. In this simple experiment we program in a simple reactive wall-following behavior, rather than learn a complex behavior. The robot starts with no internal model, and adaptively learns its internal representation in an unsupervised manner as it performs its wall following behavior.

Page 22: Holland and Goodman – Caltech – Banbury 2001

Holland and Goodman – Caltech – Banbury 2001

The Learning Algorithm(based on Linaker and Niklasson 2000 ARAVQ algorithm)

• A 10-dimensional feature space is formed from the 8 Khepera IR sensor signals plus the 2 motor drive signals.

• Clusters feature-vectors by change detection, to form prototype feature vector “models”.

• Unsupervised• Adds new models based on two criteria:

• Novelty: Large distance from existing models• Stability: Low variance in buffered history of features

• Adapts existing models over time• We program in a simple “wall following” behavior to act

as a “teacher”.

Page 23: Holland and Goodman – Caltech – Banbury 2001

Holland and Goodman – Caltech – Banbury 2001

Learning in action

Colors show learned concepts:

Black – right wall

Blue – ahead wall

Green – 45 degree right wall

Red – corridor

Light Blue – outside corner

Page 24: Holland and Goodman – Caltech – Banbury 2001

Holland and Goodman – Caltech – Banbury 2001

Running with the model

• Switch off the wall follower• The robot “sees” features as it moves• Choose the closest learned model vector at each

tick• Use the model vector motor drive values to

actually drive the motors.

Page 25: Holland and Goodman – Caltech – Banbury 2001

Holland and Goodman – Caltech – Banbury 2001

Running with the model

Color indicates which is the current “best”model feature

Page 26: Holland and Goodman – Caltech – Banbury 2001

Holland and Goodman – Caltech – Banbury 2001

Run the model in the real robot

Page 27: Holland and Goodman – Caltech – Banbury 2001

Holland and Goodman – Caltech – Banbury 2001

Invert the motor signals back to sensory signals to infer an egocentric “map” of the environment as

“seen” by the robot.

0.1 0 0.1

0.02

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.08 0.06 0.04 0.02 0 0.02 0.04 0.06

0.02

0

0.02

0.04

0.06

0.08

Page 28: Holland and Goodman – Caltech – Banbury 2001

Holland and Goodman – Caltech – Banbury 2001

Keeping it Real

• Mapping with the real robot

0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0.1 0.2 0.30.7

0.6

0.5

0.4

0.3

0.2

0.1

0

0.1

Page 29: Holland and Goodman – Caltech – Banbury 2001

Holland and Goodman – Caltech – Banbury 2001

Manipulating the model “mentally” to make a decision - “planning”

• Take the sequence of learned model feature vectors and cluster sub –sequences into higher-level concepts

• For example: • Blue-Green-black = Left Corner• Red = Corridor• Black = right wall

• At any instant ask the robot to go to “home”• Run the model forwards mentally to decide if it is

shorter to go ahead or to go back• Take appropriate action

Page 30: Holland and Goodman – Caltech – Banbury 2001

Holland and Goodman – Caltech – Banbury 2001

Decision Time

Corridor corner is home

Rotate = Home is behind me

Flash LED’s = Home is ahead of me

Page 31: Holland and Goodman – Caltech – Banbury 2001

Holland and Goodman – Caltech – Banbury 2001

CONTROLLER

INVERSE

Real WorldSense Signals

Switch SwitchMotorSignalsToRealRobot

Model WorldSense Signals

Inverse Predictor Architecture

•We now allow the inverse to be fed back into the controller via the switch

•Thus the controller has an image of its internal hidden state or “self” in the same feature space as its real sensory inputs

•Thus it can “see” what it “itself” is thinking.

•As before “we” can also observe what the machine is “thinking”.

Page 32: Holland and Goodman – Caltech – Banbury 2001

Holland and Goodman – Caltech – Banbury 2001

Consequences of the architecture•In “normal” mode - the controller is producing motor signals based on the sensory input it “sees” (including motor/effector feedback). Normally we expect to see what it is seeing. The inverse allows for detecting mismatch between a predicted and an actual sensory input – thus indicating a novel experience, which in turn could focus attention and learning in the main controller. Noisy, ambiguous, and partial inputs can be “completed”.

•In “thinking or planning” mode the real world is disconnected from the controller input, and the mental images being output by the inverse are input to the controller instead. Thus sequences of planned action towards a goal can take place in mental space, and executed as action. Note that by switching between normal mode and “thinking” mode in some way, we can emulate the robot doing both reactive control and thinking at the same (multiplexed really) time. That is, like humans do when driving a car on “automatic” while “thinking” of something else.

•In “sleeping” mode we shut off the sensory input and allow noise to be input. Then the inverse will output “mental images”, which themselves can be fed back into the input (because they have the same representation) producing a complex series of “imagined” mental images or “dreams”. Note that we can use this “sleeping” mode to actually learn (or at least update) the inverse. The input noise vector is a “sensory input” vector like any other (whether it is structured accordingly or not), thus the inverse should be able to output this vector like any other from the state and motor signals. Thus we can use the error to update the inverse.

•If we do not disconnect the motors during “dreaming” we will have “sleepwalking” or “twitching”. If we assume that the controller is continually learning, then the inverse must be continually updated. If they get too much out of synchronization we could get irrational sequences in “thinking” or worse in execution mode - an analog of “madness”.

Page 33: Holland and Goodman – Caltech – Banbury 2001

Holland and Goodman – Caltech – Banbury 2001

Where’s the Consciousness?• Not there yet• More complex robots• More complex environments• More complex architecture

Head:2 degrees of freedomBody:2 degrees of freedomArms:4 degrees of freedom (x2)Legs:6 degrees of freedom (x2)(Total of 24 degrees of freedom)

SONY DREAM ROBOT

Page 34: Holland and Goodman – Caltech – Banbury 2001

Holland and Goodman – Caltech – Banbury 2001

Increasing complexityEnvironment Agent

Fixed environment Movable bodyMoving objects More sensorsMovable objects EffectorsObjects with different values Articulated bodyOther agents – prey Metabolic stateOther agents – predators Acquired skillsOther agents – competitors ToolsOther agents – collaborators Imitative learningOther agents – mates LanguageEtc Etc

Page 35: Holland and Goodman – Caltech – Banbury 2001

Holland and Goodman – Caltech – Banbury 2001

Multi-stage planningAt each step:

- what actions could it take?- what actions should it take?- what actions would it take?

The planning system needs- a good and current model of the world- a good and current model of the agent’s abilities, expressible in terms of their effects on the model world- an associated executive system to use the information generated by the planning system

Page 36: Holland and Goodman – Caltech – Banbury 2001

Holland and Goodman – Caltech – Banbury 2001

A framework?

Self Model

Environment Model

Updates To executive

Updates

Page 37: Holland and Goodman – Caltech – Banbury 2001

Holland and Goodman – Caltech – Banbury 2001

Speculation…

There may be something it is like to be such a self-model linked to such a world model in a robot with a mission