the icsi/berkeley neural theory of language project learning early constructions (chang, mok) ecg

The ICSI/Berkeley Neural Theory of Language Project

Learning early constructions (Chang, Mok)

ECG

Moving from Spatial Relations to Verbs

• Open class vs. closed class

– How do we represent verbs (say of hand motion)

• Can we build models of verbs based on motor control primitives?

• If so, how can models overcome central limitations of Regier’s system?

– Inference

– Abstract uses

Coordination of Pattern Generators

Coordination

• PATTERN GENERATORS, separate neural networks that control each limb, can interact in different ways to produce various gaits.

– In ambling (top) the animal must move the fore and hind leg of one flank in parallel.

– Trotting (middle) requires movement of diagonal limbs (front right and back left, or front left and back right) in unison.

– Galloping (bottom) involves the forelegs, and then the hind legs, acting together

Preshaping While Reaching to Grasp

Internal Model and Efference Copy

Many areas code for motion parameters

Multiple, chronically implanted, intracranial microelectrode arrays would be used to sample theactivity of large populations of single cortical neurons simultaneously. The combined activity ofthese neural ensembles would then be transformed by a mathematical algorithm into continuousthree-dimensional arm-trajectory signals that would be used to control the movements of arobotic prosthetic arm. A closed control loop would be established by providing the subject withboth visual and tactile feedback signals generated by movement of the robotic arm.

Rizzolatti et al. 1998

A New PictureA New Picture

The fronto-parietal networks

Rizzolatti et al. 1998

F5 Mirror NeuronsF5 Mirror Neurons

Gallese and Goldman, TICS 1998

Category Loosening in Mirror Neurons (~60%)

(Gallese et al. Brain 1996)

Observed: A is Precision Grip

B is Whole Hand Prehension

Action: C: precision grip

D: Whole Hand Prehension

Umiltà et al. Neuron 2001

A (Full vision)A (Full vision)

B (Hidden)B (Hidden)

C (Mimicking)C (Mimicking)

D (HiddenMimicking)D (HiddenMimicking)

F5 Audio-Visual Mirror NeuronsF5 Audio-Visual Mirror Neurons

Kohler et al. Science (2002)

Summary of Fronto-Parietal Circuits

Motor-Premotor/Parietal Circuits

PMv (F5ab) – AIP Circuit

“grasp” neurons – fire in relation to movements of hand prehension necessary to grasp object

F4 (PMC) (behind arcuate) – VIP Circuit

transforming peri-personal space coordinates so can move toward objects

PMv (F5c) – PF Circuit F5c

different mirror circuits for grasping, placing or manipulating object

Together suggest cognitive representation of the grasp, active in action imitation and action recognition

Evidence in Humans for Mirror, General Purpose, and Action-Location

Neurons

Mirror: Fadiga et al. 1995; Grafton et al. 1996;Rizzolatti et al. 1996; Cochin et al. 1998;

Decety et al. 1997; Decety and Grèzes 1999;Hari et al. 1999; Iacoboni et al. 1999;

Buccino et al. 2001.

General Purpose: Perani et al. 1995; Martin et al.1996; Grafton et al. 1996; Chao and Martin 2000.

Action-Location: Bremmer, et al., 2001.

Itti: CS564 - Brain Theory and Artificial Intelligence. FARS Model

FARS (Fagg-Arbib-Rizzolatti-Sakata) Model

AIP

F5

dorsal/ventral streams

Task Constraints (F6)

Working Memory (46)

Instruction Stimuli (F2)

Task Constraints (F6)Working Memory (46?)Instruction Stimuli (F2)

AIPDorsalStream:Affordances

IT

VentralStream:Recognition

Ways to grab this “thing”

“It’s a mug”PFC

AIP extracts the set of affordances for an attended object.These affordances highlight the features of the object relevant to physical interaction with it.

MULTI-MODAL INTEGRATION

The premotor and parietal areas, rather than havingseparate and independent functions, are neurally integratednot only to control action, but also to serve the function ofconstructing an integrated representation of:

(a) Actions, together with (b) objects acted on, and (c) locations toward which actions are directed.

In these circuits sensory inputs are transformed in order toaccomplish not only motor but also cognitive tasks, such asspace perception and action understanding.

Modeling Motor Schemas

• Relevant requirements (Stromberg, Latash, Kandel, Arbib, Jeannerod, Rizzolatti)

– Should model coordinated, distributed, parameterized control programs required for motor action and perception.

– Should be an active structure.

– Should be able to model concurrent actions and interrupts.

– Should model hierarchical control (higher level motor centers to muscle extensor/flexors.

• Computational model called x-schemas (http://www.icsi.berkeley.edu/NTL)

An Active Model of Events

• At the Computational level, actions and events are coded in active representations called x-schemas which are extensions to Stochastic Petri nets.

• x-schemas are fine-grained action and event representations that can be used for monitoring and control as well as for inference.

Model Review: Stochastic Petri Nets

3

1

2

Basic Mechanism

[1]

Precondition arc

Resource arc

Inhibition arc

[1]Firing function -- conjunctive -- logistic -- exponential family

3

1

2

Firing Semantics

Model Review

1

11

1

2

Result of Firing

Model Review

Active representations

• Many inferences about actions derive from what we know about executing them

• Representation based on stochastic Petri nets captures dynamic, parameterized nature of actions

• Generative model: action, recognition, planning , language

Walking:

bound to a specific walker with a direction or goal

consumes resources (e.g., energy)may have termination condition

(e.g., walker at goal) ongoing, iterative action

walker=Harry

goal=home

energy

walker at goal

Preshaping While Reaching to Grasp

The ICSI/Berkeley Neural Theory of Language Project

Learning early constructions (Chang, Mok)

ECG

Representing concepts using triangle nodes

triangle nodes:

when two of the neurons fire, the third also fires

Barrett Ham Container Push

dept~CS Color ~pink Inside ~region Schema ~slide

sid~001 Taste ~salty Outside ~region Posture ~palm

emp~GSI Bdy. ~curve Dir. ~ away

Chang Pea Purchase Stroll

dept~Ling Color ~green Buyer ~person Schema ~walk

sid~002 Taste ~sweet Seller ~person Speed ~slow

emp~Gra Cost ~money Dir. ~ ANY

Goods ~ thing

Feature Structures in Four Domains

Simulation hypothesis

We understand utterances by mentally simulating their content.

– Simulation exploits some of the same neural structures activated during performance, perception, imagining, memory…

– Linguistic structure parameterizes the simulation.

• Language gives us enough information to simulate

Simulation Semantics

• BASIC ASSUMPTION: SAME REPRESENTATION FOR PLANNING AND SIMULATIVE INFERENCE

– Evidence for common mechanisms for recognition and action (mirror neurons) in the F5 area (Rizzolatti et al (1996), Gallese 96, Buccino 2002, Tettamanti 2004) and from motor imagery (Jeannerod 1996)

• IMPLEMENTATION:

– x-schemas affect each other by enabling, disabling or modifying execution trajectories. Whenever the CONTROLLER schema makes a transition it may set, get, or modify state leading to triggering or modification of other x-schemas. State is completely distributed (a graph marking) over the network.

• RESULT: INTERPRETATION IS IMAGINATIVE SIMULATION!

Simulation-based language understanding

Analysis Process

SemanticSpecification

“Harry walked into the cafe.” Utterance

CAFE Simulation

Belief State

General Knowledge

Constructions

construction WALKEDform

selff.phon [wakt]meaning : Walk-Action constraints

selfm.time before Context.speech-time selfm..aspect encapsulated

Simulation specification

A simulation specification consists of:- schemas evoked by constructions- bindings between schemas

Language Development in Children

• 0-3 mo: prefers sounds in native language

• 3-6 mo: imitation of vowel sounds only

• 6-8 mo: babbling in consonant-vowel segments

• 8-10 mo: word comprehension, starts to lose sensitivity to consonants outside native language

• 12-13 mo: word production (naming)

• 16-20 mo: word combinations, relational words (verbs, adj.)

• 24-36 mo: grammaticization, inflectional morphology

• 3 years – adulthood: vocab. growth, sentence-level grammar for discourse purposes

cow

apple ball yes

juice bead girl down no more

bottle truck baby woof yum go up this more

spoon hammer shoe daddy moo whee get out there bye

banana box eye momy choo-choo

uhoh sit in here hi

cookie horse door boy boom oh open on that no

food toys misc. people sound emotion action prep. demon. social

Words learned by most 2-year olds in a play school (Bloom 1993)

Regier Model Limitations

• Scale

• Uniqueness/Plausibility

• Grammar

• Abstract Concepts

• Inference

• Representation

• Biological Realism

Learning Verb MeaningsDavid Bailey

A model of children learning their first verbs.

Assumes parent labels child’s actions.

Child knows parameters of action, associates with word

Program learns well enough to:

1) Label novel actions correctly

2) Obey commands using new words (simulation)

System works across languages

Mechanisms are neurally plausible.

Reasoning about Actions in Artificial Intelligence (AI)

• The earliest work on actions in AI took a deductive approach

– designers hoped to represent all the system's `world knowledge' explicitly as axioms, and use ordinary logic - the predicate calculus - to deduce the effects of actions

• Envisaging a certain situation S was modeled by having the system entertain a set of axioms describing the situation

• To this set of axioms the system would apply an action - by postulating the occurrence of some action A in situation S - and then deduce the effect of A in S, producing a description of the outcome situation S'

Grasping: the action

• A set of pre-conditions in S

– free_top(y), free_hand(x), accessible(y)

• The grasp action (effect axiom):

– Result(Grasp(x,y, S), hold(x,y,S’))

• A set of effects describing the new situation S’

– Hold(x,y), not(free-hand(x))

Actions

• An action is described as an axiom linking preconditions (literals and terms true in the before situation) to effects (literals and terms true in the after situation).

• The action specification is called an effect axiom

Problems with action concepts

• Frame problem

• Qualification problem

• Ramification problem

The Frame Problem

• Which things don’t change in an action

– S1: blue(x), on_table(x), free_hand(y)

– Action grasp(y,x)

– S2: in_hand(x,y), hold(x,y), ?

Frame axioms are needed in logic

• Consider some typical frame axioms associated with the action-type:

• move x onto y.

– If z != x and I move x onto y, then if z was on w before, then z is on w after.

– If x is blue before, and I move x onto y, then x is blue after.

Active Representations don’t need frame axioms

• X-schemas directly model change, so no need for frame axioms. Also, they deal with concurrency, so no need to treat one action at a time.

• Based on x-schema type models there are a new set of logics called resource logics which attempt to model the frame problem directly.

Ramification Problem

How do I specify all the effects

– Direct (if I move, I change my location) and

– Indirect (things that were accessible before I moved may not be anymore)

• Central issue is to propagate changes of an action to all the connected knowledge that might be impacted.

• How might the brain do this?

• Spreading Activation

the icsi/berkeley neural theory of language project learning early constructions (chang, mok) ecg

Documents

hand prehension slide

mok ecg slide

efference copy slide

new picture slide

left hand

right hand

motor control primitives

f5 mirror neurons gallese