distributed representation, connection-based learning, and memory pdp class lecture january 26, 2011

27
Distributed Representation, Connection-Based Learning, and Memory PDP Class Lecture January 26, 2011

Upload: loreen-fitzgerald

Post on 18-Jan-2016

218 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Distributed Representation, Connection-Based Learning, and Memory PDP Class Lecture January 26, 2011

Distributed Representation, Connection-Based Learning, and

Memory

PDP Class LectureJanuary 26, 2011

Page 2: Distributed Representation, Connection-Based Learning, and Memory PDP Class Lecture January 26, 2011

The Concept of a Distributed Representation

• Instead of assuming that an object (concept, etc) is represented in the mind by a single unit, we consider the possibility that it could be represented by a pattern of activation a over population of units.

• The elements of the pattern may represent (approximately) some feature or sensible combination of features but they need not.

• What is crucial is that no units are dedicated to a single object; in general all units participate in the representation of many different objects.

• Neurons in the monkey visual cortex appear to exemplify these properties.

• Note that neurons in some parts of the brain are more selective than others but (in most people’s view) this is just a matter of degree.

Page 3: Distributed Representation, Connection-Based Learning, and Memory PDP Class Lecture January 26, 2011

Stimuli used by Baylis, Rolls, and Leonard (1991)

Page 4: Distributed Representation, Connection-Based Learning, and Memory PDP Class Lecture January 26, 2011

Responses of Four Neurons to Face andNon-Face Stimuli in Previous Slide

Page 5: Distributed Representation, Connection-Based Learning, and Memory PDP Class Lecture January 26, 2011

Responses to various stimuli by a neuron responding to a Tabby Cat

(Tanaka et al, 1991)

Page 6: Distributed Representation, Connection-Based Learning, and Memory PDP Class Lecture January 26, 2011

Another Example Neuron

Page 7: Distributed Representation, Connection-Based Learning, and Memory PDP Class Lecture January 26, 2011

Kiani et al, J Neurophysiol 97: 4296–4309, 2007.

Page 8: Distributed Representation, Connection-Based Learning, and Memory PDP Class Lecture January 26, 2011

The Infamous ‘Jennifer Aniston’ Neuron

Page 9: Distributed Representation, Connection-Based Learning, and Memory PDP Class Lecture January 26, 2011

A ‘Halle Barry’ Neuron

Page 10: Distributed Representation, Connection-Based Learning, and Memory PDP Class Lecture January 26, 2011

A ‘Sydney Opera House’ Neuron

Figures on this and previous two slides from:Quiroga, Q. et al, 2005, Nature, 435, 1102-1107.

Page 11: Distributed Representation, Connection-Based Learning, and Memory PDP Class Lecture January 26, 2011

Computational Arguments for the use of Distributed Representations (Hinton et al, 1986)

• They use the units in a network more efficiently

• They support generalization on the basis of similarity

• They can support micro-inferences based on consistent relationships between participating units– E.g. units activated my male facial features would activate

units associated with lower-pitched voices.

• Overlap increases generalization and micro-inferences; less overlap reduces it.

• There appears to be less overlap in the hippocampus than in other cortical areas – an issue to which we will return in a later lecture.

Page 12: Distributed Representation, Connection-Based Learning, and Memory PDP Class Lecture January 26, 2011

What is a Memory?

• The trace left in the memory system by an experience?

• A representation brought back to mind of a prior event or experience?

• Note that in some theories, these things are assumed to be one and the same (although there may be some decay or corruption).

Page 13: Distributed Representation, Connection-Based Learning, and Memory PDP Class Lecture January 26, 2011

Further questions

• Do we store separate representations of items and categories?– Experiments suggest participants are sensitive to

item information and also to the category prototype.

• Exemplar models store traces of each item encountered. But what is an item? Do items ever repeat? Is it exemplars all the way down?

• For further discussion, see exchange of papers by Bowers (2009) and Plaut & McClelland (2010) in readings listed for today’s lecture.

Page 14: Distributed Representation, Connection-Based Learning, and Memory PDP Class Lecture January 26, 2011

A PDP Approach to Memory

• An experience is a pattern of activation over neurons in one or more brain regions.

• The trace left in memory is the set of adjustments to the strengths of the connections.– Each experience leaves such a

trace, but the traces are not separable or distinct.

– Rather, they are superimposed in the same set of connection weights.

• Recall involves the recreation of a pattern of activation, using a part or associate of it as a cue.

• Every act of recall is always an act of reconstruction, shaped by the traces of many other experiences.

Page 15: Distributed Representation, Connection-Based Learning, and Memory PDP Class Lecture January 26, 2011

The Hopfield Network• A memory is a random pattern of 1’s and

(effectivey) -1’s over the units in a network like the one shown here (there are no self-connections).

• To learn, a pattern is clamped on the units; weights are learned using the Hebb rule.

• A set of patterns can be stored in this way.

• The network is probed by setting the states of the units in an initial state, and then updating the units asynchronously (as in the cube example) until the activations stop changing, using a step function. Input is removed during settling.

• The result is the retrieved memory.

– Noisy or incomplete patterns can be cleaned up or completed.

• Network itself makes decisions; “no complex external machinery is required”

• If many memories are stored, there is cross-talk among them.

– If random vectors are used, capacity is only about .14*N, N being the number of units.

Page 16: Distributed Representation, Connection-Based Learning, and Memory PDP Class Lecture January 26, 2011

The McClelland/Rumelhart (1985) Distributed Memory

Model• Inspired by ‘Brain-State in a Box’ model

of James Anderson, which predates the Hopfield net.

• Uses continuous units with activations between -1 and 1.

• Uses the same activation function as the iac model without a threshold.

• Net input is the sum of external plus internal inputs:

neti = ei + ii

• Learning occurs according to the ‘Delta Rule’:

i = ei – ii wij += iaj

• Short vs long-lasting changes to weights:

– As a first approximation to this, weight increments are thought to decay rapidly from initial values to smaller more permanent values.

Page 17: Distributed Representation, Connection-Based Learning, and Memory PDP Class Lecture January 26, 2011

Basic properties of auto-associator networks

• They can learn multiple ‘memories’ in the same set of weights– Recall: pattern completion– Recognition: strength of pattern activation– Facilitation of processing: how quickly and strongly settling

occurs.• With the Hebb Rule:

– Orthogonal patterns can be stored without mutual contamination (up to n, but the memory ‘whites out’).

• With the Delta Rule:– Sets of non-orthogonal patterns can be learned, and some

of the cross-talk can be eliminated.• However, there is a limitation:

– The external input to each unit must be linearly predicatible from a weighted combination of the activations of all of the other units.

– The weights learned are the solutions to a set of simultaneous linear equations.

Page 18: Distributed Representation, Connection-Based Learning, and Memory PDP Class Lecture January 26, 2011

Issues addressed by the M&R Distributed Memory Model

• Memory for general and specific information– Learning a prototype– Learning multiple prototypes in the same network– Learning general as well as specific information

Page 19: Distributed Representation, Connection-Based Learning, and Memory PDP Class Lecture January 26, 2011

Weights after learning from distortions of a prototype (each with a different ‘name’)

Page 20: Distributed Representation, Connection-Based Learning, and Memory PDP Class Lecture January 26, 2011

Sen

ding

Uni

ts

Receiving Units

Weights after learningDog, Cat, and BagelPatterns

Page 21: Distributed Representation, Connection-Based Learning, and Memory PDP Class Lecture January 26, 2011

Whittlesea (1983)

• Examined the effect of general and specific information on identification of letter strings after exposure to varying numbers and degrees of distortions to particular prototype strings.

Page 22: Distributed Representation, Connection-Based Learning, and Memory PDP Class Lecture January 26, 2011

Whittlesea’sExperiments

• Each experiment involved different numbers of distortions presented different numbers of times during training.

• Each test involved other distortions; W never tested the prototype but I did in some of my simulations.

• Performance measures are per-letter increase in identification compared to base line (E) and increase in dot product of input with activation due to learning (S).

Page 23: Distributed Representation, Connection-Based Learning, and Memory PDP Class Lecture January 26, 2011
Page 24: Distributed Representation, Connection-Based Learning, and Memory PDP Class Lecture January 26, 2011
Page 25: Distributed Representation, Connection-Based Learning, and Memory PDP Class Lecture January 26, 2011
Page 26: Distributed Representation, Connection-Based Learning, and Memory PDP Class Lecture January 26, 2011

Example stimuli

Spared category learning withimpaired exemplar learning in amnesia?

This happens in the model if we simplyassume amnesia reflects a smallervalue of the learning rate parameter(Amnesia is a bit more interestingthan this – see later lecture).

Page 27: Distributed Representation, Connection-Based Learning, and Memory PDP Class Lecture January 26, 2011

Limitations of Auto-Associator Models

• Capacity is limited– Different variants have different capacities– The sparser the patterns, the larger the number that can

be learned

• Sets of patterns violating linear predictability constraint cannot be learned perfectly.

• Does not capture effects indicative of representational and behavioral sharpening– ‘Strength Mirror Effect’– Sharpening of neural representations after repetition

• We will return to these issues in a later lecture, after we have a procedure in hand for training connections into hidden units.