grow your own representations: computational constructivism
TRANSCRIPT
Grow your own representations: Computational constructivism
Joseph L Austerweil,Thomas L Griffiths, and Kevin CaniniUniversity of California, Berkeley
Robert L GoldstoneIndiana University
Todd GureckisNew York University
Matt JonesUniversity of Colorado, Boulder
Stimulus
Stimulus
Stimulus
This is ugly.
Response 1
Stimulus
This is ugly.
Response 1
Stimulus
This is beautiful.
Response 2
This is ugly.
Response 1
Stimulus
This is beautiful.
Response 2
This is ugly.
Response 1
Stimulus
This is beautiful.
Response 2
This is ugly.
Response 1
Stimulus
This is beautiful.
Response 2
This is ugly.
Response 1
My kid could make this. Incredible painting style
Representation 1 Representation 2
Why use representations?
Behavior = f(Stimulus) Representation = g(Stimulus)Behavior = h(Representation)vs.
Why use representations?Representations explain how different behavior arrises from a stimulus.
The different behaviors from a stimulus are due to different representations.
Behavior = f(Stimulus) Representation = g(Stimulus)Behavior = h(Representation)vs.
Why use representations?Representations explain how different behavior arrises from a stimulus.
The different behaviors from a stimulus are due to different representations.
Representations change through experience with new stimuli.If representations are determined by stimuli, are they superfluous?
Behavior = f(Stimulus) Representation = g(Stimulus)Behavior = h(Representation)vs.
Why use representations?Representations explain how different behavior arrises from a stimulus.
The different behaviors from a stimulus are due to different representations.
Representations change through experience with new stimuli.If representations are determined by stimuli, are they superfluous?
Their utility can be salvaged by explicitly formulating how representations change with experience.
In this symposium, we explore recent computational proposals for how representations change with experience:
Nonparametric Bayesian Models - Austerweil, Gureckis, Canini, & GriffithsConnectionist - Goldstone & GureckisReinforcement Learning - Jones
Behavior = f(Stimulus) Representation = g(Stimulus)Behavior = h(Representation)vs.
What are representations and what does it mean for them to change?
A representation is something that stands in place for something else.
Palmer (1978)
What are representations and what does it mean for them to change?
A representation is something that stands in place for something else.
Palmer (1978)
Example representations: the activation of a layer of artificial neurons or a set of features.
Example things they stand for: objects in the world or a symbol in another process.
Based on its input, a representation may become active, which denotes the presence of the thing(s) it stands for.
What are representations and what does it mean for them to change?
A representation is something that stands in place for something else.
Palmer (1978)
Example representations: the activation of a layer of artificial neurons or a set of features.
Example things they stand for: objects in the world or a symbol in another process.
Based on its input, a representation may become active, which denotes the presence of the thing(s) it stands for.
Representational change happens when:1. The value of inputs that activate a representation change (selective attention).2. Two distinct representations merge (unitization).3. A fused representation splits into new representations (differentiation).
Questions to keep in mind
Questions to keep in mind
Does any feature weight change constitute representation change?Does any attentional change count?
If not, do any of the discussed models change representations?“Combinations” of fixed primitives or are flexible primitives needed?What about when the information content of a feature changes?
Questions to keep in mind
Does any feature weight change constitute representation change?Does any attentional change count?
If not, do any of the discussed models change representations?“Combinations” of fixed primitives or are flexible primitives needed?What about when the information content of a feature changes?
Inductive biases in representation formation Example: continuity constraints on perceptual feature learningExtremely strong: No representation learningExtremely weak: Any representation goes (no constraints)
Questions to keep in mind
Does any feature weight change constitute representation change?Does any attentional change count?
If not, do any of the discussed models change representations?“Combinations” of fixed primitives or are flexible primitives needed?What about when the information content of a feature changes?
Inductive biases in representation formation Example: continuity constraints on perceptual feature learningExtremely strong: No representation learningExtremely weak: Any representation goes (no constraints)
How domain general is representation change?Are the mechanisms equivalent? (chunking = unitization?)Are there both domain-general and specific inductive biases?
General: fewer features when possibleSpecific: Good continuity of features (in perception)
Questions to keep in mind
Does any feature weight change constitute representation change?Does any attentional change count?
If not, do any of the discussed models change representations?“Combinations” of fixed primitives or are flexible primitives needed?What about when the information content of a feature changes?
Inductive biases in representation formation Example: continuity constraints on perceptual feature learningExtremely strong: No representation learningExtremely weak: Any representation goes (no constraints)
How domain general is representation change?Are the mechanisms equivalent? (chunking = unitization?)Are there both domain-general and specific inductive biases?
General: fewer features when possibleSpecific: Good continuity of features (in perception)
Are the discussed models competing or complimentary?Representations in different levels of explanation
Outline of symposiumAusterweil & Griffiths - Introduction and nonparametric Bayesian models of feature representation
Goldstone - Building flexible categorization models by grounding them in perception
Jones - Constructing representations through reinforcement learning by improving generalization
Canini & Griffiths - A nonparametric hierarchical Bayesian framework for modeling human categorization
Gureckis - Endnote: Breaking sticks or breaking clusters? representation building, learning, and the brain
Nonparametric Bayesian models of feature learningBy Joe Austerweil and Tom Griffiths
Department of Psychology, UC Berkeley
http://cocosci.berkeley.edu/
FeaturesFeatures are the elementary primitives in cognitive models.
In many cases, the features are ambiguous:
The appropriate feature representation of an object is context-dependent.
Inferring a feature representation is an inductive problem.
Bayesian inference provides a rational solution.
Challenge: How do you form a set of possible representations?
Nonparametric BayesChallenge: How do you form a set of possible representations?
Idea: Use flexible hypothesis spaces from nonparametric Bayesian models.
What is a nonparametric Bayesian model?
Defines a prior over representations with potentially infinite many features (Consistent with Goldmeier, 1936/1972; Goodman, 1972; Murphy & Medin, 1985; ...).
Unlike fixed feature models, it infers the number of features.
Combines structure of a bias towards simpler feature representations, but with the flexibility to grow in complexity as more data is observed.
Nonparametric BayesChallenge: How do you form a set of possible representations?
Idea: Use flexible hypothesis spaces from nonparametric Bayesian models.
What is a nonparametric Bayesian model?
Defines a prior over representations with potentially infinite many features (Consistent with Goldmeier, 1936/1972; Goodman, 1972; Murphy & Medin, 1985; ...).
Unlike fixed feature models, it infers the number of features.
Combines structure of a bias towards simpler feature representations, but with the flexibility to grow in complexity as more data is observed.
Observations Features
Nonparametric BayesChallenge: How do you form a set of possible representations?
Idea: Use flexible hypothesis spaces from nonparametric Bayesian models.
What is a nonparametric Bayesian model?
Defines a prior over representations with potentially infinite many features (Consistent with Goldmeier, 1936/1972; Goodman, 1972; Murphy & Medin, 1985; ...).
Unlike fixed feature models, it infers the number of features.
Combines structure of a bias towards simpler feature representations, but with the flexibility to grow in complexity as more data is observed.
Observations Features
Nonparametric BayesChallenge: How do you form a set of possible representations?
Idea: Use flexible hypothesis spaces from nonparametric Bayesian models.
What is a nonparametric Bayesian model?
Defines a prior over representations with potentially infinite many features (Consistent with Goldmeier, 1936/1972; Goodman, 1972; Murphy & Medin, 1985; ...).
Unlike fixed feature models, it infers the number of features.
Combines structure of a bias towards simpler feature representations, but with the flexibility to grow in complexity as more data is observed.
Observations Features
Nonparametric BayesChallenge: How do you form a set of possible representations?
Idea: Use flexible hypothesis spaces from nonparametric Bayesian models.
What is a nonparametric Bayesian model?
Defines a prior over representations with potentially infinite many features (Consistent with Goldmeier, 1936/1972; Goodman, 1972; Murphy & Medin, 1985; ...).
Unlike fixed feature models, it infers the number of features.
Combines structure of a bias towards simpler feature representations, but with the flexibility to grow in complexity as more data is observed.
Observations Features
Austerweil & Griffiths (2009; in press)
part 5part 1 part 3part 2 part 4 part 6 shared part
Correlated Parts Inferred Features
Austerweil & Griffiths (2009; in press)
part 5part 1 part 3part 2 part 4 part 6 shared part
Correlated Parts Inferred Features
Austerweil & Griffiths (2009; in press)
part 5part 1 part 3part 2 part 4 part 6 shared part
Independent Parts Inferred Features
Austerweil & Griffiths (2009; in press)
x1 x2 x3 x4
Visual search for objects with correlated parts(Shiffrin & Lightfoot, 1997)
Incorporating Domain Constraints
Austerweil & Griffiths (2009; in press)
x1 x2 x3 x4
Visual search for objects with correlated parts(Shiffrin & Lightfoot, 1997)
Features inferred without proximity constraint.
Incorporating Domain Constraints
Austerweil & Griffiths (2009; in press)
x1 x2 x3 x4
Visual search for objects with correlated parts(Shiffrin & Lightfoot, 1997)
Features inferred without proximity constraint.
Features inferred with proximity constraint.
Incorporating Domain Constraints
Feature learning with transforms
+
Features occur differently across presentations.
Ambiguous whether the parts are a single feature or the same feature with different transformations.
Austerweil & Griffiths (2010)
Feature learning with transformsTwo object sets where vertical bars are translated either together (unitized) or independently (separate).
People use the set of objects they observe to decide which representation is appropriate.
The smallest representation that can encode the observed objects is used.
Austerweil & Griffiths (2010)
Feature learning with transforms
Unitized
Two object sets where vertical bars are translated either together (unitized) or independently (separate).
People use the set of objects they observe to decide which representation is appropriate.
The smallest representation that can encode the observed objects is used.
Austerweil & Griffiths (2010)
Feature learning with transforms
Unitized
Separate
Two object sets where vertical bars are translated either together (unitized) or independently (separate).
People use the set of objects they observe to decide which representation is appropriate.
The smallest representation that can encode the observed objects is used.
Austerweil & Griffiths (2010)
Feature learning with transforms
New Unit New Sep0
2
4
6
Human Experiment
Hum
an R
ati
ng
New Unit New Sep
Model Predictions
Test ImageTest Image
Model
Acti
vati
on
Unitized (Unit) Separate (Sep)
Austerweil & Griffiths (2010)
Feature learning with transforms
New Unit New Sep0
2
4
6
Human Experiment
Hum
an R
ati
ng
New Unit New Sep
Model Predictions
Test ImageTest Image
Model
Acti
vati
on
Unitized (Unit) Separate (Sep)
Austerweil & Griffiths (2010)
Feature learning with transforms
Austerweil & Griffiths (2010)
Feature learning with transforms
Are these two features the same?
Austerweil & Griffiths (2010)
Feature learning with transforms
Are these two features the same?
Should all transforms be included?
Square or diamond? Mach (1914)
Austerweil & Griffiths (2010)
Feature learning with transforms
Are these two features the same?
Should all transforms be included?
Square or diamond?
Hypothesis: people infer the set of transformations allowed for a given feature.
Mach (1914)
Austerweil & Griffiths (2010)
Feature learning with transformsContextual effects on allowable transforms
Rotation set
Austerweil & Griffiths (2010)
Feature learning with transformsContextual effects on allowable transforms
Rotation setor ?
Austerweil & Griffiths (2010)
Feature learning with transformsContextual effects on allowable transforms
Rotation setor ?
Size setor ?
Austerweil & Griffiths (2010)
Feature learning with transforms
New Rot New Size0
2
4
6
Human Responses
Hum
an R
ati
ng
Test Image
New Rot New Size
Model Predictions
Test Image
Model
Acti
vati
on
Rotation (Rot) Size
Austerweil & Griffiths (2010)
Feature learning with transforms
New Rot New Size0
2
4
6
Human Responses
Hum
an R
ati
ng
Test Image
New Rot New Size
Model Predictions
Test Image
Model
Acti
vati
on
Rotation (Rot) Size
Austerweil & Griffiths (2010)
Incremental learning
A B AB
(Schyns & Rodet, 1997; Austerweil & Griffiths, in prep.)
Incremental learning
A B AB
Train: AB A BTrain: A B AB
(Schyns & Rodet, 1997; Austerweil & Griffiths, in prep.)
Incremental learning
A B AB
Learn:
Train: AB A BTrain: A B AB
(Schyns & Rodet, 1997; Austerweil & Griffiths, in prep.)
Incremental learning
A B AB
LearnLearn:
Train: AB A BTrain: A B AB
(Schyns & Rodet, 1997; Austerweil & Griffiths, in prep.)
Incremental learning
A B AB
LearnLearn:
Train: AB A BTrain: A B AB
Is this AB?:People: NotIBP: YesPF: No
Is this AB?:People: YestIBP: YesPF: Yes
(Schyns & Rodet, 1997; Austerweil & Griffiths, in prep.)
Conclusions
ConclusionsNonparametric Bayesian models are a framework for feature representation inference that
has a flexible set of features, but with soft constraints.has domain-general constraints: fewer features are better (e.g., simplicity).can impose domain-specific constraints (e.g., proximity).
ConclusionsNonparametric Bayesian models are a framework for feature representation inference that
has a flexible set of features, but with soft constraints.has domain-general constraints: fewer features are better (e.g., simplicity).can impose domain-specific constraints (e.g., proximity).
They predict the correlation between parts should affect the inferred feature representation, which has been confirmed experimentally.
ConclusionsNonparametric Bayesian models are a framework for feature representation inference that
has a flexible set of features, but with soft constraints.has domain-general constraints: fewer features are better (e.g., simplicity).can impose domain-specific constraints (e.g., proximity).
They predict the correlation between parts should affect the inferred feature representation, which has been confirmed experimentally.
They learn features that are transformed when instantiated in objects and the types of transformations features are allowed to undergo.
People also infer features that undergo transformations.Potentially explains when features are orientation-variant or invariant.
ConclusionsNonparametric Bayesian models are a framework for feature representation inference that
has a flexible set of features, but with soft constraints.has domain-general constraints: fewer features are better (e.g., simplicity).can impose domain-specific constraints (e.g., proximity).
They predict the correlation between parts should affect the inferred feature representation, which has been confirmed experimentally.
They learn features that are transformed when instantiated in objects and the types of transformations features are allowed to undergo.
People also infer features that undergo transformations.Potentially explains when features are orientation-variant or invariant.
They demonstrate the importance of representations at the computational level for generalization behavior.
ConclusionsNonparametric Bayesian models are a framework for feature representation inference that
has a flexible set of features, but with soft constraints.has domain-general constraints: fewer features are better (e.g., simplicity).can impose domain-specific constraints (e.g., proximity).
They predict the correlation between parts should affect the inferred feature representation, which has been confirmed experimentally.
They learn features that are transformed when instantiated in objects and the types of transformations features are allowed to undergo.
People also infer features that undergo transformations.Potentially explains when features are orientation-variant or invariant.
They demonstrate the importance of representations at the computational level for generalization behavior.
Ordering effects can be explained at the algorithmic level using a rational incremental learner.
Acknowledgements• Other symposium speakers
• Tania Lombrozo
• Karen Schloss
• Stephen Palmer
• Rob Goldstone
• Michael Pacer & Joseph Jay Williams
• RAs: David Belford, Brian Tang, Shubin Li, Ingrid Liu, Julia Ying
• CoCoSci, Concepts and Cognition Coalition
• You!