distributed representations: simon d. levy department of computer science washington and lee...
TRANSCRIPT
![Page 1: Distributed Representations: Simon D. Levy Department of Computer Science Washington and Lee University Lexington, VA 24450 PHIL 395 9 May 2006 Preaching](https://reader035.vdocuments.net/reader035/viewer/2022070410/56649f065503460f94c1c25e/html5/thumbnails/1.jpg)
Distributed Representations:Distributed Representations:
Simon D. Levy Simon D. Levy
Department of Computer ScienceDepartment of Computer Science
Washington and Lee UniversityWashington and Lee University
Lexington, VA 24450Lexington, VA 24450
PHIL 395PHIL 395
9 May 20069 May 2006
Preaching to the Choir in Church(land)Preaching to the Choir in Church(land)
![Page 2: Distributed Representations: Simon D. Levy Department of Computer Science Washington and Lee University Lexington, VA 24450 PHIL 395 9 May 2006 Preaching](https://reader035.vdocuments.net/reader035/viewer/2022070410/56649f065503460f94c1c25e/html5/thumbnails/2.jpg)
Theme: A Neuro-Manifesto
![Page 3: Distributed Representations: Simon D. Levy Department of Computer Science Washington and Lee University Lexington, VA 24450 PHIL 395 9 May 2006 Preaching](https://reader035.vdocuments.net/reader035/viewer/2022070410/56649f065503460f94c1c25e/html5/thumbnails/3.jpg)
The real motive behind eliminative materialism
is the worry that the “propositional” kinematics
and “logical” dynamics of folk psychology
constitute a radically false account of the
cognitive activity of humans, and of the higher
animals generally. The worry is that our folk
conception of how cognitive creatures represent
the world ... is a thoroughgoing
misrepresentation of what really takes
place inside us.
![Page 4: Distributed Representations: Simon D. Levy Department of Computer Science Washington and Lee University Lexington, VA 24450 PHIL 395 9 May 2006 Preaching](https://reader035.vdocuments.net/reader035/viewer/2022070410/56649f065503460f94c1c25e/html5/thumbnails/4.jpg)
[It] turns out that we don t think the way we think we
think! The scientific evidence coming in all around us
is clear: Symbolic conscious reasoning, which is
extracted through protocol analysis from serial verbal
introspection, is a myth. [It] is entirely clear that the
symbolic mind that AI has tried for 50 years to
simulate is just a story we humans tell ourselves to
predict and explain the
unimaginably complex processes
occurring in our evolved brains.
![Page 5: Distributed Representations: Simon D. Levy Department of Computer Science Washington and Lee University Lexington, VA 24450 PHIL 395 9 May 2006 Preaching](https://reader035.vdocuments.net/reader035/viewer/2022070410/56649f065503460f94c1c25e/html5/thumbnails/5.jpg)
• Folk psychology representations are local: “a place
for every symbol, and every symbol in its place”.
• Neural-net representations are distributed: “each
entity is represented by a pattern of activity
distributed over many computing elements, and each
computing element is involved in representing many
different entities''. (Hinton 1984)
Local Local vsvs. Distributed Representation. Distributed Representation
![Page 6: Distributed Representations: Simon D. Levy Department of Computer Science Washington and Lee University Lexington, VA 24450 PHIL 395 9 May 2006 Preaching](https://reader035.vdocuments.net/reader035/viewer/2022070410/56649f065503460f94c1c25e/html5/thumbnails/6.jpg)
• Commonest distributed representation is a vector
of real numbers.
• You already know how vectors can be obtained by
back-propagation / gradient-descent.
• Today I’ll talk about some other (faster, more
plausible) ways of obtaining the vectors.
![Page 7: Distributed Representations: Simon D. Levy Department of Computer Science Washington and Lee University Lexington, VA 24450 PHIL 395 9 May 2006 Preaching](https://reader035.vdocuments.net/reader035/viewer/2022070410/56649f065503460f94c1c25e/html5/thumbnails/7.jpg)
Variation IThe Hard Problem
![Page 8: Distributed Representations: Simon D. Levy Department of Computer Science Washington and Lee University Lexington, VA 24450 PHIL 395 9 May 2006 Preaching](https://reader035.vdocuments.net/reader035/viewer/2022070410/56649f065503460f94c1c25e/html5/thumbnails/8.jpg)
A typical American seventh grader knows the meaning
of 10-15 words today that she didn't know yesterday ...
The typical seventh grader would have read less than
50 paragraphs since yesterday, from which she should
have should have learned less than three new words.
Apparently, she mastered the meanings of many words
that she did not encounter. - Landauer 1997
![Page 9: Distributed Representations: Simon D. Levy Department of Computer Science Washington and Lee University Lexington, VA 24450 PHIL 395 9 May 2006 Preaching](https://reader035.vdocuments.net/reader035/viewer/2022070410/56649f065503460f94c1c25e/html5/thumbnails/9.jpg)
Latent Semantic AnalysisLatent Semantic Analysis“You shall know a word by the company it keeps”
– J. R. Firth
• Make a table showing how many times each word occurs
in each of a set of documents, or with another word, etc. -
purely local info
• Mathematically “smear” this information across each row
of the table, showing how likely the word would be to occur
in the other documents – distributed info
![Page 10: Distributed Representations: Simon D. Levy Department of Computer Science Washington and Lee University Lexington, VA 24450 PHIL 395 9 May 2006 Preaching](https://reader035.vdocuments.net/reader035/viewer/2022070410/56649f065503460f94c1c25e/html5/thumbnails/10.jpg)
Landauer, T. K., Foltz, P. W., & Laham, D. (1998).
Introduction to Latent Semantic Analysis.
Discourse Processes, 25, 259-284.
![Page 11: Distributed Representations: Simon D. Levy Department of Computer Science Washington and Lee University Lexington, VA 24450 PHIL 395 9 May 2006 Preaching](https://reader035.vdocuments.net/reader035/viewer/2022070410/56649f065503460f94c1c25e/html5/thumbnails/11.jpg)
Landauer, T. K., Foltz, P. W., & Laham,
D. (1998).
![Page 12: Distributed Representations: Simon D. Levy Department of Computer Science Washington and Lee University Lexington, VA 24450 PHIL 395 9 May 2006 Preaching](https://reader035.vdocuments.net/reader035/viewer/2022070410/56649f065503460f94c1c25e/html5/thumbnails/12.jpg)
Landauer, T. K., Foltz, P. W., & Laham,
D. (1998).
![Page 13: Distributed Representations: Simon D. Levy Department of Computer Science Washington and Lee University Lexington, VA 24450 PHIL 395 9 May 2006 Preaching](https://reader035.vdocuments.net/reader035/viewer/2022070410/56649f065503460f94c1c25e/html5/thumbnails/13.jpg)
Latent Semantic AnalysisLatent Semantic Analysis
• As in Elman’s SRN, reps. of similar concepts end up close
together in “meaning space”
•Amazingly useful
• Intelligent information retrieval: “Smart Googling”
(Berry et al. 1994)
• Automatic essay grading: “Who’s really looking at your SAT?”
(Landauer et al. 2000)
• Disambiguating words for automatic translation
(Davis & Levy 2006: http://www.cs.wlu.edu/translate)...
![Page 14: Distributed Representations: Simon D. Levy Department of Computer Science Washington and Lee University Lexington, VA 24450 PHIL 395 9 May 2006 Preaching](https://reader035.vdocuments.net/reader035/viewer/2022070410/56649f065503460f94c1c25e/html5/thumbnails/14.jpg)
Variation IIThe Harder Problem
![Page 15: Distributed Representations: Simon D. Levy Department of Computer Science Washington and Lee University Lexington, VA 24450 PHIL 395 9 May 2006 Preaching](https://reader035.vdocuments.net/reader035/viewer/2022070410/56649f065503460f94c1c25e/html5/thumbnails/15.jpg)
The Language of Thought: Binding The Language of Thought: Binding and Recursionand Recursion
• LSA (and Elman-style hidden vectors) only give us the
representations of individual words/concepts
• Documents are just unstructured “bags of words”
• Without folk-psychological structures, how do we represent
1) the distinction between, e.g., “Lois loves Clark” and
“Clark loves Lois”?
2) intentional concepts like
“Perry knows that [Lois loves Clark]”?
![Page 16: Distributed Representations: Simon D. Levy Department of Computer Science Washington and Lee University Lexington, VA 24450 PHIL 395 9 May 2006 Preaching](https://reader035.vdocuments.net/reader035/viewer/2022070410/56649f065503460f94c1c25e/html5/thumbnails/16.jpg)
Binding as Vector Product Binding as Vector Product (Smolensky 1990)(Smolensky 1990)
© 2004. Indiana University and Michael Gasser.
www.cs.indiana.edu/classes/b651/Notes/convolution.html
24 Feb 2004
• Cool, but problematic, because representations keep
getting bigger...
![Page 17: Distributed Representations: Simon D. Levy Department of Computer Science Washington and Lee University Lexington, VA 24450 PHIL 395 9 May 2006 Preaching](https://reader035.vdocuments.net/reader035/viewer/2022070410/56649f065503460f94c1c25e/html5/thumbnails/17.jpg)
Holographic Reduced Holographic Reduced Representations (Plate 1991)Representations (Plate 1991)
• Binding by “circular convolution”: sum over diagonals with circularity to keep fixed size:
© 2004. Indiana University and Michael Gasser.
www.cs.indiana.edu/classes/b651/Notes/convolution.html
24 Feb 2004
![Page 18: Distributed Representations: Simon D. Levy Department of Computer Science Washington and Lee University Lexington, VA 24450 PHIL 395 9 May 2006 Preaching](https://reader035.vdocuments.net/reader035/viewer/2022070410/56649f065503460f94c1c25e/html5/thumbnails/18.jpg)
Holographic Reduced Holographic Reduced Representations (Plate 1991)Representations (Plate 1991)
• Keeping the # of dimensions constant allows us to build intentional representations of arbitrary complexity:
KNOWER*PERRY + KNOWN * (LOVER*LOIS + LOVEE*CLARK)
• As with LSA, similar propositions end up close together in “proposition space”
![Page 19: Distributed Representations: Simon D. Levy Department of Computer Science Washington and Lee University Lexington, VA 24450 PHIL 395 9 May 2006 Preaching](https://reader035.vdocuments.net/reader035/viewer/2022070410/56649f065503460f94c1c25e/html5/thumbnails/19.jpg)
Holographic Reduced Holographic Reduced Representations (Plate 1991)Representations (Plate 1991)
• Mathematically, the same operations are used to produce holograms
![Page 20: Distributed Representations: Simon D. Levy Department of Computer Science Washington and Lee University Lexington, VA 24450 PHIL 395 9 May 2006 Preaching](https://reader035.vdocuments.net/reader035/viewer/2022070410/56649f065503460f94c1c25e/html5/thumbnails/20.jpg)
Variation IIIThe Hardest Problem
![Page 21: Distributed Representations: Simon D. Levy Department of Computer Science Washington and Lee University Lexington, VA 24450 PHIL 395 9 May 2006 Preaching](https://reader035.vdocuments.net/reader035/viewer/2022070410/56649f065503460f94c1c25e/html5/thumbnails/21.jpg)
Language Language
• Language is a structured relationship between a set
of structured meanings and a set of structured
utterances.
• Children acquire this mapping after exposure to a
tiny fraction of the possible meaning/utterance pairs,
and [pace Elman] with very little corrective
feedback.
![Page 22: Distributed Representations: Simon D. Levy Department of Computer Science Washington and Lee University Lexington, VA 24450 PHIL 395 9 May 2006 Preaching](https://reader035.vdocuments.net/reader035/viewer/2022070410/56649f065503460f94c1c25e/html5/thumbnails/22.jpg)
Asking the Right QuestionsAsking the Right Questions
• How might a language organize itself to deal with
the fact that only an infinitesimal fraction of the
possible meaning/utterance pairs will be heard by a
given speaker in their lifetime?
• How might a nervous system (synaptic weights,
topology of neurons) organize itself to match the
regularities in its environment?
![Page 23: Distributed Representations: Simon D. Levy Department of Computer Science Washington and Lee University Lexington, VA 24450 PHIL 395 9 May 2006 Preaching](https://reader035.vdocuments.net/reader035/viewer/2022070410/56649f065503460f94c1c25e/html5/thumbnails/23.jpg)
Self-Organizing Maps (Kohonen Self-Organizing Maps (Kohonen
1984)1984)
• Input data consisting of N-
dimensional vectors
• Nodes (units) in a 2D grid
• Each node has a synaptic weight
vector of N dimensions
• Simple, “unsupervised” learning
algorithm...
![Page 24: Distributed Representations: Simon D. Levy Department of Computer Science Washington and Lee University Lexington, VA 24450 PHIL 395 9 May 2006 Preaching](https://reader035.vdocuments.net/reader035/viewer/2022070410/56649f065503460f94c1c25e/html5/thumbnails/24.jpg)
SOM Learning AlgorithmSOM Learning Algorithm
1. Pick an input vector at random
2. “Winning” node is one whose weight vector is
closest to the input vector in vector space.
3. Update weights of winner and its grid neighbors
to move them closer to the input
Get Matlab code: http://www.cs.wlu.edu/~levy/som
![Page 25: Distributed Representations: Simon D. Levy Department of Computer Science Washington and Lee University Lexington, VA 24450 PHIL 395 9 May 2006 Preaching](https://reader035.vdocuments.net/reader035/viewer/2022070410/56649f065503460f94c1c25e/html5/thumbnails/25.jpg)
SOM Learning: SOM Learning: A Two-Part A Two-Part
Invention in Two DimensionsInvention in Two Dimensions
![Page 26: Distributed Representations: Simon D. Levy Department of Computer Science Washington and Lee University Lexington, VA 24450 PHIL 395 9 May 2006 Preaching](https://reader035.vdocuments.net/reader035/viewer/2022070410/56649f065503460f94c1c25e/html5/thumbnails/26.jpg)
SOM Learning: SOM Learning: A Two-Part A Two-Part
Invention in Two DimensionsInvention in Two Dimensions
![Page 27: Distributed Representations: Simon D. Levy Department of Computer Science Washington and Lee University Lexington, VA 24450 PHIL 395 9 May 2006 Preaching](https://reader035.vdocuments.net/reader035/viewer/2022070410/56649f065503460f94c1c25e/html5/thumbnails/27.jpg)
SOM Learning: SOM Learning: A Three-Part A Three-Part
Invention in Three DimensionsInvention in Three Dimensions
![Page 28: Distributed Representations: Simon D. Levy Department of Computer Science Washington and Lee University Lexington, VA 24450 PHIL 395 9 May 2006 Preaching](https://reader035.vdocuments.net/reader035/viewer/2022070410/56649f065503460f94c1c25e/html5/thumbnails/28.jpg)
SOM Learning: SOM Learning: A Three-Part A Three-Part
Invention in Three DimensionsInvention in Three Dimensions
![Page 29: Distributed Representations: Simon D. Levy Department of Computer Science Washington and Lee University Lexington, VA 24450 PHIL 395 9 May 2006 Preaching](https://reader035.vdocuments.net/reader035/viewer/2022070410/56649f065503460f94c1c25e/html5/thumbnails/29.jpg)
Self-Organizing LanguageSelf-Organizing Language
● So grid can have any number of dimensions!
● Replace grid with high-dimensional HRR vector
● Learn to map from HRR’s for meanings to HRR’s
for utterances.
● What sort of regularities emerge?
![Page 30: Distributed Representations: Simon D. Levy Department of Computer Science Washington and Lee University Lexington, VA 24450 PHIL 395 9 May 2006 Preaching](https://reader035.vdocuments.net/reader035/viewer/2022070410/56649f065503460f94c1c25e/html5/thumbnails/30.jpg)
ConclusionsConclusions
● Distributed/vector representations can encode all
sorts of information once thought to be solely the
domain of folk psychology.
● But we will need completely new organizational
principles (holograms, deformable maps, fractals,
error gradients) to be able to tackle the really hard
problems.
![Page 31: Distributed Representations: Simon D. Levy Department of Computer Science Washington and Lee University Lexington, VA 24450 PHIL 395 9 May 2006 Preaching](https://reader035.vdocuments.net/reader035/viewer/2022070410/56649f065503460f94c1c25e/html5/thumbnails/31.jpg)
Thank You!