modelling language acquisition with neural networks steve r. howell a preliminary research plan
TRANSCRIPT
Modelling Language Acquisition Modelling Language Acquisition with Neural Networkswith Neural Networks
Steve R. Howell
A preliminary research plan
Presentation OverviewPresentation Overview
Goals & challenges of this modelling project
Examination of previous & related research
Overall plan of this projectImplementation and Evaluation
details, where available
Project GoalsProject Goals
Model two aspects of human language acquisition in a single neural network through word prediction mechanisms: grammar and semantics
Use only orthographic representations, not phonological
Use small but functional word corpus (e.g. child’s basic functional vocabulary?)
ChallengesChallenges
Need a network architecture capable of modelling both grammar and semantics
Most humans learn language phonologically first, reading later. What if phonology is required?
Computational limitations limit us to a small word corpus; can we achieve functional communication with it?
Previous ResearchPrevious Research
Ellman (1990)
Mozer (1987)
Seidelberg & McLelland (1989)
Landauer et al. (LSA)
Rao & Ballard (1997)
Ellman (1990)Ellman (1990)
Competitors Strengths Weaknesses
Ellman J. L. (1990). Finding structure in time. Cognitive Science, 14, p. 179-211
FOR MORE INFO...
Mozer (1987)Mozer (1987)
Competitors Strengths Weaknesses
Mozer M.C. (1987) Early parallel processing in reading: A connectionist approach. In M. Coltheart (Ed.) Attention and Performance, 12: The psychology of reading.
FOR MORE INFO...
Siedelberg & McLelland (1989)Siedelberg & McLelland (1989)
Competitors Strengths Weaknesses
Seidenberg, M.S. & McClelland J. L. (1989). A distributed, developmental model of word recognition and naming. Psychological Review, 96, 523-568
FOR MORE INFO...
Landauer et al.Landauer et al.
“LSA” - a model of semantic learning by statistical regularity detection using Principal Components Analysis
Very large word corpus, significant process resources required, but good performance
Data set apparently proprietaryFOR MORE INFO...
Don’t call them, they’ll call you.
Rao & Ballard (1997)Rao & Ballard (1997)
Competitors Strengths Weaknesses
Rao, R.P.N. & Ballard, D.H. (1997). Dynamic model of visual recognition predicts neural response properties in the visual cortex. Neural Computation, 9, 721-763.
FOR MORE INFO...
Overall View of this projectOverall View of this project
Architecture as in Rao & BallardRecurrent structure is as or more
applicable to temporal variability as to spatial variability
Starting with single layer network, moving to multi-layer Rao & Ballard net
Overall View (ctd.)Overall View (ctd.)
Input representations very high-levelFirst layer of net thus a word
prediction (from letters) levelSecond layer adds word prediction
from previous words (Simple grammer? Words predict next words?)
Overall View (ctd.)Overall View (ctd.)
Additional higher levels should add larger temporal range of influence
Words in a large temporal range of current word help to predict it.
Implies semantic linkage?Analogous to LSA “Bag of words”
approach at these levels
Possible AdvantagesPossible Advantages
Lower level units learn grammar, higher level units learn semantics
Effectively combines grammar-learning methods with LSA-like statistical bag-of-words approach
Top-down prediction route allows for possible modification to language generation, unlike LSA
DisadvantagesDisadvantages
Relatively complex mathematical implementation (Kalman Filter) especially compared to Ellman nets
Unclear how well higher levels will actually perform semantic learning
While better suited than LSA, it is as yet unclear how to modify the net for language generation
Resources RequiredResources Required
Hope to find information about basic functional vocabulary of English language (600-800 words?)
Models of language acquisition of course imply comparison to children’s language learning/usage, not adults: child data?
Model Evaluation (basic)Model Evaluation (basic)
If lower-levels learn to predict words from the previous word or two, then can test as Ellman did.
If higher-levels learn semantic regularities as in LSA, then can test as for LSA.
Model Evaluation (optimistic)Model Evaluation (optimistic)
If generative modifications can be made, might be able to output words/phrases semantically linked to input words/phrases (ElizaNET?).
Child Turing test? (human judges compare model output to real childrens’ output for same input?)
Current Status/Next StepsCurrent Status/Next Steps
Still reviewing previous research Working through implementation
details of Rao & Ballard algorithmMust consider different types of
high-level input representationsNeed to develop/acquire basic
English vocabulary/grammar data set
Thank-you.Thank-you.
Questions and comments are expressly welcomed. Thoughts on
any of the questions raised herein will be extremely valuable.