statistical learning in infants (and bigger folks)

Download Statistical Learning in Infants (and bigger folks)

If you can't read please download the document

Upload: lionel-mclaughlin

Post on 14-Dec-2015

220 views

Category:

Documents


1 download

TRANSCRIPT

  • Slide 1

Statistical Learning in Infants (and bigger folks) Slide 2 Statistical Learning Neural network models emphasize the value of statistical information in language What information can be extracted from this? Is this sufficient to account for human performance? Are humans able to perform this kind of analysis? If so, does it contribute to an understanding of the uniquely human ability to learn language? Slide 3 Saffran, Aslin, & Newport (1996) 8-month old infants Passive exposure to continuous speech (2 mins) bidakupadotigolabubidaku Test (Experiment #2) bidakubidakubidakubidakubidaku kupadokupadokupadokupadokupado Infants listen longer to unfamiliar sequences Transitional Probabilities bi da ku pa do ti 1.0.33 Jenny Saffran Dick Aslin Elissa Newport Slide 4 Slide 5 Slide 6 Slide 7 What is it good for? Word Learning Transitional probabilities: local minima = word boundaries Saffrans example: pretty baby /prItibebi/ p (ti|prI) = 0.8 p (be|ti) = 0.03 How else could children segment words? Words in isolation (Peters, 1983; Pinker, 1984) Stress-based segmentation: 90% of English words are stress-initial (Cutler & Carter, 1987) Phonotactic segmentation, e.g., *dnight (Gambell & Yang, 2005) Slide 8 Are Local Minima Effective? Gambell & Yang (2005) - Adult input to children from 3 corpora in CHILDES 226,178 words, 263,660 syllables Precision: hits/(hits + false alarms)41.6% Recall: hits/(hits + misses)23.3% Slide 9 More Statistical Learning Additional Stimulus types Tones Shapes etc. Additional species Slide 10 Cotton-top Tamarin Jackendoff Marc Hauser Slide 11 Slide 12 Slide 13 Slide 14 Slide 15 Where do constraints come from? Substantive Constraints If the statistical learning mechanism is able to pick up regularities that go beyond those found in natural languages, then there must be additional substantive linguistic constraints that provide the restrictions on natural languages Constraints on Learning & Processing some of the constraints on natural language structure might arise from constraints on the computational abilities this mechanism exhibits. (p. 130) Slide 16 Slide 17 Slide 18 Slide 19 Slide 20 Slide 21 Albert Bregman Slide 22 k t b | | | C - V - C - V - C | | a a Autosegmental Phonology Slide 23 Where do constraints come from? This compatibility between learning and languages in turn suggests that natural language structures may be formed, at least in part, by the constraints and selectivities of what human learners find easy to acquire. (p. 159) Slide 24 Where do constraints come from? How well does this generalize? Slide 25 Where do constraints come from? Substantive Constraints vs. Constraints on Learning or Processing Rather than removing the need for substantive constraints, Newport s approach seems to shift the burden of explanation onto the theory of representations Slide 26 Slide 27 Slide 28 Slide 29 Slide 30 Jenny Saffran Curr. Dir. Psych. Sci., 12: 110-114 (2003) Slide 31 Slide 32 Experiment 1 - Syllable Size Step 1: Pattern Induction Regime A: CVCV words, e.g., boga, diku Regime B: CVCCVC words, e.g., bikrub, gadkug Step 2: Segmentation 4 words: [baku, dola], [tupgod, girbup] Continuous stream: tupgodbakugirbupdolabaku Step 3: Testing Same words used in segmentation: [baku, dola], [tupgod, girbup] Infants listened longer to words consistent w/ induced pattern Slide 33 Experiment 2 - Phonotactics Step 1: Pattern Induction Regime A: -V+V syllables, e.g., todkad, pibtug Regime B: +V-V syllables, e.g., dakdot, gutbip Step 2: Segmentation 4 words: [kibpug, pagkob], [bupgok, gikbap] Continuous stream: pagkobbupgokgikbapkibpug Step 3: Testing Same words used in segmentation: [kibpug, pagkob], [bupgok, gikbap] Infants listened longer to words consistent w/ induced pattern Slide 34 Experiment 3 - Unnatural Phonotactics Experiment 2 -V+V pattern is stated over a feature-based class: /p,t,k/ vs. /b,d,g/ Experiment 3 Modify segment groupings: /p,d,k/ vs. /b,t,g/ Other details just like Experiment 2 No listening preference at test phase Slide 35 Conclusion To the extent that patterns that do not occur in natural languages are more difficult to acquire, we may consider the possibility that constraints on how infants learn may have served to shape the phonology of natural languages. Patterns that are difficult to acquire are less likely to persist cross-linguistically than those that are easily learned. Thus, languages may exploit devices such as voicing regularities in part because they are readily acquired by young learners. [Saffran & Thiessen 2003, p. 491] Slide 36 Marcus et al. (1999) Training ABA:ga na gali ti li ABB:ga na nali ti ti Testing ABA:wo fe wo ABB:wo fe fe Gary Marcus Slide 37 #1: ABB vs. ABA #2: ABB vs. ABA #3: ABB vs. AAB Slide 38 Slide 39 (Pena et al. 2002) Slide 40 Slide 41 Slide 42 Rule learning in infants is domain-specific Marcus, Fernandes & Johnson, submitted Slide 43 Slide 44 Slide 45 Slide 46 Slide 47 Slide 48 Slide 49 So what are we learning? Slide 50 Slide 51 Verb Argument Structure Slide 52 Baker (1979) Alternating Verbs John gave a cookie to the boy. John gave the boy a cookie. Mary showed some photos to her family. Mary showed her family some photos. Non-Alternating Verbs John donated a painting to the museum. *John donated the museum a painting. Mary displayed her art collection to the visitors. *Mary displayed the visitors her art collection Learnability problem: how to avoid overgeneralization Slide 53 Verb Argument Structure Locative Verbs Sally poured the water into the glass. *Sally poured the glass with water. *Sally filled the water into the glass. Sally filled the glass with water. Sally piled the books on the table. Sally piled the table with books. Slide 54 Verb Argument Structure Locative Verbs Sally poured the water into the glass. *Sally poured the glass with water. *Sally filled the water into the glass. Sally filled the glass with water. Sally piled the books on the table. Sally piled the table with books. Figure-verbs -- manner of motion pour, spill, drip, shake, etc. Ground-verbs -- change of state fill, cover, decorate, soak, etc. Alternator-verbs -- manner & change pile, scatter, load, etc. Slide 55 Verb Classes Assumptions 1. Linking rules are consistent across languages 2. Linking rules need not be learned Slide 56 Seidenberg (1997, Science) Locative Verb Constructions John poured the water into the cup *John poured the cup with water *Sue filled the water into the glass Sue filled the glass with water Bill loaded the apples onto the truck Bill loaded the truck with apples Connectionist networks are well suited to capturing systems with this character. Importantly, a network configured as a device that learns to perform a task such as mapping from sound to meaning will act as a discovery procedure, determining which kinds of information are relevant. Evidence that such models can encode precisely the right combinations of probabilistic constraints is provided by Allen (42), who implemented a network that learns about verbs and their argument structures from naturalistic input. (p. 1602) Slide 57 Seidenberg (Science, 3/14/97) Research on language has arrived at a particularly interesting point, however, because of important developments outside of the linguistic mainstream that are converging on a different view of the nature of language. These developments represent an important turn of events in the history of ideas about language. (p. 1599) Slide 58 Seidenberg (Science, 3/14/97) A second implication concerns the relevance of poverty-of-the-stimulus arguments to other aspects of language. Verbs and their argument structures are important, but they are language specific rather than universal properties of languages and so must be learned from experience. (p. 1602) Slide 59 Verb Argument Structure Distributional learning of linking rules (Seidenberg 1997, Science) a. Innate knowledge of nature of solution b. Model needs to select relevant semantic features from a pool of candidates Verbs and their argument structures are important, but they are language specific rather than universal properties of languages and so must be learned from experience. (p. 1602) (Allen & Seidenberg, 1997) Slide 60 Allens Model Learns associations between (i) specific verbs & argument structures and (ii) semantic representations Feature encoding for verbs, 360 features [eat]: +act, +cause, +consume, etc. [John]: +human, +animate, +male, +automotive, -vehicle Slide 61 Allens Model Learns associations between (i) specific verbs & argument structures and (ii) semantic representations Training set: 1200 utterance types taken from caretaker speech in Peter corpus (CHILDES) Slide 62 Allens Model Fine-grained distinction between hit, carry John kicked Mary the ball *John carried Mary the basket [kick]: +cause, +apply-force, +move, +travel, +contact, +hit-with-foot, +strike, +kick, +instantaneous-force, +hit [carry]: +cause, + apply-force, +move, +travel, +contact, +carry, +support, +continuous-force, +accompany Slide 63 Allens Model Fine-grained distinction between hit, carry John kicked Mary the ball *John carried Mary the basket [kick]: +cause, +apply-force, +move, +travel, +contact, +hit-with-foot, +strike, +kick, +instantaneous-force, +hit [carry]: +cause, + apply-force, +move, +travel, +contact, +carry, +support, +continuous-force, +accompany Slide 64 Allens Model Fine-grained distinction between hit, carry John kicked Mary the ball *John carried Mary the basket [kick]: +cause, +apply-force, +move, +travel, +contact, +hit-with-foot, +strike, +kick, +instantaneous-force, +hit [carry]: +cause, + apply-force, +move, +travel, +contact, +carry, +support, +continuous-force, +accompany Slide 65 Allens Model Fine-grained distinction between hit, carry John kicked Mary the ball *John carried Mary the basket [kick]: +cause, +apply-force, +move, +travel, +contact, +hit-with-foot, +strike, +kick, +instantaneous-force, +hit [carry]: +cause, + apply-force, +move, +travel, +contact, +carry, +support, +continuous-force, +accompany Slide 66 Allens Model Fine-grained distinction between hit, carry John kicked Mary the ball *John carried Mary the basket This behavior shows crucially that the network is not merely sensitive to overall semantic similarity: rather, the network has organized the semantic space such that some features are more important than other. (p. 5) Slide 67 English John piled the books on the table. John piled the table with books. Korean Yumi-kachaek-lulchaeksang-eyssa-ass-ta. Nombook-Acctable-Locpile-Past-Dec Yumi piled books on the table. *Yumi-kachaeksang-lulchaek-elossa-ass-ta. Nom table-Acc books-withpile-Past-Dec Yumi piled the table with books. Verb Argument Structure Korean is more restrictive than English - conflates pile- class and pour-class. Slide 68 1.English 2.Korean 3.Turkish 4.Chinese 5.Japanese 6.Yoruba 7.Hebrew 8.French 9.Spanish (Castilian) 10.Spanish (Argentinian) 11.Arabic 12.Thai 13.Luganda 14.Malay 15.Hindi 16.Ewe 17.Italian 18.Brazilian Port. 19.Russian 20.Polish Verb Argument Structure Typological Survey Slide 69 Verb Argument Structure Simple VP Structures She filled the water into the glass. She stuffed feathers into the pillow. Simple VP Structures She poured the glass with water. She piled the shelf with books. Adjectival Passives The filled water. The stuffed feathers. (*) Verb Serialization She pour-put the glass. She pile-put the shelf. EnglishKorean diff. same Korean diff. Korean diff. EnglishKorean diff. same (Kim, 1999; Kim, Landau, & Phillips, 1999) Verb class contrasts that seem to disappear in simple/frequent structures reemerge in constructions that are less frequent. Slide 70 Verb Argument Structure Allen & Seidenberg model presents interesting case for reduced innate knowledge about linking rules If they are right that linking rules are learned from distributional analysis of common constructions There should be more cross-language variation in verb classes We should not find cross-linguistically consistent contrasts buried in obscure corners of the grammars of particular languages Slide 71 Constrained statistical learning idea Newport & Aslin 2004 - Cognitive Psych.: only certain relations learned Saffran 2003 - Curr. Dir. Psych. Sci.: only certain generalizations made Saffran & Thiessen 2003 - Dev. Psych, 39, 484-494 Learning Rules Marcus et al. 1999 - Science: generalizing beyond the training stimuli Pena et al. 2002 - Science: importance of segmentation for generalization Marcus et al. 2005 - Nature?: importance of speech-like stimuli Marcus et al. 200x - Cognition: These are simple linear patterns; language requires much more Conclusion Literature has clarified a number of issues regarding systematicity in language Controlled studies of forming generalizations Distinguish roles of statistics: representation vs. learning tool Seidenberg & Locative verbs? More