a computational study of cross-situational techniques for learning word-to-meaning mappings

36
study of cross- study of cross- situational situational techniques for techniques for learning word-to- learning word-to- meaning mappings meaning mappings Jeffrey Mark Siskind Jeffrey Mark Siskind Presented by David Goss-Grubbs Presented by David Goss-Grubbs March 5, 2006 March 5, 2006

Upload: kalil

Post on 25-Feb-2016

29 views

Category:

Documents


2 download

DESCRIPTION

A computational study of cross-situational techniques for learning word-to-meaning mappings. Jeffrey Mark Siskind Presented by David Goss-Grubbs March 5, 2006. The Problem: Mapping Words to Concepts. Child hears John went to school Child sees GO( John , TO( school )) Child must learn - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: A computational study of cross-situational techniques for learning word-to-meaning mappings

A computational study A computational study of cross-situational of cross-situational

techniques for learning techniques for learning word-to-meaning word-to-meaning

mappingsmappingsJeffrey Mark SiskindJeffrey Mark Siskind

Presented by David Goss-GrubbsPresented by David Goss-GrubbsMarch 5, 2006March 5, 2006

Page 2: A computational study of cross-situational techniques for learning word-to-meaning mappings

The Problem: Mapping Words to The Problem: Mapping Words to ConceptsConcepts

►Child hears Child hears John went to schoolJohn went to school►Child sees GO(Child sees GO(JohnJohn, TO(, TO(schoolschool))))►Child must learnChild must learn

JohnJohn JohnJohn wentwent GO( GO(xx, , yy)) toto TO( TO(x)x) schoolschool schoolschool

Page 3: A computational study of cross-situational techniques for learning word-to-meaning mappings

Two ProblemsTwo Problems►Referential uncertaintyReferential uncertainty

MOVE(MOVE(JohnJohn, , feetfeet)) WEAR(WEAR(JohnJohn, RED(, RED(shirtshirt))))

►Determining the correct alignmentDetermining the correct alignment JohnJohn TO( TO(xx)) walkedwalked schoolschool toto JohnJohn schoolschool GO( GO(xx, , yy))

Page 4: A computational study of cross-situational techniques for learning word-to-meaning mappings

Helpful ConstraintsHelpful Constraints►Partial KnowledgePartial Knowledge►Cross-situational inferenceCross-situational inference►Covering constraintsCovering constraints►ExclusivityExclusivity

Page 5: A computational study of cross-situational techniques for learning word-to-meaning mappings

Partial KnowledgePartial Knowledge►Child hears Child hears Mary lifted the blockMary lifted the block►Child seesChild sees

CAUSE(CAUSE(MaryMary, , GO(GO(blockblock, UP)), UP)) WANT(WANT(MaryMary, , blockblock)) BE(BE(blockblock, ON(, ON(tabletable))))

► If the child knows If the child knows liftlift contains CAUSE, contains CAUSE, the second two hypotheses can be the second two hypotheses can be ruled out.ruled out.

Page 6: A computational study of cross-situational techniques for learning word-to-meaning mappings

Cross-situational inferenceCross-situational inference► John lifted the ballJohn lifted the ball

CAUSE(CAUSE(JohnJohn, GO(, GO(ballball, UP)), UP))►Mary lifted the blockMary lifted the block

CAUSE(CAUSE(MaryMary, GO(, GO(blockblock, UP)), UP))►Thus, Thus, liftedlifted

{UP, GO({UP, GO(xx, , yy), GO(), GO(xx, UP), CAUSE(, UP), CAUSE(xx, , yy), ), CAUSE(CAUSE(xx, GO(, GO(yy, , zz)), CAUSE()), CAUSE(xx, GO(, GO(yy, , UP))}UP))}

Page 7: A computational study of cross-situational techniques for learning word-to-meaning mappings

Covering constraintsCovering constraints►Assume: all components of an Assume: all components of an

utterance’s meaning come from the utterance’s meaning come from the meanings of words in that utterance.meanings of words in that utterance.

► If it is known that CAUSE is not part of If it is known that CAUSE is not part of the meaning of the meaning of JohnJohn, , thethe or or ballball, it , it must be part of the meaning of must be part of the meaning of liftedlifted..

►(But what about constructional (But what about constructional meaning?)meaning?)

Page 8: A computational study of cross-situational techniques for learning word-to-meaning mappings

ExclusivityExclusivity►Assume: any portion of the meaning of Assume: any portion of the meaning of

an utterance comes from no more an utterance comes from no more than one of its words.than one of its words.

► If If John walkedJohn walked WALK( WALK(JohnJohn) and) andJohnJohn JohnJohnThen Then walkedwalked can be no more than can be no more thanwalkedwalked WALK( WALK(xx))

Page 9: A computational study of cross-situational techniques for learning word-to-meaning mappings

Three more problemsThree more problems►BootstrappingBootstrapping►Noisy InputNoisy Input►HomonymyHomonymy

Page 10: A computational study of cross-situational techniques for learning word-to-meaning mappings

BootstrappingBootstrapping►Lexical acquisition is much easier if Lexical acquisition is much easier if

some of the language is already knownsome of the language is already known►Some of Siskind’s strategies (e.g. cross-Some of Siskind’s strategies (e.g. cross-

situational learning) work without such situational learning) work without such knowledgeknowledge

►Others (e.g. exclusivity) require it.Others (e.g. exclusivity) require it.►The algorithm starts off slow, then The algorithm starts off slow, then

speeds upspeeds up

Page 11: A computational study of cross-situational techniques for learning word-to-meaning mappings

NoiseNoise►Only a subset of all possible meanings will Only a subset of all possible meanings will

be available to the algorithmbe available to the algorithm► If none of them contain the correct If none of them contain the correct

meaning, cross-situational learning would meaning, cross-situational learning would cause those words never to be acquiredcause those words never to be acquired

►Some portion of the input must be Some portion of the input must be ignored.ignored.

►(A statistical approach is rejected – it is (A statistical approach is rejected – it is not clear why)not clear why)

Page 12: A computational study of cross-situational techniques for learning word-to-meaning mappings

HomonymyHomonymy►Similar to noisy input, cross-situational Similar to noisy input, cross-situational

techniques would fail to find a techniques would fail to find a consistent mapping for homonymous consistent mapping for homonymous words.words.

►When an inconsistency is found, a split When an inconsistency is found, a split is made.is made.

► If the split is corroborated, a new If the split is corroborated, a new sense is created; otherwise it is noise.sense is created; otherwise it is noise.

Page 13: A computational study of cross-situational techniques for learning word-to-meaning mappings

The problem, formally statedThe problem, formally stated►From: a sequence of utterancesFrom: a sequence of utterances

Each utterance is an unordered collection Each utterance is an unordered collection of wordsof words

Each utterance is paired with a set of Each utterance is paired with a set of conceptual expressionsconceptual expressions

►To: a lexiconTo: a lexicon The lexicon maps each word to a set of The lexicon maps each word to a set of

conceptual expressions, one for each conceptual expressions, one for each sense of the wordsense of the word

Page 14: A computational study of cross-situational techniques for learning word-to-meaning mappings

CompositionComposition►Select one sense for each wordSelect one sense for each word►Find all ways of combining these Find all ways of combining these

conceptual expressionsconceptual expressions►The meaning of an utterance is derived The meaning of an utterance is derived

only from the meaning of its component only from the meaning of its component words.words.

►Every conceptual expression in the Every conceptual expression in the meanings of the words must appear in meanings of the words must appear in the final conceptual expression (copies the final conceptual expression (copies are possible)are possible)

Page 15: A computational study of cross-situational techniques for learning word-to-meaning mappings

The simplified algorithm: no The simplified algorithm: no noise or homonymynoise or homonymy

►Two learning stagesTwo learning stages Stage 1: The set of conceptual symbolsStage 1: The set of conceptual symbols E.g. {CAUSE, GO, UP}E.g. {CAUSE, GO, UP} Stage 2: The conceptual expressionStage 2: The conceptual expression CAUSE(CAUSE(xx, GO(, GO(yy, UP)), UP))

Page 16: A computational study of cross-situational techniques for learning word-to-meaning mappings

Stage 1: Conceptual symbol Stage 1: Conceptual symbol setset

►Maintain sets of necessary and possible Maintain sets of necessary and possible conceptual symbols for each wordconceptual symbols for each word

► Initialize the former to the empty set Initialize the former to the empty set and the latter to the universal setand the latter to the universal set

►Utterances will increase the necessary Utterances will increase the necessary set and decrease the possible set, until set and decrease the possible set, until they converge on the actual conceptual they converge on the actual conceptual symbol setsymbol set

Page 17: A computational study of cross-situational techniques for learning word-to-meaning mappings

Stage 2: Conceptual Stage 2: Conceptual expressionexpression

►Maintain a set of possible conceptual Maintain a set of possible conceptual expressions for each wordexpressions for each word

► Initialize to the set of all expressions Initialize to the set of all expressions that can be composed from the actual that can be composed from the actual conceptual symbol setconceptual symbol set

►New utterances will decrease the New utterances will decrease the possible conceptual expression set possible conceptual expression set until only one remainsuntil only one remains

Page 18: A computational study of cross-situational techniques for learning word-to-meaning mappings

ExampleExamplenecessarynecessary PossiblePossible

JohnJohn {{JohnJohn}} {{JohnJohn, , ballball}}

TookTook {CAUSE}{CAUSE} {CAUSE, {CAUSE, WANT, GO, WANT, GO, TO, TO, armarm}}

TheThe {}{} {WANT, {WANT, armarm}}

BallBall {{ballball}} {{ballball, , armarm}}

Page 19: A computational study of cross-situational techniques for learning word-to-meaning mappings

Selecting the meaningSelecting the meaningJohn took the ballJohn took the ball►CAUSE(CAUSE(JohnJohn, GO(, GO(ballball, TO(, TO(JohnJohn))))))►WANT(WANT(JohnJohn, , ballball))►CAUSE(CAUSE(JohnJohn, GO(PART-OF (LEFT(, GO(PART-OF (LEFT(armarm), ), JohnJohn), TO(), TO(ballball)))))) Second is eliminated because no CAUSESecond is eliminated because no CAUSE Third is eliminated because no word has Third is eliminated because no word has

LEFT or PART-OFLEFT or PART-OF

Page 20: A computational study of cross-situational techniques for learning word-to-meaning mappings

Updated tableUpdated tablenecessarynecessary PossiblePossible

JohnJohn {{JohnJohn}} {{JohnJohn}}

TookTook {CAUSE, GO, {CAUSE, GO, TO}TO}

{CAUSE, GO, {CAUSE, GO, TO}TO}

TheThe {}{} {}{}

BallBall {{ballball}} {{ballball}}

Page 21: A computational study of cross-situational techniques for learning word-to-meaning mappings

Stage 2Stage 2CAUSE(CAUSE(JohnJohn, GO(, GO(ballball, TO(, TO(JohnJohn))))))

JohnJohn {{JohnJohn}}

TookTook {CAUSE(x, GO(y, TO(x)))}{CAUSE(x, GO(y, TO(x)))}

TheThe {}{}

BallBall {{ballball}}

Page 22: A computational study of cross-situational techniques for learning word-to-meaning mappings

Noise and HomonymyNoise and Homonymy►Noisy or homonymous data can Noisy or homonymous data can

corrupt the lexiconcorrupt the lexicon►Adding an incorrect element to the set Adding an incorrect element to the set

of necessary elementsof necessary elements►Taking a correct element away from Taking a correct element away from

the set of possible elementsthe set of possible elements►This may or may not create an This may or may not create an

inconsistent entryinconsistent entry

Page 23: A computational study of cross-situational techniques for learning word-to-meaning mappings

Extended algorithmExtended algorithm►Necessary and possible conceptual Necessary and possible conceptual

symbols are mapped to senses rather symbols are mapped to senses rather than wordsthan words

►Words are mapped to their sensesWords are mapped to their senses►Each sense has a confidence factorEach sense has a confidence factor

Page 24: A computational study of cross-situational techniques for learning word-to-meaning mappings

Sense assignmentSense assignment►For each utterance, find the cross-For each utterance, find the cross-

product of all the sensesproduct of all the senses►Choose the “best” consistent sense Choose the “best” consistent sense

assignmentassignment►Update the entries for those senses as Update the entries for those senses as

beforebefore►Add to a sense’s confidence factor each Add to a sense’s confidence factor each

time it is used in a preferred assignmenttime it is used in a preferred assignment

Page 25: A computational study of cross-situational techniques for learning word-to-meaning mappings

Inconsistent utterancesInconsistent utterances► Add the minimal number of new senses until Add the minimal number of new senses until

the utterance is no longer inconsistent – the utterance is no longer inconsistent – three possibilitiesthree possibilities

► If the current utterance is noise, new senses If the current utterance is noise, new senses are bad (and will be ignored)are bad (and will be ignored)

► There really are new sensesThere really are new senses► The original senses were bad, and the right The original senses were bad, and the right

senses are only now being added.senses are only now being added.► On occasion, remove senses with low On occasion, remove senses with low

confidence factorsconfidence factors

Page 26: A computational study of cross-situational techniques for learning word-to-meaning mappings

Four simulationsFour simulations►Vary the task along five parametersVary the task along five parameters►Vocabulary growth rate by size of Vocabulary growth rate by size of

corpuscorpus►Number of required exposures to a Number of required exposures to a

word by size of corpusword by size of corpus►How high can it scale?How high can it scale?

Page 27: A computational study of cross-situational techniques for learning word-to-meaning mappings

Method (1 of 2)Method (1 of 2)►Construct a random lexiconConstruct a random lexicon►Vary it by three parametersVary it by three parameters

Vocabulary sizeVocabulary size Homonymy rateHomonymy rate Conceptual-symbol inventory sizeConceptual-symbol inventory size

Page 28: A computational study of cross-situational techniques for learning word-to-meaning mappings

Method (2 of 2)Method (2 of 2)►Construct a series of utterances, each Construct a series of utterances, each

paired with a set of meaning paired with a set of meaning hypotheseshypotheses

►Vary this by the following parametersVary this by the following parameters Noise rateNoise rate Degree of referential uncertaintyDegree of referential uncertainty Cluster size (5)Cluster size (5) Similarity probability (.75)Similarity probability (.75)

Page 29: A computational study of cross-situational techniques for learning word-to-meaning mappings

Sensitivity analysisSensitivity analysis

Page 30: A computational study of cross-situational techniques for learning word-to-meaning mappings

Vocabulary sizeVocabulary size

Page 31: A computational study of cross-situational techniques for learning word-to-meaning mappings

Degree of referential Degree of referential uncertaintyuncertainty

Page 32: A computational study of cross-situational techniques for learning word-to-meaning mappings

Noise rateNoise rate

Page 33: A computational study of cross-situational techniques for learning word-to-meaning mappings

Conceptual-symbol inventory Conceptual-symbol inventory sizesize

Page 34: A computational study of cross-situational techniques for learning word-to-meaning mappings

Homonymy rateHomonymy rate

Page 35: A computational study of cross-situational techniques for learning word-to-meaning mappings

Vocabulary GrowthVocabulary Growth

Page 36: A computational study of cross-situational techniques for learning word-to-meaning mappings

Number of exposuresNumber of exposures