the semantic quilt

75
The Semantic Quilt: The Semantic Quilt: Contexts, Co- Contexts, Co- occurrences, occurrences, Kernels, and Kernels, and Ontologies Ontologies Ted Pedersen Ted Pedersen University of Minnesota, University of Minnesota, Duluth Duluth http:// http:// www.d.umn.edu/~tpederse www.d.umn.edu/~tpederse

Upload: university-of-minnesota-duluth

Post on 11-May-2015

763 views

Category:

Education


1 download

DESCRIPTION

A talk on the Semantic Quilt, which combines various methods of "doing semantics" into a more unified framework.

TRANSCRIPT

Page 1: The Semantic Quilt

The Semantic Quilt: The Semantic Quilt: Contexts, Co-occurrences, Contexts, Co-occurrences, Kernels, and Ontologies Kernels, and Ontologies

Ted PedersenTed PedersenUniversity of Minnesota, DuluthUniversity of Minnesota, Duluth

http://http://www.d.umn.edu/~tpedersewww.d.umn.edu/~tpederse

Page 2: The Semantic Quilt

Create by stitching togetherCreate by stitching together

Page 3: The Semantic Quilt

Sew together different materialsSew together different materials

Page 4: The Semantic Quilt

Ont

olog

ies

Co-Occurrences

Kernels

Contexts

Page 5: The Semantic Quilt

Semantics in NLPSemantics in NLP Potentially useful for many applicationsPotentially useful for many applications

Machine TranslationMachine Translation Document or Story UnderstandingDocument or Story Understanding Text GenerationText Generation Web SearchWeb Search ……

Can come from many sourcesCan come from many sources Not well integratedNot well integrated Not well defined?Not well defined?

Page 6: The Semantic Quilt

What do we mean by What do we mean by semanticssemantics??…it depends on our resources……it depends on our resources…

Ontologies – relationships among conceptsOntologies – relationships among concepts Similar / related concepts connectedSimilar / related concepts connected

Dictionary – definitions of senses / conceptsDictionary – definitions of senses / concepts similar / related senses have similar / related similar / related senses have similar / related

definitionsdefinitions Contexts – short passages of words Contexts – short passages of words

similar / related words occur in similar / related similar / related words occur in similar / related contextscontexts

Co-occurrences – Co-occurrences – a a word word is defined by the company it keeps is defined by the company it keeps words that occur with the same kinds words are words that occur with the same kinds words are

similar / relatedsimilar / related

Page 7: The Semantic Quilt

What level of granularity?What level of granularity?

wordswords terms / collocationsterms / collocations phrasesphrases sentencessentences paragraphsparagraphs documentsdocuments booksbooks

Page 8: The Semantic Quilt

The Terrible Tension :The Terrible Tension :Ambiguity versus GranularityAmbiguity versus Granularity

Words are potentially very ambiguousWords are potentially very ambiguous But we can list them (sort of)But we can list them (sort of) ……we can define their meanings (sort of)we can define their meanings (sort of) ……not ambiguous to human reader, but hard for a not ambiguous to human reader, but hard for a

computer to know which meaning is intendedcomputer to know which meaning is intended Terms / collocations are less ambiguousTerms / collocations are less ambiguous

Difficult to enumerate because there are so many, but Difficult to enumerate because there are so many, but can be done for a domain (e.g., medicine)can be done for a domain (e.g., medicine)

Phrases (short contexts) can still be ambiguous, Phrases (short contexts) can still be ambiguous, but not to the same degree as words or but not to the same degree as words or terms/collocationsterms/collocations

Page 9: The Semantic Quilt

The Current State of AffairsThe Current State of Affairs Most resources and methods focus on word or term Most resources and methods focus on word or term

semantics semantics makes it possible to build resources (manually or makes it possible to build resources (manually or

automatically) with reasonable coverage, but …automatically) with reasonable coverage, but … … … techniques become very resource dependenttechniques become very resource dependent … … resources become language dependentresources become language dependent … … introduces a lot of ambiguityintroduces a lot of ambiguity … … not clear how to bring together resourcesnot clear how to bring together resources

Similarity is a useful organizing principle, but …Similarity is a useful organizing principle, but … ……there are lots of ways to be similarthere are lots of ways to be similar

Page 10: The Semantic Quilt

Similarity as Organizing PrincipleSimilarity as Organizing Principle

Measure word association using knowledge lean Measure word association using knowledge lean methods that are based on co-occurrence methods that are based on co-occurrence information from large corporainformation from large corpora

Measure contextual similarity using knowledge Measure contextual similarity using knowledge lean methods that are based on co-occurrence lean methods that are based on co-occurrence information from large corporainformation from large corpora

Measure conceptual similarity / relatedness Measure conceptual similarity / relatedness using a structured repository of knowledge using a structured repository of knowledge Lexical database WordNetLexical database WordNet Unified Medical Language System (UMLS)Unified Medical Language System (UMLS)

Page 11: The Semantic Quilt

Things we can do now…Things we can do now… Identify associated wordsIdentify associated words

fine winefine wine baseball batbaseball bat

Identify similar contextsIdentify similar contexts I bought some food at the storeI bought some food at the store I purchased something to eat at the marketI purchased something to eat at the market

Assign meanings to wordsAssign meanings to words I went to the bankI went to the bank/[financial-inst.]/[financial-inst.] to deposit my check to deposit my check

Identify similar (or related) conceptsIdentify similar (or related) concepts frog : amphibianfrog : amphibian Duluth : snowDuluth : snow

Page 12: The Semantic Quilt

Things we want to do…Things we want to do…

Integrate different resources and methodsIntegrate different resources and methods Solve bigger problemsSolve bigger problems

some of what we do now is a means to an unclear some of what we do now is a means to an unclear endend

Be Language Independent Be Language Independent Offer Broad CoverageOffer Broad Coverage Reduce dependence on manually built Reduce dependence on manually built

resources resources ontologies, dictionaries, labeled training data…ontologies, dictionaries, labeled training data…

Page 13: The Semantic Quilt

Semantic Patches to Sew TogetherSemantic Patches to Sew Together ContextsContexts

SenseClusters : measures similarity between written texts (i.e., SenseClusters : measures similarity between written texts (i.e., contexts)contexts)

Co-OccurrencesCo-Occurrences Ngram Statistics Package : measures association between Ngram Statistics Package : measures association between

words, identify collocations or termswords, identify collocations or terms KernelsKernels

WSD-Shell : supervised learning for word sense disambiguation, WSD-Shell : supervised learning for word sense disambiguation, in process of including SVMs with user defined kernelsin process of including SVMs with user defined kernels

““Ontologies”Ontologies” WordNet-Similarity : measures similarity between concepts WordNet-Similarity : measures similarity between concepts

found in WordNetfound in WordNet UMLS-Similarity UMLS-Similarity

All of these are projects at the University of Minnesota, DuluthAll of these are projects at the University of Minnesota, Duluth

Page 14: The Semantic Quilt

Ont

olog

ies

Co-Occurrences

Kernels

Contexts

Page 15: The Semantic Quilt

Ngram Statistics PackageNgram Statistics Package

http://http://ngram.sourceforge.netngram.sourceforge.net

Co-Occurrences

Page 16: The Semantic Quilt

Things we can do now…Things we can do now…

Identify associated wordsIdentify associated words fine winefine wine baseball batbaseball bat

Identify similar contextsIdentify similar contexts I bought some food at the storeI bought some food at the store I purchased something to eat at the marketI purchased something to eat at the market

Assign meanings to wordsAssign meanings to words I went to the bank/[financial-inst.] to deposit my checkI went to the bank/[financial-inst.] to deposit my check

Identify similar (or related) conceptsIdentify similar (or related) concepts frog : amphibianfrog : amphibian Duluth : snowDuluth : snow

Page 17: The Semantic Quilt

Co-occurrences and semantics?Co-occurrences and semantics?

individual words (esp. common ones) are individual words (esp. common ones) are very ambiguousvery ambiguous bat bat lineline

pairs of words disambiguate each otherpairs of words disambiguate each other baseball batbaseball bat vampire … Transylvaniavampire … Transylvania product lineproduct line speech …. line speech …. line

Page 18: The Semantic Quilt

Why pairs of words?Why pairs of words?

Zipf's LawZipf's Law most words are rare, most bigrams are even most words are rare, most bigrams are even

more rare, most ngrams are even rarer stillmore rare, most ngrams are even rarer still the more common a word, the more senses it the more common a word, the more senses it

will havewill have ““Co-occurrences” are less frequent than Co-occurrences” are less frequent than

individual words, tend to be less individual words, tend to be less ambiguous as a resultambiguous as a result Mutually disambiguating Mutually disambiguating

Page 19: The Semantic Quilt

BigramsBigrams

Window Size of 2Window Size of 2 baseball bat, fine wine, apple orchard, bill clintonbaseball bat, fine wine, apple orchard, bill clinton

Window Size of 3Window Size of 3 house house ofof representatives, bottle representatives, bottle ofof wine, wine,

Window Size of 4Window Size of 4 president president of theof the republic, whispering republic, whispering in thein the wind wind

Selected using a small window size (2-4 words)Selected using a small window size (2-4 words) Objective is to capture a regular or localized Objective is to capture a regular or localized

pattern between two words (collocation?)pattern between two words (collocation?) If order doesn’t matter, then these are co-If order doesn’t matter, then these are co-

occurrences … occurrences …

Page 20: The Semantic Quilt

““occur together more often than occur together more often than expected by chance…”expected by chance…”

Observed frequencies for two words occurring Observed frequencies for two words occurring together and alone are stored in a 2x2 matrixtogether and alone are stored in a 2x2 matrix

Expected values are calculated, based on the Expected values are calculated, based on the model of independence and observed valuesmodel of independence and observed values How often would you expect these words to occur How often would you expect these words to occur

together, if they only occurred together by chance?together, if they only occurred together by chance? If two words occur “significantly” more often than the If two words occur “significantly” more often than the

expected value, then the words do not occur together expected value, then the words do not occur together by chance.by chance.

Page 21: The Semantic Quilt

Measures and Tests of Association Measures and Tests of Association http://http://ngram.sourceforge.netngram.sourceforge.net

Log-likelihood RatioLog-likelihood Ratio Mutual Information Mutual Information Pointwise Mutual Pointwise Mutual

Information Information Pearson’s Chi-squared Pearson’s Chi-squared

TestTest

Phi coefficient Phi coefficient Fisher’s Exact Test Fisher’s Exact Test T-test T-test Dice CoefficientDice Coefficient Odds RatioOdds Ratio

Page 22: The Semantic Quilt

What do we get at the end?What do we get at the end?

A list of bigrams or co-occurrences that are A list of bigrams or co-occurrences that are significant or interesting (meaningful?)significant or interesting (meaningful?) automaticautomatic language independentlanguage independent

These can be used as building blocks for These can be used as building blocks for systems that do semantic processingsystems that do semantic processing relatively unambiguousrelatively unambiguous often very informative about topic or domainoften very informative about topic or domain can serve as a fingerprint for a document or bookcan serve as a fingerprint for a document or book

Page 23: The Semantic Quilt

Ont

olog

ies

Co-Occurrences

Kernels

Contexts

Page 24: The Semantic Quilt

SenseClustersSenseClusters

http://http://senseclusters.sourceforge.netsenseclusters.sourceforge.net

Contexts

Page 25: The Semantic Quilt

Things we can do now…Things we can do now… Identify associated wordsIdentify associated words

fine winefine wine baseball batbaseball bat

Identify similar contextsIdentify similar contexts I bought some food at the storeI bought some food at the store I purchased something to eat at the marketI purchased something to eat at the market

Assign meanings to wordsAssign meanings to words I went to the bank/[financial-inst.] to deposit my checkI went to the bank/[financial-inst.] to deposit my check

Identify similar (or related) conceptsIdentify similar (or related) concepts frog : amphibianfrog : amphibian Duluth : snowDuluth : snow

Page 26: The Semantic Quilt

Identify Similar ContextsIdentify Similar Contexts Find phrases that say the same thing using Find phrases that say the same thing using

different wordsdifferent words I went to the storeI went to the store Ted drove to Wal-MartTed drove to Wal-Mart

Find words that have the same meaning in Find words that have the same meaning in different contextsdifferent contexts The The lineline is moving pretty fast is moving pretty fast I stood in I stood in lineline for 12 hours for 12 hours

Find different words that have the same Find different words that have the same meaning in different contextsmeaning in different contexts The The lineline is moving pretty fast is moving pretty fast I stood in the I stood in the queuequeue for 12 hours for 12 hours

Page 27: The Semantic Quilt

SenseClusters MethodologySenseClusters Methodology

Represent contexts using first or second Represent contexts using first or second order co-occurrences order co-occurrences

Reduce dimensionality of vectorsReduce dimensionality of vectors Singular value decompositionSingular value decomposition

Cluster the context vectorsCluster the context vectors Find the number of clustersFind the number of clusters Label the clustersLabel the clusters

Evaluate and/or use the contexts!Evaluate and/or use the contexts!

Page 28: The Semantic Quilt

Second Order FeaturesSecond Order Features

Second order features encode something Second order features encode something ‘extra’ about a feature that occurs in a ‘extra’ about a feature that occurs in a context, something not available in the context, something not available in the context itselfcontext itself Native SenseClusters : each feature is Native SenseClusters : each feature is

represented by a vector of the words with which represented by a vector of the words with which it occurs it occurs

Latent Semantic Analysis : each feature is Latent Semantic Analysis : each feature is represented by a vector of the contexts in which represented by a vector of the contexts in which it occurs it occurs

Page 29: The Semantic Quilt

2929

Similar ContextsSimilar Contextsmay have the same meaning…may have the same meaning…

Context 1: He drives his car fast Context 1: He drives his car fast Context 2: Jim speeds in his autoContext 2: Jim speeds in his auto

Car -> motor, garage, gasoline, insuranceCar -> motor, garage, gasoline, insurance Auto -> motor, insurance, gasoline, accidentAuto -> motor, insurance, gasoline, accident

Car and Auto share many co-occurrences… Car and Auto share many co-occurrences…

Page 30: The Semantic Quilt

3030

Second Order Context Second Order Context RepresentationRepresentation

Bigrams used to create a word matrixBigrams used to create a word matrix Cell values = log-likelihood of word pairCell values = log-likelihood of word pair

Rows are first order co-occurrence vector Rows are first order co-occurrence vector for a wordfor a word

Represent context by averaging vectors of Represent context by averaging vectors of words in that contextwords in that context Context includes the Cxt positions around the Context includes the Cxt positions around the

target, where Cxt is typically 5 or 20.target, where Cxt is typically 5 or 20.

Page 31: The Semantic Quilt

3131

22ndnd Order Context Vectors Order Context Vectors

He won an Oscar, but He won an Oscar, but Tom HanksTom Hanks is still a nice guy. is still a nice guy.

06272.852.913362.608420.0321176.8451.021O2contex

t

018818.55

000205.5469

134.5102

guy

000136.0441

29.57600Oscar

008.739951.781230.5203324.9818.5533won

needlefamilywarmovieactorfootballbaseball

Page 32: The Semantic Quilt

3232

After context representation…After context representation…

Second order vector is an average of word Second order vector is an average of word vectors that make up context, captures vectors that make up context, captures indirect relationshipsindirect relationships Reduced by SVD to principal componentsReduced by SVD to principal components

Now, cluster the vectors!Now, cluster the vectors! Many methods, we often use k-means or Many methods, we often use k-means or

repeated bisectionsrepeated bisections CLUTOCLUTO

Page 33: The Semantic Quilt

What do we get at the end?What do we get at the end?

contexts organized into some number of contexts organized into some number of clusters based on the clusters based on the similarity similarity of their co-of their co-occurrencesoccurrences

contexts which share words that tend to contexts which share words that tend to co-occur with the same other words are co-occur with the same other words are clustered togetherclustered together 22ndnd order co-occurrences order co-occurrences

Page 34: The Semantic Quilt

O

ntol

ogie

s W

ordN

et-S

imila

rity

Co-Occurrences

Ngram Statistics Package

Kernels

WSD-Shell

Contexts

SenseClusters

Page 35: The Semantic Quilt

Oh…we also get plenty of these…Oh…we also get plenty of these…

Similarity Matrices…Similarity Matrices… Word by Word Word by Word Ngram by Ngram Ngram by Ngram Word by ContextWord by Context Ngram by ContextNgram by Context Context by WordContext by Word Context by NgramContext by Ngram Context by Context Context by Context

Page 36: The Semantic Quilt

The WSD-ShellThe WSD-Shell

http://http://www.d.umn.edu/~tpederse/supervised.htmlwww.d.umn.edu/~tpederse/supervised.html

Kernels

Page 37: The Semantic Quilt

Things we can do now…Things we can do now… Identify associated wordsIdentify associated words

fine winefine wine baseball batbaseball bat

Identify similar contextsIdentify similar contexts I bought some food at the storeI bought some food at the store I purchased something to eat at the marketI purchased something to eat at the market

Assign meanings to wordsAssign meanings to words I went to the bank/[financial-inst.] to I went to the bank/[financial-inst.] to

deposit my checkdeposit my check Identify similar (or related) conceptsIdentify similar (or related) concepts

frog : amphibianfrog : amphibian Duluth : snowDuluth : snow

Page 38: The Semantic Quilt

Machine Learning ApproachMachine Learning Approach

Annotate text with sense tagsAnnotate text with sense tags must select sense inventorymust select sense inventory

Find interesting featuresFind interesting features bigrams and co-occurrences quite effectivebigrams and co-occurrences quite effective

Learn a modelLearn a model Apply model to untagged dataApply model to untagged data Works very well…given sufficient Works very well…given sufficient

quantities of training data and sufficient quantities of training data and sufficient coverage of your sense inventorycoverage of your sense inventory

Page 39: The Semantic Quilt

Kernel MethodsKernel Methods

The challenge for any learning algorithm is The challenge for any learning algorithm is to separate the training data into groups to separate the training data into groups by finding a boundary (hyperplane)by finding a boundary (hyperplane)

Sometimes in the original space this Sometimes in the original space this boundary is hard to findboundary is hard to find

Transform data via kernel function to a Transform data via kernel function to a different higher dimensional different higher dimensional representation, where boundaries are representation, where boundaries are easier to spoteasier to spot

Page 40: The Semantic Quilt

Kernels are similarity matricesKernels are similarity matrices

NSP NSP produces word by word similarity produces word by word similarity matrices, for use by SenseClustersmatrices, for use by SenseClusters

SenseClustersSenseClusters produces various sorts of produces various sorts of similarity matrices based on co-similarity matrices based on co-occurrencesoccurrences

……which can be used as kernelswhich can be used as kernels Latent Semantic kernelLatent Semantic kernel Bigram Association kernelBigram Association kernel Co-occurrence Association kernelCo-occurrence Association kernel

Page 41: The Semantic Quilt

What do we get at the end?What do we get at the end?

More accurate supervised classifiers that More accurate supervised classifiers that potentially require less training datapotentially require less training data

Kernel improves ability to find boundaries Kernel improves ability to find boundaries between training examples by between training examples by transforming feature space to a higher transforming feature space to a higher dimensional “cleaner” space…dimensional “cleaner” space…

Page 42: The Semantic Quilt

O

ntol

ogie

s W

ordN

et-S

imila

rity

Co-Occurrences

Ngram Statistics Package

Kernels

WSD-Shell

Contexts

SenseClusters

Page 43: The Semantic Quilt

WordNet-SimilarityWordNet-Similarity

http://wn-similarity.sourceforge.nethttp://wn-similarity.sourceforge.net

Ont

olog

ies

Page 44: The Semantic Quilt

Things we can do now…Things we can do now… Identify associated wordsIdentify associated words

fine winefine wine baseball batbaseball bat

Identify similar contextsIdentify similar contexts I bought some food at the storeI bought some food at the store I purchased something to eat at the marketI purchased something to eat at the market

Assign meanings to wordsAssign meanings to words I went to the bank/[financial-inst.] to deposit my checkI went to the bank/[financial-inst.] to deposit my check

Identify similar (or related) conceptsIdentify similar (or related) concepts frog : amphibianfrog : amphibian Duluth : snowDuluth : snow

Page 45: The Semantic Quilt

Similarity and RelatednessSimilarity and Relatedness

Two concepts are similar if they are Two concepts are similar if they are connected by connected by is-a is-a relationships.relationships. A frog A frog is-a-kind-of is-a-kind-of amphibianamphibian An illness An illness is-a is-a heath_conditionheath_condition

Two concepts can be related many ways…Two concepts can be related many ways… A human A human has-a-part has-a-part liver liver Duluth Duluth receives-a-lot-of receives-a-lot-of snowsnow

……similarity is one way to be related similarity is one way to be related

Page 46: The Semantic Quilt

WordNet-SimilarityWordNet-Similarityhttp://wn-similarity.sourceforge.nethttp://wn-similarity.sourceforge.net

Path based measuresPath based measures Shortest path (path)Shortest path (path) Wu & Palmer (wup)Wu & Palmer (wup) Leacock & Chodorow (lch)Leacock & Chodorow (lch) Hirst & St-Onge (hso)Hirst & St-Onge (hso)

Information content measuresInformation content measures Resnik (res)Resnik (res) Jiang & Conrath (jcn)Jiang & Conrath (jcn) Lin (lin)Lin (lin)

Gloss based measuresGloss based measures Banerjee and Pedersen (lesk)Banerjee and Pedersen (lesk) Patwardhan and Pedersen (vector, vector_pairs)Patwardhan and Pedersen (vector, vector_pairs)

Page 47: The Semantic Quilt

Path FindingPath Finding

Find shortest is-a path between two concepts?Find shortest is-a path between two concepts? Rada, et. al. (1989)Rada, et. al. (1989) Scaled by depth of hierarchyScaled by depth of hierarchy

• Leacock & Chodorow (1998)Leacock & Chodorow (1998) Depth of subsuming concept scaled by sum of the Depth of subsuming concept scaled by sum of the

depths of individual concepts depths of individual concepts • Wu and Palmer (1994)Wu and Palmer (1994)

Page 48: The Semantic Quilt

watercraft

instrumentality

object

artifact

conveyance

vehicle

motor-vehicle

car boat

ark

article

ware

table-ware

cutlery

fork

from Jiang and Conrath [1997]

Page 49: The Semantic Quilt

Information ContentInformation Content

Measure of specificity in is-a hierarchy (Resnik, 1995)Measure of specificity in is-a hierarchy (Resnik, 1995) -log (probability of concept)-log (probability of concept) High information content values mean very specific concepts High information content values mean very specific concepts

(like pitch-fork and basketball shoe)(like pitch-fork and basketball shoe)

Count how often a concept occurs in a corpusCount how often a concept occurs in a corpus Increment the count associated with that concept, and Increment the count associated with that concept, and

propagate the count up!propagate the count up! If based on word forms, increment all concepts associated If based on word forms, increment all concepts associated

with that formwith that form

Page 50: The Semantic Quilt

Observed “car”...Observed “car”...

motor vehicle (327 +1)

*root* (32783 + 1)

minicab (6)

cab (23)

car (73 +1) bus (17)

stock car (12)

Page 51: The Semantic Quilt

Observed “stock car”...Observed “stock car”...

motor vehicle (328+1)

*root* (32784+1)

minicab (6)

cab (23)

car (74+1) bus (17)

stock car (12+1)

Page 52: The Semantic Quilt

After Counting Concepts... After Counting Concepts...

motor vehicle (329) IC = 1.9

*root* (32785)

minicab (6)

cab (23)

car (75) bus (17) IC = 3.5

stock car (13) IC = 3.1

Page 53: The Semantic Quilt

Similarity and Information ContentSimilarity and Information Content

Resnik (1995) use information content of least Resnik (1995) use information content of least common subsumer to express similarity between common subsumer to express similarity between two conceptstwo concepts

Lin (1998) scale information content of least Lin (1998) scale information content of least common subsumer with sum of information common subsumer with sum of information content of two conceptscontent of two concepts

Jiang & Conrath (1997) find difference between Jiang & Conrath (1997) find difference between least common subsumer’s information content least common subsumer’s information content and the sum of the two individual conceptsand the sum of the two individual concepts

Page 54: The Semantic Quilt

What do we get at the end?What do we get at the end?

Similarity (or relatedness) scores between Similarity (or relatedness) scores between pairs of words / concepts that are based pairs of words / concepts that are based on path lengths, but augmented with on path lengths, but augmented with distributional information from corporadistributional information from corpora

Can create a similarity matrix between Can create a similarity matrix between concepts based on these scoresconcepts based on these scores

Page 55: The Semantic Quilt

Ont

olog

ies

Wor

dNet

-Sim

ilarit

y

Co-Occurrences

Ngram Statistics Package

Kernels

WSD-Shell

Contexts

SenseClusters

Page 56: The Semantic Quilt

Using Dictionary Glosses Using Dictionary Glosses to Measure Relatednessto Measure Relatedness

Lesk (1985) Algorithm – measure relatedness of two Lesk (1985) Algorithm – measure relatedness of two concepts by counting the number of shared words in their concepts by counting the number of shared words in their definitionsdefinitions

Cold - a mild Cold - a mild viral viral infection involving the nose and respiratory passages (but infection involving the nose and respiratory passages (but not the lungs)not the lungs)

Flu - an acute febrile highly contagious Flu - an acute febrile highly contagious viral viral diseasedisease Adapted Lesk (Banerjee & Pedersen, 2003) – expand Adapted Lesk (Banerjee & Pedersen, 2003) – expand

glosses to include those concepts directly relatedglosses to include those concepts directly related Cold - a common cold affecting the nasal passages and resulting in Cold - a common cold affecting the nasal passages and resulting in

congestion and sneezing and headache; mild congestion and sneezing and headache; mild viralviral infection involving the nose infection involving the nose and and respiratoryrespiratory passages (but not the lungs); a passages (but not the lungs); a disease disease affecting the affecting the respiratoryrespiratory system system

Flu - an acute and highly contagious Flu - an acute and highly contagious respiratoryrespiratory diseasedisease of swine caused by of swine caused by the orthomyxovirus thought to be the same virus that caused the 1918 the orthomyxovirus thought to be the same virus that caused the 1918 influenza pandemic; an acute febrile highly contagious influenza pandemic; an acute febrile highly contagious viral viral disease; a disease; a disease disease that can be communicated from one person to anotherthat can be communicated from one person to another

Page 57: The Semantic Quilt

Gloss VectorsGloss Vectors

Leskian approaches require exact matches in glossesLeskian approaches require exact matches in glosses Glosses are short, use related but not identical wordsGlosses are short, use related but not identical words

Solution? Expand glosses by replacing each content word Solution? Expand glosses by replacing each content word with a co-occurrence vector derived from corporawith a co-occurrence vector derived from corpora Rows are words in glosses, columns are the co-Rows are words in glosses, columns are the co-

occurring words in a corpus, cell values are their log-occurring words in a corpus, cell values are their log-likelihood ratioslikelihood ratios

Average the word vectors to create a single vector that Average the word vectors to create a single vector that represents the gloss/sense (Patwardhan & Pedersen, 2003)represents the gloss/sense (Patwardhan & Pedersen, 2003) 22ndnd order co-occurrences order co-occurrences

Measure relatedness using cosine rather than exact match!Measure relatedness using cosine rather than exact match! Methodology the same as that used in SenseClustersMethodology the same as that used in SenseClusters

Page 58: The Semantic Quilt

What do we get at the end?What do we get at the end?

Relatedness scores between pairs of Relatedness scores between pairs of words / concepts that are based on words / concepts that are based on content of WordNet (viewing it more like content of WordNet (viewing it more like MRD than ontology)MRD than ontology)

Can create a “relatedness” matrix between Can create a “relatedness” matrix between concepts based on these scoresconcepts based on these scores

Page 59: The Semantic Quilt

Why measure conceptual similarity? Why measure conceptual similarity?

A word will take the sense that is most A word will take the sense that is most related to the surrounding contextrelated to the surrounding context I love I love JavaJava, especially the beaches and the , especially the beaches and the

weather. weather. I love I love JavaJava, especially the support for , especially the support for

concurrent programming.concurrent programming. I love I love javajava, especially first thing in the morning , especially first thing in the morning

with a bagel. with a bagel.

Page 60: The Semantic Quilt

Word Sense DisambiguationWord Sense Disambiguation

……can be performed by finding the sense of a can be performed by finding the sense of a word most related to its neighborsword most related to its neighbors

Here, we define similarity and relatedness Here, we define similarity and relatedness with respect to WordNet-Similaritywith respect to WordNet-Similarity

WordNet-SenseRelateWordNet-SenseRelate AllWords – assign a sense to every content wordAllWords – assign a sense to every content word TargetWord – assign a sense to a given wordTargetWord – assign a sense to a given word

• http://http://senserelate.sourceforge.netsenserelate.sourceforge.net

Page 61: The Semantic Quilt

WordNet-SenseRelateWordNet-SenseRelatehttp://http://senserelate.sourceforge.netsenserelate.sourceforge.net

SenseRelate

Page 62: The Semantic Quilt

SenseRelate AlgorithmSenseRelate Algorithm

For each sense of a target word in contextFor each sense of a target word in context For each content word in the contextFor each content word in the context

• For each sense of that content wordFor each sense of that content word Measure similarity/relatedness between sense of target Measure similarity/relatedness between sense of target

word and sense of content word with WordNet::Similarityword and sense of content word with WordNet::Similarity Keep running sum for score of each sense of targetKeep running sum for score of each sense of target

Pick sense of target word with highest Pick sense of target word with highest score with words in contextscore with words in context

Go to the next word, repeatGo to the next word, repeat

Page 63: The Semantic Quilt

Coverage…Coverage…

WordNetWordNet Nouns – 82,000 conceptsNouns – 82,000 concepts Verbs – 14,000 conceptsVerbs – 14,000 concepts Adjectives – 18,000 conceptsAdjectives – 18,000 concepts Adverbs – 4,000 conceptsAdverbs – 4,000 concepts

Words not found in WordNet can’t be Words not found in WordNet can’t be disambiguated by SenseRelatedisambiguated by SenseRelate

language and resource dependent…language and resource dependent…

Page 64: The Semantic Quilt

What do we get at the end?What do we get at the end?

Can assign a sense to every word (known Can assign a sense to every word (known to WordNet) in running textto WordNet) in running text

Can assign similarity scores to pairs of Can assign similarity scores to pairs of contexts, or a word and a given set of contexts, or a word and a given set of words…words…

Can turn these into a matrix …Can turn these into a matrix …

Page 65: The Semantic Quilt

Ont

olog

ies

Wor

dNet

-Sim

ilarit

y

Co-Occurrences

Ngram Statistics Package

Kernels

WSD-Shell

Contexts

SenseClusters

SenseRelate

Page 66: The Semantic Quilt

Kernels are similarity matricesKernels are similarity matrices NSP NSP produces word by word similarity matrices, produces word by word similarity matrices,

for use by SenseClustersfor use by SenseClusters SenseClustersSenseClusters produces various similarity produces various similarity

matrices based on co-occurrencesmatrices based on co-occurrences WordNet-SimilarityWordNet-Similarity produces concept by concept produces concept by concept

similarity matricessimilarity matrices SenseRelateSenseRelate produces context by context produces context by context

similarity matrices based on concept similaritysimilarity matrices based on concept similarity All of these could be used as kernels for All of these could be used as kernels for

Supervised WSDSupervised WSD

Page 67: The Semantic Quilt

Ont

olog

ies

Wor

dNet

-Sim

ilarit

y

Co-Occurrences

Ngram Statistics Package

Kernels

WSD-Shell

Contexts

SenseClusters

SenseRelate

Page 68: The Semantic Quilt

SenseClusters Input … matricesSenseClusters Input … matrices

Word by Word co-occurrences to create Word by Word co-occurrences to create second order representation (Native)second order representation (Native)

Context by Word co-occurrences to create Context by Word co-occurrences to create LSA representation…LSA representation…

Concept by Concept similarity scores from Concept by Concept similarity scores from WordNet::Similarity WordNet::Similarity

Context by Context similarity scores from Context by Context similarity scores from SenseRelateSenseRelate

Page 69: The Semantic Quilt

Ont

olog

ies

Wor

dNet

-Sim

ilarit

y

Co-Occurrences

Ngram Statistics Package

Kernels

WSD-Shell

Contexts

SenseClusters

SenseRelate

Page 70: The Semantic Quilt

Identifying CollocationsIdentifying Collocations

……could benefit from word clusters found in could benefit from word clusters found in SenseClustersSenseClusters

……could benefit from similarity measures could benefit from similarity measures from WordNet::Similarity…from WordNet::Similarity…

Page 71: The Semantic Quilt

Ont

olog

ies

Wor

dNet

-Sim

ilarit

y

Co-Occurrences

Ngram Statistics Package

Kernels

WSD-Shell

Contexts

SenseClusters

SenseRelate

Page 72: The Semantic Quilt
Page 73: The Semantic Quilt

ConclusionConclusion

Time to integrate what we have at the word and Time to integrate what we have at the word and term levelterm level look for ways to stitch semantic patches togetherlook for ways to stitch semantic patches together

This will increase our coverage and decrease This will increase our coverage and decrease language dependencelanguage dependence make the quilt bigger and sturdiermake the quilt bigger and sturdier

We will then be able to look at a broader range We will then be able to look at a broader range of languages and semantic problemsof languages and semantic problems calm problems with the warmth of your lovely quilt… calm problems with the warmth of your lovely quilt…

Page 74: The Semantic Quilt

Many Thanks… Many Thanks… SenseClustersSenseClusters

Amruta Purandare (MS '04)Amruta Purandare (MS '04) Anagha Kulkarni (MS '06)Anagha Kulkarni (MS '06) Mahesh Joshi (MS '06)Mahesh Joshi (MS '06)

WordNet SimilarityWordNet Similarity Sid Patwardhan (MS '03)Sid Patwardhan (MS '03) Jason Michelizzi (MS '05)Jason Michelizzi (MS '05)

SenseRelateSenseRelate Satanjeev Banerjee (MS '02)Satanjeev Banerjee (MS '02) Sid Patwardhan (MS '03)Sid Patwardhan (MS '03) Jason Michelizzi (MS '05)Jason Michelizzi (MS '05) Varada Kolhatkar (MS '09)Varada Kolhatkar (MS '09)

Ngram Statistics PackageNgram Statistics Package Satanjeev Banerjee (MS '02)Satanjeev Banerjee (MS '02) Bridget McInnes (MS '04, PhD '??) Bridget McInnes (MS '04, PhD '??) Saiyam Kohli (MS '06)Saiyam Kohli (MS '06)

Supervised WSDSupervised WSD Saif Mohammad (MS '03)Saif Mohammad (MS '03) Amruta Purandare (MS '04)Amruta Purandare (MS '04) Mahesh Joshi (MS '06)Mahesh Joshi (MS '06) Bridget McInnes (MS '04, PhD '??)Bridget McInnes (MS '04, PhD '??)

Page 75: The Semantic Quilt

URLsURLs

Ngram Statistics PackageNgram Statistics Package http://http://ngram.sourceforge.netngram.sourceforge.net

SenseClustersSenseClusters http://http://senseclusters.sourceforge.netsenseclusters.sourceforge.net

WordNet-SimilarityWordNet-Similarity http://wn-similarity.sourceforge.nethttp://wn-similarity.sourceforge.net

SenseRelate WSDSenseRelate WSD http://http://senserelate.sourceforge.netsenserelate.sourceforge.net

Supervised WSDSupervised WSD http://http://www.d.umn.edu/~tpederse/supervised.htmlwww.d.umn.edu/~tpederse/supervised.html