![Page 1: Algorithms for Natural Language Processingdemo.clab.cs.cmu.edu/NLP/S20/files/slides/7-lexicalsemantics.pdfSynsetsfor dog (n) • S: (n) dog, domestic dog, Canisfamiliaris(a member](https://reader033.vdocuments.net/reader033/viewer/2022050518/5fa273d722e332060b4ee5ca/html5/thumbnails/1.jpg)
Algorithms for Natural Language Processing
Lecture 7: Lexical Semantics
![Page 2: Algorithms for Natural Language Processingdemo.clab.cs.cmu.edu/NLP/S20/files/slides/7-lexicalsemantics.pdfSynsetsfor dog (n) • S: (n) dog, domestic dog, Canisfamiliaris(a member](https://reader033.vdocuments.net/reader033/viewer/2022050518/5fa273d722e332060b4ee5ca/html5/thumbnails/2.jpg)
Three Ways of Looking at Word Meaning
• Decompositional– What the “components” of meaning “in” a word
are• Ontological– How the meaning of the word relates to the
meanings of other words• Distributional– What contexts the word is found in, relative to
other words
![Page 3: Algorithms for Natural Language Processingdemo.clab.cs.cmu.edu/NLP/S20/files/slides/7-lexicalsemantics.pdfSynsetsfor dog (n) • S: (n) dog, domestic dog, Canisfamiliaris(a member](https://reader033.vdocuments.net/reader033/viewer/2022050518/5fa273d722e332060b4ee5ca/html5/thumbnails/3.jpg)
DecompositionalSemantics
<latexit sha1_base64="YkJ5iztoEKnk0ilCsQYv0uTieao=">AAACgXicbVDLattAFB0pfaTqy203hW7UOC0tIUHyJosSMOmmm4ILdRzwGHM1upIHz0PMjEqM0B/kB7vpR/QLMnJVSJ1eGDice869d05WCW5dkvwMwr179x883H8UPX7y9NnzwYuXF1bXhuGUaaHNZQYWBVc4ddwJvKwMgswEzrL1564/+4HGcq2+u02FCwml4gVn4Dy1HFxTpbnKUTnq8MoZ2WR601I6H1VuER1SgYWb0wxLrhowBjZtw8ZNK9roKH4f//WsagnKu6LjW2SBEgTuspDXwnUkRZX3I6nh5cotDpeDYXKSbCu+C9IeDElfk+XgF801q6U/nwmwdp4m/mw/1XHmV0e0tlgBW0OJcw8VSLSLZptaG7/zTB4X2vinXLxlbzsakNZuZOaVEtzK7vY68r+9bpytkPn9ORbg/9sxBYKrDdrmK1QVV+WZD+S4C8XLLDoJXHWyZgZGabaOJ0a3PpB09/t3wcXoJPX422g4Pu+j2SdvyAH5QFJySsbkC5mQKWHkd/A6eBschHvhxzAJR3+kYdB7XpF/Kvx0AzMMxWY=</latexit> <latexit sha1_base64="nWJX1Zb98te8LQ/UGy3nUE2Ll1M=">AAACgnicbZDLattAFIZHStom7s1NV6UbEaeltCRI7qKbBky7yabgQB0HPMYcjY7kwXMRM6NSI/QIecBs8hJ9gY4cBXI7MPDznev8aSm4dXF8GYRb20+ePtvZ7T1/8fLV6/6bvTOrK8NwwrTQ5jwFi4IrnDjuBJ6XBkGmAqfp6mebn/5BY7lWv926xLmEQvGcM3AeLfoXVGmuMlSOOvzrjKwLbkRD6WxYunnvgArM3YymWHBVgzGwbmo2qhvR9L5EH6ObpmUlQfmuOzBHCQJbeniLQlYJ10KKKutGUsOLpZsfLPqD+CjeRPRQJJ0YkC7Gi/4VzTSrpL+fCbB2lsT+bD/VceZX92hlsQS2ggJnXiqQaOf1xrYm+uBJFuXa+KdctKG3O2qQ1q5l6isluKW9n2vho7l2nC2R+f0Z5uD/25IcwVUGbf0LypKr4tgbctia4sssOglctWX1FIzSbBWNjW68Icn97z8UZ8OjxOvT4WD0o7Nmh7wn++QTScg3MiInZEwmhJF/wbtgPxiE2+HnMAm/XpeGQdfzltyJ8Pt/KULF0g==</latexit>
<latexit sha1_base64="kzjnS6APslgHFuaz4pLe84ZgBGg=">AAACgXicbVBda9RAFJ2kamv82uqL4EvsVlGkJdkXH0RY9MUXYQW3W9hZlpvJTXbY+Qgzk9Il5B/4B33xR/QXdLKNsLZeGDice869d05WCW5dkvwOwr179x/sHzyMHj1+8vTZ4PD5mdW1YThlWmhznoFFwRVOHXcCzyuDIDOBs2z9tevPLtBYrtVPt6lwIaFUvOAMnKeWg19Uaa5yVI46vHRGNhJUS+l8VLlFdEwFFm5OMyy5asAY2LQNGzetaKMP8dv4r2dV37iikx2yQAkCO3ZXCnktXEdSVHk/khpertzieDkYJqfJtuK7IO3BkPQ1WQ7+0FyzWvrzmQBr52niz/ZTHWd+dURrixWwNZQ491CBRLtotqm18RvP5HGhjX/KxVt219GAtHYjM6+U4Fb2dq8j/9vrxtkKmd+fYwH+vx1TILjaoG2+Q1VxVX72gZx0oXiZRSeBq07WzMAozdbxxOjWB5Le/v5dcDY6TT3+MRqOv/TRHJBX5Ii8Iyn5SMbkG5mQKWHkKngZvA6Owr3wfZiEoxtpGPSeF+SfCj9dAxM2xVY=</latexit>
<latexit sha1_base64="G9USG7ljhlk93GXwi3Vgkv3K4gA=">AAACg3icbVDbattAEF0pTZu6l7jtW/siYrcUQoNkCn0KmPYlLwUX6jjgNWa0GsmL9yJ2V22N0C/k//qSr8gHZOUo0DgdWDicOWdm56Sl4NbF8d8g3Hu0//jJwdPes+cvXh72X70+t7oyDKdMC20uUrAouMKp407gRWkQZCpwlq6/tf3ZLzSWa/XTbUpcSCgUzzkD56ll/5IqzVWGylGHf5yR9W8tQTWUzkelW/SGVGDu5jTFgqsajIFNU7Nx3Yimdxx9iO5cq+rWdY/MUYLAXRaySriWpKiybiQ1vFi5xXDZH8Qn8baihyDpwIB0NVn2r2imWSX9AUyAtfMk9t/2Ux1nfnWPVhZLYGsocO6hAol2UW9za6L3nsmiXBv/lIu27L+OGqS1G5l6pQS3sru9lvxvrx1nS2R+f4Y5+HtbJkdwlUFbf4ey5Ko49YF8akPxMotOAletrJ6BUZqto4nRjQ8k2T3/ITgfnSQe/xgNxl+7aA7IO3JEPpKEfCFjckYmZEoYuQ7eBoNgGO6Hx+Eo/HwrDYPO84bcq/D0BkDLxk4=</latexit>
![Page 4: Algorithms for Natural Language Processingdemo.clab.cs.cmu.edu/NLP/S20/files/slides/7-lexicalsemantics.pdfSynsetsfor dog (n) • S: (n) dog, domestic dog, Canisfamiliaris(a member](https://reader033.vdocuments.net/reader033/viewer/2022050518/5fa273d722e332060b4ee5ca/html5/thumbnails/4.jpg)
Limitations of DecompositionalSemantics
• Where do the features come from?– How do you divide semantic space into features like
this?– How do you settle on a final list?
• How do you assign features to words in a principled fashion?
• How do you link these features to the real world?• For these reasons, decompositional semantics is
the least computationally useful approach to semantics
![Page 5: Algorithms for Natural Language Processingdemo.clab.cs.cmu.edu/NLP/S20/files/slides/7-lexicalsemantics.pdfSynsetsfor dog (n) • S: (n) dog, domestic dog, Canisfamiliaris(a member](https://reader033.vdocuments.net/reader033/viewer/2022050518/5fa273d722e332060b4ee5ca/html5/thumbnails/5.jpg)
Ontological Approaches
to Semantics
![Page 6: Algorithms for Natural Language Processingdemo.clab.cs.cmu.edu/NLP/S20/files/slides/7-lexicalsemantics.pdfSynsetsfor dog (n) • S: (n) dog, domestic dog, Canisfamiliaris(a member](https://reader033.vdocuments.net/reader033/viewer/2022050518/5fa273d722e332060b4ee5ca/html5/thumbnails/6.jpg)
Semantic Relations
• In grammar school, or in preparation for standardized tests, you may have learned the following terms:
synonymy, antonymy• Synonymy and antonymy are relations
between words. They are not alone:hyponymy, hypernymy, meronymy, holonymy
![Page 7: Algorithms for Natural Language Processingdemo.clab.cs.cmu.edu/NLP/S20/files/slides/7-lexicalsemantics.pdfSynsetsfor dog (n) • S: (n) dog, domestic dog, Canisfamiliaris(a member](https://reader033.vdocuments.net/reader033/viewer/2022050518/5fa273d722e332060b4ee5ca/html5/thumbnails/7.jpg)
Semantic Relations• Synonymy—equivalence
– <small, little>• Antonymy—opposition
– <small, large>• Hyponymy—subset; is-a relation
– <dog, mammal>• Hypernymy—superset
– <mammal, dog>• Meronymy—part-of relation
– <liver, body>• Holonymy—has-a relation
– <body, liver>
![Page 8: Algorithms for Natural Language Processingdemo.clab.cs.cmu.edu/NLP/S20/files/slides/7-lexicalsemantics.pdfSynsetsfor dog (n) • S: (n) dog, domestic dog, Canisfamiliaris(a member](https://reader033.vdocuments.net/reader033/viewer/2022050518/5fa273d722e332060b4ee5ca/html5/thumbnails/8.jpg)
Lexical Mini-Ontology
wall.v.1
wall.n.1
surround.v.2
fence.n.1
build.v.1
door.n.1
building.n.1 enclosure.n.1
destroy.v.1
hypernymy (is-a)hypernym
hyponym
synonymysynonym synonym
meronymy (has-a)holonym (whole)
meronym (part)
antonymyantonym antonym
![Page 9: Algorithms for Natural Language Processingdemo.clab.cs.cmu.edu/NLP/S20/files/slides/7-lexicalsemantics.pdfSynsetsfor dog (n) • S: (n) dog, domestic dog, Canisfamiliaris(a member](https://reader033.vdocuments.net/reader033/viewer/2022050518/5fa273d722e332060b4ee5ca/html5/thumbnails/9.jpg)
WordNet
• WordNet is a lexical resource that organizes words according to their semantic relations
![Page 10: Algorithms for Natural Language Processingdemo.clab.cs.cmu.edu/NLP/S20/files/slides/7-lexicalsemantics.pdfSynsetsfor dog (n) • S: (n) dog, domestic dog, Canisfamiliaris(a member](https://reader033.vdocuments.net/reader033/viewer/2022050518/5fa273d722e332060b4ee5ca/html5/thumbnails/10.jpg)
WordNet
• Words have different senses
• Each of those senses is associated with a synset(a set of words that are roughly synonymous for a particular sense)
• These synsets are associated with one another through relations like antonymy, hyponymy, and meronymy
![Page 11: Algorithms for Natural Language Processingdemo.clab.cs.cmu.edu/NLP/S20/files/slides/7-lexicalsemantics.pdfSynsetsfor dog (n) • S: (n) dog, domestic dog, Canisfamiliaris(a member](https://reader033.vdocuments.net/reader033/viewer/2022050518/5fa273d722e332060b4ee5ca/html5/thumbnails/11.jpg)
WordNet is a glorified
electronic thesaurus
![Page 12: Algorithms for Natural Language Processingdemo.clab.cs.cmu.edu/NLP/S20/files/slides/7-lexicalsemantics.pdfSynsetsfor dog (n) • S: (n) dog, domestic dog, Canisfamiliaris(a member](https://reader033.vdocuments.net/reader033/viewer/2022050518/5fa273d722e332060b4ee5ca/html5/thumbnails/12.jpg)
Synsets for dog (n)• S: (n) dog, domestic dog, Canis familiaris (a member of the genus Canis
(probably descended from the common wolf) that has been domesticated by man since prehistoric times; occurs in many breeds) "the dog barked all night"
• S: (n) frump, dog (a dull unattractive unpleasant girl or woman) "she got a reputation as a frump"; "she's a real dog"
• S: (n) dog (informal term for a man) "you lucky dog" • S: (n) cad, bounder, blackguard, dog, hound, heel (someone who is morally
reprehensible) "you dirty dog" • S: (n) frank, frankfurter, hotdog, hot dog, dog, wiener, wienerwurst,
weenie (a smooth-textured sausage of minced beef or pork usually smoked; often served on a bread roll)
• S: (n) pawl, detent, click, dog (a hinged catch that fits into a notch of a ratchet to move a wheel forward or prevent it from moving backward)
• S: (n) andiron, firedog, dog, dog-iron (metal supports for logs in a fireplace) "the andirons were too hot to touch"
12
![Page 13: Algorithms for Natural Language Processingdemo.clab.cs.cmu.edu/NLP/S20/files/slides/7-lexicalsemantics.pdfSynsetsfor dog (n) • S: (n) dog, domestic dog, Canisfamiliaris(a member](https://reader033.vdocuments.net/reader033/viewer/2022050518/5fa273d722e332060b4ee5ca/html5/thumbnails/13.jpg)
What’s a Fish? (According to WordNet)
• fish (any of various mostly cold-blooded aquatic vertebrates usually having scales and breathing through gills)
• aquatic vertebrate (animal living wholly or chiefly in or on water) • vertebrate, craniate (animals having a bony or cartilaginous skeleton with a
segmented spinal column and a large brain enclosed in a skull or cranium)• chordate (any animal of the phylum Chordata having a notochord or spinal
column) • animal, animate being, beast, brute, creature, fauna (a living organism
characterized by voluntary movement) • organism, being (a living thing that has (or can develop) the ability to act or
function independently) • living thing, animate thing (a living (or once living) entity) • whole, unit (an assemblage of parts that is regarded as a single entity)• object, physical object (a tangible and visible entity; an entity that can cast a
shadow)• entity (that which is perceived or known or inferred to have its own distinct
existence (living or nonliving))
13
![Page 14: Algorithms for Natural Language Processingdemo.clab.cs.cmu.edu/NLP/S20/files/slides/7-lexicalsemantics.pdfSynsetsfor dog (n) • S: (n) dog, domestic dog, Canisfamiliaris(a member](https://reader033.vdocuments.net/reader033/viewer/2022050518/5fa273d722e332060b4ee5ca/html5/thumbnails/14.jpg)
Thesaurus-based Word Similarity
14
giraffe gazelle
Order Artiodactyla
Class Mammalia
lion
Order Carnivora
Genus Felidae
Genus Caniformia
Genus Bovidae
Genus Giraffidae
…
![Page 15: Algorithms for Natural Language Processingdemo.clab.cs.cmu.edu/NLP/S20/files/slides/7-lexicalsemantics.pdfSynsetsfor dog (n) • S: (n) dog, domestic dog, Canisfamiliaris(a member](https://reader033.vdocuments.net/reader033/viewer/2022050518/5fa273d722e332060b4ee5ca/html5/thumbnails/15.jpg)
Information Content
15(Adapted from Lin. 1998. An information Theoretic Definition of Similarity. ICML.)
# words that are equivalent to or are hyponyms of cIC(c) = -log
# words in corpus
Entity
Inanimate-object
Natural-object
Geological formation
Natural-elevation
Hill
Shore
Coast
0.93
1.79
4.12
6.34
9.09 9.39
10.88 10.74
![Page 16: Algorithms for Natural Language Processingdemo.clab.cs.cmu.edu/NLP/S20/files/slides/7-lexicalsemantics.pdfSynsetsfor dog (n) • S: (n) dog, domestic dog, Canisfamiliaris(a member](https://reader033.vdocuments.net/reader033/viewer/2022050518/5fa273d722e332060b4ee5ca/html5/thumbnails/16.jpg)
WordNet Interfaces
• Various interfaces to WordNet are available– Many languages listed at
https://wordnet.princeton.edu/related-projects– NLTK (Python)
>>> from nltk.corpus import wordnet as wn>>> wn.synsets('dog’)
(returns list of Synset objects)http://www.nltk.org/howto/wordnet.html
![Page 17: Algorithms for Natural Language Processingdemo.clab.cs.cmu.edu/NLP/S20/files/slides/7-lexicalsemantics.pdfSynsetsfor dog (n) • S: (n) dog, domestic dog, Canisfamiliaris(a member](https://reader033.vdocuments.net/reader033/viewer/2022050518/5fa273d722e332060b4ee5ca/html5/thumbnails/17.jpg)
Limitations of WordNet and Ontological Semantics
• WordNet is a useful resource• There are intrinsic limits to this type of resource,
however:– It requires many years of manual effort by skilled
lexicographers– In the case of WordNet, some of the lexicographers were
not that skilled, and this has led to inconsistencies– The ontology is only as good as the ontologist(s); it is not
driven by data• We will now look at an approach to lexical semantics
that is data driven and does not rely on lexicographers
![Page 18: Algorithms for Natural Language Processingdemo.clab.cs.cmu.edu/NLP/S20/files/slides/7-lexicalsemantics.pdfSynsetsfor dog (n) • S: (n) dog, domestic dog, Canisfamiliaris(a member](https://reader033.vdocuments.net/reader033/viewer/2022050518/5fa273d722e332060b4ee5ca/html5/thumbnails/18.jpg)
Beef
Sentences from the brown corpus. Extracted from the concordancer in The Compleat Lexical Tutor, http://www.lextutor.ca/
![Page 19: Algorithms for Natural Language Processingdemo.clab.cs.cmu.edu/NLP/S20/files/slides/7-lexicalsemantics.pdfSynsetsfor dog (n) • S: (n) dog, domestic dog, Canisfamiliaris(a member](https://reader033.vdocuments.net/reader033/viewer/2022050518/5fa273d722e332060b4ee5ca/html5/thumbnails/19.jpg)
Chicken
![Page 20: Algorithms for Natural Language Processingdemo.clab.cs.cmu.edu/NLP/S20/files/slides/7-lexicalsemantics.pdfSynsetsfor dog (n) • S: (n) dog, domestic dog, Canisfamiliaris(a member](https://reader033.vdocuments.net/reader033/viewer/2022050518/5fa273d722e332060b4ee5ca/html5/thumbnails/20.jpg)
Context Vectors
![Page 21: Algorithms for Natural Language Processingdemo.clab.cs.cmu.edu/NLP/S20/files/slides/7-lexicalsemantics.pdfSynsetsfor dog (n) • S: (n) dog, domestic dog, Canisfamiliaris(a member](https://reader033.vdocuments.net/reader033/viewer/2022050518/5fa273d722e332060b4ee5ca/html5/thumbnails/21.jpg)
Hypothetical Counts based on Syntactic Dependencies
Modified-by-ferocious(adj)
Subject-of-devour(v)
Object-of-pet(v)
Modified-by-African(adj)
Modified-by-big(adj)
Lion 15 5 0 6 15
Dog 7 3 8 0 12
Cat 1 1 6 1 9
Elephant 0 0 0 10 15
…
21
![Page 22: Algorithms for Natural Language Processingdemo.clab.cs.cmu.edu/NLP/S20/files/slides/7-lexicalsemantics.pdfSynsetsfor dog (n) • S: (n) dog, domestic dog, Canisfamiliaris(a member](https://reader033.vdocuments.net/reader033/viewer/2022050518/5fa273d722e332060b4ee5ca/html5/thumbnails/22.jpg)
A Problem
• Some words are going to occur together many times just because they are very frequent
• The English words the and is are likely to occur in the same window many times
• They may not have a lot to do with one another except for the fact that they are frequent
• How should we address this?
![Page 23: Algorithms for Natural Language Processingdemo.clab.cs.cmu.edu/NLP/S20/files/slides/7-lexicalsemantics.pdfSynsetsfor dog (n) • S: (n) dog, domestic dog, Canisfamiliaris(a member](https://reader033.vdocuments.net/reader033/viewer/2022050518/5fa273d722e332060b4ee5ca/html5/thumbnails/23.jpg)
Pointwise Mutual Information
PMI(w, f) = log2p(w, f)
p(w)⇥ p(f)= log2
N ⇥ count(w, f)
count(w)⇥ count(f)
![Page 24: Algorithms for Natural Language Processingdemo.clab.cs.cmu.edu/NLP/S20/files/slides/7-lexicalsemantics.pdfSynsetsfor dog (n) • S: (n) dog, domestic dog, Canisfamiliaris(a member](https://reader033.vdocuments.net/reader033/viewer/2022050518/5fa273d722e332060b4ee5ca/html5/thumbnails/24.jpg)
Distributionally Similar Words
24
Rumvodkacognacbrandywhiskyliquordetergentcolaginlemonadecocoachocolatescotchnoodletequilajuice
Writereadspeakpresentreceivecallreleasesignofferknowacceptdecideissueprepareconsiderpublish
Ancientoldmoderntraditionalmedievalhistoricfamousoriginalentiremainindianvarioussingleafricanjapanesegiant
Mathematicsphysicsbiologygeologysociologypsychologyanthropologyastronomyarithmeticgeographytheologyhebreweconomicschemistryscripturebiotechnology
(from an implementation of the method described in Lin. 1998. Automatic Retrieval and Clustering of Similar Words. COLING-ACL. Trained on newswire text.)
![Page 25: Algorithms for Natural Language Processingdemo.clab.cs.cmu.edu/NLP/S20/files/slides/7-lexicalsemantics.pdfSynsetsfor dog (n) • S: (n) dog, domestic dog, Canisfamiliaris(a member](https://reader033.vdocuments.net/reader033/viewer/2022050518/5fa273d722e332060b4ee5ca/html5/thumbnails/25.jpg)
Questions?