learning valid adverb- adjective pairsweb.stanford.edu/~cysuen/projects/cs224u_presentation.pdf ·...
TRANSCRIPT
![Page 1: LEARNING VALID ADVERB- ADJECTIVE PAIRSweb.stanford.edu/~cysuen/projects/cs224u_presentation.pdf · learning valid adverb-adjective pairs caroline suen cs224u winter 2013](https://reader030.vdocuments.net/reader030/viewer/2022021501/5abad8237f8b9a76038c0365/html5/thumbnails/1.jpg)
LEARNING VALID ADVERB-ADJECTIVE PAIRS CAROLINE SUEN
CS224U WINTER 2013
![Page 2: LEARNING VALID ADVERB- ADJECTIVE PAIRSweb.stanford.edu/~cysuen/projects/cs224u_presentation.pdf · learning valid adverb-adjective pairs caroline suen cs224u winter 2013](https://reader030.vdocuments.net/reader030/viewer/2022021501/5abad8237f8b9a76038c0365/html5/thumbnails/2.jpg)
THE CHALLENGE We can say:
• “The glass is half full.” • or “Wow, Bob is really tall.”
But can we say:
• “Wow, Bob is half tall”. • or “The glass is really full.” ?
Goal: develop a model that can learn whether an adverb and an adjective can be used together and make grammatical sense.
![Page 3: LEARNING VALID ADVERB- ADJECTIVE PAIRSweb.stanford.edu/~cysuen/projects/cs224u_presentation.pdf · learning valid adverb-adjective pairs caroline suen cs224u winter 2013](https://reader030.vdocuments.net/reader030/viewer/2022021501/5abad8237f8b9a76038c0365/html5/thumbnails/3.jpg)
PRIOR WORK Syrett and Lidz (2010)
• Use linguistics to develop patterns
Sentiment analysis
• Benemara et. al (2007), Liu et. al (2009)
Adjective-noun pairs • Hatzivassiloglou et. al (1993)
![Page 4: LEARNING VALID ADVERB- ADJECTIVE PAIRSweb.stanford.edu/~cysuen/projects/cs224u_presentation.pdf · learning valid adverb-adjective pairs caroline suen cs224u winter 2013](https://reader030.vdocuments.net/reader030/viewer/2022021501/5abad8237f8b9a76038c0365/html5/thumbnails/4.jpg)
EXTRACTING DATA half completely extremely nearly
full 5 3 3 1 tall 0 0 4 0
smart 0 1 4 0 daylong 0 0 0 1
• New York Times dataset, ~18000 articles • Stanford POS tagger to find valid adverb-adjective pairs • 1019 adverbs, 4876 adjectives, 19337 pairs
![Page 5: LEARNING VALID ADVERB- ADJECTIVE PAIRSweb.stanford.edu/~cysuen/projects/cs224u_presentation.pdf · learning valid adverb-adjective pairs caroline suen cs224u winter 2013](https://reader030.vdocuments.net/reader030/viewer/2022021501/5abad8237f8b9a76038c0365/html5/thumbnails/5.jpg)
BUILDING A GRAPH half
completely
extremely
nearly
full
tall
smart
daylong
Relatively sparse bipartite graph
![Page 6: LEARNING VALID ADVERB- ADJECTIVE PAIRSweb.stanford.edu/~cysuen/projects/cs224u_presentation.pdf · learning valid adverb-adjective pairs caroline suen cs224u winter 2013](https://reader030.vdocuments.net/reader030/viewer/2022021501/5abad8237f8b9a76038c0365/html5/thumbnails/6.jpg)
PARTITIONING
half
completely
extremely
nearly
full
tall
smart
daylong
![Page 7: LEARNING VALID ADVERB- ADJECTIVE PAIRSweb.stanford.edu/~cysuen/projects/cs224u_presentation.pdf · learning valid adverb-adjective pairs caroline suen cs224u winter 2013](https://reader030.vdocuments.net/reader030/viewer/2022021501/5abad8237f8b9a76038c0365/html5/thumbnails/7.jpg)
BUILDING A GRAPH: TECHNICAL DETAILS • Used Stanford Network Analysis Platform • Experimented:
• Find dense bipartite subgraphs using the frequent itemset algorithm
• Build adverb graphs and adjective graphs and run community detection algorithms on these graphs
• Based on common neighbors
![Page 8: LEARNING VALID ADVERB- ADJECTIVE PAIRSweb.stanford.edu/~cysuen/projects/cs224u_presentation.pdf · learning valid adverb-adjective pairs caroline suen cs224u winter 2013](https://reader030.vdocuments.net/reader030/viewer/2022021501/5abad8237f8b9a76038c0365/html5/thumbnails/8.jpg)
half
completely
extremely
nearly
full
tall
smart
daylong
half
completely extremely
nearly
full
tall
smart
daylong
Adjective graph
Adverb graph
![Page 9: LEARNING VALID ADVERB- ADJECTIVE PAIRSweb.stanford.edu/~cysuen/projects/cs224u_presentation.pdf · learning valid adverb-adjective pairs caroline suen cs224u winter 2013](https://reader030.vdocuments.net/reader030/viewer/2022021501/5abad8237f8b9a76038c0365/html5/thumbnails/9.jpg)
From Wikipedia
CLIQUE PERCOLATION
![Page 10: LEARNING VALID ADVERB- ADJECTIVE PAIRSweb.stanford.edu/~cysuen/projects/cs224u_presentation.pdf · learning valid adverb-adjective pairs caroline suen cs224u winter 2013](https://reader030.vdocuments.net/reader030/viewer/2022021501/5abad8237f8b9a76038c0365/html5/thumbnails/10.jpg)
CLASSIFY: DOES AN EDGE BELONG? Use the communities that adverbs u and adjective v are in. If, by combining these communities, the edge density is sufficiently high, we claim that u and v can be paired up. Harder case:
• An adverb is in communities C1 and C2. How likely is it to be connected to an adjective in communities D1, D2, and D3?
• Thankfully, this is rare! • Larger and more densely connected communities are
given higher weight
![Page 11: LEARNING VALID ADVERB- ADJECTIVE PAIRSweb.stanford.edu/~cysuen/projects/cs224u_presentation.pdf · learning valid adverb-adjective pairs caroline suen cs224u winter 2013](https://reader030.vdocuments.net/reader030/viewer/2022021501/5abad8237f8b9a76038c0365/html5/thumbnails/11.jpg)
EVALUATION: RECALL • Find “test data” (1100 edges) – remaining edges is
“training data” • Find communities based on training data • Observe fraction of test data edges recovered
![Page 12: LEARNING VALID ADVERB- ADJECTIVE PAIRSweb.stanford.edu/~cysuen/projects/cs224u_presentation.pdf · learning valid adverb-adjective pairs caroline suen cs224u winter 2013](https://reader030.vdocuments.net/reader030/viewer/2022021501/5abad8237f8b9a76038c0365/html5/thumbnails/12.jpg)
EVALUATION: RECALL
Not enough connections: 260 (21.7%)
Not discovered by community detection algorithm: 129 (11.7%)
Correctly discovered by community detection algorithm: 711 (64.6%)
![Page 13: LEARNING VALID ADVERB- ADJECTIVE PAIRSweb.stanford.edu/~cysuen/projects/cs224u_presentation.pdf · learning valid adverb-adjective pairs caroline suen cs224u winter 2013](https://reader030.vdocuments.net/reader030/viewer/2022021501/5abad8237f8b9a76038c0365/html5/thumbnails/13.jpg)
CHALLENGES + NEXT STEPS • Not enough pairings
• (recall for test data with enough connections: 84.6%) • Clique percolation is slow
• priority was building evaluation framework first • next steps: experimenting with clustering
• Adjective edge connections are much more important than adverb connections
• Current framework does not test precision • MTurk for crowd-sourced, hand-labeled data
• Potential next step:
• Check Syrett and Lidz’ linguistic results
![Page 14: LEARNING VALID ADVERB- ADJECTIVE PAIRSweb.stanford.edu/~cysuen/projects/cs224u_presentation.pdf · learning valid adverb-adjective pairs caroline suen cs224u winter 2013](https://reader030.vdocuments.net/reader030/viewer/2022021501/5abad8237f8b9a76038c0365/html5/thumbnails/14.jpg)
THE END
THANKS FOR LISTENING! J