semantic frame identification with distributed word representations
TRANSCRIPT
![Page 1: Semantic Frame Identification with Distributed Word Representations](https://reader035.vdocuments.net/reader035/viewer/2022071018/613c8db0a9aa48668d4a4148/html5/thumbnails/1.jpg)
Semantic Frame Identification with DistributedWord Representations
Karl Moritz Hermann‡1
Jason Weston†Dipanjan Das†
Kuzman Ganchev†
†Google Inc., New York‡ Department of Computer Science, University of Oxford
Baltimore, 25.06.2014
1The majority of this research is result of an internship at Google.1 / 1
![Page 2: Semantic Frame Identification with Distributed Word Representations](https://reader035.vdocuments.net/reader035/viewer/2022071018/613c8db0a9aa48668d4a4148/html5/thumbnails/2.jpg)
We investigate features and classifiersfor frame-semantic parsing
2 / 1
![Page 3: Semantic Frame Identification with Distributed Word Representations](https://reader035.vdocuments.net/reader035/viewer/2022071018/613c8db0a9aa48668d4a4148/html5/thumbnails/3.jpg)
We investigate features and classifiersfor frame-semantic parsing
2 / 1
![Page 4: Semantic Frame Identification with Distributed Word Representations](https://reader035.vdocuments.net/reader035/viewer/2022071018/613c8db0a9aa48668d4a4148/html5/thumbnails/4.jpg)
We investigate features and classifiersfor frame-semantic parsing
2 / 1
![Page 5: Semantic Frame Identification with Distributed Word Representations](https://reader035.vdocuments.net/reader035/viewer/2022071018/613c8db0a9aa48668d4a4148/html5/thumbnails/5.jpg)
We investigate features and classifiersfor frame-semantic parsing
2 / 1
![Page 6: Semantic Frame Identification with Distributed Word Representations](https://reader035.vdocuments.net/reader035/viewer/2022071018/613c8db0a9aa48668d4a4148/html5/thumbnails/6.jpg)
We investigate features and classifiersfor frame-semantic parsing
2 / 1
![Page 7: Semantic Frame Identification with Distributed Word Representations](https://reader035.vdocuments.net/reader035/viewer/2022071018/613c8db0a9aa48668d4a4148/html5/thumbnails/7.jpg)
We investigate features and classifiersfor frame-semantic parsing
2 / 1
![Page 8: Semantic Frame Identification with Distributed Word Representations](https://reader035.vdocuments.net/reader035/viewer/2022071018/613c8db0a9aa48668d4a4148/html5/thumbnails/8.jpg)
Frame-Semantic Parsing Intro
Frame-Semantic Parsing
The task of extracting semantic predicate-argument structuresfrom text.
John sold Mary a car .
COMMERCE BUYsell.V
Seller GoodsBuyer
iobjnsubjdobj
det
3 / 1
![Page 9: Semantic Frame Identification with Distributed Word Representations](https://reader035.vdocuments.net/reader035/viewer/2022071018/613c8db0a9aa48668d4a4148/html5/thumbnails/9.jpg)
Frame-Semantic Parsing Intro
Frame-Semantic Parsing
The task of extracting semantic predicate-argument structuresfrom text.
John sold Mary a car .
COMMERCE BUYsell.V
Seller GoodsBuyer
iobjnsubjdobj
det
3 / 1
![Page 10: Semantic Frame Identification with Distributed Word Representations](https://reader035.vdocuments.net/reader035/viewer/2022071018/613c8db0a9aa48668d4a4148/html5/thumbnails/10.jpg)
Frame-Semantic Parsing Intro
Frame-Semantic Parsing
The task of extracting semantic predicate-argument structuresfrom text.
John sold Mary a car .
COMMERCE BUYsell.V
Seller GoodsBuyer
iobjnsubjdobj
det
3 / 1
![Page 11: Semantic Frame Identification with Distributed Word Representations](https://reader035.vdocuments.net/reader035/viewer/2022071018/613c8db0a9aa48668d4a4148/html5/thumbnails/11.jpg)
Frame-Semantic Parsing Intro
Frame-Semantic Parsing
The task of extracting semantic predicate-argument structuresfrom text.
John sold Mary a car .
COMMERCE BUYsell.V
Seller GoodsBuyer
iobjnsubjdobj
det
3 / 1
![Page 12: Semantic Frame Identification with Distributed Word Representations](https://reader035.vdocuments.net/reader035/viewer/2022071018/613c8db0a9aa48668d4a4148/html5/thumbnails/12.jpg)
Frame parsing as a two-stage task
Frame-Semantic Parsing is a combination of two tasks:
• Frame Identification
• Argument Identification
This paper focuses on Frame Identification
However, we also present results on the full pipeline task.
4 / 1
![Page 13: Semantic Frame Identification with Distributed Word Representations](https://reader035.vdocuments.net/reader035/viewer/2022071018/613c8db0a9aa48668d4a4148/html5/thumbnails/13.jpg)
Representing frame instances
Default Approach
• Frame instances representedby candidate and context
• Context: sparse featuresbased on parse tree
Distributed Approach
• Replaces binary featureswith word embeddings
• Context: same features, butrepresented distributionally
5 / 1
![Page 14: Semantic Frame Identification with Distributed Word Representations](https://reader035.vdocuments.net/reader035/viewer/2022071018/613c8db0a9aa48668d4a4148/html5/thumbnails/14.jpg)
Instances to Vectors
Step 1: Parsing
• Dependency parse of input sentence
• Paths are considered relative to candidate word
6 / 1
![Page 15: Semantic Frame Identification with Distributed Word Representations](https://reader035.vdocuments.net/reader035/viewer/2022071018/613c8db0a9aa48668d4a4148/html5/thumbnails/15.jpg)
Instances to Vectors
Step 1: Parsing
• Dependency parse of input sentence
• Paths are considered relative to candidate word
6 / 1
![Page 16: Semantic Frame Identification with Distributed Word Representations](https://reader035.vdocuments.net/reader035/viewer/2022071018/613c8db0a9aa48668d4a4148/html5/thumbnails/16.jpg)
Instances to Vectors
Step 2: Context word selection strategy
• Direct dependents based on parse
• Argument dependency paths learned from gold data
6 / 1
![Page 17: Semantic Frame Identification with Distributed Word Representations](https://reader035.vdocuments.net/reader035/viewer/2022071018/613c8db0a9aa48668d4a4148/html5/thumbnails/17.jpg)
Instances to Vectors
Step 3: Embeddings
• Replace words with embeddings
6 / 1
![Page 18: Semantic Frame Identification with Distributed Word Representations](https://reader035.vdocuments.net/reader035/viewer/2022071018/613c8db0a9aa48668d4a4148/html5/thumbnails/18.jpg)
Instances to Vectors
Step 4: Single vector creation
• Merge embeddings into a unified vector representation
• Effectively concatenation; zeros for empty slots
6 / 1
![Page 19: Semantic Frame Identification with Distributed Word Representations](https://reader035.vdocuments.net/reader035/viewer/2022071018/613c8db0a9aa48668d4a4148/html5/thumbnails/19.jpg)
Joint-space Model (Wsabie) — Learning
Joint-space Model
• Instances represented in Rd based on pre-trained embeddings.
7 / 1
![Page 20: Semantic Frame Identification with Distributed Word Representations](https://reader035.vdocuments.net/reader035/viewer/2022071018/613c8db0a9aa48668d4a4148/html5/thumbnails/20.jpg)
Joint-space Model (Wsabie) — Learning
Joint-space Model
• Labels represented as discrete values given lexicon (size F ).
7 / 1
![Page 21: Semantic Frame Identification with Distributed Word Representations](https://reader035.vdocuments.net/reader035/viewer/2022071018/613c8db0a9aa48668d4a4148/html5/thumbnails/21.jpg)
Joint-space Model (Wsabie) — Learning
Joint-space Model
• Learn a linear mapping M : Rd → Rm.
7 / 1
![Page 22: Semantic Frame Identification with Distributed Word Representations](https://reader035.vdocuments.net/reader035/viewer/2022071018/613c8db0a9aa48668d4a4148/html5/thumbnails/22.jpg)
Joint-space Model (Wsabie) — Learning
Joint-space Model
• Learn a matrix Y ∈ RF×m to represent labels.
7 / 1
![Page 23: Semantic Frame Identification with Distributed Word Representations](https://reader035.vdocuments.net/reader035/viewer/2022071018/613c8db0a9aa48668d4a4148/html5/thumbnails/23.jpg)
Joint-space Model (Wsabie) — Learning
Joint-space Model
• Objective function:∑x
∑y
L(ranky (x)
)max(0, γ + s(x , y)− s(x , y)).
7 / 1
![Page 24: Semantic Frame Identification with Distributed Word Representations](https://reader035.vdocuments.net/reader035/viewer/2022071018/613c8db0a9aa48668d4a4148/html5/thumbnails/24.jpg)
Joint-space Model (Wsabie) — Classification
Joint-Space Rm
• Project candidate intojoint-space
• Only consider labelprojections
• Restrict labels using lexicon
• Classify candidate usingsuitable distance metric
8 / 1
![Page 25: Semantic Frame Identification with Distributed Word Representations](https://reader035.vdocuments.net/reader035/viewer/2022071018/613c8db0a9aa48668d4a4148/html5/thumbnails/25.jpg)
Joint-space Model (Wsabie) — Classification
Joint-Space Rm
• Project candidate intojoint-space
• Only consider labelprojections
• Restrict labels using lexicon
• Classify candidate usingsuitable distance metric
8 / 1
![Page 26: Semantic Frame Identification with Distributed Word Representations](https://reader035.vdocuments.net/reader035/viewer/2022071018/613c8db0a9aa48668d4a4148/html5/thumbnails/26.jpg)
Joint-space Model (Wsabie) — Classification
Joint-Space Rm
• Project candidate intojoint-space
• Only consider labelprojections
• Restrict labels using lexicon
• Classify candidate usingsuitable distance metric
8 / 1
![Page 27: Semantic Frame Identification with Distributed Word Representations](https://reader035.vdocuments.net/reader035/viewer/2022071018/613c8db0a9aa48668d4a4148/html5/thumbnails/27.jpg)
Joint-space Model (Wsabie) — Classification
Joint-Space Rm
• Project candidate intojoint-space
• Only consider labelprojections
• Restrict labels using lexicon
• Classify candidate usingsuitable distance metric
8 / 1
![Page 28: Semantic Frame Identification with Distributed Word Representations](https://reader035.vdocuments.net/reader035/viewer/2022071018/613c8db0a9aa48668d4a4148/html5/thumbnails/28.jpg)
Experiments
Learning Setup
• Neural language model (∼ Bengio et al., 2003) trained on over100 billion tokens to learn 128-dimensional word embeddings
• FrameNet 1.5 and Ontonotes 4.0 (PropBank, WSJ-only) usedfor training the actual Wsabie models
• Hyperparameters optimised on development data
9 / 1
![Page 29: Semantic Frame Identification with Distributed Word Representations](https://reader035.vdocuments.net/reader035/viewer/2022071018/613c8db0a9aa48668d4a4148/html5/thumbnails/29.jpg)
Baselines: Where is the power coming from?
Models Evaluated
• Log-Linear Words
• Log-Linear Embeddings
• Wsabie
10 / 1
![Page 30: Semantic Frame Identification with Distributed Word Representations](https://reader035.vdocuments.net/reader035/viewer/2022071018/613c8db0a9aa48668d4a4148/html5/thumbnails/30.jpg)
Baselines: Where is the power coming from?
Models Evaluated• Log-Linear Words
• Log-Linear Embeddings
• Wsabie
10 / 1
![Page 31: Semantic Frame Identification with Distributed Word Representations](https://reader035.vdocuments.net/reader035/viewer/2022071018/613c8db0a9aa48668d4a4148/html5/thumbnails/31.jpg)
Baselines: Where is the power coming from?
Models Evaluated• Log-Linear Words
• Log-Linear Embeddings
• Wsabie
10 / 1
![Page 32: Semantic Frame Identification with Distributed Word Representations](https://reader035.vdocuments.net/reader035/viewer/2022071018/613c8db0a9aa48668d4a4148/html5/thumbnails/32.jpg)
Baselines: Where is the power coming from?
Models Evaluated• Log-Linear Words
• Log-Linear Embeddings
• Wsabie
10 / 1
![Page 33: Semantic Frame Identification with Distributed Word Representations](https://reader035.vdocuments.net/reader035/viewer/2022071018/613c8db0a9aa48668d4a4148/html5/thumbnails/33.jpg)
Evaluation
Evaluation Settings
• We evaluate on FrameNet (here) and PropBank (see paper)
• FrameNet setup follows Das et al. (2014), with a restrictedlexicon during training (Semafor)
• Multiple evaluationsAll evaluates all framesRare predicates with frequency ≤ 11 in the training dataUnseen predicates not observed in the training data
11 / 1
![Page 34: Semantic Frame Identification with Distributed Word Representations](https://reader035.vdocuments.net/reader035/viewer/2022071018/613c8db0a9aa48668d4a4148/html5/thumbnails/34.jpg)
Frame Identification Results (FrameNet - All Predicates)
FrameNet (Semafor)
83
84
85
86
82.97
83.6
83.94
84.53
86.49F
1-S
core
Das supervised Das bestLL-Embeddings LL-WordsWsabie
Figure : Frame Identification results on FrameNet dataset. We restrictthe training data to the Semafor lexicon for comparability with Das et al.,2014.
12 / 1
![Page 35: Semantic Frame Identification with Distributed Word Representations](https://reader035.vdocuments.net/reader035/viewer/2022071018/613c8db0a9aa48668d4a4148/html5/thumbnails/35.jpg)
Frame Identification Results (FrameNet - Rare)
FrameNet (Semafor)
82
84
80.97
82.31
81.03
81.65
85.22F
1-S
core
Das supervised Das bestLL-Embeddings LL-WordsWsabie
Figure : Frame Identification results on FrameNet dataset. We restrictthe training data to the Semafor lexicon for comparability with Das et al.,2014.
13 / 1
![Page 36: Semantic Frame Identification with Distributed Word Representations](https://reader035.vdocuments.net/reader035/viewer/2022071018/613c8db0a9aa48668d4a4148/html5/thumbnails/36.jpg)
Frame Identification Results (FrameNet - Unseen)
FrameNet (Semafor)
30
40
23.08
42.67
27.9727.27
46.15F
1-S
core
Das supervised Das bestLL-Embeddings LL-WordsWsabie
Figure : Frame Identification results on FrameNet dataset. We restrictthe training data to the Semafor lexicon for comparability with Das et al.,2014.
14 / 1
![Page 37: Semantic Frame Identification with Distributed Word Representations](https://reader035.vdocuments.net/reader035/viewer/2022071018/613c8db0a9aa48668d4a4148/html5/thumbnails/37.jpg)
Full pipeline
Full frame-semantic analysis
In addition to the frame identification experiments, we also use ourmodels as part of a full frame-semantic parsing setup together witha standard argument identification method.
Argument Identification System
• Standard set of discrete features
• Local log-linear model
• Global inference via hard constraints and ILP
15 / 1
![Page 38: Semantic Frame Identification with Distributed Word Representations](https://reader035.vdocuments.net/reader035/viewer/2022071018/613c8db0a9aa48668d4a4148/html5/thumbnails/38.jpg)
Full pipeline results (FrameNet)
FrameNet (Semafor)
64
66
68
64.05
64.54
67.06
68.69F
1-S
core
Das supervised Das bestLog-Words Wsabie
Figure : Full frame-structure prediction results for FrameNet dataset(Semafor).
16 / 1
![Page 39: Semantic Frame Identification with Distributed Word Representations](https://reader035.vdocuments.net/reader035/viewer/2022071018/613c8db0a9aa48668d4a4148/html5/thumbnails/39.jpg)
Conclusion
Conclusion• Novel approach to frame identification
• Model outperforms prior state of the art on frameidentification
• In a pipeline setting with a standard argument identificationsystem, the model also sets a new state of the art on varioussemantic parsing tasks
• General approach, could easily be extended for alternativeframe-semantic parsing frameworks
17 / 1
![Page 40: Semantic Frame Identification with Distributed Word Representations](https://reader035.vdocuments.net/reader035/viewer/2022071018/613c8db0a9aa48668d4a4148/html5/thumbnails/40.jpg)
The End
Thank you!Questions?
18 / 1
![Page 41: Semantic Frame Identification with Distributed Word Representations](https://reader035.vdocuments.net/reader035/viewer/2022071018/613c8db0a9aa48668d4a4148/html5/thumbnails/41.jpg)
Frame Identification Results
Model Semafor Lexicon
All Ambiguous Rare Unseen
Das et al., 2014 supervised 82.97 69.27 80.97 23.08Das et al., 2014 best 83.60 69.19 82.31 42.67
Log-Linear Words 84.53 70.55 81.65 27.27Log-Linear Embed. 83.94 70.27 81.03 27.97Wsabie Embedding 86.49 73.39 85.22 46.15
Table : Frame identification results on the FrameNet test data
19 / 1
![Page 42: Semantic Frame Identification with Distributed Word Representations](https://reader035.vdocuments.net/reader035/viewer/2022071018/613c8db0a9aa48668d4a4148/html5/thumbnails/42.jpg)
Frame Identification Results
Model Full Lexicon
All Ambiguous Rare
Log-Linear Words 87.33 70.55 87.19Log-Linear Embed. 86.94 70.26 86.56Wsabie Embedding 88.41 73.10 88.93
Table : Frame identification results on the FrameNet test data
20 / 1
![Page 43: Semantic Frame Identification with Distributed Word Representations](https://reader035.vdocuments.net/reader035/viewer/2022071018/613c8db0a9aa48668d4a4148/html5/thumbnails/43.jpg)
Full Structure Prediction Results
Model Semafor Lexicon
Precision Recall F1
Das et al. supervised 67.81 60.68 64.05Das et al. best 68.33 61.14 64.54
Log-Linear Words 71.21 63.37 67.06Wsabie Embedding 73.00 64.87 68.69
Table : Full structure prediction results for FrameNet test data. Wecompare to the prior state of the art (Das et al., 2014).
21 / 1
![Page 44: Semantic Frame Identification with Distributed Word Representations](https://reader035.vdocuments.net/reader035/viewer/2022071018/613c8db0a9aa48668d4a4148/html5/thumbnails/44.jpg)
Full Structure Prediction Results
Model Full Lexicon
Precision Recall F1
Log-Linear Words 73.31 65.20 69.01Wsabie Embedding 74.29 66.02 69.91
Table : Full structure prediction results for FrameNet test data. Wecompare to the prior state of the art (Das et al., 2014).
22 / 1
![Page 45: Semantic Frame Identification with Distributed Word Representations](https://reader035.vdocuments.net/reader035/viewer/2022071018/613c8db0a9aa48668d4a4148/html5/thumbnails/45.jpg)
Argument-Only Results (CoNLL 2005 task)
PropBank
72
74
76
78
80
73.62
76.29
78.69
77.2377.14
F1-
Sco
re
Collins Charniak Combined Log-W Wsabie
Figure : Argument only evaluation (argument identification metrics)using the CoNLL 2005 shared task evaluation script (Carreras andMarquez, 2005). Results from Punyakanok et al. (2008) are taken fromTable 11 of that paper.
23 / 1
![Page 46: Semantic Frame Identification with Distributed Word Representations](https://reader035.vdocuments.net/reader035/viewer/2022071018/613c8db0a9aa48668d4a4148/html5/thumbnails/46.jpg)
Frame Identification Results (PropBank)
PropBank
94
94.2
94.4
94.6
94.8
94.04
94.74
94.56
F1-
Sco
re
Log-E Log-W Wsabie
Figure : Frame Identification results on PropBank datasets.
24 / 1
![Page 47: Semantic Frame Identification with Distributed Word Representations](https://reader035.vdocuments.net/reader035/viewer/2022071018/613c8db0a9aa48668d4a4148/html5/thumbnails/47.jpg)
Full pipeline results (PropBank)
PropBank
79.62
79.64
79.65
79.61
F1-
Sco
re
Log-W Wsabie
Figure : Full frame-structure prediction results for PropBank dataset.
25 / 1