![Page 1: Z h ijia n g G u o D ens el y Con n ec t ed Gra p h Convol u t i on a l … · 2021. 1. 30. · Z h ijia n g G u o. Graph-to-Sequence Learning A M R-to-T ex t Gen erat i on Synt a](https://reader036.vdocuments.net/reader036/viewer/2022071411/610717ffbc7fbe173346b2a8/html5/thumbnails/1.jpg)
Densely Connected GraphDensely Connected GraphConvolutional Networks forConvolutional Networks for
Graph-to-Sequence LearningGraph-to-Sequence Learning
Joint work with Yan Zhang, Zhiyang Teng, Wei Lu
1
Zhijiang Guo
![Page 2: Z h ijia n g G u o D ens el y Con n ec t ed Gra p h Convol u t i on a l … · 2021. 1. 30. · Z h ijia n g G u o. Graph-to-Sequence Learning A M R-to-T ex t Gen erat i on Synt a](https://reader036.vdocuments.net/reader036/viewer/2022071411/610717ffbc7fbe173346b2a8/html5/thumbnails/2.jpg)
Graph-to-Sequence LearningGraph-to-Sequence Learning
AMR-to-Text GenerationAMR-to-Text Generation
Syntax-Based Machine TranslationSyntax-Based Machine Translation
2
![Page 3: Z h ijia n g G u o D ens el y Con n ec t ed Gra p h Convol u t i on a l … · 2021. 1. 30. · Z h ijia n g G u o. Graph-to-Sequence Learning A M R-to-T ex t Gen erat i on Synt a](https://reader036.vdocuments.net/reader036/viewer/2022071411/610717ffbc7fbe173346b2a8/html5/thumbnails/3.jpg)
You guys knowwhat I mean.
AMR-to-Text GenerationAMR-to-Text Generation
3
![Page 4: Z h ijia n g G u o D ens el y Con n ec t ed Gra p h Convol u t i on a l … · 2021. 1. 30. · Z h ijia n g G u o. Graph-to-Sequence Learning A M R-to-T ex t Gen erat i on Synt a](https://reader036.vdocuments.net/reader036/viewer/2022071411/610717ffbc7fbe173346b2a8/html5/thumbnails/4.jpg)
Ignore the graphstructure
(Konstas et al. 2017)
4
Sequence EncoderSequence Encoder
![Page 5: Z h ijia n g G u o D ens el y Con n ec t ed Gra p h Convol u t i on a l … · 2021. 1. 30. · Z h ijia n g G u o. Graph-to-Sequence Learning A M R-to-T ex t Gen erat i on Synt a](https://reader036.vdocuments.net/reader036/viewer/2022071411/610717ffbc7fbe173346b2a8/html5/thumbnails/5.jpg)
5
Graph State LSTM (Song et al., 2018)Gated Graph Neural Networks (Beck et al., 2018)
Recurrent Graph EncoderRecurrent Graph Encoder
![Page 6: Z h ijia n g G u o D ens el y Con n ec t ed Gra p h Convol u t i on a l … · 2021. 1. 30. · Z h ijia n g G u o. Graph-to-Sequence Learning A M R-to-T ex t Gen erat i on Synt a](https://reader036.vdocuments.net/reader036/viewer/2022071411/610717ffbc7fbe173346b2a8/html5/thumbnails/6.jpg)
Empirically, the best performance of GCNs is achievedwith a 2-layer model (Li et al., 2018, Xu et al., 2018)
GCNsGCNs
8
![Page 7: Z h ijia n g G u o D ens el y Con n ec t ed Gra p h Convol u t i on a l … · 2021. 1. 30. · Z h ijia n g G u o. Graph-to-Sequence Learning A M R-to-T ex t Gen erat i on Synt a](https://reader036.vdocuments.net/reader036/viewer/2022071411/610717ffbc7fbe173346b2a8/html5/thumbnails/7.jpg)
first convolutional layer captures first-orderproximity (immediate neighbors) information
GCNsGCNs
8
First-OrderProximity
![Page 8: Z h ijia n g G u o D ens el y Con n ec t ed Gra p h Convol u t i on a l … · 2021. 1. 30. · Z h ijia n g G u o. Graph-to-Sequence Learning A M R-to-T ex t Gen erat i on Synt a](https://reader036.vdocuments.net/reader036/viewer/2022071411/610717ffbc7fbe173346b2a8/html5/thumbnails/8.jpg)
GCNsGCNs
second convolutional layer capturessecond-order proximity information
9
Second-OrderProximity
![Page 9: Z h ijia n g G u o D ens el y Con n ec t ed Gra p h Convol u t i on a l … · 2021. 1. 30. · Z h ijia n g G u o. Graph-to-Sequence Learning A M R-to-T ex t Gen erat i on Synt a](https://reader036.vdocuments.net/reader036/viewer/2022071411/610717ffbc7fbe173346b2a8/html5/thumbnails/9.jpg)
(Bastings et al., 2017) (Damonte and Cohen , 2019 )6
Convolutional Graph EncoderConvolutional Graph Encoder
![Page 10: Z h ijia n g G u o D ens el y Con n ec t ed Gra p h Convol u t i on a l … · 2021. 1. 30. · Z h ijia n g G u o. Graph-to-Sequence Learning A M R-to-T ex t Gen erat i on Synt a](https://reader036.vdocuments.net/reader036/viewer/2022071411/610717ffbc7fbe173346b2a8/html5/thumbnails/10.jpg)
Is it possible to build a more expressive GCNmodel to learn a better graph representation
without relying on additional LSTM?
7
MotivationMotivation
Densely Connected GraphConvolutional Networks (DCGCNs)
![Page 11: Z h ijia n g G u o D ens el y Con n ec t ed Gra p h Convol u t i on a l … · 2021. 1. 30. · Z h ijia n g G u o. Graph-to-Sequence Learning A M R-to-T ex t Gen erat i on Synt a](https://reader036.vdocuments.net/reader036/viewer/2022071411/610717ffbc7fbe173346b2a8/html5/thumbnails/11.jpg)
one layer takes inputs from all preceding layersrather than the previous layer only (Huang et al., 2017)
10
DenselyConnected
Dense ConnectivityDense Connectivity
![Page 12: Z h ijia n g G u o D ens el y Con n ec t ed Gra p h Convol u t i on a l … · 2021. 1. 30. · Z h ijia n g G u o. Graph-to-Sequence Learning A M R-to-T ex t Gen erat i on Synt a](https://reader036.vdocuments.net/reader036/viewer/2022071411/610717ffbc7fbe173346b2a8/html5/thumbnails/12.jpg)
Densely Connected Sub-Block
Stack Identical BlocksStack Identical Blocks
Linear Combination Layer
11
Densely Connected GCNsDensely Connected GCNs
![Page 13: Z h ijia n g G u o D ens el y Con n ec t ed Gra p h Convol u t i on a l … · 2021. 1. 30. · Z h ijia n g G u o. Graph-to-Sequence Learning A M R-to-T ex t Gen erat i on Synt a](https://reader036.vdocuments.net/reader036/viewer/2022071411/610717ffbc7fbe173346b2a8/html5/thumbnails/13.jpg)
Both sub-blocks are denselyconnected graph convolutionallayers with different numbers
(m and n) of layers.
12
Densely Connected Sub-BlockDensely Connected Sub-Block
![Page 14: Z h ijia n g G u o D ens el y Con n ec t ed Gra p h Convol u t i on a l … · 2021. 1. 30. · Z h ijia n g G u o. Graph-to-Sequence Learning A M R-to-T ex t Gen erat i on Synt a](https://reader036.vdocuments.net/reader036/viewer/2022071411/610717ffbc7fbe173346b2a8/html5/thumbnails/14.jpg)
Densely Connected Sub-BlockDensely Connected Sub-Block
13
Sub-blocks with different numberof layers capture structural
information at different abstractlevels, similar to different filters.
![Page 15: Z h ijia n g G u o D ens el y Con n ec t ed Gra p h Convol u t i on a l … · 2021. 1. 30. · Z h ijia n g G u o. Graph-to-Sequence Learning A M R-to-T ex t Gen erat i on Synt a](https://reader036.vdocuments.net/reader036/viewer/2022071411/610717ffbc7fbe173346b2a8/html5/thumbnails/15.jpg)
Densely Connected Sub-BlockDensely Connected Sub-Block
For parameter efficiency,the output dimension of each
layer in the sub-block isdesigned to be small.
14
![Page 16: Z h ijia n g G u o D ens el y Con n ec t ed Gra p h Convol u t i on a l … · 2021. 1. 30. · Z h ijia n g G u o. Graph-to-Sequence Learning A M R-to-T ex t Gen erat i on Synt a](https://reader036.vdocuments.net/reader036/viewer/2022071411/610717ffbc7fbe173346b2a8/html5/thumbnails/16.jpg)
Densely Connected Sub-BlockDensely Connected Sub-Block
Input dimension: 300Sub-block layers: 3
Output dimension: 300(concatenate output from all 3 layers)
Hidden dimension of each layer:100 = 300 / 3 (proportional to #layers)
15
![Page 17: Z h ijia n g G u o D ens el y Con n ec t ed Gra p h Convol u t i on a l … · 2021. 1. 30. · Z h ijia n g G u o. Graph-to-Sequence Learning A M R-to-T ex t Gen erat i on Synt a](https://reader036.vdocuments.net/reader036/viewer/2022071411/610717ffbc7fbe173346b2a8/html5/thumbnails/17.jpg)
Linear Combination LayerLinear Combination Layer
This layer assigns differentweights to outputs of differentlayers. Initial inputs of the sub-block are also incorporate by
the residual connection.
16
![Page 18: Z h ijia n g G u o D ens el y Con n ec t ed Gra p h Convol u t i on a l … · 2021. 1. 30. · Z h ijia n g G u o. Graph-to-Sequence Learning A M R-to-T ex t Gen erat i on Synt a](https://reader036.vdocuments.net/reader036/viewer/2022071411/610717ffbc7fbe173346b2a8/html5/thumbnails/18.jpg)
Graph-to-Sequence ModelGraph-to-Sequence Model
17
![Page 19: Z h ijia n g G u o D ens el y Con n ec t ed Gra p h Convol u t i on a l … · 2021. 1. 30. · Z h ijia n g G u o. Graph-to-Sequence Learning A M R-to-T ex t Gen erat i on Synt a](https://reader036.vdocuments.net/reader036/viewer/2022071411/610717ffbc7fbe173346b2a8/html5/thumbnails/19.jpg)
AMR-to-Text Generation AMR-to-Text Generation
18
AMR 2015AMR 2015
AMR 2017AMR 2017
Syntax-Based Machine TranslationSyntax-Based Machine Translation
English-Czech (WMT 16)English-Czech (WMT 16)
English-German (WMT 16)English-German (WMT 16)
ExperimentsExperiments
![Page 20: Z h ijia n g G u o D ens el y Con n ec t ed Gra p h Convol u t i on a l … · 2021. 1. 30. · Z h ijia n g G u o. Graph-to-Sequence Learning A M R-to-T ex t Gen erat i on Synt a](https://reader036.vdocuments.net/reader036/viewer/2022071411/610717ffbc7fbe173346b2a8/html5/thumbnails/20.jpg)
Dataset Train Dev TestAMR 2015 16,833 1,368 1,371AMR 2017 36,521 1,368 1,371
En-Cs 181,112 2,656 2,999En-De 226,822 2,169 2,999
19
Data StatisticsData Statistics
![Page 21: Z h ijia n g G u o D ens el y Con n ec t ed Gra p h Convol u t i on a l … · 2021. 1. 30. · Z h ijia n g G u o. Graph-to-Sequence Learning A M R-to-T ex t Gen erat i on Synt a](https://reader036.vdocuments.net/reader036/viewer/2022071411/610717ffbc7fbe173346b2a8/html5/thumbnails/21.jpg)
Sequential Encoder: LSTM (Konstas et al., 2017)Graph Encoder: GS LSTM (Song et al., 2018)
Model External Data BLEULSTM No 22.0GS LSTM No 23.3GCN + LSTM No 24.4DCGCN No 25.7
GCN + LSTM (Damonte and Cohen , 2019 )
20
AMR 2015AMR 2015
![Page 22: Z h ijia n g G u o D ens el y Con n ec t ed Gra p h Convol u t i on a l … · 2021. 1. 30. · Z h ijia n g G u o. Graph-to-Sequence Learning A M R-to-T ex t Gen erat i on Synt a](https://reader036.vdocuments.net/reader036/viewer/2022071411/610717ffbc7fbe173346b2a8/html5/thumbnails/22.jpg)
Using External Training Data (0.2M)
21
AMR 2015AMR 2015
Model External Data BLEULSTM 0.2M 27.4GS LSTM 0.2M 28.2DCGCN 0.1M 29.0DCGCN 0.2M 31.6
![Page 23: Z h ijia n g G u o D ens el y Con n ec t ed Gra p h Convol u t i on a l … · 2021. 1. 30. · Z h ijia n g G u o. Graph-to-Sequence Learning A M R-to-T ex t Gen erat i on Synt a](https://reader036.vdocuments.net/reader036/viewer/2022071411/610717ffbc7fbe173346b2a8/html5/thumbnails/23.jpg)
Using External Training Data (0.3M)
21
Model External Data BLEULSTM 2M 32.3LSTM 20M 33.8GS LSTM 2M 33.6DCGCN (Single) 0.3M 33.2DCGCN (Ensemble) 0.3M 35.3
AMR 2015AMR 2015
![Page 24: Z h ijia n g G u o D ens el y Con n ec t ed Gra p h Convol u t i on a l … · 2021. 1. 30. · Z h ijia n g G u o. Graph-to-Sequence Learning A M R-to-T ex t Gen erat i on Synt a](https://reader036.vdocuments.net/reader036/viewer/2022071411/610717ffbc7fbe173346b2a8/html5/thumbnails/24.jpg)
Model #Parameters BLEU CHRF++LSTM 28.4M 21.7 49.1GGNNs 28.3M 23.3 50.4GCN + LSTM N/A 24.5 N/ADCGCN 18.5M 27.6 57.3
Sequential Encoder: LSTM (Beck et al., 2017)Graph Encoder: GGNNs (Beck et al., 2018)
22
GCN + LSTM (Damonte and Cohen , 2019 )
AMR 2017 (Single)AMR 2017 (Single)
![Page 25: Z h ijia n g G u o D ens el y Con n ec t ed Gra p h Convol u t i on a l … · 2021. 1. 30. · Z h ijia n g G u o. Graph-to-Sequence Learning A M R-to-T ex t Gen erat i on Synt a](https://reader036.vdocuments.net/reader036/viewer/2022071411/610717ffbc7fbe173346b2a8/html5/thumbnails/25.jpg)
Model #Parameters BLEU CHRF++LSTM 142.0M 26.6 52.5GGNNs 141.0M 27.5 53.5DCGCN 92.5M 30.4 59.6
Sequential Encoder: LSTM (Beck et al., 2017)Graph Encoder: GGNNs (Beck et al., 2018)
23
AMR 2017 (Ensemble)AMR 2017 (Ensemble)
![Page 26: Z h ijia n g G u o D ens el y Con n ec t ed Gra p h Convol u t i on a l … · 2021. 1. 30. · Z h ijia n g G u o. Graph-to-Sequence Learning A M R-to-T ex t Gen erat i on Synt a](https://reader036.vdocuments.net/reader036/viewer/2022071411/610717ffbc7fbe173346b2a8/html5/thumbnails/26.jpg)
Model Type #Param BLEU CHRF++BoW + GCN Single N/A 12.2 N/A
CNN + GCN Single N/A 13.7 N/A
BiRNN + GCN Single N/A 16.1 N/A
Seq2Seq Single 41.4M 15.5 40.8
GGNNs Single 41.2M 16.7 42.4
Our DCGCN Single 29.7M 19.0 44.1
Sequential Encoder: LSTM (Konstas et al., 2017)Graph Encoder: GGNNs(Beck et al., 2018)
BoW/CNN/RNN + GCN (Bastings et al., 2017)
24
English-GermanEnglish-German
![Page 27: Z h ijia n g G u o D ens el y Con n ec t ed Gra p h Convol u t i on a l … · 2021. 1. 30. · Z h ijia n g G u o. Graph-to-Sequence Learning A M R-to-T ex t Gen erat i on Synt a](https://reader036.vdocuments.net/reader036/viewer/2022071411/610717ffbc7fbe173346b2a8/html5/thumbnails/27.jpg)
Model Type #Param BLEU CHRF++BoW + GCN Single N/A 7.5 N/A
CNN + GCN Single N/A 8.7 N/A
BiRNN + GCN Single N/A 9.6 N/A
Seq2Seq Single 41.4M 8.9 33.8
GGNNs Single 41.2M 9.8 33.3
Our DCGCN Single 29.7M 12.1 37.1
26
English-GzechEnglish-GzechSequential Encoder: LSTM (Konstas et al., 2017)Graph Encoder: GGNNs(Beck et al., 2018)
BoW/CNN/RNN + GCN (Bastings et al., 2017)
![Page 28: Z h ijia n g G u o D ens el y Con n ec t ed Gra p h Convol u t i on a l … · 2021. 1. 30. · Z h ijia n g G u o. Graph-to-Sequence Learning A M R-to-T ex t Gen erat i on Synt a](https://reader036.vdocuments.net/reader036/viewer/2022071411/610717ffbc7fbe173346b2a8/html5/thumbnails/28.jpg)
Density of Connection Density of Connection
Model BLEUDCGCN 25.5
- {4} dense block 24.8
- {3, 4} dense block 23.8
- {2, 3, 4} dense blocks 23.2
28
![Page 29: Z h ijia n g G u o D ens el y Con n ec t ed Gra p h Convol u t i on a l … · 2021. 1. 30. · Z h ijia n g G u o. Graph-to-Sequence Learning A M R-to-T ex t Gen erat i on Synt a](https://reader036.vdocuments.net/reader036/viewer/2022071411/610717ffbc7fbe173346b2a8/html5/thumbnails/29.jpg)
Ablation TestAblation Test
Model BLEUDCGCN 25.5
- Global Node (GN) 24.2
- Linear Combination (LC) 23.7
- GN, LC 22.9
29
![Page 30: Z h ijia n g G u o D ens el y Con n ec t ed Gra p h Convol u t i on a l … · 2021. 1. 30. · Z h ijia n g G u o. Graph-to-Sequence Learning A M R-to-T ex t Gen erat i on Synt a](https://reader036.vdocuments.net/reader036/viewer/2022071411/610717ffbc7fbe173346b2a8/html5/thumbnails/30.jpg)
ConclusionConclusion
DCGCNs allow the encoder to better capturethe rich structural information of a graph,especially when it is large.
Future: investigate how other NLP applicationscan potentially benefit from our proposedapproach.
30
![Page 31: Z h ijia n g G u o D ens el y Con n ec t ed Gra p h Convol u t i on a l … · 2021. 1. 30. · Z h ijia n g G u o. Graph-to-Sequence Learning A M R-to-T ex t Gen erat i on Synt a](https://reader036.vdocuments.net/reader036/viewer/2022071411/610717ffbc7fbe173346b2a8/html5/thumbnails/31.jpg)
Thank YouThank You Code available:
http://www.statnlp.org/research/machine-learning