statistical phrase alignment model using dependency relation probability toshiaki nakazawa and sadao...
TRANSCRIPT
![Page 1: Statistical Phrase Alignment Model Using Dependency Relation Probability Toshiaki Nakazawa and Sadao Kurohashi Kyoto University](https://reader036.vdocuments.net/reader036/viewer/2022062408/56649f095503460f94c1dacd/html5/thumbnails/1.jpg)
Statistical Phrase Alignment Model Using Dependency Relation Probability
Toshiaki Nakazawa and Sadao Kurohashi
Kyoto University
![Page 2: Statistical Phrase Alignment Model Using Dependency Relation Probability Toshiaki Nakazawa and Sadao Kurohashi Kyoto University](https://reader036.vdocuments.net/reader036/viewer/2022062408/56649f095503460f94c1dacd/html5/thumbnails/2.jpg)
Outline Background Tree-based Statistical Phrase Alignment Model Model Training Experiments Conclusions
2
![Page 3: Statistical Phrase Alignment Model Using Dependency Relation Probability Toshiaki Nakazawa and Sadao Kurohashi Kyoto University](https://reader036.vdocuments.net/reader036/viewer/2022062408/56649f095503460f94c1dacd/html5/thumbnails/3.jpg)
Conventional Word Sequence Alignment
受 (accept)
光 (light)
素子 (device)
に (ni)
は (ha)
フォト (photo)
ゲート (gate)
を (wo)
用いた (used)
A
photogate
is
used
for
the
photodetector
![Page 4: Statistical Phrase Alignment Model Using Dependency Relation Probability Toshiaki Nakazawa and Sadao Kurohashi Kyoto University](https://reader036.vdocuments.net/reader036/viewer/2022062408/56649f095503460f94c1dacd/html5/thumbnails/4.jpg)
・・・exhibited ■ ■astrong ■inhibitory ■effect ■on ■ ■tumor ■growth ■in ■the ■castrated ■mice ■as ■ ■in ■thenon-castrated ■mice ■
・・・
非 去勢
マウス
と 同様に
去勢
マウス
の 腫よう
の 成長
に 対し
強い
抑制
効果
を 示した
grow-diag-final-and
![Page 5: Statistical Phrase Alignment Model Using Dependency Relation Probability Toshiaki Nakazawa and Sadao Kurohashi Kyoto University](https://reader036.vdocuments.net/reader036/viewer/2022062408/56649f095503460f94c1dacd/html5/thumbnails/5.jpg)
Conventional Word Sequence Alignment
受 (accept)
光 (light)
素子 (device)
に (ni)
は (ha)
フォト (photo)
ゲート (gate)
を (wo)
用いた (used)
A
photogate
is
used
for
the
photodetector
受
光
素子
に
は
フォト
ゲート
を
用いた
A
photogate
is
used
for
the
photodetector
(accept)
(light)
(device)
(photo)
(gate)
(used)
(ni)
(ha)
(wo)
Proposed Model
1. Dependency trees
![Page 6: Statistical Phrase Alignment Model Using Dependency Relation Probability Toshiaki Nakazawa and Sadao Kurohashi Kyoto University](https://reader036.vdocuments.net/reader036/viewer/2022062408/56649f095503460f94c1dacd/html5/thumbnails/6.jpg)
Proposed Model
受
光
素子
に
は
フォト
ゲート
を
用いた
A
photogate
is
used
for
the
photodetector
(accept)
(light)
(device)
(photo)
(gate)
(used)
(ni)
(ha)
(wo)
1. Dependency trees2. Phrase alignment3. Bi-directional agreement
![Page 7: Statistical Phrase Alignment Model Using Dependency Relation Probability Toshiaki Nakazawa and Sadao Kurohashi Kyoto University](https://reader036.vdocuments.net/reader036/viewer/2022062408/56649f095503460f94c1dacd/html5/thumbnails/7.jpg)
・・・exhibited ■ ■
astrong ■
inhibitory ■
effect ■
on ■ ■
tumor ■
growth ■
in ■
the ■
castrated ■
mice ■
as ■ ■
in ■
thenon-castrated ■
mice ■
・・・
非 去勢
マウス
と 同様に
去勢
マウス
の 腫よう
の 成長
に 対し
強い
抑制
効果
を 示した
grow-diag-final-and・・・ exhibited ■ ■ │ ┌─a │ ─├ strong ■ │ ─├ inhibitory ■ ─├ effect ■ ─├ on
│ │ ┌─tumor ■ │ └─growth ■ ─├ in ■
│ │ ┌─the │ │ ─├ castrated ■ │ └─mice ■ └─as ■
└─ in ■
│ ┌─ the
│ ├─ non-castrated
■ ■
└─ mice ■
・・・
─
┌非
─
┌去勢
─
┌マウス
─┌と
┬同様に
─
┌去勢
─
┌マウス
─
┌の
─
┌腫よう
─ ┌の
─
┌成長
─┌に
┬対し
─
┌強い
─ ┬抑制
─┌効果
┬を
示した
Proposed model
![Page 8: Statistical Phrase Alignment Model Using Dependency Relation Probability Toshiaki Nakazawa and Sadao Kurohashi Kyoto University](https://reader036.vdocuments.net/reader036/viewer/2022062408/56649f095503460f94c1dacd/html5/thumbnails/8.jpg)
Related Work Using tree structures
[Cherry and Lin, 2003], [Quirk et al., 2005], [Galley et al., 2006], ITG, …
Considering phrase alignment [Zhang and Vogel, 2005], [Ion et al., 2006], …
Using two directed models simultaneously [Liang et al., 2006], [Graca et al., 2008], …
![Page 9: Statistical Phrase Alignment Model Using Dependency Relation Probability Toshiaki Nakazawa and Sadao Kurohashi Kyoto University](https://reader036.vdocuments.net/reader036/viewer/2022062408/56649f095503460f94c1dacd/html5/thumbnails/9.jpg)
Tree-based Statistical Phrase Alignment Model
![Page 10: Statistical Phrase Alignment Model Using Dependency Relation Probability Toshiaki Nakazawa and Sadao Kurohashi Kyoto University](https://reader036.vdocuments.net/reader036/viewer/2022062408/56649f095503460f94c1dacd/html5/thumbnails/10.jpg)
Dependency Analysis of Sentences
受
光
素子
に
は
フォト
ゲート
を
用いた
A
photogate
is
used
for
the
photodetector
(accept)
(light)
(device)
(photo)
(gate)
(used)
(ni)
(ha)
(wo)
受光素子にはフォトゲートを用いたA photogate is used for the photodetector
Source (Japanese) Target (English)
Word order
Head node
Head node
![Page 11: Statistical Phrase Alignment Model Using Dependency Relation Probability Toshiaki Nakazawa and Sadao Kurohashi Kyoto University](https://reader036.vdocuments.net/reader036/viewer/2022062408/56649f095503460f94c1dacd/html5/thumbnails/11.jpg)
Overview of the Proposed Model(in comparison to the IBM models) IBM models find the best alignment by
Proposed model
a
)|(),|(maxarg
)|,(maxargˆ
eaaef
eafa
a
a
pp
p
Word translatio
n
Word reordering
f: source sentence
e: target sentencea: alignment
)|(),|(maxarg
)|,(maxargˆ
eaaef
eafa
a
a
pp
p
Phrase translatio
n
Dependency Relation
)|(),|()|(),|(maxarg
)|,()|,(maxargˆ
faafeeaaef
faeeafa
a
a
pppp
pp
Phrase translatio
n
Phrase translatio
n
Dependency Relation
Dependency Relation
),|( aefp
)|( eap
: Lexical prob.
: Alignment prob.
![Page 12: Statistical Phrase Alignment Model Using Dependency Relation Probability Toshiaki Nakazawa and Sadao Kurohashi Kyoto University](https://reader036.vdocuments.net/reader036/viewer/2022062408/56649f095503460f94c1dacd/html5/thumbnails/12.jpg)
Phrase Translation Probability
![Page 13: Statistical Phrase Alignment Model Using Dependency Relation Probability Toshiaki Nakazawa and Sadao Kurohashi Kyoto University](https://reader036.vdocuments.net/reader036/viewer/2022062408/56649f095503460f94c1dacd/html5/thumbnails/13.jpg)
Phrase Translation Probability
Note that the sentences are not previously segmented into phrases
J
jAjs jsEFpp
1)( )|(),|(
)(Aef
J
jaj jefpp
1
)|(),|( aefIBM Model
)|()NULL|(
)|()|()|(),|(
213
323221
EFpFp
EFpEFpEFpp
Aef
f4
f3
F2
f5
f2
f1
F1
F3
s(j):s(1) = 1s(2) = 2s(3) = 2s(4) = 3s(5) = 1
source
e4
e3
E2e2
e1
E1
E3
A:A1=2A2=3A3=0
target
![Page 14: Statistical Phrase Alignment Model Using Dependency Relation Probability Toshiaki Nakazawa and Sadao Kurohashi Kyoto University](https://reader036.vdocuments.net/reader036/viewer/2022062408/56649f095503460f94c1dacd/html5/thumbnails/14.jpg)
Dependency Relation Probability
![Page 15: Statistical Phrase Alignment Model Using Dependency Relation Probability Toshiaki Nakazawa and Sadao Kurohashi Kyoto University](https://reader036.vdocuments.net/reader036/viewer/2022062408/56649f095503460f94c1dacd/html5/thumbnails/15.jpg)
Dependency Relations
Parent-child
Parent-child
Grandparent-child
?Fs(c)
EAs(c)
EAs(p)EAs(c)
rel(fc, fp) = c
Invertedparent-child
EAs(p)
Fs(p)fp
fc
rel(fc, fp) = c;crel(fc, fp) = prel(fc, fp) = NULL_p
source target
・・・
・・・
・・・
・・・
・・・
NULL
![Page 16: Statistical Phrase Alignment Model Using Dependency Relation Probability Toshiaki Nakazawa and Sadao Kurohashi Kyoto University](https://reader036.vdocuments.net/reader036/viewer/2022062408/56649f095503460f94c1dacd/html5/thumbnails/16.jpg)
Dependency Relation Probability
Ds-pc is a set of parent-child word pairs in the source sentence
Source-side dependency relation probability is defined in the same manner
pcs)cp,(
cp ))f,f(()|(D
t relpp eA
pct)cp,(
cp ))e,e(()|(D
s relpp fA
![Page 17: Statistical Phrase Alignment Model Using Dependency Relation Probability Toshiaki Nakazawa and Sadao Kurohashi Kyoto University](https://reader036.vdocuments.net/reader036/viewer/2022062408/56649f095503460f94c1dacd/html5/thumbnails/17.jpg)
Model Training
![Page 18: Statistical Phrase Alignment Model Using Dependency Relation Probability Toshiaki Nakazawa and Sadao Kurohashi Kyoto University](https://reader036.vdocuments.net/reader036/viewer/2022062408/56649f095503460f94c1dacd/html5/thumbnails/18.jpg)
Model Training
Step 1 : Estimate word translation prob. (IBM Model 1)
Initialize dependency relation prob.
Step 2 : Estimate phrase translation prob. and dependency relation prob. E-step
1. Create initial alignment2. Modify the alignment by hill-climbing
Generate possible phrases M-step: Parameter estimation
Word base
Tree base
p( コロラド |Colorado)=0.7p( 大学 |university)=0.6…
p(c) = 0.4p(c;c)= 0.3p(p) = 0.2…
p( コロラド |Colorado)=0.7p( 大学 |university)=0.6p( コロラド 大学 |university of Colorado)=0.9…
![Page 19: Statistical Phrase Alignment Model Using Dependency Relation Probability Toshiaki Nakazawa and Sadao Kurohashi Kyoto University](https://reader036.vdocuments.net/reader036/viewer/2022062408/56649f095503460f94c1dacd/html5/thumbnails/19.jpg)
Step 2 (E-step)
受光
素子に
はフォト
ゲートを
用いた
A
photogate
is
used
for
the
photodetector
受光
素子に
はフォト
ゲートを
用いた
A
photogate
is
used
for
the
photodetector
Initial Alignment
Swap Reject
Initial alignment is greedily created
Modify the initial alignment with the operations: Swap Reject Add Extend
Example of Hill-climbing
)|(),|()|(),|(maxarg
)|,()|,(maxargˆ
faafeeaaef
faeeafa
a
a
pppp
pp
![Page 20: Statistical Phrase Alignment Model Using Dependency Relation Probability Toshiaki Nakazawa and Sadao Kurohashi Kyoto University](https://reader036.vdocuments.net/reader036/viewer/2022062408/56649f095503460f94c1dacd/html5/thumbnails/20.jpg)
Generate Possible Phrases Generate new possible phrases by merging
the NULL-aligned nodes into their parent or child non-NULL-aligned nodes
The new possible phrases are taken into consideration from the next iteration
受光
素子に
はフォト
ゲートを
用いた
A
photogate
is
used
for
the
photodetector
![Page 21: Statistical Phrase Alignment Model Using Dependency Relation Probability Toshiaki Nakazawa and Sadao Kurohashi Kyoto University](https://reader036.vdocuments.net/reader036/viewer/2022062408/56649f095503460f94c1dacd/html5/thumbnails/21.jpg)
Model Training
Step 1 : Estimate word translation prob. (IBM Model 1)
Initialize dependency relation prob.
Step 2 : Estimate phrase translation prob. and dependency relation prob. E-step
1. Create initial alignment2. Modify the alignment by hill-climbing
Generate possible phrases M-step: Parameter estimation
Word base
Tree base
p( コロラド |colorado)=0.7p( 大学 |university)=0.6…
p(c) = 0.4p(c;c)= 0.3p(p) = 0.2…
p( コロラド |colorado)=0.7p( 大学 |university)=0.6p( コロラド 大学 |university of colorado)=0.9…
![Page 22: Statistical Phrase Alignment Model Using Dependency Relation Probability Toshiaki Nakazawa and Sadao Kurohashi Kyoto University](https://reader036.vdocuments.net/reader036/viewer/2022062408/56649f095503460f94c1dacd/html5/thumbnails/22.jpg)
Experiments
![Page 23: Statistical Phrase Alignment Model Using Dependency Relation Probability Toshiaki Nakazawa and Sadao Kurohashi Kyoto University](https://reader036.vdocuments.net/reader036/viewer/2022062408/56649f095503460f94c1dacd/html5/thumbnails/23.jpg)
Alignment Experiments Training: JST Ja-En paper abstract corpus (1M
sentences, Ja: 36.4M words, En: 83.6M words) Test: 475 sentences with the gold-standard
alignments annotated by hand Parsers: KNP for Japanese, MSTParser for
English Evaluation criteria: Precision, Recall, F1 For the proposed model, we did 5 iterations in
each Step
![Page 24: Statistical Phrase Alignment Model Using Dependency Relation Probability Toshiaki Nakazawa and Sadao Kurohashi Kyoto University](https://reader036.vdocuments.net/reader036/viewer/2022062408/56649f095503460f94c1dacd/html5/thumbnails/24.jpg)
Experimental Results
Pre. Rec. FProposed 87.75 50.27 63.92
intersection 90.34 34.28 49.71
grow-final-and 81.32 48.85 61.04
grow-diag-final-and 79.39 51.15 62.22
+1.7
![Page 25: Statistical Phrase Alignment Model Using Dependency Relation Probability Toshiaki Nakazawa and Sadao Kurohashi Kyoto University](https://reader036.vdocuments.net/reader036/viewer/2022062408/56649f095503460f94c1dacd/html5/thumbnails/25.jpg)
Effectiveness of Phrase and Tree
Pre. Rec. FTrees + Phrases (Proposed) 85.54 51.00 63.90
Trees 89.77 39.47 54.83
Phrases 84.41 47.33 60.65
None 85.07 38.06 52.59
cp
-1
+1
Positional relations instead of dependency relations
![Page 26: Statistical Phrase Alignment Model Using Dependency Relation Probability Toshiaki Nakazawa and Sadao Kurohashi Kyoto University](https://reader036.vdocuments.net/reader036/viewer/2022062408/56649f095503460f94c1dacd/html5/thumbnails/26.jpg)
Discussions Parsing errors
Parsing accuracy is basically good, but still sometimes makes incorrect parsing results
Parsing probability into the model Search errors
Hill-climbing sometimes goes local minima Random restart
Function words Behave quite differently in different languages (ex.
case markers in Japanese, articles in English) Post-processing
![Page 27: Statistical Phrase Alignment Model Using Dependency Relation Probability Toshiaki Nakazawa and Sadao Kurohashi Kyoto University](https://reader036.vdocuments.net/reader036/viewer/2022062408/56649f095503460f94c1dacd/html5/thumbnails/27.jpg)
Post-processing for Function Words Reject correspondences between Japanese
particles and English “be” or “have” Reject correspondences of English articles Japanese “ する” and “ れる” or English “be” and
“have” are merged into its parent verb or adjective if they are NULL-aligned
Pre. Rec. FProposed 87.75 50.27 63.92
Proposed+ modify 87.83 58.40 70.16grow-diag-final-and 79.39 51.15 62.22grow-diag-final-and + modify 80.46 51.15 62.54
+6.2
+0.3
![Page 28: Statistical Phrase Alignment Model Using Dependency Relation Probability Toshiaki Nakazawa and Sadao Kurohashi Kyoto University](https://reader036.vdocuments.net/reader036/viewer/2022062408/56649f095503460f94c1dacd/html5/thumbnails/28.jpg)
Conclusion and Future Work Linguistically motivated phrase alignment
1. Dependency trees2. Phrase alignment3. Bi-directional agreement
Significantly better results compared to conventional word alignment models
Future work: Apply the proposed model for other language pairs
(Japanese-Chinese and so on) Incorporate parsing probability into our model Investigate the contribution of our alignment
results to the translation quality