exploration of system combination in statistical machine translation
DESCRIPTION
Different paradigms and approaches in Machine Translation (MT) result in different MT systems with their own strengths and weaknesses. The complementary strengths of multiple MT systems can be exploited by system combination. This research work aims to examine the effect of system combination on MT via empirical experiments, and more importantly, utilize system combination to improve a state-of-the-art Chinese-to-English statistical machine translation (SMT) system. Extensive experiments were carried out on gold standard datasets in MT, in particular, the evaluation sets of NIST Open Machine Translation (OpenMT) evaluation series and Workshops on Statistical Machine Translation (WMT). We not only evaluate the effects of system combination on translation performance but also examine different ways of selecting component systems. Finally, we exploit different Chinese word segmentation (CWS) standards as a way to produce diverse translation output for system combination. This approach yields significant gain of 0.5-0.8 BLEU points on average over strong baseline systems.TRANSCRIPT
![Page 1: Exploration of System Combination in Statistical Machine Translation](https://reader033.vdocuments.net/reader033/viewer/2022060115/5579559dd8b42ab6648b4a0f/html5/thumbnails/1.jpg)
Exploration of system combination in statistical machine translation
Le Truong Vinh Phu
Supervisor: Prof. Ng Hwee Tou
Master of Computing dissertation
School of Computing
27th May 2014
![Page 2: Exploration of System Combination in Statistical Machine Translation](https://reader033.vdocuments.net/reader033/viewer/2022060115/5579559dd8b42ab6648b4a0f/html5/thumbnails/2.jpg)
• Introduction • Literature Review • Multi-Engine Machine Translation (MEMT)
• Experiments • Conclusion and Future Research
Outline
2
![Page 3: Exploration of System Combination in Statistical Machine Translation](https://reader033.vdocuments.net/reader033/viewer/2022060115/5579559dd8b42ab6648b4a0f/html5/thumbnails/3.jpg)
• Introduction ♦ Machine translation (MT)
♦ Statistical machine translation (SMT)
♦ Machine translation system combination
♦ Problem description & objective
• Literature Review
• Multi-Engine Machine Translation (MEMT) • Experiments
• Conclusion and Future Research
Outline
3
![Page 4: Exploration of System Combination in Statistical Machine Translation](https://reader033.vdocuments.net/reader033/viewer/2022060115/5579559dd8b42ab6648b4a0f/html5/thumbnails/4.jpg)
• the use of computers to automate translation • difficulty: translation divergences • real-world benefits
• different paradigms and approaches ♦ dictionary-based
♦ rule-based
♦ statistical
Machine translation (MT)
4
![Page 5: Exploration of System Combination in Statistical Machine Translation](https://reader033.vdocuments.net/reader033/viewer/2022060115/5579559dd8b42ab6648b4a0f/html5/thumbnails/5.jpg)
• enabled by the availability of large corpora (mono, bi-lingual)
• relying on probability models ♦ faithfulness
♦ fluency
• P(F|E): translation model, P(E): language model
• Phrase-based SMT (Koehn et al., 2003)
Statistical machine translation (SMT)
5
![Page 6: Exploration of System Combination in Statistical Machine Translation](https://reader033.vdocuments.net/reader033/viewer/2022060115/5579559dd8b42ab6648b4a0f/html5/thumbnails/6.jpg)
• Language model: ♦ conditional probability of a word given previous words
♦ requires monolingual corpus
• Alignment
Statistical machine translation (SMT)
6
![Page 7: Exploration of System Combination in Statistical Machine Translation](https://reader033.vdocuments.net/reader033/viewer/2022060115/5579559dd8b42ab6648b4a0f/html5/thumbnails/7.jpg)
• Reordering model: ♦ penalties for long distance reordering
♦ distance-based (Koehn et al., 2005), phrase-based and hierarchical reordering (Galley & Manning, 2008)
• Automatic evaluation: ♦ BLEU (Papineni et al., 2002)
Statistical machine translation (SMT)
7
![Page 8: Exploration of System Combination in Statistical Machine Translation](https://reader033.vdocuments.net/reader033/viewer/2022060115/5579559dd8b42ab6648b4a0f/html5/thumbnails/8.jpg)
• different MT systems => different strengths and weaknesses
• synthesizing a consensus translation
• main aspects: ♦ combination method
♦ selection of good component systems to combine
MT system combination
8
![Page 9: Exploration of System Combination in Statistical Machine Translation](https://reader033.vdocuments.net/reader033/viewer/2022060115/5579559dd8b42ab6648b4a0f/html5/thumbnails/9.jpg)
• Problem description ♦ in which situation and settings system combination works well?
• Objective:
♦ evaluating system combination via empirical experiments Ø available datasets: NIST OpenMT, WMT
♦ utilizing system combination to improve a Chinese-to-English phrase-based system
Problem description & objective
9
![Page 10: Exploration of System Combination in Statistical Machine Translation](https://reader033.vdocuments.net/reader033/viewer/2022060115/5579559dd8b42ab6648b4a0f/html5/thumbnails/10.jpg)
• Introduction • Literature Review ♦ System combination
♦ Confusion network decoding
♦ Other approaches
♦ Diverse hypotheses generation
• Multi-Engine Machine Translation (MEMT) • Experiments • Conclusion and Future Research
Outline
10
![Page 11: Exploration of System Combination in Statistical Machine Translation](https://reader033.vdocuments.net/reader033/viewer/2022060115/5579559dd8b42ab6648b4a0f/html5/thumbnails/11.jpg)
• successfully applied in speech recognition (Fiscus, 1997; Mangu et al., 2000)
• crucial steps: aligning hypotheses, controlling word order
• variety of approaches: ♦ hypothesis re-ranking (Hildebrand & Vogel, 2008)
♦ confusion networks (Rosti et al., 2007a, 2007b)
♦ collaborative decoding (Li et al., 2009)
System combination
11
![Page 12: Exploration of System Combination in Statistical Machine Translation](https://reader033.vdocuments.net/reader033/viewer/2022060115/5579559dd8b42ab6648b4a0f/html5/thumbnails/12.jpg)
• current mainstream • Bangalore et al. (2001), Matusov et al. (2006), Rosti et
al. (2007a, 2007b), Sim et al. (2007), He et al. (2008)
• Rosti et al. (2007a) ♦ Sentence level
♦ Phrase level
♦ Word level
Confusion network decoding
12
![Page 13: Exploration of System Combination in Statistical Machine Translation](https://reader033.vdocuments.net/reader033/viewer/2022060115/5579559dd8b42ab6648b4a0f/html5/thumbnails/13.jpg)
Confusion network decoding
• cat sat the mat, cat sitting on the mat, and hat on a mat.
13
![Page 14: Exploration of System Combination in Statistical Machine Translation](https://reader033.vdocuments.net/reader033/viewer/2022060115/5579559dd8b42ab6648b4a0f/html5/thumbnails/14.jpg)
• Collaborative decoding (Li et al.,2009) ♦ avoid early pruning of potentially good translations
♦ leverage agreement information of n-grams
• Multi-Engine Machine Translation (MEMT) ♦ METEOR alignment (Banerjee & Lavie, 2005)
♦ no fixed backbone
Other approaches
14
![Page 15: Exploration of System Combination in Statistical Machine Translation](https://reader033.vdocuments.net/reader033/viewer/2022060115/5579559dd8b42ab6648b4a0f/html5/thumbnails/15.jpg)
• Not a trivial problem (Siohan et al., 2005) • Key point: complementary error patterns • Approaches: ♦ selecting different systems of different paradigms
♦ diversifying one baseline system Ø introducing randomness (Siohan et al., 2005) Ø different morphological decompositions of source language (de
Gispert et al., 2009) Ø varying alignment algorithms (Xu & Rosti, 2010) Ø controlling target “trait” values (Devlin and Matsoukas, 2012)
Diverse hypothesis generation
15
![Page 16: Exploration of System Combination in Statistical Machine Translation](https://reader033.vdocuments.net/reader033/viewer/2022060115/5579559dd8b42ab6648b4a0f/html5/thumbnails/16.jpg)
• Exploiting multiple Chinese word segmentation standards: Zhang et al. (2008), Dyer et al. (2008), Xu et al. (2005)
• Zhang et al. (2008): ♦ Exploiting four SIGHAN standards: AS, CITYU, MSR, PKU
Diverse hypothesis generation
16
![Page 17: Exploration of System Combination in Statistical Machine Translation](https://reader033.vdocuments.net/reader033/viewer/2022060115/5579559dd8b42ab6648b4a0f/html5/thumbnails/17.jpg)
• Introduction • Literature Review • Multi-Engine Machine Translation (MEMT) ♦ Overview
♦ Description
• Experiments • Conclusion and Future Research
Outline
17
![Page 18: Exploration of System Combination in Statistical Machine Translation](https://reader033.vdocuments.net/reader033/viewer/2022060115/5579559dd8b42ab6648b4a0f/html5/thumbnails/18.jpg)
• Open source toolkit: http://kheafield.com/code/memt/ • WMT system name: cmu-combo (2009), cmu-heafield-
combo (2010, 2011) • Superior performance in WMT 2011
• Easy to use, robust and efficient
Overview
18
![Page 19: Exploration of System Combination in Statistical Machine Translation](https://reader033.vdocuments.net/reader033/viewer/2022060115/5579559dd8b42ab6648b4a0f/html5/thumbnails/19.jpg)
• Combining 1-best outputs of component systems ♦ Pair-wise alignment (METEOR)
♦ Beam search
♦ Z-MERT tuning (Zaidan, 2009)
• Features: ♦ length
♦ language model
♦ backoff
♦ match
Description
19
![Page 20: Exploration of System Combination in Statistical Machine Translation](https://reader033.vdocuments.net/reader033/viewer/2022060115/5579559dd8b42ab6648b4a0f/html5/thumbnails/20.jpg)
• METEOR alignment: ♦ exact matches
♦ identical stems (Porter, 2001)
♦ WordNet synonyms (Miller, 1995)
♦ TERp unigram paraphrases (Snover et al., 2009)
Description
20
![Page 21: Exploration of System Combination in Statistical Machine Translation](https://reader033.vdocuments.net/reader033/viewer/2022060115/5579559dd8b42ab6648b4a0f/html5/thumbnails/21.jpg)
• Search space: ♦ picking one word at a time, from left to right
♦ maintaining two sets of “captured” and “uncaptured” words
♦ no duplication, fluency across switches
♦ no fixed backbone
Description
21
![Page 22: Exploration of System Combination in Statistical Machine Translation](https://reader033.vdocuments.net/reader033/viewer/2022060115/5579559dd8b42ab6648b4a0f/html5/thumbnails/22.jpg)
• final hypothesis weaves together parts of component outputs
Description
22
![Page 23: Exploration of System Combination in Statistical Machine Translation](https://reader033.vdocuments.net/reader033/viewer/2022060115/5579559dd8b42ab6648b4a0f/html5/thumbnails/23.jpg)
• Introduction • Literature Review • Multi-Engine Machine Translation (MEMT)
• Experiments ♦ MEMT on WMT11
♦ MEMT on NIST MT08
♦ Diversifying Chinese-English phrase-based SMT
♦ Exploiting multiple CWS standards
• Conclusion and Future Research
Outline
23
![Page 24: Exploration of System Combination in Statistical Machine Translation](https://reader033.vdocuments.net/reader033/viewer/2022060115/5579559dd8b42ab6648b4a0f/html5/thumbnails/24.jpg)
• http://www.statmt.org/wmt11 • two language pairs: French-English and Spanish-English • Ranking participating systems by BLEU on the test set
• Selecting different component systems for system combination
MEMT on WMT11
24
![Page 25: Exploration of System Combination in Statistical Machine Translation](https://reader033.vdocuments.net/reader033/viewer/2022060115/5579559dd8b42ab6648b4a0f/html5/thumbnails/25.jpg)
• French-English MEMT on WMT11
system combination gain 25
![Page 26: Exploration of System Combination in Statistical Machine Translation](https://reader033.vdocuments.net/reader033/viewer/2022060115/5579559dd8b42ab6648b4a0f/html5/thumbnails/26.jpg)
• Spanish-English MEMT on WMT11
system combination gain 26
![Page 27: Exploration of System Combination in Statistical Machine Translation](https://reader033.vdocuments.net/reader033/viewer/2022060115/5579559dd8b42ab6648b4a0f/html5/thumbnails/27.jpg)
• Spanish-English ♦ why E1 (combining all) < E2 (excluding the bottom two) ?
MEMT on WMT11
27
![Page 28: Exploration of System Combination in Statistical Machine Translation](https://reader033.vdocuments.net/reader033/viewer/2022060115/5579559dd8b42ab6648b4a0f/html5/thumbnails/28.jpg)
• LDC catalog no. LDC2010T21 and LDC2010T01 • No accompanied system papers • Challenging: mix of newswire and web texts
• Chinese-English and Arabic-English ♦ split datasets into tuning set and test set
MEMT on NIST MT08
28
![Page 29: Exploration of System Combination in Statistical Machine Translation](https://reader033.vdocuments.net/reader033/viewer/2022060115/5579559dd8b42ab6648b4a0f/html5/thumbnails/29.jpg)
• Chinese-English: ♦ Tuning set: 524 sentences, test set: 788 sentences
♦ Combining the top 5 systems out of 23 systems
♦ similar to Ma and McKeown (2012)
• Arabic-English ♦ Tuning set: 509 sentences, test set: 803 sentences
♦ Combining the top 7 systems out of 14 systems
MEMT on NIST MT08
29
![Page 30: Exploration of System Combination in Statistical Machine Translation](https://reader033.vdocuments.net/reader033/viewer/2022060115/5579559dd8b42ab6648b4a0f/html5/thumbnails/30.jpg)
• Chinese-English, gain = 3.76
MEMT on NIST MT08
30
![Page 31: Exploration of System Combination in Statistical Machine Translation](https://reader033.vdocuments.net/reader033/viewer/2022060115/5579559dd8b42ab6648b4a0f/html5/thumbnails/31.jpg)
• Arabic-English, gain = 3.47
MEMT on NIST MT08
31
![Page 32: Exploration of System Combination in Statistical Machine Translation](https://reader033.vdocuments.net/reader033/viewer/2022060115/5579559dd8b42ab6648b4a0f/html5/thumbnails/32.jpg)
• Varying different steps of training pipeline • Tune on MTC1+MTC3 datasets (LDC2002T01 and
LDC2004T07), test on NIST02-NIST08 evaluation sets
• Varying decoding algorithm: Maximum A Posteriori (MAP), Minimum Bayes Risk (MBR), Lattice Minimum Bayes Risk (LMBR)
• Varying reordering model: word-based (wbe), phrase-based (phrase), hierarchical (hier), combined reordering (phrase-hier)
Diversifying Chinese-English SMT
32
![Page 33: Exploration of System Combination in Statistical Machine Translation](https://reader033.vdocuments.net/reader033/viewer/2022060115/5579559dd8b42ab6648b4a0f/html5/thumbnails/33.jpg)
• Varying decoding algorithm, gain=-0.17
Diversifying Chinese-English SMT
33
![Page 34: Exploration of System Combination in Statistical Machine Translation](https://reader033.vdocuments.net/reader033/viewer/2022060115/5579559dd8b42ab6648b4a0f/html5/thumbnails/34.jpg)
• Varying reordering model, gain=0.19 Diversifying Chinese-English SMT
34
![Page 35: Exploration of System Combination in Statistical Machine Translation](https://reader033.vdocuments.net/reader033/viewer/2022060115/5579559dd8b42ab6648b4a0f/html5/thumbnails/35.jpg)
• Chinese Word Segmentation ♦ Correlates weakly with MT quality
♦ Potential source of diversity
• SIGHAN Bakeoff evaluation campaign ♦ Academia Sinica (AS)
♦ City University of Hong Kong (CITYU)
♦ Penn Chinese Treebank (CTB)
♦ Microsoft Research (MSR)
♦ Peking University (PKU)
Exploiting multiple CWS standards
35
![Page 36: Exploration of System Combination in Statistical Machine Translation](https://reader033.vdocuments.net/reader033/viewer/2022060115/5579559dd8b42ab6648b4a0f/html5/thumbnails/36.jpg)
• Chinese Word Segmentation
Exploiting multiple CWS standards
36
![Page 37: Exploration of System Combination in Statistical Machine Translation](https://reader033.vdocuments.net/reader033/viewer/2022060115/5579559dd8b42ab6648b4a0f/html5/thumbnails/37.jpg)
• Baseline System ♦ Chinese-English phrase-based SMT systems trained with
Moses
♦ Segmenting and training five different systems corresponding to five CWS standards
♦ Training bi-text: 8,290,649 sentence pairs
♦ Interpolated language model of order 5
♦ Tuning set MTC1+MTC3: 1928 sentences, 4 references each
♦ giza++ alignment, combined reordering scheme, MBR decoding
Exploiting multiple CWS standards
37
![Page 38: Exploration of System Combination in Statistical Machine Translation](https://reader033.vdocuments.net/reader033/viewer/2022060115/5579559dd8b42ab6648b4a0f/html5/thumbnails/38.jpg)
• System combination experiments ♦ Same tuning set MTC1+MTC3
♦ ZMERT and PRO tuning
♦ Test sets: NIST 2002 to 2006, 2008
♦ Evaluation: mteval-v11b, case-insensitive
Exploiting multiple CWS standards
38
![Page 39: Exploration of System Combination in Statistical Machine Translation](https://reader033.vdocuments.net/reader033/viewer/2022060115/5579559dd8b42ab6648b4a0f/html5/thumbnails/39.jpg)
• Results – component systems Exploiting multiple CWS standards
39
![Page 40: Exploration of System Combination in Statistical Machine Translation](https://reader033.vdocuments.net/reader033/viewer/2022060115/5579559dd8b42ab6648b4a0f/html5/thumbnails/40.jpg)
• Results – combining 5 systems ♦ Avg gain: 0.52 (ZMERT) and 0.82 (PRO)
Exploiting multiple CWS standards
40
![Page 41: Exploration of System Combination in Statistical Machine Translation](https://reader033.vdocuments.net/reader033/viewer/2022060115/5579559dd8b42ab6648b4a0f/html5/thumbnails/41.jpg)
• Results – combining the top 3 systems ♦ Avg gain: 0.35 (ZMERT) and 0.64 (PRO)
♦ Lower than when combining 5 systems
Exploiting multiple CWS standards
41
![Page 42: Exploration of System Combination in Statistical Machine Translation](https://reader033.vdocuments.net/reader033/viewer/2022060115/5579559dd8b42ab6648b4a0f/html5/thumbnails/42.jpg)
• Discussion ♦ CWS is a good source to generate diverse SMT systems
♦ Benefits: Ø Reducing segmentation errors Ø Reducing out-of-vocabulary words Ø Providing diverse translations
Exploiting multiple CWS standards
42
![Page 43: Exploration of System Combination in Statistical Machine Translation](https://reader033.vdocuments.net/reader033/viewer/2022060115/5579559dd8b42ab6648b4a0f/html5/thumbnails/43.jpg)
• Component system outputs
Exploiting multiple CWS standards
43
![Page 44: Exploration of System Combination in Statistical Machine Translation](https://reader033.vdocuments.net/reader033/viewer/2022060115/5579559dd8b42ab6648b4a0f/html5/thumbnails/44.jpg)
• Combined system output
Exploiting multiple CWS standards
44
![Page 45: Exploration of System Combination in Statistical Machine Translation](https://reader033.vdocuments.net/reader033/viewer/2022060115/5579559dd8b42ab6648b4a0f/html5/thumbnails/45.jpg)
Conclusion and future research
• Conclusion ♦ System combination does benefit MT
♦ Exceptions Ø Combining very few systems Ø Some component systems with exceptionally bad performance Ø Combining very similar systems (non-complementary)
♦ Achieved the goal of improving Chinese-English SMT system
45
![Page 46: Exploration of System Combination in Statistical Machine Translation](https://reader033.vdocuments.net/reader033/viewer/2022060115/5579559dd8b42ab6648b4a0f/html5/thumbnails/46.jpg)
Conclusion and future research
• Future research ♦ Evaluating different combination algorithms
Ø Collaborative decoding (Li et al., 2009)
♦ Trait-based approach as a way to generate diverse inputs (Devlin and Matsoukas, 2012)
46
![Page 47: Exploration of System Combination in Statistical Machine Translation](https://reader033.vdocuments.net/reader033/viewer/2022060115/5579559dd8b42ab6648b4a0f/html5/thumbnails/47.jpg)
Summary
• Empirical experiments ♦ MEMT as system combination module
♦ WMT and NIST evaluation sets
• System combination does benefit MT quality ♦ comparable, complementary input systems
• Exploiting multiple CWS as a way to diversify SMT systems ♦ improve a strong Chinese-English phrase-based system
♦ average gain 0.5-0.8 BLEU in NIST02-06 and NIST08
47
![Page 48: Exploration of System Combination in Statistical Machine Translation](https://reader033.vdocuments.net/reader033/viewer/2022060115/5579559dd8b42ab6648b4a0f/html5/thumbnails/48.jpg)
Thank You
48