statistical machine translation part iii – phrase-based smt / decoding alex fraser institute for...
TRANSCRIPT
![Page 1: Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alex Fraser Institute for Natural Language Processing University of Stuttgart 2008.07.23](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e395503460f94b2ab18/html5/thumbnails/1.jpg)
Statistical Machine TranslationPart III – Phrase-based SMT / Decoding
Alex FraserInstitute for Natural Language Processing
University of Stuttgart
2008.07.23 EMA Summer School
![Page 2: Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alex Fraser Institute for Natural Language Processing University of Stuttgart 2008.07.23](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e395503460f94b2ab18/html5/thumbnails/2.jpg)
Outline
• Phrase-based translation • Log-linear model• Tuning log-linear model• Decoding
![Page 3: Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alex Fraser Institute for Natural Language Processing University of Stuttgart 2008.07.23](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e395503460f94b2ab18/html5/thumbnails/3.jpg)
Slide from Koehn 2008
![Page 4: Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alex Fraser Institute for Natural Language Processing University of Stuttgart 2008.07.23](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e395503460f94b2ab18/html5/thumbnails/4.jpg)
Slide from Koehn 2008
![Page 5: Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alex Fraser Institute for Natural Language Processing University of Stuttgart 2008.07.23](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e395503460f94b2ab18/html5/thumbnails/5.jpg)
Language Model
• Usually a trigram language model is used for p(e)• P(the man went home) = p(the | START) p(man | START
the) p(went | the man) p(home | man went)• Language models work well for comparing the
grammaticality of strings of the same length– However, when comparing short strings with long strings
they favor short strings– For this reason, a very important component of the
language model is the length bonus• This is a constant > 1 multiplied for each English word in the
hypothesis
![Page 6: Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alex Fraser Institute for Natural Language Processing University of Stuttgart 2008.07.23](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e395503460f94b2ab18/html5/thumbnails/6.jpg)
Modified from Koehn 2008
d
![Page 7: Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alex Fraser Institute for Natural Language Processing University of Stuttgart 2008.07.23](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e395503460f94b2ab18/html5/thumbnails/7.jpg)
Slide from Koehn 2008
![Page 8: Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alex Fraser Institute for Natural Language Processing University of Stuttgart 2008.07.23](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e395503460f94b2ab18/html5/thumbnails/8.jpg)
Slide from Koehn 2008
![Page 9: Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alex Fraser Institute for Natural Language Processing University of Stuttgart 2008.07.23](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e395503460f94b2ab18/html5/thumbnails/9.jpg)
Slide from Koehn 2008
![Page 10: Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alex Fraser Institute for Natural Language Processing University of Stuttgart 2008.07.23](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e395503460f94b2ab18/html5/thumbnails/10.jpg)
Slide from Koehn 2008
![Page 11: Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alex Fraser Institute for Natural Language Processing University of Stuttgart 2008.07.23](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e395503460f94b2ab18/html5/thumbnails/11.jpg)
Slide from Koehn 2008
![Page 12: Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alex Fraser Institute for Natural Language Processing University of Stuttgart 2008.07.23](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e395503460f94b2ab18/html5/thumbnails/12.jpg)
Slide from Koehn 2008
![Page 13: Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alex Fraser Institute for Natural Language Processing University of Stuttgart 2008.07.23](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e395503460f94b2ab18/html5/thumbnails/13.jpg)
Slide from Koehn 2008
![Page 14: Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alex Fraser Institute for Natural Language Processing University of Stuttgart 2008.07.23](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e395503460f94b2ab18/html5/thumbnails/14.jpg)
Slide from Koehn 2008
![Page 15: Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alex Fraser Institute for Natural Language Processing University of Stuttgart 2008.07.23](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e395503460f94b2ab18/html5/thumbnails/15.jpg)
Slide from Koehn 2008
![Page 16: Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alex Fraser Institute for Natural Language Processing University of Stuttgart 2008.07.23](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e395503460f94b2ab18/html5/thumbnails/16.jpg)
Slide from Koehn 2008
![Page 17: Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alex Fraser Institute for Natural Language Processing University of Stuttgart 2008.07.23](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e395503460f94b2ab18/html5/thumbnails/17.jpg)
Slide from Koehn 2008
![Page 18: Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alex Fraser Institute for Natural Language Processing University of Stuttgart 2008.07.23](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e395503460f94b2ab18/html5/thumbnails/18.jpg)
Outline
• Phrase-based translation • Log-linear model• Tuning log-linear model• Decoding
![Page 19: Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alex Fraser Institute for Natural Language Processing University of Stuttgart 2008.07.23](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e395503460f94b2ab18/html5/thumbnails/19.jpg)
Slide from Koehn 2008
![Page 20: Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alex Fraser Institute for Natural Language Processing University of Stuttgart 2008.07.23](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e395503460f94b2ab18/html5/thumbnails/20.jpg)
Slide from Koehn 2008
![Page 21: Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alex Fraser Institute for Natural Language Processing University of Stuttgart 2008.07.23](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e395503460f94b2ab18/html5/thumbnails/21.jpg)
Slide from Koehn 2008
![Page 22: Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alex Fraser Institute for Natural Language Processing University of Stuttgart 2008.07.23](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e395503460f94b2ab18/html5/thumbnails/22.jpg)
Slide from Koehn 2008
![Page 23: Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alex Fraser Institute for Natural Language Processing University of Stuttgart 2008.07.23](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e395503460f94b2ab18/html5/thumbnails/23.jpg)
Slide from Koehn 2008
![Page 24: Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alex Fraser Institute for Natural Language Processing University of Stuttgart 2008.07.23](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e395503460f94b2ab18/html5/thumbnails/24.jpg)
Slide from Koehn 2008
![Page 25: Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alex Fraser Institute for Natural Language Processing University of Stuttgart 2008.07.23](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e395503460f94b2ab18/html5/thumbnails/25.jpg)
Slide from Koehn 2008
![Page 26: Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alex Fraser Institute for Natural Language Processing University of Stuttgart 2008.07.23](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e395503460f94b2ab18/html5/thumbnails/26.jpg)
Slide from Koehn 2008
![Page 27: Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alex Fraser Institute for Natural Language Processing University of Stuttgart 2008.07.23](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e395503460f94b2ab18/html5/thumbnails/27.jpg)
Outline
• Phrase-based translation model• Log-linear model• Tuning log-linear model automatically• Decoding
![Page 28: Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alex Fraser Institute for Natural Language Processing University of Stuttgart 2008.07.23](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e395503460f94b2ab18/html5/thumbnails/28.jpg)
Outline
• Phrase-based translation model• Log-linear model• Tuning log-linear model automatically• Decoding– Basic phrase-based decoding– Dealing with complexity
• Recombination• Pruning• Future cost estimation
– Decoding output
![Page 29: Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alex Fraser Institute for Natural Language Processing University of Stuttgart 2008.07.23](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e395503460f94b2ab18/html5/thumbnails/29.jpg)
Slide from Koehn 2008
![Page 30: Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alex Fraser Institute for Natural Language Processing University of Stuttgart 2008.07.23](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e395503460f94b2ab18/html5/thumbnails/30.jpg)
Slide from Koehn 2008
![Page 31: Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alex Fraser Institute for Natural Language Processing University of Stuttgart 2008.07.23](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e395503460f94b2ab18/html5/thumbnails/31.jpg)
Slide from Koehn 2008
![Page 32: Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alex Fraser Institute for Natural Language Processing University of Stuttgart 2008.07.23](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e395503460f94b2ab18/html5/thumbnails/32.jpg)
Slide from Koehn 2008
![Page 33: Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alex Fraser Institute for Natural Language Processing University of Stuttgart 2008.07.23](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e395503460f94b2ab18/html5/thumbnails/33.jpg)
Slide from Koehn 2008
![Page 34: Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alex Fraser Institute for Natural Language Processing University of Stuttgart 2008.07.23](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e395503460f94b2ab18/html5/thumbnails/34.jpg)
Slide from Koehn 2008
![Page 35: Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alex Fraser Institute for Natural Language Processing University of Stuttgart 2008.07.23](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e395503460f94b2ab18/html5/thumbnails/35.jpg)
Slide from Koehn 2008
![Page 36: Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alex Fraser Institute for Natural Language Processing University of Stuttgart 2008.07.23](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e395503460f94b2ab18/html5/thumbnails/36.jpg)
Slide from Koehn 2008
![Page 37: Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alex Fraser Institute for Natural Language Processing University of Stuttgart 2008.07.23](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e395503460f94b2ab18/html5/thumbnails/37.jpg)
Slide from Koehn 2008
![Page 38: Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alex Fraser Institute for Natural Language Processing University of Stuttgart 2008.07.23](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e395503460f94b2ab18/html5/thumbnails/38.jpg)
Slide from Koehn 2008
![Page 39: Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alex Fraser Institute for Natural Language Processing University of Stuttgart 2008.07.23](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e395503460f94b2ab18/html5/thumbnails/39.jpg)
Slide from Koehn 2008
![Page 40: Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alex Fraser Institute for Natural Language Processing University of Stuttgart 2008.07.23](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e395503460f94b2ab18/html5/thumbnails/40.jpg)
Slide from Koehn 2008
![Page 41: Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alex Fraser Institute for Natural Language Processing University of Stuttgart 2008.07.23](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e395503460f94b2ab18/html5/thumbnails/41.jpg)
Slide from Koehn 2008
![Page 42: Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alex Fraser Institute for Natural Language Processing University of Stuttgart 2008.07.23](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e395503460f94b2ab18/html5/thumbnails/42.jpg)
Slide from Koehn 2008
![Page 43: Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alex Fraser Institute for Natural Language Processing University of Stuttgart 2008.07.23](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e395503460f94b2ab18/html5/thumbnails/43.jpg)
Slide from Koehn 2008
![Page 44: Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alex Fraser Institute for Natural Language Processing University of Stuttgart 2008.07.23](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e395503460f94b2ab18/html5/thumbnails/44.jpg)
Slide from Koehn 2008
![Page 45: Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alex Fraser Institute for Natural Language Processing University of Stuttgart 2008.07.23](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e395503460f94b2ab18/html5/thumbnails/45.jpg)
Slide from Koehn 2008
![Page 46: Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alex Fraser Institute for Natural Language Processing University of Stuttgart 2008.07.23](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e395503460f94b2ab18/html5/thumbnails/46.jpg)
Slide from Koehn 2008
![Page 47: Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alex Fraser Institute for Natural Language Processing University of Stuttgart 2008.07.23](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e395503460f94b2ab18/html5/thumbnails/47.jpg)
Slide from Koehn 2008
![Page 48: Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alex Fraser Institute for Natural Language Processing University of Stuttgart 2008.07.23](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e395503460f94b2ab18/html5/thumbnails/48.jpg)
Slide from Koehn 2008
![Page 49: Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alex Fraser Institute for Natural Language Processing University of Stuttgart 2008.07.23](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e395503460f94b2ab18/html5/thumbnails/49.jpg)
Slide from Koehn 2008
![Page 50: Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alex Fraser Institute for Natural Language Processing University of Stuttgart 2008.07.23](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e395503460f94b2ab18/html5/thumbnails/50.jpg)
Slide from Koehn 2008
![Page 51: Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alex Fraser Institute for Natural Language Processing University of Stuttgart 2008.07.23](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e395503460f94b2ab18/html5/thumbnails/51.jpg)
Slide from Koehn 2008
![Page 52: Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alex Fraser Institute for Natural Language Processing University of Stuttgart 2008.07.23](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e395503460f94b2ab18/html5/thumbnails/52.jpg)
Slide from Koehn 2008
![Page 53: Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alex Fraser Institute for Natural Language Processing University of Stuttgart 2008.07.23](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e395503460f94b2ab18/html5/thumbnails/53.jpg)
Slide from Koehn 2008
![Page 54: Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alex Fraser Institute for Natural Language Processing University of Stuttgart 2008.07.23](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e395503460f94b2ab18/html5/thumbnails/54.jpg)
Slide from Koehn 2008
![Page 55: Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alex Fraser Institute for Natural Language Processing University of Stuttgart 2008.07.23](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e395503460f94b2ab18/html5/thumbnails/55.jpg)
Slide from Koehn 2008
![Page 56: Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alex Fraser Institute for Natural Language Processing University of Stuttgart 2008.07.23](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e395503460f94b2ab18/html5/thumbnails/56.jpg)
Slide from Koehn 2008
![Page 57: Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alex Fraser Institute for Natural Language Processing University of Stuttgart 2008.07.23](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e395503460f94b2ab18/html5/thumbnails/57.jpg)
Slide from Koehn 2008
![Page 58: Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alex Fraser Institute for Natural Language Processing University of Stuttgart 2008.07.23](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e395503460f94b2ab18/html5/thumbnails/58.jpg)
Slide from Koehn 2008
![Page 59: Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alex Fraser Institute for Natural Language Processing University of Stuttgart 2008.07.23](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e395503460f94b2ab18/html5/thumbnails/59.jpg)
Slide from Koehn 2008
![Page 60: Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alex Fraser Institute for Natural Language Processing University of Stuttgart 2008.07.23](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e395503460f94b2ab18/html5/thumbnails/60.jpg)
Assignment 2
• Build a state of the art phrase-based SMT system!– German to English or French to English– Using a small amount of data– This is a „learning by doing“ exercise
• See my home page again
![Page 61: Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alex Fraser Institute for Natural Language Processing University of Stuttgart 2008.07.23](https://reader036.vdocuments.net/reader036/viewer/2022062518/56649e395503460f94b2ab18/html5/thumbnails/61.jpg)
Thank you!