introduction to machine translation - nju nlpnlp.nju.edu.cn/huangsj/mtcourse/mt1-intro.pdf · 1984...

21
Shujian Huang Introduction to Machine Translation 14-12-31 Introduction to Machine Translation 1

Upload: vukhuong

Post on 28-Jul-2018

217 views

Category:

Documents


0 download

TRANSCRIPT

Shujian Huang

Introduction to Machine Translation

14-12-31 Introduction to Machine Translation 1

Machine Translation

n  Automatically translate text from one language to the other. ¡  e.g. translate a text from English to Chinese

14-12-31 Introduction to Machine Translation 2

Interests

n  Research perspective: ¡  one of the first applications envisioned for computers ¡  one of the most challenging problems in AI ¡  requiring knowledge from many NLP sub0areaes

n  Commercial perspective ¡  US lounched a seri of projects: TIDES(Translingual Information Detection,

Extraction and Summarization), GALE (Global Autonomous Language Exploitation), BOLT(Broad Operational Language Translation)

¡  EU spends more than $1 billion in translation ¡  Practical usage

14-12-31 Introduction to Machine Translation 3

Machine Translation Roadmap

n  1949 The field of "machine translation" appeared in Warren Weaver's Memorandum on Translation.

n  1950s Research groups established in US, Japan, Russia n  1966 Automatic Language Processing Advisory Committee (ALPAC)

report gave negative judgment n  1968 SYSTRAN founded n  1970s MT is used to translate text abstract, technical manuals n  1984 Trados founded (Trans. Memory and CAT, acquired by SDL 2005) n  1990s Statistical methods were employed in MT research n  1992 SDL plc founded n  2002 Language Weaver founded (acquired by SDL in 2010 by $42.5m) n  2007 Google announced its language service/OPP “Moses” released n  2012 Google translates text eq. 1 million books in one day

[http://en.wikipedia.org/wiki/Machine_translation] 14-12-31 Introduction to Machine Translation 4

SMT Research Roadmap

n  1990

n  1993

n  1999 Phrase-based translation n  2001 Syntax-based translation n  2002 BLEU (Evaluation) n  2003 MERT (Minimum Error Rate Training) n  2005 Hierarchical Phrase-based translation

14-12-31 Introduction to Machine Translation 5

People

n  Peter Brown (IBM Watson) n  Franz J. Och (RWTH, USC, Google) n  Philipp Koehn (USC, MIT, Edinburgh) n  Kevin Knight, Danial Marcu (USC, Language Weaver) n  Hermann Ney (RWTH) n  Stephan Vogel (RWTH, CMU, Qatar) n  。。。

14-12-31 Introduction to Machine Translation 6

ACL Citations

n  s

14-12-31 Introduction to Machine Translation 7

Paradigms

n  Rule-based Translation n  Statistical Machine Translation n  Example-based Translation (Translation Memory)

n  Translation levels: ¡  direct ¡  transfer ¡  interlingua

14-12-31 Introduction to Machine Translation 8

Rule-based Translation

n  A dictionary that will map each English word to an appropriate German word.

n  Rules representing regular English sentence structure. n  Rules representing regular German sentence structure. n  Rules to relate the above two structures together.

14-12-31 Introduction to Machine Translation 9

RBMT Example:

n  A girl eats an apple. ¡  Source Language = English; Demanded Target Language = German

n  1st: getting part-of-speech information of source words: ¡  a = indef.article; girl = noun; eats = verb; an = indef.article; apple =

noun

n  2nd: getting syntactic information about the verb “to eat”: ¡  NP-eat-NP; here: eat – Present Simple, 3rd Person Singular

n  3rd: parsing the source sentence: ¡  (NP an apple) = the object of eat

14-12-31 Introduction to Machine Translation 10

RBMT Example(cont.):

n  4th: translate English words into German ¡  a (category = indef.article) => ein (category = indef.article) ¡  girl (category = noun) => Mädchen (category = noun) ¡  eat (category = verb) => essen (category = verb) ¡  an (category = indef. article) => ein (category = indef.article) ¡  apple (category = noun) => Apfel (category = noun)

n  5th: Mapping dictionary entries into appropriate inflected forms (final generation): ¡  A girl eats an apple. => Ein Mädchen isst einen Apfel.

14-12-31 Introduction to Machine Translation 11

Fundamental of Statistical MT Systems

14-12-31 Introduction to Machine Translation 12

12

French Broken English

English

French/English Bilingual Text

English Text

Statistical Analysis Statistical Analysis

I am so hungry “Que hambre tengo yo”

What hunger have I, Hungry I am so, I am so hungry, Have I that hunger …

13

Statistical MT Systems

French Broken English

English

French/English Bilingual Text

English Text

Statistical Analysis Statistical Analysis

“Que hambre tengo yo” I am so hungry

Translation Model P(f|e)

Language Model P(e)

Decoding algorithm argmax P(e) * P(f|e) e

14-12-31 Introduction to Machine Translation

14

Bayes Rule

French Broken English

English

“Que hambre tengo yo” I am so hungry

Translation Model P(f|e)

Language Model P(e)

Decoding algorithm argmax P(e) * P(f|e) e

Given a source sentence f, the decoder should consider many possible translations … and return the target string e that maximizes

P(e | f) By Bayes Rule, we can also write this as:

P(e) P(f | e) / P(f) and maximize that instead. P(f) never changes while we compare different e’s, so we can equivalently maximize this:

P(e) P(f | e) 14-12-31 Introduction to Machine Translation

15

The noisy Channel Model

n  Goal: translation system from French to English n  Have a model P(e|f) which estimates conditional probability of any

English sentence e given the French sentence f. Use the training corpus to set the parameters.

n  A noisy channel model has two components: ¡  P(e) the language model ¡  P(f|e) the translation model

n  Giving: and

P(e | f ) = P(e, f )P( f )

=P(e)P( f | e)P( f )

argmaxe P(e | f ) = argmaxe P(e)P( f | e)

14-12-31 Introduction to Machine Translation

16

A division of labor

n  Use of Bayes Rule (“the noisy channel model”) allows a division of labor:

n  The job of the translation model P( f| e) is just to model how various French words typically get translated into English (perhaps in a certain context) ¡  P(f|e) doesn’t have to worry about language-particular facts about

English word order: that’s the job of P(e) n  The job of the language model is to choose felicitous bags of

words and to correctly order them for English ¡  P(e) can do bag generation: putting a bag of words in order:

n  – E.g., hungry I am so → I am so hungry

14-12-31 Introduction to Machine Translation

17

Three Problems for Statistical MT n  Language model

¡  Given an English string e, assigns P(e) by formula ¡  good English string -> high P(e) ¡  random word sequence -> low P(e)

n  Translation model ¡  Given a pair of strings <f,e>, assigns P(f | e) by formula ¡  <f,e> look like translations -> high P(f | e) ¡  <f,e> don’t look like translations -> low P(f | e)

n  Decoding algorithm ¡  Given a language model, a translation model, and a new sentence

f … find translation e maximizing P(e) * P(f | e)

14-12-31 Introduction to Machine Translation

18

Three Problems for Statistical MT(cont.)

n  The language model P(e) could be a n-gram model, estimated from any data (parallel corpus not needed to estimate the parameters)

n  The translation model P(f|e) is trained from a parallel corpus of French/English pairs

n  Note: ¡  The translation model is backwards! ¡  The language model can make up for deficiencies of the translation

model. ¡  Later we’ll talk about how to build P(f|e) ¡  Decoding, i.e., finding argmaxeP(e)P(f|e) is also a challenging problem.

14-12-31 Introduction to Machine Translation

19

Example from Koehn and Knight tutorial

n  Translation from French to English, candidate translations based on P(French|English) alone:

¡  “Que hambre tengo yo” →

¡  What Hunger have P(F|E)=0.000014

¡  Hungry I am so P(F|E)=0.000001

¡  I am so hungry P(F|E)=0.000001

¡  Have I that hunger P(F|E)=0.000020

14-12-31 Introduction to Machine Translation

20

Example from Koehn and Knight tutorial

n  With a language model ¡  P(French|English) * P(English)

¡  “Que hambre tengo yo” →

¡  What Hunger have P=0.000014*0.000001

¡  Hungry I am so P=0.000001*0.0000014

¡  I am so hungry P=0.000001*0.0001

¡  Have I that hunger P=0.000020*0.00000098

14-12-31 Introduction to Machine Translation

n  Reading Assignment:

14-12-31 Introduction to Machine Translation 21