statistical phrase-based translation authors: koehn, och, marcu presented by albert bertram titles,...
Post on 21-Dec-2015
218 views
TRANSCRIPT
Statistical Phrase-Based TranslationAuthors: Koehn, Och, Marcu
Presented by Albert BertramTitles, charts, graphs, figures and tables were extracted from the paper.Critique and snarky remarks, however, are original.
Motivation
We have a new way of learning phrase translations, but…
“What is the best method to extract phrase translation pairs?”
What to do, what to do
Compose a framework for consistent comparison
Implement each algorithmCompare the results
Evaluation Framework
PhrasesModels involved
Language modelStatistical model for translationDistortion model
Decoder
Evaluation Framework: Phrases
We all know what phrases are right?NP, VP, wait, what? Oh.
Here, they’re generic spanning and non-overlapping subsequences of words.Are these guys really linguists?
Evaluation Framework: Models
Language Model Trigram usually p(en|en-1,en-2)
Translation Model Argmaxe p(e|f) = argmaxe p(f|e)p(e)
ebest = argmaxe p(f|e)pLM(e)ωlength(e)
p(f|e) is decomposed into
Evaluation Framework: Models
Distortion Modeld(ai – bi-1)
ai = start position of the foreign phrase translated into the ith English phrase
bi-1 = end position of the foreign phrase translated into the (i-1)th English phrase
Learned from the joint probability model Ping told us about
Evaluation Framework: Decoder
Left-to-right incrementalStack-based beam searchEstimates future costsSame decoder used in all experiments
Baseline: Syntactic Phrases
Learn only syntactically correct phrasesStart with the word based alignment Prune out the phrase pairs which aren’t
subtrees in the parsed sentences for either language.
Experiment Background
Europarl and BLEUTraining corpus of 10, 20, 40, 80,160
and 320 kilo-sentence pairs
Baseline Results
Notice the bottom row there? Comparing these models is like taking a 5-year old to a chess tournament.
More Experiments
Weighting Syntactic PhrasesMaximum Phrase LengthLexical WeightingPhrase Extraction HeuristicSimpler Underlying Word-Base ModelsOther Languages
Experiments and Results
Weighting Syntactic PhrasesDouble the count on syntactic phrases
Is that sufficient? Insufficient post-analysis on this oneThe same BLEU score
Were the translations in better syntax?Did the translations at least use more syntactic
phrases?
Experiments and Results
Phrase Extraction HeuristicAlign Bidirectionally
Note this gives two different word alignment setsStart with the intersection of the two setsAdd possible alignments
Only if they’re in the union of the setsOnly if they connect at least one previously
unaligned word
Experiments and Results
Phrase Extraction HeuristicAlgorithm
Start with the first English wordExpand only directly adjacent alignment pointsMove to the next English word, repeat.Finally add non-adjacent alignment points which
meet the heuristic criteria.