recall: abenhon cs388: natural language processing

10
Some slides adapted from Dan Klein, UC Berkeley CS388: Natural Language Processing Greg DurreB Lecture 17: Machine TranslaHon 1 Star Wars The Third Gathers: The Backstroke of the West (subHtles machine translated from Chinese) Recall: ABenHon the movie was great h 1 h 2 h 3 h 4 <s> ¯ h 1 For each decoder state, compute weighted sum of input states e ij = f ( ¯ h i ,h j ) c i = X j ij h j c 1 Some funcHon f (TBD) Weighted sum of input hidden states (vector) le ij = exp(e ij ) P j 0 exp(e ij 0 ) P (y i |x,y 1 ,...,y i-1 ) = softmax(W [c i ; ¯ h i ]) P (y i |x,y 1 ,...,y i-1 ) = softmax(W ¯ h i ) No aBn: the movie was great Recall: Pointer Networks P (y i |x,y 1 ,...,y i-1 ) = softmax(W [c i ; ¯ h i ]) Pointer network: predict from source words instead of target vocab the movie was great a of the movie was great the movie was great a of Standard decoder (Pvocab): soUmax over vocabulary, all words get >0 prob { h > j V ¯ h i if y i = w j 0 otherwise w1 w2 w3 w4 P pointer (y i |x,y 1 ,...,y i-1 ) / This Lecture MT basics, evaluaHon Word alignment Language models Phrase-based decoders Syntax-based decoders (probably next Hme)

Upload: others

Post on 21-Apr-2022

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Recall: ABenHon CS388: Natural Language Processing

SomeslidesadaptedfromDanKlein,UCBerkeley

CS388:NaturalLanguageProcessing

GregDurreB

Lecture17:MachineTranslaHon1

StarWarsTheThirdGathers:TheBackstrokeoftheWest(subHtlesmachinetranslatedfromChinese)

Recall:ABenHon

themoviewasgreat

h1 h2 h3 h4

<s>

h̄1

‣ Foreachdecoderstate,computeweightedsumofinputstates

eij = f(h̄i, hj)

ci =X

j

↵ijhj

c1

‣ SomefuncHonf(TBD)

‣Weightedsumofinputhidden states(vector)

le

↵ij =exp(eij)Pj0 exp(eij0)

P (yi|x, y1, . . . , yi�1) = softmax(W [ci; h̄i])

P (yi|x, y1, . . . , yi�1) = softmax(Wh̄i)‣ NoaBn:

the

movie was

great

Recall:PointerNetworks

P (yi|x, y1, . . . , yi�1) = softmax(W [ci; h̄i])

‣ Pointernetwork:predictfromsourcewordsinsteadoftargetvocab

the

movie

was

greata of … … …

themoviewasgreat the

movie

was

greata of … … …

‣ Standarddecoder(Pvocab):soUmaxovervocabulary,allwordsget>0prob

{ h>j V h̄i if yi = wj

<latexit sha1_base64="Q3tdPpnhrZJm8oaTIwqkzDwIlKQ=">AAAGcHicjVRbb9MwFM7GWka5bewFCRCGqSJhYWq2SSCkoQleEBJSuewiLVvkOm7rLTfZ7tIq9Ss/kDd+BC/8Ak6cZGNdtxGp6cn5vnOzP7uTBEzIVuvXzOyNuVr95vytxu07d+/dX1h8sCPiASd0m8RBzPc6WNCARXRbMhnQvYRTHHYCuts5/pDjuyeUCxZH3+UooQch7kWsywiW4PIW5340XRwkfew5prDQJnLpMDHdpM88agrbsd0Qy36nmw2VZTUqrjSFJzVbDEIvE14mXzlKoRLWX2bptc6n9KQtJ5Oe4rIKsvP8FgKsQ+XUeiu6nkb1h1k6J8tppw2vK4tKuwrXRQPalWZRq4dcFiFX0qHkYRb1OA4FVE49AIgfS0Rcznp9macM4h5qm6NNZ5zaxEIryO1yTDJHZceqbJ1tOuoQ8lXU1jj1GJAbzbZZNZiqfNS2mXqOpf/Wxqfmem7a4II5Aygv8kBIod2FB9CM6YUvkuQYOKJ8HSYYOnhcDXfCBJPUR99wpIOL7iuUxINIKnMa2Uappf6HaKnzFSNYvxRHEskY9WJ4T7Z0LeFT3I9QVeIz9nEPC4J50X8A58AHZV89yWUprpzqkiALpLFS7ow2ig42pncwLX+uqXKZzvZg7GbJWzRBTWxgvGu5agxZcnGl6Dpeoyk8p9EcFprWcutkX9WhjwoRY87jFA1fFHjWsh1XHWb+Cwgc5YGgWJDT2UGyR2eqG1WqcxMeJ7Bfjb53dOjKOEE7cFAxz/rKY6fNIdZFCkE6GDL1jryF5dZqSz/oouGUxrJRPm1v4afrx2QQ0kiSAAux77QSeZBhLhkJqGq4A0ETTI5xj+6DGeGQioNMX5gKNcHjo27M4Qfi0t5/IzI45GIUdoCZzyomsdw5DdsfyO6bg4xFyUDSiBSFuoMgl29++yKfcUpkMAIDE86gV0T6GLZYwh3dgEVwJke+aOysrTrrq2tfNpa33pfLMW88Mp4bpuEYr40t46PRNrYNMve7tlR7XHtS+1N/WH9af1ZQZ2fKmCXj3FN/+RcVWSa9</latexit>

0otherwise

w1w2w3w4

Ppointer(yi|x, y1, . . . , yi�1) /<latexit sha1_base64="/P0oXrWO7G7mmaRuLo+24t1JeVE=">AAAG+3icjVXdbtQ4FA6wM8sO+9PCJTfWViMSNVSTggRCKkLLDVpppdllW5Ca1nIcz4xpEke2p5ko41fZm70AIW73RfaOt+HESVo6HQqWkpyc851ff06iPOFKj0Yfr12/8V2v//3NHwa3fvzp5182Nm8fKDGXlO1TkQj5OiKKJTxj+5rrhL3OJSNplLBX0cnz2v7qlEnFRfa3LnN2lJJpxiecEg0qvNnbHIYkyWcEB67y0B4K2SJ3w3zGMXOVH/hhSvQsmlQL43mDDqtdhbVFq3mKK4UrfT8wBrVm++a2Wu9iSKx9vRr0zK47J7+O7yGwRUyvzbdt81mrfXFb5Wo6q/ThdmVS7XfuNmnCJtptck1RyDMUarbQMq2yqSSpgswFBgONhUY0lHw603XIREzR2C33gmXhUw9to3AiCa0CU52YtnS+F5hjiNdBR8sCcwAPhmO3K7Awdatjt8CBZx+7yzPxQS36oII+E0ivakcIYdWNBqwVt4NvgtQ2UGT1HFYQ1nnZNXfKFdcsRi9JZp2b6jsrFfNMG3cd2EeFZ74F6JmLGTOYX0EyjbRAUwH31ZK+CvhdzDLUpfiDxGRKFCWyqT+BcxADs6/u5EshruzqC04eUGO73RkrNBU8XF/Buvg1p9oxne/BMqzyJ2gFmvuAeDoKzRKi1OQq0Ndwg6HCwWC4aDht6RZVf5njGDUkJlKKAi3uNfZq5AehOa7ie+BYguMYn5WbC55pJo1xSyDY+dHyy3Melh0Pw1yKHHZwMJzhN8ehFjk6gLNLZDUzmJ/Vi/gEGQTxoO8Cv6ln8A3Ba5Kfd5yXxltXJmyGG6D7l7Cfg08FJZExeGNrtDOyC10WglbYcto1xhv/h7Gg85RlmiZEqcNglOujikjNacLMIJwrlhN6QqbsEMSMpEwdVfbbbdAQNDGaCAkX8NxqP/eo4HujyjQCZD0HtWqrletsh3M9eXxU8Syfa5bRJtFkntQnqf4RoJhLRnVSgkCo5FArojMCbIN5qQEMIVht+bJwsLsTPNjZ/fPh1rPf2nHcdO46vzquEziPnGfOC2fs7Du0t+j903vbe9c3/X/77/sfGuj1a63PHefC6v/3CTVpXbg=</latexit>

ThisLecture

‣MTbasics,evaluaHon

‣Wordalignment

‣ Languagemodels

‣ Phrase-baseddecoders

‣ Syntax-baseddecoders(probablynextHme)

Page 2: Recall: ABenHon CS388: Natural Language Processing

MTBasics

MT

TrumpPopefamilywatchahundredyearsayearintheWhiteHousebalcony

People’sDaily,August30,2017

MTIdeally

‣ Ihaveafriend=>∃x friend(x,self)

‣MayneedinformaHonyoudidn’tthinkaboutinyourrepresentaHon

‣ Everyonehasafriend=> =>Tousaunami

‣ CanoUengetawaywithoutdoingalldisambiguaHon—sameambiguiHesmayexistinbothlanguages

J’aiuneamie(friendisfemale)

∃x∀y friend(x,y)∀x∃y friend(x,y)

‣ HardforsemanHcrepresentaHonstocovereverything

=>J’aiunami

MTinPracHce

Jefaisunbureau

Jefaisunesoupe

Jefaisunbureau

Imakeadesk

I’mmakingadesk

I’mmakingsoup

Qu’est-cequetufais? Whatareyoumaking?

‣WhataresometranslaHonpairsyoucanidenHfy?Howdoyouknow?

‣Whatmakesthishard? Notword-to-wordtranslaHon MulHpletranslaHonsofasinglesource(ambiguous)

‣ Bitext:thisiswhatwelearntranslaHonsystemsfrom

Page 3: Recall: ABenHon CS388: Natural Language Processing

LevelsofTransfer:VauquoisTriangle

Slidecredit:DanKlein‣ Today:mostlyphrase-based,somesyntax

Phrase-BasedMT

‣ Keyidea:translaHonworksbeBerthebiggerchunksyouuse

‣ Rememberphrasesfromtrainingdata,translatepiece-by-pieceandsHtchthosepiecestogethertotranslate

‣ HowtoidenHfyphrases?Wordalignmentoversource-targetbitext

‣ HowtosHtchtogether?Languagemodelovertargetlanguage

‣ DecodertakesphrasesandalanguagemodelandsearchesoverpossibletranslaHons

‣ NOTlikestandarddiscriminaHvemodels(takeabunchoftranslaHonpairs,learnatonofparametersinanend-to-endway)

Phrase-BasedMT

Unlabeled English data

cat ||| chat ||| 0.9 the cat ||| le chat ||| 0.8 dog ||| chien ||| 0.8 house ||| maison ||| 0.6 my house ||| ma maison ||| 0.9 language ||| langue ||| 0.9 …

Language model P(e)

Phrase table P(f|e) P (e|f) / P (f |e)P (e)

Noisy channel model: combine scores from translation model + language model to translate foreign to

English

“Translate faithfully but make fluent English”

}EvaluaHngMT

‣ Fluency:doesitsoundgoodinthetargetlanguage?‣ Fidelity/adequacy:doesitcapturethemeaningoftheoriginal?

‣ BLEUscore:geometricmeanof1-,2-,3-,and4-gramprecisionvs.areference,mulHpliedbybrevitypenalty(penalizesshorttranslaHons)

‣ Typicallyn=4,wi=1/4

r=lengthofreferencec=lengthofpredicHon

‣ Doesthiscapturefluencyandadequacy?

Page 4: Recall: ABenHon CS388: Natural Language Processing

BLEUScore

‣ BeBermethodswith human-in-the-loop

‣ Atacorpuslevel,BLEUcorrelatespreBywellwithhumanjudgments

‣ Ifyou’rebuildingrealMT systems,youdouserstudies. Inacademia,youmostlyuseBLEU

WordAlignment

WordAlignment§  Input:abitext:pairsoftranslatedsentences

§  Output:alignments:pairsoftranslatedwords§  Notalwaysone-to-one!

nous acceptons votre opinion .

we accept your view .

‣ Input:abitext,pairsoftranslatedsentences

nousacceptonsvotreopinion.|||weacceptyourview

nousallonschangerd’avis|||wearegoingtochangeourminds

‣ Output:alignmentsbetweenwordsineachsentence

‣Wewillseehowtoturntheseintophrases

“acceptandacceptonsarealigned”

1-to-ManyAlignments

Page 5: Recall: ABenHon CS388: Natural Language Processing

WordAlignment

‣ModelsP(f|e):probabilityof“French”sentencebeinggeneratedfrom“English”sentenceaccordingtoamodel

‣ Correctalignmentsshouldleadtohigher-likelihoodgeneraHons,sobyopHmizingthisobjecHvewewilllearncorrectalignments

‣ Latentvariablemodel: P (f |e) =X

a

P (f ,a|e) =X

a

P (f |a, e)P (a)

IBMModel1

Brownetal.(1993)

Thankyou,Ishalldosogladly.e

‣ EachFrenchwordisalignedtoatmostoneEnglishword

0 2 6

Gracias,loharedemuybuengrado.f

5 7 7 7 7 8a

‣ SetP(a)uniformly(nopriorovergoodalignments)

‣ :wordtranslaHonprobabilitytableP (fi|eai)

P (f ,a|e) =nY

i=1

P (fi|eai)P (ai)

HMMforAlignment

Vogeletal.(1996)

Thankyou,Ishalldosogladly.e

‣ SequenHaldependencebetweena’stocapturemonotonicity

0 2 6

Gracias,loharedemuybuengrado.f

5 7 7 7 7 8a

‣ Alignmentdistparameterizedbyjumpsize:

‣ :sameasbeforeP (fi|eai)

§  Wantlocalmonotonicity:mostjumpsaresmall§  HMMmodel(Vogel96)

§  Re-es>mateusingtheforward-backwardalgorithm -2 -1 0 1 2 3

P (f ,a|e) =nY

i=1

P (fi|eai)P (ai|ai�1)

HMMModel

à

‣WhichdirecHonisthis?

‣ Somemistakes,especiallywhenyouhaverarewords(garbagecollec<on)

‣ Alignmentsaregenerallymonotonic(alongdiagonal)

Page 6: Recall: ABenHon CS388: Natural Language Processing

EvaluaHngWordAlignment

Model AER Model 1 INT 19.5 HMM E→F 11.4 HMM F→E 10.8 HMM AND 7.1 HMM INT 4.7 GIZA M4 AND 6.9

‣ RunModel1inbothdirecHonsandintersect“intelligently”

‣ RunHMMmodelinbothdirecHonsandintersect“intelligently”

‣ “Alignmenterrorrate”:uselabeledalignmentsonsmallcorpus

PhraseExtracHon

‣ FindconHguoussetsofalignedwordsinthetwolanguagesthatdon’thavealignmentstootherwords

d’assisteràlareunionet|||toaBendthemeeHngand à

‣ Lotsofphrasespossible,countacrossallsentencesandscorebyfrequency

assisteràlareunion|||aBendthemeeHng

lareunionand|||themeeHngand

nous|||we…

Decoding

Recall:n-gramLanguageModels

‣ n-grammodels:distribuHonofnextwordisamulHnomialcondiHonedonpreviousn-1words

MaximumlikelihoodesHmateofthis3-gramprobabilityfromacorpus

IvisitedSan_____ putadistribuHonoverthenextword

‣ Typicallyuse~5-gramlanguagemodelsfortranslaHon

P (w) = P (w1)P (w2|w1)P (w3|w1, w2) . . .<latexit sha1_base64="5t3XfUKt6X50B4ft08GzEXR9MnE=">AAAEJ3icjVNLb9NAEHYdHiW8WjhyWVFFslVTxS0SXCJVcOEYJNJWqsNqvVk7q6wf8o5xIsf/hgt/hQsSIARH/gljx6loWiFWsj37fTPzzcx6/VRJDf3+ry2zc+Pmrdvbd7p3791/8HBn99GJTvKMixFPVJKd+UwLJWMxAglKnKWZYJGvxKk/e13zpx9EpmUSv4NFKsYRC2MZSM4AIbprDnoeU+mUUdfSNhkQT8xTy0unkgpLO67jRQymflDOK9vurn3B0hQab51HtNS0hGduVZGWbnZWi9qXU1JwYDPpBQ/rIKfObxPkfAHX6u03eg3bbKwW3JRrQAdf/xQFZx3eiCoRgLXSCoknY+KBmEMWlXGYsUijckGR4JMECPcyGU6hTqmSkAytxcBdFg63yT7xgozx0q3KWdWWLgdu9R7zrV37y4JKdO4OrXV9RVV3OrQK6trN53B5YR7VpoMQtqlQXXd7Q2tO5XKO8Apx5qjSzL1OsuIQiOsxbHjUwcXyojdspmAxEEhImOD7Pxzozl7/oN8sctVwW2PPaNeQ7nz1JgnPIxEDV0zrc7efwrhkGUiuRNX1ci1SxmcsFOdoxiwSelw2/3lFeohMSJBk+GAZDfp3RIlnoxeRj571LPUmV4PXcec5BC/HpYzTHETMV0JBrupG60tDJjITHNQCDcYzibUSPmV4toBXq4tDcDdbvmqcHB64RweHb5/vHb9qx7FtPDGeGpbhGi+MY+ONMTRGBjc/mp/Nb+b3zqfOl86Pzs+Vq7nVxjw2Lq3O7z+3SFKm</latexit>

P (wi|w1, . . . , wi�1) = P (wi|wi�n+1, . . . , wi�1)<latexit sha1_base64="51rE81WXGzM2/wAnhkW3v0LmNjI=">AAAEJ3icjVNLb9QwEE6zPMryauHIxaKqlKih2rRIcFmpggvHRaIPqVkix+tkrToPxRPSVTb/hgt/hQsSIARH/gljb7ai2wphKcn4+2bmmxnHUSGFgsHg15rdu3Hz1u31O/279+4/eLix+ehI5VXJ+CHLZV6eRFRxKTJ+CAIkPylKTtNI8uPo7LXmjz/wUok8ewezgo9TmmQiFowCQuGmPdwOqCymNPQd5ZIhCfh54QTFVITcUZ7vBSmFaRQ3563r9pe+4KgQjLeq0rBRYQPP/LYlHW12Toe6l1OG4MFq0gselkGezu8S5CIO1+rtGD3Dmo3TgatyBvTw9U9R8JbhRlTyGJyFVkICkZEA+DmUaZMlJU0VKtchEmySA2FBKZIp6JQyT8jImQ39ee0xl+yQIC4pa/y2OWu70sXQb99jvqXrYF6HAp372yNnWWDd6lZHTh36rvnszS/MfW16CGGfEuVVX6PCoAsAyUaYuS9yaA6BTI9hxUOL1vOL3rCZmmZAICdJju//cAg3tga7A7PIVcPvjC2rW6Nw42swyVmV8gyYpEqd+oMCxg0tQTDJ235QKV5QdkYTfopmRlOuxo35z1uyjciExHmJD5Zh0L8jGjwbNUsj9NSjVKucBq/jTiuIX44bkRUV8IwthOJK6kb1pSETUXIGcoYGZaXAWgmbUjxbwKvVxyH4qy1fNY72dv393b23z7cOXnXjWLeeWE8tx/KtF9aB9cYaWYcWsz/an+1v9vfep96X3o/ez4WrvdbFPLYurd7vP6JwUqA=</latexit>

P (w|visited San) =count(visited San, w)

count(visited San)<latexit sha1_base64="GxhegjoyI97hWYhkruShxjX0v5w=">AAAEmXicjVNdb5RAFKXdVev60a0mvvRlYtMEUmyW1kQfbFI1msanNbUfSVnJMDuwk4WBMJfSDctv8r/45r/xMguN3TbqJMDlnHPvuXNh/DQSCgaDXyurne69+w/WHvYePX7ydL2/8exUJXnG+AlLoiQ796nikZD8BARE/DzNOI39iJ/50481f3bJMyUS+Q1mKR/FNJQiEIwCQt7G6o9tl0bphHqOqSxyQFx+lZpuOhEeN5Xt2G5MYeIH5VVlWb1WC6byQKtVHnul8kp45VQVaWj9ZjaodbOkBzYsF73moU2y6/oWQc7ncKffjvbTrH4xG3DZToM23v5qCnabrk0jHoC58AqJKyRxgV9BFpcyzGis0LnwkGDjBAhzMxFOoC4ZJSEZmrMDZ17YzCI7xA0yykqnKqdV07o4cKrvWK+VDuaFJ1Dc2x6abYNFVW91aBaeY+nH3vw63K9DGyHcZ4T2qk7EEhpeIMiWQg9+UaTmEJD1HJYUPeTn7d4uhRLAx+SYSp27aL5lWZJLqMy7xDYprOp/hFalu712lDi+gkogkJAwwXtl/VPg9bcGuwO9yO3AaYIto1lDr//THScsj7kEFlGlLpxBCqOSZiBYxKuemyueUjalIb/AUNKYq1GpT1ZFthEZkyDJ8MI2NPpnRol/g5rFPirrj6eWuRq8i7vIIXg7KoVMc+CSLYyCPKo3Wh9TMhYZZxDNMKAsE9grYROKHwTwMPdwCM7ylm8Hp3u7zv7u3tfXW4cfmnGsGZvGS8M0HOONcWgcGUPjxGCdF513nU+dz93N7vvuUffLQrq60uQ8N26s7vFv6gt6gg==</latexit>

Page 7: Recall: ABenHon CS388: Natural Language Processing

Phrase-BasedDecoding

‣ Phrasetable:setofphrasepairs(e,f)withprobabiliHesP(f|e)

‣ Inputs:

‣Whatwewanttofind:eproducedbyaseriesofphrase-by-phrasetranslaHonsfromaninputf,possiblywithreordering:

‣ n-gramlanguagemodel:P (ei|e1, . . . , ei�1) ⇡ P (ei|ei�n�1, . . . , ei�1)

Phraselavcesarebig!

� 7� �� �� �� � �� �� � .

Slidecredit:DanKlein

Phrase-BasedDecoding

‣ Input

‣ TranslaHons

‣ DecodingobjecHve(for3-gramLM)

Slidecredit:DanKlein

MonotonicTranslaHon

‣ Ifwetranslatewithbeamsearch,whatstatedoweneedtokeepinthebeam?

‣Whathavewetranslatedsofar?

‣Whatwordshaveweproducedsofar?

‣Whenusinga3-gramLM,onlyneedtorememberthelast2words!Koehn(2004)

Page 8: Recall: ABenHon CS388: Natural Language Processing

MonotonicTranslaHon

…didnotidx=2

Marynot

Maryno

4.2

-1.2

-2.9

idx=2

idx=2

score=log[P(Mary)P(not|Mary)P(Maria|Mary)P(no|not)]{ {LM TM

Inreality:score=αlogP(LM)+βlogP(TM)…andTMisbrokendownintoseveralfeatures

Koehn(2004)

MonotonicTranslaHon

…notslapidx=5

…aslap

…noslap

8.7

-2.4

-1.1

idx=5

idx=5

‣ Severalpathscangetustothisstate,maxoverthem(likeViterbi)

…notgiveidx=3

…giveaidx=4

unabofetada|||aslap

bofetada|||slap ‣ Variable-lengthtranslaHonpieces=semi-HMM

Non-MonotonicTranslaHon

‣ Non-monotonictranslaHon:canvisitsourcesentence“outoforder”‣ Stateneedstodescribewhichwordshavebeentranslatedandwhichhaven’t

translated:Maria,dio,una,bofetada

‣ Bigenoughphrasesalreadycapturelotsofreorderings,sothisisn’tasimportantasyouthink

TrainingDecoders

‣ MERT(Och2003):decodetoget1000-besttranslaHonsforeachsentenceinasmalltrainingset(<1000sentences),dolinesearchonparameterstodirectlyopHmizeforBLEU

score=αlogP(LM)+βlogP(TM)

…andTMisbrokendownintoseveralfeatures

‣ Usually5-20featureweightstoset,wanttoopHmizeforBLEUscorewhichisnotdifferenHable

Page 9: Recall: ABenHon CS388: Natural Language Processing

Moses

‣ Pharaoh(Koehn,2004)isthedecoderfromKoehn’sthesis

‣ ToolkitformachinetranslaHonduetoPhilippKoehn+HieuHoang

‣Mosesimplementswordalignment,languagemodels,andthisdecoder,plus*aton*morestuff

‣ HighlyopHmizedandheavilyengineered,couldmoreorlessbuildSOTAtranslaHonsystemswiththisfrom2007-2015

‣ NextHme:resultsontheseandcomparisonstoneuralmethods

Syntax

LevelsofTransfer:VauquoisTriangle

Slidecredit:DanKlein‣ Issyntaxa“beBer”abstracHonthanphrases?

SyntacHcMT‣ Ratherthanusephrases,useasynchronouscontext-freegrammar

NP→[DT1JJ2NN3;DT1NN3JJ2]

DT→[the,la]

NN→[car,voiture]JJ→[yellow,jaune]

the yellow car

‣ Assumesparallelsyntaxuptoreordering

DT→[the,le]

la voiture jaune

NP NP

DT1 NN3 JJ2DT1 NN3JJ2

‣ TranslaHon=parsetheinputwith“half”ofthegrammar,readofftheotherhalf

Page 10: Recall: ABenHon CS388: Natural Language Processing

SyntacHcMT

Slidecredit:DanKlein

‣ Uselexicalizedrules,looklike“syntacHcphrases”

‣ LeadstoHUGEgrammars,parsingisslow

Takeaways

‣ Phrase-basedsystemsconsistof3pieces:aligner,languagemodel,decoder

‣ HMMsworkwellforalignment

‣ N-gramlanguagemodelsarescalableandhistoricallyworkedwell

‣ Decoderrequiressearchingthroughacomplexstatespace

‣ LotsofsystemvariantsincorporaHngsyntax

‣ NextHme:neuralMT