domain adaptation for neural machine translationkevinduh/notes/1905-kduh-mtma-adapt.pdfsubnetworksto...

47
Domain Adaptation for Neural Machine Translation Kevin Duh Johns Hopkins University May 2019

Upload: others

Post on 15-Mar-2020

18 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Domain Adaptation for Neural Machine Translationkevinduh/notes/1905-kduh-MTMA-Adapt.pdfSubnetworksto Analyze Domain Adaptation in Neural Machine Translation, WMT 2018 • Different

DomainAdaptationforNeuralMachineTranslation

KevinDuhJohnsHopkinsUniversity

May2019

Page 2: Domain Adaptation for Neural Machine Translationkevinduh/notes/1905-kduh-MTMA-Adapt.pdfSubnetworksto Analyze Domain Adaptation in Neural Machine Translation, WMT 2018 • Different

Outline

1. Problemdefinition2. Surveyofadaptationmethods3. ErrorAnalysis4. PromisingResearchDirections

Page 3: Domain Adaptation for Neural Machine Translationkevinduh/notes/1905-kduh-MTMA-Adapt.pdfSubnetworksto Analyze Domain Adaptation in Neural Machine Translation, WMT 2018 • Different

DomainAdaptationProblem:MachineLearningPerspective

• Trainingdata:– (x1,y1),(x2,y2),(x3,y3),…, i.i.d.samplesfromdistributionD

– Buildmodelp(y|x)

• IftestdataisnotfromD, p(y|x)maybeoperatingataspaceitwasn’tbuiltfor.– Twocasesforwhatwemeanby“notfromD”

Page 4: Domain Adaptation for Neural Machine Translationkevinduh/notes/1905-kduh-MTMA-Adapt.pdfSubnetworksto Analyze Domain Adaptation in Neural Machine Translation, WMT 2018 • Different

Visualization:Fittingp(y|x)

x

y

Page 5: Domain Adaptation for Neural Machine Translationkevinduh/notes/1905-kduh-MTMA-Adapt.pdfSubnetworksto Analyze Domain Adaptation in Neural Machine Translation, WMT 2018 • Different

Case1:Testisnotininputdomain(CovariateShift)

x

y

testsamplext

Wheredoyoupredictyt?

Page 6: Domain Adaptation for Neural Machine Translationkevinduh/notes/1905-kduh-MTMA-Adapt.pdfSubnetworksto Analyze Domain Adaptation in Neural Machine Translation, WMT 2018 • Different

Case2:Input-outputrelationchanges

x

y

testsamplext

Expectyt here

Butunexpectedlyyt ishere

Page 7: Domain Adaptation for Neural Machine Translationkevinduh/notes/1905-kduh-MTMA-Adapt.pdfSubnetworksto Analyze Domain Adaptation in Neural Machine Translation, WMT 2018 • Different

ExamplesinMachineTranslation(MT)

• Domainmismatchexample:– TrainingdataconsistsofPatent sentences– TestsampleisSocialMedia

• Case1:Testisnotininputdomain– cantranslatetechnicalwordslike“NMT”– noideahowtotranslate“OMG”

• Case2:Input-Outputrelationchanges– “CAT”translatestoawordthatmeans“ComputerAidedTranslation” ratherthan“Cutefurryanimal”

Page 8: Domain Adaptation for Neural Machine Translationkevinduh/notes/1905-kduh-MTMA-Adapt.pdfSubnetworksto Analyze Domain Adaptation in Neural Machine Translation, WMT 2018 • Different

“Domain”isafuzzyconceptinpractice

• Corporadifferby:– Topics:patents,politics,news,medicine– Style:formal,informal–Modality:written,spoken

• Oftenuse“domain”torefertodatasource:– e.g.Europarl,OpenSubtitles,TED,Paracrawl

• Bothcase1andcase2mismatchesoccuratmultiplelevels:lexical,syntactic,etc.

Page 9: Domain Adaptation for Neural Machine Translationkevinduh/notes/1905-kduh-MTMA-Adapt.pdfSubnetworksto Analyze Domain Adaptation in Neural Machine Translation, WMT 2018 • Different

Examplesentences(case1):whichisPatent,TED,Subtitles,Europarl?

1. Weliveinadigitalworld,butwe’refairlyanalogcreatures.

2. Thetabletsexhibitimprovedbioavailabilityoftheactiveingredient.

3. So,um… she’skidding.4. Resumptionofthesession

12 3 4

Page 10: Domain Adaptation for Neural Machine Translationkevinduh/notes/1905-kduh-MTMA-Adapt.pdfSubnetworksto Analyze Domain Adaptation in Neural Machine Translation, WMT 2018 • Different

Examplebitext (case2)

Page 11: Domain Adaptation for Neural Machine Translationkevinduh/notes/1905-kduh-MTMA-Adapt.pdfSubnetworksto Analyze Domain Adaptation in Neural Machine Translation, WMT 2018 • Different

WhyisDomainAdaptationanimportantprobleminMT?

• Itmaybeexpensivetoobtaintrainingbitextsthatarebothlarge &relevant totestdomain

• Oftenhavetoworkwithwhateverwecanget

Small Large

Irrelevant ✔

Relevant ✔ ✔✔

DataSize

Relevancetotestdomain

Page 12: Domain Adaptation for Neural Machine Translationkevinduh/notes/1905-kduh-MTMA-Adapt.pdfSubnetworksto Analyze Domain Adaptation in Neural Machine Translation, WMT 2018 • Different

Terminology(1of2)

• Example:TestdomainisSocialMedia• In-domaindata– Datathatisrelevanttotestdomain:SNScorpus

• Out-of-domaindata– Datathatislessrelevanttotestdomain:Europarl

• General-domaindata–MayuseinterchangeablywithOut-of-Domain–Maymeanmixedcorpus:

• Europaral +Patent+TED• Europarl +Patent+TED+SNS

Page 13: Domain Adaptation for Neural Machine Translationkevinduh/notes/1905-kduh-MTMA-Adapt.pdfSubnetworksto Analyze Domain Adaptation in Neural Machine Translation, WMT 2018 • Different

Terminology(2of2)

• Supervisedadaptationmethods:– AssumesOODbitext &IDbitext

• Unsupervisedadaptationmethods:– AssumesOODbitext &IDmonotext

Small Large

Irrelevant Out-of-Domain(OOD)

Relevant In-Domain(ID)

DataSize

Relevancetotestdomain

Page 14: Domain Adaptation for Neural Machine Translationkevinduh/notes/1905-kduh-MTMA-Adapt.pdfSubnetworksto Analyze Domain Adaptation in Neural Machine Translation, WMT 2018 • Different

Outline

1. Problemdefinition2. Surveyofadaptationmethods3. ErrorAnalysis4. PromisingResearchDirections

Page 15: Domain Adaptation for Neural Machine Translationkevinduh/notes/1905-kduh-MTMA-Adapt.pdfSubnetworksto Analyze Domain Adaptation in Neural Machine Translation, WMT 2018 • Different

AtaxonomyofdomainadaptationmethodsforNMT

From:Chenhui ChuandRui Wang,ASurveyofDomainAdaptationforNeuralMachineTranslation,COLING2018

Page 16: Domain Adaptation for Neural Machine Translationkevinduh/notes/1905-kduh-MTMA-Adapt.pdfSubnetworksto Analyze Domain Adaptation in Neural Machine Translation, WMT 2018 • Different

Datacentricadaptationmethods

Note:I’llpresentanassortmentofmethods,pickedmainlytodemonstratethevarietybutnotnecessarilyrepresentativeoftheliterature.

Page 17: Domain Adaptation for Neural Machine Translationkevinduh/notes/1905-kduh-MTMA-Adapt.pdfSubnetworksto Analyze Domain Adaptation in Neural Machine Translation, WMT 2018 • Different

SyntheticDataAugmentation(forwardorbacktranslation)

LargeOut-of-DomainBitext

In-domainMonotext

Out-of-DomainNMTModel

src trgsrc ortrg

Page 18: Domain Adaptation for Neural Machine Translationkevinduh/notes/1905-kduh-MTMA-Adapt.pdfSubnetworksto Analyze Domain Adaptation in Neural Machine Translation, WMT 2018 • Different

FilteringOut-of-DomainBitext forrelevantdatasubsets(esp.forcase2)

LargeOut-of-DomainBitext

SmallIn-domainBitext

Relevancemodele.g.ngram languagemodel

RobertCMooreandWilliamLewis.Intelligentselectionoflanguagemodeltrainingdata.ACL2010

Amittai Axelrod,Xiaodong He,andJianfeng Gao.Domainadaptationviapseudoin-domaindataselection.EMNLP2011

KevinDuh,GrahamNeubig,KatsuhitoSudoh,andHajimeTsukada.Adaptationdataselectionusingneurallanguagemodels:Experimentsinmachinetranslation.ACL2013

Marcin Junczys-Dowmunt.Dualconditionalcross- entropyfilteringofnoisyparallelcorpora.WMT2018

Page 19: Domain Adaptation for Neural Machine Translationkevinduh/notes/1905-kduh-MTMA-Adapt.pdfSubnetworksto Analyze Domain Adaptation in Neural Machine Translation, WMT 2018 • Different

Trainingobjectivecentricmethods

Page 20: Domain Adaptation for Neural Machine Translationkevinduh/notes/1905-kduh-MTMA-Adapt.pdfSubnetworksto Analyze Domain Adaptation in Neural Machine Translation, WMT 2018 • Different

ContinuedTraining(a.k.a fine-tuning)

Out-of-DomainNMTModel

ContinuedTrainingNMTModel

RandomInitializedNMTModel

SmallIn-domainBitext

LargeOut-of-DomainBitext

Thisseemstobe1st citationonNMTcontinuedtraining:Minh-Thang Luong andChrisManning.StanfordNeuralMachineTranslationSystemsforSpokenLanguageDomain.IWSLT2016

Page 21: Domain Adaptation for Neural Machine Translationkevinduh/notes/1905-kduh-MTMA-Adapt.pdfSubnetworksto Analyze Domain Adaptation in Neural Machine Translation, WMT 2018 • Different

ContinuedTraining:GeneralDomainà PatentDomain

0

10

20

30

40

50

60

70

German Korean Russian Chinese

GeneralDomain In-Domain ContinuedTraining

+0.4

+1.8 +10.1+3.5

BLEU

ResultsfromJHUSCALEWorkshop2018:ResilientMachineTranslationinNewDomains

Page 22: Domain Adaptation for Neural Machine Translationkevinduh/notes/1905-kduh-MTMA-Adapt.pdfSubnetworksto Analyze Domain Adaptation in Neural Machine Translation, WMT 2018 • Different

Generalalgorithm:1. TrainmodelonconvergenceondatasetA(A=OODbitext)2. ContinuetrainingondatasetB(B=in-domainbitext)

ContinuedTrainingVariants:• Detailsonlearningrate,etc.instep2matters• Addingaregularizationtermorfixsubnetworks instep2

– AntonioValerioMiceli Barone,BarryHaddow,UlrichGermann,andRicoSennrich.Regularizationtechniquesforfine-tuninginneuralmachinetranslation.EMNLP2017

– HudaKhayrallah,BrianThompson,KevinDuh,andPhilippKoehn.Regularizedtrainingobjectiveforcontinuedtrainingfordomainadaptationinneuralmachinetranslation.WNMT2018

– BrianThompson,HudaKhayrallah,Antonios Anastasopoulos,Arya D.McCarthy,KevinDuh,RebeccaMarvin,PaulMcNamee,JeremyGwinnup,TimAnderson,PhilippKoehn.FreezingSubnetworks toAnalyzeDomainAdaptationinNeuralMachineTranslation,WMT2018

• Differentwaystomixdata(e.g.A+Binstep2)ororderdata– Chenhui Chu,RajDabre,andSadao Kurohashi.Anempiricalcomparisonofdomainadaptation

methodsforneuralmachinetranslation.ACL2017– Marlies vanderWees,AriannaBisazza,andChristof Monz.Dynamicdataselectionforneural

machinetranslation.EMNLP2017– WeiWang,TaroWatanabe,Macduff Hughes,Tetsuji Nakagawa,andCiprian Chelba.Denoising

neuralmachinetranslationtrainingwithtrusteddataandonlinedataselection.WMT2018– Xuan Zhang,PamelaShapiro,Gaurav Kumar,PaulMcNamee,MarineCarpuat andKevinDuh.

CurriculumLearningforDomainAdaptationinNeuralMachineTranslation.NAACL2019

• Ensembling out-of-domainmodelandcontinuedtrainedmodel:– MarkusFreitag andYaser Al-Onaizan.2016.FastDomainAdaptationforNeuralMachine

Translation.ArXiV abs/1612.06897.

Page 23: Domain Adaptation for Neural Machine Translationkevinduh/notes/1905-kduh-MTMA-Adapt.pdfSubnetworksto Analyze Domain Adaptation in Neural Machine Translation, WMT 2018 • Different

InstanceWeighting

BoxingChen,ColinCherry,GeorgeFoster,SamuelLarkin.CostWeightingforNeuralMachineTranslationDomainAdaptation.WNMT2017

Rui Wang,MasaoUtiyama,Lemao Liu,Kehai Chen,Eiichiro Sumita.InstanceWeightingforNeuralMachineTranslationDomainAdaptation.EMNLP2017

Page 24: Domain Adaptation for Neural Machine Translationkevinduh/notes/1905-kduh-MTMA-Adapt.pdfSubnetworksto Analyze Domain Adaptation in Neural Machine Translation, WMT 2018 • Different

Architecture/DecoderCentricmethods

• (Iprefertoconsiderthetwotypesofmethodstogethersinceitissometimesarbitrarytodifferentiatewhat’spartofthewholearchitectureandwhat’sonlypartofthedecodingprocess)

Page 25: Domain Adaptation for Neural Machine Translationkevinduh/notes/1905-kduh-MTMA-Adapt.pdfSubnetworksto Analyze Domain Adaptation in Neural Machine Translation, WMT 2018 • Different

Fusionoftwomodels

Caglar Gulcehre,Orhan Firat,KelvinXu,Kyunghyun Cho,Loic Barrault,Huei-ChiLin,Fethi Bougares,Holger Schwenk,andYoshua Bengio.2015.Onusingmonolingualcorporainneuralmachinetranslation.CoRR,abs/1503.03535.

Page 26: Domain Adaptation for Neural Machine Translationkevinduh/notes/1905-kduh-MTMA-Adapt.pdfSubnetworksto Analyze Domain Adaptation in Neural Machine Translation, WMT 2018 • Different

Translationmodelvs.Languagemodel

• Problem:Languagemodelcanbetoostrong

• Onesolution:

Toan NguyenandDavidChiang.ImprovingLexicalChoiceinNeuralMachineTranslation.NAACL2018(notethispaperaddressesthegeneralproblemofimproperlexicalchoice,butthisisafrequentproblemindomainadaptation)

Averagedsourcewordrepresentationatdecodetimet

Page 27: Domain Adaptation for Neural Machine Translationkevinduh/notes/1905-kduh-MTMA-Adapt.pdfSubnetworksto Analyze Domain Adaptation in Neural Machine Translation, WMT 2018 • Different

Otheradaptationmethods

Page 28: Domain Adaptation for Neural Machine Translationkevinduh/notes/1905-kduh-MTMA-Adapt.pdfSubnetworksto Analyze Domain Adaptation in Neural Machine Translation, WMT 2018 • Different

Adaptationatthetokenlevel:Subword Regularization

Taku Kudo,Subword Regularization:ImprovingNeuralNetworkTranslationModelswithMultipleSubword Candidates,ACL2018

Page 29: Domain Adaptation for Neural Machine Translationkevinduh/notes/1905-kduh-MTMA-Adapt.pdfSubnetworksto Analyze Domain Adaptation in Neural Machine Translation, WMT 2018 • Different

Trainoverdifferentsubword segmentations,randomlysampled

Page 30: Domain Adaptation for Neural Machine Translationkevinduh/notes/1905-kduh-MTMA-Adapt.pdfSubnetworksto Analyze Domain Adaptation in Neural Machine Translation, WMT 2018 • Different

Outline

1. Problemdefinition2. Surveyofadaptationmethods3. ErrorAnalysis4. PromisingResearchDirections

Page 31: Domain Adaptation for Neural Machine Translationkevinduh/notes/1905-kduh-MTMA-Adapt.pdfSubnetworksto Analyze Domain Adaptation in Neural Machine Translation, WMT 2018 • Different

Thetruthis,it’shardtoanalyzeMTerrors

• ItwashardtopinpointwhytranslationwasincorrectintheSMTdays

• It’sperhapsevenharderforNMT• Butwetryanyway.Atleastitgivesawaytothinkabouttheproblem.

Page 32: Domain Adaptation for Neural Machine Translationkevinduh/notes/1905-kduh-MTMA-Adapt.pdfSubnetworksto Analyze Domain Adaptation in Neural Machine Translation, WMT 2018 • Different

S4Analysis(originallydevelopedforSMT)

• SEENerror:Neverseenthissourcewordbeforeinthetrainingdata(case1in1st partofthistalk)

• SENSEerror:Thesourcewordappearsinthetrainingdata,butisnotusedinthissense.(e.g.case2)

• SCOREerror:Thesourcewordanditstranslationappearsinthetrainingdata,butthecorrecttranslationisscoredlower

• SEARCHerror:Thecorrecttranslationisscoredhigher,butsomehowgotlostinthesearchprocess

AnnIrvine,JohnMorgan,MarineCarpuat,HalDaumeIII,andDragosMunteanu.2013.Measuringmachinetranslationerrorsinnewdomains.TransactionsoftheAssociationforComputationalLinguistics(TACL)

Page 33: Domain Adaptation for Neural Machine Translationkevinduh/notes/1905-kduh-MTMA-Adapt.pdfSubnetworksto Analyze Domain Adaptation in Neural Machine Translation, WMT 2018 • Different

S4Analysis(requiresreferenceandalignment)

AnnIrvine,JohnMorgan,MarineCarpuat,HalDaumeIII,andDragosMunteanu.2013.Measuringmachinetranslationerrorsinnewdomains.TransactionsoftheAssociationforComputationalLinguistics(TACL)

Page 34: Domain Adaptation for Neural Machine Translationkevinduh/notes/1905-kduh-MTMA-Adapt.pdfSubnetworksto Analyze Domain Adaptation in Neural Machine Translation, WMT 2018 • Different

S4Analysis(forNMT?)

• First,runexternalwordalignertodeterminecorrect/incorrectwords(buthowmuchcanwetrustthis?)

• SEENerror:– Checkout-of-vocabularywordsonsourceside

• SENSEerror:– Foragivensourcewordf,checkifthedesiredtranslationnever

appearsinthetargetsideoftrainingbitext wheref appears?• SCOREerror:

– Whennoneoftheaboveistrue?• SEARCHerror:

– Notsurehowtocheckbesidesincreasingbeam,but..

Page 35: Domain Adaptation for Neural Machine Translationkevinduh/notes/1905-kduh-MTMA-Adapt.pdfSubnetworksto Analyze Domain Adaptation in Neural Machine Translation, WMT 2018 • Different

FluentlyInadequateTranslations

• Source:(TEDtalk)

• Un-adaptedsystemoutput:I’mafraidI’mnotgoingtohavetogotobed.

• Gloss:Crowparentsseemtobeteachingtheiryoungtheseskills.

• Adaptedsystemoutput:Andtheirparentsalsotaughttheirchildrenhowtodoit.

• Translationsthatarefluentbuthavenothingtodowiththesourceareverydangerous!

MariannaJ.Martindale,MarineCarpuat,KevinDuhandPaulMcNamee.IdentifyingFluentlyInadequateOutputinNeuralandStatisticalMachineTranslation.MTSummit2019

Page 36: Domain Adaptation for Neural Machine Translationkevinduh/notes/1905-kduh-MTMA-Adapt.pdfSubnetworksto Analyze Domain Adaptation in Neural Machine Translation, WMT 2018 • Different

Outline

1. Problemdefinition2. Surveyofadaptationmethods3. ErrorAnalysis4. PromisingResearchDirections(myopinion)

Page 37: Domain Adaptation for Neural Machine Translationkevinduh/notes/1905-kduh-MTMA-Adapt.pdfSubnetworksto Analyze Domain Adaptation in Neural Machine Translation, WMT 2018 • Different

PersonalizedAdaptation,e.g.forComputerAssistedTranslation(CAT)

• CATpresentsmanyinterestingopportunitiesforresearch(withrealuserimpact!)

• ExampleinterfaceatLilt.com:

PaulMichel,GrahamNeubig.ExtremeAdaptationforPersonalizedNeuralMachineTranslation.ACL2018Sachith SriRamKothur,RebeccaKnowlesandPhilippKoehn.Document-LevelAdaptationforNeuralMachineTranslation.WNMT2018

Page 38: Domain Adaptation for Neural Machine Translationkevinduh/notes/1905-kduh-MTMA-Adapt.pdfSubnetworksto Analyze Domain Adaptation in Neural Machine Translation, WMT 2018 • Different

AdaptationtoNewLanguages

• Givenbitext inlanguagepairsA->B,C->D– BuildatranslatorforA->D– BuildatranslatorforE->BwhereEisrelatedlanguagetoA

– Assumessomesharedrepresentation,canusecontinuedtraining,etc.

• Crazyideabutpotentiallylargeimpact

Barret Zoph,Deniz Yuret,JonathanMay,andKevinKnight.Transferlearningforlow-resourceneuralmachinetranslation.EMNLP2016;GrahamNeubig andJunjie Hu.RapidAdaptationofNeuralMachineTranslationtoNewLanguages.EMNLP2018

Page 39: Domain Adaptation for Neural Machine Translationkevinduh/notes/1905-kduh-MTMA-Adapt.pdfSubnetworksto Analyze Domain Adaptation in Neural Machine Translation, WMT 2018 • Different

UnderstandingadaptationerrorsasawaytounderstandNMTbehavior

• DomainAdaptationprovidesagoodtestbedforunderstandingoverfitting,etc.

• Whattriggersafluentlyinadequatetranslation?

• Whydoescatastrophicforgettinghappen?Thompson,et.al.NAACL2019;Saunders,et.al.ACL2019;Kirpatrick,et.al.PNAS2017

Page 40: Domain Adaptation for Neural Machine Translationkevinduh/notes/1905-kduh-MTMA-Adapt.pdfSubnetworksto Analyze Domain Adaptation in Neural Machine Translation, WMT 2018 • Different

Betteradaptationalgorithms

• ManyopportunitiesfornewideasinNMT,e.g.• Batchvs.Onlinesetup– Onlinesetupmakescurriculumlearningeasier

• Easytodesignnewarchitectures–Multitasklearningforsharingparameters&data

• WhatideastoborrowfromSMT?–Modularizationoflexicalchoice,reordering,syntax,etc.

Page 41: Domain Adaptation for Neural Machine Translationkevinduh/notes/1905-kduh-MTMA-Adapt.pdfSubnetworksto Analyze Domain Adaptation in Neural Machine Translation, WMT 2018 • Different

Re-cap

1. Problemdefinition2. Surveyofadaptationmethods3. ErrorAnalysis4. PromisingResearchDirections

Page 42: Domain Adaptation for Neural Machine Translationkevinduh/notes/1905-kduh-MTMA-Adapt.pdfSubnetworksto Analyze Domain Adaptation in Neural Machine Translation, WMT 2018 • Different

Case1:Testisnotininputdomain(CovariateShift)

x

y

testsamplext

Wheredoyoupredictyt?

Page 43: Domain Adaptation for Neural Machine Translationkevinduh/notes/1905-kduh-MTMA-Adapt.pdfSubnetworksto Analyze Domain Adaptation in Neural Machine Translation, WMT 2018 • Different

Case2:Input-outputrelationchanges

x

y

testsamplext

Expectyt here

Butunexpectedlyyt ishere

Page 44: Domain Adaptation for Neural Machine Translationkevinduh/notes/1905-kduh-MTMA-Adapt.pdfSubnetworksto Analyze Domain Adaptation in Neural Machine Translation, WMT 2018 • Different

WhyisDomainAdaptationanimportantprobleminMT?

• Expensivetoobtaintrainingbitexts thatarebothlarge &relevant totestdomain

Small Large

Irrelevant ✔

Relevant ✔ ✔✔

DataSize

Relevancetotestdomain

Page 45: Domain Adaptation for Neural Machine Translationkevinduh/notes/1905-kduh-MTMA-Adapt.pdfSubnetworksto Analyze Domain Adaptation in Neural Machine Translation, WMT 2018 • Different

AtaxonomyofdomainadaptationmethodsforNMT

From:Chenhui ChuandRui Wang,ASurveyofDomainAdaptationforNeuralMachineTranslation,COLING2018

Page 46: Domain Adaptation for Neural Machine Translationkevinduh/notes/1905-kduh-MTMA-Adapt.pdfSubnetworksto Analyze Domain Adaptation in Neural Machine Translation, WMT 2018 • Different

ContinuedTraining

Out-of-DomainNMTModel

ContinuedTrainingNMTModel

RandomInitializedNMTModel

SmallIn-domainBitext

LargeOut-of-DomainBitext

Page 47: Domain Adaptation for Neural Machine Translationkevinduh/notes/1905-kduh-MTMA-Adapt.pdfSubnetworksto Analyze Domain Adaptation in Neural Machine Translation, WMT 2018 • Different

Outline

1. Problemdefinition2. Surveyofadaptationmethods3. ErrorAnalysis– S4&FluentlyInadequateTranslations

4. PromisingResearchDirections– CAT,newlanguageadaptation,adaptationas

windowtounderstandNMT,etc.