howdeep learningismaking mt and...

47
How Deep Learning is making MT and other areas converge? MARTA R. COSTA-JUSSÀ UNIVERSITAT POLITÈCNICA DE CATALUNYA, BARCELONA

Upload: others

Post on 21-May-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: HowDeep Learningismaking MT and otherareasconverge?mtm2017.unbabel.com/assets/images/slides/marta_jussa.pdf · Deep learning is… Abranch of machine learning based on a set of algorithms

How DeepLearning is makingMTandother areas converge?MARTAR.COSTA-JUSSÀ

UNIVERSITAT POLITÈCNICA DECATALUNYA,BARCELONA

Page 2: HowDeep Learningismaking MT and otherareasconverge?mtm2017.unbabel.com/assets/images/slides/marta_jussa.pdf · Deep learning is… Abranch of machine learning based on a set of algorithms

Aboutme

2

• ASR• SMT+NN

LIMSI-CNRS,Paris

• SMT• S2STranslation

UPC,Barcelona

• SMT• CLIR

USP,SãoPaulo

• HMT

I2R,Singapore

• CLIR• OM

BM,Barcelona

• HMT

IPN,Mexico

• NMT• NLI• SLTUPC,Barcelona

20042008201220142015

Page 3: HowDeep Learningismaking MT and otherareasconverge?mtm2017.unbabel.com/assets/images/slides/marta_jussa.pdf · Deep learning is… Abranch of machine learning based on a set of algorithms

Outline

MachineTranslationandDeepLearning

NeuralMachineTranslation

NeuralMTarchitectureappliedtootherareas◦ NLP(Chatbot)◦ Speech(End-to-Endspeechrecognition,End-to-Endspeechtranslation)◦ Image(Imagecaptioning)

NeuralMTinspiredbyotherareas◦ Image/NLP(Character-awaremodelling)◦ MachineLearning(Adversarialnetworks)

Discussion

3

Page 4: HowDeep Learningismaking MT and otherareasconverge?mtm2017.unbabel.com/assets/images/slides/marta_jussa.pdf · Deep learning is… Abranch of machine learning based on a set of algorithms

MachineTranslation

RulesDictionaries

Co-ocurrencesFrecuencyCounts

NeuralNetworks

SOURCELANGUAGE

TARGETLANGUAGE

MODEL

From1950stillnowEurotra,Apertium…(Forcada,2005)

From1990stillnowTC-Star,Moses…(Koehn,2010)

Startingin2014…NEMATUS…(Cho,2014)

DatesRefs

4

Page 5: HowDeep Learningismaking MT and otherareasconverge?mtm2017.unbabel.com/assets/images/slides/marta_jussa.pdf · Deep learning is… Abranch of machine learning based on a set of algorithms

Neuralnetsare…

Neuralnetworks,abranchofmachinelearning,areabiologically-inspiredprogrammingparadigmwhichenablesacomputertolearnfromobservationaldata(http://neuralnetworksanddeeplearning.com/)

5

Page 6: HowDeep Learningismaking MT and otherareasconverge?mtm2017.unbabel.com/assets/images/slides/marta_jussa.pdf · Deep learning is… Abranch of machine learning based on a set of algorithms

Deeplearningis…

A branchofmachinelearningbasedonasetofalgorithmsthatattempttomodelhigh-levelabstractionsindatabyusingmodelarchitectures,withcomplexstructuresorotherwise,composedofmultiplenon-lineartransformations(wikipedia)

Asetofmachinelearningalgorithmswhichattempttolearnmultiple-layeredmodelsofinputs,commonlyneuralnetworks(Duetal,2013)

6

Page 7: HowDeep Learningismaking MT and otherareasconverge?mtm2017.unbabel.com/assets/images/slides/marta_jussa.pdf · Deep learning is… Abranch of machine learning based on a set of algorithms

NeuralMachineTranslation

7

Page 8: HowDeep Learningismaking MT and otherareasconverge?mtm2017.unbabel.com/assets/images/slides/marta_jussa.pdf · Deep learning is… Abranch of machine learning based on a set of algorithms

Motivation:End-to-endsystem

PHRASE-BASED

Source Language Text

Preprocessing

Decoding

Postprocessing

Translation model

Language model

TargetLanguage Text

Ensure that translated wordscomeinthe right order

Finding the right targetwordsgiven the source words

Parallel corpus

WordAlignment

Phrase extraction

Monolingual corpus

TRAININGTEST

preprocessing

NEURAL

8

encoder

decoder

Page 9: HowDeep Learningismaking MT and otherareasconverge?mtm2017.unbabel.com/assets/images/slides/marta_jussa.pdf · Deep learning is… Abranch of machine learning based on a set of algorithms

Relatedwork:languagemodelingFindafunctionthattakesasinputn-1 wordsandreturnsaconditionalprobabilityofthenextone

Recurrentneuralnetworkhaveallowedtoignoredependencybeyondthecontextwindow(viarecursion)

p(I’m) p(fine|I’m) p(.|fine) EOS

I’m fine .

9

Page 10: HowDeep Learningismaking MT and otherareasconverge?mtm2017.unbabel.com/assets/images/slides/marta_jussa.pdf · Deep learning is… Abranch of machine learning based on a set of algorithms

Architecture:encoder-decoder

how are you ?

Cómo estás EOS

encoder decoder

eos Cómo estás

10

?

?

Page 11: HowDeep Learningismaking MT and otherareasconverge?mtm2017.unbabel.com/assets/images/slides/marta_jussa.pdf · Deep learning is… Abranch of machine learning based on a set of algorithms

Attention-basedmechanism

encoder

decoder

+

11

Page 12: HowDeep Learningismaking MT and otherareasconverge?mtm2017.unbabel.com/assets/images/slides/marta_jussa.pdf · Deep learning is… Abranch of machine learning based on a set of algorithms

NeuralMTarchitectureappliedtootherareasNATURALLANGUAGEPROCESSING

SPEECH

IMAGE

Page 13: HowDeep Learningismaking MT and otherareasconverge?mtm2017.unbabel.com/assets/images/slides/marta_jussa.pdf · Deep learning is… Abranch of machine learning based on a set of algorithms

NaturalLanguageProcessing

13

Page 14: HowDeep Learningismaking MT and otherareasconverge?mtm2017.unbabel.com/assets/images/slides/marta_jussa.pdf · Deep learning is… Abranch of machine learning based on a set of algorithms

Chatbot:acomputerprogramthatconductsaconversation

14

Page 15: HowDeep Learningismaking MT and otherareasconverge?mtm2017.unbabel.com/assets/images/slides/marta_jussa.pdf · Deep learning is… Abranch of machine learning based on a set of algorithms

Chatbots models

RULE-BASED

ELIZA:Decompositionrulesandkeywordsmatching(Weizenbaum,1966)AIML(ArtificialIntelligenceMarkupLanguage)(Wallace,1995).

RETRIEVAL-BASED

IRIS:achat-orienteddialoguesystembasedonthevectorspacemodelframework(Banchs andLi,2012)

<category><pattern>WHATISYOURNAME</pattern><template>MynameisMichaelN.SEvanious.</template></category>

https://hackernoon.com/chatbot-architecture-496f5bf820ed15

Page 16: HowDeep Learningismaking MT and otherareasconverge?mtm2017.unbabel.com/assets/images/slides/marta_jussa.pdf · Deep learning is… Abranch of machine learning based on a set of algorithms

Generative-basedchatbots

Human:whatisyourjob?Enc-dec:i’m alawyer.Human:whatdoyoudo?Enc-dec:i’m adoctor.

Thismodelcanbetrainedend-to-end

Cannotsolvetheproblemofmodelingthedialogue

encoder

decoder

16

Page 17: HowDeep Learningismaking MT and otherareasconverge?mtm2017.unbabel.com/assets/images/slides/marta_jussa.pdf · Deep learning is… Abranch of machine learning based on a set of algorithms

ü Themodelcangeneratesimpleandbasicconversations.

X Themodelcannotcurrentlydeliverrealisticconversations.

Chatbot evaluationQuestion Count

Enc-dec PREFERRED OVER Cleverbot 97

Cleverbot PREFERRED OVER Enc-dec 60

Tie 20

Disagreemnt AMONGJUDGES(noconsensusof3outof4judges)

23

17

Page 18: HowDeep Learningismaking MT and otherareasconverge?mtm2017.unbabel.com/assets/images/slides/marta_jussa.pdf · Deep learning is… Abranch of machine learning based on a set of algorithms

Chatbot examplesHuman:whatisthepurposeofbeingintelligent?

CleverBot:ThepurposeoflifeistoloveGodandlovepeople.

Enc-dec:tofindoutwhatitis

*

Human:Whatdoyouthinkaboutengland duringthereignofelizabeth ?

CleverBot:Ilikehim.Doyou?

Enc-dec:itwasagreatplace.

*

Human:Whatisthecolorofayellowcar?

CleverBot:Brown.

Enc-dec:yellow.

Human:whatisyourjob?

CleverBot:Iworkinmentalhealth

Enc-dec:i ’malawyer.

Human:whatdoyoudo?

CleverBot:Ihavenojob.

Enc-dec:i ’madoctor.

18

Page 19: HowDeep Learningismaking MT and otherareasconverge?mtm2017.unbabel.com/assets/images/slides/marta_jussa.pdf · Deep learning is… Abranch of machine learning based on a set of algorithms

OtherNLPTaskswithSeq2seq

Text summarization: process of shortening a text document withsoftware to create a summary with the major points of the originaldocument.Question Answering: automatically producing an answer to aquestion given a corresponding document.Semantic Parsing: mapping natural language into a logical form thatcan be executed on a knowledge base and return an answerSyntactic Parsing: process of analysing a string of symbols, either innatural language or in computer languages, conforming to the rules ofa formal grammar

19

Page 20: HowDeep Learningismaking MT and otherareasconverge?mtm2017.unbabel.com/assets/images/slides/marta_jussa.pdf · Deep learning is… Abranch of machine learning based on a set of algorithms

SpeechRecognition

20

Page 21: HowDeep Learningismaking MT and otherareasconverge?mtm2017.unbabel.com/assets/images/slides/marta_jussa.pdf · Deep learning is… Abranch of machine learning based on a set of algorithms

SpeechRecognitionsystem

FEATURES RECOGNIZER DECISION

LexiconAcousticModels

Language models

TASKINFO

Featurevector

N-bestHip.

RECOGNIZED SENTENCE

microphone

x =x1 ...x|x| w =w1 ...w|w|

21

Page 22: HowDeep Learningismaking MT and otherareasconverge?mtm2017.unbabel.com/assets/images/slides/marta_jussa.pdf · Deep learning is… Abranch of machine learning based on a set of algorithms

RNN/CNN-HMM+RNNLM

(N-GRAM+)RNN

HMM

RNN/CNN

LanguageModel

AcousticModel

Phoneticinventory

PronunciationLexicon

22

Page 23: HowDeep Learningismaking MT and otherareasconverge?mtm2017.unbabel.com/assets/images/slides/marta_jussa.pdf · Deep learning is… Abranch of machine learning based on a set of algorithms

Speechrecognitionwithencoder-decoderwithattention

LanguageModel

AcousticModel

encoder

decoder

+

23

Page 24: HowDeep Learningismaking MT and otherareasconverge?mtm2017.unbabel.com/assets/images/slides/marta_jussa.pdf · Deep learning is… Abranch of machine learning based on a set of algorithms

ListenerChallenges:speechsignalscanbehundredstothousandsofframeslong

Solution:usingapyramidBLSTM

24

Page 25: HowDeep Learningismaking MT and otherareasconverge?mtm2017.unbabel.com/assets/images/slides/marta_jussa.pdf · Deep learning is… Abranch of machine learning based on a set of algorithms

Attend&Spell

25

Page 26: HowDeep Learningismaking MT and otherareasconverge?mtm2017.unbabel.com/assets/images/slides/marta_jussa.pdf · Deep learning is… Abranch of machine learning based on a set of algorithms

End-to-endSpeech-to-text

Model WER

CLDNN-HMM* 8.0

LAS+LMRescoring 10.3

*ConvolutionalLongShortTermMemoryFullyConnectedDeepNeuralNetwork

26

Page 27: HowDeep Learningismaking MT and otherareasconverge?mtm2017.unbabel.com/assets/images/slides/marta_jussa.pdf · Deep learning is… Abranch of machine learning based on a set of algorithms

End-to-endSpeech-to-textTranslation

Multi-task learning which aims at improving thegeneralization performance of a task using other relatedtasks.

One-to-many Many-to-OneWhatisnewherecomparedtopreviouswork?

Multi-tasktraining

SpanishSpeech

SpeechRecognition

SpeechTranslation

SpeechTranslation

TextTranslation

EnglishText

Oneencoder,multipledecoders Multipleencoders,onedecoder27

Page 28: HowDeep Learningismaking MT and otherareasconverge?mtm2017.unbabel.com/assets/images/slides/marta_jussa.pdf · Deep learning is… Abranch of machine learning based on a set of algorithms

Spanish->EnglishFISHER/CALLHOMEBLEUresults

Model Test1 Test2

End-to-EndST 47.3 16.6

Multi-task 48.7 17.4

ASR /NMTconcatenation 45.4 16.6

28

Page 29: HowDeep Learningismaking MT and otherareasconverge?mtm2017.unbabel.com/assets/images/slides/marta_jussa.pdf · Deep learning is… Abranch of machine learning based on a set of algorithms

Exampleofattentionprobabilities

29

Page 30: HowDeep Learningismaking MT and otherareasconverge?mtm2017.unbabel.com/assets/images/slides/marta_jussa.pdf · Deep learning is… Abranch of machine learning based on a set of algorithms

Image

30

Page 31: HowDeep Learningismaking MT and otherareasconverge?mtm2017.unbabel.com/assets/images/slides/marta_jussa.pdf · Deep learning is… Abranch of machine learning based on a set of algorithms

ImageCaptioning

Acatonthemat

31

Page 32: HowDeep Learningismaking MT and otherareasconverge?mtm2017.unbabel.com/assets/images/slides/marta_jussa.pdf · Deep learning is… Abranch of machine learning based on a set of algorithms

Encoder-decoderwithattentiondecoder

+

encoder32

Page 33: HowDeep Learningismaking MT and otherareasconverge?mtm2017.unbabel.com/assets/images/slides/marta_jussa.pdf · Deep learning is… Abranch of machine learning based on a set of algorithms

Captioning:Show,Attend&Tell

33

Page 34: HowDeep Learningismaking MT and otherareasconverge?mtm2017.unbabel.com/assets/images/slides/marta_jussa.pdf · Deep learning is… Abranch of machine learning based on a set of algorithms

ResultsontheMSCOCOdatabase

Method BLEULog-Biliniar (Kiros etal2014a) 24.3

Enc-Dec(Vinyals et al2014a) 24.6

+Attention (Xuetal,2015) 25.0

34

Page 35: HowDeep Learningismaking MT and otherareasconverge?mtm2017.unbabel.com/assets/images/slides/marta_jussa.pdf · Deep learning is… Abranch of machine learning based on a set of algorithms

OtherComputerVisionTaskswithAttentionVisual Question Answering: given an image and a natural languagequestion about the image, the task is to provide an accurate naturallanguage answer.Video Caption Generation: attempts to generate a complete andnatural sentence, enriching the single label as in video classification,to capture the most informative dynamics in videos.

35

Page 36: HowDeep Learningismaking MT and otherareasconverge?mtm2017.unbabel.com/assets/images/slides/marta_jussa.pdf · Deep learning is… Abranch of machine learning based on a set of algorithms

NeuralMTarchitectureinspiredbyotherareas

Page 37: HowDeep Learningismaking MT and otherareasconverge?mtm2017.unbabel.com/assets/images/slides/marta_jussa.pdf · Deep learning is… Abranch of machine learning based on a set of algorithms

ConvolutionalNeuralNeworks forcharacter-awareNeuralMT

37

Page 38: HowDeep Learningismaking MT and otherareasconverge?mtm2017.unbabel.com/assets/images/slides/marta_jussa.pdf · Deep learning is… Abranch of machine learning based on a set of algorithms

German-EnglishBLEUResults

38

Method DE->EN EN->DEPhrase 20.99 17.04

NMT 20.64 17.15

+Char 22.10 20.22

Page 39: HowDeep Learningismaking MT and otherareasconverge?mtm2017.unbabel.com/assets/images/slides/marta_jussa.pdf · Deep learning is… Abranch of machine learning based on a set of algorithms

Examples

39

Page 40: HowDeep Learningismaking MT and otherareasconverge?mtm2017.unbabel.com/assets/images/slides/marta_jussa.pdf · Deep learning is… Abranch of machine learning based on a set of algorithms

GenerativeAdversarialNetworks

40

Page 41: HowDeep Learningismaking MT and otherareasconverge?mtm2017.unbabel.com/assets/images/slides/marta_jussa.pdf · Deep learning is… Abranch of machine learning based on a set of algorithms

German-to-EnglishBLEUResults

41

Method DE->ENBaseline (Shenetal2016) 25.84

+Adversarial 27.94

Page 42: HowDeep Learningismaking MT and otherareasconverge?mtm2017.unbabel.com/assets/images/slides/marta_jussa.pdf · Deep learning is… Abranch of machine learning based on a set of algorithms

German-to-EnglishExample

42

Source wir mussen verhindern ,dass diemenschenkenntnis erlangenvondingen ,vor allem dann ,wenn sie wahr sind .

Baseline weneedtopreventpeoplewhoareabletoknowthatpeoplehavetodo,especiallyiftheyaretrue.

+Adversarial weneedtopreventpeoplewhoareabletoknowaboutthings,especiallyiftheyaretrue.

REF wehavetopreventpeoplefromfindingaboutthings,especiallywhentheyaretrue.

Page 43: HowDeep Learningismaking MT and otherareasconverge?mtm2017.unbabel.com/assets/images/slides/marta_jussa.pdf · Deep learning is… Abranch of machine learning based on a set of algorithms

Discussion

43

Page 44: HowDeep Learningismaking MT and otherareasconverge?mtm2017.unbabel.com/assets/images/slides/marta_jussa.pdf · Deep learning is… Abranch of machine learning based on a set of algorithms

ImplementationsofEncoder-Decoder

LSTM CNN

44

Page 45: HowDeep Learningismaking MT and otherareasconverge?mtm2017.unbabel.com/assets/images/slides/marta_jussa.pdf · Deep learning is… Abranch of machine learning based on a set of algorithms

Attention-basedmechanisms

SoftvsHard:softattentionweightsallpixels,hardattentioncropstheimageandforcesattentiononlyonthekeptpart.

GlobalvsLocal: aglobal approach whichalwaysattendstoallsourcewordsandalocalonethatonlylooksatasubsetofsourcewordsatatime.

IntravsExternal:intraattentioniswithintheencoder’sinputsentence,externalattentionisacrosssentences.

45

Page 46: HowDeep Learningismaking MT and otherareasconverge?mtm2017.unbabel.com/assets/images/slides/marta_jussa.pdf · Deep learning is… Abranch of machine learning based on a set of algorithms

Onelargeencoder-decoder

•Text,speech,image…isallconverging toasignalparadigm?

•IfyouknowhowtobuildaneuralMTsystem,youmayeasilylearnhowtobuildaspeech-to-textrecognitionsystem...

•Oryoumaytrainthemtogethertoachievezero-shot AI.

*Andotherreferencesonthisresearchdirection….

46

Page 47: HowDeep Learningismaking MT and otherareasconverge?mtm2017.unbabel.com/assets/images/slides/marta_jussa.pdf · Deep learning is… Abranch of machine learning based on a set of algorithms

Thanks

[email protected]

WWW.COSTA-JUSSA.COM

Acknowledgements:• Noé Casas and Carlos Escolano

for their valuable feedback on theslides.

• MT-Marathon Organizers forinviting me to this exciting event.