howdeep learningismaking mt and...

How DeepLearning is makingMTandother areas converge?MARTAR.COSTA-JUSSÀ

UNIVERSITAT POLITÈCNICA DECATALUNYA,BARCELONA

Aboutme

2

• ASR• SMT+NN

LIMSI-CNRS,Paris

• SMT• S2STranslation

UPC,Barcelona

• SMT• CLIR

USP,SãoPaulo

• HMT

I2R,Singapore

• CLIR• OM

BM,Barcelona

• HMT

IPN,Mexico

• NMT• NLI• SLTUPC,Barcelona

20042008201220142015

Outline

MachineTranslationandDeepLearning

NeuralMachineTranslation

NeuralMTarchitectureappliedtootherareas◦ NLP(Chatbot)◦ Speech(End-to-Endspeechrecognition,End-to-Endspeechtranslation)◦ Image(Imagecaptioning)

NeuralMTinspiredbyotherareas◦ Image/NLP(Character-awaremodelling)◦ MachineLearning(Adversarialnetworks)

Discussion

3

MachineTranslation

RulesDictionaries

Co-ocurrencesFrecuencyCounts

NeuralNetworks

SOURCELANGUAGE

TARGETLANGUAGE

MODEL

From1950stillnowEurotra,Apertium…(Forcada,2005)

From1990stillnowTC-Star,Moses…(Koehn,2010)

Startingin2014…NEMATUS…(Cho,2014)

DatesRefs

4

Neuralnetsare…

Neuralnetworks,abranchofmachinelearning,areabiologically-inspiredprogrammingparadigmwhichenablesacomputertolearnfromobservationaldata(http://neuralnetworksanddeeplearning.com/)

5

Deeplearningis…

A branchofmachinelearningbasedonasetofalgorithmsthatattempttomodelhigh-levelabstractionsindatabyusingmodelarchitectures,withcomplexstructuresorotherwise,composedofmultiplenon-lineartransformations(wikipedia)

Asetofmachinelearningalgorithmswhichattempttolearnmultiple-layeredmodelsofinputs,commonlyneuralnetworks(Duetal,2013)

6

NeuralMachineTranslation

7

Motivation:End-to-endsystem

PHRASE-BASED

Source Language Text

Preprocessing

Decoding

Postprocessing

Translation model

Language model

TargetLanguage Text

Ensure that translated wordscomeinthe right order

Finding the right targetwordsgiven the source words

Parallel corpus

WordAlignment

Phrase extraction

Monolingual corpus

TRAININGTEST

preprocessing

NEURAL

8

encoder

decoder

Relatedwork:languagemodelingFindafunctionthattakesasinputn-1 wordsandreturnsaconditionalprobabilityofthenextone

Recurrentneuralnetworkhaveallowedtoignoredependencybeyondthecontextwindow(viarecursion)

p(I’m) p(fine|I’m) p(.|fine) EOS

I’m fine .

9

Architecture:encoder-decoder

how are you ?

Cómo estás EOS

encoder decoder

eos Cómo estás

10

?

?

Attention-basedmechanism

encoder

decoder

+

11

NeuralMTarchitectureappliedtootherareasNATURALLANGUAGEPROCESSING

SPEECH

IMAGE

NaturalLanguageProcessing

13

Chatbot:acomputerprogramthatconductsaconversation

14

Chatbots models

RULE-BASED

ELIZA:Decompositionrulesandkeywordsmatching(Weizenbaum,1966)AIML(ArtificialIntelligenceMarkupLanguage)(Wallace,1995).

RETRIEVAL-BASED

IRIS:achat-orienteddialoguesystembasedonthevectorspacemodelframework(Banchs andLi,2012)

<category><pattern>WHATISYOURNAME</pattern><template>MynameisMichaelN.SEvanious.</template></category>

https://hackernoon.com/chatbot-architecture-496f5bf820ed15

Generative-basedchatbots

Human:whatisyourjob?Enc-dec:i’m alawyer.Human:whatdoyoudo?Enc-dec:i’m adoctor.

Thismodelcanbetrainedend-to-end

Cannotsolvetheproblemofmodelingthedialogue

encoder

decoder

16

ü Themodelcangeneratesimpleandbasicconversations.

X Themodelcannotcurrentlydeliverrealisticconversations.

Chatbot evaluationQuestion Count

Enc-dec PREFERRED OVER Cleverbot 97

Cleverbot PREFERRED OVER Enc-dec 60

Tie 20

Disagreemnt AMONGJUDGES(noconsensusof3outof4judges)

23

17

Chatbot examplesHuman:whatisthepurposeofbeingintelligent?

CleverBot:ThepurposeoflifeistoloveGodandlovepeople.

Enc-dec:tofindoutwhatitis

*

Human:Whatdoyouthinkaboutengland duringthereignofelizabeth ?

CleverBot:Ilikehim.Doyou?

Enc-dec:itwasagreatplace.

*

Human:Whatisthecolorofayellowcar?

CleverBot:Brown.

Enc-dec:yellow.

Human:whatisyourjob?

CleverBot:Iworkinmentalhealth

Enc-dec:i ’malawyer.

Human:whatdoyoudo?

CleverBot:Ihavenojob.

Enc-dec:i ’madoctor.

18

OtherNLPTaskswithSeq2seq

Text summarization: process of shortening a text document withsoftware to create a summary with the major points of the originaldocument.Question Answering: automatically producing an answer to aquestion given a corresponding document.Semantic Parsing: mapping natural language into a logical form thatcan be executed on a knowledge base and return an answerSyntactic Parsing: process of analysing a string of symbols, either innatural language or in computer languages, conforming to the rules ofa formal grammar

19

SpeechRecognition

20

SpeechRecognitionsystem

FEATURES RECOGNIZER DECISION

LexiconAcousticModels

Language models

TASKINFO

Featurevector

N-bestHip.

RECOGNIZED SENTENCE

microphone

x =x1 ...x|x| w =w1 ...w|w|

21

RNN/CNN-HMM+RNNLM

(N-GRAM+)RNN

HMM

RNN/CNN

LanguageModel

AcousticModel

Phoneticinventory

PronunciationLexicon

22

Speechrecognitionwithencoder-decoderwithattention

LanguageModel

AcousticModel

encoder

decoder

+

23

ListenerChallenges:speechsignalscanbehundredstothousandsofframeslong

Solution:usingapyramidBLSTM

24

Attend&Spell

25

End-to-endSpeech-to-text

Model WER

CLDNN-HMM* 8.0

LAS+LMRescoring 10.3

*ConvolutionalLongShortTermMemoryFullyConnectedDeepNeuralNetwork

26

End-to-endSpeech-to-textTranslation

Multi-task learning which aims at improving thegeneralization performance of a task using other relatedtasks.

One-to-many Many-to-OneWhatisnewherecomparedtopreviouswork?

Multi-tasktraining

SpanishSpeech

SpeechRecognition

SpeechTranslation

SpeechTranslation

TextTranslation

EnglishText

Oneencoder,multipledecoders Multipleencoders,onedecoder27

Spanish->EnglishFISHER/CALLHOMEBLEUresults

Model Test1 Test2

End-to-EndST 47.3 16.6

Multi-task 48.7 17.4

ASR /NMTconcatenation 45.4 16.6

28

Exampleofattentionprobabilities

29

Image

30

ImageCaptioning

Acatonthemat

31

Encoder-decoderwithattentiondecoder

+

encoder32

Captioning:Show,Attend&Tell

33

ResultsontheMSCOCOdatabase

Method BLEULog-Biliniar (Kiros etal2014a) 24.3

Enc-Dec(Vinyals et al2014a) 24.6

+Attention (Xuetal,2015) 25.0

34

OtherComputerVisionTaskswithAttentionVisual Question Answering: given an image and a natural languagequestion about the image, the task is to provide an accurate naturallanguage answer.Video Caption Generation: attempts to generate a complete andnatural sentence, enriching the single label as in video classification,to capture the most informative dynamics in videos.

35

NeuralMTarchitectureinspiredbyotherareas

ConvolutionalNeuralNeworks forcharacter-awareNeuralMT

37

German-EnglishBLEUResults

38

Method DE->EN EN->DEPhrase 20.99 17.04

NMT 20.64 17.15

+Char 22.10 20.22

Examples

39

GenerativeAdversarialNetworks

40

German-to-EnglishBLEUResults

41

Method DE->ENBaseline (Shenetal2016) 25.84

+Adversarial 27.94

German-to-EnglishExample

42

Source wir mussen verhindern ,dass diemenschenkenntnis erlangenvondingen ,vor allem dann ,wenn sie wahr sind .

Baseline weneedtopreventpeoplewhoareabletoknowthatpeoplehavetodo,especiallyiftheyaretrue.

+Adversarial weneedtopreventpeoplewhoareabletoknowaboutthings,especiallyiftheyaretrue.

REF wehavetopreventpeoplefromfindingaboutthings,especiallywhentheyaretrue.

Discussion

43

ImplementationsofEncoder-Decoder

LSTM CNN

44

Attention-basedmechanisms

SoftvsHard:softattentionweightsallpixels,hardattentioncropstheimageandforcesattentiononlyonthekeptpart.

GlobalvsLocal: aglobal approach whichalwaysattendstoallsourcewordsandalocalonethatonlylooksatasubsetofsourcewordsatatime.

IntravsExternal:intraattentioniswithintheencoder’sinputsentence,externalattentionisacrosssentences.

45

Onelargeencoder-decoder

•Text,speech,image…isallconverging toasignalparadigm?

•IfyouknowhowtobuildaneuralMTsystem,youmayeasilylearnhowtobuildaspeech-to-textrecognitionsystem...

•Oryoumaytrainthemtogethertoachievezero-shot AI.

*Andotherreferencesonthisresearchdirection….

46

Thanks

[email protected]

WWW.COSTA-JUSSA.COM

Acknowledgements:• Noé Casas and Carlos Escolano

for their valuable feedback on theslides.

• MT-Marathon Organizers forinviting me to this exciting event.

howdeep learningismaking mt and...

Documents