deep learning iii unsupervised learning
TRANSCRIPT
![Page 1: Deep Learning III Unsupervised Learning](https://reader030.vdocuments.net/reader030/viewer/2022012414/616ef0fe5f3a4046f92bd9b1/html5/thumbnails/1.jpg)
DeepLearningIIIUnsupervisedLearning
RussSalakhutdinov
Machine Learning Department Carnegie Mellon University
Canadian Institute of Advanced Research
![Page 2: Deep Learning III Unsupervised Learning](https://reader030.vdocuments.net/reader030/viewer/2022012414/616ef0fe5f3a4046f92bd9b1/html5/thumbnails/2.jpg)
UnsupervisedLearning
Non-probabilis;cModelsØ SparseCodingØ AutoencodersØ Others(e.g.k-means)
ExplicitDensityp(x)
Probabilis;c(Genera;ve)Models
TractableModelsØ Fullyobserved
BeliefNetsØ NADEØ PixelRNN
Non-TractableModelsØ BoltzmannMachinesØ Varia;onal
AutoencodersØ HelmholtzMachinesØ Manyothers…
Ø Genera;veAdversarialNetworks
Ø MomentMatchingNetworks
ImplicitDensity
![Page 3: Deep Learning III Unsupervised Learning](https://reader030.vdocuments.net/reader030/viewer/2022012414/616ef0fe5f3a4046f92bd9b1/html5/thumbnails/3.jpg)
TalkRoadmap• BasicBuildingBlocks:
Ø SparseCodingØ Autoencoders
• DeepGenera;veModelsØ RestrictedBoltzmannMachinesØ DeepBeliefNetworksandDeepBoltzmannMachinesØ HelmholtzMachines/Varia;onalAutoencoders
• Genera;veAdversarialNetworks
• ModelEvalua;on
![Page 4: Deep Learning III Unsupervised Learning](https://reader030.vdocuments.net/reader030/viewer/2022012414/616ef0fe5f3a4046f92bd9b1/html5/thumbnails/4.jpg)
h3
h2
h1
v
W3
W2
W1
h3
h2
h1
v
W3
W2
W1
Deep Belief Network Deep Boltzmann Machine
DBNsvs.DBMs
DBNsarehybridmodels:• InferenceinDBNsisproblema;cduetoexplainingaway.• Onlygreedypretrainig,nojointop/miza/onoveralllayers.• Approximateinferenceisfeed-forward:nobo6om-upandtop-down.
![Page 5: Deep Learning III Unsupervised Learning](https://reader030.vdocuments.net/reader030/viewer/2022012414/616ef0fe5f3a4046f92bd9b1/html5/thumbnails/5.jpg)
Mathema;calFormula;on
h3
h2
h1
v
W3
W2
W1
modelparameters
• BoVom-upandTop-down:
DeepBoltzmannMachine
BoVom-up Top-Down
Unlikemanyexis;ngfeed-forwardmodels:ConvNet(LeCun),HMAX(Poggioet.al.),DeepBeliefNets(Hintonet.al.)
• Dependenciesbetweenhiddenvariables.• Allconnec;onsareundirected.
Input
![Page 6: Deep Learning III Unsupervised Learning](https://reader030.vdocuments.net/reader030/viewer/2022012414/616ef0fe5f3a4046f92bd9b1/html5/thumbnails/6.jpg)
h3
h2
h1
v
W3
W2
W1
NeuralNetworkOutput
h3
h2
h1
v
W3
W2
W1
Mathema;calFormula;on
DeepBoltzmannMachine
h3
h2
h1
v
W3
W2
W1
DeepBeliefNetwork
Unlikemanyexis;ngfeed-forwardmodels:ConvNet(LeCun),HMAX(Poggio),DeepBeliefNets(Hinton)
Input
![Page 7: Deep Learning III Unsupervised Learning](https://reader030.vdocuments.net/reader030/viewer/2022012414/616ef0fe5f3a4046f92bd9b1/html5/thumbnails/7.jpg)
h3
h2
h1
v
W3
W2
W1
h3
h2
h1
v
W3
W2
W1
Mathema;calFormula;on
DeepBoltzmannMachine DeepBeliefNetwork
h3
h2
h1
v
W3
W2
W1
Unlikemanyexis;ngfeed-forwardmodels:ConvNet(LeCun),HMAX(Poggio),DeepBeliefNets(Hinton)
inference
NeuralNetworkOutput
Input
![Page 8: Deep Learning III Unsupervised Learning](https://reader030.vdocuments.net/reader030/viewer/2022012414/616ef0fe5f3a4046f92bd9b1/html5/thumbnails/8.jpg)
Mathema;calFormula;on
modelparameters
Maximumlikelihoodlearning:
Problem:Bothexpecta;onsareintractable!
Learningruleforundirectedgraphicalmodels:MRFs,CRFs,Factorgraphs.
• Dependenciesbetweenhiddenvariables.
DeepBoltzmannMachine
h3
h2
h1
v
W3
W2
W1
![Page 9: Deep Learning III Unsupervised Learning](https://reader030.vdocuments.net/reader030/viewer/2022012414/616ef0fe5f3a4046f92bd9b1/html5/thumbnails/9.jpg)
ApproximateLearning
(Approximate)MaximumLikelihood:
Notfactorialanymore!
h3
h2
h1
v
W3
W2
W1
• Bothexpecta;onsareintractable!
![Page 10: Deep Learning III Unsupervised Learning](https://reader030.vdocuments.net/reader030/viewer/2022012414/616ef0fe5f3a4046f92bd9b1/html5/thumbnails/10.jpg)
Data
ApproximateLearning
(Approximate)MaximumLikelihood:h3
h2
h1
v
W3
W2
W1
Notfactorialanymore!
![Page 11: Deep Learning III Unsupervised Learning](https://reader030.vdocuments.net/reader030/viewer/2022012414/616ef0fe5f3a4046f92bd9b1/html5/thumbnails/11.jpg)
ApproximateLearning
(Approximate)MaximumLikelihood:
Notfactorialanymore!
h3
h2
h1
v
W3
W2
W1 Varia;onalInference
Stochas;cApproxima;on(MCMC-based)
![Page 12: Deep Learning III Unsupervised Learning](https://reader030.vdocuments.net/reader030/viewer/2022012414/616ef0fe5f3a4046f92bd9b1/html5/thumbnails/12.jpg)
PreviousWorkManyapproachesforlearningBoltzmannmachineshavebeenproposedoverthelast20years:• HintonandSejnowski(1983),• PetersonandAnderson(1987)• Galland(1991)• KappenandRodriguez(1998)• Lawrence,Bishop,andJordan(1998)• Tanaka(1998)• WellingandHinton(2002)• ZhuandLiu(2002)• WellingandTeh(2003)• YasudaandTanaka(2009)
ManyofthepreviousapproacheswerenotsuccessfulforlearninggeneralBoltzmannmachineswithhiddenvariables.
Real-worldapplica;ons–thousandsofhiddenandobservedvariableswithmillionsofparameters.
AlgorithmsbasedonContras;veDivergence,ScoreMatching,Pseudo-Likelihood,CompositeLikelihood,MCMC-MLE,PiecewiseLearning,cannothandlemul;plelayersofhiddenvariables.
![Page 13: Deep Learning III Unsupervised Learning](https://reader030.vdocuments.net/reader030/viewer/2022012414/616ef0fe5f3a4046f92bd9b1/html5/thumbnails/13.jpg)
NewLearningAlgorithm
Condi;onal Uncondi;onal
PosteriorInference SimulatefromtheModel
Approximatecondi;onal
Approximatethejointdistribu;on
(Salakhutdinov, 2008; NIPS 2009)
![Page 14: Deep Learning III Unsupervised Learning](https://reader030.vdocuments.net/reader030/viewer/2022012414/616ef0fe5f3a4046f92bd9b1/html5/thumbnails/14.jpg)
Condi;onal Uncondi;onal
PosteriorInference SimulatefromtheModel
Approximatethejointdistribu;on
Data-dependent
Approximatecondi;onal
NewLearningAlgorithm
Data-independent
density Match
(Salakhutdinov, 2008; NIPS 2009)
![Page 15: Deep Learning III Unsupervised Learning](https://reader030.vdocuments.net/reader030/viewer/2022012414/616ef0fe5f3a4046f92bd9b1/html5/thumbnails/15.jpg)
Condi;onal Uncondi;onal
PosteriorInference
Approximatethejointdistribu;on
Data-dependent
Approximatecondi;onal
NewLearningAlgorithm
Data-independent
Match
KeyIdea:
MarkovChainMonteCarlo
Data-dependent:Varia/onalInference,mean-fieldtheoryData-independent:Stochas/cApproxima/on,MCMCbased
Mean-Field
SimulatefromtheModel
![Page 16: Deep Learning III Unsupervised Learning](https://reader030.vdocuments.net/reader030/viewer/2022012414/616ef0fe5f3a4046f92bd9b1/html5/thumbnails/16.jpg)
h2
h1
v
Timet=1
Stochas;cApproxima;on
Update Updateh2
h1
v
t=2h2
h1
v
t=3
• Generate bysimula;ngfromaMarkovchainthatleavesinvariant(e.g.GibbsorM-Hsampler)
• Update byreplacingintractable withapointes;mate
Inprac;cewesimulateseveralMarkovchainsinparallel.(Robbins and Monro, Ann. Math. Stats, 1957; L. Younes, Probability Theory 1989)
Updateandsequen;ally,where
![Page 17: Deep Learning III Unsupervised Learning](https://reader030.vdocuments.net/reader030/viewer/2022012414/616ef0fe5f3a4046f92bd9b1/html5/thumbnails/17.jpg)
LearningAlgorithmUpdateruledecomposes:
Truegradient Perturba;ontermAlmostsureconvergenceguaranteesaslearningrate
Problem:High-dimensionaldata:theprobabilitylandscapeishighlymul;modal.
Connec;onstothetheoryofstochas;capproxima;onandadap;veMCMC.
Keyinsight:Thetransi;onoperatorcanbeanyvalidtransi;onoperator–TemperedTransi;ons,Parallel/SimulatedTempering.
MarkovChainMonteCarlo
![Page 18: Deep Learning III Unsupervised Learning](https://reader030.vdocuments.net/reader030/viewer/2022012414/616ef0fe5f3a4046f92bd9b1/html5/thumbnails/18.jpg)
PosteriorInference
Mean-Field
Varia;onalInferenceApproximateintractabledistribu;onwithsimpler,tractabledistribu;on :
Varia;onalLowerBound
MinimizeKLbetweenapproxima;ngandtruedistribu;onswithrespecttovaria;onalparameters.
![Page 19: Deep Learning III Unsupervised Learning](https://reader030.vdocuments.net/reader030/viewer/2022012414/616ef0fe5f3a4046f92bd9b1/html5/thumbnails/19.jpg)
PosteriorInference
Mean-Field
Varia;onalInferenceApproximateintractabledistribu;onwithsimpler,tractabledistribu;on :
Mean-Field:Chooseafullyfactorizeddistribu;on:
with
Nonlinearfixed-pointequa;ons:
Varia/onalInference:Maximizethelowerboundw.r.t.Varia;onalparameters.
Varia;onalLowerBound
![Page 20: Deep Learning III Unsupervised Learning](https://reader030.vdocuments.net/reader030/viewer/2022012414/616ef0fe5f3a4046f92bd9b1/html5/thumbnails/20.jpg)
PosteriorInference
Mean-Field
Varia;onalInferenceApproximateintractabledistribu;onwithsimpler,tractabledistribu;on :
1.Varia/onalInference:Maximizethelowerboundw.r.t.varia;onalparameters
MarkovChainMonteCarlo
2.MCMC:Applystochas;capproxima;ontoupdatemodelparameters
Almostsureconvergenceguaranteestoanasympto;callystablepoint.
Uncondi;onalSimula;onVaria;onalLowerBound
![Page 21: Deep Learning III Unsupervised Learning](https://reader030.vdocuments.net/reader030/viewer/2022012414/616ef0fe5f3a4046f92bd9b1/html5/thumbnails/21.jpg)
PosteriorInference
Mean-Field
Varia;onalInferenceApproximateintractabledistribu;onwithsimpler,tractabledistribu;on :
1.Varia/onalInference:Maximizethelowerboundw.r.t.varia;onalparameters
MarkovChainMonteCarlo
2.MCMC:Applystochas;capproxima;ontoupdatemodelparameters
Almostsureconvergenceguaranteestoanasympto;callystablepoint.
Uncondi;onalSimula;on
FastInference
Learningcanscaletomillionsofexamples
Varia;onalLowerBound
![Page 22: Deep Learning III Unsupervised Learning](https://reader030.vdocuments.net/reader030/viewer/2022012414/616ef0fe5f3a4046f92bd9b1/html5/thumbnails/22.jpg)
GoodGenera;veModel?HandwriVenCharacters
![Page 23: Deep Learning III Unsupervised Learning](https://reader030.vdocuments.net/reader030/viewer/2022012414/616ef0fe5f3a4046f92bd9b1/html5/thumbnails/23.jpg)
GoodGenera;veModel?HandwriVenCharacters
![Page 24: Deep Learning III Unsupervised Learning](https://reader030.vdocuments.net/reader030/viewer/2022012414/616ef0fe5f3a4046f92bd9b1/html5/thumbnails/24.jpg)
GoodGenera;veModel?HandwriVenCharacters
RealDataSimulated
![Page 25: Deep Learning III Unsupervised Learning](https://reader030.vdocuments.net/reader030/viewer/2022012414/616ef0fe5f3a4046f92bd9b1/html5/thumbnails/25.jpg)
GoodGenera;veModel?HandwriVenCharacters
RealData Simulated
![Page 26: Deep Learning III Unsupervised Learning](https://reader030.vdocuments.net/reader030/viewer/2022012414/616ef0fe5f3a4046f92bd9b1/html5/thumbnails/26.jpg)
GoodGenera;veModel?HandwriVenCharacters
![Page 27: Deep Learning III Unsupervised Learning](https://reader030.vdocuments.net/reader030/viewer/2022012414/616ef0fe5f3a4046f92bd9b1/html5/thumbnails/27.jpg)
GoodGenera;veModel?MNISTHandwriVenDigitDataset
![Page 28: Deep Learning III Unsupervised Learning](https://reader030.vdocuments.net/reader030/viewer/2022012414/616ef0fe5f3a4046f92bd9b1/html5/thumbnails/28.jpg)
Handwri;ngRecogni;on
LearningAlgorithm Error
Logis;cregression 12.0%K-NN 3.09%NeuralNet(PlaV2005) 1.53%SVM(Decosteet.al.2002) 1.40%DeepAutoencoder(Bengioet.al.2007)
1.40%
DeepBeliefNet(Hintonet.al.2006)
1.20%
DBM 0.95%
LearningAlgorithm Error
Logis;cregression 22.14%K-NN 18.92%NeuralNet 14.62%SVM(Larochelleet.al.2009) 9.70%DeepAutoencoder(Bengioet.al.2007)
10.05%
DeepBeliefNet(Larochelleet.al.2009)
9.68%
DBM 8.40%
MNISTDataset Op;calCharacterRecogni;on60,000examplesof10digits 42,152examplesof26EnglishleVers
Permuta;on-invariantversion.
![Page 29: Deep Learning III Unsupervised Learning](https://reader030.vdocuments.net/reader030/viewer/2022012414/616ef0fe5f3a4046f92bd9b1/html5/thumbnails/29.jpg)
Genera;veModelof3-DObjects
24,000examples,5objectcategories,5differentobjectswithineachcategory,6lightningcondi;ons,9eleva;ons,18azimuths.
![Page 30: Deep Learning III Unsupervised Learning](https://reader030.vdocuments.net/reader030/viewer/2022012414/616ef0fe5f3a4046f92bd9b1/html5/thumbnails/30.jpg)
3-DObjectRecogni;on
LearningAlgorithm ErrorLogis;cregression 22.5%K-NN(LeCun2004) 18.92%SVM(Bengio&LeCun2007) 11.6%DeepBeliefNet(Nair&Hinton2009)
9.0%
DBM 7.2%
PaVernComple;on
Permuta;on-invariantversion.
Whereelsecanweusegenera;vemodels?
![Page 31: Deep Learning III Unsupervised Learning](https://reader030.vdocuments.net/reader030/viewer/2022012414/616ef0fe5f3a4046f92bd9b1/html5/thumbnails/31.jpg)
Data–Collec;onofModali;es• Mul;mediacontentontheweb-image+text+audio.
• Productrecommenda;onsystems.
• Robo;csapplica;ons.
AudioVision
TouchsensorsMotorcontrol
sunset,pacificocean,bakerbeach,seashore,ocean
car,automobile
![Page 32: Deep Learning III Unsupervised Learning](https://reader030.vdocuments.net/reader030/viewer/2022012414/616ef0fe5f3a4046f92bd9b1/html5/thumbnails/32.jpg)
Challenges-I
Verydifferentinputrepresenta;ons
Image Text
sunset,pacificocean,bakerbeach,seashore,
ocean • Images–real-valued,dense
Difficulttolearncross-modalfeaturesfromlow-levelrepresenta;ons.
Dense
• Text–discrete,sparse
Sparse
![Page 33: Deep Learning III Unsupervised Learning](https://reader030.vdocuments.net/reader030/viewer/2022012414/616ef0fe5f3a4046f92bd9b1/html5/thumbnails/33.jpg)
Challenges-II
Noisyandmissingdata
Image Textpentax,k10d,pentaxda50200,kangarooisland,sa,australiansealion
mickikrimmel,mickipedia,headshot
unseulpixel,naturey
<notext>
![Page 34: Deep Learning III Unsupervised Learning](https://reader030.vdocuments.net/reader030/viewer/2022012414/616ef0fe5f3a4046f92bd9b1/html5/thumbnails/34.jpg)
Challenges-IIImage Text Textgeneratedbythemodel
beach,sea,surf,strand,shore,wave,seascape,sand,ocean,waves
portrait,girl,woman,lady,blonde,preVy,gorgeous,expression,model
night,noVe,traffic,light,lights,parking,darkness,lowlight,nacht,glow
fall,autumn,trees,leaves,foliage,forest,woods,branches,path
pentax,k10d,pentaxda50200,kangarooisland,sa,australiansealion
mickikrimmel,mickipedia,headshot
unseulpixel,naturey
<notext>
![Page 35: Deep Learning III Unsupervised Learning](https://reader030.vdocuments.net/reader030/viewer/2022012414/616ef0fe5f3a4046f92bd9b1/html5/thumbnails/35.jpg)
ASimpleMul;modalModel• Useajointbinaryhiddenlayer.• Problem:Inputshaveverydifferentsta;s;calproper;es.
• Difficulttolearncross-modalfeatures.
0010
0Real-valued
1-of-K
![Page 36: Deep Learning III Unsupervised Learning](https://reader030.vdocuments.net/reader030/viewer/2022012414/616ef0fe5f3a4046f92bd9b1/html5/thumbnails/36.jpg)
0010
0Dense,real-valuedimagefeatures
GaussianmodelReplicatedSoumax
Mul;modalDBM
Wordcounts
(Srivastava & Salakhutdinov, NIPS 2012, JMLR 2014)
![Page 37: Deep Learning III Unsupervised Learning](https://reader030.vdocuments.net/reader030/viewer/2022012414/616ef0fe5f3a4046f92bd9b1/html5/thumbnails/37.jpg)
Mul;modalDBM
0010
0Dense,real-valuedimagefeatures
GaussianmodelReplicatedSoumax
Wordcounts
(Srivastava & Salakhutdinov, NIPS 2012, JMLR 2014)
![Page 38: Deep Learning III Unsupervised Learning](https://reader030.vdocuments.net/reader030/viewer/2022012414/616ef0fe5f3a4046f92bd9b1/html5/thumbnails/38.jpg)
GaussianmodelReplicatedSoumax
0010
0
Mul;modalDBM
Wordcounts
Dense,real-valuedimagefeatures
(Srivastava & Salakhutdinov, NIPS 2012, JMLR 2014)
![Page 39: Deep Learning III Unsupervised Learning](https://reader030.vdocuments.net/reader030/viewer/2022012414/616ef0fe5f3a4046f92bd9b1/html5/thumbnails/39.jpg)
0010
0Dense,real-valuedimagefeatures
Wordcounts
GaussianmodelReplicatedSoumax
Mul;modalDBM
BoVom-up+
Top-down
(Srivastava & Salakhutdinov, NIPS 2012, JMLR 2014)
![Page 40: Deep Learning III Unsupervised Learning](https://reader030.vdocuments.net/reader030/viewer/2022012414/616ef0fe5f3a4046f92bd9b1/html5/thumbnails/40.jpg)
TextGeneratedfromImages
canada,nature,sunrise,ontario,fog,mist,bc,morning
insect,buVerfly,insects,bug,buVerflies,lepidoptera
graffi;,streetart,stencil,s;cker,urbanart,graff,sanfrancisco
portrait,child,kid,ritraVo,kids,children,boy,cute,boys,italy
dog,cat,pet,kiVen,puppy,ginger,tongue,kiVy,dogs,furry
sea,france,boat,mer,beach,river,bretagne,plage,briVany
Given Generated Given Generated
![Page 41: Deep Learning III Unsupervised Learning](https://reader030.vdocuments.net/reader030/viewer/2022012414/616ef0fe5f3a4046f92bd9b1/html5/thumbnails/41.jpg)
TextGeneratedfromImagesGiven Generated
water,glass,beer,boVle,drink,wine,bubbles,splash,drops,drop
portrait,women,army,soldier,mother,postcard,soldiers
obama,barackobama,elec;on,poli;cs,president,hope,change,sanfrancisco,conven;on,rally
![Page 42: Deep Learning III Unsupervised Learning](https://reader030.vdocuments.net/reader030/viewer/2022012414/616ef0fe5f3a4046f92bd9b1/html5/thumbnails/42.jpg)
Genera;ngTextfromImages
Samplesdrawnauerevery50stepsofGibbsupdates
![Page 43: Deep Learning III Unsupervised Learning](https://reader030.vdocuments.net/reader030/viewer/2022012414/616ef0fe5f3a4046f92bd9b1/html5/thumbnails/43.jpg)
MIR-FlickrDataset
(Huiskes et al., 2010)
• 1millionimagesalongwithuser-assignedtags.
sculpture,beauty,stone
nikon,green,light,photoshop,apple,d70
white,yellow,abstract,lines,bus,graphic
sky,geotagged,reflec;on,cielo,bilbao,reflejo
food,cupcake,vegan
d80
anawesomeshot,theperfectphotographer,flash,damniwishidtakenthat,spiritofphotography
nikon,abigfave,goldstaraward,d80,nikond80
![Page 44: Deep Learning III Unsupervised Learning](https://reader030.vdocuments.net/reader030/viewer/2022012414/616ef0fe5f3a4046f92bd9b1/html5/thumbnails/44.jpg)
Results• Logis;cregressionontop-levelrepresenta;on.• Mul;modalInputs
LearningAlgorithm MAP Precision@50
Random 0.124 0.124LDA[Huiskeset.al.] 0.492 0.754SVM[Huiskeset.al.] 0.475 0.758DBM-Labelled 0.526 0.791DeepBeliefNet 0.638 0.867Autoencoder 0.638 0.875DBM 0.641 0.873
MeanAveragePrecision
Labeled25Kexamples
+1Millionunlabelled
![Page 45: Deep Learning III Unsupervised Learning](https://reader030.vdocuments.net/reader030/viewer/2022012414/616ef0fe5f3a4046f92bd9b1/html5/thumbnails/45.jpg)
TalkRoadmap• BasicBuildingBlocks:
Ø SparseCodingØ Autoencoders
• DeepGenera;veModelsØ RestrictedBoltzmannMachinesØ DeepBeliefNetwork,DeepBoltzmannMachinesØ HelmholtzMachines/Varia;onalAutoencoders
• Genera;veAdversarialNetworks
• ModelEvalua;on
![Page 46: Deep Learning III Unsupervised Learning](https://reader030.vdocuments.net/reader030/viewer/2022012414/616ef0fe5f3a4046f92bd9b1/html5/thumbnails/46.jpg)
HelmholtzMachines• Hinton,G.E.,Dayan,P.,Frey,B.J.andNeal,R.,Science1995
Inputdata
h3
h2
h1
v
W3
W2
W1
Genera;veProcessApproximate
Inference
• Kingma&Welling,2014
• Rezende,Mohamed,Daan,2014
• Mnih&Gregor,2014
• Bornschein&Bengio,2015
• Tang&Salakhutdinov,2013
![Page 47: Deep Learning III Unsupervised Learning](https://reader030.vdocuments.net/reader030/viewer/2022012414/616ef0fe5f3a4046f92bd9b1/html5/thumbnails/47.jpg)
HelmholtzMachines,DBNs,DBMs
h3
h2
h1
v
W3
W2
W1
h3
h2
h1
v
W3
W2
W1
Deep Boltzmann Machine
Helmholtz Machine
h3
h2
h1
v
W3
W2
W1
Deep Belief Network
![Page 48: Deep Learning III Unsupervised Learning](https://reader030.vdocuments.net/reader030/viewer/2022012414/616ef0fe5f3a4046f92bd9b1/html5/thumbnails/48.jpg)
Varia;onalAutoencoders(VAEs)• TheVAEdefinesagenera;veprocessintermsofancestralsamplingthroughacascadeofhiddenstochas;clayers:
h3
h2
h1
v
W3
W2
W1
Eachtermmaydenoteacomplicatednonlinearrela;onship
• Samplingandprobabilityevalua;onistractableforeach.
Genera;veProcess
• denotesparametersofVAE.
• isthenumberofstochas/clayers.
Inputdata
![Page 49: Deep Learning III Unsupervised Learning](https://reader030.vdocuments.net/reader030/viewer/2022012414/616ef0fe5f3a4046f92bd9b1/html5/thumbnails/49.jpg)
VAE:Example• TheVAEdefinesagenera;veprocessintermsofancestralsamplingthroughacascadeofhiddenstochas;clayers:
Thistermdenotesaone-layerneuralnet.
Determinis;cLayer
Stochas;cLayer
Stochas;cLayer
• denotesparametersofVAE.
• Samplingandprobabilityevalua;onistractableforeach.
• isthenumberofstochas/clayers.
![Page 50: Deep Learning III Unsupervised Learning](https://reader030.vdocuments.net/reader030/viewer/2022012414/616ef0fe5f3a4046f92bd9b1/html5/thumbnails/50.jpg)
Varia;onalBound• TheVAEistrainedtomaximizethevaria;onallowerbound:
Inputdata
h3
h2
h1
v
W3
W2
W1
• Hardtoop;mizethevaria;onalboundwithrespecttotherecogni;onnetwork(high-variance).
• KeyideaofKingmaandWellingistousereparameteriza;ontrick.
• Tradingoffthedatalog-likelihoodandtheKLdivergencefromthetrueposterior.
![Page 51: Deep Learning III Unsupervised Learning](https://reader030.vdocuments.net/reader030/viewer/2022012414/616ef0fe5f3a4046f92bd9b1/html5/thumbnails/51.jpg)
Reparameteriza;onTrick• Assumethattherecogni;ondistribu;onisGaussian:
withmeanandcovariancecomputedfromthestateofthehiddenunitsatthepreviouslayer.
• Alterna;vely,wecanexpressthisintermofauxiliaryvariable:
![Page 52: Deep Learning III Unsupervised Learning](https://reader030.vdocuments.net/reader030/viewer/2022012414/616ef0fe5f3a4046f92bd9b1/html5/thumbnails/52.jpg)
• Assumethattherecogni;ondistribu;onisGaussian:
• Or
Determinis;cEncoder
• Therecogni;ondistribu;oncanbeexpressedintermsofadeterminis;cmapping:
Distribu;onofdoesnotdependon
Reparameteriza;onTrick
![Page 53: Deep Learning III Unsupervised Learning](https://reader030.vdocuments.net/reader030/viewer/2022012414/616ef0fe5f3a4046f92bd9b1/html5/thumbnails/53.jpg)
Compu;ngtheGradients• Thegradientw.r.ttheparameters:bothrecogni;onandgenera;ve:
Gradientscanbecomputedbybackprop
Themappinghisadeterminis;cneuralnetforfixed.
Autoencoder
![Page 54: Deep Learning III Unsupervised Learning](https://reader030.vdocuments.net/reader030/viewer/2022012414/616ef0fe5f3a4046f92bd9b1/html5/thumbnails/54.jpg)
ImportanceWeightedAutoencoders• CanimproveVAEbyusingfollowingk-sampleimportanceweigh;ngofthelog-likelihood:
wherearesampledfromtherecogni;onnetwork.
Inputdata
h3
h2
h1
v
W3
W2
W1
unnormalizedimportanceweights
(Burda, Grosse, Salakhutdinov, ICLR 2016)
![Page 55: Deep Learning III Unsupervised Learning](https://reader030.vdocuments.net/reader030/viewer/2022012414/616ef0fe5f3a4046f92bd9b1/html5/thumbnails/55.jpg)
ImportanceWeightedAutoencoders• CanimproveVAEbyusingfollowingk-sampleimportanceweigh;ngofthelog-likelihood:
• Thisisalowerboundonthemarginallog-likelihood:
• SpecialCaseofk=1:SameasstandardVAEobjec;ve.
• UsingmoresamplesàImprovesthe;ghtnessofthebound.
![Page 56: Deep Learning III Unsupervised Learning](https://reader030.vdocuments.net/reader030/viewer/2022012414/616ef0fe5f3a4046f92bd9b1/html5/thumbnails/56.jpg)
TighterLowerBound
• Forallk,thelowerboundssa;sfy:
• Usingmoresamplescanonlyimprovethe;ghtnessofthebound.
• Moreoverifisbounded,then:
![Page 57: Deep Learning III Unsupervised Learning](https://reader030.vdocuments.net/reader030/viewer/2022012414/616ef0fe5f3a4046f92bd9b1/html5/thumbnails/57.jpg)
Compu;ngtheGradients• Wecanusetheunbiasedes;mateofthegradientusingreparameteriza;ontrick:
wherewedefinenormalizedimportanceweights:
![Page 58: Deep Learning III Unsupervised Learning](https://reader030.vdocuments.net/reader030/viewer/2022012414/616ef0fe5f3a4046f92bd9b1/html5/thumbnails/58.jpg)
IWAEsvs.VAEs• Drawk-samplesformtherecogni;onnetwork
- ork-setsofauxiliaryvariables.• ObtainthefollowingMonteCarloes;mateofthegradient:
• ComparethistotheVAE’ses;mateofthegradient:
![Page 59: Deep Learning III Unsupervised Learning](https://reader030.vdocuments.net/reader030/viewer/2022012414/616ef0fe5f3a4046f92bd9b1/html5/thumbnails/59.jpg)
Mo;va;ngExample• Canwegenerateimagesfromnaturallanguagedescrip;ons?
Astopsignisflyinginblueskies
Apaleyellowschoolbusisflyinginblueskies
Aherdofelephantsisflyinginblueskies
Alargecommercialairplaneisflyinginblueskies
(Mansimov, Parisotto, Ba, Salakhutdinov, 2015)
![Page 60: Deep Learning III Unsupervised Learning](https://reader030.vdocuments.net/reader030/viewer/2022012414/616ef0fe5f3a4046f92bd9b1/html5/thumbnails/60.jpg)
Genera;ngImagesfromCap;ons
• Genera;veModel:Stochas;cRecurrentNetwork,chainedsequenceofVaria;onalAutoencoders,withasinglestochas;clayer.
• Recogni;onModel:Determinis;cRecurrentNetwork.
Stochas;cLayer
(Gregor et al., 2015)
![Page 61: Deep Learning III Unsupervised Learning](https://reader030.vdocuments.net/reader030/viewer/2022012414/616ef0fe5f3a4046f92bd9b1/html5/thumbnails/61.jpg)
FlippingColorsAyellowschoolbusparkedintheparkinglot
Aredschoolbusparkedintheparkinglot
Agreenschoolbusparkedintheparkinglot
Ablueschoolbusparkedintheparkinglot
![Page 62: Deep Learning III Unsupervised Learning](https://reader030.vdocuments.net/reader030/viewer/2022012414/616ef0fe5f3a4046f92bd9b1/html5/thumbnails/62.jpg)
NovelSceneComposi;onsAtoiletseatsitsopeninthebathroom
AskGoogle?
Atoiletseatsitsopeninthegrassfield
BloombergNews