ttic 31230, fundamentals of deep learningttic.uchicago.edu/~dmcallester/deepclass/gans.pdfttic...

TTIC 31230, Fundamentals of Deep Learning David McAllester, April 2017 Generative Adversarial Networks (GANs)

Upload: others

Post on 05-Sep-2020

11 views

Category:

Documents

0 download

Report

Download

Embed Size (px):

TRANSCRIPT

TTIC 31230, Fundamentals of Deep Learning

David McAllester, April 2017

Generative Adversarial Networks (GANs)

Page 2: TTIC 31230, Fundamentals of Deep Learningttic.uchicago.edu/~dmcallester/DeepClass/GANs.pdfTTIC 31230, Fundamentals of Deep Learning David McAllester, April 2017 Generative Adversarial

The Generator and The Discriminator

A GAN consists of two networks: a generator PgenΘ (x) and a

discriminator P discΨ (y|x).

Θ∗ = argmaxΘ

minΨ

E(x,y)∼(D ] P genΘ )

[log

P discΨ (y|x)

]

Here x is drawn from the data distribution D or the generatordistribution P

genΘ with equal propability and y = 1 if x is

drawn from D and −1 if x is drawn from PgenΘ .

The discriminator tries to determine which source x came fromand the generator tries to fool the discriminator.

Page 3: TTIC 31230, Fundamentals of Deep Learningttic.uchicago.edu/~dmcallester/DeepClass/GANs.pdfTTIC 31230, Fundamentals of Deep Learning David McAllester, April 2017 Generative Adversarial

Consistency

If the discriminator is perfect, then the only way to fool it isto exactly copy the data distribution.

Consistency Theorem: If PgenΘ (x) and P disc

Ψ (y|x) are bothuniversally expressive (any distribution can be represented)then P

genΘ∗ = D.

Page 4: TTIC 31230, Fundamentals of Deep Learningttic.uchicago.edu/~dmcallester/DeepClass/GANs.pdfTTIC 31230, Fundamentals of Deep Learning David McAllester, April 2017 Generative Adversarial

DC GANs, Radford, Metz and Chintala, ICLR 2016

Page 5: TTIC 31230, Fundamentals of Deep Learningttic.uchicago.edu/~dmcallester/DeepClass/GANs.pdfTTIC 31230, Fundamentals of Deep Learning David McAllester, April 2017 Generative Adversarial

The Generator

Page 6: TTIC 31230, Fundamentals of Deep Learningttic.uchicago.edu/~dmcallester/DeepClass/GANs.pdfTTIC 31230, Fundamentals of Deep Learning David McAllester, April 2017 Generative Adversarial

Generated Bedrooms

Page 7: TTIC 31230, Fundamentals of Deep Learningttic.uchicago.edu/~dmcallester/DeepClass/GANs.pdfTTIC 31230, Fundamentals of Deep Learning David McAllester, April 2017 Generative Adversarial

Interpolated Faces

[Ayan Chakrabarti]

Page 8: TTIC 31230, Fundamentals of Deep Learningttic.uchicago.edu/~dmcallester/DeepClass/GANs.pdfTTIC 31230, Fundamentals of Deep Learning David McAllester, April 2017 Generative Adversarial

Conditional Distribution Modeling

All distribution modeling methods apply to conditional distri-butions.

For conditional GANs we allow the generator to take x as aninput and generate a conditional value c.

Θ∗ = argmaxΘ

minΨ

Ex∼D, (c,y)∼( D(c|x) ] P genΘ (c|x) )

[log

P discΨ (y|c, x)

]

Here y = 1 if c is drawn from D(c|x) and y = −1 if c is drawnfrom P

genΘ (c|x).

Page 9: TTIC 31230, Fundamentals of Deep Learningttic.uchicago.edu/~dmcallester/DeepClass/GANs.pdfTTIC 31230, Fundamentals of Deep Learning David McAllester, April 2017 Generative Adversarial

The Case of Imperfect Generation

Θ∗ = argmaxΘ

minΨ

E(x,y)∼(D ] P genΘ )

[log

P discΨ (y|x)

]

Ψ∗(Θ) = argminΨ

E(x,y)∼(D ] P genΘ )

[log2

P (y|x)

]

P discΨ∗(Θ)(y = 1|x) =

P (x, y = 1)

P (x)=

D(x)

D(x) + PgenΘ (x)

Page 10: TTIC 31230, Fundamentals of Deep Learningttic.uchicago.edu/~dmcallester/DeepClass/GANs.pdfTTIC 31230, Fundamentals of Deep Learning David McAllester, April 2017 Generative Adversarial

Θ∗ = argmaxΘ

E(x,y)∼(D ] P genΘ )

[− log2P

discΨ∗(Θ)(y|x)

]

= argmaxΘ

2E(x,1)∼D

[log2

D(x) + π(x|Θ)

D(x)

2E(x,−1)∼P gen

[log2

D(x) + PgenΘ (x)

PgenΘ (x)

]

= argmaxΘ

1− 1

2KL(D,A)− 1

2KL(P

genΘ , A)

A(x) =1

2(D(x) + P

genΘ (x))

Page 11: TTIC 31230, Fundamentals of Deep Learningttic.uchicago.edu/~dmcallester/DeepClass/GANs.pdfTTIC 31230, Fundamentals of Deep Learning David McAllester, April 2017 Generative Adversarial

Jensen-Shannon Divergence (JSD)

We have arrived at the Jensen-Shannon divergence.

Θ∗ = argminΘ

JSD(D,PgenΘ )

JSD(P,Q) =1

2KL

(P,P + Q

2KL

(Q,P + Q

)

0 ≤ JSD(P,Q) = JSD(Q,P ) ≤ 1

Page 12: TTIC 31230, Fundamentals of Deep Learningttic.uchicago.edu/~dmcallester/DeepClass/GANs.pdfTTIC 31230, Fundamentals of Deep Learning David McAllester, April 2017 Generative Adversarial

The Discriminator Tends to Win

If the discriminator “wins” the discriminator log loss goes tozero (becomes exponentially small) and there is no gradient toguide the generator.

In this case the learning stops and the generator is blockedfrom minimizing JSD(D,P

genΘ ).

Page 13: TTIC 31230, Fundamentals of Deep Learningttic.uchicago.edu/~dmcallester/DeepClass/GANs.pdfTTIC 31230, Fundamentals of Deep Learning David McAllester, April 2017 Generative Adversarial

The Standard Fix

The standard fix is to replace the loss

` = − logP discΨ (y|x)

with

˜̀ = −y logP discΨ (1|x)

These two loss functions agree when y = 1 (the case where xis drawn from D) but are very different when x is drawn fromthe generator (y = −1) and P disc

Ψ (1|x) is exponentially closeto zero.

Page 14: TTIC 31230, Fundamentals of Deep Learningttic.uchicago.edu/~dmcallester/DeepClass/GANs.pdfTTIC 31230, Fundamentals of Deep Learning David McAllester, April 2017 Generative Adversarial

A Margin Interpretation of the Standard Fix

The standard fix can be interpreted in terms of the “margin”of binary classification.

For y ∈ {−1, 1} we typically have sΨ(1|x) = −sΨ(−1|x) andsoftmax over 1 and -1 gives

PΨ(y|x) =1

1 + e−m

where the margin m = 2ysΨ(x).

The margin is large when the prediction is confidently corrent.

Page 15: TTIC 31230, Fundamentals of Deep Learningttic.uchicago.edu/~dmcallester/DeepClass/GANs.pdfTTIC 31230, Fundamentals of Deep Learning David McAllester, April 2017 Generative Adversarial

A Margin Interpretation of the Standard Fix

In the standard fix we (essentially) take the loss to be themargin of the discriminator.

The generator wants to reduce the discriminator’s margin.

The direction of the update is the same but the step is muchlarger under margin-loss for generated inputs and large dis-criminator margins.

Page 16: TTIC 31230, Fundamentals of Deep Learningttic.uchicago.edu/~dmcallester/DeepClass/GANs.pdfTTIC 31230, Fundamentals of Deep Learning David McAllester, April 2017 Generative Adversarial

END

Automatically generated PDF from existing images.eprints.rclis.org/31230/1/8 SERVICES USING INTERNET AND INTRAN… · tour of the art gallery, tourist information, weather reports

Economic and Social Council - un.org · E/2008/50 08-31230 5 Establishing such positive interaction appears to have become much more difficult in recent years and in some cases has

PERANAN PENDEKATAN CTL (Contextual Teaching and …eprints.umm.ac.id/31230/2/jiptummpp-gdl-s1-2011-erikhendra-23028-PEDAHULU-N.pdf · peranan pendekatan ctl (contextual teaching and

THE BRADLEY - Core77s3files.core77.com/files/pdfs/2015/31230/248218_utiqKzSZI.pdf · fragile, inaccurate tactile watches currently available. While talking with people who are vision

Helmholtzstraße 22 D-89081 Ulm phone+49 (0) 731/50-31230 fax +49 (0) 731/50-31239 email [email protected] It Takes Two: Why Mortality Trend Modeling is more

TTIC 31230, Fundamentals of Deep Learningttic.uchicago.edu/~dmcallester/DeepClass18/16alpha/alphago.pdfChess Background The rst min-max computer chess program was described by Claude

H&L Tooth Company Tooth Identification Flyer #202.2017fs ... · Komatsu copied Cat’s pin and retainer system and over the years ... D150 175 -78 31230 3.25” 5.19” 7.25” D275

HUBUNGAN EKSTRAKURIKULER KEPRAMUKAAN DAN …lib.unnes.ac.id/31230/1/1401413038.pdf · v MOTO DAN PERSEMBAHAN MOTO 1. “Barang siapa keluar untuk mencari ilmu, maka dia berada di

PERBEDAAN EFEKTIVITAS MENYIKAT GIGI ANTARA …eprints.ums.ac.id/31230/21/NASPUB.pdf · perhitungan uji LSD dapat dilihat pada tabel 4. Tabel 4. Hasil uji LSD Multiple Comparisons

TTIC 31230, Fundamentals of Deep Learning · AlphaGo Zero: Trained on self play only. Trained for 3 days. Run on one machine with 4 TPUs. Defeated AlphaGo Lee under match conditions

BOCyL n.º 165, 14 de agosto de 2020 - Disp. 006...%ROHWtQ2¿FLDOGH&DVWLOOD\/HyQ Núm. 165 Viernes, 14 de agosto de 2020 Pág. 31230 Anexo PROGRAMA TERRITORIAL DE FOMENTO PARA ÁVILA

La Renovation de La GaRe de LuxembouRG de Renovatie van het … - 1.pdf · 2013-04-03 · BELGIQUE-BELGIE PP LIEGE X BC.31230 Sabine PiedBoeuf - Rue du Lombard 34-42 • BE-1000 Bruxelles

Equality of Opportunity in Supervised Learningttic.uchicago.edu/~nati/Publications/HardtPriceSrebro...Equality of Opportunity in Supervised Learning Moritz Hardt Eric Price Nathan

Band gap bowing in NixMg1−xO - Imperial College LondonS REPORS 31230 DO 10.103srep31230 1 Band gap bowing in Ni xMg1−xO Christian A. Niedermeier1, Mikael Råsander1, Sneha Rhode1,

PERBEDAAN EFEKTIVITAS MENYIKAT GIGI ANTARA …eprints.ums.ac.id/31230/1/HALAMAN_DEPAN.pdf · between the Bass method and Roll method (p

P00 8512 PIERRE & MARBRE - 1.pdf · 2013. 4. 3. · Mars/Maart 2011-1 BELGIQUE-BELGIE PP LIEGE X BC.31230 ... TrIbUnE / TrIbUnE N’oubliez pas de déclarer vos chantiers ! Vergeet

The oil mist separator for extremely efficient separation ......Handte Oil Expert 6,0 31230 x 1181 mm to 6,000 m /h 2,930 mm - 3,960 mm Handte Oil Expert 9,0 31230 x 1725 mm to 9,000

TTIC 31230, Fundamentals of Deep Learning · 2020-04-04 · TTIC 31230, Fundamentals of Deep Learning David McAllester, Winter 2019 The Fundamental Equations of Deep Learning 1. Early

Gordon v 476 Broadway Realty Corp. - courts.state.ny.us · Gordon v 476 Broadway Realty Corp. 2017 NY Slip Op 31230(U) June 7, 2017 Supreme Court, New York County Docket Number: 152076/14

Cell attachment on ion implanted titanium surfacejamme.acmsse.h2.pl/papers_vol31_2/31230.pdf · 2013-09-19 · Success of non-biodegradable implants will first and foremost depend

Stochastic Optimization for Machine Learningttic.uchicago.edu/~nati/Publications/ICML10tut.pdf · Stochastic Optimization for Machine Learning ICML 2010, Haifa, Israel Tutorial by

INTRODUCING WAQF BASED TAKAFUL MODEL IN INDIAirep.iium.edu.my/31230/1/INTRODUCING_WAQF_BASED_TAKAFUL_MODEL_IN_I… · (s.a.w.) and has ever since moderately blossomed until the starting

David McAllester, Winter 2018 Computation Graphs and ...dmcallester/DeepClass18/02MLP/MLPb.pdf · TTIC 31230, Fundamentals of Deep Learning David McAllester, Winter 2018 Computation

PREFEITURA MUNICIPAL DE MARIÓPOLIS ESTADO DO … · concurso pÚblico nº 01/2015 insc nome nota situaÇÃo cargo 32142 adriane farias stramari 54,00 aprovado assistente social 31230

TTIC 31230, Fundamentals of Deep Learning · TTIC 31230, Fundamentals of Deep Learning David McAllester, Winter 2020 What About alpha-beta? 1. Grand Uni cation AlphaZero uni es chess

Unsupervised Learning of Stereo Vision with …ttic.uchicago.edu/~dmcallester/stereo_learning.pdfvised learning while they use laser range ﬁnder data to train their system. One might

Konstrukciono masinstvo - Izvestaj o prijavljenim ... · 2 3. Датум и место рођења, адреса: 28. 12. 1973. године, Ариље, Чајничка 14, 31230

Ileen Macpherson: Life and tragedy of a pioneer of ...orgprints.org/31230/1/JO415.pdf · Ileen Macpherson: Life and tragedy of a pioneer of biodynamic farming at Demeter Farm and

2015-2016 CUENTAS ANUALES FIRMADAS · Otras deudas a largo plazo Deudas con empresas del grupo y asociadas a Pasivos por impuesto diferido 31220 31230 31290 31300 31400 120.11679

2005-09-04 Asset Liability Management in der betrieblichen ... · ifa Institut für Finanz- und Aktuarwissenschaften Helmholtzstraße 22 D-89081 Ulm phone +49 (0) 731/50-31230 fax

Acoustic Communication. Hearing and Speech. Version 2 31230-05-v2.0.pdf · Acoustic Communication T. Poulsen 7 2. The Ear The ear can be divided into four main parts: The outer ear,

TTIC 31230, Fundamentals of Deep Learningdmcallester/DeepClass18/16... · 2018. 2. 27. · AlphaZero Defeats Stock sh in Chess (December 2017) AlphaGo Zero was a fundamental algorithmic

SERBIA ACCOMMODATION NATIONAL TOURISM UNTERKUNFT … · 2016. 2. 5. · SERBIA-SERBIEN ACCOMMODATION-UNTERKUNFT 7 GH MLINAREV SAN 31230 Arilje, selo Pogledi Tel: 891-390 E-mail: [email protected]

Mathematical Foundations - University of Chicagottic.uchicago.edu/~dmcallester/foundations.pdf · theory. However, the category-theoretic formulation of such statements is not derived

eprints.usm.myeprints.usm.my/31230/1/EKC_361_Kawalan_Dinamik_dan_Proses.pdf · [a] As a process engineer, you are assigned a unit with an exothermic chemical reactor. In order to