progress on generative adversarial networksmac.xmu.edu.cn/valse2017/ppt/apr/12gan_zwm.pdf ·...

Progress on Generative AdversarialNetworks

Wangmeng Zuo

Vision Perception and Cognition CentreHarbin Institute of Technology

Content

• Image generation: problem formulation

• Three issues about GAN

• Discriminate a complex distribution from another one

• Improve the training of generator

• Reveal the connection between input and output

Image generation

Image generation

• Goal: learn a generative model to transform the input to an image from specific distribution• Input: image or variable from input distribution

• Output: image from the desired distribution• One typical setting: generate an image from Gaussian noise

Image translation (Zhu et al., Arxiv 2017)

Image restoration

Super-resolution(Sajjadi et al., Arxiv 2016)

Deblocking(Guo & Chao, Arxiv 2016)

Face editing

Face hallucination(Sønderby et al., ICLR 2017)

Gender transfer(Li et al., Arxiv 2016)

Domain Adaptation

Refining synthetic images(Shrivastava et al., Arxiv 2016)

Generating realistic image from rendering image

(Bousmalis et al., Arxiv 2016)

Image captioning

Dai et al., Arxiv 2017

Three issues you should know on GAN

• Goal: Transform a sample or variable from input distribution to a sample from the desired distribution• Distribution discrepancy measurement: How to evaluate the

closeness between the output and the desired distributions?

• Generation network design: How to design and train the generator?

• How to connect the input and output?

Distribution discrepancy measurement

How to evaluate the closeness between two distributions• KL-Divergence?

• Discriminator• Unbiased Look at Dataset Bias (A. Torralba, A. Efros, CVPR 2011)

• P(xs) P(xt)

Generative Adversarial Networks ( Goodfellow et al., NIPS 2014)• Update the generator to generate more realistic image

• Improve the discriminator to discriminate the synthetic images from real ones

Generative Adversarial Nets (Goodfellow et al., NIPS 2014)

Mode Collapse

• D in inner loop: convergence to correct distribution

• G in inner loop: place all mass on most likely point

MMD for measuring distribution discrepancy

• Maximum Mean Discrepancy (MMD) (Borgwardt, Bioinformatics 2006)

MMD

• Choosing • Linear and Gaussian RBF kernel

• Multiple kernel (Gretton et al., NIPS 2012)

• Let be a CNN (Salimans et al., NIPS 2016)

• Adversarial Learning• Fixed , update the generator to minimize MMD

• Fixed generator, update to maximize MMD

MMD in image generation

• Generative Moment Matching Networks (GMMN) (Li et al., ICML 2015)

Weighted GMMN

MMD in image generation

• Improved GAN (Salimans et al., NIPS 2016)

• Wasserstein GAN (Arjovsky et al., Arxiv 2017)• Wasserstein-1 distance

Use MMD to evaluate the performance of a generative model (Sutherland et al., 2016)

Design and train the generator

DCGAN (Radford et al., ICLR 2016)

• Fully convolutional networks

• Using BN to most layers except the last layer of generator and 1st layer of discriminator

• Two mini-batches for the discriminator are normalized separately

Stacked generator

• Zhang et al., Arxiv 2016; Huang et al., Arxiv 2016

Image enhancement: ResNet

• Super-resolution (Ledig et al., Arxiv 2016)

• Facial attribute transfer (Li et al., Arxiv 2016)

Image translation: U-Net

• Image translation (Isola et al., Arxiv 2016)

• Guided face completion (Zhao et al., 2017)

Image captioning

• Dai et al., Arxiv 2017

Connect Input and Output

InfoGAN (Chen et al., NIPS 2016)

• GAN

• InfoGAN (Chen et al., NIPS 2016)• Input: z, c

• Interpretable and disentangled representations

• Easy to train

Perceptual loss (Li & Wand, Arxiv 2016)

•

Adaptive perceptual loss (Li et al., Arxiv 2016)

•

Conditinal GAN (Isola et al., Arxiv 2016)

• Supervised GAN learning

• Positive pair: • (Input, groundtruth)

• Negative pair:• (Input, synthesis)

Extra Guidance (Zhao et al., 2017)

•

Cycle-Consistent supervision

• Shen & Liu, Arxiv 2016

• Zhu et al., Arxiv 2017

• Liu et al., Arxiv 2017

• Yi et al., Arxiv 2017

Summary

• How to evaluate the closeness between the output and the desired distributions?• Classifier and MMD

• How to design and train the generator?• Problematic-specific

• How to connect the input and output?• Paired, unpaired, guidance

Reference• Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S.,

ourville, A., Bengio, Y.: Generative adversarial nets. In: NIPS. 2014

• Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks, ICLR 2016.

• P. Isola, J.Y. Zhu, T. Zhou, A.A. Efros, Image-to-image translation with conditional adversarial networks, Arxiv 2016.

• M. S. M. Sajjadi, B. Scholkopf, M. Hirsch, EnhanceNet: Single Image Super-Resolution through Automated Texture Synthesis, Arxiv 2016.

• C. Ledig, L. Theis, F. Huszar, J. Caballero, Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network, Arxiv 2016.

• H. Zhang, T. Xu, H. Li, S. Zhang, X. Huang, X. Wang, D. Metaxas, StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks, Arxiv 2016.

• C. K. Sønderby, J. Caballero, L. Theis, W. Shi, F. Huszár, Amortised MAP Inference for image super-resolution, ICLR 2017.

• M. Li, W. Zuo, D. Zhang, Deep Identity-aware Transfer of Facial Attributes, Arxiv 2016.

• K. Bousmalis, N. Silberman, D. Dohan, D. Erhan, D. Krishnan, Unsupervised Pixel–Level Domain Adaptation with Generative Adversarial Networks, Arxiv 2016.

• X. Chen, Y. Duan, R. Houthooft, J. Schulman, I. Sutskever, P. Abbeel, InfoGAN: Interpretable representation learning by information maximizing generative adversarial nets, NIPS 2016.

• T. Salimans, I. Goodfellow, W. Zaremba, V. Cheung, A. Radford, X. Chen, Improved techniques for training GANs, NIPS 2016.

• H. Yan, Y. Ding, P. Li, Q. Wang, Y. Xu, W. Zuo, Mind the Class Weight Bias: Weighted Maximum Mean Discrepancy for Unsupervised Domain Adaptation, CVPR 2017.

• A. Gretton, K. M. Borgwardt, M. J. Rasch, B. Scholkopf, and A. Smola. A kernel two-sample test. Journal of Machine Learning Research, 2012.

• A. Gretton, D. Sejdinovic, H. Strathmann, S. Balakrishnan, M. Pontil, K. Fukumizu, and B. K. Sriperumbudur. Optimal kernel choice for large-scale two-sample tests. NIPS 2012.

• Y. Li, K. Swersky, and R. Zemel. Generative moment matching networks. ICML 2015.

• J.-Y. Zhu, T. Park, P. Isola, A. A. Efros, Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks, Arxiv 2017.

progress on generative adversarial networksmac.xmu.edu.cn/valse2017/ppt/apr/12gan_zwm.pdf ·...

Documents