deep learning on computer vision...

National Taiwan University

Dept. of EE, Research Assistant

Yen-Cheng Liu

Deep Learning on Computer VisionApplications

Outline

• Convolution Neural Network (CNN)

• Applications• Super-resolution

• Compressed Artifact Reduction

• Reconstruct Compressed Image

• Context Inpainting

• Colorization

• Sketch Simplification• Style Transfer

• Depth Estimation

• Semantic Segmentation

• Standard Model

• Image-to-Image Model

Outline

• Colorization

• Standard Model

No Math in this tutorial

How I feel when no math

appear ing in a paper

Convolutional Neural Networkn CNN History

• 1990s, CNN used to be the dominant tool, but then fell out of fashion, particularly in computer

vision, with the rise of support vector machines(SVM).

• In 2012, CNN has become popular again due to significant success on the ILSVRC

n Standard structure

Convolution layers and pooling layers Fully connected layers 3

Convolutional Neural Networkn Component

• Convolution layers, Pooling layers and Fully connected layers

• Purpose: originally for classification (i.e. LeNet)

Convolution Layer

Image Credit: Stanford CS231n

Pooling Layer Fully-Connected Layer

Input Feature Map Filters

Output Feature Map

Standard CNN Model

Convolution layers and pooling layers Fully connected layers

Input Output

What you did yesterday……

Ground Truth Label

+Human Machine

What you did yesterday……

“9” “5” “2” ”7”

Recognition

Output

Standard CNN Model- Example

Object Datasets

InputOutput

“Dog”

Object

Recognition

Face Datasets

InputOutput

“柯P”

Recognition

某Datasets

+ Label

某Recognition

某Datasets

+ Label

某Recognition

某Datasets

+ Label

某Recognition

Jonathan Long Evan Shelhamer Trevor Darrell

(from UC Berkley)

Image-to-Image Model

Input Output

Convolution layers Fully connected layers

Input Output

OutputUp-sampling

Input Output

Output

Input Output

Applications

Super-resolution

Image Credit: Wei-Sheng Lai @ UC Merced

Super-resolution

Super-resolution- Input : Low-resolution image Y

- Output : High-resolution image F(Y)

Image Credit: Wei-Sheng Lai @ UC Merced

[1] “Deep Laplacian Pyramid Networks for Fast and Accurate Super-Resolution”, Lai et al., CVPR ‘17

[2] “Image super-resolution using deep convolutional networks”, Dong et al., TPAMI ‘17

[3] “Accelerating the super-resolution convolutional neural network”, Dong et al., ECCV ‘16

[5] “Deeply-recursive convolutional network for image super-resolution”, Kim et al., CVPR ‘16 [4] “Accurate image super-resolution using very deep convolutional network”, Kim et al., CVPR ‘16

[6] Learning a Deep Convolutional Network for Image Super-Resolution, Dong et al., ECCV ‘14

Super-resolution

Reconstructing Compressed Image• Kulkarni et al.[4] present a non-iterative and extremely fast algorithm to

reconstruct images from compressively sensed (CS) random measurements

21[7] “ReconNet: Non-Iterative Reconstruction of Images from Compressively Sensed Random Measurements”, Kulkarni et al., CVPR ‘16

Reconstructing Compressed Image

22[7] “ReconNet: Non-Iterative Reconstruction of Images from Compressively Sensed Random Measurements”, Kulkarni et al., CVPR ‘16

Image Artifacts Removal• Compressed artifacts can be removed by using ARCNN [8]

23[8] “Compression Artifacts Reduction by a Deep Convolutional Network”, Dong et al., ICCV ‘15

Image Artifacts Removal

[8] “Compression Artifacts Reduction by a Deep Convolutional Network”, Dong et al., ICCV ‘15 24

Context Inpainting• Pathak et al.[9] generate the contexts of an arbitrary image region

conditioned on its surroundings using CNN

25[9] ”Context Encoders: Feature Learning by Inpainting”, Pathak et al., CVPR ‘16.

Context Inpainting• Pathak et al.[9] generate the contexts of an arbitrary image region

conditioned on its surroundings using Generative Adversarial Net

Context Inpainting

Face Inpainting• Input: Corrupted Facial Image• Output: Complete Facial Image

28[10] ”Generative Face Completion”, Li et al., CVPR ‘17.

[11] “DeMeshNet: Blind Face Inpainting for Deep MeshFace Verification”, Zhang et al., CVPR ‘17

OutputInput Input Output

Face Rotation• Input: Facial Image• Output: Facial Image with given angle

29[12] ” Rotating Your Face Using Multi-task Deep Neural Network ”,Yim et al., CVPR ‘15.

[13] “Disentangled Representation Learning GAN for Pose-Invariant Face Recognition”, Trum et al., CVPR ‘17

Attribute Manipulation

30[14] StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation

Colorization

Colorization• Cheng et al.[14] investigates into the colorization problem which converts a

grayscale image to a colorful one

32[15] “Deep Colorization”, Cheng et al., ICCV ‘15.

Colorization• Cheng et al.[14] investigates into the colorization problem which converts a

grayscale image to a colorful version

33[15] “Deep Colorization”, Cheng et al., ICCV ‘15.

Colorization• Satoshi et al.[16] propose a technique to automatically colorize grayscale

images that combines both global priors and local image features .

Global image priors are extracted from entire image

Local image feature are computed from small image pattern

[16] “Let there be Color!: Joint End-to-end Learning of Global and Local Image Priors for Automatic Image Colorization with SimultaneousClassification”, Iizuka et al., SIGGRAPH ‘16

Colorization• Satoshi et al.[16] propose a technique to automatically colorize grayscale

images that combines both global priors and local image features .

[16] “Let there be Color!: Joint End-to-end Learning of Global and Local Image Priors for Automatic Image Colorization with SimultaneousClassification”, Iizuka et al., SIGGRAPH ‘16

Sketch Simplification• Simo-Serra et al.[17] propose CNN structure to simplify sketch drawings

• This architecture can process any resolution due to Fully Convolutional

Neural Network

Input Image Output Image[17] Learning to Simplify: Fully Convolutional Networks for Rough Sketch Cleanup, Simo-Serra et al., SIGGRAPH ‘16 36

Sketch Simplification• Simo-Serra et al.[17] propose CNN structure to simplify sketch drawings

More challenging input rough raster image, instead of vector image

[17] Learning to Simplify: Fully Convolutional Networks for Rough Sketch Cleanup, Simo-Serra et al., SIGGRAPH ‘16 37

Sketch-to-Photo Inversion

[18] “Scribbler: Controlling Deep Image Synthesis with Sketch and Color”, Sangkloy et al. , CVPR 2017

Output

Attribute Manipulation + Style Transfer

[19] “Detach and Adapt: Learning Cross-Domain Disentangled Deep Representation”, CVPR ’1839

No Label Supervision

Output

Outline

• Colorization

• Standard Model

Outline

• Colorization

• Standard Model

• Gatys et al.[20] propose a system which use neural representation to

separate and recombine content and style of arbitrary images

Artistic Image Style Transfer

[20] “A Neural Algorithm of Artistic Style”, Gatys et al., CVPR ‘1641

• Based on VGG-19 framework

• Extract the feature map of single photo and artwork to generate the image which mixcontent and style

VGG-19

Content / Style

Representation

[20] “A Neural Algorithm of Artistic Style”, Gatys et al., CVPR ‘1642

• Based on VGG-19 framework

• Extract the feature map of single photo and artwork to generate the image which mixcontent and style

43[20] “A Neural Algorithm of Artistic Style”, Gatys et al., CVPR ‘16

• Feed-forward CNN Model (1000x faster than Gatys et al.)

• Loss Network is based on Pre-trained VGG-19 framework

[21] "Perceptual losses for real-time style transfer and super-resolution.“, Johnson et al., ECCV ‘16.

Artistic Video Style Transfer

https://www.youtube.com/watch?v=Khuj4ASldmU

45[22] "Artistic style transfer for videos.“, Ruder et al., arXiv ‘16.

Artistic 360 Video Style Transfer

https://www.youtube.com/watch?v=pkgMUfNeUCQ

[23] " Artistic style transfer for videos and spherical images.“, Ruder et al., arXiv ‘17.

Deep Photo Enhancer

47[24] "Deep Photo Enhancer: Unpaired Learning for Image Enhancement from Photographs with GANs.“, Chen et al., CVPR ‘18

Deep Photo Enhancer

47[24] "Deep Photo Enhancer: Unpaired Learning for Image Enhancement from Photographs with GANs.“, Chen et al., CVPR ‘18

Single Image Depth Estimation

• Depth Estimation CNN– Input: RGB image

– Output: Depth/Disparity Estimation

48[25] “Unsupervised Monocular Depth Estimation with Left-Right Consistency”, Godard et al., CVPR ’17

[26] “Unsupervised Learning of Depth and Ego-motion from Video”, Zhou et al., CVPR ‘17

[27] “Semi-Supervised Deep Learning for Monocular Depth Map Prediction”, Kuznietsov et al., CVPR ‘17

Single Image Depth Estimation

• Depth Estimation CNN– Input: RGB image

– Output: Depth/Disparity Estimation

Semantic Segmentation

• Semantic Segmentation CNN (Pixel-wise classification)

– Input: RGB image

– Output: Pixel-wise classes prediction

49[28] “Fully Convolutional Networks for Semantic Segmentation”, Long et al., CVPR ’15

[29] “Pyramid Scene Parsing Network”, Zhao et al., CVPR ‘17

https://www.youtube.com/watch?v=qWl9idsCuLQ

Today’s Practice!

Style Transfer + Semantic Segmentation• Champandard [30] introduce a novel concept to augment artistic style

algorithm with semantic annotation

Doodle by Human Result

[30] Semantic Style Transfer and Turning Two-Bit Doodles into Fine Artworks, Champandard et al., arXiv, Mar, 201651

by CNN or HumanPainting

Summary

Today’s Presentation

Sty le Trans fe r

S e m a n t i c S e g m e n t a t i o n

Depth Estimation

Photo EnhancerAttribute Manipulation

S u p e r -Reso lu t ion

Colorization

I m a g e Art i facts R e m o v a l

C o n t e x t I n p a i n t i n g

Summary

Computer Vision

Today’s Presentation

Conclusion• Research areas including computer vision, image processing and computer graphics

have a great success based on Deep Learning

• Convolution Neural Network is still evolving and continually achieve magical

performance• Unsupervised learning

• Meta learning (Learning to learn)

• Explanation of neural network

deep learning on computer vision...

Documents

signed laplacian for spectral clustering revisited ·...

discretization of laplacian operator

deep laplacian pyramid networks for fast and...

laplacian - semantic scholar...laplacian algorithm. one...

fast and accurate image super-resolution with deep laplacian...

fast local laplacian filters: theory and applications ·...

the adjacency matrix, standard laplacian, and · pdf filethe...

laplacian matrices of graph

discrete laplacian

scalable laplacian...

laplacian paradigm 2 - sachdevasushant.github.io ·...

towards a theoretical foundation for laplacian-based...

glee: geometric laplacian eigenmap embedding · glee has...

the laplacian spectrum graphs

deep generative image models using a laplacian pyramid...

laplacian patch-based image synthesis -...

laplacian operator and smoothingpanozzo/ustc/05 - laplacian...

local laplacian filters: edge-aware image...

deep generative image models using a laplacian pyramid of...

pan-sharpening with a hyper-laplacian penalty · gaussian,...