applied deep learning 11/03 convolutional neural networks

SlidecreditfromMarkChang 1

ConvolutionalNeuralNetworks• Weneedacoursetotalkaboutthistopic◦ http://cs231n.stanford.edu/syllabus.html

• However, we only have a lecture

Outline• CNN(ConvolutionalNeuralNetworks)Introduction

• EvolutionofCNN

• VisualizingtheFeatures

• CNNasArtist

• SentimentAnalysisbyCNN

• EvolutionofCNN

• CNNasArtist

ImageRecognition

http://www.cs.toronto.edu/~fritz/absps/imagenet.pdf

ImageRecognition

LocalConnectivity

Neuronsconnecttoasmallregion

Parameter Sharing• Thesamefeatureindifferentpositions

Neuronssharethesameweights

Parameter Sharing• Differentfeaturesinthesameposition

Neuronshavedifferentweights

Convolutional Layers

widthwidthdepth

weights weights

height

sharedweight

b1 =wb1a1+wb2a2

b2 =wb1a2+wb2a3

c1 =wc1a1+wc2a2

c2 =wc1a2+wc2a3

depth=2depth=1

depth=2 depth=2

c1 = a1wc1 + b1wc2

+ a2wc3 + b2wc4

c2 = a2wc1 + b2wc2

+ a3wc3 + b3wc4

c1 = a1wc1 + b1wc2

+ a2wc3 + b2wc4

c2 = a2wc1 + b2wc2

+ a3wc3 + b3wc4

wd4 d1 = a1wd1 + b1wd2

+ a2wd3 + b2wd4

wd4 d2 = a2wd1 + b2wd2

+ a3wd3 + b3wd4

depth=2 depth=2

Convolutional LayersA B C

A B C D

Hyper-parametersofCNN• Stride • Padding

Stride=1

Stride=2

Padding=0

Padding=1

Example

OutputVolume(3x3x2)

InputVolume(7x7x3)

Stride=2

Padding=1

http://cs231n.github.io/convolutional-networks/

Filter(3x3x3)

RelationshipwithConvolution

y[n] =X

x[k]w[n� k]

w[0� k]

y[0] = x[�2]w[2] + x[�1]w[1] + x[0]w[0]

y[n] =X

x[k]w[n� k]

w[1� k]

y[1] = x[�1]w[2] + x[0]w[1] + x[2]w[0]

y[n] =X

x[k]w[n� k]

y[2] = x[0]w[2] + x[1]w[1] + x[2]w[0]

w[2� k]

y[n] =X

x[k]w[n� k]

w[4� k]

y[4] = x[2]w[2] + x[3]w[1] + x[4]w[0]

Nonlinearity• RectifiedLinear(ReLU)

⇢nin

if nin

0 otherwise

14�31

775ReLU

WhyReLU?• Easytotrain

• Avoidgradientvanishingproblem

Sigmoidsaturated

gradient≈0 ReLU notsaturated

WhyReLU?• Biologicalreason

strong stimulationReLU

weak stimulation

neuron t

strong stimulation

neuron t

weak stimulation

PoolingLayer

1 3 2 4

5 7 6 8

0 0 3 3

5 5 0 0

MaximumPooling

AveragePooling

Max(1,3,5,7)=7 Avg(1,3,5,7)=4

nooverlap

noweights

depth=1

Max(0,0,5,5)=5

Why“Deep” Learning?

VisualPerceptionofHuman

http://www.nature.com/neuro/journal/v8/n8/images/nn0805-975-F1.jpg

VisualPerceptionofComputer

ConvolutionalLayer

ConvolutionalLayer Pooling

PoolingLayer

ReceptiveFieldsReceptiveFields

InputLayer

VisualPerceptionofComputer

Input Layer

ConvolutionalLayerwith

ReceptiveFields:

Max-poolingLayerwith

Width=3,Height=3

FilterResponses

Input Image31

Fully-Connected Layer• Fully-ConnectedLayers:Globalfeatureextraction

• Softmax Layer:Classifier

ConvolutionalLayer Convolutional

LayerPoolingLayer

PoolingLayer

InputLayer

InputImage

Fully-ConnectedLayer Softmax

ClassLabel

VisualPerceptionofComputer• Alexnet

http://vision03.csail.mit.edu/cnn_art/data/single_layer.png

Training• ForwardPropagation

n1(out)n2(out) n2(in) w21

n2(in) = w21n1(out)

n2(out) = g(n2(in)), g is activation function

Training• Updateweights

n2 n1JCost

function:

@n2(out)

@n2(in)

w21 w21 � ⌘@J

) w21 w21 � ⌘@J

@n2(out)

@n2(in)

n2(out) n2(in) w21 n1(out)

@n2(out)

@n2(in)

w21 w21 � ⌘@J

) w21 w21 � ⌘@J

@n2(out)

@n2(in)

Training• Updateweights

n2 n1JCost

function:

n2(out) n2(in) w21 n1(out)

w21 w21 � ⌘@J

@n2(out)

@n2(in)

) w21 w21 � ⌘@J

@n2(out)g0(n2(in))n1(out)

n2(out) = g(n2(in)), n2(in) = w21n1(out)

)@n2(out)

@n2(in)= g0(n2(in)),

@n2(in)

@w21= n1(out)

Training• Propagatetothepreviouslayer

n2 n1JCost

function:

@n1(in)=

@n2(out)

@n2(in)

@n1(out)

@n1(in)

n2(out) n2(in) n1(out) n1(in)

TrainingConvolutional Layers• example:

output input

ConvolutionalLayer

To simplify the notations, in the following slides, we make:

b1 means b1(in), a1 means a1(out), and so on.

TrainingConvolutional Layers• Forwardpropagation

b1input

ConvolutionalLayer

b1 = wb1a1 + wb2a2

b2 = wb1a2 + wb2a3

TrainingConvolutional Layers• Updateweights

function:

@b1@wb1

@b2@wb1

wb1 wb1 � ⌘(@J

@b1@wb1

@b2@wb1

b1 wb1

b1 = wb1a1 + wb2a2

b2 = wb1a2 + wb2a3

@b1@wb1

@b2@wb1

wb1 wb1 � ⌘(@J

@b1a1 +

@b2a2)

function:

wb2 wb2 � ⌘(@J

@b1@wb2

@b2@wb2

@b1@wb2

@b2@wb2

function:

@b1@wb2

@b2@wb2

wb2 wb2 � ⌘(@J

@b1a2 +

@b2a3)

b1 = wb1a1 + wb2a2

b2 = wb1a2 + wb2a3

function:

TrainingConvolutional Layers• Propagatetothepreviouslayer

function:

@b1@a1

@b1@a2

@b2@a2

@b2@a3

@b1@a1

@b2@a3

@b1@a2

@b2@a2

TrainingConvolutional Layers• Propagatetothepreviouslayer

function:

b1 = wb1a1 + wb2a2

b2 = wb1a2 + wb2a3

@b1@a1

= wb1@b1@a2

@b2@a2

@b2@a3

@b1wb1

@b1wb1 +

@b2wb2

Max-Pooling LayersduringTraining• Poolinglayershavenoweights

• Noneedtoupdateweights

b1 a1 > a2

a2 > a3b2 = max(a2, a3)

b1 = max(a1, a2)

Max-pooling

⇢a2 if a2 � a3a3 otherwise @b2

⇢1 if a2 � a30 otherwise

Max-Pooling LayersduringTraining• Propagatetothepreviouslayer

@b1@a1

a2 > a3

@b2@a2

a1 > a2

@b1@a2

@b2@a3

Costfunction:

Max-Pooling LayersduringTraining• ifa1 =a2 ??◦ Choosethenodewithsmallerindex

Costfunction:

a1 = a2 = a3

Avg-Pooling LayersduringTraining• Poolinglayershavenoweights

• Noneedtoupdateweights

b1b1 =

2(a1 + a2)

2(a2 + a3)

@b2@a2

@b2@a3

Avg-pooling

Avg-Pooling LayersduringTraining• Propagatetothepreviouslayer

function:

@b1@a1

=@b1@a2

@b2@a2

=@b2@a3

ReLUduringTraining

⇢nin

if nin

0 otherwise

⇢1 if n

0 otherwise

Training CNN

• EvolutionofCNN

• CNNasArtist

LeNet◦ Paper:http://vision.stanford.edu/cs598_spring07/papers/Lecun98.pdf

YannLeCun http://yann.lecun.com/exdb/lenet/

ImageNetChallenge• ImageNetLargeScaleVisualRecognitionChallenge◦ http://image-net.org/challenges/LSVRC/

• Dataset:◦ 1000categories◦ Training:1,200,000◦ Validation:50,000◦ Testing:100,000

http://vision.stanford.edu/Datasets/collage_s.png

ImageNetChallenge

http://www.qingpingshan.com/uploads/allimg/160818/1J22QI5-0.png

AlexNet (2012)• Paper:

• TheresurgenceofDeepLearning

GeoffreyHintonAlexKrizhevsky57

VGGNet (2014)• Paper:https://arxiv.org/abs/1409.1556

D:VGG16E:VGG19Allfiltersare3x3

VGGNet• Morelayers&smallerfilters(3x3)isbetter

• Morenon-linearity,fewerparameters

One5x5filter• Parameters:5x5=25

• Non-linear:1

Two3x3filters• Parameters:3x3x2=18

• Non-linear:2

depth=643x3convconv1_1conv1_2

maxpool

depth=1283x3convconv2_1conv2_2

maxpooldepth=2563x3convconv3_1conv3_2conv3_3conv3_4

depth=5123x3convconv4_1conv4_2conv4_3conv4_4

depth=5123x3convconv5_1conv5_2conv5_3conv5_4

maxpool maxpool maxpool

size=4096FC1FC2

size=1000softmax

GoogLeNet (2014)• Paper:

http://www.cs.unc.edu/~wliu/papers/GoogLeNet.pdf

22layersdeepnetwork

InceptionModule

Inception Module• Bestsize?◦ 3x3?5x5?

• Usethemall,andcombine

Inception Module

1x1convolution

3x3convolution

5x5convolution

3x3max-pooling

Previouslayer

FilterConcatenate

Inception ModulewithDimensionReduction

• Use1x1filterstoreducedimension

Inception ModulewithDimensionReduction

Previouslayer

1x1convolution(1x1x256x128)

Reduceddimension

Inputsize1x1x256

128256

Outputsize1x1x128

ResNet (2015)• Paper:https://arxiv.org/abs/1512.03385

• ResidualNetworks

• 152layers

ResNet• Residuallearning:abuildingblock

Residualfunction

Residual LearningwithDimensionReduction

• using1x1filters

PretrainedModelDownload• http://www.vlfeat.org/matconvnet/pretrained/◦ Alexnet:◦ http://www.vlfeat.org/matconvnet/models/imagenet-matconvnet-alex.mat

◦ VGG19:◦ http://www.vlfeat.org/matconvnet/models/imagenet-vgg-verydeep-19.mat

◦ GoogLeNet:◦ http://www.vlfeat.org/matconvnet/models/imagenet-googlenet-dag.mat

◦ ResNet◦ http://www.vlfeat.org/matconvnet/models/imagenet-resnet-152-dag.mat

UsingPretrainedModel• Lowerlayers：edge,blob,texture(moregeneral)

• Higherlayers:objectpart(morespecific)

Transfer Learning• ThePretrained Modelis

trainedonImageNetdataset

• IfyourdataissimilartotheImageNet data

◦ FixallCNNLayers◦ TrainFClayer

Conv layer

FClayer FClayer

LabeleddataLabeleddataLabeleddataLabeleddataLabeleddataImageNetdata

Yourdata

Conv layer

Conv layer…

Conv layer

Yourdata

… …

Transfer Learning• ThePretrainedModelis

trainedonImageNetdataset

• IfyourdataisfardifferentfromtheImageNet data

◦ FixlowerCNNLayers◦ TrainhigherCNNandFClayers

Conv layer

FClayer FClayer

LabeleddataLabeleddataLabeleddataLabeleddataLabeleddataImageNetdata

Yourdata

Conv layer

Conv layer…

Conv layer

Yourdata

… …

Tensorflow Transfer LearningExample• https://www.tensorflow.org/versions/r0.11/how_tos/styl

e_guide.html

daisy634

photos

dandelion899

photos

roses642

photos

tulips800

photos

sunflowers700

photoshttp://download.tensorflow.org/example_images/flower_photos.tgz

Tensorflow Transfer LearningExampleFixtheselayers Trainthislayer

• EvolutionofCNN

• CNNasArtist

Visualizing CNN

flower

randomnoise

filterresponse

filterresponse:

GradientAscent• Magnifythefilterresponse

randomnoise:

score: F =X

lowerscore

higherscore

gradient:@F

filterresponse:

GradientAscent• Magnifythefilterresponse

randomnoise:

gradient:

lowerscore

higherscore

update x

learningrate

x x+ ⌘@F

GradientAscent

Different LayersofVisualization

Multiscale ImageGeneration

visualize resize visualize

resize

visualize

Multiscale ImageGeneration

DeepDream• https://research.googleblog.com/2015/06/inceptionism-

going-deeper-into-neural.html

• Sourcecode:https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/tutorials/deepdream/deepdream.ipynb

http://download.tensorflow.org/example_images/flower_photos.tgz

DeepDream

• EvolutionofCNN

• CNNasArtist

NeuralArt

• Paper:https://arxiv.org/abs/1508.06576

• Sourcecode:https://github.com/ckmarkoh/neuralart_tensorflow

content

artwork

http://www.taipei-101.com.tw/upload/news/201502/2015021711505431705145.JPG

https://github.com/andersbll/neural_artistic_style/blob/master/images/starry_night.jpg?raw=true

TheMechanismofPainting

BrainArtist

Scene Style ArtWork

Computer NeuralNetworks

Misconception

ContentGenerationBrainArtistContent

CanvasMinimize

thedifference

NeuralStimulation

ContentGeneration

FilterResponsesVGG19

Updatethecolorofthepixels

Content

Canvas

Result

Width*HeightDepth

Minimizethe

difference

ContentGeneration

Layerl’sFilterlResponses:

Layerl’sFilterResponses:

InputPhoto: Input

Canvas:

Width*Height (j)

Depth(i)

Width*Height (j)

Depth(i)

ContentGeneration• BackwardPropagation

Layerl’sFilterlResponses:

InputCanvas:

UpdateCanvas

LearningRate

ContentGeneration

conv1_2 conv2_2 conv3_4 conv4_4 conv5_2conv5_1

StyleGeneration

VGG19Artwork

FilterResponses GramMatrix

Width*Height

Position-dependent

Position-independent

StyleGeneration

1. .5 .25 1.

.5 .25 .5

.25 .25

1. .5 1.

Width*Height

Layerl’sFilterResponsesGramMatrix

StyleGeneration

Layerl’sFilterResponses

Layerl’sGramMatrix

InputArtwork:

InputCanvas:

StyleGeneration

VGG19Filter

ResponsesGramMatrix

Minimizethe

difference

Canvas

UpdatethecolorofthepixelsResult

StyleGeneration

StyleGenerationVGG19

Conv1_1 Conv1_1Conv2_1

Conv1_1Conv2_1Conv3_1

Conv1_1Conv2_1Conv3_1Conv4_1

Conv1_1Conv2_1Conv3_1Conv4_1Conv5_1

ArtworkGenerationFilterResponsesVGG19

GramMatrix

ArtworkGeneration

VGG19 VGG19

Conv1_1Conv2_1Conv3_1Conv4_1Conv5_1

Conv4_2

ArtworkGeneration

Contentv.s.Style

0.15 0.05

0.02 0.007

NeuralDoodle• Paper:https://arxiv.org/abs/1603.01768

• Sourcecode:https://github.com/alexjc/neural-doodle

content resultsemanticmaps

NeuralDoodle• Imageanalogy

恐怖連結，慎入！https://raw.githubusercontent.com/awentzonline/image-analogies/master/examples/images/trump-image-analogy.jpg

Real-timeTextureSynthesis• Paper:https://arxiv.org/pdf/1604.04382v1.pdf◦ GAN:https://arxiv.org/pdf/1406.2661v1.pdf◦ VAE:https://arxiv.org/pdf/1312.6114v10.pdf

• SourceCode:https://github.com/chuanli11/MGANs

• EvolutionofCNN

• CNNasArtist

AConvolutional NeuralNetworkforModellingSentences

• Paper:https://arxiv.org/abs/1404.2188

• Sourcecode:https://github.com/FredericGodin/DynamicCNN

Drawbacks ofRecursiveNeuralNetworks(RvNN)

• Needhuman-labeledsyntaxtreeduringtraining

This is a dog

TrainRvNN

Wordvector This is a dog

Drawbacks ofRecursiveNeuralNetworks(RvNN)

• Ambiguityinnaturallanguage

http://3rd.mafengwo.cn/travels/info_weibo.php?id=2861280

http://www.appledaily.com.tw/realtimenews/article/new/20151006/705309/

Element-wise 1Doperationsonwordvectors

• 1DConvolutionor1DPooling

This is a

operationoperation

This is a

Representedby

FromRvNN toCNN• RvNN • CNN

This is a dog

conv1conv1conv1

SameRvNN

Differentconv layers

This is a dog

CNNwithMax-Pooling Layers• Similartosyntaxtree

• Buthuman-labeledsyntaxtreeisnotneeded

This is a dog

conv1conv1conv1

This is a dog

conv1conv1

MaxPooling

SentimentAnalysisbyCNN• Use softmax layer to classify the sentiments

positive

This movie is awesome

conv1conv1conv1

softmax

negative

This movie is awful

conv1conv1conv1

softmax

SentimentAnalysisbyCNN• Buildthe “correct syntax tree” by training

negative

conv1conv1conv1

softmax

negative

conv1conv1conv1

softmax

Backwardpropagation

SentimentAnalysisbyCNN• Buildthe “correct syntax tree” by training

negative

conv1conv1conv1

softmax

positive

conv1conv1conv1

softmax

Updatetheweights

MultipleFilters• RicherfeaturesthanRNN

This is

filter11 Filter13filter12

filter21 Filter23filter22

Sentencecan’t be easily resized• Imagecanbeeasilyresized • Sentence can’t be easily

resized

全台灣最高樓在台北

resize

全台灣最高的高樓在台北市

全台灣最高樓在台北市

台灣最高樓在台北

Various InputSize• Convolutionallayersandpoolinglayers◦ canhandleinputwithvarioussize

This is a dog

conv1conv1conv1

the dog run

conv1conv1

Various InputSize• Fully-connectedlayerandsoftmax layer◦ needfixed-sizeinput

The dog run

softmax

This is a

softmax

k-maxPooling• choosethek-maxvalues

• preservetheorderofinputvalues

• variable-sizeinput,fixed-sizeoutput

3-maxpooling

13 4 1 7 812 5 21 15 7 4 9

3-maxpooling

12 21 15 13 7 8

WideConvolution• Ensuresthatallweightsreachtheentiresentence

conv conv conv conv conv convconv conv

Wide convolution Narrow convolution

Dynamick-maxPooling

wideconvolution&k-maxpooling

and L are constants

l : index of current layer

: k of current layer

: k of top layer

L : total number of layers

s : length of input sentence

= max(ktop

, dL� l

Dynamick-maxPooling

s = 10

k1 = max(3, d2� 1

2⇥ 10e) = 5

= max(ktop

, dL� l

conv &pooling

Dynamick-maxPooling

conv &pooling

= max(ktop

, dL� l

k1 = max(3, d2� 1

2⇥ 14e) = 7

s = 14

Dynamick-maxPooling

conv &pooling

= max(ktop

, dL� l

k1 = max(3, d2� 1

2⇥ 8e) = 4

Dynamick-maxPooling

Wideconvolution&Dynamick-maxpooling

Convolutional NeuralNetworksforSentenceClassification

• Paper:http://www.aclweb.org/anthology/D14-1181

• Sourcee code:https://github.com/yoonkim/CNN_sentence

Static&Non-StaticChannel• Pretrained byword2vec

• Static:fixthevaluesduringtraining

• Non-Static:updatethevaluesduringtraining

AbouttheLecturer

MarkChang

• Email:ckmarkoh at gmail dot com• Blog: http://cpmarkchang.logdown.com• Github:https://github.com/ckmarkoh• Slideshare:http://www.slideshare.net/ckmarkohchang• Youtube:https://www.youtube.com/channel/UCckNPGDL21aznRhl3EijRQw

HTCResearch&HealthcareDeepLearningAlgorithmsResearchEngineer

applied deep learning 11/03 convolutional neural networks

Technology

imagenet classiﬁcation with deep convolutional neural...

deep convolutional neural network의개요 심층...

very deep convolutional neural networks for noise...

deep convolutional neural networks - computer...

deep convolutional neural networks - overview

deep convolutional neural network for automatic …

radiologists versus deep convolutional neural networks: a

imagenet classiﬁcation with deep convolutional neural...

imagenet classification with deep convolutional neural...

deep neural networks convolutional networks ii

convolutional neural network models - deep learning

deep convolutional neural networks - cjoint.com

training deep convolutional neural networks for...

deep convolutional neural network for image deconvolution

deep convolutional neural networks for automated

deep convolutional neural networks - computer science-...

exploring the design space of deep convolutional … the...

imagenet classiﬁcation with deep convolutional neural...

reﬁning architectures of deep convolutional neural...

ujava.org deep learning with convolutional neural network