applied deep learning 11/03 convolutional neural networks

Post on 16-Apr-2017

1.919 Views

Category:

Technology

6 Downloads

Preview:

Click to see full reader

TRANSCRIPT

SlidecreditfromMarkChang 1

ConvolutionalNeuralNetworks• Weneedacoursetotalkaboutthistopic◦ http://cs231n.stanford.edu/syllabus.html

• However, we only have a lecture

2

Outline• CNN(ConvolutionalNeuralNetworks)Introduction

• EvolutionofCNN

• VisualizingtheFeatures

• CNNasArtist

• SentimentAnalysisbyCNN

3

Outline• CNN(ConvolutionalNeuralNetworks)Introduction

• EvolutionofCNN

• VisualizingtheFeatures

• CNNasArtist

• SentimentAnalysisbyCNN

4

ImageRecognition

http://www.cs.toronto.edu/~fritz/absps/imagenet.pdf

5

ImageRecognition

6

LocalConnectivity

Neuronsconnecttoasmallregion

7

Parameter Sharing• Thesamefeatureindifferentpositions

Neuronssharethesameweights

8

Parameter Sharing• Differentfeaturesinthesameposition

Neuronshavedifferentweights

9

Convolutional Layers

depth

widthwidthdepth

weights weights

height

sharedweight

10

Convolutional Layers

c1

c2

b1

b2

a1

a2

a3

wb1

wb2

b1 =wb1a1+wb2a2

wb1

wb2

b2 =wb1a2+wb2a3

wc1

wc2

c1 =wc1a1+wc2a2

wc2

wc1

c2 =wc1a2+wc2a3

depth=2depth=1

11

Convolutional Layers

c1

b1

b2

a1

a2

d1

b3

a3

c2

d2

depth=2 depth=2

wc1

wc2

wc3

wc4

c1 = a1wc1 + b1wc2

+ a2wc3 + b2wc4

wc1

wc2

wc3

wc4

c2 = a2wc1 + b2wc2

+ a3wc3 + b3wc4

12

Convolutional Layers

c1

b1

b2

a1

a2

d1

b3

a3

c2

d2

c1 = a1wc1 + b1wc2

+ a2wc3 + b2wc4

c2 = a2wc1 + b2wc2

+ a3wc3 + b3wc4

wd1

wd2

wd3

wd4 d1 = a1wd1 + b1wd2

+ a2wd3 + b2wd4

wd1

wd2

wd3

wd4 d2 = a2wd1 + b2wd2

+ a3wd3 + b3wd4

depth=2 depth=2

13

Convolutional LayersA B C

A B C D

14

Hyper-parametersofCNN• Stride • Padding

0 0

Stride=1

Stride=2

Padding=0

Padding=1

15

Example

OutputVolume(3x3x2)

InputVolume(7x7x3)

Stride=2

Padding=1

http://cs231n.github.io/convolutional-networks/

Filter(3x3x3)

16

Convolutional Layers

http://cs231n.github.io/convolutional-networks/

17

Convolutional Layers

http://cs231n.github.io/convolutional-networks/

18

Convolutional Layers

http://cs231n.github.io/convolutional-networks/

19

RelationshipwithConvolution

y[n] =X

k

x[k]w[n� k]

x[n]

w[n]

n

n

y[n]

x[k]

k

k

w[0� k]

n

y[0] = x[�2]w[2] + x[�1]w[1] + x[0]w[0]

20

RelationshipwithConvolution

y[n] =X

k

x[k]w[n� k]

x[n]

w[n]

n

n

y[n]

x[k]

k

k

n

w[1� k]

y[1] = x[�1]w[2] + x[0]w[1] + x[2]w[0]

21

RelationshipwithConvolution

y[n] =X

k

x[k]w[n� k]

x[n]

w[n]

n

n

y[n]

x[k]

k

k

n

y[2] = x[0]w[2] + x[1]w[1] + x[2]w[0]

w[2� k]

22

RelationshipwithConvolution

y[n] =X

k

x[k]w[n� k]

x[n]

w[n]

n

n

y[n]

x[k]

k

k

n

w[4� k]

y[4] = x[2]w[2] + x[3]w[1] + x[4]w[0]

23

Nonlinearity• RectifiedLinear(ReLU)

nout

=

⇢nin

if nin

> 0

0 otherwise

nin n

2

664

14�31

3

775

2

664

1401

3

775ReLU

24

WhyReLU?• Easytotrain

• Avoidgradientvanishingproblem

Sigmoidsaturated

gradient≈0 ReLU notsaturated

25

WhyReLU?• Biologicalreason

strong stimulationReLU

weak stimulation

neuron t

v

strong stimulation

neuron t

v

weak stimulation

26

PoolingLayer

1 3 2 4

5 7 6 8

0 0 3 3

5 5 0 0

4 5

5 3

7 8

5 3

MaximumPooling

AveragePooling

Max(1,3,5,7)=7 Avg(1,3,5,7)=4

nooverlap

noweights

depth=1

Max(0,0,5,5)=5

27

Why“Deep” Learning?

28

VisualPerceptionofHuman

http://www.nature.com/neuro/journal/v8/n8/images/nn0805-975-F1.jpg

29

VisualPerceptionofComputer

ConvolutionalLayer

ConvolutionalLayer Pooling

Layer

PoolingLayer

ReceptiveFieldsReceptiveFields

InputLayer

30

VisualPerceptionofComputer

Input Layer

ConvolutionalLayerwith

ReceptiveFields:

Max-poolingLayerwith

Width=3,Height=3

FilterResponses

FilterResponses

Input Image31

Fully-Connected Layer• Fully-ConnectedLayers:Globalfeatureextraction

• Softmax Layer:Classifier

ConvolutionalLayer Convolutional

LayerPoolingLayer

PoolingLayer

InputLayer

InputImage

Fully-ConnectedLayer Softmax

Layer

5

7

ClassLabel

32

VisualPerceptionofComputer• Alexnet

http://www.cs.toronto.edu/~fritz/absps/imagenet.pdf

http://vision03.csail.mit.edu/cnn_art/data/single_layer.png

33

Training• ForwardPropagation

n2 n1

n1(out)n2(out) n2(in) w21

34

n2(in) = w21n1(out)

n2(out) = g(n2(in)), g is activation function

Training• Updateweights

n2 n1JCost

function:

@J

@w21=

@J

@n2(out)

@n2(out)

@n2(in)

@n2(in)

@w21

w21 w21 � ⌘@J

@w21

) w21 w21 � ⌘@J

@n2(out)

@n2(out)

@n2(in)

@n2(in)

@w21

n2(out) n2(in) w21 n1(out)

35

@J

@w21=

@J

@n2(out)

@n2(out)

@n2(in)

@n2(in)

@w21

w21 w21 � ⌘@J

@w21

) w21 w21 � ⌘@J

@n2(out)

@n2(out)

@n2(in)

@n2(in)

@w21

Training• Updateweights

n2 n1JCost

function:

n2(out) n2(in) w21 n1(out)

w21 w21 � ⌘@J

@n2(out)

@n2(out)

@n2(in)

@n2(in)

@w21

) w21 w21 � ⌘@J

@n2(out)g0(n2(in))n1(out)

n2(out) = g(n2(in)), n2(in) = w21n1(out)

)@n2(out)

@n2(in)= g0(n2(in)),

@n2(in)

@w21= n1(out)

36

Training• Propagatetothepreviouslayer

n2 n1JCost

function:

@J

@n1(in)=

@J

@n2(out)

@n2(out)

@n2(in)

@n2(in)

@n1(out)

@n1(out)

@n1(in)

n2(out) n2(in) n1(out) n1(in)

37

TrainingConvolutional Layers• example:

a3

a2

a1

b2

b1wb1

wb1

wb2

wb2

output input

ConvolutionalLayer

To simplify the notations, in the following slides, we make:

b1 means b1(in), a1 means a1(out), and so on.

38

TrainingConvolutional Layers• Forwardpropagation

a3

a2

a1

b2

b1input

ConvolutionalLayer

b1 = wb1a1 + wb2a2

b2 = wb1a2 + wb2a3

wb1

wb1

wb2

wb2

39

TrainingConvolutional Layers• Updateweights

JCost

function:

a3

a2

a1

b2

b1

@J

@b1

@J

@b2

wb1

wb1

@b1@wb1

@b2@wb1

wb1 wb1 � ⌘(@J

@b1

@b1@wb1

+@J

@b2

@b2@wb1

)

40

TrainingConvolutional Layers• Updateweights

a3

a2

a1

b2

b1 wb1

wb1

b1 = wb1a1 + wb2a2

b2 = wb1a2 + wb2a3

@b1@wb1

= a1

@b2@wb1

= a2

wb1 wb1 � ⌘(@J

@b1a1 +

@J

@b2a2)

@J

@b1

@J

@b2

JCost

function:

41

TrainingConvolutional Layers• Updateweights

a3

a2

a1

b2

b1

wb2 wb2 � ⌘(@J

@b1

@b1@wb2

+@J

@b2

@b2@wb2

)

@b1@wb2

@b2@wb2

wb2

wb2

@J

@b1

@J

@b2

JCost

function:

42

TrainingConvolutional Layers• Updateweights

a3

a2

a1

b2

b1wb2

wb2

@b1@wb2

= a2

@b2@wb2

= a3

wb2 wb2 � ⌘(@J

@b1a2 +

@J

@b2a3)

b1 = wb1a1 + wb2a2

b2 = wb1a2 + wb2a3

@J

@b1

@J

@b2

JCost

function:

43

TrainingConvolutional Layers• Propagatetothepreviouslayer

JCost

function:

a3

a2

a1

b2

b1

@J

@b1

@J

@b2

@b1@a1

@b1@a2

@b2@a2

@b2@a3

@J

@b1

@b1@a1

@J

@b2

@b2@a3

@J

@b1

@b1@a2

+@J

@b2

@b2@a2

44

TrainingConvolutional Layers• Propagatetothepreviouslayer

JCost

function:

a3

a2

a1

b2

b1

@J

@b1

@J

@b2

b1 = wb1a1 + wb2a2

b2 = wb1a2 + wb2a3

@b1@a1

= wb1@b1@a2

= wb2

@b2@a2

= wb1

@b2@a3

= wb2

@J

@b1wb1

@J

@b1wb1 +

@J

@b2wb2

@J

@b2wb2

45

Max-Pooling LayersduringTraining• Poolinglayershavenoweights

• Noneedtoupdateweights

a3

a2

a1

b2

b1 a1 > a2

a2 > a3b2 = max(a2, a3)

b1 = max(a1, a2)

Max-pooling

46

=

⇢a2 if a2 � a3a3 otherwise @b2

@a2=

⇢1 if a2 � a30 otherwise

Max-Pooling LayersduringTraining• Propagatetothepreviouslayer

a3

a2

a1

b2

b1

@J

@b1

@J

@b2

@b1@a1

= 1

a2 > a3

@b2@a2

= 1

a1 > a2

@J

@b1@J

@b2

@b1@a2

= 0

@b2@a3

= 0

J

Costfunction:

47

Max-Pooling LayersduringTraining• ifa1 =a2 ??◦ Choosethenodewithsmallerindex

a3

a2

a1

b2

b1

@J

@b1

@J

@b2

@J

@b1@J

@b2J

Costfunction:

a1 = a2 = a3

48

Avg-Pooling LayersduringTraining• Poolinglayershavenoweights

• Noneedtoupdateweights

a3

a2

a1

b2

b1b1 =

1

2(a1 + a2)

b2 =1

2(a2 + a3)

@b2@a2

=1

2

@b2@a3

=1

2

Avg-pooling

49

Avg-Pooling LayersduringTraining• Propagatetothepreviouslayer

JCost

function:

a3

a2

a1

b2

b1

@J

@b1

@J

@b2

@b1@a1

=@b1@a2

=1

2

@b2@a2

=@b2@a3

=1

2

1

2

@J

@b1

1

2(@J

@b1+

@J

@b2)

1

2

@J

@b2

50

ReLUduringTraining

nout

=

⇢nin

if nin

> 0

0 otherwise

nin n

@nout

@nin

=

⇢1 if n

in

> 1

0 otherwise

51

Training CNN

52

Outline• CNN(ConvolutionalNeuralNetworks)Introduction

• EvolutionofCNN

• VisualizingtheFeatures

• CNNasArtist

• SentimentAnalysisbyCNN

53

LeNet◦ Paper:http://vision.stanford.edu/cs598_spring07/papers/Lecun98.pdf

YannLeCun http://yann.lecun.com/exdb/lenet/

54

ImageNetChallenge• ImageNetLargeScaleVisualRecognitionChallenge◦ http://image-net.org/challenges/LSVRC/

• Dataset:◦ 1000categories◦ Training:1,200,000◦ Validation:50,000◦ Testing:100,000

http://vision.stanford.edu/Datasets/collage_s.png

55

ImageNetChallenge

http://www.qingpingshan.com/uploads/allimg/160818/1J22QI5-0.png

56

AlexNet (2012)• Paper:

http://www.cs.toronto.edu/~fritz/absps/imagenet.pdf

• TheresurgenceofDeepLearning

GeoffreyHintonAlexKrizhevsky57

VGGNet (2014)• Paper:https://arxiv.org/abs/1409.1556

D:VGG16E:VGG19Allfiltersare3x3

58

VGGNet• Morelayers&smallerfilters(3x3)isbetter

• Morenon-linearity,fewerparameters

One5x5filter• Parameters:5x5=25

• Non-linear:1

Two3x3filters• Parameters:3x3x2=18

• Non-linear:2

59

VGG19

depth=643x3convconv1_1conv1_2

maxpool

depth=1283x3convconv2_1conv2_2

maxpooldepth=2563x3convconv3_1conv3_2conv3_3conv3_4

depth=5123x3convconv4_1conv4_2conv4_3conv4_4

depth=5123x3convconv5_1conv5_2conv5_3conv5_4

maxpool maxpool maxpool

size=4096FC1FC2

size=1000softmax

60

GoogLeNet (2014)• Paper:

http://www.cs.unc.edu/~wliu/papers/GoogLeNet.pdf

22layersdeepnetwork

InceptionModule

61

Inception Module• Bestsize?◦ 3x3?5x5?

• Usethemall,andcombine

62

Inception Module

1x1convolution

3x3convolution

5x5convolution

3x3max-pooling

Previouslayer

FilterConcatenate

63

Inception ModulewithDimensionReduction

• Use1x1filterstoreducedimension

64

Inception ModulewithDimensionReduction

Previouslayer

1x1convolution(1x1x256x128)

Reduceddimension

Inputsize1x1x256

128256

Outputsize1x1x128

65

ResNet (2015)• Paper:https://arxiv.org/abs/1512.03385

• ResidualNetworks

• 152layers

66

ResNet• Residuallearning:abuildingblock

Residualfunction

67

Residual LearningwithDimensionReduction

• using1x1filters

68

PretrainedModelDownload• http://www.vlfeat.org/matconvnet/pretrained/◦ Alexnet:◦ http://www.vlfeat.org/matconvnet/models/imagenet-matconvnet-alex.mat

◦ VGG19:◦ http://www.vlfeat.org/matconvnet/models/imagenet-vgg-verydeep-19.mat

◦ GoogLeNet:◦ http://www.vlfeat.org/matconvnet/models/imagenet-googlenet-dag.mat

◦ ResNet◦ http://www.vlfeat.org/matconvnet/models/imagenet-resnet-152-dag.mat

69

UsingPretrainedModel• Lowerlayers:edge,blob,texture(moregeneral)

• Higherlayers:objectpart(morespecific)

http://vision03.csail.mit.edu/cnn_art/data/single_layer.png

70

Transfer Learning• ThePretrained Modelis

trainedonImageNetdataset

• IfyourdataissimilartotheImageNet data

◦ FixallCNNLayers◦ TrainFClayer

Conv layer

FClayer FClayer

LabeleddataLabeleddataLabeleddataLabeleddataLabeleddataImageNetdata

Yourdata

Conv layer

Conv layer…

Conv layer

Yourdata

… …

71

Transfer Learning• ThePretrainedModelis

trainedonImageNetdataset

• IfyourdataisfardifferentfromtheImageNet data

◦ FixlowerCNNLayers◦ TrainhigherCNNandFClayers

Conv layer

FClayer FClayer

LabeleddataLabeleddataLabeleddataLabeleddataLabeleddataImageNetdata

Yourdata

Conv layer

Conv layer…

Conv layer

Yourdata

… …

72

Tensorflow Transfer LearningExample• https://www.tensorflow.org/versions/r0.11/how_tos/styl

e_guide.html

daisy634

photos

dandelion899

photos

roses642

photos

tulips800

photos

sunflowers700

photoshttp://download.tensorflow.org/example_images/flower_photos.tgz

Tensorflow Transfer LearningExampleFixtheselayers Trainthislayer

Outline• CNN(ConvolutionalNeuralNetworks)Introduction

• EvolutionofCNN

• VisualizingtheFeatures

• CNNasArtist

• SentimentAnalysisbyCNN

75

Visualizing CNN

http://vision03.csail.mit.edu/cnn_art/data/single_layer.png

76

Visualizing CNN

CNN

CNN

flower

randomnoise

filterresponse

filterresponse

77

filterresponse:

GradientAscent• Magnifythefilterresponse

randomnoise:

x f

score: F =X

i,j

fi,j

fi,j

lowerscore

higherscore

x

F

gradient:@F

@x

78

filterresponse:

GradientAscent• Magnifythefilterresponse

randomnoise:

x f

x

gradient:

fi,j

lowerscore

higherscore

F

@F

@x

update x

learningrate

x x+ ⌘@F

@x

79

GradientAscent

80

Different LayersofVisualization

CNN

81

Multiscale ImageGeneration

visualize resize visualize

resize

visualize

82

Multiscale ImageGeneration

83

DeepDream• https://research.googleblog.com/2015/06/inceptionism-

going-deeper-into-neural.html

• Sourcecode:https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/tutorials/deepdream/deepdream.ipynb

http://download.tensorflow.org/example_images/flower_photos.tgz

84

DeepDream

85

DeepDream

86

Outline• CNN(ConvolutionalNeuralNetworks)Introduction

• EvolutionofCNN

• VisualizingtheFeatures

• CNNasArtist

• SentimentAnalysisbyCNN

87

NeuralArt

• Paper:https://arxiv.org/abs/1508.06576

• Sourcecode:https://github.com/ckmarkoh/neuralart_tensorflow

content

style

artwork

http://www.taipei-101.com.tw/upload/news/201502/2015021711505431705145.JPG

https://github.com/andersbll/neural_artistic_style/blob/master/images/starry_night.jpg?raw=true

88

TheMechanismofPainting

BrainArtist

Scene Style ArtWork

Computer NeuralNetworks

89

Misconception

90

ContentGenerationBrainArtistContent

CanvasMinimize

thedifference

NeuralStimulation

Draw

91

ContentGeneration

92

FilterResponsesVGG19

Updatethecolorofthepixels

Content

Canvas

Result

Width*HeightDepth

Minimizethe

difference

ContentGeneration

Layerl’sFilterlResponses:

Layerl’sFilterResponses:

InputPhoto: Input

Canvas:

Width*Height (j)

Depth(i)

Width*Height (j)

Depth(i)

93

ContentGeneration• BackwardPropagation

Layerl’sFilterlResponses:

InputCanvas:

VGG19

UpdateCanvas

LearningRate

94

ContentGeneration

95

ContentGeneration

VGG19

conv1_2 conv2_2 conv3_4 conv4_4 conv5_2conv5_1

96

StyleGeneration

VGG19Artwork

G

G

FilterResponses GramMatrix

Width*Height

Depth

Depth

Depth

Position-dependent

Position-independent

97

StyleGeneration

1. .5

.5

.5

1.

1. .5 .25 1.

.5 .25 .5

.25 .25

1. .5 1.

Width*Height

Depth

k1 k2

k1

k2

Depth

Depth

Layerl’sFilterResponsesGramMatrix

G

98

StyleGeneration

Layerl’sFilterResponses

Layerl’sGramMatrix

Layerl’sGramMatrix

InputArtwork:

InputCanvas:

99

StyleGeneration

VGG19Filter

ResponsesGramMatrix

Minimizethe

difference

G

G

Style

Canvas

UpdatethecolorofthepixelsResult

100

StyleGeneration

101

StyleGenerationVGG19

Conv1_1 Conv1_1Conv2_1

Conv1_1Conv2_1Conv3_1

Conv1_1Conv2_1Conv3_1Conv4_1

Conv1_1Conv2_1Conv3_1Conv4_1Conv5_1

102

ArtworkGenerationFilterResponsesVGG19

GramMatrix

103

ArtworkGeneration

VGG19 VGG19

Conv1_1Conv2_1Conv3_1Conv4_1Conv5_1

Conv4_2

104

ArtworkGeneration

105

Contentv.s.Style

0.15 0.05

0.02 0.007

106

NeuralDoodle• Paper:https://arxiv.org/abs/1603.01768

• Sourcecode:https://github.com/alexjc/neural-doodle

style

content resultsemanticmaps

107

NeuralDoodle• Imageanalogy

108

NeuralDoodle• Imageanalogy

恐怖連結,慎入!https://raw.githubusercontent.com/awentzonline/image-analogies/master/examples/images/trump-image-analogy.jpg

109

Real-timeTextureSynthesis• Paper:https://arxiv.org/pdf/1604.04382v1.pdf◦ GAN:https://arxiv.org/pdf/1406.2661v1.pdf◦ VAE:https://arxiv.org/pdf/1312.6114v10.pdf

• SourceCode:https://github.com/chuanli11/MGANs

110

Outline• CNN(ConvolutionalNeuralNetworks)Introduction

• EvolutionofCNN

• VisualizingtheFeatures

• CNNasArtist

• SentimentAnalysisbyCNN

111

AConvolutional NeuralNetworkforModellingSentences

• Paper:https://arxiv.org/abs/1404.2188

• Sourcecode:https://github.com/FredericGodin/DynamicCNN

112

Drawbacks ofRecursiveNeuralNetworks(RvNN)

• Needhuman-labeledsyntaxtreeduringtraining

This is a dog

TrainRvNN

Wordvector This is a dog

RvNN

RvNN

RvNN

113

Drawbacks ofRecursiveNeuralNetworks(RvNN)

• Ambiguityinnaturallanguage

http://3rd.mafengwo.cn/travels/info_weibo.php?id=2861280

114

http://www.appledaily.com.tw/realtimenews/article/new/20151006/705309/

Element-wise 1Doperationsonwordvectors

• 1DConvolutionor1DPooling

This is a

operationoperation

This is a

Representedby

115

FromRvNN toCNN• RvNN • CNN

This is a dog

conv3

conv2

conv1conv1conv1

conv2

SameRvNN

Differentconv layers

This is a dog

RvNN

RvNN

RvNN

116

CNNwithMax-Pooling Layers• Similartosyntaxtree

• Buthuman-labeledsyntaxtreeisnotneeded

This is a dog

conv2

pool1

conv1conv1conv1

pool1

This is a dog

conv2

pool1

conv1conv1

pool1

MaxPooling

117

SentimentAnalysisbyCNN• Use softmax layer to classify the sentiments

positive

This movie is awesome

conv2

pool1

conv1conv1conv1

pool1

softmax

negative

This movie is awful

conv2

pool1

conv1conv1conv1

pool1

softmax

118

SentimentAnalysisbyCNN• Buildthe “correct syntax tree” by training

negative

This movie is awesome

conv2

pool1

conv1conv1conv1

pool1

softmax

negative

This movie is awesome

conv2

pool1

conv1conv1conv1

pool1

softmax

Backwardpropagation

error

119

SentimentAnalysisbyCNN• Buildthe “correct syntax tree” by training

negative

This movie is awesome

conv2

pool1

conv1conv1conv1

pool1

softmax

positive

This movie is awesome

conv2

pool1

conv1conv1conv1

pool1

softmax

Updatetheweights

120

MultipleFilters• RicherfeaturesthanRNN

This is

filter11 Filter13filter12

a

filter11 Filter13filter12

filter21 Filter23filter22

121

Sentencecan’t be easily resized• Imagecanbeeasilyresized • Sentence can’t be easily

resized

全台灣最高樓在台北

resize

resize

全台灣最高的高樓在台北市

全台灣最高樓在台北市

台灣最高樓在台北

122

Various InputSize• Convolutionallayersandpoolinglayers◦ canhandleinputwithvarioussize

This is a dog

pool1

conv1conv1conv1

pool1

the dog run

pool1

conv1conv1

123

Various InputSize• Fully-connectedlayerandsoftmax layer◦ needfixed-sizeinput

The dog run

fc

softmax

This is a

fc

softmax

dog

124

k-maxPooling• choosethek-maxvalues

• preservetheorderofinputvalues

• variable-sizeinput,fixed-sizeoutput

3-maxpooling

13 4 1 7 812 5 21 15 7 4 9

3-maxpooling

12 21 15 13 7 8

125

WideConvolution• Ensuresthatallweightsreachtheentiresentence

conv conv conv conv conv convconv conv

Wide convolution Narrow convolution

126

Dynamick-maxPooling

wideconvolution&k-maxpooling

wideconvolution&k-maxpooling

kl

ktop

L

s

ktop

and L are constants

l : index of current layer

kl

: k of current layer

ktop

: k of top layer

L : total number of layers

s : length of input sentence

k

l

= max(ktop

, dL� l

L

se)

127

Dynamick-maxPooling

s = 10

L = 2

k1 = max(3, d2� 1

2⇥ 10e) = 5

ktop

= 3

k

l

= max(ktop

, dL� l

L

se)

conv &pooling

conv &pooling

128

Dynamick-maxPooling

conv &pooling

conv &pooling

L = 2

ktop

= 3

k

l

= max(ktop

, dL� l

L

se)

k1 = max(3, d2� 1

2⇥ 14e) = 7

s = 14

129

Dynamick-maxPooling

conv &pooling

conv &pooling

L = 2

ktop

= 3

k

l

= max(ktop

, dL� l

L

se)

s = 8

k1 = max(3, d2� 1

2⇥ 8e) = 4

130

Dynamick-maxPooling

Wideconvolution&Dynamick-maxpooling

131

Convolutional NeuralNetworksforSentenceClassification

• Paper:http://www.aclweb.org/anthology/D14-1181

• Sourcee code:https://github.com/yoonkim/CNN_sentence

132

Static&Non-StaticChannel• Pretrained byword2vec

• Static:fixthevaluesduringtraining

• Non-Static:updatethevaluesduringtraining

133

AbouttheLecturer

MarkChang

• Email:ckmarkoh at gmail dot com• Blog: http://cpmarkchang.logdown.com• Github:https://github.com/ckmarkoh• Slideshare:http://www.slideshare.net/ckmarkohchang• Youtube:https://www.youtube.com/channel/UCckNPGDL21aznRhl3EijRQw

134

HTCResearch&HealthcareDeepLearningAlgorithmsResearchEngineer

top related