lecture 7: convolutional networksjustincj/slides/eecs498/498_fa2019_lecture07.pdflecture 7 -2 due...

98
Justin Johnson September 24, 2019 Lecture 7: Convolutional Networks Lecture 7 - 1

Upload: others

Post on 22-May-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019

Lecture7:ConvolutionalNetworks

Lecture7- 1

Page 2: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019

Reminder:A2

Lecture7- 2

DueMonday,September30,11:59pm(Evenifyouenrolledlate!)

Yoursubmissionmustpassthevalidationscript

Page 3: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019

Slightschedulechange

Lecture7- 3

Contentoriginallyplannedfortodaygotsplitintotwolectures

Pushestheschedulebackabit:

A4DueDate:Friday11/1->Friday11/8A5DueDate:Friday11/15->Friday11/22A6DueDate:StillFriday12/6

Page 4: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019

LastTime:Backpropagation

Lecture7- 4

x

W

hingeloss

R

+ Ls (scores)*

Representcomplexexpressionsascomputationalgraphs

Forwardpasscomputesoutputs

Backwardpasscomputesgradients

fLocal

gradients

Upstreamgradient

Downstreamgradients

Duringthebackwardpass,eachnodeinthegraphreceivesupstreamgradientsandmultipliesthembylocalgradients tocomputedownstreamgradients

Page 5: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019Lecture7- 5

Inputimage(2,2)

56

231

24

2

56 231

24 2

Stretchpixelsintocolumn

(4,)x hW1 sW2

Input:3072

Hiddenlayer:100

Output:10

f(x,W)=Wx

Problem:Sofarourclassifiersdon’trespectthespatialstructureofimages!

Page 6: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019Lecture7- 6

Inputimage(2,2)

56

231

24

2

56 231

24 2

Stretchpixelsintocolumn

(4,)x hW1 sW2

Input:3072

Hiddenlayer:100

Output:10

f(x,W)=Wx

Problem:Sofarourclassifiersdon’trespectthespatialstructureofimages!

Solution:Definenewcomputationalnodesthatoperateonimages!

Page 7: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019

ComponentsofaFull-ConnectedNetwork

Lecture7- 7

x h s

Fully-ConnectedLayers ActivationFunction

Page 8: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019

ComponentsofaConvolutionalNetwork

Lecture7- 8

ConvolutionLayers PoolingLayers

x h s

Fully-ConnectedLayers ActivationFunction

Normalization

Page 9: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019

ComponentsofaConvolutionalNetwork

Lecture7- 9

ConvolutionLayers PoolingLayers

x h s

Fully-ConnectedLayers ActivationFunction

Normalization

Page 10: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019

Fully-ConnectedLayer

Lecture7- 10

30721

32x32x3image->stretchto3072x1

10x3072weights

OutputInput

110

Page 11: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019

Fully-ConnectedLayer

Lecture7- 11

30721

32x32x3image->stretchto3072x1

10x3072weights

OutputInput

1number:theresultoftakingadotproductbetweenarowofWandtheinput(a3072-dimensionaldotproduct)

110

Page 12: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019

ConvolutionLayer

Lecture7- 12

32

3

3x32x32 image: preservespatialstructure

widthdepth/channels

height32

Page 13: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019

ConvolutionLayer

Lecture7- 13

32

3

3x32x32 image

widthdepth/channels

3x5x5filter

Convolvethefilterwiththeimagei.e.“slideovertheimagespatially,computingdotproducts”

height32

Page 14: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019

ConvolutionLayer

Lecture7- 14

32

3

3x32x32 image

width

height

depth/channels

3x5x5filter

Filtersalwaysextendthefulldepthoftheinputvolume

Convolvethefilterwiththeimagei.e.“slideovertheimagespatially,computingdotproducts”

32

Page 15: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019

ConvolutionLayer

Lecture7- 15

32

3

3x32x32image

3x5x5filter

321number:theresultoftakingadotproductbetweenthefilterandasmall3x5x5chunkoftheimage(i.e.3*5*5=75-dimensionaldotproduct+bias)

Page 16: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019

ConvolutionLayer

Lecture7- 16

32

3

3x32x32image

3x5x5filter

32convolve(slide)overallspatiallocations

1x28x28activationmap

1

28

28

Page 17: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019

ConvolutionLayer

Lecture7- 17

32

3

3x32x32image

3x5x5filter

32convolve(slide)overallspatiallocations

two1x28x28activationmap

1

28

1

28

28

Considerrepeatingwithasecond(green)filter:

Page 18: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019

ConvolutionLayer

Lecture7- 18

32

3

3x32x32image

32

6activationmaps,each1x28x28

Consider6filters,each3x5x5

ConvolutionLayer

6x3x5x5filters Stackactivationstogeta

6x28x28outputimage!

Page 19: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019

ConvolutionLayer

Lecture7- 19

32

3

3x32x32image

32

6activationmaps,each1x28x28Also6-dimbiasvector:

ConvolutionLayer

6x3x5x5filters Stackactivationstogeta

6x28x28outputimage!

Page 20: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019

ConvolutionLayer

Lecture7- 20

32

3

3x32x32image

32

28x28grid,ateachpointa6-dimvector

Also6-dimbiasvector:

ConvolutionLayer

6x3x5x5filters Stackactivationstogeta

6x28x28outputimage!

Page 21: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019

ConvolutionLayer

Lecture7- 21

32

3

2x3x32x32Batchofimages

32

2x6x28x28Batchofoutputs

Also6-dimbiasvector:

ConvolutionLayer

6x3x5x5filters

Page 22: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019

ConvolutionLayer

Lecture7- 22

W

Cin

NxCin xHxWBatchofimages

H

NxCout xH’xW’Batchofoutputs

AlsoCout-dimbiasvector:

ConvolutionLayer

Cout xCinx Kw xKhfilters

Cout

Page 23: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019Lecture7- 23

32

32

3

W1:6x3x5x5b1:5 28

28

6 10

26

26

….

StackingConvolutions

Input:Nx3x32x32

Firsthiddenlayer:Nx6x28x28

W2:10x6x3x3b2:10

Secondhiddenlayer:Nx10x26x26

Conv Conv Conv

W3:12x10x3x3b3:12

Page 24: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019Lecture7- 24

32

32

3

W1:6x3x5x5b1:5 28

28

6 10

26

26

….

StackingConvolutions

Input:Nx3x32x32

Firsthiddenlayer:Nx6x28x28

W2:10x6x3x3b2:10

Secondhiddenlayer:Nx10x26x26

Conv Conv Conv

W3:12x10x3x3b3:12

Q:Whathappensifwestacktwoconvolutionlayers?

Page 25: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019Lecture7- 25

32

32

3

W1:6x3x5x5b1:6 28

28

6 10

26

26

….

StackingConvolutions

Input:Nx3x32x32

Firsthiddenlayer:Nx6x28x28

W2:10x6x3x3b2:10

Secondhiddenlayer:Nx10x26x26

Conv

W3:12x10x3x3b3:12

Q:Whathappensifwestacktwoconvolutionlayers?A:Wegetanotherconvolution!

(Recally=W2W1xisalinearclassifier)

ReLU Conv ReLU Conv ReLU

Page 26: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019Lecture7- 26

32

32

3

W1:6x3x5x5b1:6 28

28

6 10

26

26

….

Whatdoconvolutionalfilterslearn?

Input:Nx3x32x32

Firsthiddenlayer:Nx6x28x28

W2:10x6x3x3b2:10

Secondhiddenlayer:Nx10x26x26

Conv

W3:12x10x3x3b3:12

ReLU Conv ReLU Conv ReLU

Page 27: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019Lecture7- 27

32

32

3

W1:6x3x5x5b1:6 28

28

6

Whatdoconvolutionalfilterslearn?

Input:Nx3x32x32

Firsthiddenlayer:Nx6x28x28

Conv ReLU

Linearclassifier:Onetemplateperclass

Page 28: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019Lecture7- 28

32

32

3

W1:6x3x5x5b1:6 28

28

6

Whatdoconvolutionalfilterslearn?

Input:Nx3x32x32

Firsthiddenlayer:Nx6x28x28

Conv ReLU

MLP:Bankofwhole-imagetemplates

Page 29: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019Lecture7- 29

32

32

3

W1:6x3x5x5b1:6 28

28

6

Whatdoconvolutionalfilterslearn?

Input:Nx3x32x32

Firsthiddenlayer:Nx6x28x28

Conv ReLU

First-layerconvfilters:localimagetemplates(Oftenlearnsorientededges,opposingcolors)

AlexNet:64filters,each3x11x11

Page 30: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019Lecture7- 30

32

32

3

W1:6x3x5x5b1:6 28

28

6

Acloserlookatspatialdimensions

Input:Nx3x32x32

Firsthiddenlayer:Nx6x28x28

Conv ReLU

Page 31: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019Lecture7- 31

Acloserlookatspatialdimensions

7

7

Input:7x7Filter:3x3

Page 32: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019Lecture7- 32

Acloserlookatspatialdimensions

7

7

Input:7x7Filter:3x3

Page 33: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019Lecture7- 33

Acloserlookatspatialdimensions

7

7

Input:7x7Filter:3x3

Page 34: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019Lecture7- 34

Acloserlookatspatialdimensions

7

7

Input:7x7Filter:3x3

Page 35: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019Lecture7- 35

Acloserlookatspatialdimensions

7

7

Input:7x7Filter:3x3Output:5x5

Page 36: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019Lecture7- 36

Acloserlookatspatialdimensions

7

7

Input:7x7Filter:3x3Output:5x5

Ingeneral:Input:WFilter:KOutput:W– K+1

Problem:Featuremaps“shrink”witheachlayer!

Page 37: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019

0 0 0 0 0 0 0 0 0

0 0

0 0

0 0

0 0

0 0

0 0

0 0

0 0 0 0 0 0 0 0 0

Lecture7- 37

Acloserlookatspatialdimensions

Input:7x7Filter:3x3Output:5x5

Ingeneral:Input:WFilter:KOutput:W– K+1

Problem:Featuremaps“shrink”witheachlayer!

Solution:paddingAddzerosaroundtheinput

Page 38: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019

0 0 0 0 0 0 0 0 0

0 0

0 0

0 0

0 0

0 0

0 0

0 0

0 0 0 0 0 0 0 0 0

Lecture7- 38

Acloserlookatspatialdimensions

Input:7x7Filter:3x3Output:5x5

Ingeneral:Input:WFilter:KPadding:POutput:W– K+1+2P

Verycommon:SetP=(K– 1)/2tomakeoutputhavesamesizeasinput!

Page 39: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019Lecture7- 39

ReceptiveFields

Input Output

ForconvolutionwithkernelsizeK,eachelementintheoutputdependsonaKxKreceptivefield intheinput

Page 40: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019Lecture7- 40

ReceptiveFields

Input Output

EachsuccessiveconvolutionaddsK– 1tothereceptivefieldsizeWithLlayersthereceptivefieldsizeis1+L*(K– 1)

Becareful– ”receptivefieldintheinput”vs“receptivefieldinthepreviouslayer”Hopefullyclearfromcontext!

Page 41: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019Lecture7- 41

ReceptiveFields

Input Output

EachsuccessiveconvolutionaddsK– 1tothereceptivefieldsizeWithLlayersthereceptivefieldsizeis1+L*(K– 1)

Problem:Forlargeimagesweneedmanylayersforeachoutputto“see”thewholeimageimage

Page 42: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019Lecture7- 42

ReceptiveFields

Input Output

EachsuccessiveconvolutionaddsK– 1tothereceptivefieldsizeWithLlayersthereceptivefieldsizeis1+L*(K– 1)

Problem:Forlargeimagesweneedmanylayersforeachoutputto“see”thewholeimageimage

Solution:Downsample insidethenetwork

Page 43: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019Lecture7- 43

Strided ConvolutionInput:7x7Filter:3x3Stride:2

Page 44: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019Lecture7- 44

Strided ConvolutionInput:7x7Filter:3x3Stride:2

Page 45: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019Lecture7- 45

Strided ConvolutionInput:7x7Filter:3x3Stride:2

Output:3x3

Page 46: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019Lecture7- 46

Strided ConvolutionInput:7x7Filter:3x3Stride:2

Output:3x3

Ingeneral:Input:WFilter:KPadding:PStride:SOutput:(W– K+2P)/S+1

Page 47: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019

ConvolutionExample

Lecture7- 47

Inputvolume:3x 32 x 32105x5filterswithstride1,pad2

Outputvolumesize:?

Page 48: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019

ConvolutionExample

Lecture7- 48

Inputvolume:3x 32 x 3210 5x5 filterswithstride1,pad2

Outputvolumesize:(32+2*2-5)/1+1=32spatially,so10 x32 x 32

Page 49: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019

ConvolutionExample

Lecture7- 49

Inputvolume:3x32x32105x5filterswithstride1,pad2

Outputvolumesize:10x32x32Numberoflearnableparameters:?

Page 50: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019

ConvolutionExample

Lecture7- 50

Inputvolume:3 x32x3210 5x5 filterswithstride1,pad2

Outputvolumesize:10x32x32Numberoflearnableparameters:760Parametersperfilter:3*5*5+1(forbias)=7610 filters,sototalis10 *76 =760

Page 51: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019

ConvolutionExample

Lecture7- 51

Inputvolume:3x32x32105x5filterswithstride1,pad2

Outputvolumesize:10x32x32Numberoflearnableparameters:760Numberofmultiply-addoperations:?

Page 52: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019

ConvolutionExample

Lecture7- 52

Inputvolume:3 x32x32105x5 filterswithstride1,pad2

Outputvolumesize:10x32x32Numberoflearnableparameters:760Numberofmultiply-addoperations:768,00010*32*32 =10,240outputs;eachoutputistheinnerproductoftwo3x5x5tensors(75elems);total=75*10240=768K

Page 53: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019

Example:1x1Convolution

Lecture7- 53

64

56

561x1CONVwith32filters

3256

56

(eachfilterhassize1x1x64,andperformsa64-dimensionaldotproduct)

Page 54: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019

Example:1x1Convolution

Lecture7- 54

64

56

561x1CONVwith32filters

3256

56

(eachfilterhassize1x1x64,andperformsa64-dimensionaldotproduct)

Linetal,“NetworkinNetwork”,ICLR2014

Stacking1x1convlayersgivesMLPoperatingoneachinputposition

Page 55: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019

ConvolutionSummary

Lecture7- 55

Input:Cin xHxWHyperparameters:- Kernelsize:KH xKW- Numberfilters:Cout- Padding:P- Stride:SWeightmatrix:Cout xCin xKH xKWgivingCout filtersofsizeCin xKH xKWBiasvector:CoutOutputsize:Cout xH’xW’where:- H’=(H– K+2P)/S+1- W’=(W– K+2P)/S+1

Page 56: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019

ConvolutionSummary

Lecture7- 56

Input:Cin xHxWHyperparameters:- Kernelsize:KH xKW- Numberfilters:Cout- Padding:P- Stride:SWeightmatrix:Cout xCin xKH xKWgivingCout filtersofsizeCin xKH xKWBiasvector:CoutOutputsize:Cout xH’xW’where:- H’=(H– K+2P)/S+1- W’=(W– K+2P)/S+1

Commonsettings:KH =KW (Smallsquarefilters)P=(K– 1)/2(”Same”padding)Cin,Cout =32,64,128,256(powersof2)K=3,P=1,S=1(3x3conv)K=5,P=2,S=1(5x5conv)K=1,P=0,S=1(1x1conv)K=3,P=1,S=2(Downsample by2)

Page 57: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019

Othertypesofconvolution

Lecture7- 57

Sofar:2DConvolution

CinW

H

Input:Cin xHxWWeights:Cout xCin xKxK

Page 58: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019

Othertypesofconvolution

Lecture7- 58

Sofar:2DConvolution 1DConvolution

CinW

H

Input:Cin xHxWWeights:Cout xCin xKxK

Cin

W

Input:Cin xWWeights:Cout xCin xK

Page 59: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019

Othertypesofconvolution

Lecture7- 59

Sofar:2DConvolution 3DConvolution

CinW

H

Input:Cin xHxWWeights:Cout xCin xKxK

Cin-dimvectorateachpointinthevolume

W

D

H

Input:Cin xHxWxDWeights:Cout xCin xKxKxK

Page 60: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019Lecture7- 60

PyTorch ConvolutionLayer

Page 61: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019Lecture7- 61

PyTorch ConvolutionLayers

Page 62: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019

ComponentsofaConvolutionalNetwork

Lecture7- 62

ConvolutionLayers PoolingLayers

x h s

Fully-ConnectedLayers ActivationFunction

Normalization

Page 63: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019

PoolingLayers:Anotherwaytodownsample

Lecture7- 63

Hyperparameters:KernelSizeStridePoolingfunction

Page 64: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019

MaxPooling

Lecture7- 64

1 1 2 4

5 6 7 8

3 2 1 0

1 2 3 4

Singledepthslice

x

y

Maxpoolingwith2x2kernelsizeandstride2 6 8

3 4

Introducesinvariance tosmallspatialshiftsNolearnableparameters!

Page 65: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019

PoolingSummary

Lecture7- 65

Input:CxHxWHyperparameters:- Kernelsize:K- Stride:S- Poolingfunction(max,avg)Output:CxH’xW’where- H’=(H– K)/S+1- W’=(W– K)/S+1Learnableparameters:None!

Commonsettings:max,K=2,S=2max,K=3,S=2(AlexNet)

Page 66: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019

ComponentsofaConvolutionalNetwork

Lecture7- 66

ConvolutionLayers PoolingLayers

x h s

Fully-ConnectedLayers ActivationFunction

Normalization

Page 67: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019

ConvolutionalNetworks

Lecture7- 67

Lecun etal,“Gradient-basedlearningappliedtodocumentrecognition”,1998

Classicarchitecture:[Conv,ReLU,Pool]xN,flatten,[FC,ReLU]xN,FC

Example:LeNet-5

Page 68: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019

Example:LeNet-5

Lecture7- 68

Layer OutputSize WeightSizeInput 1x28 x28Conv(Cout=20,K=5, P=2,S=1) 20x28x28 20x1x5x5ReLU 20x28x28MaxPool(K=2,S=2) 20x14 x14Conv (Cout=50,K=5,P=2,S=1) 50x14x14 50x20x5x5ReLU 50x14x14MaxPool(K=2, S=2) 50x7x7Flatten 2450Linear(2450 ->500) 500 2450x500ReLU 500Linear(500->10) 10 500x10

Lecun etal,“Gradient-basedlearningappliedtodocumentrecognition”,1998

Page 69: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019

Example:LeNet-5

Lecture7- 69

Layer OutputSize WeightSizeInput 1x28 x28Conv(Cout=20,K=5, P=2,S=1) 20x28x28 20x1x5x5ReLU 20x28x28MaxPool(K=2,S=2) 20x14 x14Conv (Cout=50,K=5,P=2,S=1) 50x14x14 50x20x5x5ReLU 50x14x14MaxPool(K=2, S=2) 50x7x7Flatten 2450Linear(2450 ->500) 500 2450x500ReLU 500Linear(500->10) 10 500x10

Lecun etal,“Gradient-basedlearningappliedtodocumentrecognition”,1998

Page 70: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019

Example:LeNet-5

Lecture7- 70

Layer OutputSize WeightSizeInput 1x28 x28Conv(Cout=20,K=5, P=2,S=1) 20x28x28 20x1x5x5ReLU 20x28x28MaxPool(K=2,S=2) 20x14 x14Conv (Cout=50,K=5,P=2,S=1) 50x14x14 50x20x5x5ReLU 50x14x14MaxPool(K=2, S=2) 50x7x7Flatten 2450Linear(2450 ->500) 500 2450x500ReLU 500Linear(500->10) 10 500x10

Lecun etal,“Gradient-basedlearningappliedtodocumentrecognition”,1998

Page 71: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019

Example:LeNet-5

Lecture7- 71

Layer OutputSize WeightSizeInput 1x28 x28Conv(Cout=20,K=5, P=2,S=1) 20x28x28 20x1x5x5ReLU 20x28x28MaxPool(K=2,S=2) 20x14 x14Conv (Cout=50,K=5,P=2,S=1) 50x14x14 50x20x5x5ReLU 50x14x14MaxPool(K=2, S=2) 50x7x7Flatten 2450Linear(2450 ->500) 500 2450x500ReLU 500Linear(500->10) 10 500x10

Lecun etal,“Gradient-basedlearningappliedtodocumentrecognition”,1998

Page 72: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019

Example:LeNet-5

Lecture7- 72

Layer OutputSize WeightSizeInput 1x28 x28Conv(Cout=20,K=5, P=2,S=1) 20x28x28 20x1x5x5ReLU 20x28x28MaxPool(K=2,S=2) 20x14 x14Conv (Cout=50,K=5,P=2,S=1) 50x14x14 50x20x5x5ReLU 50x14x14MaxPool(K=2, S=2) 50x7x7Flatten 2450Linear(2450 ->500) 500 2450x500ReLU 500Linear(500->10) 10 500x10

Lecun etal,“Gradient-basedlearningappliedtodocumentrecognition”,1998

Page 73: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019

Example:LeNet-5

Lecture7- 73

Layer OutputSize WeightSizeInput 1x28 x28Conv(Cout=20,K=5, P=2,S=1) 20x28x28 20x1x5x5ReLU 20x28x28MaxPool(K=2,S=2) 20x14 x14Conv (Cout=50,K=5,P=2,S=1) 50x14x14 50x20x5x5ReLU 50x14x14MaxPool(K=2, S=2) 50x7x7Flatten 2450Linear(2450 ->500) 500 2450x500ReLU 500Linear(500->10) 10 500x10

Lecun etal,“Gradient-basedlearningappliedtodocumentrecognition”,1998

Page 74: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019

Example:LeNet-5

Lecture7- 74

Layer OutputSize WeightSizeInput 1x28 x28Conv(Cout=20,K=5, P=2,S=1) 20x28x28 20x1x5x5ReLU 20x28x28MaxPool(K=2,S=2) 20x14 x14Conv (Cout=50,K=5,P=2,S=1) 50x14x14 50x20x5x5ReLU 50x14x14MaxPool(K=2, S=2) 50x7x7Flatten 2450Linear(2450 ->500) 500 2450x500ReLU 500Linear(500->10) 10 500x10

Lecun etal,“Gradient-basedlearningappliedtodocumentrecognition”,1998

Page 75: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019

Example:LeNet-5

Lecture7- 75

Layer OutputSize WeightSizeInput 1x28 x28Conv(Cout=20,K=5, P=2,S=1) 20x28x28 20x1x5x5ReLU 20x28x28MaxPool(K=2,S=2) 20x14 x14Conv (Cout=50,K=5,P=2,S=1) 50x14x14 50x20x5x5ReLU 50x14x14MaxPool(K=2, S=2) 50x7x7Flatten 2450Linear(2450 ->500) 500 2450x500ReLU 500Linear(500->10) 10 500x10

Lecun etal,“Gradient-basedlearningappliedtodocumentrecognition”,1998

Page 76: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019

Example:LeNet-5

Lecture7- 76

Layer OutputSize WeightSizeInput 1x28 x28Conv(Cout=20,K=5, P=2,S=1) 20x28x28 20x1x5x5ReLU 20x28x28MaxPool(K=2,S=2) 20x14 x14Conv (Cout=50,K=5,P=2,S=1) 50x14x14 50x20x5x5ReLU 50x14x14MaxPool(K=2, S=2) 50x7x7Flatten 2450Linear(2450 ->500) 500 2450x500ReLU 500Linear(500->10) 10 500x10

Lecun etal,“Gradient-basedlearningappliedtodocumentrecognition”,1998

Aswegothroughthenetwork:

Spatialsizedecreases(usingpoolingorstrided conv)

Numberofchannelsincreases(total“volume”ispreserved!)

Page 77: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019

Problem:DeepNetworksveryhardtotrain!

Lecture7- 77

Page 78: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019

ComponentsofaConvolutionalNetwork

Lecture7- 78

ConvolutionLayers PoolingLayers

x h s

Fully-ConnectedLayers ActivationFunction

Normalization

Page 79: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019

BatchNormalization

Lecture7- 79

Ioffe andSzegedy,“Batchnormalization:Acceleratingdeepnetworktrainingbyreducinginternalcovariateshift”,ICML2015

Idea:“Normalize”theoutputsofalayersotheyhavezeromeanandunitvariance

Why?Helpsreduce“internalcovariateshift”,improvesoptimization

Wecannormalizeabatchofactivationslikethis:

Thisisadifferentiablefunction,sowecanuseitasanoperatorinournetworksandbackprop throughit!

Page 80: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019

BatchNormalization

Lecture7- 80

Ioffe andSzegedy,“Batchnormalization:Acceleratingdeepnetworktrainingbyreducinginternalcovariateshift”,ICML2015

Input: Per-channelmean,shapeisD

Normalizedx,ShapeisNxD

XN

D

Per-channelstd,shapeisD

Page 81: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019

BatchNormalization

Lecture7- 81

Ioffe andSzegedy,“Batchnormalization:Acceleratingdeepnetworktrainingbyreducinginternalcovariateshift”,ICML2015

Input: Per-channelmean,shapeisD

Normalizedx,ShapeisNxD

XN

D Problem:Whatifzero-mean,unitvarianceistoohardofaconstraint?

Per-channelstd,shapeisD

Page 82: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019

BatchNormalization

Lecture7- 82

Learnablescaleandshiftparameters:

Output,ShapeisNxD

Learning=,=willrecoverthe

identityfunction!

Input: Per-channelmean,shapeisD

Normalizedx,ShapeisNxD

Per-channelstd,shapeisD

Page 83: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019

BatchNormalization:Test-Time

Lecture7- 83

Learnablescaleandshiftparameters:

Output,ShapeisNxD

Learning=,=willrecoverthe

identityfunction!

Input: Per-channelmean,shapeisD

Normalizedx,ShapeisNxD

Per-channelstd,shapeisD

Problem:Estimatesdependonminibatch;can’tdothisattest-time!

Page 84: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019

BatchNormalization:Test-Time

Lecture7- 84

Learnablescaleandshiftparameters:

Output,ShapeisNxD

Learning=,=willrecoverthe

identityfunction!

Input: Per-channelmean,shapeisD

Normalizedx,ShapeisNxD

Per-channelstd,shapeisD

(Running)averageofvaluesseenduringtraining

(Running)averageofvaluesseenduringtraining

Page 85: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019

BatchNormalization:Test-Time

Lecture7- 85

Learnablescaleandshiftparameters:

Output,ShapeisNxD

Input: Per-channelmean,shapeisD

Normalizedx,ShapeisNxD

Per-channelstd,shapeisD

(Running)averageofvaluesseenduringtraining

(Running)averageofvaluesseenduringtraining

Duringtestingbatchnormbecomesalinearoperator!Canbefusedwiththepreviousfully-connectedorconvlayer

Page 86: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019

BatchNormalizationforConvNets

Lecture7- 86

x: N × D

𝞵,𝝈: 1 × Dɣ,β: 1 × Dy = ɣ(x-𝞵)/𝝈+β

x: N×C×H×W

𝞵,𝝈: 1×C×1×1ɣ,β: 1×C×1×1y = ɣ(x-𝞵)/𝝈+β

Normalize Normalize

BatchNormalizationforfully-connected networks

BatchNormalizationforconvolutional networks(SpatialBatchnorm,BatchNorm2D)

Page 87: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019

BatchNormalization

Lecture7- 87

FC

BN

tanh

FC

BN

tanh

UsuallyinsertedafterFullyConnectedorConvolutionallayers,andbeforenonlinearity.

Ioffe andSzegedy,“Batchnormalization:Acceleratingdeepnetworktrainingbyreducinginternalcovariateshift”,ICML2015

Page 88: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019

BatchNormalization

Lecture7- 88

FC

BN

tanh

FC

BN

tanh

- Makesdeepnetworksmucheasiertotrain!- Allowshigherlearningrates,fasterconvergence- Networksbecomemorerobusttoinitialization- Actsasregularizationduringtraining- Zerooverheadattest-time:canbefusedwithconv!

Trainingiterations

ImageNetaccuracy

Ioffe andSzegedy,“Batchnormalization:Acceleratingdeepnetworktrainingbyreducinginternalcovariateshift”,ICML2015

Page 89: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019

BatchNormalization

Lecture7- 89

FC

BN

tanh

FC

BN

tanh

- Makesdeepnetworksmucheasiertotrain!- Allowshigherlearningrates,fasterconvergence- Networksbecomemorerobusttoinitialization- Actsasregularizationduringtraining- Zerooverheadattest-time:canbefusedwithconv!- Notwell-understoodtheoretically(yet)- Behavesdifferentlyduringtrainingandtesting:this

isaverycommonsourceofbugs!

Ioffe andSzegedy,“Batchnormalization:Acceleratingdeepnetworktrainingbyreducinginternalcovariateshift”,ICML2015

Page 90: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019

LayerNormalization

Lecture7- 90

x: N × D

𝞵,𝝈: 1 × Dɣ,β: 1 × Dy = ɣ(x-𝞵)/𝝈+β

x: N × D

𝞵,𝝈: N × 1ɣ,β: 1 × Dy = ɣ(x-𝞵)/𝝈+β

Normalize Normalize

LayerNormalization forfully-connectednetworksSamebehaviorattrainandtest!UsedinRNNs,Transformers

BatchNormalization forfully-connectednetworks

Ba,Kiros,andHinton,“LayerNormalization”,arXiv 2016

Page 91: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019

InstanceNormalization

Lecture7- 91

Ulyanovetal,ImprovedTextureNetworks:MaximizingQualityandDiversityinFeed-forwardStylizationandTextureSynthesis,CVPR2017

x: N×C×H×W

𝞵,𝝈: 1×C×1×1ɣ,β: 1×C×1×1y = ɣ(x-𝞵)/𝝈+β

x: N×C×H×W

𝞵,𝝈: N×C×1×1ɣ,β: 1×C×1×1y = ɣ(x-𝞵)/𝝈+β

Normalize Normalize

InstanceNormalization forconvolutionalnetworksSamebehaviorattrain/test!

BatchNormalization forconvolutionalnetworks

Page 92: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019

ComparisonofNormalizationLayers

Lecture7- 92

WuandHe,“GroupNormalization”,ECCV2018

Page 93: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019

GroupNormalization

Lecture7- 93

WuandHe,“GroupNormalization”,ECCV2018

Page 94: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019

ComponentsofaConvolutionalNetwork

Lecture7- 94

ConvolutionLayers PoolingLayers

x h s

Fully-ConnectedLayers

ActivationFunction Normalization

Page 95: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019

ComponentsofaConvolutionalNetwork

Lecture7- 95

ConvolutionLayers PoolingLayers

x h s

Fully-ConnectedLayers

ActivationFunction Normalization

Mostcomputationally

expensive!

Page 96: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019Lecture7- 96

Summary:ComponentsofaConvolutionalNetworkConvolutionLayers PoolingLayers

x h s

Fully-ConnectedLayers

ActivationFunction Normalization

Page 97: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019Lecture7- 97

Summary:ComponentsofaConvolutionalNetwork

Problem:Whatistherightwaytocombineallthesecomponents?

Page 98: Lecture 7: Convolutional Networksjustincj/slides/eecs498/498_FA2019_lecture07.pdfLecture 7 -2 Due Monday, September 30, 11:59pm (Even if you enrolled late!) Your submission must pass

JustinJohnson September24,2019

Nexttime:CNNArchitectures

Lecture7- 98