- -1.4cm deep learning by convolutional networks
TRANSCRIPT
![Page 1: - -1.4cm Deep learning by convolutional networks](https://reader030.vdocuments.net/reader030/viewer/2022020621/61e70eb945d7f829991ad6b6/html5/thumbnails/1.jpg)
Deep learning by convolutional networks
Michael Kampffmeyer1
Geilo Winter School, 16. Jan 2017
![Page 2: - -1.4cm Deep learning by convolutional networks](https://reader030.vdocuments.net/reader030/viewer/2022020621/61e70eb945d7f829991ad6b6/html5/thumbnails/2.jpg)
What we will cover...
Introduction
Background
CNNs
Practical tips & software
Segmentation & object detection
Conclusion
![Page 3: - -1.4cm Deep learning by convolutional networks](https://reader030.vdocuments.net/reader030/viewer/2022020621/61e70eb945d7f829991ad6b6/html5/thumbnails/3.jpg)
Slide inspirations
Hugo Larochelle: Neural networks courseChristopher Olah: http://colah.github.ioFei-Fei Li, Andrej Karpathy and Justin Johnson: cs231n slides
Introduction Background CNNs Practical tips & software Segmentation & object detection Conclusion
![Page 4: - -1.4cm Deep learning by convolutional networks](https://reader030.vdocuments.net/reader030/viewer/2022020621/61e70eb945d7f829991ad6b6/html5/thumbnails/4.jpg)
Deep Learning is everywhere
Introduction Background CNNs Practical tips & software Segmentation & object detection Conclusion
![Page 5: - -1.4cm Deep learning by convolutional networks](https://reader030.vdocuments.net/reader030/viewer/2022020621/61e70eb945d7f829991ad6b6/html5/thumbnails/5.jpg)
Deep Learning is everywhere
I Image processingI ClassificationI SegmentationI LocalizationI Detection
I Speech and text processingI TranslationI Caption generationI Word embeddingsI Sequence prediction
I Reinforcement learningI Automatic game playing
I ... and much more
[http://image-net.org]
[Karpathy, 2015]
[https://deepmind.com]
Introduction Background CNNs Practical tips & software Segmentation & object detection Conclusion
![Page 6: - -1.4cm Deep learning by convolutional networks](https://reader030.vdocuments.net/reader030/viewer/2022020621/61e70eb945d7f829991ad6b6/html5/thumbnails/6.jpg)
Neural Networks
X1
X2
X3
X4
Σ
Σ
Σ
Σ
Σ
g(·)
g(·)
g(·)
g(·)
g(·)
Σ
Σ
Σ
Σ
Σ
g(·)
g(·)
g(·)
g(·)
g(·)
Σ o(·) Output
Hiddenlayer
Hiddenlayer
Inputlayer
Outputlayer
Introduction Background CNNs Practical tips & software Segmentation & object detection Conclusion
![Page 7: - -1.4cm Deep learning by convolutional networks](https://reader030.vdocuments.net/reader030/viewer/2022020621/61e70eb945d7f829991ad6b6/html5/thumbnails/7.jpg)
Neurons
Pre-activation of single neuron
a(x) = b +∑i
wixi = b + wTx
Output of neuron
h(x) = g(a(x))
b is the biasw are the weightsg() is the activation function
x2x1 1
Neu
ron
Σ
g(·)
w1 bw2
Introduction Background CNNs Practical tips & software Segmentation & object detection Conclusion
![Page 8: - -1.4cm Deep learning by convolutional networks](https://reader030.vdocuments.net/reader030/viewer/2022020621/61e70eb945d7f829991ad6b6/html5/thumbnails/8.jpg)
Neurons
Theorem (Universal approximation theorem)
”A single hidden layer neural network with a linear output unit canapproximate any continuous function arbitrarily well, given enoughhidden units” (Hornik, 1991)
I However, learning it is very difficult
I In practice use hierarchical representations
Introduction Background CNNs Practical tips & software Segmentation & object detection Conclusion
![Page 9: - -1.4cm Deep learning by convolutional networks](https://reader030.vdocuments.net/reader030/viewer/2022020621/61e70eb945d7f829991ad6b6/html5/thumbnails/9.jpg)
Activations - Linear
I No input squashing
I Not used in practice
g(a) = a
−3 −2 −1 1 2 3
−3
−2
−1
1
2
3
Introduction Background CNNs Practical tips & software Segmentation & object detection Conclusion
![Page 10: - -1.4cm Deep learning by convolutional networks](https://reader030.vdocuments.net/reader030/viewer/2022020621/61e70eb945d7f829991ad6b6/html5/thumbnails/10.jpg)
Activations - Sigmoid
I Squashes between 0 and 1
I Strictly increasing
I Bounded
I Always positive
I Used in AE, RNN, shallow CNNs
g(a) = sigmoid(a) =1
1 + e−a
−3 −2 −1 1 2 3
−3
−2
−1
1
2
3
Introduction Background CNNs Practical tips & software Segmentation & object detection Conclusion
![Page 11: - -1.4cm Deep learning by convolutional networks](https://reader030.vdocuments.net/reader030/viewer/2022020621/61e70eb945d7f829991ad6b6/html5/thumbnails/11.jpg)
Activations - Tanh
I Squashes between -1 and 1
I Strictly increasing
I Bounded
I Both positive and negative activations
I Used mainly in RNN
g(a) = tanh(a) =ea − e−a
ea + e−a=
e2a − 1
e2a + 1
−3 −2 −1 1 2 3
−3
−2
−1
1
2
3
Introduction Background CNNs Practical tips & software Segmentation & object detection Conclusion
![Page 12: - -1.4cm Deep learning by convolutional networks](https://reader030.vdocuments.net/reader030/viewer/2022020621/61e70eb945d7f829991ad6b6/html5/thumbnails/12.jpg)
Activations - ReLU
I Not decreasing
I Bounded lower end
I Sparse activations (faster training)
I More robust (vanishing gradients)
I Used in CNNs
g(a) = ReLU(a) = max(a, 0)
−3 −2 −1 1 2 3
−3
−2
−1
1
2
3
Introduction Background CNNs Practical tips & software Segmentation & object detection Conclusion
![Page 13: - -1.4cm Deep learning by convolutional networks](https://reader030.vdocuments.net/reader030/viewer/2022020621/61e70eb945d7f829991ad6b6/html5/thumbnails/13.jpg)
Stacking Neurons
A single neuron can solve linear problems
[Source: Hugo Larochelle]
Introduction Background CNNs Practical tips & software Segmentation & object detection Conclusion
![Page 14: - -1.4cm Deep learning by convolutional networks](https://reader030.vdocuments.net/reader030/viewer/2022020621/61e70eb945d7f829991ad6b6/html5/thumbnails/14.jpg)
Stacking Neurons
But not nonlinear problems
[Source: Hugo Larochelle]
These require transformationsPower of hierarchical representations
Introduction Background CNNs Practical tips & software Segmentation & object detection Conclusion
![Page 15: - -1.4cm Deep learning by convolutional networks](https://reader030.vdocuments.net/reader030/viewer/2022020621/61e70eb945d7f829991ad6b6/html5/thumbnails/15.jpg)
Forward pass
Hidden layer (pre-activation)
a(k)(x) = b(k) + W (k)h(k−1)(x)
Hidden layer activation
h(k)(x) = g(a(k)(x))
Output layer
h(L+1)(x) = o(a(L+1)(x)) = f (x)
Introduction Background CNNs Practical tips & software Segmentation & object detection Conclusion
![Page 16: - -1.4cm Deep learning by convolutional networks](https://reader030.vdocuments.net/reader030/viewer/2022020621/61e70eb945d7f829991ad6b6/html5/thumbnails/16.jpg)
Output activationI Common multi-class classification loss functionI Want
I Estimate p(y = c |x)I Strictly positiveI Sums to 1
I Bounded lower end
o(a) = softmax(a) =
[e(a1)∑c e
(ac )· · · e(aC )∑
c e(ac )
]
Introduction Background CNNs Practical tips & software Segmentation & object detection Conclusion
![Page 17: - -1.4cm Deep learning by convolutional networks](https://reader030.vdocuments.net/reader030/viewer/2022020621/61e70eb945d7f829991ad6b6/html5/thumbnails/17.jpg)
Classification loss function
Lossfunction: Measure of goodness of how well the model isperforming
In classification want to estimate
f (x)c = p(y = c|x)
Reformulated as minimization problem (minimize negativelog-likelihood)
`(f (x), y) = −∑c
1y=c log f (x)c = − log f (x)y
Introduction Background CNNs Practical tips & software Segmentation & object detection Conclusion
![Page 18: - -1.4cm Deep learning by convolutional networks](https://reader030.vdocuments.net/reader030/viewer/2022020621/61e70eb945d7f829991ad6b6/html5/thumbnails/18.jpg)
Reminder: Gradient descent
I Minimizing lossfunction byfollowing gradient
I Mini-batch SGD
[Source: sebastianraschka.com]
Data: Training samplesResult: Trained modelinitialize parameters Θ ;for N epochs do
for each training sample (x t , yt)do
∆ = −∇Θ`(f (x t ; Θ), yt)−λ∇ΘΩ(Θ) ;
Θ← Θ + α∆ ;
end
end
Introduction Background CNNs Practical tips & software Segmentation & object detection Conclusion
![Page 19: - -1.4cm Deep learning by convolutional networks](https://reader030.vdocuments.net/reader030/viewer/2022020621/61e70eb945d7f829991ad6b6/html5/thumbnails/19.jpg)
Backpropagation - Intuition
Common abstraction for neuralnetworks: Computation graph
I e = (a + b) ∗ (b + 1)
[Source: Christopher Olah (http://colah.github.io)]
Introduction Background CNNs Practical tips & software Segmentation & object detection Conclusion
![Page 20: - -1.4cm Deep learning by convolutional networks](https://reader030.vdocuments.net/reader030/viewer/2022020621/61e70eb945d7f829991ad6b6/html5/thumbnails/20.jpg)
Backpropagation - Intuition
Common abstraction for neuralnetworks: Computation graph
I e = (a + b) ∗ (b + 1)
I a = 2 and b = 1
[Source: Christopher Olah (http://colah.github.io)]
Introduction Background CNNs Practical tips & software Segmentation & object detection Conclusion
![Page 21: - -1.4cm Deep learning by convolutional networks](https://reader030.vdocuments.net/reader030/viewer/2022020621/61e70eb945d7f829991ad6b6/html5/thumbnails/21.jpg)
Backpropagation - Intuition
Common abstraction for neuralnetworks: Computation graph
I e = (a + b) ∗ (b + 1)
I a = 2 and b = 1
Compute partial derivatives
I ∂∂a(a + b) = ∂a
∂a + ∂b∂a = 1
I ∂∂c (c ∗ d) = c ∂d
∂c + d ∂c∂c = d
[Source: Christopher Olah (http://colah.github.io)]
Introduction Background CNNs Practical tips & software Segmentation & object detection Conclusion
![Page 22: - -1.4cm Deep learning by convolutional networks](https://reader030.vdocuments.net/reader030/viewer/2022020621/61e70eb945d7f829991ad6b6/html5/thumbnails/22.jpg)
Backpropagation - IntuitionCommon abstraction for neuralnetworks: Computation graph
I e = (a + b) ∗ (b + 1)
I a = 2 and b = 1
Compute partial derivatives
I ∂∂a(a + b) = ∂a
∂a + ∂b∂a = 1
I ∂∂c (c ∗ d) = c ∂d
∂c + d ∂c∂c = d
Compute ∂e∂b
I Multivariate chain ruleI ∂e
∂b = ∂e∂c
∂c∂b + ∂e
∂d∂d∂b
I ∂e∂b = 2 ∗ 1 + 3 ∗ 1
[Source: Christopher Olah (http://colah.github.io)]
Introduction Background CNNs Practical tips & software Segmentation & object detection Conclusion
![Page 23: - -1.4cm Deep learning by convolutional networks](https://reader030.vdocuments.net/reader030/viewer/2022020621/61e70eb945d7f829991ad6b6/html5/thumbnails/23.jpg)
Backpropagation - Intuition
Getting closer to a Neural Networks
I A more complex example with 9 paths
I Computing ∂Z∂X = αδ+αε+αζ +βδ+βε+βζ + γδ+ γε+ γζ
I Does not scale to large networks
[Source: Christopher Olah (http://colah.github.io)]
Introduction Background CNNs Practical tips & software Segmentation & object detection Conclusion
![Page 24: - -1.4cm Deep learning by convolutional networks](https://reader030.vdocuments.net/reader030/viewer/2022020621/61e70eb945d7f829991ad6b6/html5/thumbnails/24.jpg)
Backpropagation - Intuition
Getting closer to a Neural Networks
I Backpropagation more efficient
I Computing ∂Z∂X = (α + β + γ)(δ + ε+ ζ)
[Source: Christopher Olah (http://colah.github.io)]
Introduction Background CNNs Practical tips & software Segmentation & object detection Conclusion
![Page 25: - -1.4cm Deep learning by convolutional networks](https://reader030.vdocuments.net/reader030/viewer/2022020621/61e70eb945d7f829991ad6b6/html5/thumbnails/25.jpg)
Backpropagation - Intuition
I Self containing modules
I Forward propagation computeoutput based on child layer(s)
I Backward propagation computegradient wrt. children based onparent layer(s)
[Source: Hugo Larochelle]
Introduction Background CNNs Practical tips & software Segmentation & object detection Conclusion
![Page 26: - -1.4cm Deep learning by convolutional networks](https://reader030.vdocuments.net/reader030/viewer/2022020621/61e70eb945d7f829991ad6b6/html5/thumbnails/26.jpg)
Problems with deep networks
I Optimization more difficultI Vanishing gradients (Filippo)
I Overfitting is a problemI Better regularization
I However, many benefits
Introduction Background CNNs Practical tips & software Segmentation & object detection Conclusion
![Page 27: - -1.4cm Deep learning by convolutional networks](https://reader030.vdocuments.net/reader030/viewer/2022020621/61e70eb945d7f829991ad6b6/html5/thumbnails/27.jpg)
Dropout regularization (Hinton et al. 2012)
I TrainingI Drop units with dropout
probability pI Reduces co-adaption
I TestI Scale weights by
dropout rate (1-p)
Hiddenlayer
Inputlayer
Outputlayer
Introduction Background CNNs Practical tips & software Segmentation & object detection Conclusion
![Page 28: - -1.4cm Deep learning by convolutional networks](https://reader030.vdocuments.net/reader030/viewer/2022020621/61e70eb945d7f829991ad6b6/html5/thumbnails/28.jpg)
Batch normalization (Ioffe and Szegedy, 2015)
I Normalize pre-activationI Training
I Normalize batch bymean and std
I TestI Normalize by global
mean and std
[Ioffe and Szegedy, 2015]
Introduction Background CNNs Practical tips & software Segmentation & object detection Conclusion
![Page 29: - -1.4cm Deep learning by convolutional networks](https://reader030.vdocuments.net/reader030/viewer/2022020621/61e70eb945d7f829991ad6b6/html5/thumbnails/29.jpg)
CNNs
[LeCun et al. 1998]
Introduction Background CNNs Practical tips & software Segmentation & object detection Conclusion
![Page 30: - -1.4cm Deep learning by convolutional networks](https://reader030.vdocuments.net/reader030/viewer/2022020621/61e70eb945d7f829991ad6b6/html5/thumbnails/30.jpg)
Hierarchical features
[Source: Y. LeCun]
Introduction Background CNNs Practical tips & software Segmentation & object detection Conclusion
![Page 31: - -1.4cm Deep learning by convolutional networks](https://reader030.vdocuments.net/reader030/viewer/2022020621/61e70eb945d7f829991ad6b6/html5/thumbnails/31.jpg)
Convolutions
Advantage of the convolution
I Locally connected
I Translation-invariant
I Position explicitly encoded
I Independent of input size
[He, ICCV15 tutorial]
Introduction Background CNNs Practical tips & software Segmentation & object detection Conclusion
![Page 32: - -1.4cm Deep learning by convolutional networks](https://reader030.vdocuments.net/reader030/viewer/2022020621/61e70eb945d7f829991ad6b6/html5/thumbnails/32.jpg)
Convolutions
Convolution is defined as
g(x , y) = w(x , y) ∗ f (x , y) = g(x , y) =a∑
s=−a
b∑t=−b
w(s, t)f (x + s, y + t)
I Learn image filters w(x,y) to detect automatically relevantfeatures
Introduction Background CNNs Practical tips & software Segmentation & object detection Conclusion
![Page 33: - -1.4cm Deep learning by convolutional networks](https://reader030.vdocuments.net/reader030/viewer/2022020621/61e70eb945d7f829991ad6b6/html5/thumbnails/33.jpg)
Convolutions
Convolution is defined as
g(x , y) = w(x , y) ∗ f (x , y) = g(x , y) =a∑
s=−a
b∑t=−b
w(s, t)f (x + s, y + t)
f = g =
w =
1 2 3 4
5 6 7 8
9 10 11 12
13 14 15 16
24 40 52 45
64 96 112 92
112 160 176 140
108 152 164 129
1 2 1
2 4 2
1 2 1
Introduction Background CNNs Practical tips & software Segmentation & object detection Conclusion
![Page 34: - -1.4cm Deep learning by convolutional networks](https://reader030.vdocuments.net/reader030/viewer/2022020621/61e70eb945d7f829991ad6b6/html5/thumbnails/34.jpg)
Convolutions
Convolution is defined as
g(x , y) = w(x , y) ∗ f (x , y) = g(x , y) =a∑
s=−a
b∑t=−b
w(s, t)f (x + s, y + t)
f = g =
w =
0 0 0 0 0 0
0 1 2 3 4 0
0 5 6 7 8 0
0 9 10 11 12 0
0 13 14 15 16 0
0 0 0 0 0 0
24 40 52 45
64 96 112 92
112 160 176 140
108 152 164 129
1 2 1
2 4 2
1 2 1
Introduction Background CNNs Practical tips & software Segmentation & object detection Conclusion
![Page 35: - -1.4cm Deep learning by convolutional networks](https://reader030.vdocuments.net/reader030/viewer/2022020621/61e70eb945d7f829991ad6b6/html5/thumbnails/35.jpg)
Convolutions
Convolution is defined as
g(x , y) = w(x , y) ∗ f (x , y) = g(x , y) =a∑
s=−a
b∑t=−b
w(s, t)f (x + s, y + t)
f = g =
w =
0 0 0 0 0 0
0 1 2 3 4 0
0 5 6 7 8 0
0 9 10 11 12 0
0 13 14 15 16 0
0 0 0 0 0 0
24 40 52 45
64 96 112 92
112 160 176 140
108 152 164 129
1 2 1
2 4 2
1 2 1
Introduction Background CNNs Practical tips & software Segmentation & object detection Conclusion
![Page 36: - -1.4cm Deep learning by convolutional networks](https://reader030.vdocuments.net/reader030/viewer/2022020621/61e70eb945d7f829991ad6b6/html5/thumbnails/36.jpg)
Convolutions
Convolution is defined as
g(x , y) = w(x , y) ∗ f (x , y) = g(x , y) =a∑
s=−a
b∑t=−b
w(s, t)f (x + s, y + t)
f = g =
w =
0 0 0 0 0 0
0 1 2 3 4 0
0 5 6 7 8 0
0 9 10 11 12 0
0 13 14 15 16 0
0 0 0 0 0 0
24 40 52 45
64 96 112 92
112 160 176 140
108 152 164 129
1 2 1
2 4 2
1 2 1
Introduction Background CNNs Practical tips & software Segmentation & object detection Conclusion
![Page 37: - -1.4cm Deep learning by convolutional networks](https://reader030.vdocuments.net/reader030/viewer/2022020621/61e70eb945d7f829991ad6b6/html5/thumbnails/37.jpg)
Convolutions
Convolution is defined as
g(x , y) = w(x , y) ∗ f (x , y) = g(x , y) =a∑
s=−a
b∑t=−b
w(s, t)f (x + s, y + t)
f = g =
w =
0 0 0 0 0 0
0 1 2 3 4 0
0 5 6 7 8 0
0 9 10 11 12 0
0 13 14 15 16 0
0 0 0 0 0 0
24 40 52 45
64 96 112 92
112 160 176 140
108 152 164 129
1 2 1
2 4 2
1 2 1
Introduction Background CNNs Practical tips & software Segmentation & object detection Conclusion
![Page 38: - -1.4cm Deep learning by convolutional networks](https://reader030.vdocuments.net/reader030/viewer/2022020621/61e70eb945d7f829991ad6b6/html5/thumbnails/38.jpg)
Convolutions
Convolution is defined as
g(x , y) = w(x , y) ∗ f (x , y) = g(x , y) =a∑
s=−a
b∑t=−b
w(s, t)f (x + s, y + t)
f = g =
w =
0 0 0 0 0 0
0 1 2 3 4 0
0 5 6 7 8 0
0 9 10 11 12 0
0 13 14 15 16 0
0 0 0 0 0 0
24 40 52 45
64 96 112 92
112 160 176 140
108 152 164 129
1 2 1
2 4 2
1 2 1
Introduction Background CNNs Practical tips & software Segmentation & object detection Conclusion
![Page 39: - -1.4cm Deep learning by convolutional networks](https://reader030.vdocuments.net/reader030/viewer/2022020621/61e70eb945d7f829991ad6b6/html5/thumbnails/39.jpg)
Convolutions
Convolution is defined as
g(x , y) = w(x , y) ∗ f (x , y) = g(x , y) =a∑
s=−a
b∑t=−b
w(s, t)f (x + s, y + t)
f = g =
w =
0 0 0 0 0 0
0 1 2 3 4 0
0 5 6 7 8 0
0 9 10 11 12 0
0 13 14 15 16 0
0 0 0 0 0 0
24 40 52 45
64 96 112 92
112 160 176 140
108 152 164 129
1 2 1
2 4 2
1 2 1
Introduction Background CNNs Practical tips & software Segmentation & object detection Conclusion
![Page 40: - -1.4cm Deep learning by convolutional networks](https://reader030.vdocuments.net/reader030/viewer/2022020621/61e70eb945d7f829991ad6b6/html5/thumbnails/40.jpg)
Pooling
I Mean and max pooling
I Larger receptive field
I Overlapping/Nonoverlapping
I Downsampeling feature representation
1 2 3 4
5 6 7 8
9 10 11 12
13 14 15 16
6 8
14 16
2 × 2 pool
stride 2
Introduction Background CNNs Practical tips & software Segmentation & object detection Conclusion
![Page 41: - -1.4cm Deep learning by convolutional networks](https://reader030.vdocuments.net/reader030/viewer/2022020621/61e70eb945d7f829991ad6b6/html5/thumbnails/41.jpg)
Architecture - Layer componentsCNNs consist of several layers with these components
Input
Convolution
Pooling
Nonlinierity
Feature maps
Normalization
Introduction Background CNNs Practical tips & software Segmentation & object detection Conclusion
![Page 42: - -1.4cm Deep learning by convolutional networks](https://reader030.vdocuments.net/reader030/viewer/2022020621/61e70eb945d7f829991ad6b6/html5/thumbnails/42.jpg)
Increasing depth
[LeCun et al. 1998]
[Krizhevsky et al. 2012]
[Szegedy et al. 2015]
[He et al. 2015]
Introduction Background CNNs Practical tips & software Segmentation & object detection Conclusion
![Page 43: - -1.4cm Deep learning by convolutional networks](https://reader030.vdocuments.net/reader030/viewer/2022020621/61e70eb945d7f829991ad6b6/html5/thumbnails/43.jpg)
Increasing depth - ImageNet
[He, ICML 2016 Tutorial]
Introduction Background CNNs Practical tips & software Segmentation & object detection Conclusion
![Page 44: - -1.4cm Deep learning by convolutional networks](https://reader030.vdocuments.net/reader030/viewer/2022020621/61e70eb945d7f829991ad6b6/html5/thumbnails/44.jpg)
Practical tips
CNNs for small datasets:
I Data augmentation
I Transfer learning
Architecture:
I Small filters
Introduction Background CNNs Practical tips & software Segmentation & object detection Conclusion
![Page 45: - -1.4cm Deep learning by convolutional networks](https://reader030.vdocuments.net/reader030/viewer/2022020621/61e70eb945d7f829991ad6b6/html5/thumbnails/45.jpg)
Data augmentationCommon augmentations:
I Flip
I Rotate
I Random cropsI Jitter
I Add noiseI Change contrastI Move slightly along principle
components of RGB colorspace
Introduction Background CNNs Practical tips & software Segmentation & object detection Conclusion
![Page 46: - -1.4cm Deep learning by convolutional networks](https://reader030.vdocuments.net/reader030/viewer/2022020621/61e70eb945d7f829991ad6b6/html5/thumbnails/46.jpg)
Transfer learning
Training CNNs on tasks without massive datasets is commonpracticeSeveral approaches:
I Perform unsupervised training on a large unlabeled dataset,and fine-tune with labelled data
I Pre-train on a large dataset, and fine-tune to the new data
I Pre-train on a large dataset, extract the features and classifythe new data using your favorite classifier
Introduction Background CNNs Practical tips & software Segmentation & object detection Conclusion
![Page 47: - -1.4cm Deep learning by convolutional networks](https://reader030.vdocuments.net/reader030/viewer/2022020621/61e70eb945d7f829991ad6b6/html5/thumbnails/47.jpg)
Transfer learning - Medium datasetTake a pre-trained model and fine-tune to new tasks
Introduction Background CNNs Practical tips & software Segmentation & object detection Conclusion
![Page 48: - -1.4cm Deep learning by convolutional networks](https://reader030.vdocuments.net/reader030/viewer/2022020621/61e70eb945d7f829991ad6b6/html5/thumbnails/48.jpg)
Transfer learning - Small datasetExtract the features and classify with favorite classifier
Introduction Background CNNs Practical tips & software Segmentation & object detection Conclusion
![Page 49: - -1.4cm Deep learning by convolutional networks](https://reader030.vdocuments.net/reader030/viewer/2022020621/61e70eb945d7f829991ad6b6/html5/thumbnails/49.jpg)
Small filters
What size filter to choose:
I Very common 3× 3
I Larger receptive fields can be represented by small filters
[Szegedy et al., 2015]
Introduction Background CNNs Practical tips & software Segmentation & object detection Conclusion
![Page 50: - -1.4cm Deep learning by convolutional networks](https://reader030.vdocuments.net/reader030/viewer/2022020621/61e70eb945d7f829991ad6b6/html5/thumbnails/50.jpg)
Small filters
I More efficient
I More nonlinearity
5x5 conv Weights:C × (5× 5× C ) = 25C 2
2× (3x3) conv Weights:2× C × (3× 3× C ) = 18C 2
Introduction Background CNNs Practical tips & software Segmentation & object detection Conclusion
![Page 51: - -1.4cm Deep learning by convolutional networks](https://reader030.vdocuments.net/reader030/viewer/2022020621/61e70eb945d7f829991ad6b6/html5/thumbnails/51.jpg)
Small filters
I Can go even smaller
I 1× 3 conv and 3× 1 conv
[Szegedy et al., 2015]
Introduction Background CNNs Practical tips & software Segmentation & object detection Conclusion
![Page 52: - -1.4cm Deep learning by convolutional networks](https://reader030.vdocuments.net/reader030/viewer/2022020621/61e70eb945d7f829991ad6b6/html5/thumbnails/52.jpg)
Software
Many software alternatives:
I Torch
I Caffe
I Theano
I Tensorflow
I Neon
I Keras
1.5cm
[Source: Alex Wiltschko]
Introduction Background CNNs Practical tips & software Segmentation & object detection Conclusion
![Page 53: - -1.4cm Deep learning by convolutional networks](https://reader030.vdocuments.net/reader030/viewer/2022020621/61e70eb945d7f829991ad6b6/html5/thumbnails/53.jpg)
SoftwareCaffe:
I + Fast
I + Feedforward
I + Finetuning
I + Easy to get started
I + Great model zoo
I - RNN
I - Extensibility (CUDA/C++)
Torch:
I + Extensibility
I + Model zoo
I - RNN
I - Lua
Introduction Background CNNs Practical tips & software Segmentation & object detection Conclusion
![Page 54: - -1.4cm Deep learning by convolutional networks](https://reader030.vdocuments.net/reader030/viewer/2022020621/61e70eb945d7f829991ad6b6/html5/thumbnails/54.jpg)
Software
Theano:
I + Python
I + Good abstraction
I + RNN
I + Extensibility
I - Pretrained models
I - Debugging
Tensorflow:
I + Python
I + Good abstraction
I + RNN
I + Best parallelism
I - Performance
I - Flexibility
Introduction Background CNNs Practical tips & software Segmentation & object detection Conclusion
![Page 55: - -1.4cm Deep learning by convolutional networks](https://reader030.vdocuments.net/reader030/viewer/2022020621/61e70eb945d7f829991ad6b6/html5/thumbnails/55.jpg)
Caffe - Finetuning example
Possible to finetune network without writing code (using C++api):
I Step 1: Download a pre-trained model from the model zoo.[https://github.com/BVLC/caffe/wiki/Model-Zoo]
I Step 2: Modify .prototxt and define solver.prototxt (Nicevisualizationhttp://ethereon.github.io/netscope/quickstart.html)
I Step 3: Run ./build/tools/caffe train -solvervggModel/solver.prototxt -weightsvggModel/VGG CNN S.caffemodel -gpu 0
Introduction Background CNNs Practical tips & software Segmentation & object detection Conclusion
![Page 56: - -1.4cm Deep learning by convolutional networks](https://reader030.vdocuments.net/reader030/viewer/2022020621/61e70eb945d7f829991ad6b6/html5/thumbnails/56.jpg)
Segmentation using CNNs
I Pixel-wise classification
I Want end-to-end learnable architecture
I Less sensitive than traditional segmentation methods
[Kampffmeyer et al. 2016]
Introduction Background CNNs Practical tips & software Segmentation & object detection Conclusion
![Page 57: - -1.4cm Deep learning by convolutional networks](https://reader030.vdocuments.net/reader030/viewer/2022020621/61e70eb945d7f829991ad6b6/html5/thumbnails/57.jpg)
Segmentation using CNNs - Patch based
I Patch-based approach
I Very intuitive
I Very computationally expensive
Introduction Background CNNs Practical tips & software Segmentation & object detection Conclusion
![Page 58: - -1.4cm Deep learning by convolutional networks](https://reader030.vdocuments.net/reader030/viewer/2022020621/61e70eb945d7f829991ad6b6/html5/thumbnails/58.jpg)
Segmentation using CNNs - Patch based
Replace fully connected layer with convolutions [Sermanet et al.,2013]
I More efficient
I Not dependent on image size
Introduction Background CNNs Practical tips & software Segmentation & object detection Conclusion
![Page 59: - -1.4cm Deep learning by convolutional networks](https://reader030.vdocuments.net/reader030/viewer/2022020621/61e70eb945d7f829991ad6b6/html5/thumbnails/59.jpg)
Segmentation using CNNs - Fully convolutional
Learn an upsampeling from feature representation back to pixelspace [Long et al. 2015]
I Efficient
I Not dependent on image size
I End-to-end learning on whole images
I Better accuracy
Introduction Background CNNs Practical tips & software Segmentation & object detection Conclusion
![Page 60: - -1.4cm Deep learning by convolutional networks](https://reader030.vdocuments.net/reader030/viewer/2022020621/61e70eb945d7f829991ad6b6/html5/thumbnails/60.jpg)
Object detection
Two main approaches
I Region Proposals
I Regression
[Ren et al., 2016]
Introduction Background CNNs Practical tips & software Segmentation & object detection Conclusion
![Page 61: - -1.4cm Deep learning by convolutional networks](https://reader030.vdocuments.net/reader030/viewer/2022020621/61e70eb945d7f829991ad6b6/html5/thumbnails/61.jpg)
Object detection - Region Proposals - RCNNRCNN [Girshick et al. 2014]
I Region proposal (e.g. selective search)
I Classify regions
I Computationally expensive
I Non-maximum suppression
[Girshick et al. 2014]
Introduction Background CNNs Practical tips & software Segmentation & object detection Conclusion
![Page 62: - -1.4cm Deep learning by convolutional networks](https://reader030.vdocuments.net/reader030/viewer/2022020621/61e70eb945d7f829991ad6b6/html5/thumbnails/62.jpg)
Object detection - Region Proposals - Fast RCNN
Fast RCNN [Girshick, 2015]
I Use fully convolutional idea for efficiency
I Bounding box regression offsets
I Faster and better accuracy
[Girshick, 2015]
Introduction Background CNNs Practical tips & software Segmentation & object detection Conclusion
![Page 63: - -1.4cm Deep learning by convolutional networks](https://reader030.vdocuments.net/reader030/viewer/2022020621/61e70eb945d7f829991ad6b6/html5/thumbnails/63.jpg)
Object detection - Region Proposals - Faster RCNN
Faster RCNN [Ren et al., 2016]
I Region proposal network
I Faster and improved overallaccuracy
[Ren et al., 2016]
Introduction Background CNNs Practical tips & software Segmentation & object detection Conclusion
![Page 64: - -1.4cm Deep learning by convolutional networks](https://reader030.vdocuments.net/reader030/viewer/2022020621/61e70eb945d7f829991ad6b6/html5/thumbnails/64.jpg)
Take away message
I CNNs are powerful models
I State of the art on many tasks
I Don’t require large datasets
Introduction Background CNNs Practical tips & software Segmentation & object detection Conclusion
![Page 65: - -1.4cm Deep learning by convolutional networks](https://reader030.vdocuments.net/reader030/viewer/2022020621/61e70eb945d7f829991ad6b6/html5/thumbnails/65.jpg)