data deep learning and (image) classification · 2019-11-13 · data deep learning: technical...
TRANSCRIPT
DATA
Edouard Oyallon
1
Deep Learning and (Image) Classification
advisor: Stéphane Mallatfollowing the works of Laurent Sifre, Joan Bruna, …
DATADeep learning: Technical
breakthrough
2
• Deep learning has permitted to solve a large number of task that were considered as extremely challenging for a computer.
• The technique that is used is generic and scalable. It simply requires a large amount of data.
• Pretty much hype and engineers with deep learning profiles are highly demanded.
DATA
Face recognition• Face recognition tasks almost solved(three years of
research):
3
DATA
Strategy Games• Game of GO: completely impossible to solve with
Monte Carlo tree search, and solved (two years of research):
4
DATA
Natural Language Processing• Translation (Google just updated its traduction system
with Recurrent Neural Network):
5
DATA
Pure blackbox
• However, nobody has no idea how it works and few research works have clues.
• People claim "AI" is raising and that we are simulating the "brain", while mathematicians avoid those techniques like a prawn.
• Can we do maths in deep learning?
6
DATA
Plot
• Classification?
• Understanding variabilities in high dimension
• Deepnets
• Wavelets
7
DATA
Classification ?
8
What color should be this circle?
DATAClassification of signals
• Let , random variables
• Problem: Estimate such that
• We are given a training set to build
• Say one can write , Classifier being built with
• 3 ways to build : Supervised Unsupervised Predefined
9
n > 0 (X,Y ) 2 Rn ⇥ Y
y
(xi, yi) 2 Rn ⇥ Y y
y = Classifier(�x)(�xi, yi)
(xi)i(xi, yi)i Geometric priors
�
Y = { }n = 2
Classifier
,
w<latexit sha1_base64="eqfykHkxP29JLdLAGokHSA1pBNI=">AAAB53icbZDLSgMxFIbP1Futt6pLN8EiuCozCupKC25ctuDYQjuUTHqmjc1khiSjlNIncONCxa3P4hu4821MLwtt/SHw8f/nkHNOmAqujet+O7ml5ZXVtfx6YWNza3unuLt3p5NMMfRZIhLVCKlGwSX6hhuBjVQhjUOB9bB/Pc7rD6g0T+StGaQYxLQrecQZNdaqPbaLJbfsTkQWwZtB6erz9LQCANV28avVSVgWozRMUK2bnpuaYEiV4UzgqNDKNKaU9WkXmxYljVEHw8mgI3JknQ6JEmWfNGTi/u4Y0ljrQRzaypianp7PxuZ/WTMz0UUw5DLNDEo2/SjKBDEJGW9NOlwhM2JggTLF7ayE9aiizNjbFOwRvPmVF8E/KZ+V3ZpXqlzCVHk4gEM4Bg/OoQI3UAUfGCA8wQu8OvfOs/PmvE9Lc86sZx/+yPn4AYu+jmw=</latexit>
<latexit sha1_base64="O0lQJ2xoQXj8ACW3DbgTbPBIgF0=">AAAB53icbZDLSgNBEEVr4ivGV9Slm8YguAozCupKA25cJuCYQDKEnk5N0qbnQXePEoZ8gRsXKm79Fv/AnX9jZ5KFJl5oONxbRVeVnwiutG1/W4Wl5ZXVteJ6aWNza3unvLt3p+JUMnRZLGLZ8qlCwSN0NdcCW4lEGvoCm/7wepI3H1AqHke3epSgF9J+xAPOqDZW47FbrthVOxdZBGcGlavP01z1bvmr04tZGmKkmaBKtR070V5GpeZM4LjUSRUmlA1pH9sGIxqi8rJ80DE5Mk6PBLE0L9Ikd393ZDRUahT6pjKkeqDms4n5X9ZOdXDhZTxKUo0Rm34UpILomEy2Jj0ukWkxMkCZ5GZWwgZUUqbNbUrmCM78yovgnlTPqnbDqdQuYaoiHMAhHIMD51CDG6iDCwwQnuAFXq1769l6s96npQVr1rMPf2R9/ABGL474</latexit>
w
y = arg infy E(|y(X)� Y |)
DATA
Classifier• A classifier is an algorithm that outputs the probability
distribution for a given sample to belong to a class .
• A classical example is given by the Support Vector Machine (SVM):
10
Minimizing the distance between the convex-hull and taking the associated hyperplan
xi yi
!
!
!
Tx+ b � 0?
Linear classifierRef.: Vapnik, Chervonenkis, 63
DATADiscrete image to continuous
image• An image corresponds to the discretisation of a
physical anagogic signal (light!)
• An array of numbers:
• One can setthen,
• Nyquiest-Shannon sampling property:
11
x
x[n1, n2] 2 R, n1, n2 N
x(u) =X
n2Z2
x[n]�n(u)
Fx(!) =X
n2Z2
x[n]e�in!,Fx 2 L
2[0, 1]
9!x 2 L2(R), support(F x) ⇢ [�1
2
,
1
2
],F x|[� 12 ,
12 ]
= Fx
�1
2
1
2
!
�1
2
1
2
!
DATA
• Caltech 101, etc
1212
Training set to predict labels "Rhino"
Not a "rhino""Rhinos"
High Dimensional classification�! y(x)?
introduce deep network, then how wavelets are related to the deep network
Estimation problem
(xi, yi) 2 R2242 ⇥ {1, ..., 1000}, i < 106
DATA
High-dimensionality issues• Density functions are difficult to estimate in high
dimension.
• For a fixed number of points and bin size, as increases, the bins will be likely to be empty.
13
p(x)<latexit sha1_base64="9bk7nK0g8MwVhmAJxcooGsDk3Og=">AAACFXicdVDLSgMxFL1TX7W+qi7dhBahog7TKu10V3TjsoLVQjuUTJq2oZkHSUYsQ7/CjUt/w40LFbeCO//GtHVARQ8EDuecm9wcN+RMKsv6MFJz8wuLS+nlzMrq2vpGdnPrUgaRILRBAh6Iposl5cynDcUUp81QUOy5nF65w9OJf3VNhWSBf6FGIXU83PdZjxGstNTJHsbt6SUt0Xed2DKP7HKlZB9YZvm4ZFcrmlQ1q1rjsHCzN+5k80kEJRGURFDRtKbI13Lt/XsAqHey7+1uQCKP+opwLGWraIXKibFQjHA6zrQjSUNMhrhPW5r62KPSiacbjdGuVrqoFwh9fIWm6veJGHtSjjxXJz2sBvK3NxH/8lqR6tlOzPwwUtQns4d6EUcqQJOOUJcJShQfaYKJYHpXRAZYYKJ0kxldQvJT9D9plMyyaZ0X87UTmCENO5CDAhShAjU4gzo0gMAtPMATPBt3xqPxYrzOoinja2YbfsB4+wSmWpz0</latexit><latexit sha1_base64="WHXaP+eUdEjVkpZ0ve7HnJla/5Q=">AAACFXicdVBLSwMxGMzWV62vVY9eQotQUZfdKu32VvTisYK1hd2lZNO0Dc0+SLJiWfor9OBf8eJBxavgrf/GtLWgogOBYWa+5Mv4MaNCmuZYyywsLi2vZFdza+sbm1v69s61iBKOSQNHLOItHwnCaEgakkpGWjEnKPAZafqD84nfvCFc0Ci8ksOYeAHqhbRLMZJKauvHqTu9xOE930tN48QuV0r2kWmUT0t2taJIVbGqOYqLtwejtl6YR+A8AucRaBnmFIVa3j28H9eG9bb+4XYinAQklJghIRzLjKWXIi4pZmSUcxNBYoQHqEccRUMUEOGl041GcF8pHdiNuDqhhFP1+0SKAiGGga+SAZJ98dubiH95TiK7tpfSME4kCfHsoW7CoIzgpCPYoZxgyYaKIMyp2hXiPuIIS9VkTpUw/yn8nzRKRtkwL61C7QzMkAV7IA+KwAIVUAMXoA4aAIM78AiewYv2oD1pr9rbLJrRvmZ2wQ9o75+uPp56</latexit>
N = 1<latexit sha1_base64="QWodEY6Iq/8TRpkBTpL62XzntIA=">AAAB6XicbZDLSgMxFIbP1Fsdb1WXboJFcFVmXKibYtGNK6no2EI7lEyaaUOTzJBkhDIUfAE3LlTc+jDu3fk2ppeFVn8IfPz/OeScE6WcaeN5X05hYXFpeaW46q6tb2xulbZ37nSSKUIDkvBENSOsKWeSBoYZTpupolhEnDaiwcU4b9xTpVkib80wpaHAPcliRrCx1s1V1e+Uyl7Fmwj9BX8G5bMPt/oAAPVO6bPdTUgmqDSEY61bvpeaMMfKMMLpyG1nmqaYDHCPtixKLKgO88moI3RgnS6KE2WfNGji/uzIsdB6KCJbKbDp6/lsbP6XtTITn4Y5k2lmqCTTj+KMI5Og8d6oyxQlhg8tYKKYnRWRPlaYGHsd1x7Bn1/5LwRHleOKd+2Xa+cwVRH2YB8OwYcTqMEl1CEAAj14hGd4cbjz5Lw6b9PSgjPr2YVfct6/AXFIju4=</latexit><latexit sha1_base64="Y7xS+xI+29xkCr1EFwYlXrzDUWE=">AAAB6XicbZDLSsNAFIZP6q3GW9Slm8EiuCqJC3VTLLpxJRWNLbShTKaTduhkEmYmQgl9BDcuVNz2Ydy7Ed/G6WWhrT8MfPz/Ocw5J0w5U9p1v63C0vLK6lpx3d7Y3NrecXb3HlSSSUJ9kvBENkKsKGeC+pppThuppDgOOa2H/atxXn+kUrFE3OtBSoMYdwWLGMHaWHc3Fa/tlNyyOxFaBG8GpYsPu5KOvuxa2/lsdRKSxVRowrFSTc9NdZBjqRnhdGi3MkVTTPq4S5sGBY6pCvLJqEN0ZJwOihJpntBo4v7uyHGs1CAOTWWMdU/NZ2Pzv6yZ6eg8yJlIM00FmX4UZRzpBI33Rh0mKdF8YAATycysiPSwxESb69jmCN78yovgn5RPy+6tV6pewlRFOIBDOAYPzqAK11ADHwh04Qle4NXi1rP1Zr1PSwvWrGcf/sga/QBhY5Bi</latexit>
N = 2<latexit sha1_base64="2cQYsILU/5DOXRBq2M/5/JpljZg=">AAAB6XicbZDNSsNAFIVv6l+Nf1WXbgaL4KokXaibYtGNK6lobKENZTKdtEMnkzAzEUoo+AJuXKi49WHcu/NtnKZdaOuBgY9z7mXuvUHCmdKO820VlpZXVteK6/bG5tb2Tml3717FqSTUIzGPZSvAinImqKeZ5rSVSIqjgNNmMLyc5M0HKhWLxZ0eJdSPcF+wkBGsjXV7Xat2S2Wn4uRCi+DOoHz+adceAaDRLX11ejFJIyo04Viptusk2s+w1IxwOrY7qaIJJkPcp22DAkdU+Vk+6hgdGaeHwliaJzTK3d8dGY6UGkWBqYywHqj5bGL+l7VTHZ75GRNJqqkg04/ClCMdo8neqMckJZqPDGAimZkVkQGWmGhzHdscwZ1feRG8auWk4ty45foFTFWEAziEY3DhFOpwBQ3wgEAfnuAFXi1uPVtv1vu0tGDNevbhj6yPH3LLju8=</latexit><latexit sha1_base64="vYJWe4oMbnx82BpVJXNZSv7AVJM=">AAAB6XicbZC7SgNBFIbPxltcb1FLm8EgWIXdFMYmGLSxkoiuCSRLmJ3MJkNmZ5eZWSEseQQbCxXbPIy9jfg2Ti6FJv4w8PH/5zDnnCDhTGnH+bZyK6tr6xv5TXtre2d3r7B/8KDiVBLqkZjHshlgRTkT1NNMc9pMJMVRwGkjGFxN8sYjlYrF4l4PE+pHuCdYyAjWxrq7qZY7haJTcqZCy+DOoXjxYVeT8Zdd7xQ+292YpBEVmnCsVMt1Eu1nWGpGOB3Z7VTRBJMB7tGWQYEjqvxsOuoInRini8JYmic0mrq/OzIcKTWMAlMZYd1Xi9nE/C9rpTo89zMmklRTQWYfhSlHOkaTvVGXSUo0HxrARDIzKyJ9LDHR5jq2OYK7uPIyeOXSWcm5dYu1S5gpD0dwDKfgQgVqcA118IBAD57gBV4tbj1bb9b7rDRnzXsO4Y+s8Q9i5pBj</latexit>
N<latexit sha1_base64="lzpXRD7SX+5TXOBjpnnPaA4iADI=">AAAB53icbZA9SwNBEIbn4lc8v6KWNotBsAp3FmojBm2sJAHPBJIj7G3mkjV7H+zuCeEI2NtYqNj6a+zt/DduEgtNfGHh4X1n2JkJUsGVdpwvq7CwuLS8Uly119Y3NrdK2zu3KskkQ48lIpHNgCoUPEZPcy2wmUqkUSCwEQwux3njHqXiSXyjhyn6Ee3FPOSMamPVrzulslNxJiLz4P5A+fzDPnsAgFqn9NnuJiyLMNZMUKVarpNqP6dScyZwZLczhSllA9rDlsGYRqj8fDLoiBwYp0vCRJoXazJxf3fkNFJqGAWmMqK6r2azsflf1sp0eOrnPE4zjTGbfhRmguiEjLcmXS6RaTE0QJnkZlbC+lRSps1tbHMEd3blefCOKscVp+6WqxcwVRH2YB8OwYUTqMIV1MADBgiP8Awv1p31ZL1ab9PSgvXTswt/ZL1/A4BOjmw=</latexit><latexit sha1_base64="l/pAYWh0hvWOcS47yJ8q4MqKTao=">AAAB53icbZC7SgNBFIbPxltcb1FLm8EgWIVdC7URgzZWkoBrAskSZidnkzGzF2ZmhRDyBDYWKrbiw9jbiG/j5FJo4g8DH/9/DnPOCVLBlXacbyu3sLi0vJJftdfWNza3Cts7tyrJJEOPJSKR9YAqFDxGT3MtsJ5KpFEgsBb0Lkd57R6l4kl8o/sp+hHtxDzkjGpjVa9bhaJTcsYi8+BOoXj+YZ+l7192pVX4bLYTlkUYayaoUg3XSbU/oFJzJnBoNzOFKWU92sGGwZhGqPzBeNAhOTBOm4SJNC/WZOz+7hjQSKl+FJjKiOqums1G5n9ZI9PhqT/gcZppjNnkozATRCdktDVpc4lMi74ByiQ3sxLWpZIybW5jmyO4syvPg3dUOi45VbdYvoCJ8rAH+3AILpxAGa6gAh4wQHiAJ3i27qxH68V6nZTmrGnPLvyR9fYDcGmP4A==</latexit>
Curse of dimensionality
DATA
Image variabilities14
Geometric variability Class variabilityGroups acting on images:
translation, rotation, scaling
Intraclass variability
Extraclass variabilityOther sources : luminosity, occlusion, small deformations
Not informative
x⌧ (u) = x(u� ⌧(u)), ⌧ 2 C1
I � ⌧
High variance: must be reduced
DATAFighting the curse of
dimensionality• Objective: building a representation of such that a
simple (say euclidean) classifier can estimate the label :
• Designing consist of building an approximation of a low dimensional space which is regular with respect to the class:
• How can we do that?
15
�x x
�
�
w<latexit sha1_base64="eqfykHkxP29JLdLAGokHSA1pBNI=">AAAB53icbZDLSgMxFIbP1Futt6pLN8EiuCozCupKC25ctuDYQjuUTHqmjc1khiSjlNIncONCxa3P4hu4821MLwtt/SHw8f/nkHNOmAqujet+O7ml5ZXVtfx6YWNza3unuLt3p5NMMfRZIhLVCKlGwSX6hhuBjVQhjUOB9bB/Pc7rD6g0T+StGaQYxLQrecQZNdaqPbaLJbfsTkQWwZtB6erz9LQCANV28avVSVgWozRMUK2bnpuaYEiV4UzgqNDKNKaU9WkXmxYljVEHw8mgI3JknQ6JEmWfNGTi/u4Y0ljrQRzaypianp7PxuZ/WTMz0UUw5DLNDEo2/SjKBDEJGW9NOlwhM2JggTLF7ayE9aiizNjbFOwRvPmVF8E/KZ+V3ZpXqlzCVHk4gEM4Bg/OoQI3UAUfGCA8wQu8OvfOs/PmvE9Lc86sZx/+yPn4AYu+jmw=</latexit>
<latexit sha1_base64="O0lQJ2xoQXj8ACW3DbgTbPBIgF0=">AAAB53icbZDLSgNBEEVr4ivGV9Slm8YguAozCupKA25cJuCYQDKEnk5N0qbnQXePEoZ8gRsXKm79Fv/AnX9jZ5KFJl5oONxbRVeVnwiutG1/W4Wl5ZXVteJ6aWNza3unvLt3p+JUMnRZLGLZ8qlCwSN0NdcCW4lEGvoCm/7wepI3H1AqHke3epSgF9J+xAPOqDZW47FbrthVOxdZBGcGlavP01z1bvmr04tZGmKkmaBKtR070V5GpeZM4LjUSRUmlA1pH9sGIxqi8rJ80DE5Mk6PBLE0L9Ikd393ZDRUahT6pjKkeqDms4n5X9ZOdXDhZTxKUo0Rm34UpILomEy2Jj0ukWkxMkCZ5GZWwgZUUqbNbUrmCM78yovgnlTPqnbDqdQuYaoiHMAhHIMD51CDG6iDCwwQnuAFXq1769l6s96npQVr1rMPf2R9/ABGL474</latexit>
k�x� �x0k n 1 ) y(x) = y(x0)
RDRd
yy
D � d
DATA
Separation - Contraction• In high dimension, typical distances are huge, thus an
appropriate representation must contract the space:
• While avoiding the different classes to collapse:
16
k�x� �x0k kx� x
0k
9✏ > 0, y(x) 6= y(x0) ) k�x� �x0k � ✏
✏
�
DATA
Nature of the variabilities• A classification problem can be written as a loss
minimisation:
• A symmetry corresponds to a transformation that preserves the class:
• That should preserve also the representation:
17
inf
Classifier,�
X
i
loss(xi, yi)
loss(x, y) = kClassifier(�x)� ykL
(x, y) in the training set () (Lx, y) in the training set
�Lx = �x ) loss(Lx, y) = loss(x, y)
DATA
An example: translation• Translation is a linear action:
• (Culture) The set of translations is a (Lie) group with an exponential map:
• In many case, it is a variability to reduce:
18
8u 2 R2, Lax(u) = x(u� a)
"cats"
La+b = La � Lb and Lax(u) =X
n�0
(d
nx
du
n)u
(�a)n
n!= e
�a( ddu )u
x
✓ ! ei✓Similar to:
DATAConvolution: covariance to
translation
• A linear (bounded) operator of is a convolution iff it is covariant with the action of translations:
• In this case,
• And it is diagonalised by its Fourier basis:
19
8a, LaW = WLa ) Wx(u� a) = Wxa(u), xa(u) = x(u� a)
W L2
9w,Wx(u) =
Zx(t)w(u� t)dt
Wei!Tu = Fw(!)ei!
Tu
DATA
Invariance to translation20
• In many cases, one wish to be invariant globally to translation, a simple way is to perform an averaging:
• Even if it can be localized, the averaging keeps the low frequency structures: the invariance brings a loss of information!
• Covariance (even non linear) and averaging imply invariance:
Ax =
ZLaxda =
Zx(u)du
A
It’s the 0 frequency!
ALa = A
WLa = LaW ) AWLax = ALaWx = AWx
An invariant is created!
DATA
Averaging makes euclidean distance meaningful in high dimension
21
x y kx� yk2 = 2
Averaging is the key to get invariants
x y
Translation
Rotation
DATAHow to tackle the curse of
dimensionality?• Cascade of covariant operators with translation to build
an invariant to translations:
• Linear and non-linear contraction to reduce the volume:
• An interesting object:
22
k⇢(x)� ⇢(y)k kx� yk
AWJ ...W1Lax = AWJ ...W1x
�x = A⇢WJ ...⇢W1x
DATAHow to tackle the curse of
dimensionality? (2)• Weak differentiability property:
• A linear projection (to kill ) build an invariant
23
supL
k�Lx� �xkkLx� xk < 1 ) 9 ”weak” @
x
�) �Lx ⇡ �x+ @
x
�L+ o(kLk)
�
A linear operatorDisplacement
+ projection
L
L
DATA
How can we build ?
• Enumerating the different variabilities is hard.
• Since Deep neural networks solve the vision classification task, it is necessary they build invariance to a large set of intra-class variabilities.
• So, what is a Deep network?
24
�
DATA
Delving into the technique
• Building a Deep network is challenging.
• It requires a large amount of data and GPUs
• … and there are many more details.
25
DATA
Dataset: CIFAR• 50 000 images for training, 10 000 images for testing, of
size 32x32 (small), 10/100 classes
26
DATA
Dataset: Imagenet
• 1,2M labeled images for training, 1000 classes (car, dog, …) of various sizes, 400k for testing
• Natural images with large variability
27
DATA
Benchmarks28
% a
ccur
acy
60
70
80
90
100
2010 2011 2012 2013 2014 2015 2016
HandcraftedDeepNet
Ref.: image-net.org
human
ImageNet%
acc
urac
y
60
70
80
90
100
2010 2011 2012 2013 2014 2015 2016
UnsupervisedDeepNet
Ref.: http://rodrigob.github.io/are_we_there_yet/build/
human
CIFAR
DATADeepNet?
• A DeepNet is a cascade of linear operators with a point-wise non-linearity.
• Each operators is supervisedly learned
• Formal way to write it:
29
Ref.: Rich feature hierarchies for accurate object detection and semantic segmentation. Girshick et al.
Convolutional network and applications in vision. Y. LeCun et al.
operation
layer of neurons
xj+1 = ⇢Wjxj
Linear
Pointwise non-linearity
DATAArchitecture of a CNN
• Cascade of convolutional operator and non-linear operator:
• Can be interpreted as neurons sharing weights:
• Designing a state-of-the-art deep net is generally hard and requires a lot of engineering
30
x0x1 x2
xJ
…
Classifier
The kernel is learned
xj+1(u,�) = ⇢(X
�
xj(., �) ? wj,�,�(u))
DATA
Typical CNN architecture31
Ref.: ImageNet Classification with Deep Convolutional Network, A Krizhevsky et al.
"AlexNet"60M parameters, 8 layers
DATA
Inception Net32
Ref.: Going Deeper with Convolutions, C Szegedy et al.
5M parameters, 38 layers
DATA
ResNet33
Ref.: Deep Residual Learning for Image Recognition, K He et al.
4M parameters, 152 layers
DATA
Deep Face34
120M parameters, 7 layers
DATA
Optimizing a DeepNet• The output has the dimension of the number of
classes. The DeepNet operators are optimised via the neg cross entropy and a stochastic gradient descent:
• All the functions are differentiable: back propagation algorithm+ stochastic gradient:
• It is absolutely non-convex! No guarantee to converge.
35
�x
Ref.: Convolutional network and applications in vision. Y. LeCun et al.
�X
n
X
class
1yn=class log(�xn)class
learning rate randomly selected sample
wi+1j = wi
j � ↵irwj(wij , Xj)
DATA
CUDA
• Deep learning algorithms rely a lot on linear operations.
• CUDA routines permit to implement efficiently linear algebra routines: speed up of 80.
• What costs a lot of with a GPUs are the I/O
36
DATA
Implementation of a CNN37
Splitting dataset into batches of size
256
loss
Typical training time on imagenet: 100 epochs 2 hours per epoch
@loss
@wj=
@loss
@xj
@xj
@wj256
CPU GPU
w1 w2
@loss
@xj�1=
@xj
@xj�1
@loss
@xj
x1 x2
@loss
@x1
@loss
@x2
@loss
@w2
@loss
@w1
DATA
Softwares…
• All the packages are based on GPUs, select your favorite via: simplicity of benchmarking, data input…
• All available in python or C++ ; developed by FB, Google, … there is a war!
38
Modularity
Sim
plic
ity to
inpu
t the
dat
a
Hard Simple
SimpleMatConvNet
TheanoTensorflow
PyTorchCaffe
(This is subjective)
DATA
Training your own CNN
• Again, the optimisation is no convex: a lot of hyper parameters (learning rate, l2 regularization…) to tune:
• Demo!
39
DATA
Why is deep learning dangerous?• Pure black box. Few mathematical results are available.
Many rely on a "manifold hypothesis". Clearly wrong: Ex: stability to diffeomorphisms
• No stability results. It means that "small" variations of the inputs might have a large impact on the system. And this happens.
• Small data?
• Shall we learn each layer from scratch? (geometric priors?)
• Thanks to the cascade, features are hard to interpret
40
DATA
Identifying the variabilities?• Several works showed a deepnet exhibits some
covariance:
• Manifold of faces at a certain depth:
• Can we generalise these?
41
Ref.: Understanding deep features with computer-generated imagery, M Aubry, B Russel
Ref.: Unsupervised Representation Learning with Deep Convolutional GAN, Radford, Metz & Chintalah
DATA
Why does it work?42
• Progressively, there is a linear separation that occurs
• In fact, euclidean distances become more meaningful with depth and symmetry groups seem to appear.
Indicates a progressive dimensionality reduction!
Ref.: Building a Regular Decision Boundary with Deep Networks, CVPR 2017, EO Mutiscale Hiearchical Convolutional Network, Jacobsen, O, Mallat, Smeulders
Acc
urac
y
Ref.: Visualizing and Understanding Convolutional Networks, M Zeiler, R Fergus
W4W3W2W1
DATAWavelets: avoiding learning?
• Wavelets help to describe signal structures. is a wavelet iff
• They are chosen localised in space and frequency.
• Wavelets can be dilated in order to be a multi-scale representation of signals, rotated to describe rotations.
• Design wavelets selective to an informative variability.
43
<latexit sha1_base64="vZM3p0oDK8G//ZjqteTWKP152iI=">AAAB6nicbZDLSgMxFIbPtF5qvY2XnZtgEVyVGRcqrgpuXFZwbKEdSibNtKFJZkgyQhkKPoEbFypufQ4fwp1vY6btQlt/CHz8/znknBOlnGnjed9OqbyyurZe2ahubm3v7Lp7+/c6yRShAUl4otoR1pQzSQPDDKftVFEsIk5b0ei6yFsPVGmWyDszTmko8ECymBFsCqubatZza17dmwotgz+HWuPws/wIAM2e+9XtJyQTVBrCsdYd30tNmGNlGOF0Uu1mmqaYjPCAdixKLKgO8+msE3RinT6KE2WfNGjq/u7IsdB6LCJbKbAZ6sWsMP/LOpmJL8OcyTQzVJLZR3HGkUlQsTjqM0WJ4WMLmChmZ0VkiBUmxp6nao/gL668DMFZ/bzu3fq1xhXMVIEjOIZT8OECGnADTQiAwBCe4AVeHeE8O2/O+6y05Mx7DuCPnI8frjmPpA==</latexit><latexit sha1_base64="MKReSqAJM21mHrPcc9EWVyc1loM=">AAAB6nicbZC7SgNBFIbPJl5ivMVLZzMYBKuwa6FiFbCxjOAaIVnC7GQ2GTIzO8zMCiHkFWwsVGx9DmsbGzvfxtkkhSb+MPDx/+cw55xYcWas7397heLS8spqaa28vrG5tV3Z2b01aaYJDUnKU30XY0M5kzS0zHJ6pzTFIua0GQ8u87x5T7VhqbyxQ0UjgXuSJYxgm1ttZVinUvVr/kRoEYIZVOv770X1+ZE1OpWvdjclmaDSEo6NaQW+stEIa8sIp+NyOzNUYTLAPdpyKLGgJhpNZh2jI+d0UZJq96RFE/d3xwgLY4YidpUC276Zz3Lzv6yV2eQ8GjGpMkslmX6UZBzZFOWLoy7TlFg+dICJZm5WRPpYY2LdecruCMH8yosQntROa/51UK1fwFQlOIBDOIYAzqAOV9CAEAj04QGe4NkT3qP34r1OSwverGcP/sh7+wFQoJGe</latexit>
<latexit sha1_base64="mduu63BoHMqGld23Vba2fG/RufU=">AAAB6nicbZC9TsMwFIVvyl8pfwFGGCwqJKYqYQDGChbGViK0UhtVjuu0Vm0nsh2kKuorsDAAYuUheA42Nh4Fp+0ALUey9Omce+V7b5Rypo3nfTmlldW19Y3yZmVre2d3z90/uNdJpggNSMIT1Y6wppxJGhhmOG2nimIRcdqKRjdF3nqgSrNE3plxSkOBB5LFjGBTWN1Us55b9WreVGgZ/DlU68cfzW8AaPTcz24/IZmg0hCOte74XmrCHCvDCKeTSjfTNMVkhAe0Y1FiQXWYT2edoFPr9FGcKPukQVP3d0eOhdZjEdlKgc1QL2aF+V/WyUx8FeZMppmhksw+ijOOTIKKxVGfKUoMH1vARDE7KyJDrDAx9jwVewR/ceVlCM5rFzWv6Vfr1zBTGY7gBM7Ah0uowy00IAACQ3iEZ3hxhPPkvDpvs9KSM+85hD9y3n8AnNqQXA==</latexit><latexit sha1_base64="KoiyL79C8JDW2tvOqjdwVaE3Hps=">AAAB6nicbZDNSgMxFIXv+FvrX9WlIsEiuCozLtRl0Y3LFhxbaIeSSTNtaJIZkoxQhi7dunGh4taH6HO48xl8CTNtF9p6IPBxzr3k3hsmnGnjul/O0vLK6tp6YaO4ubW9s1va27/XcaoI9UnMY9UMsaacSeobZjhtJopiEXLaCAc3ed54oEqzWN6ZYUIDgXuSRYxgk1vtRLNOqexW3InQIngzKFePxvXvx+NxrVP6bHdjkgoqDeFY65bnJibIsDKMcDoqtlNNE0wGuEdbFiUWVAfZZNYROrVOF0Wxsk8aNHF/d2RYaD0Uoa0U2PT1fJab/2Wt1ERXQcZkkhoqyfSjKOXIxChfHHWZosTwoQVMFLOzItLHChNjz1O0R/DmV14E/7xyUXHrXrl6DVMV4BBO4Aw8uIQq3EINfCDQhyd4gVdHOM/Om/M+LV1yZj0H8EfOxw95uZHC</latexit>
2 L2(R2,C) andRR2 (u)du = 0
<latexit sha1_base64="mrzq5xaVNdN46w+JQS8GOBQT3ZU=">AAACQnicbVBBaxNBFH7baptGW9N69DI0CAmUsJtD7aUQmosHD1GaJpiNYXZ2kgydnV1m3paGZaE/zYt/wJt49+KhivTmwdkkQkx8MPB933vfzJsvSKQw6LpfnK3tR493dkt75SdP9w+eVQ6Prkycasa7LJax7gfUcCkU76JAyfuJ5jQKJO8F1+2i37vh2ohYXeIs4cOITpQYC0bRSqPKez8xgvhCET+iOGVUZm/yD83anAVB9s6Sk7+kndd95LeYEapCklsXjrLVyZwU19XSepieu6NK1W248yKbwFuCaqv+8PUOADqjymc/jFkacYVMUmMGnpvgMKMaBZM8L/up4Qll13TCBxYqGnEzzOYZ5OSlVUIyjrU9CslcXXVkNDJmFgV2sljYrPcK8X+9QYrjs2EmVJIiV2zx0DiVBGNSBEpCoTlDObOAMi3sroRNqaYMbexlG4K3/uVN0G02ThvuW6/auoBFleAFHEMNPHgFLXgNHegCg4/wDe7hh/PJ+e78dH4tRrecpec5/FPO7z/pLLQC</latexit><latexit sha1_base64="hbb0Ca2ujpjCwTZDix9084r07Wg=">AAACQnicbVC7TsMwFHV4U14FRhaLCqmVUJUwAAtSBQsDQ0GUIppSOY4LVh0nsm8QVZSNP2JkYeEH2BA7CwMgxMaA04LE60qWzjn3Hvv6eJHgGmz7zhoYHBoeGR0bz01MTk3P5GfnDnQYK8pqNBShOvSIZoJLVgMOgh1GipHAE6zudbayfv2MKc1DuQ/diDUDciJ5m1MCRmrlj9xIc+xyid2AwCklItlJj1eKPeZ5yZ4hy19kKy25wM4hwUT6ODUuaCXfJ1OcXVeMS368YbfyBbts9wr/Bc4nKFRKb/dnl1cX1Vb+1vVDGgdMAhVE64ZjR9BMiAJOBUtzbqxZRGiHnLCGgZIETDeTXgYpXjKKj9uhMkcC7qnfHQkJtO4GnpnMFta/e5n4X68RQ3u9mXAZxcAk7T/UjgWGEGeBYp8rRkF0DSBUcbMrpqdEEQom9pwJwfn95b+gtlJeLdu7TqGyifo1hhbQIioiB62hCtpGVVRDFF2jB/SEnq0b69F6sV77owPWp2ce/Sjr/QNzfLXq</latexit>
j,✓ =1
22j (
r✓(u)
2j)
<latexit sha1_base64="4yWyX8g1ftSjdJSEzIePbvZr7Ik=">AAACJXicbVDLSsNAFJ34rPVVdekmWIQWpCRF1E2h6MZlBWsLTQ2T6aSddvJg5kYoIV/jxl9x46KK4MpfcdJkoa0HBg7nnMude5yQMwmG8aWtrK6tb2wWtorbO7t7+6WDwwcZRILQNgl4ILoOlpQzn7aBAafdUFDsOZx2nMlN6neeqJAs8O9hGtK+h4c+cxnBoCS71LBCyex4fGbBiAJOGpYrMInNJK4/xvVxkqR+JROFnYUqUTW1x0nVLpWNmjGHvkzMnJRRjpZdmlmDgEQe9YFwLGXPNELox1gAI5wmRSuSNMRkgoe0p6iPPSr78fzMRD9VykB3A6GeD/pc/T0RY0/KqeeopIdhJBe9VPzP60XgXvVj5ocRUJ9ki9yI6xDoaWf6gAlKgE8VwUQw9VedjLCqBFSzRVWCuXjyMmnXaxc14+683LzO2yigY3SCKshEl6iJblELtRFBz+gVzdC79qK9aR/aZxZd0fKZI/QH2vcP94GmQg==</latexit><latexit sha1_base64="4yWyX8g1ftSjdJSEzIePbvZr7Ik=">AAACJXicbVDLSsNAFJ34rPVVdekmWIQWpCRF1E2h6MZlBWsLTQ2T6aSddvJg5kYoIV/jxl9x46KK4MpfcdJkoa0HBg7nnMude5yQMwmG8aWtrK6tb2wWtorbO7t7+6WDwwcZRILQNgl4ILoOlpQzn7aBAafdUFDsOZx2nMlN6neeqJAs8O9hGtK+h4c+cxnBoCS71LBCyex4fGbBiAJOGpYrMInNJK4/xvVxkqR+JROFnYUqUTW1x0nVLpWNmjGHvkzMnJRRjpZdmlmDgEQe9YFwLGXPNELox1gAI5wmRSuSNMRkgoe0p6iPPSr78fzMRD9VykB3A6GeD/pc/T0RY0/KqeeopIdhJBe9VPzP60XgXvVj5ocRUJ9ki9yI6xDoaWf6gAlKgE8VwUQw9VedjLCqBFSzRVWCuXjyMmnXaxc14+683LzO2yigY3SCKshEl6iJblELtRFBz+gVzdC79qK9aR/aZxZd0fKZI/QH2vcP94GmQg==</latexit>
j,✓<latexit sha1_base64="h0os0qPCCwGOdy3Iax5X9vow0TE=">AAAB93icbVBNS8NAEN34WetHox69LBbBg5REinosePFYwdhCE8Jmu2nXbjZhdyLU0F/ixYOKV/+KN/+N2zYHbX0w8Hhvhpl5USa4Bsf5tlZW19Y3Nitb1e2d3b2avX9wr9NcUebRVKSqGxHNBJfMAw6CdTPFSBIJ1olG11O/88iU5qm8g3HGgoQMJI85JWCk0K75meZh8XDmw5ABmYR23Wk4M+Bl4pakjkq0Q/vL76c0T5gEKojWPdfJICiIAk4Fm1T9XLOM0BEZsJ6hkiRMB8Xs8Ak+MUofx6kyJQHP1N8TBUm0HieR6UwIDPWiNxX/83o5xFdBwWWWA5N0vijOBYYUT1PAfa4YBTE2hFDFza2YDokiFExWVROCu/jyMvHOGxcN57ZZbzXLNCroCB2jU+SiS9RCN6iNPERRjp7RK3qznqwX6936mLeuWOXMIfoD6/MHXVOTCg==</latexit><latexit sha1_base64="h0os0qPCCwGOdy3Iax5X9vow0TE=">AAAB93icbVBNS8NAEN34WetHox69LBbBg5REinosePFYwdhCE8Jmu2nXbjZhdyLU0F/ixYOKV/+KN/+N2zYHbX0w8Hhvhpl5USa4Bsf5tlZW19Y3Nitb1e2d3b2avX9wr9NcUebRVKSqGxHNBJfMAw6CdTPFSBIJ1olG11O/88iU5qm8g3HGgoQMJI85JWCk0K75meZh8XDmw5ABmYR23Wk4M+Bl4pakjkq0Q/vL76c0T5gEKojWPdfJICiIAk4Fm1T9XLOM0BEZsJ6hkiRMB8Xs8Ak+MUofx6kyJQHP1N8TBUm0HieR6UwIDPWiNxX/83o5xFdBwWWWA5N0vijOBYYUT1PAfa4YBTE2hFDFza2YDokiFExWVROCu/jyMvHOGxcN57ZZbzXLNCroCB2jU+SiS9RCN6iNPERRjp7RK3qznqwX6936mLeuWOXMIfoD6/MHXVOTCg==</latexit>
Isotropic| |
Non-Isotropic| |VS
�
DATA
The Gabor wavelet
44
�(u) =1
2⇡�e�
kuk22� (u) =
1
2⇡�e�
kuk22� (ei⇠.u � )
(for sake of simplicity, formula are given in the isotropic case)
Heisenberg principle!
Good localisation in space and Fourier
DATA
Wavelet Transform• Wavelet transform :
• Isometric and linear operator of , with
• Covariant with translation
• Nearly commutes with diffeomorphisms
• A good baseline to describe an image!
45
Wx = {x ? j,✓, x ? �J}✓,jJ
kWxk2 =X
✓,jJ
Z|x ? j,✓|2 +
Zx ? �
2J
k[W, .⌧ ]k Ckr⌧k
!1
!2
W (x⌧=c) = (Wx)⌧=c
Ref.: Group Invariant Scattering, Mallat S
L2
DATA 46
Success story Wavelets for Textures&Digits
• Non-learned representation have been successively used on:
• Digits (patterns):
• Textures (stationary processes):
• However all the variabilities (groups) here are perfectly understood. (not with natural images)
Small deformations +Translation
Ref.: Invariant Convolutional Scattering Network, J. Bruna and S Mallat
Ref.: Rotation, Scaling and Deformation Invariant Scattering for texture discrimination, Sifre L and Mallat S.
Rotation+Scale
46
DATA
Conclusion47
• Deep Learning architectures are of interest thanks to their outstanding numerical results.
• Black boxes must be opened via maths.
• Check my website for softwares and papers: http://www.di.ens.fr/~oyallon/
Thank you!