mvfl_part3 cvpr2012 spatio-temporal and higher-order feature learning
TRANSCRIPT
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 1/87
Outline
1 IntroductionFeature LearningCorrespondence in Computer VisionRelational feature learning
2 Learning relational featuresSparse Coding Review
Encoding relationsInferenceLearning
3 Factorization, eigen-spaces and complex cellsFactorizationEigen-spaces, energy models, complex cells
4 ApplicationsApplicationsConclusions
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 71 / 174
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 2/87
Outline
1 IntroductionFeature LearningCorrespondence in Computer VisionRelational feature learning
2 Learning relational featuresSparse Coding Review
Encoding relationsInferenceLearning
3 Factorization, eigen-spaces and complex cellsFactorizationEigen-spaces, energy models, complex cells
4 ApplicationsApplicationsConclusions
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 72 / 174
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 3/87
Complexity
The number of parameters is about n × n × n (!)
More, if we want sparse, overcomplete hiddens.There is a simple, yet far-reaching, way to reduce that number.
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 73 / 174
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 4/87
Factorization
wy jf
wxif
wzkf
wijk
wijk =ijk f
wxif wy
jf wzkf
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 74 / 174
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 5/87
Factorization is lter matching
x y
xi
y j
z
zk
W z
W x W y
zk =ij
wijk x i y j =ij f
wxif wy
jf wzkf x i y j
=f
wy jf
·
i
wxif x i ·
j
wykf y j
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 75 / 174
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 6/87
Factorization is lter matching
x y
x i y j
z
zk
W z
W x W y
E =ijk
(f
wxif wy
jf wzkf )x i y j zk =
f
(i
wxif x i )(
j
wy jf y j )(
k
wzkf zk )
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 76 / 174
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 7/87
Factorized models
x y
x i y j
z
zk
W z
W x W y
Factored Gated Boltzmann machinesExponentiate and normalize energy (just like RBM).Learning and inference exactly like before.(Taylor, 2009), (Memisevic, Hinton; 2009)
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 77 / 174
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 8/87
Factorized models
x
y
z
x
y
Factored Relational AutoencodersAgain, everything like before. Back-propagate through the lters.Conditional learning trivial.Joint learning by adding two asymmetric objectives.
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 78 / 174
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 9/87
Square pooling models
Square pooling:Another way to learn lter matchingmodels are square pooling models, forexample:
ASSOM (Kohonen, 1996)
ISA (Hyvarinen, 2000)Product of T-distributions (Osindero etal., 2006)(Karklin, Lewicki; 2008)cRBM (Ranzato et al., 2009)
Often, W z
is constrained so each hiddensees only a few squared inputs. That wayhiddens can be thought of as encodingsubspace norms.
zk
z
W z
x
x i y j
y
(·
)2
W x W y
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 79 / 174
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 10/87
Square pooling models
Square pooling:Why is square pooling the same?
The activity that a hidden unit gets is:
f wzkf W x· f
Tx + W y· f
Ty
2
= f wzkf 2(W x· f
Tx )(W yT
· f y )
+( W x
· f T
x )2
+ ( W y
· f
Ty )
2
zk
z
W z
x
x i y j
y
(·
)2
W x W y
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 79 / 174
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 11/87
Square pooling models
Square pooling:Why is square pooling the same?
The activity that a hidden unit gets is:f wz
kf W x· f T
x + W y· f T
y2
= f wzkf 2(W x· f
Tx )(W yT
· f y )
+( W x
· f T
x )2
+ ( W y
· f
Ty )
2
zk
z
W z
x
x i y j
y
(·
)2
W x W y
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 79 / 174
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 12/87
Square pooling models
Square pooling:Learning is somewhat more difcult thanwith factored gated feature learning.Example ISA: Gradient-based, whileenforcing W xy T W xy = I after everygradient step (eigen-decomposition).
zk
z
W z
x
x i y j
y
(·
)2
W x W y
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 79 / 174
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 13/87
Examples
Toy examples:There is no structure in these images.Only in how they change .
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 80 / 174
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 14/87
Learned lters wxif
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 81 / 174
y
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 15/87
Learned lters wy jf
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 82 / 174
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 16/87
Frequency/orientation histograms
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 83 / 174
h
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 17/87
Frequency/orientation histograms
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 84 / 174
V l i i f i i
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 18/87
Velocity tuning of mapping units
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 85 / 174
Fil l d f li hif
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 19/87
Filters learned from split-screen shifts
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 86 / 174
Af lt
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 20/87
Afne lters
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 87 / 174
“Filt i g” lt
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 21/87
“Filtering”-lters
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 88 / 174
Rotation lters
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 22/87
Rotation lters
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 89 / 174
Rotation lters
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 23/87
Rotation lters
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 90 / 174
Rotation lters
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 24/87
Rotation lters
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 91 / 174
Filters learned by watching TV
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 25/87
Filters learned by watching TV
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 92 / 174
Filters learned by watching TV
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 26/87
Filters learned by watching TV
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 93 / 174
“Bag-Of-Warps”
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 27/87
Bag Of Warps
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 94 / 174
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 28/87
Outline
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 29/87
Outline
1 IntroductionFeature Learning
Correspondence in Computer VisionRelational feature learning
2 Learning relational featuresSparse Coding Review
Encoding relationsInferenceLearning
3 Factorization, eigen-spaces and complex cellsFactorizationEigen-spaces, energy models, complex cells
4 ApplicationsApplicationsConclusions
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 96 / 174
Linear image warps
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 30/87
Linear image warps
Consider a linear transformation in pixel space (“ warp ”):
y = L x
Now consider the following task:
Given two images x , y , what is the warp that relates them?
This is exactly the problem that mapping units should be able to
solve.
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 97 / 174
Orthogonal image warps
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 31/87
g g p
y = L x
We restrict our attention to orthogonal warps in the following,that is:
L T L = I
These include all permutations (“shufing pixels”).Orthogonal warps are the only transformations we can seeanyway, if all our images are white :
I = C y = LC x L T = LL T
(Bethge, 2007)To get a better understanding of what mapping units really do, wemake use of two properties of orthogonal ima ge wa rps:
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 98 / 174
Properties of orthogonal image warps
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 32/87
p g g p
(I) Orthogonal transformations decompose into 2-D rotations
An orthogonal matrix is similar to a matrix that performsaxis-aligned two-dimensional rotations:
V T LV =R 1
. . .R k
R i = cos(θi ) − sin(θi )
sin(θi ) cos(θi )
This follows, for example, from the fact that theeigen-decomposition
L = V DV T
has complex eigenvalues of length 1.The eigenspaces are also known as invariant subspaces .
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 99 / 174
Properties of orthogonal image warps
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 33/87
p g g p
Example: Translation and the Fourier spectrumTranslation is an example of an orthogonal warp.1-D translation matrices are circulants , which have ones along anoff-diagonal, like so:
L =
0 1 0 0 00 0 1 0 00 0 0 1 00 0 0 0 11 0 0 0 0
The two-dimensional eigen-features of this matrix turn out to besine-/cosine-pairs (Fourier features).
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 100 / 174
Properties of orthogonal image warps
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 34/87
p g g p
Quadrature pairsSince the invariant subspaces of orthogonal warps aretwo-dimensional, eigenvectors come in pairs :
v R , v I
They form an orthogonal basis for the invariant subspace.In the case of translation, v I is a sine and v R is a cosine feature.Waves with 90 degrees phase difference are known as“quadrature pair ”.But the concept is more general and applies to all orthogonalmatrices.The eigenvector pairs of orthogonal transformations have beenreferred to as “ generalized quadrature pairs ” (Bethge et al.,2007).
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 101 / 174
Properties of commuting image warps
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 35/87
(II) Commuting transformations share an eigen-basisAny two transformations that commute share a single eigen-basis.They differ only in their eigen values .
“Proof”: Consider A and B with AB = BA and the eigenvector vof B with λ an eigenvalue with multiplicity one. We have
BAv = ABv = λAv.
So Av is also an eigenvector of B with the same eigenvalue.
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 102 / 174
Properties of commuting image warps
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 36/87
Translation Example continuedAll circulants have the Fourier basis as eigen-basis.
Properties (I) and (II) taken together now allow us to state thefollowing:
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 103 / 174
Properties of commuting image warps
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 37/87
Any two orthogonal, commuting transformations differ only withrespect to the rotation angles in the eigenpaces .
So to apply a transformation you can equivalently perform a set ofindependent two-D rotations.
x
y = L x
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 104 / 174
Properties of commuting image warps
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 38/87
Any two orthogonal, commuting transformations differ only withrespect to the rotation angles in the eigenpaces .
So to apply a transformation you can equivalently perform a set ofindependent two-D rotations.
x
y = L x
To infer the transformation, given two images x and y : Project x
and y onto the eigenvectors, then comput e the rota tion a ngles!
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 104 / 174
Properties of commuting image warps
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 39/87
Any two orthogonal, commuting transformations differ only withrespect to the rotation angles in the eigenpaces .
So to apply a transformation you can equivalently perform a set ofindependent two-D rotations.
x
y = L x
To infer the transformation, given two images x and y : Project x
and y onto the eigenvectors, then comput e the rota tion a ngles!
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 104 / 174
Extracting sub-space rotations, naive approach
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 40/87
φy
φx
Im
Re
In each subspace:Normalize the 2-D projections to unit norm, then read off the anglebetween them.
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 105 / 174
Extracting sub-space rotations, naive approach
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 41/87
φy
φx
Im
Re
In each subspace:Normalize the 2-D projections to unit norm, then read off the anglebetween them.
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 105 / 174
Extracting sub-space rotations, naive approach
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 42/87
φy
φx
Im
Re
In each subspace:Normalize the 2-D projections to unit norm, then read off the anglebetween them.
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 105 / 174
Extracting sub-space rotations, naive approach
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 43/87
φy
φx
Im
Re
In each subspace:Normalize the 2-D projections to unit norm, then read off the anglebetween them.
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 105 / 174
Extracting sub-space rotations, naive approach
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 44/87
φyφx
Im
Re
Extracting rotations by computing anglesTo read off the angle, compute the inner product (afternormalizing projections to unit-norm).Formally,
cos(φy − φx ) = cos φy cos φx + sin φy sin φx
= ( v RT
y )( v RT
x ) + ( v I T
y )( v I T
x )
Compute the sum over products of lter responses.
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 105 / 174
Extracting sub-space rotations, naive approach
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 45/87
φyφx
Im
Re
Extracting rotations by computing anglesTo read off the angle, compute the inner product (afternormalizing projections to unit-norm).Formally,
cos(φy − φx ) = cos φy cos φx + sin φy sin φx
= (v RT
y )( v RT
x ) + ( v I T
y )( v I T
x )
Compute the sum over products of lter responses.
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 105 / 174
Sub-space rotation detectors
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 46/87
Normalizing to unit norm can be a bad idea, if projections are
small:The aperture problem
Consider the left shift of a horizontal bar.
It is impossible to see the transformation in this case.
This is known as the aperture problem .
Normalizing subspace projections would amount to pretending we couldsee the transformation!
A second way to get the rotations:Absorb the rotation into one of the eigenvectors, then try to detect rotation angles.
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 106 / 174
Sub-space rotation detectors
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 47/87
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 107 / 174
Sub-space rotation detectors
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 48/87
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 107 / 174
Sub-space rotation detectors
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 49/87
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 107 / 174
Sub-space rotation detectors
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 50/87
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 107 / 174
Sub-space rotation detectors
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 51/87
Extracting rotations by detecting angles
Formally, let the output lter pair vθR , v
θI be the input lter rotatedby θ degrees (in complex notation: v θ = exp( iθ )v ).
Measure how well the image pair x , y conforms with this rotation:
r θ := cos( φy − φx − θ)
= cos( φy )cos( φx + θ) + sin( φy )sin( φx + θ)= (v
θR
Ty )( v R
Tx ) + ( v
θI
Ty )( v I
Tx )
Again we have to sum over products of lter responses .
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 107 / 174
Sub-space rotation detectors
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 52/87
y jx i
yx
For each subspace, we will need several mapping units, each
tuned to a different angle, θi .The set of mapping unit responses will now constitute apopulation code that represents the observed transformation.A mapping unit is conservative : It res only if a transform ispresent and if it is visible in the image pai r.But there is still one r oblem...Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 108 / 174
Subspace rotation detector graphical model
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 53/87
y jx iyx
But the aperture problem causes another problem:Take a video showing translations and generate two copies:
Low-pass lter each frame in the rst; High-pass lter each framein the second.Now the transformation will be visible only in some components inthe rst and in other components in the second video.These subspace features are content-dep e nde nt!
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 109 / 174
Subspace rotation detector graphical model
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 54/87
y jx iyx
The solution:Let hiddens pool within and pool across subspaces.This is exactly the factored bilinear model.
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 109 / 174
Summary: Learning relation-detectors
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 55/87
The cross-correlation modelA hidden variable that computes the sum over products of lterresponses can detect rotations, θ, in an invariant subspace.To reconstruct the transformed output from the input image, it has
to pool over multiple 2-dimensional subspaces.The population code of such hiddens is a good code for imagetransformations.Learning requires contrast normalization + keeping the scales
of lters roughly the same !
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 110 / 174
Learning quadrature features
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 56/87
y jx i
z
zk
yx
We can see the quadrature features, if we outsource theacross-subspace pooling into a separate layer.
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 111 / 174
Learning quadrature Features
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 57/87
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 112 / 174
Learning quadrature Features
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 58/87
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 113 / 174
Rotation “quadrature” lters
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 59/87
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 114 / 174
Rotation “quadrature” lters
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 60/87
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 115 / 174
Mixed transformations
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 61/87
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 116 / 174
Mixed transformations
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 62/87
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 117 / 174
Quadrature features from natural video
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 63/87
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 118 / 174
Quadrature features from natural video
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 64/87
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 119 / 174
Energy models
z
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 65/87
zk =f
wfk uTf x + v
Tf y
2
= 2f
wfk uTf x v
Tf y
+f
wfk uTf x
2+
f
wfk vTf y
2
zk
z
W z
x
x i y j
y
( · )2
W x W y
When we apply energy models to the concatenation of twoimages , we add square terms in inference.This may make the rotation detectors more conservative.Otherwise inference is the same!
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 120 / 174
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 66/87
Energy models
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 67/87
The energy model
(Adelson and Bergen, 1985): Motion(Ozhawa, DeAngelis, Freeman; 1990): DisparityEquivalence to cross-correlation: See, for example, (Fleet et al.;1994).
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 121 / 174
Learning energy models on moviesIm
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 68/87
φx t
φx 1
Re
What happens when we train energy models on movies?Hiddens receive all pairs of products between lters applied toframes.So they detect the repeated application of the sameeigenvalue :
s
vs T
x s
2
=s
vs T
x s
2
+st
vs T
x s · vt T
x t
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 122 / 174
Learning energy models on moviesIm
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 69/87
φx t
φx 1
Re
What happens when we train energy models on movies?Hiddens receive all pairs of products between lters applied toframes.So they detect the repeated application of the sameeigenvalue :
s
vs T
x s
2
=s
vs T
x s
2
+st
vs T
x s · vt T
x t
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 122 / 174
Training energy models via gating
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 70/87
We can train a cross-correlation model via the energy mechanism.But we can do the opposite, too:
Plug in the same data left and right and tie left and right lters.So we don’t have to use ISA or PoT to train energy models.
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 123 / 174
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 71/87
A covariance encoder trained on movies
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 72/87
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 124 / 174
A covariance encoder trained on movies
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 73/87
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 124 / 174
A covariance encoder trained on movies
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 74/87
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 124 / 174
A covariance encoder trained on movies
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 75/87
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 124 / 174
A covariance encoder trained on movies
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 76/87
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 124 / 174
A covariance encoder trained on movies
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 77/87
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 124 / 174
A covariance encoder trained on movies
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 78/87
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 124 / 174
A covariance encoder trained on movies
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 79/87
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 124 / 174
A covariance encoder trained on movies
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 80/87
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 124 / 174
Learning cross-correlation and energy models
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 81/87
Take-home message, factored modelTo learn about transformation, let hidden units pool over products of
lter responses (gated feature learning) or pool over squares of sumsof lter responses (energy model).
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 125 / 174
A bag of tricks
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 82/87
Tricks for learning:Normalize lters during learning, so they grow slowly , and theygrow together : Normalize with a running average of the averagelter norms.Connect top-level hiddens locally to the factors.Probably even better: make them locally overlapping (“Topographic ICA”).DC-centering and contrast-normalization for each patch.Plus: Whiten the data before learning, using PCA or ZCA.
Fast learning: large data-sets essential (use GPU’s...).
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 126 / 174
A bag of tricks
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 83/87
Tricks for learning:Normalize lters during learning, so they grow slowly , and theygrow together : Normalize with a running average of the averagelter norms.Connect top-level hiddens locally to the factors.
Probably even better: make them locally overlapping (“Topographic ICA”).DC-centering and contrast-normalization for each patch.Plus: Whiten the data before learning, using PCA or ZCA.
Fast learning: large data-sets essential (use GPU’s...).
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 126 / 174
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 84/87
A bag of tricks
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 85/87
Tricks for learning:Normalize lters during learning, so they grow slowly , and theygrow together : Normalize with a running average of the averagelter norms.Connect top-level hiddens locally to the factors.
Probably even better: make them locally overlapping (“Topographic ICA”).DC-centering and contrast-normalization for each patch.Plus: Whiten the data before learning, using PCA or ZCA.
Fast learning: large data-sets essential (use GPU’s...).
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 126 / 174
A bag of tricks
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 86/87
Tricks for learning:Normalize lters during learning, so they grow slowly , and theygrow together : Normalize with a running average of the averagelter norms.Connect top-level hiddens locally to the factors.
Probably even better: make them locally overlapping (“Topographic ICA”).DC-centering and contrast-normalization for each patch.Plus: Whiten the data before learning, using PCA or ZCA.
Fast learning: large data-sets essential (use GPU’s...).
Roland Memisevic (Uni Frankfurt) Multiview Feature Learning Tutorial at CVPR 2012 126 / 174
7/31/2019 Mvfl_part3 Cvpr2012 Spatio-temporal and Higher-Order Feature Learning
http://slidepdf.com/reader/full/mvflpart3-cvpr2012-spatio-temporal-and-higher-order-feature-learning 87/87