sparse kernel learning for image annotation
TRANSCRIPT
![Page 1: Sparse Kernel Learning for Image Annotation](https://reader034.vdocuments.net/reader034/viewer/2022052411/55649b31d8b42afd4f8b4bf3/html5/thumbnails/1.jpg)
Sparse Kernel Learning for Image Annotation
Sean Moran and Victor Lavrenko
Institute of Language, Cognition and ComputationSchool of Informatics
University of Edinburgh
ICMR’14 Glasgow, April 2014
![Page 2: Sparse Kernel Learning for Image Annotation](https://reader034.vdocuments.net/reader034/viewer/2022052411/55649b31d8b42afd4f8b4bf3/html5/thumbnails/2.jpg)
Sparse Kernel Learning for Image Annotation
Overview
SKL-CRM
Evaluation
Conclusion
![Page 3: Sparse Kernel Learning for Image Annotation](https://reader034.vdocuments.net/reader034/viewer/2022052411/55649b31d8b42afd4f8b4bf3/html5/thumbnails/3.jpg)
Sparse Kernel Learning for Image Annotation
Overview
SKL-CRM
Evaluation
Conclusion
![Page 4: Sparse Kernel Learning for Image Annotation](https://reader034.vdocuments.net/reader034/viewer/2022052411/55649b31d8b42afd4f8b4bf3/html5/thumbnails/4.jpg)
Assigning words to pictures
Feature Extraction
GIST SIFT LAB HAAR
Tiger, Grass, Whiskers
City, Castle, Smoke
Tiger, Tree, Leaves
Eagle, Sky
Training Dataset
P(Tiger | ) = 0.15
P(Grass | ) = 0.12
P(Whiskers| ) = 0.12
Top 5 words as annotation
This talk:How best to
combinefeatures?
Multiple Features
Ranked list of words
Tiger, Grass, Tree Leaves, Whiskers
Annotation Model
P(Leaves | ) = 0.10
P(Tree | ) = 0.10
P(Smoke | ) = 0.01
Testing Image
P(City | ) = 0.03
P(Waterfall | ) = 0.05
P(Castle | ) = 0.03
P(Eagle | ) = 0.02
P(Sky | ) = 0.08
X1
X2
X3
X4
X5
X6
X1
X2
X3
X4
X5
X6
X1
X2
X3
X4
X5
X6
X1
X2
X3
X4
X5
X6
X1
X2
X3
X4
X5
X6
X1
X2
X3
X4
X5
X6
X1
X2
X3
X4
X5
X6
X1
X2
X3
X4
X5
X6
X1
X2
X3
X4
X5
X6
X1
X2
X3
X4
X5
X6
X6
X5
X4
X3
X2
X1
X6
X5
X4
X3
X2
X1
X1
X2
X3
X4
X5
X6
X1
X2
X3
X4
X5
X6
X1
X2
X3
X4
X5
X6
X6
X5
X4
X3
X2
X1
X6
X5
X4
X3
X2
X1
X1
X2
X3
X4
X5
X6
X1
X2
X3
X4
X5
X6
X1
X2
X3
X4
X5
X6
X6
X5
X4
X3
X2
X1
X6
X5
X4
X3
X2
X1
X1
X2
X3
X4
X5
X6
X1
X2
X3
X4
X5
X6
X1
X2
X3
X4
X5
X6
X6
X5
X4
X3
X2
X1
X6
X5
X4
X3
X2
X1
X1
X2
X3
X4
X5
X6
![Page 5: Sparse Kernel Learning for Image Annotation](https://reader034.vdocuments.net/reader034/viewer/2022052411/55649b31d8b42afd4f8b4bf3/html5/thumbnails/5.jpg)
Previous work
I Topic models: latent Dirichlet allocation (LDA) [Barnard etal. ’03], Machine Translation [Duygulu et al. ’02]
I Mixture models: Continuous Relevance Model (CRM)[Lavrenko et al. ’03], Multiple Bernoulli Relevance Model(MBRM) [Feng ’04]
I Discriminative models: Support Vector Machine (SVM)[Verma and Jahawar ’13], Passive Aggressive Classifier[Grangier ’08]
I Local learning models: Joint Equal Contribution (JEC)[Makadia’08], Tag Propagation (Tagprop) [Guillaumin et al.’09], Two-pass KNN (2PKNN) [Verma et al. ’12]
![Page 6: Sparse Kernel Learning for Image Annotation](https://reader034.vdocuments.net/reader034/viewer/2022052411/55649b31d8b42afd4f8b4bf3/html5/thumbnails/6.jpg)
Combining different feature types
I Previous work: linear combination of feature distances in aweighted summation with “default” kernels:
Kernels
x
GG(x
;p)
p =1
x
GG(x
;p)
p =15
x
GG(x
;p)
p =2
Laplacian UniformGaussian
I Standard kernel assignment: Gaussian for Gist, Laplacianfor colour features, χ2 for SIFT
![Page 7: Sparse Kernel Learning for Image Annotation](https://reader034.vdocuments.net/reader034/viewer/2022052411/55649b31d8b42afd4f8b4bf3/html5/thumbnails/7.jpg)
Data-adaptive visual kernels
I Our contribution: permit the visual kernels themselves toadapt to the data:
Kernels
x
GG(x
;p)
p =1
x
GG(x
;p)
p =15
x
GG(x
;p)
p =2
Laplacian UniformGaussian
Corel 5K
I Hypothesis: Optimal kernels for GIST, SIFT etc dependenton the image dataset itself
![Page 8: Sparse Kernel Learning for Image Annotation](https://reader034.vdocuments.net/reader034/viewer/2022052411/55649b31d8b42afd4f8b4bf3/html5/thumbnails/8.jpg)
Data-adaptive visual kernels
I Our contribution: permit the visual kernels themselves toadapt to the data:
Kernels
x
GG(x
;p)
p =1
x
GG(x
;p)
p =15
x
GG(x
;p)
p =2
Laplacian UniformGaussian
IAPR TC12
I Hypothesis: Optimal kernels for GIST, SIFT etc dependenton the image dataset itself
![Page 9: Sparse Kernel Learning for Image Annotation](https://reader034.vdocuments.net/reader034/viewer/2022052411/55649b31d8b42afd4f8b4bf3/html5/thumbnails/9.jpg)
Sparse Kernel Continuous Relevance Model (SKL-CRM)
Overview
SKL-CRM
Evaluation
Conclusion
![Page 10: Sparse Kernel Learning for Image Annotation](https://reader034.vdocuments.net/reader034/viewer/2022052411/55649b31d8b42afd4f8b4bf3/html5/thumbnails/10.jpg)
Continuous Relevance Model (CRM)
I CRM estimates joint distribution of image features (f) andwords (w)[Lavrenko et al. 2003]:
P(w, f) =∑J∈T
P(J)N∏
j=1
P(wj |J)M∏i=1
P(~fi |J)
I P(J): Uniform prior for training image JI P(~fi |J): Gaussian non-parametric kernel density estimateI P(wi |J): Multinomial for word smoothing
I Estimate marginal probability distribution over individual tags:
P(w |f) =P(w , f)∑w P(w , f)
I Top e.g. 5 words with highest P(w |f) used as annotation
![Page 11: Sparse Kernel Learning for Image Annotation](https://reader034.vdocuments.net/reader034/viewer/2022052411/55649b31d8b42afd4f8b4bf3/html5/thumbnails/11.jpg)
Sparse Kernel Learning CRM (SKL-CRM)
I Introduce binary kernel-feature alignment matrix Ψu,v
P(I |J) =M∏i=1
R∑j=1
exp
{− 1
β
∑u,v
Ψu,vkv (~f ui ,~f uj )
}
I kv (~f ui ,~f uj ): v -th kernel function on the u-th feature type
I β: kernel bandwidth parameter
I Goal: learn Ψu,v by directly maximising annotation F1 scoreon held-out validation dataset
![Page 12: Sparse Kernel Learning for Image Annotation](https://reader034.vdocuments.net/reader034/viewer/2022052411/55649b31d8b42afd4f8b4bf3/html5/thumbnails/12.jpg)
Generalised Gaussian Kernel
I Shape factor p: traces out an infinite family of kernels
P(~fi |~fj) =p1−1/p
2βΓ(1/p)exp
[−1
p
|~fi − ~fj |p
βp
]
I Γ: Gamma functionI β: kernel bandwidth parameter
![Page 13: Sparse Kernel Learning for Image Annotation](https://reader034.vdocuments.net/reader034/viewer/2022052411/55649b31d8b42afd4f8b4bf3/html5/thumbnails/13.jpg)
Generalised Gaussian Kernel
I Shape factor p: traces out an infinite family of kernels
P(~fi |~fj) =p1−1/p
2βΓ(1/p)exp
[−1
p
|~fi − ~fj |p
βp
]
x
GG(x ;
p)
p =2
![Page 14: Sparse Kernel Learning for Image Annotation](https://reader034.vdocuments.net/reader034/viewer/2022052411/55649b31d8b42afd4f8b4bf3/html5/thumbnails/14.jpg)
Generalised Gaussian Kernel
I Shape factor p: traces out an infinite family of kernels
P(~fi |~fj) =p1−1/p
2βΓ(1/p)exp
[−1
p
|~fi − ~fj |p
βp
]
x
GG(x ;
p)
p =1
![Page 15: Sparse Kernel Learning for Image Annotation](https://reader034.vdocuments.net/reader034/viewer/2022052411/55649b31d8b42afd4f8b4bf3/html5/thumbnails/15.jpg)
Generalised Gaussian Kernel
I Shape factor p: traces out an infinite family of kernels
P(~fi |~fj) =p1−1/p
2βΓ(1/p)exp
[−1
p
|~fi − ~fj |p
βp
]
x
GG(x ;
p)
p =15
![Page 16: Sparse Kernel Learning for Image Annotation](https://reader034.vdocuments.net/reader034/viewer/2022052411/55649b31d8b42afd4f8b4bf3/html5/thumbnails/16.jpg)
Multinomial Kernel
I Multinomial kernel optimised for count-based features:
P(~fi |~fj) =(∑
d fi ,d)!∏d (fi ,d !)
∏d
(pj ,d)fi,d
I fi,d : count for bin d in the unlabelled image iI fj,d count for the training image j
I Jelinek-Mercer smoothing used to estimate pj ,d :
pj ,d = λfj ,d∑d fj ,d
+ (1− λ)
∑j fj ,d∑
j ,d fj ,d
I We also consider standard χ2 and Hellinger kernels
![Page 17: Sparse Kernel Learning for Image Annotation](https://reader034.vdocuments.net/reader034/viewer/2022052411/55649b31d8b42afd4f8b4bf3/html5/thumbnails/17.jpg)
Greedy kernel-feature alignment
Features
Kernels
Laplacian
GIST HAAR
Gaussian Uniform
X1
X2
X3
X4
X5
X6
X1
X2
X3
X4
X5
X6
X1
X2
X3
X4
X5
X6
X1
X2
X3
X4
X5
X6
SIFT LAB
0 0 0 0
0 0 0 0
0 0 0 0
GIST SIFT LAB HAAR
Laplacian
Gaussian
Uniform
Ψ vu
X6
Iteration 0:
F1 0.0
Features
GIST HAAR
X1
X2
X3
X4
X5
X6
X1
X2
X3
X4
X5
X6
X1
X2
X3
X4
X5
X6
X1
X2
X3
X4
X5
X6
SIFT LAB
X6
Testing Image
Training Image
x
GG(x
;p)
p =1
x
GG(x
;p)
p =15
x
GG(x
;p)
p =2
![Page 18: Sparse Kernel Learning for Image Annotation](https://reader034.vdocuments.net/reader034/viewer/2022052411/55649b31d8b42afd4f8b4bf3/html5/thumbnails/18.jpg)
Greedy kernel-feature alignment
Features
Kernels
Laplacian
GIST HAAR
Uniform
X1
X2
X3
X4
X5
X6
X1
X2
X3
X4
X5
X6
X1
X2
X3
X4
X5
X6
X1
X2
X3
X4
X5
X6
SIFT LAB
0 0 0 0
1 0 0 0
0 0 0 0
GIST SIFT LAB HAAR
Laplacian
Gaussian
Uniform
Ψ vu
X6
Iteration 1:
F1 0.25
Features
GIST HAAR
X1
X2
X3
X4
X5
X6
X1
X2
X3
X4
X5
X6
X1
X2
X3
X4
X5
X6
X1
X2
X3
X4
X5
X6
SIFT LAB
X6
Testing Image
Training Image
x
GG(x
;p)
p =1
x
GG(x
;p)
p =15
x
GG(x
;p)
p =2
Gaussian
![Page 19: Sparse Kernel Learning for Image Annotation](https://reader034.vdocuments.net/reader034/viewer/2022052411/55649b31d8b42afd4f8b4bf3/html5/thumbnails/19.jpg)
Greedy kernel-feature alignment
Features
GIST HAAR
X1
X2
X3
X4
X5
X6
X1
X2
X3
X4
X5
X6
X1
X2
X3
X4
X5
X6
X1
X2
X3
X4
X5
X6
SIFT LAB
0 0 0 0
1 0 0 0
0 0 0 1
GIST SIFT LAB HAAR
Laplacian
Gaussian
Uniform
Ψ vu
X6
Iteration 2:
F1 0.34
Features
GIST HAAR
X1
X2
X3
X4
X5
X6
X1
X2
X3
X4
X5
X6
X1
X2
X3
X4
X5
X6
X1
X2
X3
X4
X5
X6
SIFT LAB
X6
Testing Image
Training Image
Kernels
Laplacian Uniformx
GG(x
;p)
p =1
x
GG(x
;p)
p =15
x
GG(x
;p)
p =2
Gaussian
![Page 20: Sparse Kernel Learning for Image Annotation](https://reader034.vdocuments.net/reader034/viewer/2022052411/55649b31d8b42afd4f8b4bf3/html5/thumbnails/20.jpg)
Greedy kernel-feature alignment
Features
GIST HAAR
X1
X2
X3
X4
X5
X6
X1
X2
X3
X4
X5
X6
X1
X2
X3
X4
X5
X6
X1
X2
X3
X4
X5
X6
SIFT LAB
0 0 0 0
1 1 0 0
0 0 0 1
GIST SIFT LAB HAAR
Laplacian
Gaussian
Uniform
Ψ vu
X6
Iteration 3:
F1 0.38
Features
GIST HAAR
X1
X2
X3
X4
X5
X6
X1
X2
X3
X4
X5
X6
X1
X2
X3
X4
X5
X6
X1
X2
X3
X4
X5
X6
SIFT LAB
X6
Testing Image
Training Image
Kernels
x
GG(x
;p)
p =1
x
GG(x
;p)
p =15
x
GG(x
;p)
p =2
Gaussian Laplacian Uniform
![Page 21: Sparse Kernel Learning for Image Annotation](https://reader034.vdocuments.net/reader034/viewer/2022052411/55649b31d8b42afd4f8b4bf3/html5/thumbnails/21.jpg)
Greedy kernel-feature alignment
Features
GIST HAAR
X1
X2
X3
X4
X5
X6
X1
X2
X3
X4
X5
X6
X1
X2
X3
X4
X5
X6
X1
X2
X3
X4
X5
X6
SIFT LAB
0 0 1 0
1 1 0 0
0 0 0 1
GIST SIFT LAB HAAR
Laplacian
Gaussian
Uniform
Ψ vu
X6
Iteration 4:
F1 0.42
Features
GIST HAAR
X1
X2
X3
X4
X5
X6
X1
X2
X3
X4
X5
X6
X1
X2
X3
X4
X5
X6
X1
X2
X3
X4
X5
X6
SIFT LAB
X6
Testing Image
Training Image
Kernels
Laplacian Uniformx
GG(x
;p)
p =1
x
GG(x
;p)
p =15
x
GG(x
;p)
p =2
Gaussian
![Page 22: Sparse Kernel Learning for Image Annotation](https://reader034.vdocuments.net/reader034/viewer/2022052411/55649b31d8b42afd4f8b4bf3/html5/thumbnails/22.jpg)
Evaluation
Overview
SKL-CRM
Evaluation
Conclusion
![Page 23: Sparse Kernel Learning for Image Annotation](https://reader034.vdocuments.net/reader034/viewer/2022052411/55649b31d8b42afd4f8b4bf3/html5/thumbnails/23.jpg)
Datasets/Features
I Standard evaluation datasets:
I Corel 5K: 5,000 images (landscapes, cities), 260 keywords
I IAPR TC12: 19,627 images (tourism, sports), 291 keywords
I ESP Game: 20,768 images (drawings, graphs), 268 keywords
I Standard “Tagprop” feature set [Guillaumin et al. ’09]:
I Bag-of-words histograms: SIFT [Lowe ’04] and Hue [van deWeijer & Schmid ’06]
I Global colour histograms: RGB, HSV, LAB
I Global GIST descriptor [Oliva & Torralba ’01]
I Descriptors, except GIST, also computed in a 3x1 spatialarrangement [Lazebnik et al. ’06]
![Page 24: Sparse Kernel Learning for Image Annotation](https://reader034.vdocuments.net/reader034/viewer/2022052411/55649b31d8b42afd4f8b4bf3/html5/thumbnails/24.jpg)
Evaluation Metrics
I Standard evaluation metrics [Guillaumin et al. ’09]:
I Mean per word Recall (R)
I Mean per word Precision (P)
I F1 Measure
I Number of words with recall > 0 (N+)
I Fixed annotation length of 5 keywords
![Page 25: Sparse Kernel Learning for Image Annotation](https://reader034.vdocuments.net/reader034/viewer/2022052411/55649b31d8b42afd4f8b4bf3/html5/thumbnails/25.jpg)
F1 score of CRM model variants
Corel 5K IAPR TC12 ESP Game0.00
0.05
0.10
0.15
0.20
0.25
0.30
0.35
0.40
0.45
CRM
CRM 15
SKL-CRM
F1
![Page 26: Sparse Kernel Learning for Image Annotation](https://reader034.vdocuments.net/reader034/viewer/2022052411/55649b31d8b42afd4f8b4bf3/html5/thumbnails/26.jpg)
F1 score of CRM model variants
Corel 5K IAPR TC12 ESP Game0.00
0.05
0.10
0.15
0.20
0.25
0.30
0.35
0.40
0.45
CRM
CRM 15
SKL-CRM
F1
Original CRMDuygulu et al.
features
![Page 27: Sparse Kernel Learning for Image Annotation](https://reader034.vdocuments.net/reader034/viewer/2022052411/55649b31d8b42afd4f8b4bf3/html5/thumbnails/27.jpg)
F1 score of CRM model variants
Corel 5K IAPR TC12 ESP Game0.00
0.05
0.10
0.15
0.20
0.25
0.30
0.35
0.40
0.45
CRM
CRM 15
SKL-CRM
F1
Original CRMDuygulu et al.
features
Original CRM15 Tagprop
features +71%
![Page 28: Sparse Kernel Learning for Image Annotation](https://reader034.vdocuments.net/reader034/viewer/2022052411/55649b31d8b42afd4f8b4bf3/html5/thumbnails/28.jpg)
F1 score of CRM model variants
Corel 5K IAPR TC12 ESP Game0.00
0.05
0.10
0.15
0.20
0.25
0.30
0.35
0.40
0.45
CRM
CRM 15
SKL-CRM
F1
Original CRMDuygulu et al.
features
Original CRM15 Tagprop
features +71%
SKL-CRM15 Tagprop
features +45%
![Page 29: Sparse Kernel Learning for Image Annotation](https://reader034.vdocuments.net/reader034/viewer/2022052411/55649b31d8b42afd4f8b4bf3/html5/thumbnails/29.jpg)
F1 score of SKL-CRM on Corel 5K
HSV_V3H1DS
HS_V3H1HSV
HSHH_V3H1
GISTLAB_V3H1
RGB_V3H1RGB
DH_V3H1DH
HHLAB
DS_V3H1
0.31
0.33
0.35
0.37
0.39
0.41
0.43
0.45
SKL-CRM (Valid F1)
SKL-CRM (Test F1)
Tagprop (Test F1)
Feature type
F1
![Page 30: Sparse Kernel Learning for Image Annotation](https://reader034.vdocuments.net/reader034/viewer/2022052411/55649b31d8b42afd4f8b4bf3/html5/thumbnails/30.jpg)
F1 score of SKL-CRM on Corel 5K
HSV_V3H1DS
HS_V3H1HSV
HSHH_V3H1
GISTLAB_V3H1
RGB_V3H1RGB
DH_V3H1DH
HHLAB
DS_V3H1
0.31
0.33
0.35
0.37
0.39
0.41
0.43
0.45
SKL-CRM (Valid F1)
SKL-CRM (Test F1)
Tagprop (Test F1)
Feature type
F1
![Page 31: Sparse Kernel Learning for Image Annotation](https://reader034.vdocuments.net/reader034/viewer/2022052411/55649b31d8b42afd4f8b4bf3/html5/thumbnails/31.jpg)
F1 score of SKL-CRM on Corel 5K
HSV_V3H1DS
HS_V3H1HSV
HSHH_V3H1
GISTLAB_V3H1
RGB_V3H1RGB
DH_V3H1DH
HHLAB
DS_V3H1
0.31
0.33
0.35
0.37
0.39
0.41
0.43
0.45
SKL-CRM (Valid F1)
SKL-CRM (Test F1)
Tagprop (Test F1)
Feature type
F1
![Page 32: Sparse Kernel Learning for Image Annotation](https://reader034.vdocuments.net/reader034/viewer/2022052411/55649b31d8b42afd4f8b4bf3/html5/thumbnails/32.jpg)
F1 score of SKL-CRM on Corel 5K
HSV_V3H1DS
HS_V3H1HSV
HSHH_V3H1
GISTLAB_V3H1
RGB_V3H1RGB
DH_V3H1DH
HHLAB
DS_V3H1
0.31
0.33
0.35
0.37
0.39
0.41
0.43
0.45
SKL-CRM (Valid F1)
SKL-CRM (Test F1)
Tagprop (Test F1)
Feature type
F1
![Page 33: Sparse Kernel Learning for Image Annotation](https://reader034.vdocuments.net/reader034/viewer/2022052411/55649b31d8b42afd4f8b4bf3/html5/thumbnails/33.jpg)
F1 score of SKL-CRM on Corel 5K
HSV_V3H1DS
HS_V3H1HSV
HSHH_V3H1
GISTLAB_V3H1
RGB_V3H1RGB
DH_V3H1DH
HHLAB
DS_V3H1
0.31
0.33
0.35
0.37
0.39
0.41
0.43
0.45
SKL-CRM (Valid F1)
SKL-CRM (Test F1)
Tagprop (Test F1)
Feature type
F1
![Page 34: Sparse Kernel Learning for Image Annotation](https://reader034.vdocuments.net/reader034/viewer/2022052411/55649b31d8b42afd4f8b4bf3/html5/thumbnails/34.jpg)
Optimal kernel-feature alignments on Corel 5K
I Optimal alignments1:
I HSV: Multinomial (λ = 0.99)I HSV V3H1: Generalised Gaussian (p=0.9)I Harris Hue (HH V3H1): Generalised Gaussian (p=0.1) ≈
Dirac spike!I Harris SIFT (HS): GaussianI HS V3H1: Generalised Gaussian (p=0.7)I DenseSift (DS): Laplacian
I Our data-driven kernels more effective than standard kernels
I No alignment agrees with literature default assignment i.e.Gaussian for Gist, Laplacian for colour histogram, χ2 for SIFT
1V3H1 denotes descriptors computed in a spatial arrangement
![Page 35: Sparse Kernel Learning for Image Annotation](https://reader034.vdocuments.net/reader034/viewer/2022052411/55649b31d8b42afd4f8b4bf3/html5/thumbnails/35.jpg)
SKL-CRM Results vs. Literature (Precision & Recall)
R P R P0.20
0.25
0.30
0.35
0.40
0.45
0.50
MBRM JEC
Tagprop GS
SKL-CRM
Corel 5K IAPR TC12
![Page 36: Sparse Kernel Learning for Image Annotation](https://reader034.vdocuments.net/reader034/viewer/2022052411/55649b31d8b42afd4f8b4bf3/html5/thumbnails/36.jpg)
SKL-CRM Results vs. Literature (N+)
MBRM JEC Tagprop GS SKL-CRM0
50
100
150
200
250
300
Corel 5K
IAPR TC12
N+
![Page 37: Sparse Kernel Learning for Image Annotation](https://reader034.vdocuments.net/reader034/viewer/2022052411/55649b31d8b42afd4f8b4bf3/html5/thumbnails/37.jpg)
Conclusion
Overview
SKL-CRM
Evaluation
Conclusion
![Page 38: Sparse Kernel Learning for Image Annotation](https://reader034.vdocuments.net/reader034/viewer/2022052411/55649b31d8b42afd4f8b4bf3/html5/thumbnails/38.jpg)
Conclusions and Future Work
I Proposed a sparse kernel model for image annotation
I Key experimental findings:
I Default kernel-feature alignment suboptimal
I Data-adaptive kernels are superior to standard kernels
I Sparse set of features just as effective as much larger set
I Greedy forward selection as effective as gradient ascent
I Future work: superposition of kernels per feature type