object recognizing we will discuss: features classifiers example ‘winning’ system
Post on 21-Dec-2015
223 views
TRANSCRIPT
![Page 1: Object Recognizing We will discuss: Features Classifiers Example ‘winning’ system](https://reader038.vdocuments.net/reader038/viewer/2022103022/56649d555503460f94a320bf/html5/thumbnails/1.jpg)
Object Recognizing
We will discuss:
• Features
• Classifiers
• Example ‘winning’ system
![Page 2: Object Recognizing We will discuss: Features Classifiers Example ‘winning’ system](https://reader038.vdocuments.net/reader038/viewer/2022103022/56649d555503460f94a320bf/html5/thumbnails/2.jpg)
Object Classes
![Page 3: Object Recognizing We will discuss: Features Classifiers Example ‘winning’ system](https://reader038.vdocuments.net/reader038/viewer/2022103022/56649d555503460f94a320bf/html5/thumbnails/3.jpg)
Class Non-class
![Page 4: Object Recognizing We will discuss: Features Classifiers Example ‘winning’ system](https://reader038.vdocuments.net/reader038/viewer/2022103022/56649d555503460f94a320bf/html5/thumbnails/4.jpg)
Class Non-class
![Page 5: Object Recognizing We will discuss: Features Classifiers Example ‘winning’ system](https://reader038.vdocuments.net/reader038/viewer/2022103022/56649d555503460f94a320bf/html5/thumbnails/5.jpg)
Features and Classifiers
Same features with different classifiersSame classifier with different features
![Page 6: Object Recognizing We will discuss: Features Classifiers Example ‘winning’ system](https://reader038.vdocuments.net/reader038/viewer/2022103022/56649d555503460f94a320bf/html5/thumbnails/6.jpg)
Generic Features
Simple (wavelets) Complex (Geons)
![Page 7: Object Recognizing We will discuss: Features Classifiers Example ‘winning’ system](https://reader038.vdocuments.net/reader038/viewer/2022103022/56649d555503460f94a320bf/html5/thumbnails/7.jpg)
Class-specific Features: Common Building Blocks
![Page 8: Object Recognizing We will discuss: Features Classifiers Example ‘winning’ system](https://reader038.vdocuments.net/reader038/viewer/2022103022/56649d555503460f94a320bf/html5/thumbnails/8.jpg)
Optimal Class Components?
• Large features are too rare
• Small features are found
everywhere
Find features that carry the highest amount of information
![Page 9: Object Recognizing We will discuss: Features Classifiers Example ‘winning’ system](https://reader038.vdocuments.net/reader038/viewer/2022103022/56649d555503460f94a320bf/html5/thumbnails/9.jpg)
Entropy
Entropy:
x = 0 1 H
p = 0.5 0.5 ? 0.1 0.9 0.47 0.01 0.99 0.08
)p(x log )p(x- H i2i
![Page 10: Object Recognizing We will discuss: Features Classifiers Example ‘winning’ system](https://reader038.vdocuments.net/reader038/viewer/2022103022/56649d555503460f94a320bf/html5/thumbnails/10.jpg)
Mutual Information I(C,F)
Class:11010100
Feature:10011100
I(F,C) = H(C) – H(C|F)
![Page 11: Object Recognizing We will discuss: Features Classifiers Example ‘winning’ system](https://reader038.vdocuments.net/reader038/viewer/2022103022/56649d555503460f94a320bf/html5/thumbnails/11.jpg)
Optimal classification features
• Theoretically: maximizing delivered information minimizes classification error
• In practice: informative object components can be identified in training images
![Page 12: Object Recognizing We will discuss: Features Classifiers Example ‘winning’ system](https://reader038.vdocuments.net/reader038/viewer/2022103022/56649d555503460f94a320bf/html5/thumbnails/12.jpg)
Mutual Info vs. Threshold
0.00 20.00 40.00
Detection threshold
Mu
tu
al
Info
forehead
hairline
mouth
eye
nose
nosebridge
long_hairline
chin
twoeyes
Selecting Fragments
![Page 13: Object Recognizing We will discuss: Features Classifiers Example ‘winning’ system](https://reader038.vdocuments.net/reader038/viewer/2022103022/56649d555503460f94a320bf/html5/thumbnails/13.jpg)
Adding a New Fragment(max-min selection)
?
MIΔ
MI = MI [Δ ;class] - MI [ ;class ]Select: Maxi Mink ΔMI (Fi, Fk)
)Min. over existing fragments, Max. over the entire pool(
);(),;(min);(),;( jjiij
i FCMIFFCMIFCMIFFCMI
![Page 14: Object Recognizing We will discuss: Features Classifiers Example ‘winning’ system](https://reader038.vdocuments.net/reader038/viewer/2022103022/56649d555503460f94a320bf/html5/thumbnails/14.jpg)
Highly Informative Face Fragments
![Page 15: Object Recognizing We will discuss: Features Classifiers Example ‘winning’ system](https://reader038.vdocuments.net/reader038/viewer/2022103022/56649d555503460f94a320bf/html5/thumbnails/15.jpg)
Horse-class features
Car-class features
Pictorial features Learned from examples
![Page 16: Object Recognizing We will discuss: Features Classifiers Example ‘winning’ system](https://reader038.vdocuments.net/reader038/viewer/2022103022/56649d555503460f94a320bf/html5/thumbnails/16.jpg)
Fragments with positions
On all detected fragments within their regions
![Page 17: Object Recognizing We will discuss: Features Classifiers Example ‘winning’ system](https://reader038.vdocuments.net/reader038/viewer/2022103022/56649d555503460f94a320bf/html5/thumbnails/17.jpg)
Star model
Detected fragments ‘vote’ for the center location
Find location with maximal vote
![Page 18: Object Recognizing We will discuss: Features Classifiers Example ‘winning’ system](https://reader038.vdocuments.net/reader038/viewer/2022103022/56649d555503460f94a320bf/html5/thumbnails/18.jpg)
Bag of words
ObjectObject Bag of ‘words’Bag of ‘words’
![Page 19: Object Recognizing We will discuss: Features Classifiers Example ‘winning’ system](https://reader038.vdocuments.net/reader038/viewer/2022103022/56649d555503460f94a320bf/html5/thumbnails/19.jpg)
Bag of visual words A large collection of image patches
–
1.Feature detection 1.Feature detection and representationand representation
•Regular grid– & VogelSchiele ,2003
–Fei- ,Fei & Perona2005
![Page 20: Object Recognizing We will discuss: Features Classifiers Example ‘winning’ system](https://reader038.vdocuments.net/reader038/viewer/2022103022/56649d555503460f94a320bf/html5/thumbnails/20.jpg)
Each class has its words historgram
–
–
–
![Page 21: Object Recognizing We will discuss: Features Classifiers Example ‘winning’ system](https://reader038.vdocuments.net/reader038/viewer/2022103022/56649d555503460f94a320bf/html5/thumbnails/21.jpg)
SVM – linear separation in feature space
![Page 22: Object Recognizing We will discuss: Features Classifiers Example ‘winning’ system](https://reader038.vdocuments.net/reader038/viewer/2022103022/56649d555503460f94a320bf/html5/thumbnails/22.jpg)
Optimal Separation
SVMPerceptron
Find a separating plane such that the closest points are as far as possible
![Page 23: Object Recognizing We will discuss: Features Classifiers Example ‘winning’ system](https://reader038.vdocuments.net/reader038/viewer/2022103022/56649d555503460f94a320bf/html5/thumbnails/23.jpg)
Separating line: w ∙ x + b = 0 Far line: w ∙ x + b = +1Their distance: w ∙ ∆x = +1 Separation: |∆x| = 1/|w|Margin: 2/|w|
0+1
-1 The Margin
![Page 24: Object Recognizing We will discuss: Features Classifiers Example ‘winning’ system](https://reader038.vdocuments.net/reader038/viewer/2022103022/56649d555503460f94a320bf/html5/thumbnails/24.jpg)
Max Margin Classification
)Equivalently, usually used
How to solve such constraint optimization ?
The examples are vectors xi
The labels yi are +1 for class, -1 for non-class
![Page 25: Object Recognizing We will discuss: Features Classifiers Example ‘winning’ system](https://reader038.vdocuments.net/reader038/viewer/2022103022/56649d555503460f94a320bf/html5/thumbnails/25.jpg)
Using Lagrange multipliers: Minimize LP =
With αi > 0 the Lagrange multipliers
![Page 26: Object Recognizing We will discuss: Features Classifiers Example ‘winning’ system](https://reader038.vdocuments.net/reader038/viewer/2022103022/56649d555503460f94a320bf/html5/thumbnails/26.jpg)
Minimize Lp :
Set all derivatives to 0:
Also for the αi
Dual formulation: Maximize the Lagrangian w.r.t. the αi and the above conditions. Put into Lp
![Page 27: Object Recognizing We will discuss: Features Classifiers Example ‘winning’ system](https://reader038.vdocuments.net/reader038/viewer/2022103022/56649d555503460f94a320bf/html5/thumbnails/27.jpg)
Dual formulationMathematically equivalent formulation: Can maximize the Lagrangian with respect to the αi
After manipulations – nice concise optimization :
![Page 28: Object Recognizing We will discuss: Features Classifiers Example ‘winning’ system](https://reader038.vdocuments.net/reader038/viewer/2022103022/56649d555503460f94a320bf/html5/thumbnails/28.jpg)
SVM: in simple matrix form
We first find the α. From this we can find: w, b, and the support vectors.
The matrix H is a simple ‘data matrix’: Hij = yiyj <xi∙xj>
Final classification: w∙x + b ∑αi yi <xi x> + b
Because w = ∑αi yi xi Only <xi x> with support vectors are used
![Page 29: Object Recognizing We will discuss: Features Classifiers Example ‘winning’ system](https://reader038.vdocuments.net/reader038/viewer/2022103022/56649d555503460f94a320bf/html5/thumbnails/29.jpg)
Full story – separable case
Or use ∑αi yi <xi x> + b
![Page 30: Object Recognizing We will discuss: Features Classifiers Example ‘winning’ system](https://reader038.vdocuments.net/reader038/viewer/2022103022/56649d555503460f94a320bf/html5/thumbnails/30.jpg)
Quadratic Programming QP
Minimize (with respect to x)
Subject to one or more constraints of the form:
Ax < b (inequality constraints)Ex = d (equality constraints)
![Page 31: Object Recognizing We will discuss: Features Classifiers Example ‘winning’ system](https://reader038.vdocuments.net/reader038/viewer/2022103022/56649d555503460f94a320bf/html5/thumbnails/31.jpg)
The non-separable case
![Page 32: Object Recognizing We will discuss: Features Classifiers Example ‘winning’ system](https://reader038.vdocuments.net/reader038/viewer/2022103022/56649d555503460f94a320bf/html5/thumbnails/32.jpg)
It turns out that we can get a very similar formulation of the problem and solution, if we penalize the incorrect classification in a certain way. The penalty is Cξi where ξi ≥ 0is the distance of the miss-classified point from the respective plane. We now minimize a penalty with the miss-classifications:
![Page 33: Object Recognizing We will discuss: Features Classifiers Example ‘winning’ system](https://reader038.vdocuments.net/reader038/viewer/2022103022/56649d555503460f94a320bf/html5/thumbnails/33.jpg)
![Page 34: Object Recognizing We will discuss: Features Classifiers Example ‘winning’ system](https://reader038.vdocuments.net/reader038/viewer/2022103022/56649d555503460f94a320bf/html5/thumbnails/34.jpg)
Kernel Classification
![Page 35: Object Recognizing We will discuss: Features Classifiers Example ‘winning’ system](https://reader038.vdocuments.net/reader038/viewer/2022103022/56649d555503460f94a320bf/html5/thumbnails/35.jpg)
![Page 36: Object Recognizing We will discuss: Features Classifiers Example ‘winning’ system](https://reader038.vdocuments.net/reader038/viewer/2022103022/56649d555503460f94a320bf/html5/thumbnails/36.jpg)
Using kernels
A kernel K(x,x’) is also associated with a mapping x → φ(x) We can use φ(x) and perform a linear classification in the target
space .
It turns out that this can be done directly using kernels and without the mapping, the results are equivalent. The optimal separation in the target space is the same as what we will get using the procedure below. It is similar to the linear case, with
the kernel replacing the dot-product .
![Page 37: Object Recognizing We will discuss: Features Classifiers Example ‘winning’ system](https://reader038.vdocuments.net/reader038/viewer/2022103022/56649d555503460f94a320bf/html5/thumbnails/37.jpg)
Use K(xi, xj)
Use ∑αi yi K<xi x> + b
![Page 38: Object Recognizing We will discuss: Features Classifiers Example ‘winning’ system](https://reader038.vdocuments.net/reader038/viewer/2022103022/56649d555503460f94a320bf/html5/thumbnails/38.jpg)
Summary points
• Linear separation with the largest margin, f(x) = w∙x + b
• Dual formulation, f(x) = ∑αi yi (xi ∙ x) + b
• Natural extension to non-separable classes
• Extension through kernels, f(x) = ∑αi yi K(xi x) + b
![Page 39: Object Recognizing We will discuss: Features Classifiers Example ‘winning’ system](https://reader038.vdocuments.net/reader038/viewer/2022103022/56649d555503460f94a320bf/html5/thumbnails/39.jpg)
Felzenszwalb et al .
• Felzenszwalb, McAllester, Ramanan CVPR 2008. A Discriminatively Trained, Multiscale, Deformable Part Model
![Page 40: Object Recognizing We will discuss: Features Classifiers Example ‘winning’ system](https://reader038.vdocuments.net/reader038/viewer/2022103022/56649d555503460f94a320bf/html5/thumbnails/40.jpg)
Object model using HoG
A bicycle and its ‘root filter ’The root filter is a patch of HoG descriptor Image is partitioned into 8x8 pixel cells In each block we compute a histogram of gradient orientations
![Page 41: Object Recognizing We will discuss: Features Classifiers Example ‘winning’ system](https://reader038.vdocuments.net/reader038/viewer/2022103022/56649d555503460f94a320bf/html5/thumbnails/41.jpg)
Using patches with HoG descriptors and classification by SVM
![Page 42: Object Recognizing We will discuss: Features Classifiers Example ‘winning’ system](https://reader038.vdocuments.net/reader038/viewer/2022103022/56649d555503460f94a320bf/html5/thumbnails/42.jpg)
The filter is searched on a pyramid of HoG descriptors, to deal with unknown scale
Dealing with scale: multi-scale analysis
![Page 43: Object Recognizing We will discuss: Features Classifiers Example ‘winning’ system](https://reader038.vdocuments.net/reader038/viewer/2022103022/56649d555503460f94a320bf/html5/thumbnails/43.jpg)
A part Pi = (Fi, vi, si, ai, bi) .
Fi is filter for the i-th part, vi is the center for a box of possible positions for part i relative to the root position, si the size of this box
ai and bi are two-dimensional vectors specifying coefficients of a quadratic function measuring a score for each possible placement of the i-th part. That is, ai and bi are two numbers each, and the penalty for deviation ∆x, ∆y from the expected location is a1 ∆ x + a2 ∆y + b1 ∆x2 + b2 ∆y2
Adding Parts
![Page 44: Object Recognizing We will discuss: Features Classifiers Example ‘winning’ system](https://reader038.vdocuments.net/reader038/viewer/2022103022/56649d555503460f94a320bf/html5/thumbnails/44.jpg)
Bicycle model: root, parts, spatial map
Person model
![Page 45: Object Recognizing We will discuss: Features Classifiers Example ‘winning’ system](https://reader038.vdocuments.net/reader038/viewer/2022103022/56649d555503460f94a320bf/html5/thumbnails/45.jpg)
![Page 46: Object Recognizing We will discuss: Features Classifiers Example ‘winning’ system](https://reader038.vdocuments.net/reader038/viewer/2022103022/56649d555503460f94a320bf/html5/thumbnails/46.jpg)
The full score of a potential match is: ∑ Fi ∙ Hi + ∑ ai1 xi + ai2 y + bi1x2 + bi2y2
Fi ∙ Hi is the appearance part
xi, yi, is the deviation of part pi from its expected location in the model. This is the spatial part.
Match Score
![Page 47: Object Recognizing We will discuss: Features Classifiers Example ‘winning’ system](https://reader038.vdocuments.net/reader038/viewer/2022103022/56649d555503460f94a320bf/html5/thumbnails/47.jpg)
The score of a match can be expressed as the dot-product of a vector β of coefficients, with the image:
Score = β∙ψ
Using the vectors ψ to train an SVM classifier :β∙ψ > 1 for class examples
β∙ψ < 1 for class examples
Using SVM:
![Page 48: Object Recognizing We will discuss: Features Classifiers Example ‘winning’ system](https://reader038.vdocuments.net/reader038/viewer/2022103022/56649d555503460f94a320bf/html5/thumbnails/48.jpg)
β∙ψ > 1 for class examples β∙ψ < 1 for class examples
However, ψ depends on the placement z, that is, the values of ∆xi, ∆yi
We need to take the best ψ over all placements. In their notation :Classification then uses β∙f > 1
We need to take the best ψ over all placements. In their notation :
Classification then uses β∙f > 1
![Page 49: Object Recognizing We will discuss: Features Classifiers Example ‘winning’ system](https://reader038.vdocuments.net/reader038/viewer/2022103022/56649d555503460f94a320bf/html5/thumbnails/49.jpg)
In analogy to classical SVMs we would like to train from labeled examples D = (<x1, y1i> . . . , <xn, yn>) By optimizing the following objective function,
Finding β, SVM training:
![Page 50: Object Recognizing We will discuss: Features Classifiers Example ‘winning’ system](https://reader038.vdocuments.net/reader038/viewer/2022103022/56649d555503460f94a320bf/html5/thumbnails/50.jpg)
search with gradient descent over the placement. This includes also the levels in the hierarchy. Start with the root filter, find places of high score for it. For these high-scoring locations, each for the optimal placement of the parts at a level with twice the resolution as the root-filter, using GD.
With the optimal placement, use
β∙ψ > 1 for class examples β∙ψ < 1 for class examples
Recognition
![Page 51: Object Recognizing We will discuss: Features Classifiers Example ‘winning’ system](https://reader038.vdocuments.net/reader038/viewer/2022103022/56649d555503460f94a320bf/html5/thumbnails/51.jpg)
![Page 52: Object Recognizing We will discuss: Features Classifiers Example ‘winning’ system](https://reader038.vdocuments.net/reader038/viewer/2022103022/56649d555503460f94a320bf/html5/thumbnails/52.jpg)
• Training -- positive examples with bounding boxes around the objects, and negative examples.
• Learn root filter using SVM
• Define fixed number of parts, at locations of high energy in the root filter HoG
• Use these to start the iterative learning
![Page 53: Object Recognizing We will discuss: Features Classifiers Example ‘winning’ system](https://reader038.vdocuments.net/reader038/viewer/2022103022/56649d555503460f94a320bf/html5/thumbnails/53.jpg)
Hard Negatives
The set M of hard-negatives for a known β and data set DThese are support vector (y ∙ f =1) or misses (y ∙ f < 1)
Optimal SVM training does not need all the examples, hard examples are sufficient. For a given β, use the positive examples + C hard examples Use this data to compute β by standard SVM Iterate (with a new set of C hard examples)
![Page 54: Object Recognizing We will discuss: Features Classifiers Example ‘winning’ system](https://reader038.vdocuments.net/reader038/viewer/2022103022/56649d555503460f94a320bf/html5/thumbnails/54.jpg)
![Page 55: Object Recognizing We will discuss: Features Classifiers Example ‘winning’ system](https://reader038.vdocuments.net/reader038/viewer/2022103022/56649d555503460f94a320bf/html5/thumbnails/55.jpg)
All images contain at least 1 bike
![Page 56: Object Recognizing We will discuss: Features Classifiers Example ‘winning’ system](https://reader038.vdocuments.net/reader038/viewer/2022103022/56649d555503460f94a320bf/html5/thumbnails/56.jpg)
![Page 57: Object Recognizing We will discuss: Features Classifiers Example ‘winning’ system](https://reader038.vdocuments.net/reader038/viewer/2022103022/56649d555503460f94a320bf/html5/thumbnails/57.jpg)
Correct person detections
![Page 58: Object Recognizing We will discuss: Features Classifiers Example ‘winning’ system](https://reader038.vdocuments.net/reader038/viewer/2022103022/56649d555503460f94a320bf/html5/thumbnails/58.jpg)
Difficult images, medium results. About 0.5 precision at 0.5 recall
![Page 59: Object Recognizing We will discuss: Features Classifiers Example ‘winning’ system](https://reader038.vdocuments.net/reader038/viewer/2022103022/56649d555503460f94a320bf/html5/thumbnails/59.jpg)
All images contain at least 1 bird
![Page 60: Object Recognizing We will discuss: Features Classifiers Example ‘winning’ system](https://reader038.vdocuments.net/reader038/viewer/2022103022/56649d555503460f94a320bf/html5/thumbnails/60.jpg)
![Page 61: Object Recognizing We will discuss: Features Classifiers Example ‘winning’ system](https://reader038.vdocuments.net/reader038/viewer/2022103022/56649d555503460f94a320bf/html5/thumbnails/61.jpg)
Average precision :Roughly, AP of 0.3 – in a test with 1000 class images, out of the top 1000 detection, 300 will be true class examples (recall =
precision = 0.3) .
![Page 62: Object Recognizing We will discuss: Features Classifiers Example ‘winning’ system](https://reader038.vdocuments.net/reader038/viewer/2022103022/56649d555503460f94a320bf/html5/thumbnails/62.jpg)
Future Directions
• Dealing with very large number of classes – Imagenet, 15,000 categories, 12 million images
• To consider: human-level performance for at least one class