support vector machines a.k.a, whirlwind o’ vector algebra sec. 6.3 svm tutorial by c. burges (on...
Post on 22-Dec-2015
229 views
TRANSCRIPT
![Page 1: Support Vector Machines a.k.a, Whirlwind o’ Vector Algebra Sec. 6.3 SVM Tutorial by C. Burges (on class “resources” page)](https://reader035.vdocuments.net/reader035/viewer/2022062421/56649d7e5503460f94a61bbd/html5/thumbnails/1.jpg)
Support Vector Machines
a.k.a, Whirlwind o’ Vector Algebra
Sec. 6.3SVM Tutorial by C. Burges (on class “resources” page)
![Page 2: Support Vector Machines a.k.a, Whirlwind o’ Vector Algebra Sec. 6.3 SVM Tutorial by C. Burges (on class “resources” page)](https://reader035.vdocuments.net/reader035/viewer/2022062421/56649d7e5503460f94a61bbd/html5/thumbnails/2.jpg)
Administrivia•Reminder: straw poll
•RL or Unsup?
![Page 3: Support Vector Machines a.k.a, Whirlwind o’ Vector Algebra Sec. 6.3 SVM Tutorial by C. Burges (on class “resources” page)](https://reader035.vdocuments.net/reader035/viewer/2022062421/56649d7e5503460f94a61bbd/html5/thumbnails/3.jpg)
Nonlinear data projection•Suppose you have a
“projection function”:
•Original feature space
•“Projected” space
•Usually
•Do learning w/ linear model in
•Ex:
![Page 4: Support Vector Machines a.k.a, Whirlwind o’ Vector Algebra Sec. 6.3 SVM Tutorial by C. Burges (on class “resources” page)](https://reader035.vdocuments.net/reader035/viewer/2022062421/56649d7e5503460f94a61bbd/html5/thumbnails/4.jpg)
The catch...•How many dimensions does have?
•For degree-k polynomial expansions:
•E.g., for k=4, d=256 (16x16 images),
•Yike!
•For “radial basis functions”,
![Page 5: Support Vector Machines a.k.a, Whirlwind o’ Vector Algebra Sec. 6.3 SVM Tutorial by C. Burges (on class “resources” page)](https://reader035.vdocuments.net/reader035/viewer/2022062421/56649d7e5503460f94a61bbd/html5/thumbnails/5.jpg)
Linear surfaces for cheap•Can’t directly find linear surfaces in
•Have to find a clever “method” for finding them indirectly
•It’ll take (quite) a bit of work to get there...
•Will need different criterion than
•We’ll look for the “maximum margin” classifier
•Surface s.t. class 1 (“true”) data falls as possible on one side; class -1 (“false”) falls as far as possible on the other
![Page 6: Support Vector Machines a.k.a, Whirlwind o’ Vector Algebra Sec. 6.3 SVM Tutorial by C. Burges (on class “resources” page)](https://reader035.vdocuments.net/reader035/viewer/2022062421/56649d7e5503460f94a61bbd/html5/thumbnails/6.jpg)
Max margin hyperplanes
Hyperplane
Margin
![Page 7: Support Vector Machines a.k.a, Whirlwind o’ Vector Algebra Sec. 6.3 SVM Tutorial by C. Burges (on class “resources” page)](https://reader035.vdocuments.net/reader035/viewer/2022062421/56649d7e5503460f94a61bbd/html5/thumbnails/7.jpg)
Max margin is uniqueHyperplane
Margin
![Page 8: Support Vector Machines a.k.a, Whirlwind o’ Vector Algebra Sec. 6.3 SVM Tutorial by C. Burges (on class “resources” page)](https://reader035.vdocuments.net/reader035/viewer/2022062421/56649d7e5503460f94a61bbd/html5/thumbnails/8.jpg)
Exercise•Given a hyperplane defined by a weight vector
•What is the equation for points on the surface of the hyperplane?
•What are the equations for points on the two margins?
•Give an expression for the distance between a point and the hyperplane (and/or either margin)
•What is the role of ?
![Page 9: Support Vector Machines a.k.a, Whirlwind o’ Vector Algebra Sec. 6.3 SVM Tutorial by C. Burges (on class “resources” page)](https://reader035.vdocuments.net/reader035/viewer/2022062421/56649d7e5503460f94a61bbd/html5/thumbnails/9.jpg)
5 minutes of math...•A dot product (inner product) is a
projection of one vector onto another
•When the projection of X onto w is equal to ww10, then X falls exactly onto the w hyperplane
w
Hyperplane
X
![Page 10: Support Vector Machines a.k.a, Whirlwind o’ Vector Algebra Sec. 6.3 SVM Tutorial by C. Burges (on class “resources” page)](https://reader035.vdocuments.net/reader035/viewer/2022062421/56649d7e5503460f94a61bbd/html5/thumbnails/10.jpg)
5 minutes of math...•BTW, are we sure that hyperplane is
perpendicular to w? Why?
![Page 11: Support Vector Machines a.k.a, Whirlwind o’ Vector Algebra Sec. 6.3 SVM Tutorial by C. Burges (on class “resources” page)](https://reader035.vdocuments.net/reader035/viewer/2022062421/56649d7e5503460f94a61bbd/html5/thumbnails/11.jpg)
5 minutes of math...•BTW, are we sure that hyperplane is
perpendicular to w? Why?
•Consider any two vectors, and , falling exactly on the hyperplane, then:
is some vector in the hyperplane
is perpendicular to any vector in the hyperplane
![Page 12: Support Vector Machines a.k.a, Whirlwind o’ Vector Algebra Sec. 6.3 SVM Tutorial by C. Burges (on class “resources” page)](https://reader035.vdocuments.net/reader035/viewer/2022062421/56649d7e5503460f94a61bbd/html5/thumbnails/12.jpg)
5 minutes of math...•Projections on one side of the line have
dot products >0...
w
Hyperplane
X
![Page 13: Support Vector Machines a.k.a, Whirlwind o’ Vector Algebra Sec. 6.3 SVM Tutorial by C. Burges (on class “resources” page)](https://reader035.vdocuments.net/reader035/viewer/2022062421/56649d7e5503460f94a61bbd/html5/thumbnails/13.jpg)
5 minutes of math...•Projections on one side of the line have
dot products >0...
•... and on the other, <0
w
Hyperplane
X
![Page 14: Support Vector Machines a.k.a, Whirlwind o’ Vector Algebra Sec. 6.3 SVM Tutorial by C. Burges (on class “resources” page)](https://reader035.vdocuments.net/reader035/viewer/2022062421/56649d7e5503460f94a61bbd/html5/thumbnails/14.jpg)
5 minutes of math...•What is the distance from any vector X to
the hyperplane?
w
X
r=?
![Page 15: Support Vector Machines a.k.a, Whirlwind o’ Vector Algebra Sec. 6.3 SVM Tutorial by C. Burges (on class “resources” page)](https://reader035.vdocuments.net/reader035/viewer/2022062421/56649d7e5503460f94a61bbd/html5/thumbnails/15.jpg)
5 minutes of math...•What is the distance from any vector X to
the hyperplane?
•Write X as a point on plane + offset from plane
w
![Page 16: Support Vector Machines a.k.a, Whirlwind o’ Vector Algebra Sec. 6.3 SVM Tutorial by C. Burges (on class “resources” page)](https://reader035.vdocuments.net/reader035/viewer/2022062421/56649d7e5503460f94a61bbd/html5/thumbnails/16.jpg)
5 minutes of math...•Now:
![Page 17: Support Vector Machines a.k.a, Whirlwind o’ Vector Algebra Sec. 6.3 SVM Tutorial by C. Burges (on class “resources” page)](https://reader035.vdocuments.net/reader035/viewer/2022062421/56649d7e5503460f94a61bbd/html5/thumbnails/17.jpg)
5 minutes of math...•Theorem: The distance, r, from any point X to the
hyperplane defined by w and is given by:
•Lemma: The distance from the origin to the hyperplane is given by:
•Also: r>0 for points on one side of the hyperplane; r<0 for points on the other
![Page 18: Support Vector Machines a.k.a, Whirlwind o’ Vector Algebra Sec. 6.3 SVM Tutorial by C. Burges (on class “resources” page)](https://reader035.vdocuments.net/reader035/viewer/2022062421/56649d7e5503460f94a61bbd/html5/thumbnails/18.jpg)
Back to SVMs & margins•The margins are parallel to hyperplane,
so are defined by same w, plus constant offsets
w
bb
![Page 19: Support Vector Machines a.k.a, Whirlwind o’ Vector Algebra Sec. 6.3 SVM Tutorial by C. Burges (on class “resources” page)](https://reader035.vdocuments.net/reader035/viewer/2022062421/56649d7e5503460f94a61bbd/html5/thumbnails/19.jpg)
Back to SVMs & margins•The margins are parallel to hyperplane, so
are defined by same w, plus constant offsets
•Want to ensure that all data points are “outside” the margins
w
bb
![Page 20: Support Vector Machines a.k.a, Whirlwind o’ Vector Algebra Sec. 6.3 SVM Tutorial by C. Burges (on class “resources” page)](https://reader035.vdocuments.net/reader035/viewer/2022062421/56649d7e5503460f94a61bbd/html5/thumbnails/20.jpg)
Maximizing the margin•So now we have a learning criterion function:
•Pick w to maximize b s.t. all points still satisfy
•Note: w.l.o.g. can rescale w arbitrarily (why?)
•So can formulate full problem as:
Minimize:
Subject to:
•But how do you do that? And how does this help?
![Page 21: Support Vector Machines a.k.a, Whirlwind o’ Vector Algebra Sec. 6.3 SVM Tutorial by C. Burges (on class “resources” page)](https://reader035.vdocuments.net/reader035/viewer/2022062421/56649d7e5503460f94a61bbd/html5/thumbnails/21.jpg)
Quadratic programming•Problems of the form
Minimize:Subject to:
•are called “quadratic programming” problems
•There are off-the-shelf methods to solve them
•Actually solving this is way, way beyond the scope of this class
•Consider it a black box
•If a solution exists, it will be found & be unique
•Expensive, but not intractably so
![Page 22: Support Vector Machines a.k.a, Whirlwind o’ Vector Algebra Sec. 6.3 SVM Tutorial by C. Burges (on class “resources” page)](https://reader035.vdocuments.net/reader035/viewer/2022062421/56649d7e5503460f94a61bbd/html5/thumbnails/22.jpg)
Nonseparable data•What if the data isn’t linearly separable?
•Project into higher dim space (we’ll get there)
•Allow some “slop” in the system
•Allow margins to be violated “a little”
w
![Page 23: Support Vector Machines a.k.a, Whirlwind o’ Vector Algebra Sec. 6.3 SVM Tutorial by C. Burges (on class “resources” page)](https://reader035.vdocuments.net/reader035/viewer/2022062421/56649d7e5503460f94a61bbd/html5/thumbnails/23.jpg)
The new “slackful” QP•The are “slack variables”
•Allow margins to be violated a little
•Still want to minimize margin violations, so add them to QP instance:
•Minimize:
•Subject to: