machine learning using matlab - uni konstanz · machine learning using matlab lecture 7 support...

26
Machine Learning using Matlab Lecture 7 Support Vector Machine (SVM)

Upload: others

Post on 20-May-2020

19 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Machine Learning using Matlab - Uni Konstanz · Machine Learning using Matlab Lecture 7 Support Vector Machine (SVM) ... Radial Basis Function (Gaussian kernel): ... for example,

Machine Learning using Matlab

Lecture 7 Support Vector Machine (SVM)

Page 2: Machine Learning using Matlab - Uni Konstanz · Machine Learning using Matlab Lecture 7 Support Vector Machine (SVM) ... Radial Basis Function (Gaussian kernel): ... for example,

Note● Deadline for presentation application is 11.06.2017. If you still didn’t send

your application, please send it asap.● The presentation date schedule will be released in our course website next

week.● On Thursday lab session I will give a quiz, if you can finish in time, you will be

given bonus in your final score.

Page 3: Machine Learning using Matlab - Uni Konstanz · Machine Learning using Matlab Lecture 7 Support Vector Machine (SVM) ... Radial Basis Function (Gaussian kernel): ... for example,

Outline● Primal and dual forms● Feature map● Kernel trick● Regression● SVM toolbox

Page 4: Machine Learning using Matlab - Uni Konstanz · Machine Learning using Matlab Lecture 7 Support Vector Machine (SVM) ... Radial Basis Function (Gaussian kernel): ... for example,

Intuition

Support vector

Support vector

SVM is also called “maximum margin classifier”

Page 5: Machine Learning using Matlab - Uni Konstanz · Machine Learning using Matlab Lecture 7 Support Vector Machine (SVM) ... Radial Basis Function (Gaussian kernel): ... for example,

“Hard” margin● Given training examples , SVM aims to find an optimal hyperplane

so that:

● It is equivalent to minimizing the following function:

Page 6: Machine Learning using Matlab - Uni Konstanz · Machine Learning using Matlab Lecture 7 Support Vector Machine (SVM) ... Radial Basis Function (Gaussian kernel): ... for example,

Which classifier is better?

Tradeoff between the margin and the number of mistakes in the training data

Page 7: Machine Learning using Matlab - Uni Konstanz · Machine Learning using Matlab Lecture 7 Support Vector Machine (SVM) ... Radial Basis Function (Gaussian kernel): ... for example,

Introduce “slack” variables

Support vector

Support vector

● For point is between margin and correct side of hyperplane. This is margin violation

● For point is misclassified

Page 8: Machine Learning using Matlab - Uni Konstanz · Machine Learning using Matlab Lecture 7 Support Vector Machine (SVM) ... Radial Basis Function (Gaussian kernel): ... for example,

“Soft” margin solutionThe optimization problems becomes:

● Every constraint can be satisfied if is sufficiently large.● C is a regularization parameter:

○ Small C ⇒ large margin○ large C ⇒ narrow margin○ C = ∞ ⇒ hard margin

● It is called primal form of SVM.

Page 9: Machine Learning using Matlab - Uni Konstanz · Machine Learning using Matlab Lecture 7 Support Vector Machine (SVM) ... Radial Basis Function (Gaussian kernel): ... for example,

Different regularization

Page 10: Machine Learning using Matlab - Uni Konstanz · Machine Learning using Matlab Lecture 7 Support Vector Machine (SVM) ... Radial Basis Function (Gaussian kernel): ... for example,

Dual formWith Lagrange multipliers, we have the dual form of SVM:

The decision function can be rewritten:

Prediction is very fast as most ⍺ are zeros.

Page 11: Machine Learning using Matlab - Uni Konstanz · Machine Learning using Matlab Lecture 7 Support Vector Machine (SVM) ... Radial Basis Function (Gaussian kernel): ... for example,

What if the data is not linearly separable?● In logistic regression, we add more

parameters to make the decision boundary nonlinearly.

● However, we can’t do the same way in SVM because I still want to learn to a linear classifier.

Page 12: Machine Learning using Matlab - Uni Konstanz · Machine Learning using Matlab Lecture 7 Support Vector Machine (SVM) ... Radial Basis Function (Gaussian kernel): ... for example,

Map data into higher dimension

Data is linearly separable in 3D space

w

Page 13: Machine Learning using Matlab - Uni Konstanz · Machine Learning using Matlab Lecture 7 Support Vector Machine (SVM) ... Radial Basis Function (Gaussian kernel): ... for example,

Feature map● By mapping data from d-dimensional to D-dimensional space (d<D), we can

still learn a linear classifier.● , where is called feature map.●

What change in classifier learning after mapping features?

Page 14: Machine Learning using Matlab - Uni Konstanz · Machine Learning using Matlab Lecture 7 Support Vector Machine (SVM) ... Radial Basis Function (Gaussian kernel): ... for example,

Kernel trick - demonstration

Page 15: Machine Learning using Matlab - Uni Konstanz · Machine Learning using Matlab Lecture 7 Support Vector Machine (SVM) ... Radial Basis Function (Gaussian kernel): ... for example,

Transformed feature in primal formClassifier:

Optimization:

● Simply map x to phix where data is separable● Solve for w in high dimensional space● There are many more parameters to learn for w if D>>d, can we avoid this?

Page 16: Machine Learning using Matlab - Uni Konstanz · Machine Learning using Matlab Lecture 7 Support Vector Machine (SVM) ... Radial Basis Function (Gaussian kernel): ... for example,

Transformed feature in dual formClassifier:

Optimization:

● In dual form, phix only occurs in pairs● Only the m dimensional vector alp needs to be learnt● Kernel:

Page 17: Machine Learning using Matlab - Uni Konstanz · Machine Learning using Matlab Lecture 7 Support Vector Machine (SVM) ... Radial Basis Function (Gaussian kernel): ... for example,

Kernel in dual formClassifier:

Optimization:

Page 18: Machine Learning using Matlab - Uni Konstanz · Machine Learning using Matlab Lecture 7 Support Vector Machine (SVM) ... Radial Basis Function (Gaussian kernel): ... for example,

Kernel trick

1. Classifier can be learnt and applied without explicitly computing2. All that is required to compute the kernel function3. Complexity of learning depends on number of training examples m rather than the

dimensions of feature space D.

Page 19: Machine Learning using Matlab - Uni Konstanz · Machine Learning using Matlab Lecture 7 Support Vector Machine (SVM) ... Radial Basis Function (Gaussian kernel): ... for example,

Common kernel functions● Linear kernel:● Polynomial kernel:

○ Contains all polynomials terms up to degree d

● Radial Basis Function (Gaussian kernel):○ Infinite dimensional feature space

How many parameters do you need to tune?

Page 20: Machine Learning using Matlab - Uni Konstanz · Machine Learning using Matlab Lecture 7 Support Vector Machine (SVM) ... Radial Basis Function (Gaussian kernel): ... for example,

Kernel trick - summary● Classifier can be learnt in high dimensional feature space, without explicitly

knowing the feature map● Kernels can also be used elsewhere, for example, kernel PCA, kernel

k-means● Different kernel functions may be applied to different scenarios● However, the optimal parameters have to be chosen empirically

Page 21: Machine Learning using Matlab - Uni Konstanz · Machine Learning using Matlab Lecture 7 Support Vector Machine (SVM) ... Radial Basis Function (Gaussian kernel): ... for example,

Support Vector Regression (SVR)

-insensitive loss

Page 22: Machine Learning using Matlab - Uni Konstanz · Machine Learning using Matlab Lecture 7 Support Vector Machine (SVM) ... Radial Basis Function (Gaussian kernel): ... for example,

SVR primal form

Page 23: Machine Learning using Matlab - Uni Konstanz · Machine Learning using Matlab Lecture 7 Support Vector Machine (SVM) ... Radial Basis Function (Gaussian kernel): ... for example,

SVR dual form

Introduce Lagrange multipliers , we have:

Page 24: Machine Learning using Matlab - Uni Konstanz · Machine Learning using Matlab Lecture 7 Support Vector Machine (SVM) ... Radial Basis Function (Gaussian kernel): ... for example,

SVR - summary● SVR is the extension of SVM, thus the optimization algorithm for SVM can be

applied to SVR directly [Smola ’04].● Likewise, “kernel trick” can also be applied to SVR.● Q: how many parameters should I tune if I use gaussian kernel?

Three parameters, namely, C, σ, and

Page 25: Machine Learning using Matlab - Uni Konstanz · Machine Learning using Matlab Lecture 7 Support Vector Machine (SVM) ... Radial Basis Function (Gaussian kernel): ... for example,

SVM toolbox1. Libsvm: https://www.csie.ntu.edu.tw/~cjlin/libsvm/2. SVMlight: http://svmlight.joachims.org/

Page 26: Machine Learning using Matlab - Uni Konstanz · Machine Learning using Matlab Lecture 7 Support Vector Machine (SVM) ... Radial Basis Function (Gaussian kernel): ... for example,

SVM - summary● SVM was originally proposed by Boser, Guyon and Vapnik in 1992 and gained

increasing popularity in late 1990s.● SVM can be applied to complex data types beyond feature vectors (e.g. graphs,

sequences, relational data) by designing kernel functions for such data.● For multiclass SVM, you can use either one-vs-rest scheme or multi-class SVM, e.g.,

[Weston ’99] and [Crammer ’01].● SVM is a convex problem, thus we have global optimal solution. However, the

computational cost increases along with the number of training examples. Therefore, more efficient optimization algorithms are proposed, e.g. SMO [Platt ’99] and [Joachims ’99].

● Tuning SVMs remains a black art: selecting a specific kernel and parameters is usually done in a try-and-see manner.