cs540 machine learning lecture 6 - cs.ubc.camurphyk/teaching/cs540-fall08/l6.pdf · cs540 machine...
TRANSCRIPT
1
CS540 Machine learningLecture 6
2
Last time
• Linear and ridge regression (QR, SVD, LMS)
3
This time
• Logistic regression• MLE
• Perceptron algorithm• IRLS
• Multinomial logistic regression
4
Logistic regression
• Model for binary classification
SigmoidorLogisticfunction
5
Logistic regression in 1d
SAT scores
6
Logistic regression in 2d
7
Notation
8
Why the logistic function?
• McCulloch Pitts model of neuron
9
Log-odds ratio
10
This time
• Logistic regression• MLE
• Perceptron algorithm• IRLS
• Multinomial logistic regression
11
MLE
12
Gradient
13
Perceptron algorithm
14
Perceptron algorithm
15
Convergence
• If linearly separable (so errors = 0), guaranteed to converge, but may do so slowly
• If not linearly separable, may not converge
16
This time
• Logistic regression• MLE
• Perceptron algorithm• IRLS
• Multinomial logistic regression
17
IRLS
• Iteratively reweighted least squares for finding the MLE for logistic regression
• Special case of Newton’s algorithm
18
Newton’s method
• Consider a quadratic objective
• Gradient methods may take many steps, but we can “hop” to the minimum in 1 step if we use
• In general, g(x) will not be linear in x, but we can linearize it. Alternatively we can approximate f by a quadratic.
19
Linearize the gradient
• First order Taylor series approx of g(x) around x_k
0
g(x)g
lin(x)
xk xk+dk
20
Approximate the function
• Construct second order Taylor of f(x) around x_k
f(x)fquad
(x)
xk xk+dk
21
Newton’s algorithm
Use QR to solve H dk = -gk for dk
22
Gradient and Hessian
23
Generic solver
24
IRLS
25
L2 regularization
• Needed to prevent overfitting and w -> inf
26
This time
• Logistic regression• MLE
• Perceptron algorithm• IRLS
• Multinomial logistic regression
27
Multinomial logistic regression
• Y in {1,…,C} categorical
Binary case
softmax
28
Softmax function
29
MLE
Can compute gradient and Hessian and use Newton’s method
Can add L2 regularizer
Can use faster optimization methods eg bound optimization