cs 461: machine learning lecture 2 - kiri wagstaff1/12/08 cs 461, winter 2008 1 cs 461: machine...

23
1/12/08 CS 461, Winter 2008 1 CS 461: Machine Learning Lecture 2 Dr. Kiri Wagstaff [email protected]

Upload: others

Post on 11-Oct-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CS 461: Machine Learning Lecture 2 - Kiri Wagstaff1/12/08 CS 461, Winter 2008 1 CS 461: Machine Learning Lecture 2 Dr. Kiri Wagstaff kiri.wagstaff@calstatela.edu

1/12/08 CS 461, Winter 2008 1

CS 461: Machine LearningLecture 2

Dr. Kiri [email protected]

Page 2: CS 461: Machine Learning Lecture 2 - Kiri Wagstaff1/12/08 CS 461, Winter 2008 1 CS 461: Machine Learning Lecture 2 Dr. Kiri Wagstaff kiri.wagstaff@calstatela.edu

1/12/08 CS 461, Winter 2008 2

Introductions

Share with us: Your name Grad or undergrad? What inspired you to sign up for this class?

Anything specifically that you want to learn from it?

Page 3: CS 461: Machine Learning Lecture 2 - Kiri Wagstaff1/12/08 CS 461, Winter 2008 1 CS 461: Machine Learning Lecture 2 Dr. Kiri Wagstaff kiri.wagstaff@calstatela.edu

1/12/08 CS 461, Winter 2008 3

Today’s Topics

Review Homework 1 Supervised Learning Issues Decision Trees Evaluation Weka

Page 4: CS 461: Machine Learning Lecture 2 - Kiri Wagstaff1/12/08 CS 461, Winter 2008 1 CS 461: Machine Learning Lecture 2 Dr. Kiri Wagstaff kiri.wagstaff@calstatela.edu

1/12/08 CS 461, Winter 2008 4

Review

Machine Learning Computers learn from their past experience

Inductive Learning Generalize to new data

Supervised Learning Training data: <x, g(x)> pairs Known label or output value for training data Classification and regression

Instance-Based Learning 1-Nearest Neighbor k-Nearest Neighbors

Hypothesis

ObservationsFeedback,

moreobservations

Refinement

Induction,generalization Actions,

guesses

Page 5: CS 461: Machine Learning Lecture 2 - Kiri Wagstaff1/12/08 CS 461, Winter 2008 1 CS 461: Machine Learning Lecture 2 Dr. Kiri Wagstaff kiri.wagstaff@calstatela.edu

1/12/08 CS 461, Winter 2008 5

Homework 1

Solution/Discussion

Page 6: CS 461: Machine Learning Lecture 2 - Kiri Wagstaff1/12/08 CS 461, Winter 2008 1 CS 461: Machine Learning Lecture 2 Dr. Kiri Wagstaff kiri.wagstaff@calstatela.edu

1/12/08 CS 461, Winter 2008 6

Issues in Supervised Learning

1. Representation: which features to use?

2. Model Selection: complexity, noise, bias

Page 7: CS 461: Machine Learning Lecture 2 - Kiri Wagstaff1/12/08 CS 461, Winter 2008 1 CS 461: Machine Learning Lecture 2 Dr. Kiri Wagstaff kiri.wagstaff@calstatela.edu

1/12/08 CS 461, Winter 2008 7

When don’t you want zero error?

[Alpaydin 2004 © The MIT Press]

Page 8: CS 461: Machine Learning Lecture 2 - Kiri Wagstaff1/12/08 CS 461, Winter 2008 1 CS 461: Machine Learning Lecture 2 Dr. Kiri Wagstaff kiri.wagstaff@calstatela.edu

1/12/08 CS 461, Winter 2008 8

Model Selection and Complexity

Rule of thumb: prefer simplest model that has“good enough” performance

[Alpaydin 2004 © The MIT Press]

“Make everything as simple as possible, but not simpler.”-- Albert Einstein

Page 9: CS 461: Machine Learning Lecture 2 - Kiri Wagstaff1/12/08 CS 461, Winter 2008 1 CS 461: Machine Learning Lecture 2 Dr. Kiri Wagstaff kiri.wagstaff@calstatela.edu

1/12/08 CS 461, Winter 2008 9

Another reason to prefer simplemodels…

[Tom Dietterich]

Page 10: CS 461: Machine Learning Lecture 2 - Kiri Wagstaff1/12/08 CS 461, Winter 2008 1 CS 461: Machine Learning Lecture 2 Dr. Kiri Wagstaff kiri.wagstaff@calstatela.edu

1/12/08 CS 461, Winter 2008 10

Decision Trees

Chapter 9

Page 11: CS 461: Machine Learning Lecture 2 - Kiri Wagstaff1/12/08 CS 461, Winter 2008 1 CS 461: Machine Learning Lecture 2 Dr. Kiri Wagstaff kiri.wagstaff@calstatela.edu

1/12/08 CS 461, Winter 2008 11

Decision Trees

Example: Diagnosis of Psychotic Disorders

Page 12: CS 461: Machine Learning Lecture 2 - Kiri Wagstaff1/12/08 CS 461, Winter 2008 1 CS 461: Machine Learning Lecture 2 Dr. Kiri Wagstaff kiri.wagstaff@calstatela.edu

1/12/08 CS 461, Winter 2008 12

(Hyper-)Rectangles == Decision Tree

[Alpaydin 2004 © The MIT Press]

discriminant

Page 13: CS 461: Machine Learning Lecture 2 - Kiri Wagstaff1/12/08 CS 461, Winter 2008 1 CS 461: Machine Learning Lecture 2 Dr. Kiri Wagstaff kiri.wagstaff@calstatela.edu

1/12/08 CS 461, Winter 2008 13

Measuring Impurity

1. Calculate error using majority label After a split

2. More sensitive: use entropy For node m, Nm instances reach m, Nim belong to Ci

Node m is pure if pim is 0 or 1 Entropy:

After a split:

( )m

imi

miN

Npm,CP̂ =!x|

!=

"=K

i

i

m

i

mm pp1

2logI

[Alpaydin 2004 © The MIT Press]

!

I’m = "Nmj

Nmj=1

n

# pmji log2pmj

i

i=1

K

#

!

Im =1"maxi(pm

i)

!

I'm =Nmj

Nmj=1

n

" 1#maxi(pmj

i)

Page 14: CS 461: Machine Learning Lecture 2 - Kiri Wagstaff1/12/08 CS 461, Winter 2008 1 CS 461: Machine Learning Lecture 2 Dr. Kiri Wagstaff kiri.wagstaff@calstatela.edu

1/12/08 CS 461, Winter 2008 14

Should we play tennis?

[Tom Mitchell]

Page 15: CS 461: Machine Learning Lecture 2 - Kiri Wagstaff1/12/08 CS 461, Winter 2008 1 CS 461: Machine Learning Lecture 2 Dr. Kiri Wagstaff kiri.wagstaff@calstatela.edu

1/12/08 CS 461, Winter 2008 15

How well does it generalize?

[Tom Dietterich, Tom Mitchell]

Page 16: CS 461: Machine Learning Lecture 2 - Kiri Wagstaff1/12/08 CS 461, Winter 2008 1 CS 461: Machine Learning Lecture 2 Dr. Kiri Wagstaff kiri.wagstaff@calstatela.edu

1/12/08 CS 461, Winter 2008 16

Decision Tree Construction Algorithm

[Alpaydin 2004 © The MIT Press]

Page 17: CS 461: Machine Learning Lecture 2 - Kiri Wagstaff1/12/08 CS 461, Winter 2008 1 CS 461: Machine Learning Lecture 2 Dr. Kiri Wagstaff kiri.wagstaff@calstatela.edu

1/12/08 CS 461, Winter 2008 17

Decision Trees = Profit!

PredictionWorks “Increasing customer loyalty through targeted

marketing”

Page 18: CS 461: Machine Learning Lecture 2 - Kiri Wagstaff1/12/08 CS 461, Winter 2008 1 CS 461: Machine Learning Lecture 2 Dr. Kiri Wagstaff kiri.wagstaff@calstatela.edu

1/12/08 CS 461, Winter 2008 18

Decision Trees in Weka

Use Explorer, run J48 on height/weight data Who does it misclassify?

Page 19: CS 461: Machine Learning Lecture 2 - Kiri Wagstaff1/12/08 CS 461, Winter 2008 1 CS 461: Machine Learning Lecture 2 Dr. Kiri Wagstaff kiri.wagstaff@calstatela.edu

1/12/08 CS 461, Winter 2008 19

Building a Regression Tree

Same algorithm… different criterion Instead of impurity, use Mean Squared Error

(in local region) Predict mean output for node Compute training error (Same as computing the variance for the node)

Keep splitting until node error is acceptable;then it becomes a leaf Acceptable: error < threshold

Page 20: CS 461: Machine Learning Lecture 2 - Kiri Wagstaff1/12/08 CS 461, Winter 2008 1 CS 461: Machine Learning Lecture 2 Dr. Kiri Wagstaff kiri.wagstaff@calstatela.edu

1/12/08 CS 461, Winter 2008 20

Turning Trees into Rules

[Alpaydin 2004 © The MIT Press]

Page 21: CS 461: Machine Learning Lecture 2 - Kiri Wagstaff1/12/08 CS 461, Winter 2008 1 CS 461: Machine Learning Lecture 2 Dr. Kiri Wagstaff kiri.wagstaff@calstatela.edu

1/12/08 CS 461, Winter 2008 21

Weka Machine Learning Library

Weka Explorer’s Guide

Page 22: CS 461: Machine Learning Lecture 2 - Kiri Wagstaff1/12/08 CS 461, Winter 2008 1 CS 461: Machine Learning Lecture 2 Dr. Kiri Wagstaff kiri.wagstaff@calstatela.edu

1/12/08 CS 461, Winter 2008 22

Summary: Key Points for Today

Supervised Learning Representation: features available Model Selection: which hypothesis to choose?

Decision Trees Hierarchical, non-parametric, greedy

Nodes: test a feature value Leaves: classify items (or predict values)

Minimize impurity (%error or entropy) Turning trees into rules

Evaluation (10-fold) Cross-Validation Confusion Matrix

Page 23: CS 461: Machine Learning Lecture 2 - Kiri Wagstaff1/12/08 CS 461, Winter 2008 1 CS 461: Machine Learning Lecture 2 Dr. Kiri Wagstaff kiri.wagstaff@calstatela.edu

1/12/08 CS 461, Winter 2008 23

Next Time

Support Vector Machines(read Ch. 10.1-10.4, 10.6, 10.9)

Evaluation(read Ch. 14.4, 14.5, 14.7, 14.9)

Questions to answer from the reading: Posted on the website (calendar)