classification in weka - ijskt.ijs.si/petrakralj/ips_dm_0910/handsonweka-part1.pdf ·...

27
[email protected] Classification in WEKA 2009/11/10 Petra Kralj Novak [email protected]

Upload: dinhdieu

Post on 06-Feb-2018

221 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Classification in WEKA - IJSkt.ijs.si/PetraKralj/IPS_DM_0910/HandsOnWeka-Part1.pdf · Petra.Kralj@ijs.si Practice with Weka 1. Build a decision tree with the ID3 algorithm on the

[email protected]

Classification in WEKA 2009/11/10

Petra Kralj Novak

[email protected]

Page 2: Classification in WEKA - IJSkt.ijs.si/PetraKralj/IPS_DM_0910/HandsOnWeka-Part1.pdf · Petra.Kralj@ijs.si Practice with Weka 1. Build a decision tree with the ID3 algorithm on the

[email protected]

Practice with Weka

1. Build a decision tree with the ID3 algorithm on the lenses dataset, evaluate on a separate test set

2. Classification on the CAR dataset– Preparing the data– Building decision trees– Naive Bayes classifier– Understanding the Weka output

Page 3: Classification in WEKA - IJSkt.ijs.si/PetraKralj/IPS_DM_0910/HandsOnWeka-Part1.pdf · Petra.Kralj@ijs.si Practice with Weka 1. Build a decision tree with the ID3 algorithm on the

[email protected]

Weka

Download

version

3.6

Weka is open source softwere for machine learning and data mining.

http://www.cs.waikato.ac.nz/ml/weka/

Page 4: Classification in WEKA - IJSkt.ijs.si/PetraKralj/IPS_DM_0910/HandsOnWeka-Part1.pdf · Petra.Kralj@ijs.si Practice with Weka 1. Build a decision tree with the ID3 algorithm on the

[email protected]

Run Weka

Choose Explorer

Page 5: Classification in WEKA - IJSkt.ijs.si/PetraKralj/IPS_DM_0910/HandsOnWeka-Part1.pdf · Petra.Kralj@ijs.si Practice with Weka 1. Build a decision tree with the ID3 algorithm on the

[email protected]

Exercise 1: Lenses dataset

• In the Weka data mining tool induce a decision

tree for the lenses dataset with the ID3

algorithm.

• Data:– lensesTrain.arff

– lensesTest.arff

• Compare the outcome with the manually

obtained results.

Page 6: Classification in WEKA - IJSkt.ijs.si/PetraKralj/IPS_DM_0910/HandsOnWeka-Part1.pdf · Petra.Kralj@ijs.si Practice with Weka 1. Build a decision tree with the ID3 algorithm on the

[email protected]

Load the data

Page 7: Classification in WEKA - IJSkt.ijs.si/PetraKralj/IPS_DM_0910/HandsOnWeka-Part1.pdf · Petra.Kralj@ijs.si Practice with Weka 1. Build a decision tree with the ID3 algorithm on the

[email protected]

Load the data - 2

lensesTrain.arff

Page 8: Classification in WEKA - IJSkt.ijs.si/PetraKralj/IPS_DM_0910/HandsOnWeka-Part1.pdf · Petra.Kralj@ijs.si Practice with Weka 1. Build a decision tree with the ID3 algorithm on the

[email protected]

The data are loaded

Target variable

Choose

“Classify”

Page 9: Classification in WEKA - IJSkt.ijs.si/PetraKralj/IPS_DM_0910/HandsOnWeka-Part1.pdf · Petra.Kralj@ijs.si Practice with Weka 1. Build a decision tree with the ID3 algorithm on the

[email protected]

Choose algoritem

Page 10: Classification in WEKA - IJSkt.ijs.si/PetraKralj/IPS_DM_0910/HandsOnWeka-Part1.pdf · Petra.Kralj@ijs.si Practice with Weka 1. Build a decision tree with the ID3 algorithm on the

[email protected]

trees

Id3

Page 11: Classification in WEKA - IJSkt.ijs.si/PetraKralj/IPS_DM_0910/HandsOnWeka-Part1.pdf · Petra.Kralj@ijs.si Practice with Weka 1. Build a decision tree with the ID3 algorithm on the

[email protected]

12

3

5

lensesTest.arff

4

Page 12: Classification in WEKA - IJSkt.ijs.si/PetraKralj/IPS_DM_0910/HandsOnWeka-Part1.pdf · Petra.Kralj@ijs.si Practice with Weka 1. Build a decision tree with the ID3 algorithm on the

[email protected]

Decision tree

Page 13: Classification in WEKA - IJSkt.ijs.si/PetraKralj/IPS_DM_0910/HandsOnWeka-Part1.pdf · Petra.Kralj@ijs.si Practice with Weka 1. Build a decision tree with the ID3 algorithm on the

[email protected]

Classification accuracy

Confusion

matrix

Page 14: Classification in WEKA - IJSkt.ijs.si/PetraKralj/IPS_DM_0910/HandsOnWeka-Part1.pdf · Petra.Kralj@ijs.si Practice with Weka 1. Build a decision tree with the ID3 algorithm on the

[email protected]

Exercise 2: CAR dataset

• 1728 examples

• 6 attributes– 6 nominal

– 0 numeric

• Nominal target variable– 4 classes: unacc, acc, good, v-good

– Distribution of classes• unacc (70%), acc (22%), good (4%), v-good (4%)

• No missing values

Page 15: Classification in WEKA - IJSkt.ijs.si/PetraKralj/IPS_DM_0910/HandsOnWeka-Part1.pdf · Petra.Kralj@ijs.si Practice with Weka 1. Build a decision tree with the ID3 algorithm on the

[email protected]

Preparing the data for WEKA - 1

Data in a spreadsheet

(e.g. MS Excel)

- Rows are examples

- Columns are attributes

- The last column is the target variable

Page 16: Classification in WEKA - IJSkt.ijs.si/PetraKralj/IPS_DM_0910/HandsOnWeka-Part1.pdf · Petra.Kralj@ijs.si Practice with Weka 1. Build a decision tree with the ID3 algorithm on the

[email protected]

Preparing the data for WEKA - 2

Save as “.csv”- Careful with dots “.”,

commas “,” and semicolons “;”!

Page 17: Classification in WEKA - IJSkt.ijs.si/PetraKralj/IPS_DM_0910/HandsOnWeka-Part1.pdf · Petra.Kralj@ijs.si Practice with Weka 1. Build a decision tree with the ID3 algorithm on the

[email protected]

Load the data

Target variable

Car.csv

Page 18: Classification in WEKA - IJSkt.ijs.si/PetraKralj/IPS_DM_0910/HandsOnWeka-Part1.pdf · Petra.Kralj@ijs.si Practice with Weka 1. Build a decision tree with the ID3 algorithm on the

[email protected]

Choose algorithm J48

Page 19: Classification in WEKA - IJSkt.ijs.si/PetraKralj/IPS_DM_0910/HandsOnWeka-Part1.pdf · Petra.Kralj@ijs.si Practice with Weka 1. Build a decision tree with the ID3 algorithm on the

[email protected]

Building and evaluating the tree

Page 20: Classification in WEKA - IJSkt.ijs.si/PetraKralj/IPS_DM_0910/HandsOnWeka-Part1.pdf · Petra.Kralj@ijs.si Practice with Weka 1. Build a decision tree with the ID3 algorithm on the

[email protected]

Classified as

Actual values

Classification

accuracy

Page 21: Classification in WEKA - IJSkt.ijs.si/PetraKralj/IPS_DM_0910/HandsOnWeka-Part1.pdf · Petra.Kralj@ijs.si Practice with Weka 1. Build a decision tree with the ID3 algorithm on the

[email protected]

Right

mouse

click

Page 22: Classification in WEKA - IJSkt.ijs.si/PetraKralj/IPS_DM_0910/HandsOnWeka-Part1.pdf · Petra.Kralj@ijs.si Practice with Weka 1. Build a decision tree with the ID3 algorithm on the

[email protected]

Tree pruning

Set the minimal

number of objects

per leaf to 15

Parameters of the

algorithm

(right mouse click)

Page 23: Classification in WEKA - IJSkt.ijs.si/PetraKralj/IPS_DM_0910/HandsOnWeka-Part1.pdf · Petra.Kralj@ijs.si Practice with Weka 1. Build a decision tree with the ID3 algorithm on the

[email protected]

Reduced

number of

leaves and

nodesEasier to interpret

Lower

classification

accuracy

Page 25: Classification in WEKA - IJSkt.ijs.si/PetraKralj/IPS_DM_0910/HandsOnWeka-Part1.pdf · Petra.Kralj@ijs.si Practice with Weka 1. Build a decision tree with the ID3 algorithm on the

[email protected]

Naïve Bayes classifier