classification in weka

27
[email protected] Classification in WEKA 8.11.2007 Petra Kralj [email protected]

Upload: others

Post on 12-Sep-2021

19 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Classification in WEKA

[email protected]

Classification in WEKA 8.11.2007

Petra [email protected]

Page 2: Classification in WEKA

[email protected]

Practice with Weka1. Build a decision tree with the ID3

algorithm on the lenses dataset, evaluate on a separate test set

2. Classification on the CAR dataset– Preparing the data– Building decision trees– Naive Bayes classifier– Understanding the Weka output

Page 3: Classification in WEKA

[email protected]

Exercise 1: Lenses dataset

• In the Weka data mining tool induce a decision tree for the lenses dataset with the ID3 algorithm.

• Data:– lensesTrain.arff– lensesTest.arff

• Compare the outcome with the manually obtained results.

Page 4: Classification in WEKA

[email protected]

Weka

Downloadversion 3.4

Weka is open source softwere for machine learning and data mining. http://www.cs.waikato.ac.nz/ml/weka/

Page 5: Classification in WEKA

[email protected]

Run Weka

Choose Explorer

Page 6: Classification in WEKA

[email protected]

Load the data

Page 7: Classification in WEKA

[email protected]

Load the data - 2

lensesTrain.arff

Page 8: Classification in WEKA

[email protected]

The data are loaded

Target variable

Choose

“Classify”

Page 9: Classification in WEKA

[email protected]

Choose algoritem

Page 10: Classification in WEKA

[email protected]

trees

Id3

Page 11: Classification in WEKA

[email protected]

1 2

3

5

lensesTest.arff

4

Page 12: Classification in WEKA

[email protected]

Decision tree

Page 13: Classification in WEKA

[email protected]

Classification accuracy

Confusion matrix

Page 14: Classification in WEKA

[email protected]

Exercise 2: CAR dataset

• 1728 examples• 6 attributes

– 6 nominal– 0 numeric

• Nominal target variable– 4 classes: unacc, acc, good, v-good– Distribution of classes

• unacc (70%), acc (22%), good (4%), v-good (4%)• No missing values

Page 15: Classification in WEKA

[email protected]

Preparing the data for WEKA - 1

Data in a spreadsheet (e.g. MS Excel)

- Rows are examples- Columns are attributes- The last column is the

target variable

Page 16: Classification in WEKA

[email protected]

Preparing the data for WEKA - 2

Save as “.csv”- Careful with dots “.”,

commas “,” and semicolons “;”!

Page 17: Classification in WEKA

[email protected]

Load the data

Target variable

Car.csv

Page 18: Classification in WEKA

[email protected]

Choose algorithm J48

Page 19: Classification in WEKA

[email protected]

Building and evaluating the tree

Page 20: Classification in WEKA

[email protected]

Classified as

Actual values

Classification accuracy

Page 21: Classification in WEKA

[email protected]

Right mouse click

Page 22: Classification in WEKA

[email protected]

Tree pruning

Set the minimal number of objects

per leaf to 15

Parameters of the algorithm

(right mouse click)

Page 23: Classification in WEKA

[email protected]

Reduced number of leaves and

nodes Easier to interpret

Lower classification

accuracy

Page 25: Classification in WEKA

[email protected]

Naïve Bayes classifier