department of computer science, university of waikato, new zealand eibe frank weka: a machine...
TRANSCRIPT
Department of Computer Science, University of Waikato, New Zealand
Eibe Frank
WEKA: A Machine Learning Toolkit
The Explorer• Classification and
Regression• Clustering• Association Rules• Attribute Selection• Data Visualization
The Experimenter The Knowledge
Flow GUI Conclusions
Machine Learning with WEKA
based on notes by
04/10/23 University of Waikato 3
WEKA: the software Machine learning/data mining software written in
Java (distributed under the GNU Public License) Used for research, education, and applications Complements “Data Mining” by Witten & Frank Main features:
Comprehensive set of data pre-processing tools, learning algorithms and evaluation methods
Graphical user interfaces (incl. data visualization) Environment for comparing learning algorithms
04/10/23 University of Waikato 4
WEKA: versions There are several versions of WEKA:
WEKA 3.0: “book version” compatible with description in data mining book 1st edition
WEKA 3.2: “GUI version” adds graphical user interfaces (earlier version is command-line only)
WEKA 3.4 ++ on SoC linux and ISS windows This talk is based on snapshots of WEKA 3.3 … with some extra up-to-date snapshots Only changes are “layout” and some extras
04/10/23 University of Waikato 5
@relation heart-disease-simplified
@attribute age numeric@attribute sex { female, male}@attribute chest_pain_type { typ_angina, asympt, non_anginal, atyp_angina}@attribute cholesterol numeric@attribute exercise_induced_angina { no, yes}@attribute class { present, not_present}
@data63,male,typ_angina,233,no,not_present67,male,asympt,286,yes,present67,male,asympt,229,yes,present38,female,non_anginal,?,no,not_present...
WEKA only deals with “flat” files
04/10/23 University of Waikato 6
@relation heart-disease-simplified
@attribute age numeric@attribute sex { female, male}@attribute chest_pain_type { typ_angina, asympt, non_anginal, atyp_angina}@attribute cholesterol numeric@attribute exercise_induced_angina { no, yes}@attribute class { present, not_present}
@data63,male,typ_angina,233,no,not_present67,male,asympt,286,yes,present67,male,asympt,229,yes,present38,female,non_anginal,?,no,not_present...
WEKA only deals with “flat” files
04/10/23 University of Waikato 7
04/10/23 University of Waikato 8
04/10/23 University of Waikato 9
04/10/23 University of Waikato 10
Explorer: pre-processing the data Data can be imported from a file in various
formats: ARFF, CSV, C4.5, binary Data can also be read from a URL or from an SQL
database (using JDBC) Pre-processing tools in WEKA are called “filters” BUT it may be easier to reformat to ARFF yourself
(write a program in python / java … or just use WordPad to type in the text – but make sure format is right!), this helps with data understanding
04/10/23 University of Waikato 11
04/10/23 University of Waikato 12
04/10/23 University of Waikato 13
04/10/23 University of Waikato 14
04/10/23 University of Waikato 15
04/10/23 University of Waikato 16
04/10/23 University of Waikato 17
04/10/23 University of Waikato 18
Explorer: building “classifiers” Classifiers in WEKA are models for predicting
nominal or numeric quantities Implemented learning schemes include:
Decision trees and lists, instance-based classifiers, support vector machines, multi-layer perceptrons, logistic regression, Bayes’ nets, …
You explore by trying different classifiers, see which works best for you…
04/10/23 University of Waikato 19
04/10/23 University of Waikato 20
04/10/23 University of Waikato 21
04/10/23 University of Waikato 22
04/10/23 University of Waikato 23
04/10/23 University of Waikato 24
04/10/23 University of Waikato 25
04/10/23 University of Waikato 26
04/10/23 University of Waikato 27
04/10/23 University of Waikato 28
04/10/23 University of Waikato 29
04/10/23 University of Waikato 30
04/10/23 University of Waikato 31
04/10/23 University of Waikato 32
04/10/23 University of Waikato 33
04/10/23 University of Waikato 34
04/10/23 University of Waikato 35
04/10/23 University of Waikato 36
04/10/23 University of Waikato 37
04/10/23 University of Waikato 38
04/10/23 University of Waikato 39
04/10/23 University of Waikato 40
04/10/23 University of Waikato 41
04/10/23 University of Waikato 42
04/10/23 University of Waikato 43
04/10/23 University of Waikato 44QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture.
04/10/23 University of Waikato 45QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture.
04/10/23 University of Waikato 46QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture.
04/10/23 University of Waikato 47
WEKA from ISS PC
2009
@relation ukus
@attribute center numeric@attribute centre numeric@attribute centerpercent numeric@attribute color numeric@attribute colour numeric@attribute colorpercent numeric@attribute english {UK,US}
@data1,32,3, 0,20,0, UK0,25,0, 0,12,0, UK9,27,25, 0,84,0, UK0,19,0, 0,24,0, UK0,16,0, 0,14,0, UK0,16,0, 0,12,0, UK0,21,0, 0,38,0, UK0,25,0, 0,34,0, UK2,26,7, 2,3,40, UK2,32,5, 1,59,2, UK31,0,100, 55,0,100, US61,0,100, 26,0,100, US24,0,100, 11,0,100, US12,1,92, 21,4,84, US8,0,100, 4,2,67, US10,0,100, 8,0,100, US19,0,100, 22,0,100, US14,0,100, 7,0,100, US14,0,100, 6,0,100, US8,5,62, 24,0,100, US
@relation test
@attribute center numeric@attribute centre numeric@attribute centerpercent numeric@attribute color numeric@attribute colour numeric@attribute colorpercent numeric@attribute english {UK,US}
@data10,5,33, 0,20,0, UK
04/10/23 University of Waikato 70
WEKA has more… Clustering data into groups Finding associations between attributes Visualisation - online analytical processing Experimenter to run and compare different MLs Knowledge Flow GUI 3rd-party add-ons: sourceforge.net http://www.cs.waikato.ac.nz/ml/weka