logistic regression notes
TRANSCRIPT
-
7/27/2019 Logistic Regression Notes
1/50
A COMPARISON OF MULTIPLEREGRESSION, LOGISTIC REGRESSION AND
DISCRIMINATION FUNCTION INCLASSIFICATION OF OBSERVATIONS
by: Dr. Yap Bee Wah
UNIVERSITI TEKNOLOGI MARAFaculty of Information Technology & Quantitative Sciences
PETALWID
3.02.52.01.51.0.50.0
7
6
5
4
3
2
1
0
TYPE
iris virginica
iris versicolor
iris setosa
Kolokium Statistik 24 Julai 2004 Th 5, FTMSK
-6 -4 -2 0 2 4 6 8
0.0
0.1
0.2
0.3
0.4
-
7/27/2019 Logistic Regression Notes
2/50
KOLOKIUM STATISTIK 2004,FTMSK
2
OVERVIEW OF PRESENTATIONIntroduction
Multiple Regression
Logistic Regression
Discriminant Function
Methodology (Model Building and Evaluation Process)
Results
Conclusion
-
7/27/2019 Logistic Regression Notes
3/50
KOLOKIUM STATISTIK 2004,FTMSK
3
Introduction:
Two(2) pioneer studies Efron (1975) studied the relative efficiency of logistic regression and normal
discrimination analysis.
He found that typically, logistic regression is between one half and two thirdsas effective as normal discrimination.(Efron, B (1975). The Efficiency of Logistic Regression Compared to NormalDiscrimination Function Analysis. Journal of the American Statistical Association,December 1975, Volume 70, Number 352, Theory and Methods Section)
Press and Wilson (1978) compared logistic regression and parametricdiscriminant analysis and conclude that logistic regression is preferable toparametric discriminant analysis in cases for which the variables do not havemultivariate normal distributions.
However, for normal distributions, logistic regression is less efficient thanparametric discriminant analysis.(Press, S. J. & Wilson, S. (1978). Choosing between logistic regression anddiscriminant analysis. Journal of the American Statistical Association, 73, 699-705)
-
7/27/2019 Logistic Regression Notes
4/50
KOLOKIUM STATISTIK 2004,FTMSK
4
Introduction to Multiple Linear
Regression Multiple Linear Regression is a useful statistical
modeling technique for describing the relationship
between a response (dependent) variable with oneor several predictor variables
When the response variable is dichotomous (2
categories) or polytomous (more than 2
categories), logistic regression or discriminantanalysis is frequently used to model the
relationship.
-
7/27/2019 Logistic Regression Notes
5/50
KOLOKIUM STATISTIK 2004,FTMSK
5
Multiple Regression Model
Considerk predictor variables, themultiple regression model is stated asfollow:
iikkiii XXXY ...22110
),(~
2
0 Ni
Regression coefficients
Must be a
quantitative
variable
-
7/27/2019 Logistic Regression Notes
6/50
KOLOKIUM STATISTIK 2004,FTMSK
6
Regression: Research Application Example
IS Faculty Research Productivity: Influential Factors
and Implications
by :Qing Hu & T. Grandon Gill(Florida Atlantic University)
(Information Resources Management Journal, Vol 13, No 2,
2000)
Research
Productivity
(Annual rate of
publication)
Number of years in IS faculty
Percentages of time allocated for
teaching
Percentages of time allocated for research
Percentages of time allocated for academic services
Type of degree
-
7/27/2019 Logistic Regression Notes
7/50
KOLOKIUM STATISTIK 2004,FTMSK
7
Introduction to Logistic regression
Allows estimating the probability of anevent happening
Useful for modeling data with adichotomous dependent variable (Y) (eg:survive/die; purchase/do not purchase;pass/fail etc)
Allows a mixture ofquantitative andqualitativepredictor variables (X).
-
7/27/2019 Logistic Regression Notes
8/50
KOLOKIUM STATISTIK 2004,FTMSK
8
Application examplesDependent variable Independent variables
otherwise0
survive1Y
genderX
X
X
ageX
:
leveldosage:
illnessoflength:
:
4
3
2
1
otherwise0
billscardcreditsettle1Y
gender:
childrenofnumber:
income:
cardscreditofnumber:
:
5
4
3
2
1
X
X
X
X
ageX
-
7/27/2019 Logistic Regression Notes
9/50
KOLOKIUM STATISTIK 2004,FTMSK
9
Logit model,otherwise known as the
logistic regression model For k explanatory variables and i =1,2,,n
the model is
where
ikkii
i
i xxxp
p
...log 221101
)1( ii YPpReferred to as the logit
or log-odds
-
7/27/2019 Logistic Regression Notes
10/50
KOLOKIUM STATISTIK 2004,FTMSK
10
We can solve the logit equation to obtain:
In mathematicalexpression, this
formula is called the
logistic function and
can be written as:
)]...(exp[1
1)1Pr(
22110 kkXXXY
z-e1
1f(z)
kk110 X...Xzwhere
-
7/27/2019 Logistic Regression Notes
11/50
KOLOKIUM STATISTIK 2004,FTMSK
11
Simple logit model
Let Y and X be defined as follows:
)()]|(log[
)()]|(log[
,)(
)(log
001
111
11
1
0
0
0
XYodds
XYodds
Hence
X
YP
YP
1
0
10
NSvsS
e
e
e
snonsmoodds
ssmooddsOR
)ker(
)ker(
otherwise0
smokerif1
otherwise0
cancerlungdevelopif1
1X
Y
OR (Odds-ratio) :
A ratio of 2 odds
-
7/27/2019 Logistic Regression Notes
12/50
KOLOKIUM STATISTIK 2004,FTMSK
12
Interpretation of odds-ratio,
If for example ,
This odds ratio (OR) indicates that a smoker is 3
times more likely to develop lung cancercompared to a nonsmoker.
3ratioOdds,0986.1 0986.1 e
-
7/27/2019 Logistic Regression Notes
13/50
KOLOKIUM STATISTIK 2004,FTMSK
13
Introduction to Discriminant Analysis
An appropriate technique for classifying or separating individualsinto different groups (dependent variable) based on a set ofquantitativeindependent random variables
Involves deriving the linear combination of predictor variables(called the discriminant function) that will discriminate best
between the given groups
The main objective of discriminant analysis is to predict groupmembership based on a set of quantitative variables.
Assumptions: The predictor variables for each group has a
multivariate normal distribution
-6 -4 -2 0 2 4 6 8
0.0
0.1
0.2
0.3
0.4
-
7/27/2019 Logistic Regression Notes
14/50
KOLOKIUM STATISTIK 2004,FTMSK
14
Scatter Plot of Income Vs Lotsize
LOTSIZE
24222018161412
120
100
80
60
40
20
GROUP
nonowners
owners
Can we find a discriminant function based on income and lotsize of
house to predict if a house owner will or will not purchase a lawn
mower? (Johnson and Wichern, Applied Multivariate Statistical
Analysis, Wiley,2002).
-
7/27/2019 Logistic Regression Notes
15/50
KOLOKIUM STATISTIK 2004,FTMSK
15
We can classify a new observation (xo)
using:1) Linear or quadratic discriminant functions
2) Posteriorprobabilities
i
1
k
ofyprobabilitpriortheiswhere
observed)x wasgiven thatfromcomes
i
ii
g
i
kk
k
p
xfp
xfp
xPxP
)(
)(
()|(
Group k
-
7/27/2019 Logistic Regression Notes
16/50
KOLOKIUM STATISTIK 2004,FTMSK
16
Classification for two (2) normal populations
Homoscedastic Case (when )
Allocate to if
Otherwise allocate into .
21
1
221
1210
121
12
2121
p
p
c
cxxSxxxSxx pooledpooled
/
/ln'/'
0
x
0x 2
1
Linear Discriminant
Function Cost of misclassification
Prior
probabilities
An observationNote: Assume c(1/2)=c(2/1) and
p1=p2 is they are unknown.
Hence, ln (1)=0.
Source: Johnson &Wichern, 2002
-
7/27/2019 Logistic Regression Notes
17/50
KOLOKIUM STATISTIK 2004,FTMSK
17
Classification for two (2) normal
populations:
Heteroscedastic case (when ) Allocate to if
Otherwise allocate into
0x
0x2
121
1
20
1
22
1
110
1
2
1
1012
2121
p
p
c
ckxSxSxxSSx
/
/ln'''/
21
221
1
11
2
1 2121 xSxxSxS
Sk
''/||
||ln/
Quadratic Discriminant Function
Source: Johnson & Wichern, 2002
-
7/27/2019 Logistic Regression Notes
18/50
KOLOKIUM STATISTIK 2004,FTMSK
18
Example:Admission into graduate programs
based on GPA and GMAT
9046224640385
40385029690S
24734618058090
05809004350S
28n31n07447
482x
23561
403x
21
2121
..
..,
..
..
,,.
.,
.
.
admitnotdo
admit
2
1
:
:
00030020610
020610483728S
939349452902
529020369501-pooled
..
..,
..
..pooledS
scoreGMAT
GPAateundergradu
2
1
:
:
X
X
Independent variables (X)Response variable (Y)
-
7/27/2019 Logistic Regression Notes
19/50
-
7/27/2019 Logistic Regression Notes
20/50
KOLOKIUM STATISTIK 2004,FTMSK
20
Classification with severalpopulations
Allocate to
if the linear discriminant score
=the largest of
where
0x k
)(
xdk )(
),...,(
),(
xdxdxd g21
,...,g,ipxd ii 21xSx21xSx i
1pooled
'i
1pooled
'i ln)(
,...,g,ipxxSxxSxd iiiiQ
i21
2
1
2
1i
1 ln)()(||ln)( '
Quadratic discriminant score
Fishers discrimianant
function given inSPSS/SAS output
NOTE:
(1) Equal covariance matrices
(2)Unequal covariance matrices
Source: Johnson &
Wicheren 2002)
-
7/27/2019 Logistic Regression Notes
21/50
KOLOKIUM STATISTIK 2004,FTMSK
21
Assessing the performance of the
classification functions
Error rates-percentage of observations
misclassified Predicted MembershipActual
Membership
Group Owners Non-
owners
Sample
size
Owners n1c n1m n1
Non-owners n2m n2c n2
21
21rateErrornn
nn mm
-
7/27/2019 Logistic Regression Notes
22/50
KOLOKIUM STATISTIK 2004,FTMSK
22
Comparing the performance of multiple regression,
logistic regression, and discrimination functions in
classification of observations
These three statistical methods were applied to aa data set to compare their predictive ability ofclassifying a baby as low birth weight or normalbased on several predictor variables.
-
7/27/2019 Logistic Regression Notes
23/50
KOLOKIUM STATISTIK 2004,FTMSK
23
Dependent variableY = Birth weight (g)
Independent variables
X1 = Race (Malay, Chinese and Indian), X6 = Abortion (yes, no)
X2 = Gender (male, female) X7 = Mothers height (cm)
X3 = Mothers age (years) X8 = Vitamin (mg)
X4 = Fathers income (RM) X9 = Weight gain (kg)
X5 = Parity (children) X10 = Antenatal visits (number of times)
Data set (collected in 1997) courtesy of
Hospital Kuala Lumpur
-
7/27/2019 Logistic Regression Notes
24/50
KOLOKIUM STATISTIK 2004,FTMSK
24
Yes
Split data into
(the training data set (n1= 365)
(the validation data set (n2 = 50)
Build the model(s) using
the training data set.
Evaluate the performance of the
model using the validation data
set.
Find the probabilities of misclassifications;
E1, E2 and E3.
Compare the error rates E1, E2 and E3.
Are remedial
measures
needed?
Checking the model
adequacy using plot
of residuals and other
diagnostics.
Select the best model.
Yes
No
Methodology (The Process of Developing and
Evaluating the Models)
-
7/27/2019 Logistic Regression Notes
25/50
KOLOKIUM STATISTIK 2004,FTMSK
25
SPSS Results(Multiple Linear Regression
Analysis)
ANOVA
16531299 4 4132824.770 15.500 .000
95990816 360 266641.157
1.13E+08 364
Regression
Residual
Total
Model1
Sum of
Squares df Mean Square F Sig.
Coefficients
-1532.707 857.305 -1.788 .075
45.828 17.534 .131 2.614 .009
23.679 5.506 .210 4.300 .000
39.234 9.698 .210 4.046 .000
51.366 14.606 .178 3.517 .000
(Constant)
PARITY
MUM_HEIG
WGHTGAIN
ANT_VST
Model
1
B Std. Error
UnstandardizedCoefficients
Beta
Standardi
zed
Coefficients
t Sig.
Model Summ ary
.383 .147 .137 516.37
Model
1
R R Square
Adjusted
R Square
Std. Erro r of
the Estimate
Significant
predictor
variables
-
7/27/2019 Logistic Regression Notes
26/50
KOLOKIUM STATISTIK 2004,FTMSK
26
The final estimated regression function is:
Birth Weight= -1532.707 + 45.828(Parity) +
23.679(Mothers Height) + 39.234(Weight Gain) +
51.366(Antenatal Visits)
SPSS Results (Multiple Linear Regression)
-
7/27/2019 Logistic Regression Notes
27/50
KOLOKIUM STATISTIK 2004,FTMSK
27
Multiple Regression ResultsInterpretation of the estimated regression coefficients;
1. For parity (b1
= 45.828): every additional one child inthe family, the birth weight of babies will increase byapproximately 46g holding mothers height, weight gain andantenatal visits constant.
2. For mothers height (b2= 23.679), it indicates that the birthweight of babies will increase by approximately 24g for every
1 cm increase in mothers height, holding parity, weight gainand antenatal visits constant.(Birth weight higher for tallermothers)
3. For weight gain (b3= 39.234), it indicates that the birth weightof babies will increase by approximately 40g for every1kg increase in weight gain, holding parity, mothers height
and antenatal visits constant.4. For antenatal visits (b4= 51.366), it indicates that the birth
weight of babies will increase by approximately 52g forevery one unit (time) increase in number of antenatal visitsholding parity, mothers height and weight gain constant.
Checking Model Adequacy Through
-
7/27/2019 Logistic Regression Notes
28/50
KOLOKIUM STATISTIK 2004,FTMSK
28
Checking Model Adequacy Through
Diagnostic Plots
Observed Value
43210-1-2-3
3
2
1
0
-1
-2
-3
Regress ion Standardized Predicted Value
43210-1-2-3
4
3
2
1
0
-1
-2
-3
-4
Q-Q Plot of Residuals Plot of Residuals against Fitted Values
Notes: Kolmogorov-Smirnov = 0.045, p-value = 0.077Skewness = - 0.153
Kurtosis = 0.048
No violation of regression model assumptions of normal
errors with constant variance.
-
7/27/2019 Logistic Regression Notes
29/50
KOLOKIUM STATISTIK 2004,FTMSK 29
Evaluating Regression Model Performance
Through Error Rate The estimated regression function is then used to
predict the birth weight of the 50 observations inthe validation sample
Predicted values below 2500 were classified aslow birth weight. Otherwise, they are classified asnormal birth weight.
The following classification table gives the trueand predicted category obtained.
-
7/27/2019 Logistic Regression Notes
30/50
KOLOKIUM STATISTIK 2004,FTMSK 30
Predicted Total
Normal weight Low weight
Observed Normal weight 34 0 34
Low weight 15 1 16
Total 49 1 50
Classification Table.
300
50
015
1
.
E
Error rate for Multiple Regression Model
-
7/27/2019 Logistic Regression Notes
31/50
KOLOKIUM STATISTIK 2004,FTMSK 31
Independent variables
X1 = Race (Malay, Chinese and Indian), X6 = Abortion (yes, no)
X2 = Gender (male, female) X7 = Mothers height (cm)
X3 = Mothers age (years) X8 = Vitamin (mg)
X4 = Fathers income (RM) X9 = Weight gain (kg)
X5 = Parity (children) X10 = Antenatal visits (number of times)
otherwise02500g)ht(birthweightbirth weiglowif1Y
APPLYING LOGISTIC
REGRESSION
-
7/27/2019 Logistic Regression Notes
32/50
KOLOKIUM STATISTIK 2004,FTMSK 32
SPSS Results for Multiple Logistic Regression.
-.193 .052 13.666 1 .000 .824
-.194 .070 7.624 1 .006 .824
.648 .308 4.428 1 .035 1.912
-.108 .028 15.136 1 .000 .898
18.247 4.380 17.356 1 .000 8.4E+07
WGHTGAIN
ANT_VST
ABORT(1)
MUM_HEIG
Constant
Step
1
B S.E. Wald df Sig. Exp(B)
where
zj = 18.247 - 0.193(Weight gain) - 0.194(Antenatal visits) +
0.648(History of abortion)0.108(Mothers height)
The estimated logistic regression
model obtained:
jzj eYP
1
11)(
-
7/27/2019 Logistic Regression Notes
33/50
KOLOKIUM STATISTIK 2004,FTMSK 33
1. For weight gain; the odds ratio means that for every 1 kg increasein weight gain, the odds of low birth weight will decrease.
2. For antenatal visits; the odds ratio indicates that when a
mother increases antenatal visit by 1 time, the odds of
low birth weight will decrease.
3. For abortion; the odds ratio indicates that a mother who has
had abortion(s) is approximately 2 times more likely to have a
baby with low birth weight compared to those who have no
history of abortion(s).
4. For mothers height; the odds ratio indicates that the odds of low
birth weight is lower for mothers who are taller
Interpretation of the odds-ratio
e
-
7/27/2019 Logistic Regression Notes
34/50
-
7/27/2019 Logistic Regression Notes
35/50
KOLOKIUM STATISTIK 2004,FTMSK 35
Evaluating the performance of the logistic
regression model
The estimated logistic function is then used to predict the 50
observations in the validation data set
If
we classify the observation as belonging to (low birth weight)
50215001
,...,,.)(
jYP j
1
-
7/27/2019 Logistic Regression Notes
36/50
KOLOKIUM STATISTIK 2004,FTMSK 36
Error Rate for the Logistic Regression
Model
Predicted Total
Normal weight Low weight
Observed Normal weight 33 1 34
Low weight 11 5 16
Total 44 6 50
240
50
1112
.
E
Discriminant Analysis (Checking the
-
7/27/2019 Logistic Regression Notes
37/50
KOLOKIUM STATISTIK 2004,FTMSK 37
Discriminant Analysis (Checking the
assumption of multivariate normal distribution)
Variables Normal birth weight Low birth weight
Mothers age Approximately Normal Approximately Normal
Fathers income Nonnormal Nonnormal
Parity Approximately Approximately Normal
Mothers height Approximately Normal Approximately Normal
Vitamin Approximately Normal Approximately Normal
Weight gain Approximately Normal Approximately Normal
Antenatal visit Approximately Normal Approximately Normal
Chi square plots for checking multivariate
-
7/27/2019 Logistic Regression Notes
38/50
KOLOKIUM STATISTIK 2004,FTMSK 38
Chi-square plots for checking multivariate
normality
0.00 5.00 10.00 15.00 20.00 25.00
chisq
0.00000
10.00000
20.00000
30.00000
40.00000
50.00000
MahalanobisDistance
0.00 5.00 10.00 15.00 20.00
chisq
0.00000
10.00000
20.00000
30.00000
40.00000
50.00000
MahalanobisDistanc
e
Low Birth Weight Group Normal Birth Weight
Group
-6 -4 -2 0 2 4 6 8
0.0
0.1
0.2
0.3
0.4 Chi-square plots indicate both
groups have approximate
multivariate normal distributions.
-
7/27/2019 Logistic Regression Notes
39/50
KOLOKIUM STATISTIK 2004,FTMSK 39
Boxs M 12.83
F 2.11
df 1 6
df 2 217363
Sig. 0.049
Use Fishers Linear Disriminant Function.
Can assumeequal covariancematrices
Boxs M Test of
Equality of Covariance
Matrices.
Discriminant Analysis Results
211
210
:
:
H
H
-
7/27/2019 Logistic Regression Notes
40/50
KOLOKIUM STATISTIK 2004,FTMSK 40
SPSS Output (Discriminant Functions)
Class ification Function Coefficients
6.699 6.601
.397 .238
2.965 2.797
-532.689 -515.224
MUM_HEIG
WGHTGAIN
ANT_VST
(Constant)
0 normal
w eight 1 low w eight
WEI_CODE
Fisher's linear disc riminant functions
1d
Normal birth weight category;
= -532.689 + 6.699(mothers height) + 0.397(weight gain) +
2.965(antenatal visits)
Low birth weight category;
= -515.224 + 6.601(mothers height) + 0.238(weight gain) +
2.797(antenatal visits)2
d
-
7/27/2019 Logistic Regression Notes
41/50
KOLOKIUM STATISTIK 2004,FTMSK 41
Classification Results
170 96 266
28 71 99
63.9 36.1 100.0
28.3 71.7 100.0
169 97 266
28 71 99
63.5 36.5 100.0
28.3 71.7 100.0
WEI_CODE
normal weight
low weight
normal weight
low weight
normal weight
low weight
normal weight
low weight
Count
%
Count
%
Original
Cross-validated
normal weight low weight
Predicted Group
Membership
Total
Discriminant Analysis Results (Contd)
Cross-validation error rate of themodel=0.34
-
7/27/2019 Logistic Regression Notes
42/50
KOLOKIUM STATISTIK 2004,FTMSK 42
Evaluating the performance of the
discriminant functions The estimated discriminant functions is then used
to predict the group membership of the 50
observations in the validation data set If
we classify the observation into (low birth weight)
)()( jj xdxd 21
1
-
7/27/2019 Logistic Regression Notes
43/50
KOLOKIUM STATISTIK 2004,FTMSK 43
Evaluate Discriminant Functions Performance
Through Error Rate
Classification Table.
Predicted Total
Normal weight Low weight
Observed Normal weight 22 12 34
Low weight 6 10 16
Total 28 22 50
360
50
1263
.
E
S f h M d l P f
-
7/27/2019 Logistic Regression Notes
44/50
KOLOKIUM STATISTIK 2004,
FTMSK
44
Summary of the Models Performances
Statistical model Significant variables Error rate
1. Multiple linearregression
1.
Mothers height2. Weight gain
3. Antenatal visits
4. Parity
0.30
2. Multiple logistic
regression
1. Mothers height
2. Weight gain
3. Antenatal visits4. History of abortion(s)
0.24
3. Discriminant 1. Mothers height
2. Weight gain
3. Antenatal visits
0.36
Comparing the models with same significant predictor variables,
error rates for Multiple Regression,Logistic Regression and
Discriminant Analysis are 0.28, 0.26 and 0.36 respectively.
Note:
-
7/27/2019 Logistic Regression Notes
45/50
KOLOKIUM STATISTIK 2004,
FTMSK
45
Conclusion of Study
The significant predictor variables affecting birth
weight of babies are weight gain, number of
antenatal visits, parity, mothers height and historyof abortions
The logistic regression model is found to be the
best model in this study as it has the lowest error
rate
-
7/27/2019 Logistic Regression Notes
46/50
KOLOKIUM STATISTIK 2004,
FTMSK
46
Some interesting research papers
(1) Logistic Regression for Data Mining and High-
Dimensional Classification
Paul Komarek (PhD thesis, Carnegie Mellon University,
2004, 138 pages)
(www.autonlab.org/autonweb/showPaper.jsp?ID=komarek:Ir
_thesis
(2) Predicting Housing Value: A Comparison of Multiple
Regression and Artificial Neural Networks
Nghiep Nguyen & Al Cripps
Journal of Real Estate Research,Vol 22, p313-336, 2001.
http://www.autonlab.org/autonweb/showPaper.jsp?ID=komarek:Ir_thesishttp://www.autonlab.org/autonweb/showPaper.jsp?ID=komarek:Ir_thesishttp://www.autonlab.org/autonweb/showPaper.jsp?ID=komarek:Ir_thesishttp://www.autonlab.org/autonweb/showPaper.jsp?ID=komarek:Ir_thesis -
7/27/2019 Logistic Regression Notes
47/50
KOLOKIUM STATISTIK 2004,
FTMSK
47
Some interesting research papers
(3)Application of f-regression to fuzzy classification problem
Boris Izyumov
Proceedings of 3rd International Conference on Fuzzy
Logic and Technology (EUS,2003),Zittau, Germany
(2003),pp781-766
(4) Assessing and Predicting Information and Communication
Technology Literacy in Education Undergraduates
JoAnne Davies (Phd thesis, Department of Educational
Psychology, Edmonton, Alberta, 2002)
-
7/27/2019 Logistic Regression Notes
48/50
KOLOKIUM STATISTIK 2004,
FTMSK
48
Some interesting research papers
(5) Discriminant Analysis for recognition of human faceimages
Kamran Etemad & Rama ChellapaJ. Optical Soc. Of America, Vol 14, No 8, 1997
-
7/27/2019 Logistic Regression Notes
49/50
FACULTY OF INFORMATION TECHNOLOGY &
-
7/27/2019 Logistic Regression Notes
50/50
KOLOKIUM STATISTIK 2004 50
PETALWID
3.02.52.01.51.0.50.0
7
6
5
4
3
2
1
0
TYPE
iris virginica
iris versicolor
iris setosa
-6 -4 -2 0 2 4 6 8
0.0
0.1
0.2
0.3
0.4
FACULTY OF INFORMATION TECHNOLOGY &
QUANTITATIVE SCIENCES, UiTM