logistic regression(cases 4-6)
DESCRIPTION
logistic regTRANSCRIPT
04/21/2023 1Unlock the Potential of Data analysis
Logistic RegressionCases(4-6)
Muhammad Akram Naseem([email protected])
Presenter: Research Centre for Training and
Development(RCTD)
04/21/2023Unlock the potential of Data analysis2
Logistic Regression What is?Logistic Regression analysis used when
dependent variable is categorical .
Binary logistic regression is most useful when you want to model the
event probability for a categorical response variable with two outcomes.
For example:A doctor wants to accurately diagnose a
possibly cancerous tumor.
04/21/2023Unlock the potential of Data analysis3
Binary logistic regression
Accurately diagnose(AD) depend on experience of doctor
Dependent Variable: Accurately diagnose(AD) (Yes, No)
Independent variable: Experience
04/21/2023Unlock the potential of Data analysis4
Binary logistic regression A loan officer wants to know whether
the next customer is likely to default or not
Default from loan(DFL) depends on monthly income
Dependent variable: Default from loan(DFL) (Yes,No)
Independent variable: monthly Income
04/21/2023Unlock the potential of Data analysis5
Binary logistic regression Satisfaction of employees of a
certain(Yes, No) organization depend on work load
Dependent variable: Satisfaction (Yes, No)
Independent variable: work load
04/21/2023Unlock the potential of Data analysis6
Binary logistic regression(Case-4)
Case: A study is conducted to know the impact of age on coronary heart disease(CHD),
Dependent variable: CHD(1-yes , 0-N0)Independent variable: Age
Data file: logistic regression
04/21/2023Unlock the potential of Data analysis7
Binary logistic regression(Case-4)1.Click on Analyze
2.Click on Binary logistic
3.Shift dependent variable
4.Shift independent variable
5.Click on Ok
04/21/2023Unlock the potential of Data analysis8
Binary logistic regression(Case-4)Omnibus Tests of Model Coefficients
Chi-square df Sig.
Step 29.31 1.00 0.00
Block 29.31 1.00 0.00
Model 29.31 1.00 0.00
Model Summary
Step -2 Log likelihood Cox & Snell R Square
1 107.35 0.25
Classification Table
Predicted
Coronary Heart Disease
NO YES
Coronary Heart Disease
NO45 12
YES14 29Explanatory
Power of the model
P-value suggest
significance of the
model
Proportion of correctly specified(45+29)/
100=0.74,
04/21/2023Unlock the potential of Data analysis9
Binary logistic regression(Case-4)Age is a
significant variable
Variables in the Equation
B S.E. Wald df Sig. Exp(B)=OR
AGE 0.11 0.02 21.25 1 0.00 1.117
Constant -5.31 1.13 21.94 1 0.00 0.01
1.exp(0.11)=1.12, means
that with the increase of one year in
age the risk of CHD is
increased 1.12 times provided all factors kept
constant Z = -5.31 + 0.1109 (age)
04/21/2023Unlock the potential of Data analysis10
Age Z= P(CHD)
30 -2.01 0.1235 -1.46 0.1940 -0.91 0.2945 -0.36 0.4150 0.19 0.5555 0.74 0.6860 1.29 0.7865 1.84 0.8670 2.39 0.9275 2.94 0.95 30 35 40 45 50 55 60 65 70 75
0.120.19
0.29
0.41
0.55
0.68
0.780.86
0.92 0.95
P(CHD)
Z = -5.31 + 0.1109 (age)
P(CHD)= = Ze1
1
Binary logistic regression(Case-4)
04/21/2023Unlock the potential of Data analysis11
In case 5 our objective is to know the impact of a categorical (binary) explanatory variable on a categorical (binary) dependent variable, how the analysis will be performed and how we will interpret the findings
File used: Case5.sav Dependent variable: Baby Birth
Weight Independent variable: Smoking
Status
Binary logistic regression(Case-5)
04/21/2023Unlock the potential of Data analysis12
Binary logistic regression(Case-5)
Dependent variable: Baby Birth Weight
Will be classified as low weight(1) , not low weight(0)
Independent variable: Smoking Status,
will be classified as smoker(1), non smoker (0)
04/21/2023Unlock the potential of Data analysis13
Binary logistic regression(Case-5)Omnibus Tests of Model Coefficients
Chi-square df Sig.
Step 4.867 1 0.27
Block 4.8671
0.27
Model 4.8671
0.27
Model Summary
Step -2 Log likelihood Cox & Snell R Square
1 229.805 0.025
Classification Table
Predicted
Birth Weight
NO Low
Birth WeightNO
130 0
Low59 0Explanatory
Power of the model
P-value suggest in significance of the
model
Proportion of correctly specified(130+0)/
159=0.81,
04/21/2023Unlock the potential of Data analysis14
Binary logistic regression(Case-5)Age is a
significant variable
Variables in the Equation
B S.E. Wald df Sig. Exp(B)=OR
SS 0.704 0.320 4.852 1 0.028 2.022
Constant -1.087 0.215 25.627 1 0.00 0.337
1.exp(0.704)=2.02, means that with the
status of smoking the
risk of low birth weight of baby
is increased 2.02 times to those mothers who not smoke
during pregnancy,provided all factors kept
constant
Z = -1.087 +0.704 (SS)
04/21/2023Unlock the potential of Data analysis15
P(LBW)= = Ze1
1
Binary logistic regression(Case-5)
Z = -1.087 +0.704 (SS)
ss zprob
yes -0.383 0.41
No -1.087 0.25 yes No0.00
0.10
0.20
0.30
0.400.41
0.25
prob of LBW along with Smoking status(ss) of
mothers
04/21/2023Unlock the potential of Data analysis16
Binary logistic regression(Case-6)
In this case , we will study to know the impact of race(black, white, other) on birth weight(Low, Not Low)
Dependent variable: Birth Weight status(BWS)
Explanatory variable: race
04/21/2023Unlock the potential of Data analysis17
Binary logistic regression(Case-6)
We will create two dummies of race
Race1:-white-1, black=0, Others=0
Race2:- white-0, black=1, Others=0
04/21/2023Unlock the potential of Data analysis18
Binary logistic regression(Case-6)Omnibus Tests of Model Coefficients
Chi-square df Sig.
Step 4.636 2 0.098
Block 4.6362
0.098
Model 4.6362
0.098
Model Summary
Step -2 Log likelihood Cox & Snell R Square
1 230.036 0.024
Classification Table
Predicted
Birth Weight
NO Low
Birth WeightNO
130 0
Low59 0Explanatory
Power of the model
P-value suggest
significance of the
model
Proportion of correctly specified(130+0)/
159=0.81,
04/21/2023Unlock the potential of Data analysis19
Binary logistic regression(Case-5)Age is a
significant variable
Variables in the Equation
B S.E. Wald Df Sig. Exp(B)=OR
race1 -0.599 0.347 2.973 1 0.085 0.549
race2 0.232 0.470 0.244 1 0.621 1.261
Constant -0.542 0.252 4.650 1 0.031 0.581
Odd Ratios
Z = -0.542-0.599race1+0.232race2
04/21/2023Unlock the potential of Data analysis20
P(LBW)= = Ze1
1
Binary logistic regression(Case-6)
prob of LBW along with Smoking status(ss) of
mothers
Z = -0.542-0.599race1+0.232race2
race Z prob
White -1.14 0.24
Black -0.31 0.42
other -0.54 0.37 White Black other0.0000.0500.1000.1500.2000.2500.3000.3500.4000.450
0.242
0.423
0.368
Probability of LBW between different races
04/21/2023Unlock the potential of Data analysis21
Classical Regression Logistic Regression
1.Model significance tested by t or F test
2.Co-efficient estimated by
3.Interpretation is straight forward
4.Explanatory power of model is determined by R2
1.Model significance tested by chi-square
2.Co-efficients estimated by likelihood
3.Interpretation is through odds ratio
4.Explanatory power of model is determined by pseudo R2
Classical regression vs logistic regression
04/21/2023Unlock the potential of Data analysis22
Best of luck