logistic regression iii: advanced topics
DESCRIPTION
Logistic Regression III: Advanced topics. Conditional Logistic Regression for Matched Data. Recall: Matching. Matching can control for extraneous sources of variability and increase the power of a statistical test. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Logistic Regression III: Advanced topics](https://reader036.vdocuments.net/reader036/viewer/2022062309/56814e0b550346895dbb7899/html5/thumbnails/1.jpg)
Logistic Regression III: Advanced Logistic Regression III: Advanced topics topics
![Page 2: Logistic Regression III: Advanced topics](https://reader036.vdocuments.net/reader036/viewer/2022062309/56814e0b550346895dbb7899/html5/thumbnails/2.jpg)
Conditional Logistic Regression for Conditional Logistic Regression for Matched DataMatched Data
![Page 3: Logistic Regression III: Advanced topics](https://reader036.vdocuments.net/reader036/viewer/2022062309/56814e0b550346895dbb7899/html5/thumbnails/3.jpg)
Recall: MatchingRecall: MatchingMatching can control for extraneous
sources of variability and increase the power of a statistical test.
Match M controls to each case based on potential confounders, such as age and gender.
If the data are matched, you must account for the matching in the statistical analysis!!
![Page 4: Logistic Regression III: Advanced topics](https://reader036.vdocuments.net/reader036/viewer/2022062309/56814e0b550346895dbb7899/html5/thumbnails/4.jpg)
Recall: Recall: Agresti Agresti example, example, diabetes and MIdiabetes and MI
Match each MI case to an MI control based on age and gender.
Ask about history of diabetes to find out if diabetes increases your risk for MI.
![Page 5: Logistic Regression III: Advanced topics](https://reader036.vdocuments.net/reader036/viewer/2022062309/56814e0b550346895dbb7899/html5/thumbnails/5.jpg)
Diabetes
No diabetes
25 119
Diabetes No Diabetes
9 37
16 82
46
98
144
MI cases
MI controls
odds(“favors” case/discordant pair) =
16
37
c
bOR
![Page 6: Logistic Regression III: Advanced topics](https://reader036.vdocuments.net/reader036/viewer/2022062309/56814e0b550346895dbb7899/html5/thumbnails/6.jpg)
Conditional Logistic RegressionConditional Logistic Regression
![Page 7: Logistic Regression III: Advanced topics](https://reader036.vdocuments.net/reader036/viewer/2022062309/56814e0b550346895dbb7899/html5/thumbnails/7.jpg)
The Conditional Likelihood: The Conditional Likelihood: each each discordant discordant stratumstratum (rather than individual) (rather than individual) gets 1 term in the likelihoodgets 1 term in the likelihood
xDPDPDPDP
DPDP
i
)exposurescontrol/(*)exposurescase/(~)exposurescontrol/(~*)exposurescase/(
)exposurescontrol/(~*)exposurescase/(strata all
1
Note: the marginal probability of disease may differ in each age-gender stratum, but we assume that the (multiplicative) increase in disease risk due to exposure is constant across strata.
For each stratum, we add to the likelihood: the CONDITIONAL probability that the case got disease and the control did not, given that we have a case-control pair.
The numerator is the probability (as a function of exposures) that the case gets disease and the control does not.
The denominator is the probability that the case gets disease and the control does not OR that the control (with all her exposures) gets disease and the case doesn’t (with all her exposure).
![Page 8: Logistic Regression III: Advanced topics](https://reader036.vdocuments.net/reader036/viewer/2022062309/56814e0b550346895dbb7899/html5/thumbnails/8.jpg)
Recall probability terms:Recall probability terms:
e
eEDP
1)/(
e
eEDP
1)~/(
eEDP
1
1)/(~
eEDP
1
1)~/(~
α)0(α))~/(1
)~/(ln(
)1(α))/(1
)/(ln(
EDP
EDP
EDP
EDP
![Page 9: Logistic Regression III: Advanced topics](https://reader036.vdocuments.net/reader036/viewer/2022062309/56814e0b550346895dbb7899/html5/thumbnails/9.jpg)
Diabetes
No diabetes
Case (MI) Control
1 1
0 0
Diabetes
No diabetes
Case (MI) Control
1 0
0 1
Diabetes
No diabetes
Case (MI) Control
0 1
1 0
Diabetes
No diabetes
Case (MI) Control
0 0
1 1
iiii
i
ii
i
ee
e
ee
eee
e
L
1
1*
11
1*
1
1
1*
1),(
ii
i
ii
i
ii
i
ee
e
ee
eee
e
L
1
1*
11
1*
1
1
1*
1),(
1
1*
11
11
*1
11
*1),(
i
i
iii
i
ii
i
ee
eeee
eee
L
1
1*
1
1
1
1*
1
1
1*
1),(
i
i
iii
i
ii
i
e
e
eee
eee
e
L
![Page 10: Logistic Regression III: Advanced topics](https://reader036.vdocuments.net/reader036/viewer/2022062309/56814e0b550346895dbb7899/html5/thumbnails/10.jpg)
The conditional likelihood=The conditional likelihood=
case favor thethat strata discordant
1i
control favor the that strata discordant
1
1*
1
1
1
1*
1
1
1*
1
1
1*
11*
1
11
*1
1
n
m
j
i
i
iii
i
ii
i
jj
j
j
j
j
j
j
j
e
e
eee
eee
e
x
ee
e
e
e
e
e
e
e
Each age-gender stratum has the same baseline odds of disease; but these
baseline odds may differ across strata
![Page 11: Logistic Regression III: Advanced topics](https://reader036.vdocuments.net/reader036/viewer/2022062309/56814e0b550346895dbb7899/html5/thumbnails/11.jpg)
Conditional Logistic RegressionConditional Logistic Regression
case favor thethat
strata discordant
1
control favor thethat strata discordant
1j
n
i
m
ii
i
jj
j
ee
ex
ee
e
nmn
i
m
j e
e
ee
ex
e
e i
)1
()1
1(
1
1
1
parameter) nuisance of rid (gets !cancel! s' The***
11
![Page 12: Logistic Regression III: Advanced topics](https://reader036.vdocuments.net/reader036/viewer/2022062309/56814e0b550346895dbb7899/html5/thumbnails/12.jpg)
Example: MI and diabetesExample: MI and diabetes
3716 )1
()1
1()L(
e
e
e
![Page 13: Logistic Regression III: Advanced topics](https://reader036.vdocuments.net/reader036/viewer/2022062309/56814e0b550346895dbb7899/html5/thumbnails/13.jpg)
Conditional Logistic RegressionConditional Logistic Regression
16
37
1637
53)137(
01
53-37
dlog(L)
)1log(*5337)log(
e
e
ee
e
e
d
eL
![Page 14: Logistic Regression III: Advanced topics](https://reader036.vdocuments.net/reader036/viewer/2022062309/56814e0b550346895dbb7899/html5/thumbnails/14.jpg)
In SAS…In SAS…
proc logistic data = YourData;model MI (event = "Yes") = diabetes;strata PairID;run;
![Page 15: Logistic Regression III: Advanced topics](https://reader036.vdocuments.net/reader036/viewer/2022062309/56814e0b550346895dbb7899/html5/thumbnails/15.jpg)
Could there be an association between exposure to ultrasound in utero and an increased risk of childhood malignancies?
Previous studies have found no association, but they have had poor statistical power to detect an association.
Swedish researchers performed a nationwide population based case-control study using prospectively assembled data on prenatal exposure to ultrasound.
Example:Example:Prenatal ultrasound examinations and risk Prenatal ultrasound examinations and risk of childhood leukemia: case-control study of childhood leukemia: case-control study
BMJBMJ 2000;320:282-283 2000;320:282-283
![Page 16: Logistic Regression III: Advanced topics](https://reader036.vdocuments.net/reader036/viewer/2022062309/56814e0b550346895dbb7899/html5/thumbnails/16.jpg)
Example:Example:Prenatal ultrasound examinations and risk Prenatal ultrasound examinations and risk of childhood leukemia: case-control study of childhood leukemia: case-control study
BMJBMJ 2000;320:282-283 2000;320:282-283
535 cases: all children born and diagnosed as having myeloid leukemia between 1973 and 1989 in Swedish registers of birth, cancer, and causes of death.
535 matched controls: 1 control was randomly selected for each case from the Swedish Birth Registry, matched by sex and year and month of birth.
![Page 17: Logistic Regression III: Advanced topics](https://reader036.vdocuments.net/reader036/viewer/2022062309/56814e0b550346895dbb7899/html5/thumbnails/17.jpg)
Ultrasound
No ultrasound
215 320
Ultrasound No Ultrasound
200
335
535
Leukemia cases
Myeloid leukemia controls
235100
115 85
85.100
85
c
bOR
But this type of analysis is limited to single dichotomous exposure…
![Page 18: Logistic Regression III: Advanced topics](https://reader036.vdocuments.net/reader036/viewer/2022062309/56814e0b550346895dbb7899/html5/thumbnails/18.jpg)
Used conditional logistic regression to look at dose-response with number of ultrasounds:
Results: Reference OR = 1.0; no ultrasounds OR =.91 for 1-2 ultrasounds OR=.64 for >=3 ultrasounds
Conclusion: no evidence of a positive association between prenatal ultrasound and childhood leukemia; even evidence of inverse association (which could be explained by reasons for frequent ultrasound)
![Page 19: Logistic Regression III: Advanced topics](https://reader036.vdocuments.net/reader036/viewer/2022062309/56814e0b550346895dbb7899/html5/thumbnails/19.jpg)
Each term in the likelihood represents a stratum of 1+M individuals
More complicated likelihood expression! Just as easy to implement in SAS as we’ll
see Wednesday…
Extension: 1:M matchingExtension: 1:M matching
![Page 20: Logistic Regression III: Advanced topics](https://reader036.vdocuments.net/reader036/viewer/2022062309/56814e0b550346895dbb7899/html5/thumbnails/20.jpg)
Ordinal Logistic Regression Ordinal Logistic Regression
![Page 21: Logistic Regression III: Advanced topics](https://reader036.vdocuments.net/reader036/viewer/2022062309/56814e0b550346895dbb7899/html5/thumbnails/21.jpg)
Ordinal Logistic RegressionOrdinal Logistic Regression
What if your outcome variable has more than two levels?
For ordinal outcomes, use ordinal logistic regression:
*Relies on the cumulative logit*Models the predicted probability of multiple
outcomes*Also known as the “proportional odds model”
![Page 22: Logistic Regression III: Advanced topics](https://reader036.vdocuments.net/reader036/viewer/2022062309/56814e0b550346895dbb7899/html5/thumbnails/22.jpg)
Ordinal Variable Example: Likert Ordinal Variable Example: Likert ScaleScale
1 = strongly disagree2 = disagree3 = neutral 4 = agree5 = strongly agree
Cumulative outcomes:
*strongly agree vs. the rest
*agree or strongly agree vs. neutral or negative
*agree or neutral vs. negative
*the rest vs. strongly negative
Ordinal logistic regression gives you a way to model these cumulative outcomes all at once!
![Page 23: Logistic Regression III: Advanced topics](https://reader036.vdocuments.net/reader036/viewer/2022062309/56814e0b550346895dbb7899/html5/thumbnails/23.jpg)
Ordinal Variable Example: Continuous Ordinal Variable Example: Continuous variable measured crudelyvariable measured crudely
1 = breastfed >=6 months
2 = breastfed 4-5 months
3 = breastfed 2-3 months
4 = breastfed <2 months
The outcome variable, breastfeeding, was only measured at limited time
points. So, may not be best modeled as continuous variable in linear regression. Use ordinal logistic!
![Page 24: Logistic Regression III: Advanced topics](https://reader036.vdocuments.net/reader036/viewer/2022062309/56814e0b550346895dbb7899/html5/thumbnails/24.jpg)
More inclusive
definition of a “positive”
outcome
Another example, 3 levels:Another example, 3 levels:
1 = eumenorrhea (normal menses) (66.6%)
2 = oligomenorrhea (mild irregularity) (24.6%)
3 = amenorrhea (severe irregularity) (8.6%)
From my data on runners:
Most “severe” outcome
![Page 25: Logistic Regression III: Advanced topics](https://reader036.vdocuments.net/reader036/viewer/2022062309/56814e0b550346895dbb7899/html5/thumbnails/25.jpg)
Cumulative logit, 3 groupsCumulative logit, 3 groups(2 potential “positive” outcomes)(2 potential “positive” outcomes)
normal
rheaoligomenoror amenorrhea
normalor reaoligomenor
amenorrhea
logty irregulariany for logit cumulative
log amenorrheafor logit cumulative
p
p
p
p
In words:
The log odds of having amenorrhea (versus everything else).
And the log odds of having any irregularity (versus normal).
![Page 26: Logistic Regression III: Advanced topics](https://reader036.vdocuments.net/reader036/viewer/2022062309/56814e0b550346895dbb7899/html5/thumbnails/26.jpg)
Corresponding logistic model (no Corresponding logistic model (no predictors)predictors)
The intercept-only model, no predictors (two intercepts!):
Log odds (amenorrhea)= amen
Log odds (any irregularity)= amen or oligo
![Page 27: Logistic Regression III: Advanced topics](https://reader036.vdocuments.net/reader036/viewer/2022062309/56814e0b550346895dbb7899/html5/thumbnails/27.jpg)
Fitted model:Fitted model:Logit of amenorreha=
8.6% of my sample has amenorrhea
Odds = 8.6/91.4=.094
Ln (.094) = -2.3623
Logit of any irregularity=
33.3% has any irregularity (24.6% + 8.6%)
Odds=(1/3)/(2/3) = 1/2
Ln(1/2) = -.70
Fitted models are: Log odds (amenorrhea)= -2.36 Log odds (any irregularity)= -0.70
![Page 28: Logistic Regression III: Advanced topics](https://reader036.vdocuments.net/reader036/viewer/2022062309/56814e0b550346895dbb7899/html5/thumbnails/28.jpg)
Logistic model with predictors:Logistic model with predictors:
Log odds (amenorrhea)= amen + β1*X1 + β2*X2
Log odds (any irregularity)= amen or oligo + β1*X1 + β2*X2
Note, different intercepts but shared betas (shared slopes)!
![Page 29: Logistic Regression III: Advanced topics](https://reader036.vdocuments.net/reader036/viewer/2022062309/56814e0b550346895dbb7899/html5/thumbnails/29.jpg)
Odds ratio interpretation (a):Odds ratio interpretation (a):
unexposed for the amenorrhea of odds
exposed for the amenorrhea of oddsOR
)1()0(
)1()1(
exp
exp
confounderosureamen
confounderosureamen
e
e
)1()1(
exp
exp
1osure
osure
ee
)1()0(
)1()1(
exp
exp
confounderosureamen
confounderosureamen
e
e
![Page 30: Logistic Regression III: Advanced topics](https://reader036.vdocuments.net/reader036/viewer/2022062309/56814e0b550346895dbb7899/html5/thumbnails/30.jpg)
Odds ratio interpretation (b):Odds ratio interpretation (b):
unexposed for thety irregulari menstrualany of odds
exposed for thety irregulari menstrualany of oddsOR
)1()0(
)1()1(
exp
exp
confounderosureoamenorolig
confounderosureoamenorolig
e
e
)1()1(
exp
exp
1osure
osure
ee
)1()0(
)1()1(
exp
exp
confounderosureoamenorolig
confounderosureoamenorolig
e
e
![Page 31: Logistic Regression III: Advanced topics](https://reader036.vdocuments.net/reader036/viewer/2022062309/56814e0b550346895dbb7899/html5/thumbnails/31.jpg)
Odds ratio interpretation:Odds ratio interpretation:
Interpretation of the betas:
eβ = adjusted odds ratio
For every 1-unit increase in X, it’s the increase in the odds of any menstrual irregularity compared with none and it’s also the increase in the odds of amenorrhea compared with the other two categories (adjusted for any other predictors in the model).
Note: proportional odds assumption! The odds ratios are the same across different levels of the outcome.
![Page 32: Logistic Regression III: Advanced topics](https://reader036.vdocuments.net/reader036/viewer/2022062309/56814e0b550346895dbb7899/html5/thumbnails/32.jpg)
Example predictor, EDI-A:Example predictor, EDI-A:
Score on the anorexia subscale of the eating disorder inventory (EDI-A)
![Page 33: Logistic Regression III: Advanced topics](https://reader036.vdocuments.net/reader036/viewer/2022062309/56814e0b550346895dbb7899/html5/thumbnails/33.jpg)
Cumulative logit plot (4 bins)Cumulative logit plot (4 bins)
The intercept for any irregularity (the log odds of any irregularity where EDI-A=0)
The intercept for amenorrhea (the
log odds of amenorrhea where
EDI-A=0)
These lines should be linear and parallel (equal slopes, one beta!)
The slopes represent the increase in the log odds of either outcome for every 1-unit increase in EDI-A score.
![Page 34: Logistic Regression III: Advanced topics](https://reader036.vdocuments.net/reader036/viewer/2022062309/56814e0b550346895dbb7899/html5/thumbnails/34.jpg)
Fitted model with EDI-A:Fitted model with EDI-A:
Analysis of Maximum Likelihood Estimates
Standard WaldParameter DF Estimate Error Chi-Square Pr > ChiSq
Intercept 1 1 -3.2630 0.3823 72.8648 <.0001Intercept 2 1 -1.3888 0.2478 31.4220 <.0001EDIA 1 0.1211 0.0265 20.9065 <.0001
Log odds (amen)= -3.2630 + 0.1211*EDI-A
Log odds (any irregularity)= -1.3888 + 0.1211*EDI-A
![Page 35: Logistic Regression III: Advanced topics](https://reader036.vdocuments.net/reader036/viewer/2022062309/56814e0b550346895dbb7899/html5/thumbnails/35.jpg)
Fitted Model: Predicted logit at Fitted Model: Predicted logit at every level of EDI-Aevery level of EDI-A
![Page 36: Logistic Regression III: Advanced topics](https://reader036.vdocuments.net/reader036/viewer/2022062309/56814e0b550346895dbb7899/html5/thumbnails/36.jpg)
Compare actual data and fitted Compare actual data and fitted model:model:
![Page 37: Logistic Regression III: Advanced topics](https://reader036.vdocuments.net/reader036/viewer/2022062309/56814e0b550346895dbb7899/html5/thumbnails/37.jpg)
Fitted model with EDI-A:Fitted model with EDI-A:
Odds Ratio Estimates
Point 95% WaldEffect Estimate Confidence Limits
EDIA 1.129 1.072 1.189
For every 1-unit increase in EDI-A score, there’s a 13% increase in the odds of being amenorrheic versus the other two categories and a 13% increase in the odds of being amenorrheic or oligomenorrheic versus normal.
![Page 38: Logistic Regression III: Advanced topics](https://reader036.vdocuments.net/reader036/viewer/2022062309/56814e0b550346895dbb7899/html5/thumbnails/38.jpg)
Predictions:Predictions:
Log odds (outcome)= -3.2630 + -1.3888 + 0.1211*EDIA-1
The model predicts that a woman with an EDI-A score of 15 would have:
%5.6011
1)tyirregulariP(any
%1911
1)P(amen
4281.
4281.
)15(1211.3888.1
)15(1211.3888.1
4461.1
4461.1
)15(1211.2630.3
)15(1211.2630.3
e
e
e
e
e
e
e
e
![Page 39: Logistic Regression III: Advanced topics](https://reader036.vdocuments.net/reader036/viewer/2022062309/56814e0b550346895dbb7899/html5/thumbnails/39.jpg)
Predictions:Predictions:
Predicted logit=-1.446
Predicted probability = 19%
Predicted logit=.4281
Predicted probability = 60.5%50%
probability line
![Page 40: Logistic Regression III: Advanced topics](https://reader036.vdocuments.net/reader036/viewer/2022062309/56814e0b550346895dbb7899/html5/thumbnails/40.jpg)
Advantages & disadvantagesAdvantages & disadvantages
Ordinal logistic is better than running separate logistic models for different outcomes (e.g., one model for amenorrhea, one model for any irregularity) because of the improvement in statistical power!
Ordinal logistic prevents you from having to arbitrarily turn an ordinal variable into a binary variable!
But does require that you meet the proportional odds assumption…