modeling continuous longitudinal data. introduction to continuous longitudinal data: examples
Post on 19-Dec-2015
248 views
TRANSCRIPT
Modeling Continuous Longitudinal Data
Introduction to continuous longitudinal data: Examples
Copyright ©1995 BMJ Publishing Group Ltd. Lokken, P. et al. BMJ 1995;310:1439-1442
Day of surgery
Days 1-7 after surgery
(morning and evening)
Mean pain assessments by visual analogue scales (VAS)
Homeopathy vs. placebo in treating pain after surgery
Divalproex vs. placebo for treating bipolar depression
Davis et al. “Divalproex in the treatment of bipolar depression: A placebo controlled study.” J Affective Disorders 85 (2005) 259-266.
Copyright ©1995 BMJ Publishing Group Ltd. Keller, H.-R. et al. BMJ 1995;310:1232-1235
Mean (SD) score of acute mountain sickness in subjects treated with simulated descent (One hour of treatment in the hyperbaric chamber) or dexamethasone.
Randomized trial of in-field treatments of acute mountain sickness
Copyright ©1997 BMJ Publishing Group Ltd. Cadogan, J. et al. BMJ 1997;315:1255-1260
Mean (SE) percentage increases in total body bone mineral and bone
density over 18 months. P values are for the differences between groups by repeated measures analysis of variance
Pint of milk vs. control on bone acquisition in adolescent females
Copyright ©2000 BMJ Publishing Group Ltd. Hovell, M. F et al. BMJ 2000;321:337-342
Counseling vs. control on smoking in pregnancy
Longitudinal data: broad form
id time1 time2 time3 time4
1 31 29 15 262 24 28 20 323 14 20 28 304 38 34 30 345 25 29 25 296 30 28 16 34
Hypothetical data from Twisk, chapter 3, page 26, table 3.4Jos W. R. Twisk. Applied Longitudinal Data Analysis for Epidemiology: A Practical Guide. Cambridge University Press, 2003.
Longitudinal data: Long form
Hypothetical data from Twisk, chapter 3, page 26, table 3.4
id time score
1 1 311 2 291 3 15
1 4 262 1 242 2 282 3 202 4 323 1 143 2 203 3 283 4 30
id time score
4 1 38
4 2 344 3 304 4 345 1 255 2 295 3 255 4 296 1 306 2 286 3 166 4 34
Converting data from broad to long in SAS…
data long;set broad;time=1; score=time1; output;time=2; score=time2; output;time=3; score=time3; output;time=4; score=time4; output;run;
Profile plots (use long form)
The plot tells a lot!
Mean response plot
Superimposed…
smoothed
smoothed
Superimposed…
Two groups (e.g., treatment placebo)
id group time1 time2 time3 time4
1 A 31 29 15 262 A 24 28 20 323 A 14 20 28 304 B 38 34 30 345 B 25 29 25 296 B 30 28 16 34
Hypothetical data from Twisk, chapter 3, page 40, table 3.7
Profile plots by group
B
A
Mean plots by group
B
A
Possible questions… Overall, are there significant differences between time
points? From plots: looks like some differences (time3 and 4 look
different) Overall, are there significant changes from baseline?
From plots: at time3 or time4 maybe Do the two groups differ at any time points?
From plots: certainly at baseline; some difference everywhere Do the two groups differ in their responses over time?**
From plots: their response profile looks similar over time, though A and B are closer by the end.
Statistical analysis strategies
Strategy 1: ANCOVA on the final measurement, adjusting for baseline differences (end-point analysis)
Strategy 2: repeated-measures ANOVA “Univariate” approach
Strategy 3: “Multivariate” ANOVA approach
Strategy 4: GEE Strategy 5: Mixed Models
Strategy 6: Modeling change
Newer approaches: next week
Traditional approaches: this week
In two/three weeks
Comparison of traditional and new methods
FROM:Ralitza Gueorguieva, PhD; John H. Krystal, MD Move Over ANOVA : Progress in Analyzing Repeated-Measures Data and Its Reflection in Papers Published in the Archives of General Psychiatry. Arch Gen Psychiatry. 2004;61:310-317.
Things to consider:1. Spacing of time intervals
Repeated-measures ANOVA and MANOVA require that all subjects measured at same time intervals—our plots above assumed this too!
MANOVA weights all time intervals evenly (as if evenly spaced)
2. Assumptions of the model ALL strategies assume normally distributed outcome and
homogeneity of variances But all strategies are robust against this assumption,
especially if data set is >30 **Univariate repeated-measures ANOVA assumes sphericity, or
compound symmetry3. Missing Data
All traditional analyses require imputation of missing data
(also need to know: does the SAS PROC require long or broad form of data?)
Compound symmetryCompound symmetry requires :
(a)The variances of the outcome variable must be the same at each time point
(b) The correlation between repeated measurements are equal, regardless of the time interval between measurements.
(a) Variances at each time points (visually)
Does variance look equal across time points??
--Looks like most variability at time1 and least at time4…
(a) Variances at each time points (numerically)
id time1 time2 time3 time4
1 31 29 15 262 24 28 20 323 14 20 28 304 38 34 30 345 25 29 25 296 30 28 16 34
Variance: 65.60000 20.40000 39.46667 9.76667
(b) Correlation (covariance) across time points
time1 time2 time3 time4
time1 1.00000 0.94035 -0.14150 0.28445
time2 0.94035 1.00000 -0.02819 0.26921 time3 -0.14150 -0.02819 1.00000 0.27844
time4 0.28445 0.26921 0.27844 1.00000
Certainly do NOT have equal correlations!
Time1 and time2 are highly correlated, but time1 and time3 are inversely correlated!
Compound symmetry would look like…
time1 time2 time3 time4
time1 1.00000 -0.04878 -0.04878 -0.04878
time2 -0.04878 1.00000 -0.04878 -0.04878 time3 -0.04878 -0.04878 1.00000 -0.04878
time4 -0.04878 -0.04878 -0.04878 1.00000
Missing Data Very important to fill in missing data!
Otherwise, you have to throw out the whole observation.
With missing data, changes in the mean over time may just reflect drop-out pattern; you cannot compare time point 1 with 50 people to time point 2 with 35 people!
We will implement classic “last observation carried forward” strategy for simplicity
Other more complicated imputation strategies may be more appropriate
LOCF
Subject HRSD 1 HRSD 2 HRSD 3 HRSD 4Subject 1
20 13
Subject 2
21 21 20 19
Subject 3
19 18 10 6
Subject 4
30 25 23
LOCF
Subject HRSD 1 HRSD 2 HRSD 3 HRSD 4
Subject 1
20 13 13 13
Subject 2
21 21 20 19
Subject 3
19 18 10 6
Subject 4
30 30 25 23
Last Observation Carried Forward
Strategy 1: End-point analysis
proc glm data=broad;class group;model time4 = time1 group;run;
Removes repeated measures problem by considering only a single time point (the final one).
Ignores intermediate data completely
Asks whether or not the two group means differ at the final time point, adjusting for differences at baseline (using ANCOVA).
Comparing groups at every follow-up time point in this way would hugely increase your type I error.
Strategy 1: End-point analysis
Sum of Source DF Squares Mean Square F Value Pr > F
Model 2 13.50000000 6.75000000 0.57 0.6155
Error 3 35.33333333 11.77777778
Corrected Total 5 48.83333333
R-Square Coeff Var Root MSE time4 Mean
0.276451 11.13041 3.431877 30.83333
Source DF Type I SS Mean Square F Value Pr > F
time1 1 3.95121951 3.95121951 0.34 0.6031 group 1 9.54878049 9.54878049 0.81 0.4343
group time4 LSMEAN Pr > |t|
A 29.3333333 0.4343 B 32.3333
Strategy 1: End-point analysis
Sum of Source DF Squares Mean Square F Value Pr > F
Model 2 13.50000000 6.75000000 0.57 0.6155
Error 3 35.33333333 11.77777778
Corrected Total 5 48.83333333
R-Square Coeff Var Root MSE time4 Mean
0.276451 11.13041 3.431877 30.83333
Source DF Type I SS Mean Square F Value Pr > F
time1 1 3.95121951 3.95121951 0.34 0.6031 group 1 9.54878049 9.54878049 0.81 0.4343
group time4 LSMEAN Pr > |t|
A 29.3333333 0.4343 B 32.3333
Least-squares means of the two groups at time4, adjusted for baseline differences (not significantly different)
From end-point analysis… Overall, are there significant differences between
time points? Can’t say
Overall, are there significant changes from baseline?
Can’t say Do the two groups differ at any time points?
They don’t differ at time4 Do the two groups differ in their responses over
time? Can’t say
Strategy 2: univariate repeated measures ANOVA (rANOVA)
Just good-old regular ANOVA, but accounting for between subject differences
BUT first… Naive analysis Run ANOVA on long form of data,
ignoring correlations within subjects (also ignoring group for now):
proc anova data=long; class time;
model score= time ;run;
Compares means from each time point as if they were independent samples. (analogous to using a two-sample t-test when a paired t-test is appropriate). Results in loss of power!
One-way ANOVA (naïve)
79.224])2783.30()2733.22()2728()2727[(6imes)(between t SSB 2222 x
17.676)83.3034()83.3029(.....)2724()2731(me)(within tiSSW 2222
Between times
id time1 time2 time3 time4MEAN
1 31 29 15 262 24 28 20 323 14 20 28 304 38 34 30 345 25 29 25 296 30 28 16 34MEAN: 27.00 28.00 22.33 30.83
27.00
Within time
One-way ANOVA results
The ANOVA Procedure
Dependent Variable: score Sum of Source DF Squares Mean Square F Value Pr > F
Model 3 224.7916667 74.9305556 2.22 0.1177
Error 20 676.1666667 33.8083333
Corrected Total 23 900.9583333
Source DF Anova SS Mean Square F Value Pr > F
time 3 224.7916667 74.9305556 2.22 0.1177
Twisk: Output 3.3
Univariate repeated-measures ANOVA
Explain away some error variability by accounting for differences between subjects:
-SSE was 676.17-This will be reduced by variability between subjects
proc glm data=broad; model time1-time4=; repeated time;run; quit;
rANOVA
21.276])2727(...)2723()2726()2725.25[(4subjects)(between SS 2222 id x
399.96276.21-76.176 ty variabilidunexplaine
id time1 time2 time3 time4 MEAN1 31 29 15 26 25.252 24 28 20 32 26.003 14 20 28 30 23.004 38 34 30 34 34.005 25 29 25 29 27.006 30 28 16 34 27.00MEAN: 27.00 28.00 22.33 30.83 27.00
Between subjects
before) (from 79.224imes)(between t SSB
rANOVA results
The GLM Procedure Repeated Measures Analysis of Variance Univariate Tests of Hypotheses for Within Subject Effects
Adj Pr > F Source DF Type III SS Mean Square F Value Pr > F G - G H - F
time 3 224.7916667 74.9305556 2.81 0.0752 0.1311 0.1114 Error(time) 15 399.9583333 26.6638889
Greenhouse-Geisser Epsilon 0.4857 Huynh-Feldt Epsilon 0.6343
Between time variability
Unexplained variability
Repeated measures p-value = .0752
After G-G correction for non-sphericity=.1311
(H-F correction gives .1114)
Idea of G-G and H-F corrections, analogous to pooled vs. unpooled variance ttest: if we have to estimate more things because variances/covariances aren’t equal, then we lose some degrees of freedom and p-value increases.
These epsilons should be 1.0 if sphericity holds. Sphericity assumption appears violated.
With two groups: Naive analysis Run ANOVA on long form of data,
ignoring correlations within subjects:
proc anova data=long; class time;
model score= time group group*time;run;As if there are 8 independent samples: 2 groups at each time point.
Two-way ANOVA (naïve)
33.523)]67.2529(...)2314()2324()2331[(SSE 222 04.126])2775.24()2733.29[(12groups)(between SSB 22 x
before) (from 224.79n times)SSB(betwee
grp time1 time2 time3 time4 MEANA 31 29 15 26A 24 28 20 32A 14 20 28 30MEAN: 23.00 25.67 21.00 19.33 24.75
B 38 34 30 34B 25 29 25 29B 30 28 16 34MEAN: 31.00 30.33 23.67 32.33 29.33
Overall mean=27
Within time
Between groups
Within time
Recall: SST=900.9583333; group by time=900.9583-523.33-224.79-126.04=26.79
Results: Naïve analysis
The ANOVA Procedure
Dependent Variable: score
Sum of Source DF Squares Mean Square F Value Pr > F
Model 7 377.6250000 53.9464286 1.65 0.1924
Error 16 523.3333333 32.7083333
Corrected Total 23 900.9583333
Source DF Anova SS Mean Square F Value Pr > F
time 3 224.7916667 74.9305556 2.29 0.1173 group 1 126.0416667 126.0416667 3.85 0.0673 time*group 3 26.7916667 8.9305556 0.27 0.8439
Univariate repeated-measures ANOVA
Reduce error variability by between subject differences:-SSE was 523.33-This will be reduced by variability between subjects
proc glm data=broad;class group;
model time1-time4= group; repeated time;run; quit;
rANOVA grp time1 time2 time3 time4 MEANA 31 29 15 26 25.25A 24 28 20 32 26.00A 14 20 28 30 23.00MEAN: 23.00 25.67 21.00 19.33 24.75
B 38 34 30 34 34.00B 25 29 25 29 27.00B 30 28 16 34 27.00MEAN: 31.00 30.33 23.67 32.33 29.33
Overall mean=27
16.150])33.2927(...)75.2426()75.2425.25[(4subjects) (between SS 222id x
167.37317.15033.523ty variabilidunexplaine
Between subjects in each group
Between subjects in each group
rANOVA results (two groups)
The GLM Procedure Repeated Measures Analysis of Variance Univariate Tests of Hypotheses for Within Subject Effects
Adj Pr > F Source DF Type III SS Mean Square F Value Pr > F G - G H - F
time 3 224.7916667 74.9305556 2.41 0.1178 0.1743 0.1283 time*group 3 26.7916667 8.9305556 0.29 0.8338 0.6954 0.8118 Error(time) 12 373.1666667 31.0972222
Greenhouse-Geisser Epsilon 0.4863 Huynh-Feldt Epsilon 0.885
The GLM Procedure Repeated Measures Analysis of Variance Tests of Hypotheses for Between Subjects Effects
Source DF Type III SS Mean Square F Value Pr > F
group 1 126.0416667 126.0416667 3.36 0.1408 Error 4 150.1666667 37.5416667
Usually of less interest!
What we care about!
No apparent difference in responses over time between the groups.
From rANOVA analysis… Overall, are there significant differences between
time points? No, Time not statistically significant (p=.1743, G-G)
Overall, are there significant changes from baseline?
No, Time not statistically significant Do the two groups differ at any time points?
No, Group not statistically significant (p=.1408) Do the two groups differ in their responses over
time?** No, not even close; Group*Time (p-value>.60)
Strategy 3: rMANOVA Multivariate: More than one
dependent variable Multivariate Approach to repeated
measures--Treats response variable as a multivariate response vector.
Not just for repeated measures, but appropriate for other situations with multiple dependent variables.
Analogous to paired t-test Recall: paired t-
test:1
112
~)(
ndiff
diff
n
idiff
TYSD
Yn
yyY
Paired t-test compares the difference values between two time points to their standard error.
MANOVA is just a paired t-test where the outcome variable is a vector of difference rather than a single difference:
22
2))1)(1(
1(
diff
diffTdiffN
H
HTN
TNF
S
yy
Called: Hotelling's Trace
Where T is the number of time points:
T-1 differences
id group diff1 diff2 diff3
1 A -2 -14 112 A 4 -8 123 A 6 8 24 B -4 -4 45 B 4 -4 46 B -2 -12 18
Note: weights all differences equally, so hard to interpret if time intervals are unevenly spaced.
Note: assumes differences follow a multivariate normal distribution + multivariate homogeneity of variances assumption
On same output as rANOVA
proc glm data=broad;model time1-time4=;repeated time;
run; quit;
Null hypothesis: diff1=0, diff2=0, diff3=0
Results (time only)
MANOVA Test Criteria and Exact F Statistics for the Hypothesis of no time Effect H = Type III SSCP Matrix for time E = Error SSCP Matrix
S=1 M=0.5 N=0.5
Statistic Value F Value Num DF Den DF Pr > F
Wilks' Lambda 0.24281920 3.12 3 3 0.1876 Pillai's Trace 0.75718080 3.12 3 3 0.1876 Hotelling-Lawley Trace 3.11829053 3.12 3 3 0.1876 Roy's Greatest Root 3.11829053 3.12 3 3 0.1876
•4 separate F-statistics (slightly different versions of MANOVA statistic)
•all give the same answer: change over time is not significant
•compare to rANOVA results: G-G time p-value=.13
Use Wilks’ Lambda in general.
Use Pillai’s Trace for small sample sizes (when assumptions of model are violated)
On same output as rANOVA
proc glm data=broad;class group;model time1-time4= group;repeated time;
run; quit;
The GLM Procedure Repeated Measures Analysis of Variance MANOVA Test Criteria and Exact F Statistics for the Hypothesis of no time Effect
Statistic Value F Value Num DF Den DF Pr > F
Wilks' Lambda 0.23333404 2.19 3 2 0.3287 Pillai's Trace 0.76666596 2.19 3 2 0.3287 Hotelling-Lawley Trace 3.28570126 2.19 3 2 0.3287 Roy's Greatest Root 3.28570126 2.19 3 2 0.3287
MANOVA Test Criteria and Exact F Statistics for the Hypothesis of no time*group Effect
Statistic Value F Value Num DF Den DF Pr > F
Wilks' Lambda 0.77496006 0.19 3 2 0.8932 Pillai's Trace 0.22503994 0.19 3 2 0.8932 Hotelling-Lawley Trace 0.29038909 0.19 3 2 0.8932 Roy's Greatest Root 0.29038909 0.19 3 2 0.8932
No differences between times.
No differences in change over time between the groups (compare to G-G time*group p-value=.6954)
Results (two groups)
From rMANOVA analysis… Overall, are there significant differences between time
points? No, Time not statistically significant (p=.3287)
Overall, are there significant changes from baseline? No, Time not statistically significant
Do the two groups differ at any time points? Can’t say (never looked at raw scores, only difference values)
Do the two groups differ in their responses over time?**
No, not even close; Group*Time (p-value=.89)
Can also test for the shape of the response profile…
proc glm data=broad; class group;
model time1-time4= group; repeated time 3 polynomial /summary ;run; quit;
The GLM Procedure Repeated Measures Analysis of Variance Analysis of Variance of Contrast Variables
time_N represents the nth degree polynomial contrast for time
Contrast Variable: time_1
Source DF Type III SS Mean Square F Value Pr > F
Mean 1 10.2083333 10.2083333 0.21 0.6716 group 1 21.6750000 21.6750000 0.44 0.5421 Error 4 195.7666667 48.9416667
Contrast Variable: time_2
Source DF Type III SS Mean Square F Value Pr > F
Mean 1 84.37500000 84.37500000 3.80 0.1231 group 1 5.04166667 5.04166667 0.23 0.6586 Error 4 88.83333333 22.20833333
Contrast Variable: time_3
Source DF Type III SS Mean Square F Value Pr > F
Mean 1 130.2083333 130.2083333 5.88 0.0724 group 1 0.0750000 0.0750000 0.00 0.9564 Error 4 88.5666667 22.141666
linear
quadratic
cubic
Can also get successive paired t-tests
proc glm data=broad;
class group;
model time1-time4= group;
repeated time profile /summary ;
run; quit;
**Not adjusted for multiple comparisons!
Repeated Measures Analysis of VarianceAnalysis of Variance of Contrast Variables
time_N represents the nth successive difference in time
Contrast Variable: time_1
Source DF Type III SS Mean Square F Value Pr > F
Mean 1 6.00000000 6.00000000 0.35 0.5879 group 1 16.66666667 16.66666667 0.96 0.3823 Error 4 69.33333333 17.33333333
Contrast Variable: time_2
Source DF Type III SS Mean Square F Value Pr > F
Mean 1 192.6666667 192.6666667 2.56 0.1850 group 1 6.0000000 6.0000000 0.08 0.7918 Error 4 301.3333333 75.3333333
Contrast Variable: time_3
Source DF Type III SS Mean Square F Value Pr > F
Mean 1 433.5000000 433.5000000 9.06 0.0395 group 1 0.1666667 0.1666667 0.00 0.9558 Error 4 191.3333333 47.8333333
Time1 vs. time2
Time2 vs. time3
Time3 vs. time4
Univariate vs. multivariate If compound symmetry assumption
is met, univariate approach has more power (more degrees of freedom).
But, if compound symmetry is not met, then type I error is increased
Summary: rANOVA and rMANOVA Require imputation of missing data rANOVA requires compound
symmetry (though there are corrections for this)
Require subjects measured at same time points
But, easy to implement and interpret
Practice: rANOVA and rMANOVA
Within-subjects effects, but no between-subjects effects.
Time is significant.
Group*time is significant.
Group is not significant.
What effects do you expect to be statistically significant?
Time?
Group?
Time*group?
Practice: rANOVA and rMANOVA
Between group effects; no within subject effects:
Time is not significant.
Group*time is not significant.
Group IS significant.
Practice: rANOVA and rMANOVA
Some within-group effects, no between-group effect.
Time is significant.
Group is not significant.
Time*group is not significant.
References Jos W. R. Twisk. Applied Longitudinal Data Analysis for Epidemiology: A
Practical Guide. Cambridge University Press, 2003.