correlated data - introduction -...
TRANSCRIPT
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Faculty of Health Sciences
Correlated dataIntroduction
Julie Lyng Forman & Lene Theil SkovgaardNovember 25, 2013
1 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Introduction
I The idea of the courseI Comparing two types of measurementI Logarithmic transformationI Linear regressionI The general linear model
Home page:http://staff.pubhealth.ku.dk/~lts/CorrelatedMeasurementsE-mail: [email protected]
2 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Aim of the course
To make the participants able to:I understand and interpret advanced statistical analysesI judge the assumptions behind the use of various methods of
analysesI perform own analyses using SASI understand output from a statistical program package
- in general, i.e. other than SASI present results from a statistical analysis - numerically and
graphically
To create a better platform for communication between ’users’ ofstatistics and statisticians, to benefit subsequent collaboration
3 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
We expect students to . . .
Be interested
Be motivatedI ideally from your own (future) research project
Have basic knowledge of statistical concepts such as:I mean, averageI variance, standard deviation, standard errorI distributionI correlation, regression, anovaI t-test, χ2-test, F-test
4 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Topics for the course
Quantitative data (normal distribution):I Analysis of variance
I Variance component modelsI General linear models / regression analysis
I Linear mixed modelsNon-normal outcome (binary data or count data):
I Logistic or Poisson regressionI Generalized linear mixed models
Not covered:I Multivariate data (several outcomes at once)I Censored data (survival analysis)
5 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Recommended reading
I The lecture notes(can be downloaded from the course webpages).
I Brief notes about SAS-programming(can be downloaded from the course webpages).
I B.T. West, K.B. Welch and A.T. Galecki:Linear mixed models: a practical guide using statistical software,Chapman & Hall/CRC, 2007
We teach SAS programming.I . . . but the book also covers SPSS, R, Stata, and HLM.
6 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Teaching activities
Lectures:I Mornings (9.15–12.00)I Copies of overheads must be downloaded in advanceI Coffee break around 10-10.30
Computers labs:I In the afternoon (13.00-15.45) following each lectureI Coffee, tea, and cake will be servedI Exercises will be handed outI Solutions can be downloaded after classes
7 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Course diploma
To pass the course 80% attendance is required.I It is your responsibility to sign the list each morning and each
afternoon.I Note: 5× 2 = 10 lists, 80% equals 8 half days.
There is no compulsory home work . . .I but to benefit from the course you need to work with the
material at homeI We expect you to do so!
8 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
What are repeated measurements?
Repeated measurements refer to data where the same outcome hasbeen measured in different situations (or at different spots) on thesame individuals.
I Special case: longitudinal means repeatedly over time.
Repeated measurements are termed clustered data when the sameoutcome is measured on groups of individuals from the samefamilies/workplaces/school classes/villages/etc.
9 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Paired data
The most simple example of clustered or repeated measuments.I Two replicates or two subjects per cluster
Examples of paired data:I Same person with treatment and placebo (cross-over studies)I Baseline-follow up studiesI Twin studiesI Comparison of two measurement methodsI Reliability of a measurement method
Quantiative outcome analysed with the paired t-test BUToften the test is not in focus, rather estimation/quantification
10 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Statistical analysis
The usual assumption is that observations are independent.
If you have clustered or repeated measurements the assumption ofindependence is violated.
I Your analyses must account for the repetitions/clustering.I In this course we will teach you how to do it.
Warning: Ignoring the repetitions/clustering and doing a standardanalysis most often leads to:
I P-values that are too small or too large.I confidence intervals that are too wide or too narrow.
11 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Example: MF vs SV
Two measurement methods,expected to give the same result:
MF: Transmitral volumetric flow,determined by Dopplereccocardiography
SV: Left ventricular strokevolume, determined bycross-sectional eccocardiography
subject MF SV1 47 432 66 703 68 724 69 815 70 60. . .. . .. . .. . .18 105 9819 112 10820 120 13121 132 131
12 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Comparison of measurement methods
Usually a comparison of a new experimental method with anestablished method (the reference)
I How well do the two measurements agree?I Is the new method biased compared to the reference?
The data is pairedI The subjects act as their own controlsI Hence we look at differences within subjects
Set up a statistical model to:I Describe the typical size of the differencesI Test if the bias (i.e. the mean difference) is zero
13 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Description of the dataGraphical description
I ScatterplotI Sample pathsI Bland-Altman plotI Histogram
Numerical description
Variable Mean Std.Dev-------------------------MF 86.05 20.32SV 85.81 21.19DIF 0.24 6.96AVERAGE 85.93 20.46-------------------------
14 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Statistical model for paired data
xi : MF-measurement for the i’th subjectyi : SV-measurement for the i’th subject
Look at the differences:
di = xi − yi , for i = 1, . . . , 21
The model asssumes that the differences? are:I independentI normally distributed di ∼ N (δ, σ2
d)? No assumptions are made about the distribution of the individualflow measurements15 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
The normal distribution
x
De
ns
ity
2
1 1( , )N m s
2
2 2( , )N m s
1 1m s+1 1m s- 2 2m s-
2 2m s+2m1m
N (µ, σ2)
The mean is often denotedµ or α.
The standard deviation isoften denoted σ or ω.
The variance is σ2.
16 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Paired t-test in SAS
Can be performed in two different ways:
1. as a paired two-sample test
PROC TTEST;PAIRED mf*sv;
RUN;
The TTEST ProcedureStatistics
Lower CL Upper CL Lower CL Upper CLDifference N Mean Mean Mean Std Dev Std Dev Std Devmf - sv 21 -2.932 0.2381 3.4078 5.3275 6.9635 10.056
Difference Std Err Minimum Maximummf - sv 1.5196 -13 10
T-TestsDifference DF t Value Pr > |t|mf - sv 20 0.16 0.8771
17 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
One-sample tests in SAS, for differences
2. as a one-sample test on the differences:
PROC UNIVARIATE NORMAL;VAR dif;
RUN;
The UNIVARIATE ProcedureVariable: dif
Tests for Location: Mu0=0
Test -Statistic- -----p Value------
Student’s t t 0.156687 Pr > |t| 0.8771Sign M 2.5 Pr >= |M| 0.3593Signed Rank S 8 Pr >= |S| 0.7603
Moments
N 21 Sum Weights 21Mean 0.23809524 Sum Observations 5Std Deviation 6.96351034 Variance 48.4904762... ... ... ...
18 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
About the paired t-test
Test of the null hypothesis H0 : δ = 0 (no bias)
The t-statistic is given by:
t = d − 0SEM = 0.24− 0
6.96/√
21= 0.158 ∼ t(20)
which gives P = 0.88, i.e. no significant bias.
Does this mean that the measurement methods are equally good?
19 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Estimation of bias
The estimated mean difference is given by
d = 0.24 cm3
The estimate is our best guess, but repeating the experimentwould give us a somewhat different result
The estimate has a distribution, with an uncertainty called thestandard error of the estimate.
I The standard error of the mean is given by
SEM = sd√n = 6.96√
21= 1.52 cm3
20 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
General confidence intervals
Confidence intervals tells us what the parameter is likely to beI An interval, that ’catches’ the true mean with a 95%
probability is called a 95% confidence intervalI 95% is called the coverage
The usual construction is:I Average ±t97.5%(n − 1) · SEMI Often a good approximation, even if data are not normally
distributed (due to the central limit theorem)
The t-quantile t97.5% may be looked up in a table or computed by a program (e.g. R,see http://mirrors.dotsrc.org/cran/).
21 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Confidence limits for the bias
For the differences mf-sv, we get the confidence interval:
d ± t97.5%(20) · SEM0.24 ± 2.086 · 6.96/
√21
(−2.93 ; 3.41)
If there is a bias, it is likely (i.e. with 95% certainty) within thelimits (−2.93cm3, 3.41cm3)
Conclusion:We cannot rule out a bias of approx. 3 cm3 in either direction
22 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
P-values and confidence intervals
Tests and confidence intervals are equivalent in a certain senseI They agree on ’reasonable’ values for the meanI The confidence interval contains the values δ0 for which
H0 : δ = δ0 would be accepted
But the P-value is less informative than the confidence intervalI If the study is large a tiny bias may be significantI If the study is small a large bias may be insignificantI Better use the confidence interval to judge the clinical
implications of the bias!
23 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Note the difference
Standard error (of the mean), SE(M)I tells us something about the uncertainty of the estimate of
the meanI SEM = SD/
√n is the standard deviation in the distibution of
the estimateI – is used for comparisons, relations etc.
Standard deviation, SDI tells us something about the variation in our sample,I and presumably in the populationI – is used when describing the data
24 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Normal regions
The normal region is an interval containing 95% of the ’typical’observations, i.e. the midrange of the population:
2.5%-quantile to 97.5%-quantile
If the distribution is normal N (µ, σ2), thenI 2.5%-quantile to 97.5%-quantile is µ± 1.96σ
An estimated normal region is given by:
Average± 2× SD
But this does not account for parameter uncertainty!
25 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Prediction intervals
A prediction interval has to ’catch’ future observations with highprobability, say 95%.
x ± 2s is a good prediction interval if the sample is large.But if the sample is small the coverage will be too low.
95% coverage is attained by the prediction interval:
(x − s ·√
1 + 1/n · t2.5%, x + s ·√
1 + 1/n · t97.5%)
I.e. the probability that a randomly chosen subject from thepopulation has a value in this interval is 95% if the data is normal
26 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Limits of agreement
Limits-of-agreement is the prediction interval for the differencebetween two measuring methods
I important for deciding whether or not two measurementmethods may replace each other.
Limits-of-agreement for mf-sv are given by:
0.24± 2.086 ·√
1 + 1/21 · 6.96 = (−14.97, 15.45)
While "x ± 2s" is too narrow / has too low coverage:
d ± 2 · sd = 0.24± 2 · 6.96 = (−13.68, 14.16)
27 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Derivation of the prediction interval
Assume that dnew is a new observation, then
dnew − d ∼ N(0, σ2
d ·(1 + 1
n) )
dnew−dsd ·√
1+1/n∼ t(n − 1)
implying that with 95% probability:
t2.5% < dnew−dsd ·√
1+1/n< t97.5%
d + sd√
1 + 1/n · t2.5% < dnew < d + sd ·√
1 + 1/n · t97.5%
d − sd√
1 + 1/n · t97.5% < dnew < d + sd ·√
1 + 1/n · t97.5%
since t2.5% = −t97.5% by symmetry of the t-distribution.
28 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Assumptions for the paired comparison
The differences:I are independent, i.e. the subjects are unrelated
I are normally distributed: judged graphically or numericallyI by inspection of histograms or QQ-plotsI by formal tests (e.g. PROC UNIVARIATE NORMAL in SAS)
I have have identical variances: judged using the ’Bland-Altmanplot’ of differencs vs. averages
Sometimes it is necessary to tranform the data in order to fulfillthe assumptions
29 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Checking normality: the QQ-plot
Observed quantiles againsttheoretical normal quantiles
If the data is normal, the pointswill be close to the line
30 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Model assumption: Normality?
Assumption: the differences follow a normal distribution.
We can check the assumption by e.g. looking at the histogram orthe QQ-plot.
But with large samples the assumption is not always necessary:I The validity of the t-test and the confidence intervals only rely
on the distributions of the average d . . .I and averages tend to be normal due to the CLT.
However: Normal regions (e.g. limits of agreement) require anormal distribution.
31 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
The central limit theorem (CLT)Averages of rolls of dice are more normal than a single roll
One dice roll
Average0 1 2 3 4 5 6 7
0.0
0.2
0.4
0.6
2 dice rolls
Average0 1 2 3 4 5 6 7
0.0
0.2
0.4
0.6
10 dice rolls
Average2 3 4 5
0.0
0.2
0.4
0.6
50 dice rolls
Average2.5 3.0 3.5 4.0 4.5
0.0
0.5
1.0
1.5
32 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Classical two-sample (unpaired) comparison
If the two treatments were applied to separate groups of subjcets– we have independent samplesTraditional model assumptions:
x11, · · · , x1n1 ∼ N (µ1, σ2)
x21, · · · , x2n2 ∼ N (µ2, σ2)
I All observations are independentI Observations follow a normal distribution within each groupI Both groups have the same variance, σ2
I The mean values, µ1 and µ2 may differ
33 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Paired or unpaired comparison?
Note the consequences for the difference between MF and SV:
Estimated mean differenceI 0.24, CI: (-2.93, 3.41) according to the paired t-testI 0.24, CI: (-12.71, 13.19) according to the unpaired t-test
i.e. same estimate but a much wider confidence intervalI The latter is wrong!
You have to respect your design.I Do not forget to take advantage of a subject serving as its
own control (higher power with fewer individuals)
34 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Comparing measurement methods
When comparing two measurement methods:I We have to determine the proper scale
before carrying out the statistical analysis
Is the precision of the measurements approximately the same overthe entire range?
I In that case look at differences on an absolute scaleI Use the differences between the raw measurements
Or does the precision increase with the size of the quantity beingmeasured?
I In that case look at differences on a relative scaleI Make a logarithmic transformation
35 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Another comparison: REFE vs TEST
Two methods for determiningconcentration of glucose:
I REFE: Colour test, may be’polluted’ by urine acid
I TEST: Enzymatic test,more specific for glucose
Ref: R.G. Miller et.al. (eds):Biostatistics Casebook.John Wiley & Sons, 1980.
nr. REFE TEST1 155 1502 160 1553 180 169. . .. . .. . .44 94 8845 111 10246 210 188
average 144.1 134.2SD 91.0 83.2
36 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
The usual analysis - the naive approach
Do we see a systematic difference?Test ’δ=0’ assuming di = REFEi − TESTi ∼ N (δ, σ2
d)
d = 9.89, sd = 9.70⇒ t = dSEM = d
sd/√
n = 6.92 ∼ t(45)hence P< 0.0001 , i.e. stong indication of bias.
Limits of agreement tells us that the typical differences are
9.89± t97.5%(45) ·√
1 + 1/46 · 9.70 = (−9.85, 29.64)
Is this a valid analysis?!?
37 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Plots of the raw data
Scatter plot and Bland Altman plot:
The variance of the differences increases with the level;so the model assumptions of the usual analysis are violated!38 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Plots of the log-transformed data
Precision seem to be relative, hence we do a log-transformation
I The plots look better except for an outlier
39 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Close up
Following a logarithmictransformation (andomission of the outlier)the Bland Altman plotlooks OK
40 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Notes on the log-transformation
I It is the original measurements, that have to be transformedwith the logarithm, not the differences!
I Never make a logarithmic transformation on data that mightbe negative!
I It does not matter which logarithm you choose (i.e. whichbase of the logarithm) since they are all proportional
I The procedure with construction of limits of agreement is nowrepeated for the transformed observations
I The result can be transformed back to the original scale withthe anti-logarithm (exp for the natural logarithm)
41 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
The correct analysis
Do we see a systematic difference?Test ’δ=0’ assuming di = log(REFEi)− log(TESTi) ∼ N (δ, σ2
d)
d = 0.066, sd = 0.042⇒ t = dSEM = d
sd/√
n = 10.66 ∼ t(45)P< 0.0001 , i.e. stong indication of bias.
Limits of agreement tells us that the typical differences are
0.066± t97.5%(45) ·√
1 + 1/46 · 0.042 = (−0.020, 0.152)
. . . on Log-scale!
42 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Back transformation
Limits of agreement on log-scale are (−0.020, 0.152),meaning that for 95% of the subjects we will have:
−0.020 < log(REFE)− log(TEST) < 0.152
i.e. − 0.020 < log(REFETEST
)< 0.152
Back transforming (using the exponential function):
0.982 = exp(−0.020) < REFETEST
< exp(0.152) = 1.162
or reversed: 0.859 = 11.162 <
TESTREFE
<1
0.982 = 1.02
So TEST will typically lie 14% below to 2% above REFE.43 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Limits of agreement on the original scale
44 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Non-normal data
If the normal distribution is not a good description:I Tests and confidence intervals are valid if the sample is
sufficiently large (due to the central limit theorem).
I To judge the reliability for a given sample:I Use resampling techniquesI Or check with a statistician
I Normal regions and limits of agreement becomeuntrustworthy!
45 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Example: Fertility and aging
Cross-sectional study: 527 women aged 22–42.
Objective: How does fertility decline with age?
Outcomes: Physiological markers of fertilityI Menstrual cycle lengthI Reproductive hormones (FSH, AMH, . . . )I Ovarian volumeI Antral follicle count (AFC)
46 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Simple linear regeression for AFCAFC = α+ β · age + ε – is this a good model?
47 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Log-linear regressionA more plausible model is exponential decay, implying a linearmodel on logarithmic scale: log(AFC) = α+ β · age + ε
48 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Regression with SAS
PROC GLM DATA=menopause;MODEL logafc = age / SOLUTION CLPARM;RUN;
The GLM Procedure
R-Square Coeff Var Root MSE logafc Mean0.053070 21.53554 0.622772 2.891832
Source DF Type III SS Mean Square F Value Pr > FAGE 1 11.41154527 11.41154527 29.42 <.0001
Parameter Estimate Std.Error t Value Pr > |t| 95% Confidence LimitsIntercept 4.066684811 0.21828311 18.63 <.0001 3.637869196 4.495500427AGE -0.035958049 0.00662907 -5.42 <.0001 -0.048980815 -0.022935284
Note: We could have used PROC REG instead.
49 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Regression equation and estimates
The estimates for the linear regression on logarithmic scale are:
Intercept α = 4.07 (95% CI 3.64–4.50)I The "expected value for age= 0"!
Regression coefficient β = −0.036 (95% CI -0.049 to -0.023)I The expected decrease in log(AFC) with one year of aging.
50 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Rate of decline
We see exponential decay on the natural scale.
The expected AFC for age x (median or geometric mean) is
AFC(x) = exp(α+ βx)
I With one year of aging x → x + 1I AFC(x + 1) = exp(α+ β(x + 1)) = exp(β) · AFC(x)I Annual rate of change is the factor exp(β)
corresponding to the decline {1− exp(β)} · 100%.I Estimated by exp(β) = 0.9646, i.e. a decline of 3.5%.
51 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Multiple regression
The regression could be biased by possible confounders:I Use of oral contraceptives (yes, no)I Smoking (current, former og never)I Prenatal smoking exposure (yes, no)I BMI (under weight, normal weight, over weight, obese)
Adjust for these in a multiple regression (general linear model):
Yi = α+ βX + β1Xi,1 + . . .+ βkXi,k + εi
with k additional covariates.I Some of these are dummy variables coding for relevant groups
52 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
SAS-program
PROC GLM DATA=menopause;CLASS oc smoking prenatsmoke bmigrp;MODEL logafc = oc smoking prenatsmoke bmigrp age
/ SOLUTION CLPARM;OUTPUT OUT=diagnostics p=fitted r=residual student=stres;RUN;
The GLM Procedure
Sum ofSource DF Squares Mean Square F Value Pr > FModel 8 22.7349809 2.8418726 7.58 <.0001Error 497 186.2490086 0.3747465Corrected Total 505 208.9839894
R-Square Coeff Var Root MSE logafc Mean0.108788 21.15842 0.612165 2.893247
53 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
SAS-output
Source DF Type III SS Mean Square F Value Pr > FOC 1 8.38447592 8.38447592 22.37 <.0001SMOKING 2 0.04472481 0.02236240 0.06 0.9421PRENATSMOKE 1 1.74079772 1.74079772 4.65 0.0316BMIGRP 3 0.68550681 0.22850227 0.61 0.6089AGE 1 15.39698818 15.39698818 41.09 <.0001
StandardParameter Estimate Error t Value Pr > |t|Intercept 4.007665017 B 0.29614093 13.53 <.0001OC no 0.313390980 B 0.06625480 4.73 <.0001OC yes 0.000000000 B . . .SMOKING never -0.023610470 B 0.07174410 -0.33 0.7422SMOKING previous -0.023529734 B 0.08113255 -0.29 0.7719SMOKING smoker 0.000000000 B . . .PRENATSMOKE no-smoke 0.130881971 B 0.06072597 2.16 0.0316PRENATSMOKE smoke 0.000000000 B . . .BMIGRP normal 0.153602199 B 0.18013313 0.85 0.3942BMIGRP over25 0.084779529 B 0.19228883 0.44 0.6595BMIGRP over30 0.050248838 B 0.21702144 0.23 0.8170BMIGRP under18.5 0.000000000 B . . .AGE -0.047386837 0.00739279 -6.41 <.0001
Adjusted β = −0.047, i.e. rate of decline by 4.6%.54 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
SAS-output
Parameter 95% Confidence Limits
Intercept 3.425822534 4.589507500OC no 0.183216961 0.443565000OC yes . .SMOKING never -0.164569584 0.117348643SMOKING previous -0.182934802 0.135875335SMOKING smoker . .PRENATSMOKE no-smoke 0.011570705 0.250193237PRENATSMOKE smoke . .BMIGRP normal -0.200314124 0.507518523BMIGRP over25 -0.293019675 0.462578732BMIGRP over30 -0.376143740 0.476641415BMIGRP under18.5 . .AGE -0.061911820 -0.032861855
. . . with 95% confidence interval (-0.062,-0.033),corresponding to a decline between 3.2% and 6.0%.
55 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Interpretation of regression coefficients
Simple regression Y = α+ β · age + ε
I β is the expected change in log(AFC) when age increases byone year.
Multiple regression Y = α+ β · age + β1 ·X1 + . . .+ βkXk + ε
I β is the expected change in log(AFC) when age increases byone year and all other covariates are held fixed.
Similarly for the other covariates:I e.g. exp(0.154) ' 1.166 or 16.6% higher AFC for normal BMI
compared to < 18.5 and all other covariates held fixed.
56 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Hypothesis tests
Does AFC decline with age?
T-test for H0 β = 0:I β = −0.0439, s.e(β) = 0.0074, t = β/s.e(β) = −6.41.I P < 0.0001 in t-distribution with 497 degrees of freedom.
Equivalent to F-test:I Mean Square(Age)/Mean Square(Error) = 41.09I P < 0.0001 in F-distribution with (1,497) degrees of freedom
Note: In case of a categorical covariates with more than two levelsonly the F-test is generally applicable.
57 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Tests of type I and type III
Mind the difference!
Type I: Test the effect of each covariate after ajustment for allother covariates above it on the list.
I Sequential tests to be read bottom-up.
Type III: Test the effect of each covariate after ajustment for allother covariates on the list.
I Non-sequential tests, pick the one that you like.
58 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Predictions (fitted values)
log(AFC) = α+ β · age + β1 · I (no prenatal smoking)+β2 · I (never smoker) + β3 · I (previous smoker)+β4 · I (normal BMI) + . . .+ β6 · I (BMI > 30)+β7 · I (No use of oral contraceptives)
Expected log(AFC) of a 30 year old woman, no smoking, normalweight, non-user of oral contraceptives:
log(AFC) = 4.008−0.047·30−0.024+0.131+0.154+0.313 = 3.172
I.e. we expect an AFC of exp(3.172) ' 24.59 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Model assumptions
The general linear model assumes that:
1. The observations are independent2. The linear model for the mean is correct3. Error terms (εi ’s) are normally distributed with zero mean
and equal variances
Use the residuals for model diagnostics:
Ri = Yi − Yi
I "Observed value - Predicted value"I Standardized values are preferred for diagnostics (because of
varying estimation uncertainty in the predicted values)60 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Residual plot
Should be fairly symemtric around zero and with no systematicpatterns.
61 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Residuals against covariatesSimilar plot – looking for non-linear relation with a covariate.
62 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Checking normality: the QQ-plot
63 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Example: Maternal age at menopause
64 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Example: Maternal age at menopause
Does the decline in fertility depend on heridatory factors?
Three groups according to maternal age at menopause:I Early, ≤ 45 years of ageI Normal, 46 to 54 years of ageI Late, > 55 years of age
We have a log-linear model for each group.I Is the rate of decline the same in all three groups?
65 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Analysis of covariance
Another name for a general linear model with one quantiativecovariate and one categorical covariate
I We have one regression line for each group
Are the lines parallel?I If not we have an interaction between the two covariates
Are the lines identical?I If not we have differences among the groups
66 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Example: Maternal age at menopause
In the late-group fertility seems to increase with age???67 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Estimating regression lines
Model: log(AFC)ij = αj + βj · ageij , j = 1, 2, 3I One set of regression parameters per groupI Re-set the intercept at age= 22 for interpretability
data menopause;set menopause;age22 = age-22;run;
proc glm data=menopause;class menogrp;model logafc = menogrp age22*menogrp
/ noint solution clparm;run;
68 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
ANCOVA-output
The GLM Procedure
Dependent Variable: logafc
R-Square Coeff Var Root MSE logafc Mean0.082607 21.27821 0.615330 2.891832
StandardParameter Estimate Error t Value Pr > |t| 95% Confidence Limits
MENOGRP early 3.328468294 0.20237671 16.45 <.0001 2.930893639 3.726042949MENOGRP late 2.744304704 0.24071674 11.40 <.0001 2.271409991 3.217199417MENOGRP normal 3.334785604 0.08572241 38.90 <.0001 3.166381562 3.503189646AGE22*MENOGRP early -0.052377545 0.01703328 -3.08 0.0022 -0.085839902 -0.018915188AGE22*MENOGRP late 0.022007035 0.01998117 1.10 0.2712 -0.017246526 0.061260596AGE22*MENOGRP normal -0.042078492 0.00764074 -5.51 <.0001 -0.057088940 -0.027068044
Increasing rate in the late maternal menopause group isinsignificant (P=0.27).
69 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Rates of decline
When the slopes are back-transformed, they becomeestimated rates of decline, with 95%-confidence intervals:
Maternal menopause Rate of change in AFC per year (95% CI)Early (≤ 45 years) -5.1% (-8.2% to -1.9%)Normal (46-54 years) -4.1% (-5.5% to -2.7%)Late (> 55 years) +2.2% (-1.7% to +6.3%)
Increasing rate in the late-group might as well be a chance finding.
70 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Re-parametrisationSame model other parameters:
log(AFC)i = α+ β · age + δ1 · I (group=1) + δ2 · I (group=2)+γ1 · I (group=1) · age + γ2 · I (group=2) · age
I Group 3 is reference with regression parameters α and β.I δ’s and γ’s are differences in regression parameters wrt ref.I Allows for testing differences among the groups.
title1 ’ANCOVA’;proc glm data=menopause;class menogrp;model logafc = menogrp age22 age22*menogrp / solution;run;71 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
ANCOVA-output
The GLM Procedure
R-Square Coeff Var Root MSE logafc Mean0.082607 21.27821 0.615330 2.891832
Source DF Type III SS Mean Square F Value Pr > FMENOGRP 2 2.05075422 1.02537711 2.71 0.0676AGE22 1 2.65777726 2.65777726 7.02 0.0083AGE22*MENOGRP 2 3.77690717 1.88845358 4.99 0.0072
StandardParameter Estimate Error t Value Pr > |t|Intercept 3.334785604 B 0.08572241 38.90 <.0001MENOGRP early -0.006317310 B 0.21978322 -0.03 0.9771MENOGRP late -0.590480900 B 0.25552472 -2.31 0.0212MENOGRP normal 0.000000000 B . . .AGE22 -0.042078492 B 0.00764074 -5.51 <.0001AGE22*MENOGRP early -0.010299053 B 0.01866852 -0.55 0.5814AGE22*MENOGRP late 0.064085527 B 0.02139224 3.00 0.0029AGE22*MENOGRP normal 0.000000000 B . . .
Regression coefficients differ significantly, intercepts do not.72 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Missing data problem?
We have missing data . . .I among younger women whose mothers aren’t yet menopausalI i.e. missing not at randomI data from some of the potentially most fertile tend to be
missing
This may cause biasI Particularly the late-group.
73 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Assuming identical intercepts
Leave out the main effect of menogrp.
title1 ’ANCOVA with same intercept at age 22’;proc glm data=menopause;class menogrp;model logafc = age22 age22*menogrp/ solution clparm;run;
Output:Source DF Type I SS Mean Square F Value Pr > FAGE22 1 11.41154527 11.41154527 29.94 <.0001AGE22*MENOGRP 2 4.30076782 2.15038391 5.64 0.0038
Rate of decline still differ significantly between groups (P=0.004).
74 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
A prettier picture
75 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Estimated rates of decline
. . . when assuming identical intercepts (at age 22).
Estimated rates of decline with 95%-confidence intervals:
Maternal menopause Rate of decline in AFC per year (95% CI)Early (≤ 45 years) 4.7% (3.1% to 6.3%)Normal (46-54 years) 3.7% (2.3% to 4.9%)Late (> 55 years) 2.0% (0.4% to 3.6%)
76 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
Summary statistics
Numerical description of quantitative variables:
Location, centerI average (mean value) x = (x1 + · · ·+ xn)/nI median (middle observation, 50% above and 50% below)
VariationI variance, s2 = Σ(xi − x)2/(n − 1) (quadratic units)I standard deviation, s =
√variance (units as outcome)
I quantiles, e.g. Inter Quantile Range (25% to 75% quantile)I standard error, SE = s/
√n (uncertainty of mean estimate)
77 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
The summary statistics for ’MF vs SV’ are made using the code:
Note: the data is read in from the file ’mf_sv.txt’(text file with two columns and 21 observations)
DATA mydata;INFILE ’mf_sv.txt’ FIRSTOBS=2;INPUT mf sv;
dif=mf-sv;average=(mf+sv)/2;
RUN;
PROC MEANS DATA=mydata MEAN STD;RUN;
78 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
The pictures for ’MF vs SV’ are made using the code:
proc gplot;plot mf*sv / haxis=axis1 vaxis=axis2 frame;
axis1 value=(H=2) minor=NONE label=(H=2);axis2 value=(H=2) minor=NONE label=(A=90 R=0 H=2);symbol1 v=circle i=none c=BLACK l=1 w=2;run;
proc gplot;plot flow*method=subject/ nolegend haxis=axis1 vaxis=axis2 frame;
axis1 value=(H=2) minor=NONE label=(H=2);axis2 value=(H=2) minor=NONE label=(A=90 R=0 H=2);symbol1 v=circle i=join l=1 w=2 r=21;run;
79 / 80
u n i v e r s i t y o f c o p e n h a g e n d e p a r t m e n t o f b i o s t a t i s t i c s
proc gplot;plot dif*average / vref=0 lv=1 vref=0.24 15.5 -15.0 lv=2
haxis=axis1 vaxis=axis2 frame;axis1 value=(H=2) minor=NONE label=(H=2 ’average’);axis2 order=(-16 to 16 by 4) value=(H=2) minor=NONE
label=(A=90 R=0 H=2 ’difference MF-SV’);symbol1 v=circle i=none l=1 w=2;title h=3 ’Bland Altman plot’;run;
title;proc gchart;
vbar dif;run;
80 / 80