rerandomization in randomized experiments
DESCRIPTION
Rerandomization in Randomized Experiments. Kari Lock and Don Rubin Harvard University JSM 2010. The “Gold Standard”. Why are randomized experiments so good?. They yield unbiased estimates of the treatment effect They eliminate (?) confounding factors… - PowerPoint PPT PresentationTRANSCRIPT
Rerandomization in Randomized Experiments
Kari Lock and Don RubinHarvard University
JSM 2010
The “Gold Standard”
Why are randomized experiments so good?
• They yield unbiased estimates of the treatment effect
• They eliminate (?) confounding factors…… ON AVERAGE. For any particular experiment, covariate imbalance is possible (and likely)
Rerandomization
• Suppose you are doing a randomized experiment and have covariate information available before conducting the experiment
• You randomize to treatment and control, but get a “bad” randomization
• Can you rerandomize?• Yes, but you first need to specify a concrete definition
of “bad”
Randomize subjects to treated and control
Collect covariate dataSpecify a criteria determining when a randomization is unacceptable;
based on covariate balance
(Re)randomize subjects to treated and control
Check covariate balance
1)
2)
Conduct experiment
unacceptable acceptable
Analyze results with a Fisher randomization test
3)
4)
UnbiasedTo maintain an unbiased estimate of the treatment effect, the decision to rerandomize or not must be automatic and specified in advance blind to which group is treated
Theorem: If the treated and control groups are the same size, and if for every unacceptable randomization the exact opposite randomization is also unacceptable, then rerandomization yields an unbiased estimate of the treatment effect.
Mahalanobis Distance
Define overall covariate distance byM = D’r-1D
2Under adequate sample sizes and pure randomization: ~ kM
Dj : Standardized difference between treated and control covariate means for covariate jk = number of covariatesD = (D1, …, Dk)r = covariate correlation matrix = cov(D)
Choose a and rerandomize when M > a
Rerandomization Based on M
• Since M follows a known distribution, easy to specify the proportion of rejected randomizations
• M is affinely invariant
• Correlations between covariates are maintained
• The variance reduction on each covariate is the same (and known)
•The variance reduction for any linear combination of the covariates is known
RerandomizationTheorem: If nT = nC and rerandomization occurs when M > a, then
| ,cov co| v
T C
T C T Ca
E M aM a v
X X 0X X X X
and
1,2 2 2 , is the incomplete gamma function
,2 2
a
k a
vk ak
2va
| 0,| 1 (1 ) var .r
T C
T C T Ca
E M aMY Y
Y Ya vY R Y
Difference in Covariate Means
Difference in Outcome Means
Pure RandomizationRe-Randomization
Standardized Differences in Covariate Means
-4 -2 0 2 4
male
age
collgpaa
actcomp
preflit
likelit
likemath
numbmath 0.14
0.15
0.17
0.14
0.16
0.16
0.16
0.15
, ,
, ,
var( |var(
))
j T j C
j T j C
X X aTX X
(theoretical va = .16)
Pure RandomizationRe-Randomization
var( | .57var(
))
T C
T C
Y Y aY Y
T
(theory = .58)
Math
Difference in Means
-1.0 -0.5 0.0 0.5 1.0
Verbal
Difference in Means
-1.0 -0.5 0.0 0.5 1.0
Equivalent to increasing the
sample size by a factor of 1.7
Difference in Outcome Means Under Null
Conclusion• Rerandomization improves covariate balance between the treated and control means, and increases precision in estimating the treatment effect if the covariates are correlated with the response
• Rerandomization gives the researcher more power to detect a significant result, and more faith that an observed effect is really due to the treatment