naihua duan ucla and rand may 2000 selection bias in treatment assignment/delivery

May 24, 2000 NIDA/NIMH Substance Abuse Conference

1

Research Designs, Statistical Strategies for Dealing with Selection Bias in Treatment Delivery, and Limitations

Naihua Duan

UCLA and RAND

May 2000

Selection bias in treatment assignment/delivery

Research designs

Mitigating for overt selection bias

Dealing with hidden selection bias

Discussions


2

Selection Bias in Treatment Delivery

In naturalistic settings:

Pre-treatment health treatment delivered

Pre-treatment health outcome

Treated group dissimilar from untreated group

Direct comparison of treated vs. untreated results in biased

estimate for treatment effect

Need to mitigate selection bias in order to assess treatment effect more appropriately


3

Selection Bias in Treatment Delivery: Typology

Overt selection bias

Treatment related to covariates

T X

Given covariates, treatment independent of outcome

T Y | X (ignorability)

Like a stratified randomized experiment

Hidden selection bias

Given covariates, treatment still related to outcome

T Y | X

Rosenbaum (1995) Observational Studies, Springer-Verlag


5

Research Designs

Ideal randomized clinical trial (RCT)Imperfect RCT with noncomplianceRandomized encouragement design (RED)Observational studies

Settings: controlled vs. naturalistic

Treatment assignment/delivery: mandated vs. choice

Treated vs. untreated groups: balance vs. imbalance

Research questions: efficacy vs. adoption, program effect, and efficacy

Analytic strength: interval validity vs. external validity


6

Randomized Clinical Trial

A d op tionC om p lian ce

N ot ad op tedN on -com p l.

A ss ig n to Tx

A d op tionN on -com p l.

N o t ad op tedC om p lian ce

A ss ig n to C on tro l

R an d om ize

R ec ru it, con sen t, en ro ll

Intensive efforts made to mandate assignment


7

Randomized Encouragement Design

A d op tionC om p lian ce

N ot ad op tedN on -com p l.

E n cou rag e Tx

A d op tionN on -com p l.

N o t ad op tedC om p lian ce

N o en cou rag em en t

R an d om ize

R ec ru it, con sen t, en ro ll

Encouragement: training, providing information, case management, reducing barriers (child care, transportation, flexible hours, reducing co-payment…), decorate waiting room,...


8

Randomized Encouragement Design: Features

Analogous to marketing experiment

Encouragement higher adoption rate?

better overall outcomes? better outcomes for new users?

Naturalistic, incorporate user preferences, facilitate choice

Broader participation, external validity, dissemination

Zelen (1979 NEJM, 1990 Stat. in Medicine: randomized consent design), Holland (1988) in Clogg CC, ed. Sociological Methodology, Hirano et al. (2000, Biostatistics), Wells et al. (2000, JAMA), Duan et al. (2000, manuscript)


9

Mitigating Overt Selection Bias

Assume overt selection bias: T X

Assume no hidden selection bias: T Y | X

Covariate adjustment through ANCOVA

Stratification (through propensity score method)

Matching (through propensity score method)


10

Covariate Adjustment

Y = + T + X (+ T X ) + Extrapolation can be risky when imbalance is substantial

Y

X: Pre-Tx health

T = 1

T = 0


11

Limitations for Covariate Adjustment

Extrapolation can be risky when imbalance is substantial

Compare apples and oranges, rely on model to adjust

Careful model diagnosis is essential

Multivariate imbalance might be more problematic

Why so popular?

Ease of push-botton analysis

Almost always gives an answer

Could be a bad answer!


12

Stratification When Covariate Is Univariate

Stratify, then compare by stratum

Compare apples and apples, oranges and oranges

Y

X: Pre-Tx health

T = 1

T = 0


13

Stratification: Procedure

Stratify, then compare treated vs. untreated by stratum

Two-sample comparison within each stratum

ANCOVA within each stratum

Assess interactions across strata

Synthesize treatment effects across strata

Weighted average

Overall intervention effect on treated

Overall intervention effect on untreated

Overall intervention effect on entire pool

Can be specified as ANCOVA with interactions

Nonparametric regression of Y on X, stratified by T


14

Covariate Adjustment, Nonparametric Version

OK for low dimension X

Curse of dimensionality for high dimension X

Y

X: Pre-Tx health

T = 1

T = 0


15

Stratification: Features

Why not used as widely as ANCOVA?

Does not always give an answer

Provides warning where imbalance is too severe

Not a push-button operation, but not difficult

How to stratify?

Clinical judgement

Usually not critical; sensitivity analysis recommended

Cochran-Rubin-Rosenbaum recommend 5 strata

How to stratify with multi-dimensional covariates?

Curse of dimensionality

Use propensity score method to reduce dimensionality


16

Propensity Score Method

Assumeovert selection bias, no hidden selection bias

T Y | X

= X) = P(T = 1 | X) is the propensity score

Example: logit(X)) = + X X) is a balancing score (most parsimonious)

T X | X)

Given X), treatment independent of outcome

T Y | X)

Need only stratify by propensity score

Other dimensions of X can be neglected in assessing treatment effect


17

Propensity Score Method: Procedure

Estimate X) = P(T = 1 | X)

Logistic regression of T on X

Stratify sample (X, T, and Y) by estimated X) or XSort out apples and oranges

Analyze each stratum, compare treated vs. untreated

Two sample comparison within stratum

ANCOVA within stratum

Assess interactions across strata

Synthesize treatment effects across strata

Weighted average...


19

Propensity Score Method: Stratification for Y

Stratify, then compare by stratum

Compare apples and apples, oranges and oranges

Y

X

T = 0

T = 1


20

Propensity Score Method: Model Specification

Specification of propensity score model

Lean towards over-fitting vs. under-fitting?

Model diagnosis: are the covariates balanced across treatment groups within each stratum?

Stratify by propensity score and key covariates (one or two)?

Model misspecification less serious than ANCOVA?

Only rank of estimated propensity score is used

Stratification not sensitive to minor perturbations in model

Limited empirical evidence (Drake 1993 Biometrics, Dehejia and Wahba 1999 JASA)


21

Propensity Score Method: Options

Stratification

Matching (case-control)

Curse of dimensionality relevant, less critical

Mahalonobis distance matching

Match on propensity score (+ a few key covariates?)

Design stage vs. analysis stage

Primary vs. secondary data collection

ANCOVA: regress Y on T and propensity score (+ a few key covariates? + interactions?)

Nonparametric regression? Stratified by T?


22

Dimension Reduction

Fundamental challenge in ANCOVA

Valid assessment of treatment effect can be obtained using nonparametric regression of Y on X, stratified by T

Curse of dimensionality

No obvious way to reduce dimensionality?

Propensity score method is an elegant way to reduce dimensionality

Alternative dimension reduction methods?

Slicing regression (Duan and Li 1991 Annals of Statistics, Li 1991 JASA): use inverse regression of X on Y...


23

Propensity Score Method: References

Rosenbaum and Rubin (1983 Biometrika, 1984 JASA)

Lavori, Dawson, and Mueller (1994 Stat. in Medicine)

Rosenbaum (1995) Observational Studies, Springer-Verlag

Rubin (1997) Annals of Internal Medicine

D’Agastino (1998 Stat. in Medicine)

Normand et al. (2000 manuscript)

Hirano et al. (2000 manuscript)


24

Dealing with Hidden Selection Bias

T Y | X

Very challenging problem, no easy solutions

Given X, how does treatment depend on outcome?

Overt selection bias can be made to look like stratified randomized experiment

Hidden selection bias cannot be made to…

Rosenbaum-Rubin’s sensitivity analysis

Instrumental variable analysis a la Rubin Causal Model

Selection modeling


25

Rosenbaum’s Sensitivity Analysis: General Principle

How robust is the observed treatment effect against hidden selection bias?

Analogous to pattern mixture model for missing data

Formulate a family of plausible models for hidden selection bias (from mild to severe)

Assess treatment effect under each model

Determine how much hidden selection bias wipes out treatment effect

Is this much hidden selection bias realistic?

Specificity analysis


26

Unobserved Confounder Model

logit(Xi)) = + Xi Ui 0 Ui 1

> 0: maximum impact of unobserved hidden bias

= exp() is the upper bound between Xi)’s | X

Example: 2 x 2 table (analyzed with Fisher’s exact test)

Worst case scenario for hidden bias:

Unobserved health is a perfect predictor of survival

Healthy patients are more likely to receive treatment

Ui = 1 for all survivors; = 0 for all deceaseds

Null distribution is a tilted hypergeometric distribution

Given , derive P-value under tilted hypergeometric distribution


27

Rosenbaum’s Sensitivity Analysis: Limitations

Does not give THE answer (should we expect one?)

Rosenbaum’s sensitivity analysis is based on permutation test (tilted by hidden selection bias)

Permutation test is the foundation for randomized trials, but rarely used: heavy computation burden

Used more in recent years, e.g., COMMIT

Special software required for tilted permutation test

Programming logic not difficult

Very heavy computation burden

Inertia for users to stay with familiar packages


28

Instrumental Variable (IV) Analysis for RED,a la Rubin Causal Model

Encouragement intervention serves as instrumental variable

Assume binary intervention (I = 0, 1)

binary treatment (T = 0, 1)

T(0) T(1) Category

0 0 Never takers

0 1 Compliers (new users)

1 0 Defiers (assumed to be absent)

1 1 Always takers

Very likely different beyond observed characteristics


29

IV Analysis: Observed Compliance Status

I = 0:

Untreated: C or N

Treated: A or D

I = 1:

Untreated: N or D

Treated: C or A

Randomized encouragement design

Compliance status distributed similarly across intervention groups

%(C) = %(treated | I = 1) %(treated | I = 0)

= %(untreated | I = 0) %(untreated | I = 1)


30

IV Analysis: Intervention Effect by Subgroups

Key assumption:

Effect of encouragement intervention mediated entirely through treatment (exclusion restriction)

Always takers and never takers: no treatment variation

no intervention effect [exclusion restriction]

cannot assess treatment effect

Intervention effect manifested entirely through compliers


31

Complier Average Causal Effect

Treatment “Efficacy” on compliers:

CACE = Program effect / Incremental adoption rate

Program effect = intent-to-treat effect for encouragement intervention on outcome

Incremental adoption rate = intent-to-treat effect for encouragement intervention on adoption

Distribute intervention effect on outcome over compliers


32

IV Analysis: External Validity

Treatment effect estimable only for compliers (new users)

Intrinsic limitation of design (RED or imperfect RCT)

Should we be concerned about treatment effect for always takers and never takers?

Yes for efficacy trials, less so for RED

Never taker might never adopt treatment voluntarily

Mandate vs. choice

Universal dissemination vs. practical dissemination

Always takers more critical; absent for new treatments

Presence of defier likely to cancel some intervention effect

IV estimate is conservative for true CACE


33

IV Analysis: Discussions

Exclusion restriction needs to be entertained carefully

Likelihood and Bayesian methods available under weaker assumptions

Non-randomized encouragement design (observational studies with instrumental variables)

Example: McClellan et al. JAMA 1994, distance to alternative types of hospitals

IV analysis usually deflates precision substantially

Bias-variance trade-off?

Combine propensity score analysis with IV analysis?


34

IV Analysis: References

Sommer and Zeger (1991 Stat. in Medicine)

Angrist, Imbens, and Rubin (1996 JASA)

Imbens and Rubin (1997 Annals of Statistics)

Little and Yau (1998 Psych Methods)

Hirano, Imbens, Rubin, and Zhou (2000 Biostatistics)

Wells, et al. (2000, manuscript)


35

DiscussionsFormulate research questions

Treatment effect for whom? Adoption?

Careful design usually more effective than analytic solutions

Matching to avoid severe imbalance

Promising methods for mitigating overt selection bias

Careful modeling warranted

Propensity score method worth exploring

Nonparametric regression worth exploring

Hidden selection bias very challenging

Rosenbaum’s sensitivity analysis warranted

IV analysis and selection model require careful assessment

naihua duan ucla and rand may 2000 selection bias in treatment assignment/delivery

Documents