personalized medicine and dynamic treatment …personalized medicine and dynamic treatment regimes...

118
Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University April 21, 2014 Trends and Innovations in Clinical Trial Statistics Conference 1/118

Upload: others

Post on 23-Jun-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Personalized Medicine and DynamicTreatment Regimes

Marie Davidian and Butch Tsiatis

Department of StatisticsNorth Carolina State University

April 21, 2014

Trends and Innovations in Clinical Trial Statistics Conference

1/118

Page 2: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Acknowledgement

IMPACT: Innovative Methods Program for AdvancingClinical Trials• A joint venture of the University of North Carolina at Chapel

Hill, Duke University, and North Carolina State University• Supported by Program Project grant P01 CA142538 from

the National Cancer Institute, Statistical Methods forCancer Clinical Trials

• Comprises five interrelated research projects, includingDiscovery of Dynamic Treatment Regimes

This short course is part of IMPACT’s education andoutreach efforts

2/118

Page 3: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Course Outline

Introduction to Personalized Medicine and Dynamic TreatmentRegimes

Estimation of Optimal Treatment Regimes for a Single Decision

Estimation of Optimal Treatment Regimes from a ClassificationPerspective

SMARTs and Estimation of Optimal Treatment Regimes forMultiple Decisions

3/118

Page 4: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Roadmap

8:00 am – 8:45 am Marie will give an introduction to dynamictreatment regimes

8:45 am – 10:00 am Butch will discuss estimation of optimaltreatment regimes in the single decisionsetting

10:00 am – 10:20 am Break

10:20 am – 10:50 am Butch will discuss estimation of optimalregimes as a classification problem

10:50 am – 12:00 pm Marie will discuss SMART studies andestimation of optimal treatment regimesin the multiple decision setting

4/118

Page 5: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Course Outline

Introduction to Personalized Medicine and Dynamic TreatmentRegimes

Estimation of Optimal Treatment Regimes for a Single Decision

Estimation of Optimal Treatment Regimes from a ClassificationPerspective

SMARTs and Estimation of Optimal Treatment Regimes forMultiple Decisions

5/118

Page 6: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

What is Personalized Medicine?

“The right treatment for the right patient (at the right time) ”

6/118

Page 7: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

What is Personalized Medicine?

In general: One size does not fit all• Multiple treatment options may be available• Patient heterogeneity

I Across patients: What works for one patient may not workfor another

I Within patients: What works now may not work later

Premise: Use information on a patient’s characteristics todetermine which treatment option s/he should receive (andwhen. . . )• Genetic/genomic, demographic,. . .• Physiologic/clinical measures, medical history,. . .

7/118

Page 8: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Popular Perspective on Personalized Medicine

Subgroup identification/targeted treatment:• Are there subgroups of patients who are likely to benefit

from a particular treatment?• Can biomarkers be developed to identify such patients?• Can a treatment be developed that targets a subgroup that

is very likely to benefit from that treatment?

Focus: Treating and targeting treatment for subgroups of thepatient population• “Right patient for the treatment ”

8/118

Page 9: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Another Perspective on Personalized Medicine

Clinical practice: Clinicians make (a series of) treatmentdecision(s) over the course of a patient’s disease or disorder• Fixed schedule , milestone in the disease process, event

necessitating a decision• Multiple treatment options• Accrued information on the patient

Focus: Given information on patient’s characteristics, can wedetermine the treatment from among the available options mostlikely to benefit him/her?• “Right treatment for the patient ”• This is the perspective we will take in this course

9/118

Page 10: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Clinical Decision-Making

How are these decisions made?• Clinical judgment , practice guidelines• Synthesize all information on a patient up to the point of a

decision to determine next treatment action• Goal: “Individualize ” the decision to the patient• Can this be formalized and made evidence-based ?

10/118

Page 11: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Dynamic Treatment Regime

Formalizing personalized medicine: At any decision• Construct a rule that takes as input the available

information on the patient to that point and dictates thenext treatment from among the possible, feasible options

Dynamic treatment regime: A set of formal rules , eachcorresponding to a decision point• Dynamic because the treatment action varies depending

on the accrued information

11/118

Page 12: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Dynamic Treatment Regime

Assume: There is a clinical outcome by which treatmentbenefit can be assessed• Survival time, CD4 count, indicator of no myocardial

infarction within 30 days, . . .• Larger outcomes are better

Intuitively: Rules should depend on characteristics (variables ,covariates ) that exhibit a qualitative interaction with treatment• “Tailoring variables ”

12/118

Page 13: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Tailoring Variables

Ave

rage

Out

com

e

Trt A

Trt ATrt B

Trt B

No Mutation Mutation

13/118

Page 14: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Tailoring Variables

Ave

rage

Out

com

e

Trt A

Trt ATrt B

Trt B

No Mutation Mutation

14/118

Page 15: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Single Decision Point

Simple example: Which treatment to give patients whopresent with primary operable breast cancer ?• Treatment options : L-phenylalanine mustard and

5-fluorouracil (PF, c1) or c1 + tamoxifen (PFT, c2)• Available information: age (years), progesterone receptor

(PR) level (fmol)• Outcome: Disease-free survival to three years

Gail and Simon (1985):• Data on ∼ 1,300 patients in a National Surgical Adjuvant

Breast and Bowel Project (NSABP) clinical trial

15/118

Page 16: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Single Decision Point

Gail and Simon rule/regime:• “If age < 50 years and PR < 10 fmol, give c1 (coded as 1);

otherwise, give c2 (coded as 0)”• Mathematically: The rule d is (rectangular region )

d(age,PR) = I(age < 50 and PR < 10)

Alternatively: Rules of form (hyperplane )

d(age,PR) = I{age + 8.7 log(PR)− 60 > 0}

16/118

Page 17: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Multiple Decision Points

Two decision points:• Decision 1 : Induction chemotherapy (options c1, c2)• Decision 2 : Maintenance treatment for patients who

respond (options m1, m2), Salvage chemotherapy for thosewho don’t (options s1, s2)

• Outcome : Survival time

17/118

Page 18: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Multiple Decision Points

• At baseline: Information x1 (accrued information h1 = x1)

• Decision 1: Two options {c1, c2}; rule 1: d1(x1)⇒ x1 → {c1, c2}• Between decisions 1 and 2: Collect additional information x2,

including responder status

• Accrued information h2 = {x1, chemotherapy at decision 1, x2}• Decision 2: Four options {m1,m2, s1, s2}; rule 2: d2(h2)⇒

h2 → {m1,m2} (responder), h2 → {s1, s2} (nonresponder)

• Regime : d = (d1,d2)

18/118

Page 19: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Summary

Single decision: 1 decision point• Baseline information x ∈ X• Set of treatment options a ∈ A• Decision rule d(x), d : X → A• Dynamic treatment regime: d

Multiple decisions: K decision points• Baseline information x1, intermediate information xk

between decisions k − 1 and k , k = 2, . . . ,K• Set of treatment options at decision k ak ∈ Ak

• Accrued information h1 = x1 ∈ H1,

hk = {x1,a1, x2,a2, . . . , xk−1,ak−1, xk} ∈ Hk , k = 2, . . . ,K

• Decision rules d1(h1),d2(h2), . . . ,dK (hK ), dk : Hk → Ak

• Dynamic treatment regime d = (d1,d2, . . . ,dK )

19/118

Page 20: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Considerations

Challenge: Infinitude of possible regimes d• D = class of all possible dynamic treatment regimes• Can we find the “best ” set of rules; i.e., the “best ” dynamic

treatment regime in D?• How do we define “best? ”

20/118

Page 21: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Optimal Dynamic Treatment Regime

An optimal dynamic treatment regime: Should satisfy• If all patients in the population were to receive treatment

according to regime d , the expected (average) outcome forthe population would be as large as possible

• If an individual patient were to receive treatment accordingto regime d = (d1, . . . ,dK ), his/her expected outcomewould be as large as possible given the informationavailable on him/her

• Can we formalize this?

21/118

Page 22: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Potential Outcomes

Single decision: Possible treatment options a ∈ A• For a randomly chosen patient from the population, define

the random variable Y ∗(a) = the outcome the patient wouldachieve if s/he were to receive treatment option a

• “Potential outcome ”• E.g., if A = {0,1} (two possible treatment options)

I Y ∗(1)= the outcome a patient would achieve if s/he weregiven treatment 1

I Similarly for Y ∗(0)

• E{Y ∗(1)} is the average outcome if all patients in thepopulation were to receive 1; similarly for E{Y ∗(0)}

22/118

Page 23: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Potential Outcomes

Potential outcome for a regime: For any d ∈ D• Y ∗(d) = the outcome a patient would achieve if s/he

received treatment according to regime d• If A = {0,1} and patient has baseline information X = x

d(x) = 0 or 1

• Potential outcome under d

Y ∗(d) = Y ∗(1)d(X ) + Y ∗(0){1− d(X )}

23/118

Page 24: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Optimal Dynamic Treatment Regime

Thus:• E{Y ∗(d)|X = x} is the expected outcome for a patient with

information x if s/he were to receive treatment according toregime d ∈ D

• E{Y ∗(d)} = E [ E{Y ∗(d)|X} ] is the expected (average)outcome for the population if all patients were to receivetreatment according to regime d ∈ D

Optimal regime: dopt is a regime in D such that• E{Y ∗(d)} ≤ E{Y ∗(dopt )} for all d ∈ D• E{Y ∗(d)|X = x} ≤ E{Y ∗(dopt )|X = x} for all d ∈ D and

all x ∈ X

24/118

Page 25: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Optimal Dynamic Treatment Regime

Multiple decisions: Same idea, only more complicated• Baseline information X1

• Potential outcomes under a regime d ∈ D

X ∗2 (d), . . . ,X ∗K (d),Y ∗(d)

• Patient presents with information X1; all subsequentinformation determined by d

Optimal regime: dopt is a regime in D such that• E{Y ∗(d)} ≤ E{Y ∗(dopt )} for all d ∈ D• E{Y ∗(d)|X1 = x1} ≤ E{Y ∗(dopt )|X1 = x1} for all d ∈ D

and x1 ∈ H1

25/118

Page 26: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Value of a Regime

Value of regime d: The expected (average) outcome for thepopulation if all patients were to receive treatment according toregime d ∈ D• Value of regime d is E{Y ∗(d)}• Sometimes written as V (d) = E{Y ∗(d)}

26/118

Page 27: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Important Philosophical Point

Distinguish between:• The “best ” treatment for a patient• The “best ” treatment decision for a patient given the

information available

Best treatment for a patient: Option abest ∈ A correspondingto the largest Y ∗(a) for that patient

Best treatment decision given the information available:• We cannot hope to determine abest because we can never

see all the potential outcomes on a given patient• We can hope to make the optimal decision given the

information available, i.e., find dopt that makesE{Y ∗(d)|X = x} (and the value, E{Y ∗(d)}) as large aspossible

27/118

Page 28: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Estimation of Optimal Treatment Regimes

Result: This perspective on personalized medicine boils downto discovery of optimal dynamic treatment regimes using data• Existing data from observational studies (e.g., registries),

previously conducted clinical trials• Prospectively collected data from clinical trials conducted

specifically for this purpose (coming up)

28/118

Page 29: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Estimation of Optimal Treatment Regimes

Single decision: Data (Xi ,Ai ,Yi), i = 1, . . . ,n• n subjects indexed by i• Xi = information observed on subject i• Ai = observed treatment actually received by subject i• Yi = observed outcome for subject i• Goal: Under suitable assumptions , estimate dopt using

these data

29/118

Page 30: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Estimation of Optimal Treatment Regimes

Multiple decisions: Data

(X1i ,A1i ,X2i ,A2i , . . . ,X(K−1)i ,A(K−1)i ,XKi ,Yi), i = 1, . . . ,n

• X1i = Baseline information observed on subject i• Xki , k = 2, . . . ,K = intermediate information between

decisions k − 1 and k on subject i• Aki , k = 1, . . . ,K = observed treatment actually received

by subject i at decision k• Yi = observed outcome for subject i ; can be ascertained

after decision K or can be a function of X2i , . . . ,XKi

• Goal: Under suitable assumptions , estimatedopt = (dopt

1 , . . . ,doptK ) using these data

30/118

Page 31: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Estimation of Optimal Treatment Regimes

Challenges:1. The optimal dynamic treatment regime dopt is defined in

terms of potential outcomes (not the observed data)2. This sounds hard ; does it really have to be?3. Were all possibly useful tailoring variables that clinicians

used in the study at each decision point recorded in thedata?

31/118

Page 32: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Estimation of Optimal Treatment Regimes

Challenge 1: dopt is defined in terms of potential outcomes• Need to be able to express the definition of dopt

equivalently in terms of the data• Possible under certain assumptions• Butch will demonstrate in the single decision case• Also possible in the multiple decision case (but harder)

32/118

Page 33: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Estimation of Optimal Treatment Regimes

Challenge 2: Can’t we just piece together results from severalstudies to figure out the optimal regime?

• Study comparing induction chemotherapies based on response

• Study comparing maintenance therapies based on survival timeamong responders to induction therapy

• Study comparing salvage therapies based on survival timeamong nonresponders

• Wouldn’t the regime that uses the “best ” option in each studyhave to have the “best ” average outcome?

• Delayed effects: The induction therapy with the highestproportion of responders might have other effects that rendersubsequent treatments less effective in regard to survival

• Result: Must consider the entire sequence of decisions

33/118

Page 34: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Clinical Trials for Discovery of Treatment Regimes

Challenge 3: Conduct a clinical trial specifically designed forestimation of optimal dynamic treatment regimes• SMART: Sequential Multiple Assignment Randomized Trial• Randomize subjects to the treatment options at each

decision point• Collect extensive, detailed information initially and

intermediate to decision points on possible tailoringvariables

Later: Marie will have more to say on both of these issues

34/118

Page 35: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Clinical Trials for Discovery of Treatment Regimes

Cancer

Response

NoResponse

Response

NoResponse

C

C

1

2

M1

M2

M1

M2

S1

S2

S1

S2

Randomization at •s

35/118

Page 36: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Course Outline

Introduction to Personalized Medicine and Dynamic TreatmentRegimes

Estimation of Optimal Treatment Regimes for a Single Decision

Estimation of Optimal Treatment Regimes from a ClassificationPerspective

SMARTs and Estimation of Optimal Treatment Regimes forMultiple Decisions

36/118

Page 37: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Optimal Regime

Assume: Large outcomes are good

An optimal regime:• A regime that, if followed by all patients in the population,

yields the largest outcome on average• That is, yields the largest value

Goal: Given data (evidence ) from a clinical trial orobservational study, estimate an optimal regime satisfying thisdefinition• For now : Consider regimes involving a single decision /rule

37/118

Page 38: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Statistical Framework

Simplest setting: A single decision with two treatment options• A = {0,1}

Observed data: (Yi ,Xi ,Ai), i = 1, . . . ,n, iid• Yi outcome, Xi baseline covariates, Ai = 0,1 treatment

received

Treatment regime: A single rule• A function d : X → {0, 1}

38/118

Page 39: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Statistical Framework

Breast cancer example: Which treatment to give patients whopresent with primary operable breast cancer ?• Two treatment options (0 or 1), x =(age, PR)• Possible rules

d(age,PR) = I(age < 50 and PR < 10)

d(age,PR) = I{age + 8.7log(PR)− 60 > 0}

Goal, restated:• Let D be the class of all possible regimes d• Estimate dopt ∈ D such that, if dopt were followed by all

patients in the population, it would lead to largest averageoutcome (value ) among all regimes in D

39/118

Page 40: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Potential Outcomes

Reminder: We can hypothesize potential outcomes• Y ∗(1) = outcome that would be achieved if patient were to

receive 1; Y ∗(0) defined similarly• E{Y ∗(1)} is the average outcome if all patients in the

population were to receive 1; and similarly for E{Y ∗(0)}• We observe

Y = Y ∗(1)A + Y ∗(0)(1− A)

40/118

Page 41: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Potential Outcomes

No unmeasured confounders: Assume that

Y ∗(0),Y ∗(1) ⊥⊥ A|X

• X contains all information used to assign treatments• Automatically satisfied for data from a randomized trial• Standard but unverifiable assumption for observational

studies• Implies that

E{Y ∗(1)} = E [E{Y ∗(1)|X}]= E [E{Y ∗(1)|X ,A = 1}]= E{E(Y |X ,A = 1) }

and similarly for E{Y ∗(0)}

41/118

Page 42: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Potential Outcomes

E{Y ∗(1)} = E{E(Y |X ,A = 1) }

Implication for estimating E{Y ∗(1)}: Similarly for E{Y ∗(0)}• E(Y |X ,A) = Q(X ,A) is the regression of Y on X and A• E(Y |X ,A) is unknown• Posit a model Q(X ,A;β) for Q(X ,A)

• Estimate β based on observed data =⇒ β(e.g., least squares)

• Estimator for E{Y ∗(1)}

n−1n∑

i=1

Q(Xi ,1; β)

42/118

Page 43: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Potential Outcomes

Potential outcome for a regime:• For any d ∈ D, define Y ∗(d) to be the potential outcome

for a patient if s/he were given treatment according toregime d

Y ∗(d) = Y ∗(1)d(X ) + Y ∗(0){1− d(X )}

• E{Y ∗(d)} is the average outcome for the population if allpatients were treated according to regime d

• That is, E{(Y ∗(d)} = V (d) is the value of regime d

43/118

Page 44: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Value of a Regime

Y ∗(d) = Y ∗(1)d(X ) + Y ∗(0){1− d(X )}

Value of regime d: Using no unmeasured confounders

E{Y ∗(d)} = E [E{Y ∗(d)|X}]

= E[E{Y ∗(1)|X}d(X ) + E{Y ∗(0)|X}{1− d(X )}

]= E

[E(Y |X ,A = 1)d(X ) + E(Y |X ,A = 0){1− d(X )}

]= E [Q(X ,1)d(X ) + Q(X ,0){1− d(X )}],

where E(Y |X ,A) = Q(X ,A)

44/118

Page 45: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Estimating the Value of a Regime

E{Y ∗(d)} = E [Q(X ,1)d(X ) + Q(X ,0){1− d(X )}]

Again: E(Y |X ,A) is not known• Posit a model Q(X ,A;β) for E(Y |X ,A)

• Estimate β based on observed data =⇒ β(e.g., least squares)

• Estimate V (d) = E{Y ∗(d)} by

V (d) = n−1n∑

i=1

[Q(Xi ,1, β)d(Xi) + Q(X1,0, β){1− d(Xi)}]

45/118

Page 46: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Optimal Regime

Reminder: dopt is a regime in D such that• E{Y ∗(d)} ≤ E{Y ∗(dopt )} for all d ∈ D• E{Y ∗(d)|X = x} ≤ E{Y ∗(dopt )|X = x} for all d ∈ D and

x ∈ X

Optimal regime:

dopt (x) = arg maxa={0,1} E{Y ∗(a)|X = x}

• Thus

dopt (x) = I[E{Y ∗(1)|X = x} > E{Y ∗(0)|X = x}]= I{Q(x ,1) > Q(x ,0) }

46/118

Page 47: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Estimating the Optimal Regime

“Regression estimator”:• Estimate dopt by

doptREG(x) = I{Q(x ,1; β) > Q(x ,0; β) }

• Estimator for V (dopt ) = E{Y ∗(dopt )}

VREG(doptREG) = n−1

n∑i=1

[Q(X ,1i , β)dopt

REG(Xi)+Q(X ,0i , β){1−doptREG(Xi)}

]Concern: Q(X ,A;β) may be misspecified , so dopt

REG could befar from dopt

47/118

Page 48: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Optimal Restricted Regime

Alternative perspective: Q(X ,A;β) defines a class of regimes

d(x , β) = I{Q(x ,1;β) > Q(x ,0;β)},

indexed by β, that may or may not contain dopt

• E.g., suppose in truth

E(Y |X ,A) = exp{1 + X1 + 2X2 + 3X1X2 + A(1− 2X1 + X2)}

=⇒ dopt (x) = I(x2 ≥ 2x1 − 1) (hyperplane)

48/118

Page 49: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Optimal Restricted Regime

Posited model:

Q(X ,A;β) = β0 + β1X1 + β2X2 + A(β3 + β4X1 + β5X2)

• Regimes I{Q(x ,1;β) > Q(x ,0;β)} define a class ofregimes Dη with elements

I(x2 ≥ η1x1+η0) or I(x2 ≤ η1x1+η0), η0 = −β3/β5, η1 = −β4/β5

depending on the sign of β5

• Parameter η is defined as a function of β• The optimal regime in this case is contained in Dη• However, the estimated regime I{Q(x ,1; β) > Q(x ,0; β}

may not estimate the optimal regime within Dη if theposited model is incorrect

49/118

Page 50: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Optimal Restricted Regime

Suggests: Consider directly a restricted class of regimes Dηwith elements of form

d(x ; η) = dη(x) indexed by η

• Such regimes may be motivated by a regression model orbased on cost , feasibility in practice, interpretability; e.g.,

d(x ; η) = I(x1 < η0, x2 < η1)

• Dη may or may not contain dopt but is still of interest

• Optimal restricted regime doptη (x) = d(x ; ηopt ),

ηopt = arg maxη E{Y ∗(dη)}

50/118

Page 51: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Estimating the Optimal Restricted Regime

Optimal restricted regime: doptη (x) = d(x ; ηopt ),

ηopt = arg maxη E{Y ∗(dη)} = arg maxη V (dη)

Approach:• Directly estimate the value V (dη) = E{Y ∗(dη)} for any

fixed η =⇒ V (dη)

• Estimate the optimal restricted regime by finding

ηopt = arg maxηV (dη) =⇒ doptη (x) = d(x ; ηopt )

• We refer to this as a value search estimator for doptη

51/118

Page 52: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Value Search Estimators

Required: A “good ” estimator for V (dη)

• Missing data analogy• Let Cη denote η-regime consistency indicator

Cη = Ad(X ; η) + (1− A){1− d(X ; η)}

• “Full data ” are {X ,Y ∗(dη)}; “observed data ” are(X ,Cη,CηY )

• =⇒ Only a subset of subjects have observed outcomesunder dη; the rest are missing

52/118

Page 53: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Value Search Estimators

Cη = Ad(X ; η) + (1− A){1− d(X ; η)}

Propensity scores:• π(X ) = pr(A = 1|X ) is the propensity score for treatment• Randomized trial: π(X ) is known• Observational study: Posit a model π(X ; γ) (e.g., logistic

regression) and fit using (Ai ,Xi), i = 1, . . . ,n =⇒ γ.• Propensity of receiving treatment consistent with dη

πc(X ; η) = pr(Cη = 1|X ) = E(Cη|X )

= E [Ad(X ; η) + (1− A){1− d(X ; η)}|X ]

= π(X )d(X ; η) + {1− π(X )}{1− d(X ; η)}

• Write πc(X ; η, γ) with π(X ; γ)

53/118

Page 54: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Value Search Estimators

Estimators for V (dη) = E{Y ∗(dη)}: For fixed η• Inverse probability weighted estimator

VIPWE (dη) = n−1n∑

i=1

Cη,iYi

πc(Xi ; η, γ).

• Consistent for V (dη) if π(X ; γ) (hence πc(X ; η, γ)) is correct

54/118

Page 55: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Value Search Estimators

Consistency:

E{

CηYπc(X ; η)

}= E

{CηY ∗(dη)

πc(X ; η)

}= E

[E{

CηY ∗(dη)

πc(X ; η)

∣∣∣∣Y ∗(dη),X}]

= E[

E{Cη|Y ∗(dη),X}Y ∗(dη)

πc(X ; η)

]= E

[E{Cη|X}Y ∗(dη)

πc(X ; η)

]= E

{πc(X ; η)Y ∗(dη)

πc(X ; η)

}= E{Y ∗(dη)}

55/118

Page 56: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Value Search Estimators

Estimators for V (dη) = E{Y ∗(dη)}: For fixed η• Doubly robust augmented inverse probability weighted

estimator

VAIPWE (dη) = n−1n∑

i=1

{Cη,iYi

πc(Xi ; η, γ)−

Cη,i − πc(Xi ; η, γ)

πc(Xi ; η, γ)m(Xi ; η, β)

}m(X ; η, β) = E{Y ∗(dη)|X} = Q(X ,1;β)d(X ; η)+Q(0,X ;β){1−d(X ; η)}

and Q(X ,A;β) is a model for E(Y |X ,A)

• Consistent if either π(X , γ) or Q(X ,A;β) is correct

56/118

Page 57: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Augmented Estimator

Under MAR: Y ∗(dη)⊥⊥Cη|X• If γ

p−→ γ∗ and βp−→ β∗, this estimator

p−→

E{

CηYπc(X ; η, γ∗)

− Cη − πc(X ; η, γ∗)

πc(X ; η, γ∗)m(X ; η, β∗)

}= E

[Y ∗(dη) +

{Cη − πc(X ; η, γ∗)

πc(X ; η, γ∗)

}{Y ∗(dη)−m(X ; η, β∗)}

]= E{Y ∗(dη)}+ E

[{Cη − πc(X ; η, γ∗)

πc(X ; η, γ∗)

}{Y ∗(dη)−m(X ; η, β∗)}

]• Hence the estimator is consistent if either

I π(X ; γ∗) = π(X )⇒ πc(X ; η, γ∗) = πc(X ; η)(propensity correct)

I Q(X ,A;β∗) = Q(X ,A)⇒ m(X ; η, β∗) = m(X ; η)(regression correct )

I Double robustness

57/118

Page 58: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Value Search Estimators

Result: Estimators ηopt for ηopt obtained by maximizingVIPWE (dη) or VAIPWE (dη) in η

• Estimated optimal restricted regime doptη (x) = d(x ; ηopt )

• Non-smooth functions of η; must use suitable optimizationtechniques

• Estimators for V (doptη ) = E{Y ∗(dopt

η )}

VIPWE (doptη,IPWE ) or VAIPWE (dopt

η,AIPWE )

Can calculate standard errors• Semiparametric theory : AIPWE is more efficient than

IPWE for estimating V (dη) = E{Y ∗(dη)}• =⇒ Estimating regimes based on AIPWE should be

“better ”

58/118

Page 59: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Simulation Studies

Extensive simulations: One representative scenario• True E(Y |X ,A) of form

Qt (X ,A;β) = exp{β0+β1X 21 +β2X 2

2 +β3X1X2+A(β4+β5X1+β6X2)}

• Misspecified model for E(Y |X ,A)

Qm(X ,A;β) = β0 + β1X1 + β2X2 + A(β3 + β4X1 + β5X2)

• =⇒ Dη = {I(η0 + η1X1 + η2X2 > 0)}, dopt ∈ Dη• True propensity score

logit{πt (X ; γ)} = γ0 + γ1X 21 + γ2X 2

2

• Misspecified propensity score

logit{πm(X ; γ)} = γ0 + γ1X1 + γ2X2

59/118

Page 60: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Simulation Studies

Here: Both the correct and misspecified outcome regressionmodels define a class of regimes

Dη = {I(η0 + η1x1 + η2x2 > 0)}

so that dopt ∈ Dη• Other scenarios with dopt /∈ Dη• 1000 Monte Carlo data sets with n = 500

60/118

Page 61: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Simulation Studies

• Truth: V (dopt ) = E{Y ∗(dopt )} = 3.71

• V (dopt ) obtained for each data set using 106 Monte Carlosimulations

Method V (dopt ) SE Cov V (dopt )

REG-Qt 3.70 (0.14) – – 3.71 (0.00)REG-Qm 3.44 (0.18) – – 3.27 (0.19)

PS correct

IPWE 4.01 (0.26) 0.28 86.1 3.63 (0.07)AIPWE-Qt 3.72 (0.15) 0.15 94.7 3.70 (0.01)AIPWE-Qm 3.85 (0.21) 0.23 91.8 3.66 (0.07)

PS incorrect

IPWE 4.06 (0.22) 0.23 69.4 3.42 (0.20)AIPWE-Qt 3.72 (0.15) 0.15 95.2 3.70 (0.01)AIPWE-Qm 3.81 (0.18) 0.19 94.1 3.57 (0.20)

61/118

Page 62: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Simulation Studies

Performance: Empirical CDFs of over 1000 data sets ofV (dopt )/V (dopt ) under true and misspecified models

62/118

Page 63: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Application: NSABP Trial

Recall: Two treatment options, n = 1276• A = 0 if PFT (c2), = 1 if PF (c1)• Y = 1 if patient survived disease-free to 3 years, = 0

otherwise• x = {age, log(PR+1)}=(x1, x2)

• Consider regimes of the form

d(x ; η) = I(age < η0 and PR < η1)

• Gail and Simon : η0 = 50, η1 = 10

63/118

Page 64: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Application: NSABP Trial

IPWE and AIPWE:• Propensity: π = n−1∑n

i=1 Ai (randomized trial )• Outcome regression: With expit(u) = eu/(1 + eu)

Q(x ,a;β) = expit{β0 + β1x1 + β2x2 + a(β3 + β4x1 + β5x2)}

Fit by logistic regression

Results: Regimes of form

d(x ; η) = I(age < η0 and PR < η1)

• Gail and Simon : η0 = 50, η1 = 10• Estimated optimal regimes :

ηopt0 ηopt

1 V (doptη ) (95% CI)

IPWE 56 5 0.68 (0.64,0.72)AIPWE 60 9 0.69 (0.65,0.72)

64/118

Page 65: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Discussion

• Two approaches to estimation of optimal regimes for asingle decision point

• Regression methods – estimate an optimal regime basedon a posited regression model

• Value search methods – estimate an optimal treatmentregime within a specified class by maximizing the value

• Robustness to misspecification (AIPWE)• Both methods may be extended to multiple decision points

(later)• Next: Alternative classification perspective for single

decision

65/118

Page 66: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

References

Gail, M. and Simon, R. (1985). Testing for qualitativeinteractions between treatment effects and patient subsets.Biometrics 41, 361–372.

Zhang, B., Tsiatis, A. A., Laber, E. B., and Davidian, M. (2012).A robust method for estimating optimal treatment regimes.Biometrics 68, 1010–1018.

66/118

Page 67: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

BreakPlease return by 10:20 am

67/118

Page 68: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Course Outline

Introduction to Personalized Medicine and Dynamic TreatmentRegimes

Estimation of Optimal Treatment Regimes for a Single Decision

Estimation of Optimal Treatment Regimes from a ClassificationPerspective

SMARTs and Estimation of Optimal Treatment Regimes forMultiple Decisions

68/118

Page 69: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Classification Methods

Generic classification situation:• Z = outcome , class , label ; here, Z = {0, 1} (binary )• X = vector of covariates, features taking values in X , the

feature space• d is a classifier: d : X → {0, 1}• D is a family of classifiers , e.g.,

I Hyperplanes of the form

I(η0 + η1X1 + η2X2 > 0)

I Rectangular regions of the form

I(X1 < a1) + I(X1 ≥ a1,X2 < a2)

69/118

Page 70: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Classification Methods

Generic classification problem:• Training set: (Xi ,Zi), i = 1, . . . ,n• Find classifier d ∈ D that minimizes

I Classification error

n∑i=1

{Zi − d(Xi )}2

I Weighted classification error

n∑i=1

wi{Zi − d(Xi )}2

70/118

Page 71: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Classification Methods

Approaches:• This problem has been studied extensively by statisticians

and computer scientists• Machine learning (supervised learning)• Many methods and software are available• Recursive partitioning (CART ): Rectangular regions• Support vector machines: Hyperplanes, etc.

71/118

Page 72: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Value Search Estimators, Revisited

Recall: Estimation of dη ∈ restricted class Dη

ηopt = arg maxηV (dη) = arg maxηE{Y ∗(dη)}

• Doubly robust AIPWE

VAIPWE (dη) = n−1n∑

i=1

{Cη,iYi

πc(Xi ; η, γ)−

Cη,i − πc(Xi ; η, γ)

πc(Xi ; η, γ)m(Xi ; η, β)

}

Cη,i = Aid(Xi ; η) + (1− Ai){1− d(Xi ; η)}πc(Xi ; η, γ) = π(Xi ; γ)d(Xi ; η) + {1− π(Xi ; γ)}{1− d(Xi ; η)}m(Xi ; η, β) = Q(Xi ,1; β)d(Xi ; η) + {1−Q(Xi ,0, β)}{1− d(Xi ; η)}

72/118

Page 73: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Value Search Estimators, Revisited

Algebra: VAIPWE (dη) may be rewritten as

n−1n∑

i=1

d(Xi ; η)C(Xi) + terms not involving d

C(Xi) =

{AiYi

π(Xi , γ)− Ai − π(Xi , γ)

π(Xi ; γ)Q(Xi ,1; β)

}−

{(1− Ai)Yi

1− π(Xi ; γ)+

Ai − π(Xi ; γ)

1− π(Xi ; γ)Q(Xi ,0; β)

},

• The contrast function is

E{C(Xi)|Xi} ≈ C(Xi) = Q(Xi ,1)−Q(Xi ,0)

73/118

Page 74: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Contrast Function

E{C(Xi)|Xi} ≈ C(Xi) = Q(Xi ,1)−Q(Xi ,0)

Result: C(Xi) can be viewed as an estimator for the contrastfunction for subject i• If we knew the functions Q(Xi ,1) and Q(Xi ,0), we should

assign treatment

I{C(Xi) > 0} = I{Q(Xi ,1)−Q(Xi ,0) > 0}

to patient i .

74/118

Page 75: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Classification Perspective

ηopt = arg maxηn∑

i=1

d(Xi ; η)C(Xi)

Further algebra: Another identity

d(Xi ; η)C(Xi) = −|C(Xi)|[I{C(Xi) > 0} − d(Xi ; η)]2

+ |C(Xi)|I{C(Xi) > 0}

• Hence

ˆηopt = arg minηn∑

i=1

|C(Xi)| [I{C(Xi) > 0} − d(Xi ; η)]2,

75/118

Page 76: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Classification Perspective

ηopt = arg minηn∑

i=1

|C(Xi)| [I{C(Xi) > 0} − d(Xi ; η)]2

Alternative formulation: This can be viewed as a weightedclassification problem with• Label I{C(Xi) > 0}• Classifier d(Xi ; η)

• Weight |C(Xi)|

76/118

Page 77: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

NSABP Trial, Revisited

Recall: Two treatment options, n = 1276• A = 0 if PFT (c2), = 1 if PF (c1)• Y = 1 if patient survived disease-free to 3 years, = 0

otherwise• x = {age, log(PR+1)}=(x1, x2)

AIPWE: Classification formulation• Propensity: π = n−1∑n

i=1 Ai (randomized trial )• Outcome regression: With expit(u) = eu/(1 + eu)

Q(x ,a;β) = expit{β0 + β1x1 + β2x2 + a(β3 + β4x1 + β5x2)}

Fit by logistic regression

77/118

Page 78: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

NSABP Trial, Revisited

Classifier: Recursive partitioning (CART ):• R function rpart with default settings except for weights|C(Xi)| and labels I{C(Xi) > 0}

Results: Regime of the form

d(x ; η) = I(age < η0 and PR < η1)

ηopt0 ηopt

1 V (doptη ) (95% CI)

IPWE 56.0 5.0 0.68 (0.64,0.72)AIPWE 60.0 9.0 0.69 (0.65,0.72)CART 59.5 16.5 0.68 (0.65,0.72)

78/118

Page 79: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Discussion

• Estimation of optimal regime using “off-the-shelf ”classification methods

• Estimated contrast functions constructed independently ofclass of regimes

• Form of estimated optimal regime determined byclassification method

• Extension to multiple decisions ongoing

79/118

Page 80: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

References

Zhang, B., Tsiatis, A. A., Davidian, M., Zhang, M., and Laber, E.B. (2012). Estimating optimal treatment regimes from aclassification perspective. Stat 1, 103–114.

Zhao, Y., Zeng, D., Rush, A. J., and Kosorok, M. R. (2012).Estimating individualized treatment rules using outcomeweighted learning. Journal of the American StatisticalAssociation 107, 1106–1118.

80/118

Page 81: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Course Outline

Introduction to Personalized Medicine and Dynamic TreatmentRegimes

Estimation of Optimal Treatment Regimes for a Single Decision

Estimation of Optimal Treatment Regimes from a ClassificationPerspective

SMARTs and Estimation of Optimal Treatment Regimes forMultiple Decisions

81/118

Page 82: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Recap: Multiple Decision Points

Two decision points:• Decision 1 : Induction chemotherapy (options c1, c2)• Decision 2 : Maintenance treatment for patients who

respond (options m1, m2), Salvage chemotherapy for thosewho don’t (options s1, s2)

• Outcome : Survival time

82/118

Page 83: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Recap: Multiple Decision Points

• At baseline: Information x1 (accrued information h1 = x1)

• Decision 1: Two options {c1, c2}; rule 1: d1(x1)⇒ x1 → {c1, c2}• Between decisions 1 and 2: Collect additional information x2,

including responder status

• Accrued information h2 = {x1, chemotherapy at decision 1, x2}• Decision 2: Four options {m1,m2, s1, s2}; rule 2: d2(h2)⇒

h2 → {m1,m2} (responder), h2 → {s1, s2} (nonresponder)

• Regime : d = (d1,d2)

83/118

Page 84: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Recap: Multiple Decision Points

In general: K decision points• Baseline information x1, intermediate information xk

between decisions k − 1 and k , k = 2, . . . ,K• Set of treatment options at decision k ak ∈ Ak

• Accrued information h1 = x1 ∈ H1,

hk = {x1,a1, x2,a2, . . . , xk−1,ak−1, xk} ∈ Hk , k = 2, . . . ,K

• Decision rules d1(h1),d2(h2), . . . ,dK (hK ), dk : Hk → Ak

• Dynamic treatment regime d = (d1,d2, . . . ,dK )

• D is the set of all possible K -decision regimes

84/118

Page 85: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Recap: Optimal Regime for Multiple Decisions

Optimal regime: dopt ∈ D such that a patient with baselineinformation X1 = x1 who receives all K treatments according todopt has expected outcome as large as possible

Potential outcomes under a regime d ∈ D:• Baseline information X1, potential outcomes

X ∗2 (d), . . . ,X ∗K (d),Y ∗(d)

dopt satisfies:• E{Y ∗(d)} ≤ E{Y ∗(dopt )} for all d ∈ D• E{Y ∗(d)|X1 = x1} ≤ E{Y ∗(dopt )|X1 = x1} for all d ∈ D

and x1 ∈ H1

85/118

Page 86: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Recap: Delayed Effects

Recall: Can not “piece together” an optimal regime fromseveral separate studies• In the cancer example, the regime that uses the “best ”

options from separate studies of inductionchemotherapies , maintenance therapies , and salvagetherapies would not necessarily lead to dopt

• Delayed effects: The induction therapy with the highestproportion of responders might have other effects thatrender subsequent treatments less effective in regard tosurvival

• Result: Require data from a study involving the entiresequence of decisions

86/118

Page 87: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Estimation of Optimal Treatment Regimes

K decisions: Data

(X1i ,A1i ,X2i ,A2i , . . . ,X(K−1)i ,A(K−1)i ,XKi ,Yi), i = 1, . . . ,n

• X1i = Baseline information observed on subject i• Xki , k = 2, . . . ,K = intermediate information between

decisions k − 1 and k on subject i• Aki , k = 1, . . . ,K = observed treatment actually received

by subject i at decision k• Hi = accrued information for subject i up to decision k

H1i = X1i , Hki = (X1i ,A1i , . . . ,A(k−1)i ,Xki), k = 2, . . . ,K

• Yi = observed outcome for subject i ; can be ascertainedafter decision K or can be a function of X2i , . . . ,XKi

87/118

Page 88: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Data Sources

Goal: Under suitable assumptions , estimate dopt using thesedata

Studies:• Existing data from longitudinal observational studies,

previously conducted clinical trials with follow-up• Prospectively collected data from clinical trials conducted

specifically for this purpose

88/118

Page 89: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Assumptions

Single decision setting: Require the assumption of nounmeasured confounders• Treatment received is independent of potential outcomes

that would be achieved under all treatment options givenbaseline X

• Unverifiable for observational studies• Automatically satisfied in a randomized trial

89/118

Page 90: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Assumptions

Multiple decision setting: Require an analogous assumptionat all decision points• Sequential randomization assumption• Treatment received at each decision k = 1, . . . ,K is

independent of potential outcomes that would be achievedunder all treatment options at all decisions given accruedinformation Hk

• Unverifiable for longitudinal observational studies• Randomized trial ?

90/118

Page 91: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

SMART

Randomized study: Sequential Multiple AssignmentRandomized Trial – SMART• Randomize subjects to treatment options that are feasible

given their accrued information at each decision point• Sequential randomization assumption is automatically

satisfied

91/118

Page 92: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

SMART

Cancer

Response

NoResponse

Response

NoResponse

C

C

1

2

M1

M2

M1

M2

S1

S2

S1

S2

Randomization at •s

92/118

Page 93: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

SMART

Embedded regimes: The SMART for the cancer exampleembeds 8 simple regimes for j , `,p = 1,2 of the form

“Give cj followed by m` if response, else sp if nonresponse”

• E.g., for j = 1, ` = 1, p = 1

d1(h1) = c1 for all h1

d2(h2) = m1 if response= s1 if nonresponse

• These embedded regimes use no baseline information atdecision 1 and response status only at decision 2

• By randomization , there will be subjects whose actualtreatment received is consistent with following at least oneof these embedded regimes

93/118

Page 94: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

SMART

Result: Multi-purpose• Can estimate the value for each embedded regime and

formally compare (e.g., using suitable inverse probabilityweighted estimators)

• Can collect extensive, detailed information at baseline andintermediate to decisions 1 and 2 to inform estimation ofan optimal regime

SMART design principles:• Embed simple regimes• Base sample size on conventional comparisons, e.g.,

decision 1 treatments• Base sample size on embedded regimes (how?)• Open problem

94/118

Page 95: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Estimation of Optimal Treatment Regimes

Goal, restated: Estimate dopt satisfying• E{Y ∗(d)} ≤ E{Y ∗(dopt )} for all d ∈ D• E{Y ∗(d)|X1 = x1} ≤ E{Y ∗(dopt )|X1 = x1} for all d ∈ D

and x1 ∈ H1

Sequential randomization assumption: Data from• A SMART• A fabulous longitudinal observational study

For definiteness: Take K = 2 and Ak = {0,1}, k = 1,2• Recall accrued information

H1i = X1i , H2i = (X1i ,A1i ,X2i)

95/118

Page 96: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Characterizing the Optimal Regime

Optimal regime dopt : Follows from backward induction(dynamic programming )• Formally in terms of potential outcomes• Sequential randomization assumption allows equivalent

expressions in terms of observed data (X1,A1,X2,A2,Y )(as for single decision and no unmeasured confounders)

96/118

Page 97: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Characterizing the Optimal Regime

Optimal regime dopt : Backward induction• Decision 2: Q2(H2,A2) = E(Y |H2,A2)

dopt2 (h2) = I{Q2(h2,1) > Q2(h2,0)} = arg maxa2={0,1}Q2(h2,a2)

Y2(h2) = max{Q2(h2,0),Q2(h2,1)}

• Decision 1: Q1(H1,A1) = E{Y2(H2)|H1,A1}

dopt1 (h1) = I{Q1(h1,1) > Q1(h1,0)]} = arg maxa1={0,1}Q1(h1,a1)

Y1(h1) = max{Q1(h1,0),Q2(h1,1)}

• dopt = (dopt1 ,dopt

2 )

• The value of dopt is V (dopt ) = E{Y1(H1)}• Y2(h2) and Y1(h1) are referred to as the value functions

97/118

Page 98: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Q-Learning

Q-learning: May be thought of as a generalization of theregression estimator to sequential decisions• Reinforcement learning in computer science• Posit models for the “Q-functions ”• Involves some complications not present in the single

decision case

98/118

Page 99: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Q-Learning

Estimation of dopt :• Decision 2: Posit and fit a model Q2(H2,A2;β2) by

regressing Y on H2,A2 (e.g., least squares) and estimate

doptQ,2(h2) = I{Q2(h2,1; β2) > Q2(h2,0; β2)}

• For each i , form “predicted value ”

Y 2i = Y2i(H2i ; β2) = max{Q2(H2i ,0; β2),Q2(H2i ,1; β2)}

• Decision 1: Posit and fit a model Q1(H1,A1;β1) by

regressing Y 2 on H1,A1 (e.g., least squares) and estimate

doptQ,1(h1) = I{Q1(h1,1; β1) > Q2(h1,0; β1)}

• Estimated regime doptQ = (dopt

Q,1, doptQ,2)

99/118

Page 100: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Q-Learning

Estimation of V (dopt ):• For each i , form

Y 1i = Y1i(H1i ; β1) = max{Q1(H1i ,0; β1),Q1(H1i ,1; β1)}

• Estimate the value V (dopt ) by

VQ(dopt ) = n−1n∑

i=1

Y 1i

100/118

Page 101: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Q-Learning

Decision 2: Positing/fitting Q2(H2,A2;β2) is a standardregression problem• Data (H2i ,A2i ,Yi), i = 1, . . . ,n• For continuous Y , linear models are often used, e.g.

Q2(h2,a2;β2) = hT2 β21 + a2(hT

2 β22), h2 = (1,hT2 )T

• Standard model selection methods, diagnostics, etc• Under the linear model

dopt2 (h2) = I(hT

2 β22 > 0)

Y2(H2;β2) = HT2 β21 + (HT

2 β22)I(HT2 β22 > 0)

101/118

Page 102: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Q-Learning

Y2(H2;β2) = HT2 β21 + (HT

2 β22)I(HT2 β22 > 0)

Decision 1: It is common to assume a linear model forQ1(H1,A1) = E{Y2(H2;β2)|H1,A1}

Q1(h1,a1;β1) = hT1 β11 + a2(hT

1 β12), h1 = (1,hT1 )T

• “Data ” (H1i ,A1i ,Y 2i), i = 1 . . . ,n

• This is clearly a nonstandard regression problem; more ina moment

• Under the linear model (if it were correct )

dopt1 (h1) = I(hT

1 β12 > 0)

102/118

Page 103: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Q-Learning

Decision 1: Even if the Decision 2 linear model is correctlyspecified, a linear model at Decision 1 is almost certainlyincorrect• In truth Q2(h2,a2) = −x1 + a2(0.5a1 + x2) and

X2|X1 = x1,A1 = a1 ∼ N (ξx1, σ2)

• May show that

Q1(h1,a1) = −x1 +σ√2π

exp{−(0.5a1 + ξx1)2/(2σ2)}

+(0.5a1 + ξx1)Φ{(0.5a1 + ξx1)/σ}

• Not linear in general

103/118

Page 104: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Q-Learning

Issues and challenges:• Need not use linear models• Regardless, as in the single decision case, incorrect model

specification will impact quality of estimation of dopt

• Modeling at decisions K − 1, . . . ,1 challenging due to needto model max

• More flexible models for Q-functions can be used• Because of nonsmooth max operator, standard asymptotic

theory is invalid• Considerable current research

104/118

Page 105: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Other Regression Methods

A-learning: Related, alternative class of methods• g-estimation• In general, for k = 1, . . . ,K

doptk (hk ) = I{Qk (hk ,1) > Qk (hk ,0)} = I{Qk (hk ,1)−Qk (hk ,0) > 0}

depends only on the sign of the contrast function

Qk (hk ,1)−Qk (hk ,0)

• Model the contrast functions instead of the full Q-functions• Along with propensity scores for each decision known in a

SMART)

105/118

Page 106: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Value Search Methods

Generalization to K > 1:• Consider directly a restricted class of regimes Dη with

elements dη = (dη,1, . . . ,dη,K ); at decision k

dη,k (hk ) = dk (hk ; ηk )

• Based on cost , feasibility , interpretability at each decision• Optimal restricted regime dopt

η

ηopt = arg maxηE{Y ∗(dη)} = arg maxηV (dη)

• Estimator V (dη) for fixed η; maximize in η to obtain ηopt

• Required: A “good ” V (dη)

106/118

Page 107: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Value Search Methods

Extend missing data analogy to monotone dropout: K = 2• “Full data ”

{X1,X ∗2 (dη),Y ∗(dη)}• Define η-regime consistency indicator Cη

• Cη =∞: If a patient’s actual treatments A1,A2 are allconsistent with following dη, then

(X1,X2,Y ) = {X1,X ∗2 (dη),Y ∗(dη)}

• Cη = 2: If actual A1 is consistent with following dη but A2 isnot, then

(X1,X2) = {X1,X ∗2 (dη)}but Y ∗(dη) is “missing ” (“dropout ” before decision 2)

• Cη = 1: If neither of A1,A2 is consistent with following dη,both X ∗2 (dη),Y ∗(dη) are “missing ” (“dropout ” beforedecision 1)

107/118

Page 108: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Value Search Methods

Propensity scores: At decision k = 1, . . . ,K

πk (Hk ) = pr(Ak = 1|Hk )

• Randomized trial (SMART): πk (hk ) is known• Observational study: Posit and fit models πk (hk ; γk )

• Can express propensities of receiving treatment consistentwith dη through decision k in terms of πk (hk )

Result: Can develop IPWE and doubly-robust AIPWEestimators for V (dη) in terms of Cη and πk (hk )

108/118

Page 109: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Value Search Methods

Issues and challenges:• As for K = 1, is nonstandard optimization problem• IPWE (leading term for AIPWE) involves only subjects with

Cη =∞ (consistent with following regime for all Kdecisions )

• May become infeasible for K > 3• Simulation evidence: Performance comparable to

Q-learning with correct models; AIPWE is robust to modelmodel misspecification while Q-learning is not

109/118

Page 110: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Application: STAR*D

STAR*D: Sequenced Treatment Alternatives to RelieveDepression – SMART with ∼ 4000 subjects• Goal: Strategies for patients whose depressive disorder

does not benefit sufficiently from citalopram• Here: A simplified version• Run-in: All received citalopram• Decision 1: 1260 non-responders randomized to augment

(0) or switch (1)• Decision 2: 330 non-responders to decision 1 treatment

randomized to augment (0) or switch (1)• Response: Quick Inventory of Depressive

Symptomatology (QIDS) score

12-week QIDS < 5 (remission) OR > 50% decrease from entry

110/118

Page 111: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Simplified STAR*D schematic        

             

     

                                                   

Non-response to citalopram

Augment

Switch

Response

Non-Response

Response

Non-Response

Augment

Augment

Switch

Switch

Continue current

treatment

Continue current

treatment

Randomization at •s

111/118

Page 112: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Application: STAR*D

Data:• Baseline (at decision 1): X1 = (X11,X12), X11 = QIDS at

decision 1, X12 = QIDS slope from entry to decision 1• Decision 2: X2 = (X21,X22), X21 = QIDS at decision 2,

X22 = QIDS slope from decision 1 to decision 2• Treatment: Ak = 0,1, k = 1,2• Outcome: Cumulative average negative QIDS

Y = −I(X21 ≤ L0)X21 − I(X21 > L0)(X21 + T )/2

L0 = max(5,Xentry/2), T = ending QIDS

(function of accrued information )

112/118

Page 113: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Application: STAR*D

Y = −I(X21 ≤ L0)X21 − I(X21 > L0)(X21 + T )/2

L0 = max(5,Xentry/2), T = ending QIDS

Q-learning:• Decision 2:

Q2(H2,A2;β2) = −{I(X21 ≤ L0) + I(X21 > L0)/2}X21

+I(X21 > L0)(β20 + β21X21 + β22X22 + β23A2)/2

• Decision 1:

Q1(H1,A1) = β10 + β11X11 + β12X12 + A1(β13 + β14X12)

113/118

Page 114: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Application: STAR*D

Estimated regime:

doptQ,2(h2) = always switch

doptQ,1(h1) = I(x12 > −0.97) ( switch if QIDS slope > -0.97)

• VQ(dopt ) = −7.97

114/118

Page 115: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Discussion

• Two classes of methods for estimation of optimal regimesfor multiple decision points

• Q- and A-learning (sequential regression methods) –estimate an optimal regime based on sequential positedregression models

• Potential for model misspecification is high• Value search methods – robustness to misspecification• Limitation to small K due to need for “regime consistency ”

115/118

Page 116: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

References

Lunceford, J., Davidian, M., and Tsiatis, A. A. (2002).Estimation of the survival distribution of treatment regimes intwo-stage randomization designs in clinical trials. Biometrics58, 48–57.

Murphy, S. A. (2005). An experimental design for thedevelopment of adaptive treatment strategies. Statistics inMedicine 24, 1455–1481.

Schulte, P. J., Tsiatis, A. A., Laber, E. B., and Davidian, M.(2014). Q- and A-learning methods for estimating optimaldynamic treatment regimes. Statistical Science, in press.

Zhang, B., Tsiatis, A. A., Laber, E. B., and Davidian, M. (2013).Robust estimation of optimal dynamic treatment regimes forsequential treatment decisions. Biometrika 100, 681–694.

116/118

Page 117: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Closing Remarks

• Estimation of optimal treatment regimes is a wide openarea of research

• SMARTs are the “gold standard ” data source forestimation of optimal regimes

• Design considerations for SMARTs?• High-dimensional covariate information? Regression

model selection ?• “Black box ” vs. restricted class of regimes?• Inference ?• Balancing multiple outcomes (e.g., efficacy vs. toxicity )?• . . .

117/118

Page 118: Personalized Medicine and Dynamic Treatment …Personalized Medicine and Dynamic Treatment Regimes Marie Davidian and Butch Tsiatis Department of Statistics North Carolina State University

Thought Leaders

2013 MacArthur Fellow Susan Murphy and Jamie Robins

118/118