the supervised learning approach to estimating...

Post on 20-May-2020

2 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

The Supervised Learning Approach To EstimatingHeterogeneous Causal Regime Effects

Thai T. Pham

Stanford Graduate School of Businessthaipham@stanford.edu

May, 2016

Introduction

Observations

Many sequential treatment settings: patients make adjustmentsin medications in multiple periods; students decide whether tofollow an educational honors program over multiple years; in labormarket, the unemployed might participate in a set of programs (jobsearch, subsidized job, training) sequentially.

Heterogeneity in treatment sequence reactions: medicationeffects can be heterogeneous across patients and across time;the same for educational program and labor market.

Hard to set up sequential randomized experiments in reality.

Thai T. Pham Estimation of Heterogeneous Causal Regime Effects 2 / 27

Introduction

Contributions

Develop a nonparametric framework using supervised learning toestimate heterogeneous treatment regime effects from observational(or experimental) data.

Treatment Regime: a set of functions of characteristics andintermediate outcomes.

Propose using supervised learning approach (deep learning),which gives good estimation accuracy and which is robust tomodel misspecification.

Propose matching based testing method for the estimation ofheterogeneous treatment regime effects.

Propose matching based kernel estimator for variance ofheterogeneous treatment regime effects (time allows).

Thai T. Pham Estimation of Heterogeneous Causal Regime Effects 3 / 27

Introduction

Contributions (cont’d)

In this paper, we

Focus on dynamic setting with multiple treatments appliedsequentially (in contrast to a single treatment).

Focus on the heterogeneous (in contrast to average) effect of asequence of treatments, i.e. a treatment regime.

Focus on observational data (in contrast to experimental data).

Thai T. Pham Estimation of Heterogeneous Causal Regime Effects 4 / 27

An Illustrative Model The Setup

Setup - Motivational Dataset

The North Carolina Honors Program Dataset.

There are 24,112 observations in total.

X0 = [Y0,d1,d2,d3], where Y0 is the Math test score at the end of8th grade and d1,d2,d3 are census-data dummy variables.

W0,W1 ∈ {0,1} are treatment variables.

Y1: end of 9th grade Math test score.

Y2: end of 10th grade Math test score (object of interest).

Y0,Y1,Y2 are pre-scaled to have zero mean and unit variance.

Thai T. Pham Estimation of Heterogeneous Causal Regime Effects 5 / 27

An Illustrative Model The Setup

Setup - Model

End of eighth grade:

Students’ initial information X0, which includes Math test score Y0and other personal information (d1,d2,d3), is observed.

Decide to follow honors (W0 = 1) or standard (W0 = 0) program.

End of ninth grade:

X0,W0, and Math test score Y1 are observed.

Decide to switch or stay in current program (W1 = 1 or 0).

End of tenth grade:

X0,W0,Y1,W1, and Math test score Y2 are observed.

Object of interest: Y2 (It could be any functions of X0,Y1,Y2).

Thai T. Pham Estimation of Heterogeneous Causal Regime Effects 6 / 27

An Illustrative Model The Setup

Potential Outcome (PO) Framework

Treatment regime d = (d0,d1) has

d0 : X0 →W0 ∈ {0,1} and d1 : X0 ×W0 × Y1 →W1 ∈ {0,1}.

Potential Outcome Y1(W0) = Y1. Also, the observed outcome

Y1 = W0 · Y1(1) + (1−W0) · Y1(0).

Similarly, Y d2 = Y2 if the subject follows regime d . We also write

Y2 = Y d2 = Y2(W0,W1) when d0 maps to W0 and d1 maps to W1.

We have

Y2 = W0W1 · Y2(1,1) + W0(1−W1) · Y2(1,0)+(1−W0)W1 · Y2(0,1) + (1−W0)(1−W1) · Y2(0,0).

Thai T. Pham Estimation of Heterogeneous Causal Regime Effects 7 / 27

An Illustrative Model The Setup

Types of Treatment Regime

Static Treatment Regime: subjects specify (or are specified) thewhole treatment plan based only on the initial covariates (X0).

So d : X0 → (W0,W1) ∈ {0,1}2.

Dynamic Treatment Regime: subjects choose (or are assigned)the initial treatment based on the initial covariates (X0); thensubsequently choose (or are assigned) the next treatment basedon the initial covariates (X0), the first period treatment (W0), andthe intermediate outcome (Y1); and so on.

This is our original setup.

Thai T. Pham Estimation of Heterogeneous Causal Regime Effects 8 / 27

An Illustrative Model The Setup

Potential Outcome (PO) Framework (Cont’d)

Objective: Estimate E[Y d

2 − Y d ′

2

]for individuals (or average), and

derive heterogeneous optimal regime

d∗(C) = arg maxd

E[Y d

2

∣∣∣ C]

for individual covariates C.

Difficulties:Fundamental Problem of Causal Inference: for each subject, wenever observe both Y d

2 and Y d ′

2 .

Selection Bias: students following d may fundamentally bedifferent from those following d ′ (e.g., students with good testscores choose the honors program in each period).

Thai T. Pham Estimation of Heterogeneous Causal Regime Effects 9 / 27

An Illustrative Model Identification Results

Identification Result - Static Treatment Regime

Theorem (Identification Result - STR)

Let d0 = d0(X0) and d1 = d1(X0). Then (with Assumptions)

E[

Y2 · 1{W0 = d0} · 1{W1 = d1}P(W0 = d0|X0) · P(W1 = d1|X0)

∣∣∣∣X0

]= E

[Y d

2

∣∣∣X0

].

Corollary:

E[ observed/estimable, Transformed Outcome︷ ︸︸ ︷

Y2 ·[

W0W1

e0e1− (1−W0)(1−W1)

(1− e0)(1− e1)

] ∣∣∣∣X0

]= E

[ unobserved, PO︷ ︸︸ ︷Y2(1,1)− Y2(0,0)

∣∣X0].

Here, e0 = P(W0 = 1|X0) and e1 = P(W1 = 1|X0).

Matching Based Testing Method STR Estimation

Thai T. Pham Estimation of Heterogeneous Causal Regime Effects 10 / 27

An Illustrative Model Identification Results

Identification Result - Dynamic Treatment Regime

Theorem (Identification Result - DTR)

Let d0 = d0(X0) and d1 = d1(X0,X1,Y1,W0). Then (with Assumptions)

In period T = 1:

E[

Y2 ·1{W1 = d1}

P(W1 = d1|X0,X1,Y1,W0)︸ ︷︷ ︸observed/estimable

∣∣∣∣X0,X1,Y1,W0

]

= E[

Y d12︸︷︷︸

PO

∣∣X0,X1,Y1,W0].

In period T = 0:

E[

Y2 · 1{W1 = d1} · 1{W0 = d0}P(W1 = d1|X0,X1,Y1,W0) · P(W0 = d0|X0)

∣∣∣∣X0

]= E

[Y d

2∣∣X0].

DTR Estimation

Thai T. Pham Estimation of Heterogeneous Causal Regime Effects 11 / 27

Model Estimation

Challenges In Traditional Approach

Goal: Specify a relation b/w transformed outcome T and covariates C.

Econometric approaches assume T = h(C;β) + ε for a fixed (linear)function h(·) and E[ε|C] = 0, and estimate β by minimizing

||T− h(C;β)||2.

Problem: Linear models need not give good estimates in general.

Thai T. Pham Estimation of Heterogeneous Causal Regime Effects 12 / 27

Model Estimation

Machine Learning Approach

Machine learning methods generally give much more accurateestimates than traditional econometric models.

Empirical comparisons of different machine learning methods withlinear regressions:

Caruana and Niculescu-Mizil (2006)1

Morton, Marzban, Giannoulis, Patel, Aparasu, and Kakadiaris(2014)2

1Caruana, R. and A. Niculescu-Mizil, (2006), “An Empirical Comparison ofSupervised Learning Algorithms,” Proceedings of the 23rd International Conferenceon Machine Learning, Pittsburgh, PA.

2Morton, A., E. Marzban, G. Giannoulis, A. Patel, R. Aparasu, and I. A. Kakadiaris,(2014), “A Comparison of Supervised Machine Learning Techniques for PredictingShort-Term In-Hospital Length of Stay Among Diabetic Patients,” 13th InternationalConference on Machine Learning and Applications.

Thai T. Pham Estimation of Heterogeneous Causal Regime Effects 13 / 27

Model Estimation

Machine Learning Approach (Cont’d)

Goal: Specify a relation b/w transformed outcome T and covariates C.

Machine learning (ML) methods allow h(·) to vary in terms ofcomplexity, and estimate β by minimizing||T− h(C;β)||2 + λg(β) where g penalizes complex models.

Data set = (Training, Validation, Test). Use Training set (with CV)to choose the optimal h(·) in terms of complexity, validation set tochoose the optimal ‘hyperparameter’ λ, and test set to evaluatethe performance.

RMSE is the comparison criterion.

Hence, ML approach is flexible and performance oriented.

Thai T. Pham Estimation of Heterogeneous Causal Regime Effects 14 / 27

Model Estimation

Estimating Model

Propensity Score Estimation: Use logistic regression or otherML techniques such as Random Forest, Gradient Boosting, etc.

Full Model Estimation: (Though many ML techniques wouldwork) We use a deep learning method in machine learningliterature called “Multilayer Perceptron.”

It possesses the universal approximation property : it canapproximate any continuous function on any compact subset of Rn.

Thai T. Pham Estimation of Heterogeneous Causal Regime Effects 15 / 27

Model Estimation

Multilayer Perceptron (MLP)

Assume we want to estimate T = h(C;β) + ε. MLP (with onehidden layer) considers

h(C;β) =K∑

j=1

αjσ(γTj C + θj) and β =

(K , (αj , γj , θj)

Kj=1

),

where σ is a sigmoid function such as σ(x) = 1/(1 + exp(−x)).Empirically, MLP (and deep learning in general) is shown to workvery well. (Lecun et al.3, Mnih et al.4)

3LeCun, Y., Y. Bengio, and G. Hinton, (2015), “Deep Learning,” Nature 521,436-444 (28 May).

4Mnih, V., K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A.Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski, S. Petersen, C. Beattie, A. Sadik,I. Antonoglou, H. King, D. Kumaran, D. Wierstra, S. Legg, and D. Hassabis, (2015),“Human-level control through deep reinforcement learning,” Nature 518, 529-533 (26February).

Thai T. Pham Estimation of Heterogeneous Causal Regime Effects 16 / 27

Model Estimation Testing Method

Matching Based Testing Method

The identification results relate unobserved difference of potentialoutcomes Z to observed (or estimable) transformed outcome T :

E[T |C] = E[Z |C].

For example in STR : Z = Y d2 − Y d ′

2 , C = X0.

Randomly draw M units with treatment regime d . Denote byxd ,m

0 ’s and yd ,m2 ’s the covariates and corresponding outcomes.

For each m, determine xd ′,m0 = arg minx i

0 | regime= d ′ ||x i0 − xd ,m

0 ||2.

Let τm = yd ,m2 − yd ′,m

2 . Here, τ is a proxy for the unobserved Z .Let τ be the estimator which fits x0 to T .Define τm = 1

2(τ(xd ,m0 ) + τ(xd ′,m

0 )).

Define the matching lossM :√

1M∑M

m=1(τm − τm)2.

Thai T. Pham Estimation of Heterogeneous Causal Regime Effects 17 / 27

Simulations Setup

Simulation Setup

Test the ability of our method in adapting to heterogeneity in thetreatment regime effect.

50,000 obs for training; 5,000 obs for validation; 5,000 for testing.

X0 ∼ U([0,1]10); W0 ∈ {0,1}; Y1 ∈ R with standard normal noise;W1 ∈ {0,1}; Y2 ∈ R with standard normal noise.

e0(X0) = e1(X0) = e1(X0,W0,Y1) = 0.5.

τ1(X0) = E[

Y W0=11 − Y W0=0

1

∣∣∣X0

]= ξ(X0[1])ξ(X0[2]); and

τ2(X0,W0,Y1) = E[

Y W1=12 − Y W1=0

2

∣∣∣X0,W0,Y1

]= ρ(Y1)ρ(W0)ξ(X0[1])

whereξ(x) =

21 + e−12(x−1/2) ; ρ(x) = 1 +

11 + e−20(x−1/3) .

Thai T. Pham Estimation of Heterogeneous Causal Regime Effects 18 / 27

Simulations Results

Simulation Results

Table: Performance In Terms of Root Mean Squared Error (RMSE)

Method Linear Regression(LR)

Multilayer Perceptron(MLP)

STR 1.75 1.66DTR: T = 0 0.74 0.13DTR: T = 1 1.10 0.20

∗ sdv (TO: T = 0) = 2.41; sdv (true effect: T = 0) = 1.34.∗ sdv (TO: T = 1) = 3.21; sdv (true effect: T = 1) = 2.52.

Comments:MLP returns really good results, and it outperforms LR.Static setting does not fit here as the RMSEs on STR are bad.

Thai T. Pham Estimation of Heterogeneous Causal Regime Effects 19 / 27

Simulations Results

Simulation Results (cont’d)5

●●

●●

● ●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

● ●

● ●

●●

●●

●●

● ●

●●

● ●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

● ●●

● ●

●●

●●

●●

●●

● ●

● ●

●●

●●

●●

● ●

●●

●●

● ●

●●

● ●●

●●

●●

●●

●●

●●

● ●

●●

●●

● ●

●●

●●

● ●

● ●

●●

●●

●●

● ●

●●

● ●

●●

●●

●●

●●

●●

●●●

●●

● ●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

● ●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

● ●●

●●

●●

● ●

●●

●●

●●

●●

●● ●

●●

●●

●●

●●

●●

●●●

●●

●●

● ●

●●

●●

●●

●●

●●

● ●

●●●

●●

●● ●

●●

● ●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

● ●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

● ●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

● ●

●●

●●

● ●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

● ●

●●

●●

●●

●●

● ●

●●

●●

●●

●●●

●●

●●

● ●

● ●

● ●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

● ●

●●

●●

●●

●●

● ●

●●

●●●●

●●

●●

●●●

●●

● ●

●●

●●

●●

● ●

● ●

●●

●●

● ●

●●

●●

●●

●●

● ●

●●

● ●

●●

●●

● ● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

● ●

●●

● ●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

● ●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

● ●

●●

●●

●●

●●●

● ●

●●

●●

●●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

MLP: T = 0

●●

●●

●●

● ●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

● ●

●●

● ●

● ●

●●

●●

●●

● ●

●●

● ●

●●

●●

●●

●●

●●

● ●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

● ●●

● ●

●●

●●

●●

●●

● ●

●●

●●

●●

● ●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

● ●

● ●

●●

●●

● ●

●●

●●

● ●

● ●

●●

●●

●●

● ●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

● ●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

● ●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

● ●

● ●

●●

●●

●●

●●

●● ●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

● ●

●●●

●●

●● ●

●●

● ●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

● ●

●●

●●

●●

●●

●●

● ●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

● ●

●●

●●

● ●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●●

●●

●●

● ●

● ●

● ●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

● ●

●●

●●

● ●

●●

●●●●

●●

●●

●●●

●●

●●

● ●

●●

●●

●●

●●

● ●

● ●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

● ●

● ●

●●

●●

●●

● ● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

● ●

●●

● ●

● ●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

● ●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●●

●●

● ●

●●

●●

●●

●●●

● ●

●●

●●

●●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

True Effect: T = 0

●●

●●

●●

● ●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

● ●

●●

● ●

● ●

●●

●●

●●

● ●

●●

● ●

●●

●●

●●

●●

●●

● ●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

● ●●

● ●

●●

●●

●●

●●

● ●

●●

●●

●●

● ●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

● ●

● ●

●●

●●

● ●

●●

●●

● ●

● ●

●●

●●

●●

● ●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

● ●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

● ●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

● ●

● ●

●●

●●

●●

●●

●● ●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

● ●

●●●

●●

●● ●

●●

● ●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

● ●

●●

●●

●●

●●

●●

● ●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

● ●

●●

●●

● ●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●●

●●

●●

● ●

● ●

● ●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

● ●

●●

●●

● ●

●●

●●●●

●●

●●

●●●

●●

●●

● ●

●●

●●

●●

●●

● ●

● ●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

● ●

● ●

●●

●●

●●

● ● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

● ●

●●

● ●

● ●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

● ●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●●

●●

● ●

●●

●●

●●

●●●

● ●

●●

●●

●●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

LR: T = 0

●●

●●

●●

●●

●●

● ●

● ●

● ●

●●

●● ●

●●

●●

●●

●●

● ●

● ●

● ●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

● ●

●●

● ●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

● ●

● ●

●●

●●

●●●

●●

● ●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

● ●

● ●

● ●

●●

●● ●

●●

● ●

●●

● ●

● ●

●●

●●

●●

●●

●●

● ●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

● ●

● ●

●●

●●

● ●

● ●

● ●

●●

●●

●●

●●

●●

● ●

●●

● ●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

● ●

●●

●●

● ●

● ●

●●

●●

●●

●●

● ●

●●

●●

●●

● ●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

● ●●

●●

●● ●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

● ●●

● ●

●●

●●

●●

● ●

●●

●●

● ●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

● ●

● ●

● ●

●●

●●

●●●

●●

● ●

●●

●●

●●

●●●

●●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

● ●

●●

● ●

●●

●●

●●

●●

● ●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●● ●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

● ●

●●

●●

●●

● ●

●● ●

●●

●●

●●

●●

●●

●●

● ● ●

● ●

●●

●●

●●

●●

● ●

●●

●●

●●

● ●●

●●

●●

●●

●●

● ●

● ●

●●

●●

● ●

●●

● ●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

● ●

●●

● ●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

● ●

●●

●●

● ●

●●

●●

● ●

●●

● ●

●●

●●

● ●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

● ●

● ●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

● ●

● ●

● ●

●●

●●

●●

●●

●●

● ●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

● ●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

● ●

●●

●●

●●

●●

MLP: T = 1

●●

●●

●●

●●

● ●

● ●

● ●

●●

●● ●

●●

●●

●●

●●

● ●

● ●

● ●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

● ●

●●

● ●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

● ●

● ●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

● ●

●●

● ●

● ●

●●

●● ●

●●

● ●

●●

● ●

● ●

● ●

●●

●●

●●

●●

●●

● ●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

● ●

● ●

●●

●●

● ●

● ●

● ●

●●

●●

●●

●●

●●

● ●

●●

● ●

● ●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

● ●

●●

●●

● ●

● ●

●●

●●

● ●

●●

●●

●●

●●

● ●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

● ●●

●●

●● ●

●●

●●

●●

● ●

●●

●●

●●

●●

● ●

●●

● ●●

● ●

●●

●●

●●

● ●

●●

● ●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

● ●

● ●

● ●

●●

●●

●●●

●●

● ●

●●

●●

●●

●●●

●●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

● ●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

● ●

●●

● ●

●●

●●

●●

●●

● ●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●● ●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

● ●

●●

●●

●●

● ●

●● ●

●●

●●

●●

●●

●●

●●

●●

●●

● ● ●

● ●

●●

●●

●●

●●

● ●

●●

●●

●●

● ●●

●●

●●

●●

●●

● ●

● ●

●●

●●

● ●

●●

● ●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

● ●

●●

● ●

●●

●●

●●

●●

●●

● ● ●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

● ●

●●

●●

● ●

●●

●●

●●

●●

● ●

●●

●●

● ●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

● ●

●●

●●

●●

● ●

● ●

● ●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

● ●

● ●

● ●

●●

●●

●●

●●

●●

● ●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

● ●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

● ●

●●

●●

●●

●●

●●

True Effect: T = 1

●●

●●

●●

●●

● ●

● ●

● ●

●●

●● ●

●●

●●

●●

●●

● ●

● ●

● ●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

● ●

●●

● ●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

● ●

● ●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

● ●

●●

● ●

● ●

●●

●● ●

●●

● ●

●●

● ●

● ●

● ●

●●

●●

●●

●●

●●

● ●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

● ●

● ●

●●

●●

● ●

● ●

● ●

●●

●●

●●

●●

●●

● ●

●●

● ●

● ●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

● ●

●●

●●

● ●

● ●

●●

●●

● ●

●●

●●

●●

●●

● ●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

● ●●

●●

●● ●

●●

●●

●●

● ●

●●

●●

●●

●●

● ●

●●

● ●●

● ●

●●

●●

●●

● ●

●●

● ●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

● ●

● ●

● ●

●●

●●

●●●

●●

● ●

●●

●●

●●

●●●

●●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

● ●

●●

● ●

●●

●●

●●

●●

● ●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●● ●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

● ●

●●

●●

●●

● ●

●● ●

●●

●●

●●

●●

●●

●●

●●

●●

● ● ●

● ●

●●

●●

●●

●●

● ●

●●

●●

●●

● ●●

●●

●●

●●

●●

● ●

● ●

●●

●●

● ●

●●

● ●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

● ●

●●

● ●

●●

●●

●●

●●

●●

● ● ●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

● ●

●●

●●

● ●

●●

●●

●●

●●

● ●

●●

●●

● ●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

● ●

● ●

● ●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

● ●

● ●

● ●

●●

●●

●●

●●

●●

● ●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

● ●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

● ●

●●

●●

●●

●●

●●

LR: T = 1

Figure: Heterogeneous Treatment Regime Effect Using Validation and Test Data. The first row corresponds to period T = 0and the second row corresponds to period T = 1. In each period: the middle picture visualizes the true treatment effect; the leftone is the estimated effect by using Multilayer Perceptron; and the right one is the estimated effect by using Linear Regression.

5We thank Wager and Athey (2015) for sharing their visualization code.Thai T. Pham Estimation of Heterogeneous Causal Regime Effects 20 / 27

Empirical Application Estimation of Propensity Scores

Propensity Score Estimation

Use the North Carolina Honors Program data (in illustrative model).

EstimateP(W0 = 1|X0), P(W1 = 1|X0)

andP(W1 = 1|X0,Y1,W0).

Use Random Forest as a probabilistic classification problem.

Thai T. Pham Estimation of Heterogeneous Causal Regime Effects 21 / 27

Empirical Application Static Treatment Regime Estimation

Model Estimation - Static Treatment Regime

Use three methods: Linear Regression, Gradient Boosting, andMultilayer Perceptron.

MethodValidation

Matching LossTest

Matching Loss

Linear Regression 11.15 9.77Gradient Boosting 5.03 4.89Multilayer Perceptron 3.20 3.27

Comments:MLP outperforms other methods.All results are bad, which signals the dynamic nature of the data.

STR

Thai T. Pham Estimation of Heterogeneous Causal Regime Effects 22 / 27

Empirical Application Dynamic Treatment Regime Estimation

Model Estimation - Dynamic Treatment Regime

Period T = 1 - MethodValidation

Matching LossTest

Matching Loss

Linear Regression 1.29 1.29Gradient Boosting 0.94 1.01Multilayer Perceptron 0.94 1.01

*sdv(TO: val) = 4.06; sdv(est. true effect: val) = 0.85.*sdv(TO: test) = 4.03; sdv(est. true effect: test) = 0.92.

Thai T. Pham Estimation of Heterogeneous Causal Regime Effects 23 / 27

Empirical Application Dynamic Treatment Regime Estimation

Model Estimation - Dynamic Treatment Regime(Cont’d)

Period T = 0 - MethodValidation

Matching LossTest

Matching Loss

Linear Regression 3.29 3.45Gradient Boosting 1.51 1.63Multilayer Perceptron 1.14 1.60

DTR - Use only students who follow the optimal treatment in T = 1.

*sdv(TO: val) = 6.94; sdv(est. true effect: val) = 0.84.*sdv(TO: test) = 7.45; sdv(est. true effect: test) = 0.98.

Remark: The results are worse than that in simulations due tounobserved heterogeneity.

Thai T. Pham Estimation of Heterogeneous Causal Regime Effects 24 / 27

Empirical Application Heterogeneous Optimal Regime Estimation

Heterogeneous Optimal Regime

Static Treatment Regime

Thai T. Pham Estimation of Heterogeneous Causal Regime Effects 25 / 27

Empirical Application Heterogeneous Optimal Regime Estimation

Heterogeneous Optimal Regime (cont’d)

STR: Estimated gain per student from heterogeneous optimal regimeover homogeneous optimal regime (0,0):

1∑(0,0) not opt

1

( ∑(1,1) opt

[Y2(1,1)− Y2(0,0)

]+

∑(1,0) opt

[Y2(1,0)− Y2(0,0)

]+

∑(0,1) opt

[Y2(0,1)− Y2(0,0)

])= 0.91.

DTR: Estimated gain per student in T = 0 from heterogeneous optimalW0 over homogeneous optimal treatment W0 = 0:∑

W0=1 optimal

[Y W0=1

2 − Y W0=02

]#obs used in T = 0 s.t. W0 = 1 opt

= 0.74.

∗ mean(Y2) = 0;min(Y2) = −4.06;max(Y2) = 3.66.

Thai T. Pham Estimation of Heterogeneous Causal Regime Effects 26 / 27

Conclusion

Conclusion

We developed a nonparametric framework using supervisedlearning to estimate heterogeneous causal regime effects.

Our model addresses the dynamic treatment setting, thepopulation heterogeneity, and the difficulty in setting up sequentialrandomized experiments in reality.

We introduced machine learning approach, in particular deeplearning, which demonstrates its estimation power and which isrobust to model misspecification.

We also introduced matching based testing method for theestimation of heterogeneous treatment regime effects. A matchingbased kernel estimator for variance of these effects is introducedin Appendix.

Thai T. Pham Estimation of Heterogeneous Causal Regime Effects 27 / 27

Appendix

Variance Estimation - Matching Kernel Approach

Matching

M matching pairs[(xd ,m

0 , yd ,m2 ); (xd ′,m

0 , yd ′,m2 )

], m = 1, ...,M.

Fix xnew0 . To estimate σ2(xnew

0 ) = Var(Y d2 − Y d ′

2 |xnew0 ), we define

εm = τm − τm.

Let xmean,m0 =

xd,m0 +xd′,m

02 . An estimator for σ2(xnew

0 ) is

σ2(xnew0 ) =

∑Mm=1 K (H−1[xmean,m

0 − xnew0 ])ε2m∑M

m=1 K (H−1[xmean,m0 − xnew

0 ]).

Thai T. Pham Estimation of Heterogeneous Causal Regime Effects 27 / 27

top related