causal inference from graphical models iisteffen/teaching/grad/caus13ii.pdfcausal interpretation of...

42
Recapitulating Bayesian networks Causal Bayesian networks Structural equation systems Computation of effects Identifiability of causal effects Chain graph models References Causal Inference from Graphical Models — II Steffen Lauritzen, University of Oxford Graduate Lectures Oxford, October 2013 Steffen Lauritzen, University of Oxford Causal Inference from Graphical Models — II

Upload: others

Post on 19-Jan-2021

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Causal Inference from Graphical Models IIsteffen/teaching/grad/caus13II.pdfCausal interpretation of a Bayesian network involves giving (1) a simple form. Ste en Lauritzen, University

RecapitulatingBayesian networks

Causal Bayesian networksStructural equation systems

Computation of effectsIdentifiability of causal effects

Chain graph modelsReferences

Causal Inference from Graphical Models — II

Steffen Lauritzen, University of Oxford

Graduate Lectures

Oxford, October 2013

Steffen Lauritzen, University of Oxford Causal Inference from Graphical Models — II

Page 2: Causal Inference from Graphical Models IIsteffen/teaching/grad/caus13II.pdfCausal interpretation of a Bayesian network involves giving (1) a simple form. Ste en Lauritzen, University

RecapitulatingBayesian networks

Causal Bayesian networksStructural equation systems

Computation of effectsIdentifiability of causal effects

Chain graph modelsReferences

Markov propertiesMoralization

An probability distribution P of Xv , v ∈ V satisfies the localMarkov property w.r.t. a directed acyclic graph D if

(L) : ∀α ∈ V : α⊥⊥{nd(α) \ pa(α)} | pa(α).

It factorizes over D if its density or probability mass function f hasthe form

(F) : f (x) =∏v∈V

f (xv | xpa(v)).

It satisfies the global Markov property w.r.t. D if

(G) : A⊥d B |S ⇒ A⊥⊥B | S .

These directed Markov properties are equivalent:

(G) ⇐⇒ (L) ⇐⇒ (F).

Steffen Lauritzen, University of Oxford Causal Inference from Graphical Models — II

Page 3: Causal Inference from Graphical Models IIsteffen/teaching/grad/caus13II.pdfCausal interpretation of a Bayesian network involves giving (1) a simple form. Ste en Lauritzen, University

RecapitulatingBayesian networks

Causal Bayesian networksStructural equation systems

Computation of effectsIdentifiability of causal effects

Chain graph modelsReferences

Markov propertiesMoralization

Separation in DAGs

A node γ in a trail τ is a collider if edges meet head-to head at γ:s-� s- - s� - s� �

A trail τ from α to β in D is active relative to S if both conditionsbelow are satisfied:

I all its colliders are in S ∪ an(S)

I all its non-colliders are outside S

A trail that is not active is blocked. Two subsets A and B ofvertices are d-separated by S if all trails from A to B are blockedby S . We write A⊥d B |S .

Steffen Lauritzen, University of Oxford Causal Inference from Graphical Models — II

Page 4: Causal Inference from Graphical Models IIsteffen/teaching/grad/caus13II.pdfCausal interpretation of a Bayesian network involves giving (1) a simple form. Ste en Lauritzen, University

RecapitulatingBayesian networks

Causal Bayesian networksStructural equation systems

Computation of effectsIdentifiability of causal effects

Chain graph modelsReferences

Markov propertiesMoralization

Separation by example

3 6

1 5 7

2 4

u uu u u

u u

-

@@@R

����

@@@R

-

@@@R

@@@R

����

����

-

3 6

1 5 7

2 4

u uu u u

u u

-

@@@R

����

@@@R

-

@@@R

@@@R

����

����

-

For S = {5}, the trail (4, 2, 5, 3, 6) is active, whereas the trails(4, 2, 5, 6) and (4, 7, 6) are blocked. For S = {3, 5}, they are allblocked.

Steffen Lauritzen, University of Oxford Causal Inference from Graphical Models — II

Page 5: Causal Inference from Graphical Models IIsteffen/teaching/grad/caus13II.pdfCausal interpretation of a Bayesian network involves giving (1) a simple form. Ste en Lauritzen, University

RecapitulatingBayesian networks

Causal Bayesian networksStructural equation systems

Computation of effectsIdentifiability of causal effects

Chain graph modelsReferences

Markov propertiesMoralization

The moral graph Dm of a DAG D is obtained by adding undirectededges between unmarried parents and subsequently droppingdirections, as in the example below:

3 6

1 5 7

2 4

s ss s ss s-

@@R

���

@@R

-@@R

@@R

���

���

-

3 6

1 5 7

2 4

s ss s ss s-

@@R

���

@@R

-@@R�

�@@R

���

���

-

3 6

1 5 7

2 4

s ss s ss s@@

��

@@

@@

��

��

��@@

Steffen Lauritzen, University of Oxford Causal Inference from Graphical Models — II

Page 6: Causal Inference from Graphical Models IIsteffen/teaching/grad/caus13II.pdfCausal interpretation of a Bayesian network involves giving (1) a simple form. Ste en Lauritzen, University

RecapitulatingBayesian networks

Causal Bayesian networksStructural equation systems

Computation of effectsIdentifiability of causal effects

Chain graph modelsReferences

Markov propertiesMoralization

Alternative equivalent separation

To resolve query involving three sets A, B, S :

1. Reduce to subgraph induced by ancestral set DAn(A∪B∪S) ofA ∪ B ∪ S ;

2. Moralize to form (DAn(A∪B∪S))m ;

3. Say that S m-separates A from B and write A⊥m B | S if andonly if S separates A from B in this undirected graph.

It then holds that A⊥m B | S if and only if A⊥d B | S.Proof in Lauritzen (1996) needs to allow self-intersecting paths tobe correct.

Steffen Lauritzen, University of Oxford Causal Inference from Graphical Models — II

Page 7: Causal Inference from Graphical Models IIsteffen/teaching/grad/caus13II.pdfCausal interpretation of a Bayesian network involves giving (1) a simple form. Ste en Lauritzen, University

RecapitulatingBayesian networks

Causal Bayesian networksStructural equation systems

Computation of effectsIdentifiability of causal effects

Chain graph modelsReferences

Markov propertiesMoralization

Forming ancestral set

3 6

1 5

2 4

u uu u

u u

-

@@@R

����

@@@R

@@@R

����

-

The subgraph induced by all ancestors of nodes involved in thequery 4⊥m 6 | 3, 5?

Steffen Lauritzen, University of Oxford Causal Inference from Graphical Models — II

Page 8: Causal Inference from Graphical Models IIsteffen/teaching/grad/caus13II.pdfCausal interpretation of a Bayesian network involves giving (1) a simple form. Ste en Lauritzen, University

RecapitulatingBayesian networks

Causal Bayesian networksStructural equation systems

Computation of effectsIdentifiability of causal effects

Chain graph modelsReferences

Markov propertiesMoralization

Adding links between unmarried parents

3 6

1 5

2 4

u uu u

u u

-

@@@R

����

@@@R

@@@R

����

-

Adding an undirected edge between 2 and 3 with common child 5in the subgraph induced by all ancestors of nodes involved in thequery 4⊥m 6 | 3, 5?

Steffen Lauritzen, University of Oxford Causal Inference from Graphical Models — II

Page 9: Causal Inference from Graphical Models IIsteffen/teaching/grad/caus13II.pdfCausal interpretation of a Bayesian network involves giving (1) a simple form. Ste en Lauritzen, University

RecapitulatingBayesian networks

Causal Bayesian networksStructural equation systems

Computation of effectsIdentifiability of causal effects

Chain graph modelsReferences

Markov propertiesMoralization

Dropping directions

3 6

1 5

2 4

u uu u

u u@@@

���

@@@

@@@

���

Since {3, 5} separates 4 from 6 in this graph, we can conclude that4⊥m 6 | 3, 5

Steffen Lauritzen, University of Oxford Causal Inference from Graphical Models — II

Page 10: Causal Inference from Graphical Models IIsteffen/teaching/grad/caus13II.pdfCausal interpretation of a Bayesian network involves giving (1) a simple form. Ste en Lauritzen, University

RecapitulatingBayesian networks

Causal Bayesian networksStructural equation systems

Computation of effectsIdentifiability of causal effects

Chain graph modelsReferences

Intervention vs. observationCausal interpretation

Standard causal interpretation of any probabilistic model (Spirteset al., 1993; Pearl, 2000) emphasizes distinction betweenconditioning by observation and conditioning by intervention.

We use special notations for this

P(X = x |Y ← y) = P{X = x | do(Y = y)} = p(x || y), (1)

whereas

p(y | x) = p(Y = y |X = x) = P{Y = y | is(X = x)}.

Causal interpretation of a Bayesian network involves giving (1) asimple form.

Steffen Lauritzen, University of Oxford Causal Inference from Graphical Models — II

Page 11: Causal Inference from Graphical Models IIsteffen/teaching/grad/caus13II.pdfCausal interpretation of a Bayesian network involves giving (1) a simple form. Ste en Lauritzen, University

RecapitulatingBayesian networks

Causal Bayesian networksStructural equation systems

Computation of effectsIdentifiability of causal effects

Chain graph modelsReferences

Intervention vs. observationCausal interpretation

We say that a BN is causal w.r.t. atomic interventions at B ⊆ V ifit holds for any A ⊆ B that

p(x || x∗A) =∏

v∈V \A

p(xv | xpa(v))

∣∣∣∣∣∣xA=x∗A

For A = ∅ we obtain standard factorisation.

Note that conditional distributions p(xv | xpa(v)) are stable underinterventions which do not involve xv . Such assumption must bejustified in any given context.

Steffen Lauritzen, University of Oxford Causal Inference from Graphical Models — II

Page 12: Causal Inference from Graphical Models IIsteffen/teaching/grad/caus13II.pdfCausal interpretation of a Bayesian network involves giving (1) a simple form. Ste en Lauritzen, University

RecapitulatingBayesian networks

Causal Bayesian networksStructural equation systems

Computation of effectsIdentifiability of causal effects

Chain graph modelsReferences

Linear structural equation systemsIntervention by replacement

3 6

1 5 7

2 4

s ss s ss s-@@R

���

@@R

-@@R@@R

��� ���

-

A linear structural equation system for this network is

X1 ← α1 + U1

X2 ← α2 + β21x1 + U2

X3 ← α3 + β31x1 + U3

X4 ← α4 + β42x2 + U4

X5 ← α5 + β52x2 + β53x3 + U5

X6 ← α6 + β63x3 + β65x5 + U6

X7 ← α7 + β74x4 + β75x5 + β76x6 + U7.

Steffen Lauritzen, University of Oxford Causal Inference from Graphical Models — II

Page 13: Causal Inference from Graphical Models IIsteffen/teaching/grad/caus13II.pdfCausal interpretation of a Bayesian network involves giving (1) a simple form. Ste en Lauritzen, University

RecapitulatingBayesian networks

Causal Bayesian networksStructural equation systems

Computation of effectsIdentifiability of causal effects

Chain graph modelsReferences

Linear structural equation systemsIntervention by replacement

After intervention by replacement, the system changes to

X1 ← α1 + U1

X2 ← α2 + β21x1 + U2

X3 ← α3 + β31x1 + U3

X4 ← x∗4

X5 ← α5 + β52x2 + β53x3 + U5

X6 ← α6 + β63x3 + β65x5 + U6

X7 ← α7 + β74x∗4 + β75x5 + β76x6 + U7.

Steffen Lauritzen, University of Oxford Causal Inference from Graphical Models — II

Page 14: Causal Inference from Graphical Models IIsteffen/teaching/grad/caus13II.pdfCausal interpretation of a Bayesian network involves giving (1) a simple form. Ste en Lauritzen, University

RecapitulatingBayesian networks

Causal Bayesian networksStructural equation systems

Computation of effectsIdentifiability of causal effects

Chain graph modelsReferences

Linear structural equation systemsIntervention by replacement

Justification of causal models by structural equations

Intervention by replacement in structural equation system impliesD causal for distribution of Xv , v ∈ V .

Occasionally used for justification of CBN.

Ambiguity in choice of gv and Uv makes this problematic.

May take stability of conditional distributions as a primitive ratherthan structural equations.

Structural equations more expressive when choice of gv and Uv

can be externally justified.

Steffen Lauritzen, University of Oxford Causal Inference from Graphical Models — II

Page 15: Causal Inference from Graphical Models IIsteffen/teaching/grad/caus13II.pdfCausal interpretation of a Bayesian network involves giving (1) a simple form. Ste en Lauritzen, University

RecapitulatingBayesian networks

Causal Bayesian networksStructural equation systems

Computation of effectsIdentifiability of causal effects

Chain graph modelsReferences

Assessment of effects of actionsIntervention diagrams

a l b

r

t - t -

t@

@@I

����6 t

a - treatment with AZT; l - intermediate response (possible lungdisease); b - treatment with antibiotics; r - survival after a fixedperiod.Predict survival if Xa ← 1 and Xb ← 1, assuming stableconditional distributions.

Steffen Lauritzen, University of Oxford Causal Inference from Graphical Models — II

Page 16: Causal Inference from Graphical Models IIsteffen/teaching/grad/caus13II.pdfCausal interpretation of a Bayesian network involves giving (1) a simple form. Ste en Lauritzen, University

RecapitulatingBayesian networks

Causal Bayesian networksStructural equation systems

Computation of effectsIdentifiability of causal effects

Chain graph modelsReferences

Assessment of effects of actionsIntervention diagrams

G-computation

a l b

r

t - t -

t@

@@I

����6 t

p(1r || 1a, 1b) =∑xl

p(1r , xl || 1a, 1b)

=∑xl

p(1r | xl , 1a, 1b)p(xl | 1a).

Steffen Lauritzen, University of Oxford Causal Inference from Graphical Models — II

Page 17: Causal Inference from Graphical Models IIsteffen/teaching/grad/caus13II.pdfCausal interpretation of a Bayesian network involves giving (1) a simple form. Ste en Lauritzen, University

RecapitulatingBayesian networks

Causal Bayesian networksStructural equation systems

Computation of effectsIdentifiability of causal effects

Chain graph modelsReferences

Assessment of effects of actionsIntervention diagrams

Augment each node v ∈ A where intervention is contemplatedwith additional parent variable Fv .Fv has state space Xv ∪ {φ} and conditional distributions in theintervention diagram are

p′(xv | xpa(v), fv ) =

{p(xv | xpa(v)) if fv = φ

δxv ,x∗v if fv = x∗v ,

where δxy is Kronecker’s symbol

δxy =

{1 if x = y0 otherwise.

Fv is forcing the value of Xv when Fv 6= φ.

Steffen Lauritzen, University of Oxford Causal Inference from Graphical Models — II

Page 18: Causal Inference from Graphical Models IIsteffen/teaching/grad/caus13II.pdfCausal interpretation of a Bayesian network involves giving (1) a simple form. Ste en Lauritzen, University

RecapitulatingBayesian networks

Causal Bayesian networksStructural equation systems

Computation of effectsIdentifiability of causal effects

Chain graph modelsReferences

Assessment of effects of actionsIntervention diagrams

It now holds in the extended DAG, i.e. the intervention diagramthat

p(x) = p′(x |Fv = φ, v ∈ A),

but also

p(x || x∗B) = P(X = x |XB ← x∗B)

= P ′(x |Fv = x∗v , v ∈ B,Fv = φ, v ∈ B \ A),

In particular it holds that if pa(v) = ∅, then p(x | x∗v ) = p(xv || x∗v ).

Steffen Lauritzen, University of Oxford Causal Inference from Graphical Models — II

Page 19: Causal Inference from Graphical Models IIsteffen/teaching/grad/caus13II.pdfCausal interpretation of a Bayesian network involves giving (1) a simple form. Ste en Lauritzen, University

RecapitulatingBayesian networks

Causal Bayesian networksStructural equation systems

Computation of effectsIdentifiability of causal effects

Chain graph modelsReferences

Assessment of effects of actionsIntervention diagrams

More generally we can explicitly join decision nodes δ ∈ ∆ to theDAG as parents of nodes which they affect.

Further, each of these can have parents in D or in ∆ to indicatethat intervention at δ may depend on states of pa(δ). A strategy σyields a conditional distribution of decisions, given their parents toyield

f (x ||σ) =∏v∈V

f (xv | xpa(v))∏δ∈∆

σ(xδ | xpa(δ))

where now pa(v) refer to parents in the extended diagram, whichmust be a DAG to make sense.

This formally corresponds to the notion of LIMIDs (Lauritzen andNilsson, 2001).

Steffen Lauritzen, University of Oxford Causal Inference from Graphical Models — II

Page 20: Causal Inference from Graphical Models IIsteffen/teaching/grad/caus13II.pdfCausal interpretation of a Bayesian network involves giving (1) a simple form. Ste en Lauritzen, University

RecapitulatingBayesian networks

Causal Bayesian networksStructural equation systems

Computation of effectsIdentifiability of causal effects

Chain graph modelsReferences

Assessment of effects of actionsIntervention diagrams

A C

B D

E

FA

FB

FD

FC

LIMID for a causal interpretation of a DAG. Red nodes represent(external) forces or interventions that affect the conditionaldistributions of their children. Note that interventions can beallowed to depend on other variables (treatment strategies).

Steffen Lauritzen, University of Oxford Causal Inference from Graphical Models — II

Page 21: Causal Inference from Graphical Models IIsteffen/teaching/grad/caus13II.pdfCausal interpretation of a Bayesian network involves giving (1) a simple form. Ste en Lauritzen, University

RecapitulatingBayesian networks

Causal Bayesian networksStructural equation systems

Computation of effectsIdentifiability of causal effects

Chain graph modelsReferences

Back-door criterion and formulaClassic casesFront-door formulaInstrumental variable

Treatment variable t, response r , set of observed covariates C ,unobserved variables U.

When and how can p(Xr || xt) be calculated from p(xt , xr , xC ), thelatter in principle being observable from data?

In this case we could say that C is a identifier for assessing theeffect of T on R.

Answer can be found by analysing intervention diagram.

Simplest cases known as back-door and front-door criteria andformulae.

Steffen Lauritzen, University of Oxford Causal Inference from Graphical Models — II

Page 22: Causal Inference from Graphical Models IIsteffen/teaching/grad/caus13II.pdfCausal interpretation of a Bayesian network involves giving (1) a simple form. Ste en Lauritzen, University

RecapitulatingBayesian networks

Causal Bayesian networksStructural equation systems

Computation of effectsIdentifiability of causal effects

Chain graph modelsReferences

Back-door criterion and formulaClassic casesFront-door formulaInstrumental variable

D′ denotes D augmented with Ft .Assume C ⊇ C0, where C0 satisfies

(BD1) Covariates in C0 are unaffected by an intervention:C0⊥D′ Ft ;

(BD2) Intervention only affects response through chosentreatment: R ⊥D′ Ft |C0 ∪ {t}.

Then C identifies the effect of the treatment t on R as

p(xr || x∗t ) =∑xC0

p(xr | xC0 , x∗t )p(xC0).

Steffen Lauritzen, University of Oxford Causal Inference from Graphical Models — II

Page 23: Causal Inference from Graphical Models IIsteffen/teaching/grad/caus13II.pdfCausal interpretation of a Bayesian network involves giving (1) a simple form. Ste en Lauritzen, University

RecapitulatingBayesian networks

Causal Bayesian networksStructural equation systems

Computation of effectsIdentifiability of causal effects

Chain graph modelsReferences

Back-door criterion and formulaClassic casesFront-door formulaInstrumental variable

Confounding

Ft t r

u

t - t -

t@@@R? t

Ft t r

u

t��� t tt@@@

The unobserved confounder Xu is affecting both treatment andresponse.BD2 is violated; graph to the right reveals that Ft is notd-separated from r by t, so treatment effect is not identifiable.

Steffen Lauritzen, University of Oxford Causal Inference from Graphical Models — II

Page 24: Causal Inference from Graphical Models IIsteffen/teaching/grad/caus13II.pdfCausal interpretation of a Bayesian network involves giving (1) a simple form. Ste en Lauritzen, University

RecapitulatingBayesian networks

Causal Bayesian networksStructural equation systems

Computation of effectsIdentifiability of causal effects

Chain graph modelsReferences

Back-door criterion and formulaClassic casesFront-door formulaInstrumental variable

Randomisation

Ft t r

c u

t - t -

t@@@R

?

t?t

Ft t r

c u

t��� t��� tt@@@

t

When Xt is randomised, possibly depending on observed covariatec , confounding is resolved.Now Ft ⊥D′ r | {c , t} and c is an identifier.

Steffen Lauritzen, University of Oxford Causal Inference from Graphical Models — II

Page 25: Causal Inference from Graphical Models IIsteffen/teaching/grad/caus13II.pdfCausal interpretation of a Bayesian network involves giving (1) a simple form. Ste en Lauritzen, University

RecapitulatingBayesian networks

Causal Bayesian networksStructural equation systems

Computation of effectsIdentifiability of causal effects

Chain graph modelsReferences

Back-door criterion and formulaClassic casesFront-door formulaInstrumental variable

Sufficient covariate

Ft t r

u c

t - t -

t -

?

t?t

Ft t r

u c

t��� t��� tt t

Alternatively, an observed covariate c can ‘screen away’ theconfounding effect on the treatment.Also here, Ft ⊥D′ r | {c , t} and c is an identifier.

Steffen Lauritzen, University of Oxford Causal Inference from Graphical Models — II

Page 26: Causal Inference from Graphical Models IIsteffen/teaching/grad/caus13II.pdfCausal interpretation of a Bayesian network involves giving (1) a simple form. Ste en Lauritzen, University

RecapitulatingBayesian networks

Causal Bayesian networksStructural equation systems

Computation of effectsIdentifiability of causal effects

Chain graph modelsReferences

Back-door criterion and formulaClassic casesFront-door formulaInstrumental variable

Ft t c r

u

t - t - t -

t@@@R

��� t

In this case c is the agent through which the treatment effects theresponse. Then one can show

p(xr || x∗t ) =∑xc

p(xc | x∗t )∑xt

p(xr | xc , xt)p(xt).

Steffen Lauritzen, University of Oxford Causal Inference from Graphical Models — II

Page 27: Causal Inference from Graphical Models IIsteffen/teaching/grad/caus13II.pdfCausal interpretation of a Bayesian network involves giving (1) a simple form. Ste en Lauritzen, University

RecapitulatingBayesian networks

Causal Bayesian networksStructural equation systems

Computation of effectsIdentifiability of causal effects

Chain graph modelsReferences

Back-door criterion and formulaClassic casesFront-door formulaInstrumental variable

I is an instrument (Durbin, 1954; Bowden and Turkington, 1984;Angrist et al., 1996) if

Fi i t r

Ft u

t - t - t -

t?

t?

��� t

Fi

ti t r

Ft u

t�������

��

t��� tt t

i is treatment assigned, t is treatment taken.

The graph to the right reveals that r ⊥D′ Fi | {i} so the effect ofthe treatment assignment is identified.

However, r is not d-separated from Ft by t so the effect of thetreatment itself cannot.

Steffen Lauritzen, University of Oxford Causal Inference from Graphical Models — II

Page 28: Causal Inference from Graphical Models IIsteffen/teaching/grad/caus13II.pdfCausal interpretation of a Bayesian network involves giving (1) a simple form. Ste en Lauritzen, University

RecapitulatingBayesian networks

Causal Bayesian networksStructural equation systems

Computation of effectsIdentifiability of causal effects

Chain graph modelsReferences

Back-door criterion and formulaClassic casesFront-door formulaInstrumental variable

In the linear case, the effect of t on r can be found as the ratio ofeffects of i on r and the effect of i on t, both of which areidentified.

But linearity and additivity of errors are very strong assumptions.

Bounds are available in the general case using linear programmingmethods (Balke and Pearl, 1997; Dawid, 2003).

Steffen Lauritzen, University of Oxford Causal Inference from Graphical Models — II

Page 29: Causal Inference from Graphical Models IIsteffen/teaching/grad/caus13II.pdfCausal interpretation of a Bayesian network involves giving (1) a simple form. Ste en Lauritzen, University

RecapitulatingBayesian networks

Causal Bayesian networksStructural equation systems

Computation of effectsIdentifiability of causal effects

Chain graph modelsReferences

Back-door criterion and formulaClassic casesFront-door formulaInstrumental variable

Mendelian randomization

Same as instrumental variable

Fg g x r

u

t - t - t -

t?

��� t

g is gene assigned, x could be exposure or expression.

Bounds for exposure effects are available.

Steffen Lauritzen, University of Oxford Causal Inference from Graphical Models — II

Page 30: Causal Inference from Graphical Models IIsteffen/teaching/grad/caus13II.pdfCausal interpretation of a Bayesian network involves giving (1) a simple form. Ste en Lauritzen, University

RecapitulatingBayesian networks

Causal Bayesian networksStructural equation systems

Computation of effectsIdentifiability of causal effects

Chain graph modelsReferences

Back-door criterion and formulaClassic casesFront-door formulaInstrumental variable

It holdsmaxxt

∑xr

maxxi

p(xr , xt | xi )p(xr ) ≤ 1, (2)

This instrumental inequality was first derived by Pearl (1995). Canbe used to falsify that something is an instrument (Ramsahai andLauritzen, 2011).

Steffen Lauritzen, University of Oxford Causal Inference from Graphical Models — II

Page 31: Causal Inference from Graphical Models IIsteffen/teaching/grad/caus13II.pdfCausal interpretation of a Bayesian network involves giving (1) a simple form. Ste en Lauritzen, University

RecapitulatingBayesian networks

Causal Bayesian networksStructural equation systems

Computation of effectsIdentifiability of causal effects

Chain graph modelsReferences

DefinitionFactorization and Markov propertyCausal interpretation in undirected graphsCausal chain graphs

A standard chain graph is a mixed graph with no multiple edges,no bi-directed edges, and no directed or semi-directed cycles i.e. nocycles with all arrows on the cycle pointing in the same direction.

A C

B D

E

A C

B D

E

The graph to the left is a chain graph, with chain components(connected components after removing arrows){A}, {B}, {C ,D}, {E}. The graph to the right is not a chaingraph, due to the semi-directed cycle 〈A→ C — D → B — A〉.

Steffen Lauritzen, University of Oxford Causal Inference from Graphical Models — II

Page 32: Causal Inference from Graphical Models IIsteffen/teaching/grad/caus13II.pdfCausal interpretation of a Bayesian network involves giving (1) a simple form. Ste en Lauritzen, University

RecapitulatingBayesian networks

Causal Bayesian networksStructural equation systems

Computation of effectsIdentifiability of causal effects

Chain graph modelsReferences

DefinitionFactorization and Markov propertyCausal interpretation in undirected graphsCausal chain graphs

A chain graph with no undirected edges is a directed acyclic graphor DAG.

A chain graph with no directed edges is an undirected graph orUG.

The chain components T of a chain graph are connectedcomponents of subgraph induced by undirected edges.

In a DAG, all chain components are singletons and in an undirectedgraph, the chain components are the connected components.

Steffen Lauritzen, University of Oxford Causal Inference from Graphical Models — II

Page 33: Causal Inference from Graphical Models IIsteffen/teaching/grad/caus13II.pdfCausal interpretation of a Bayesian network involves giving (1) a simple form. Ste en Lauritzen, University

RecapitulatingBayesian networks

Causal Bayesian networksStructural equation systems

Computation of effectsIdentifiability of causal effects

Chain graph modelsReferences

DefinitionFactorization and Markov propertyCausal interpretation in undirected graphsCausal chain graphs

The chain graph Markov has an outer factorization

f (x) =∏τ∈T

f(xτ | xpa(τ)

), (3)

where each factor further factorizes w.r.t. the graph G∗(τ) as

f(xτ | xpa(τ)

)= Z−1

(xpa(τ)

) ∏A∈A(τ)

φA(xA), (4)

where A(τ) are the complete sets in G∗(τ) and

Z(xpa(τ)

)=∑xτ

∏A∈A(τ)

φA(xA).

The graph G∗(τ) is obtained from Gτ∪pa(τ) by dropping directionson edges and adding edges between any pair of members of pa(τ).Matched by a global Markov property as for DAGs and UGs.

Steffen Lauritzen, University of Oxford Causal Inference from Graphical Models — II

Page 34: Causal Inference from Graphical Models IIsteffen/teaching/grad/caus13II.pdfCausal interpretation of a Bayesian network involves giving (1) a simple form. Ste en Lauritzen, University

RecapitulatingBayesian networks

Causal Bayesian networksStructural equation systems

Computation of effectsIdentifiability of causal effects

Chain graph modelsReferences

DefinitionFactorization and Markov propertyCausal interpretation in undirected graphsCausal chain graphs

A C

B D

E

A C

B D

G∗({C ,D})

Chain components {A}, {B}, {C ,D}, {E}.Outer factorization:

f (x) = f (xA)f (xB)f (xCD | xAB)f (xE | xCD)

Inner factorization:

f (xCD | xAB) = Z−1(xAB)φ(xAC )φ(xBD)φ(xCD).

Steffen Lauritzen, University of Oxford Causal Inference from Graphical Models — II

Page 35: Causal Inference from Graphical Models IIsteffen/teaching/grad/caus13II.pdfCausal interpretation of a Bayesian network involves giving (1) a simple form. Ste en Lauritzen, University

RecapitulatingBayesian networks

Causal Bayesian networksStructural equation systems

Computation of effectsIdentifiability of causal effects

Chain graph modelsReferences

DefinitionFactorization and Markov propertyCausal interpretation in undirected graphsCausal chain graphs

A C

B D

E

A C

B D

A

B

A C

B D

E

GmAB Gm

ABCD Gm

Chain components {A}, {B}, {C ,D}, {E}.Conditional independence read from sequence of moral graphs

A⊥⊥B, C ⊥⊥B | {A,D}, D ⊥⊥A | {B,C}, E ⊥⊥{A,B} | {C ,D}

Steffen Lauritzen, University of Oxford Causal Inference from Graphical Models — II

Page 36: Causal Inference from Graphical Models IIsteffen/teaching/grad/caus13II.pdfCausal interpretation of a Bayesian network involves giving (1) a simple form. Ste en Lauritzen, University

RecapitulatingBayesian networks

Causal Bayesian networksStructural equation systems

Computation of effectsIdentifiability of causal effects

Chain graph modelsReferences

DefinitionFactorization and Markov propertyCausal interpretation in undirected graphsCausal chain graphs

Intervention conditioning in an undirected graph, corresponding toferromagnetism, is made by

f (xV \B || x∗B) = (Z ∗)−1∏A∈A

φA(xA)

∣∣∣∣∣xB=x∗B

= f (xV \B | x∗B).

Hence this corresponds to standard conditioning.

More generally, the system can be affected by new potentials

f (xV ||σ) = (Z ∗)−1∏a∈A

φA(xA)∏B∈B

σB(xB)

where the atomic interventions above correspond to some of thenew potentials being Dirac delta functions, known as quenching inPhysics.

Steffen Lauritzen, University of Oxford Causal Inference from Graphical Models — II

Page 37: Causal Inference from Graphical Models IIsteffen/teaching/grad/caus13II.pdfCausal interpretation of a Bayesian network involves giving (1) a simple form. Ste en Lauritzen, University

RecapitulatingBayesian networks

Causal Bayesian networksStructural equation systems

Computation of effectsIdentifiability of causal effects

Chain graph modelsReferences

DefinitionFactorization and Markov propertyCausal interpretation in undirected graphsCausal chain graphs

There is a similar intervention calculus for chain graphs

f (x) =∏τ∈T

f(xτ | xpa(τ)

) ∏δ∈∆

σ(xδ | xpa(δ))

where each factor in the left product further factorizes according tothe graph G∗(τ) as before. Also pa refer to parents in the extendedgraph, hence may include intervention nodes.

To make sense, the extended diagram must be a chain graph.

This form of LIMIDs was discussed in Cowell et al. (1999).

Steffen Lauritzen, University of Oxford Causal Inference from Graphical Models — II

Page 38: Causal Inference from Graphical Models IIsteffen/teaching/grad/caus13II.pdfCausal interpretation of a Bayesian network involves giving (1) a simple form. Ste en Lauritzen, University

RecapitulatingBayesian networks

Causal Bayesian networksStructural equation systems

Computation of effectsIdentifiability of causal effects

Chain graph modelsReferences

DefinitionFactorization and Markov propertyCausal interpretation in undirected graphsCausal chain graphs

LIMID for a chain graph

A C

B D

E

FA

FB

FD

FC

The exact same interpretation can be given to a chain graph.

Steffen Lauritzen, University of Oxford Causal Inference from Graphical Models — II

Page 39: Causal Inference from Graphical Models IIsteffen/teaching/grad/caus13II.pdfCausal interpretation of a Bayesian network involves giving (1) a simple form. Ste en Lauritzen, University

RecapitulatingBayesian networks

Causal Bayesian networksStructural equation systems

Computation of effectsIdentifiability of causal effects

Chain graph modelsReferences

DefinitionFactorization and Markov propertyCausal interpretation in undirected graphsCausal chain graphs

Atomic intervention conditioning in a chain graph now leads to

f (xV \A || x∗A) =f (x)∏

τ∈T f (xτ∩A | xpa(τ))

∣∣∣∣xA=x∗A

.

This specializes to standard conditioning in undirected graphs andintervention conditioning in DAGs (Lauritzen and Richardson,2002).

Steffen Lauritzen, University of Oxford Causal Inference from Graphical Models — II

Page 40: Causal Inference from Graphical Models IIsteffen/teaching/grad/caus13II.pdfCausal interpretation of a Bayesian network involves giving (1) a simple form. Ste en Lauritzen, University

RecapitulatingBayesian networks

Causal Bayesian networksStructural equation systems

Computation of effectsIdentifiability of causal effects

Chain graph modelsReferences

Angrist, J. D., Imbens, G. W., and Rubin, D. B. (1996).Identification of causal effects using instrumental variables (withdiscussion). Journal of the American Statistical Association,91:444–472.

Balke, A. and Pearl, J. (1997). Bounds on treatment effects fromstudies with imperfect compliance. Journal of the AmericanStatistical Association, 92:1171–1176.

Bowden, R. J. and Turkington, D. A. (1984). InstrumentalVariables. Cambridge University Press, Cambridge, UK.

Cowell, R. G., Dawid, A. P., Lauritzen, S. L., and Spiegelhalter,D. J. (1999). Probabilistic Networks and Expert Systems.Springer-Verlag, New York.

Dawid, A. P. (2003). Causal inference using influence diagrams:the problem of partial compliance. In Green, P. J., Hjort, N. L.,

Steffen Lauritzen, University of Oxford Causal Inference from Graphical Models — II

Page 41: Causal Inference from Graphical Models IIsteffen/teaching/grad/caus13II.pdfCausal interpretation of a Bayesian network involves giving (1) a simple form. Ste en Lauritzen, University

RecapitulatingBayesian networks

Causal Bayesian networksStructural equation systems

Computation of effectsIdentifiability of causal effects

Chain graph modelsReferences

and Richardson, S., editors, Highly Structured StochasticSystems, chapter 3, pages 45–65. Clarendon Press.

Durbin, J. (1954). Errors in variables. Review of the InternationalStatistical Institute, 22:23–32.

Lauritzen, S. L. (2001). Causal inference from graphical models. InBarndorff-Nielsen, O. E., Cox, D. R., and Kluppelberg, C.,editors, Complex Stochastic Systems, pages 63–107. Chapmanand Hall/CRC Press, London/Boca Raton.

Lauritzen, S. L. and Nilsson, D. (2001). Representing and solvingdecision problems with limited information. ManagementScience, 47:1238–1251.

Lauritzen, S. L. and Richardson, T. S. (2002). Chain graph modelsand their causal interpretation (with discussion). Journal of theRoyal Statistical Society, Series B, 64:321–361.

Steffen Lauritzen, University of Oxford Causal Inference from Graphical Models — II

Page 42: Causal Inference from Graphical Models IIsteffen/teaching/grad/caus13II.pdfCausal interpretation of a Bayesian network involves giving (1) a simple form. Ste en Lauritzen, University

RecapitulatingBayesian networks

Causal Bayesian networksStructural equation systems

Computation of effectsIdentifiability of causal effects

Chain graph modelsReferences

Pearl, J. (1995). Causal inference from indirect experiments.Artificial Intelligence in Medicine, 7:561–582.

Pearl, J. (2000). Causality. Cambridge University Press,Cambridge.

Ramsahai, R. R. and Lauritzen, S. L. (2011). Likelihood analysis ofthe binary instrumental variable model. Biometrika, 98:987–994.

Spirtes, P., Glymour, C., and Scheines, R. (1993). Causation,Prediction and Search. Springer-Verlag, New York. Reprinted byMIT Press.

Steffen Lauritzen, University of Oxford Causal Inference from Graphical Models — II