![Page 1: Distinguishing between Cause and Effect: Estimation of ...clopinet.com/isabelle/Projects/NIPS2013/slides/Peters_NIPS2013.pdf · Y non-Gaussian X Y Then there is no X = ˚Y + N X N](https://reader034.vdocuments.net/reader034/viewer/2022042920/5f64c1540075480c2d320016/html5/thumbnails/1.jpg)
Distinguishing between Cause and Effect:Estimation of Causal Graphs with two Variables
Jonas PetersETH Zurich
Tutorial
NIPS 2013 Workshop on Causality9th December 2013
![Page 2: Distinguishing between Cause and Effect: Estimation of ...clopinet.com/isabelle/Projects/NIPS2013/slides/Peters_NIPS2013.pdf · Y non-Gaussian X Y Then there is no X = ˚Y + N X N](https://reader034.vdocuments.net/reader034/viewer/2022042920/5f64c1540075480c2d320016/html5/thumbnails/2.jpg)
F. H. Messerli: Chocolate Consumption, Cognitive Function, and Nobel Laureates, N Engl J Med 2012
Jonas Peters (ETH Zurich) Distinguishing between Cause and Effect 9th December 2013
![Page 3: Distinguishing between Cause and Effect: Estimation of ...clopinet.com/isabelle/Projects/NIPS2013/slides/Peters_NIPS2013.pdf · Y non-Gaussian X Y Then there is no X = ˚Y + N X N](https://reader034.vdocuments.net/reader034/viewer/2022042920/5f64c1540075480c2d320016/html5/thumbnails/3.jpg)
F. H. Messerli: Chocolate Consumption, Cognitive Function, and Nobel Laureates, N Engl J Med 2012
Jonas Peters (ETH Zurich) Distinguishing between Cause and Effect 9th December 2013
![Page 4: Distinguishing between Cause and Effect: Estimation of ...clopinet.com/isabelle/Projects/NIPS2013/slides/Peters_NIPS2013.pdf · Y non-Gaussian X Y Then there is no X = ˚Y + N X N](https://reader034.vdocuments.net/reader034/viewer/2022042920/5f64c1540075480c2d320016/html5/thumbnails/4.jpg)
F. H. Messerli: Chocolate Consumption, Cognitive Function, and Nobel Laureates, N Engl J Med 2012
Jonas Peters (ETH Zurich) Distinguishing between Cause and Effect 9th December 2013
![Page 5: Distinguishing between Cause and Effect: Estimation of ...clopinet.com/isabelle/Projects/NIPS2013/slides/Peters_NIPS2013.pdf · Y non-Gaussian X Y Then there is no X = ˚Y + N X N](https://reader034.vdocuments.net/reader034/viewer/2022042920/5f64c1540075480c2d320016/html5/thumbnails/5.jpg)
Problem: Given P(X ,Y ), can we infer whether
X → Y or Y → X ?
Difficulty: So much symmetry:
P(X ) · P(Y |X ) = P(X ,Y ) = P(X |Y ) · P(Y )
We need assumptions!! (e.g. Markov and faithfulness do not suffice.)
Surprise (for some assumptions):
2 variables ⇒ p variables
J. Peters, J. Mooij, D. Janzing and B. Scholkopf: Causal Discovery with Continuous Additive Noise Models, arXiv:1309.6779
Jonas Peters (ETH Zurich) Distinguishing between Cause and Effect 9th December 2013
![Page 6: Distinguishing between Cause and Effect: Estimation of ...clopinet.com/isabelle/Projects/NIPS2013/slides/Peters_NIPS2013.pdf · Y non-Gaussian X Y Then there is no X = ˚Y + N X N](https://reader034.vdocuments.net/reader034/viewer/2022042920/5f64c1540075480c2d320016/html5/thumbnails/6.jpg)
Problem: Given P(X ,Y ), can we infer whether
X → Y or Y → X ?
Difficulty: So much symmetry:
P(X ) · P(Y |X ) = P(X ,Y ) = P(X |Y ) · P(Y )
We need assumptions!! (e.g. Markov and faithfulness do not suffice.)
Surprise (for some assumptions):
2 variables ⇒ p variables
J. Peters, J. Mooij, D. Janzing and B. Scholkopf: Causal Discovery with Continuous Additive Noise Models, arXiv:1309.6779
Jonas Peters (ETH Zurich) Distinguishing between Cause and Effect 9th December 2013
![Page 7: Distinguishing between Cause and Effect: Estimation of ...clopinet.com/isabelle/Projects/NIPS2013/slides/Peters_NIPS2013.pdf · Y non-Gaussian X Y Then there is no X = ˚Y + N X N](https://reader034.vdocuments.net/reader034/viewer/2022042920/5f64c1540075480c2d320016/html5/thumbnails/7.jpg)
Problem: Given P(X ,Y ), can we infer whether
X → Y or Y → X ?
Difficulty: So much symmetry:
P(X ) · P(Y |X ) = P(X ,Y ) = P(X |Y ) · P(Y )
We need assumptions!! (e.g. Markov and faithfulness do not suffice.)
Surprise (for some assumptions):
2 variables ⇒ p variables
J. Peters, J. Mooij, D. Janzing and B. Scholkopf: Causal Discovery with Continuous Additive Noise Models, arXiv:1309.6779
Jonas Peters (ETH Zurich) Distinguishing between Cause and Effect 9th December 2013
![Page 8: Distinguishing between Cause and Effect: Estimation of ...clopinet.com/isabelle/Projects/NIPS2013/slides/Peters_NIPS2013.pdf · Y non-Gaussian X Y Then there is no X = ˚Y + N X N](https://reader034.vdocuments.net/reader034/viewer/2022042920/5f64c1540075480c2d320016/html5/thumbnails/8.jpg)
Idea No. 1: Linear Non-Gaussian Additive Models (LiNGAM)
Structural assumptions like additive non-Gaussian noise models break thesymmetry:
Y = βX + NY NY ⊥⊥ X ,
with NY non-Gaussian.
Jonas Peters (ETH Zurich) Distinguishing between Cause and Effect 9th December 2013
![Page 9: Distinguishing between Cause and Effect: Estimation of ...clopinet.com/isabelle/Projects/NIPS2013/slides/Peters_NIPS2013.pdf · Y non-Gaussian X Y Then there is no X = ˚Y + N X N](https://reader034.vdocuments.net/reader034/viewer/2022042920/5f64c1540075480c2d320016/html5/thumbnails/9.jpg)
Asymmetry No. 1
Consider a distribution corresponding to
Y = βX + NY
• NY ⊥⊥ X• NY non-Gaussian
X Y
Then there is no
X = φY + NX
• NX ⊥⊥ Y• NX non-Gaussian
X Y
S. Shimizu, P.O. Hoyer, A. Hyvarinen and A.J. Kerminen: A linear non-Gaussian acyclic model for causal discovery, JMLR 2006
Jonas Peters (ETH Zurich) Distinguishing between Cause and Effect 9th December 2013
![Page 10: Distinguishing between Cause and Effect: Estimation of ...clopinet.com/isabelle/Projects/NIPS2013/slides/Peters_NIPS2013.pdf · Y non-Gaussian X Y Then there is no X = ˚Y + N X N](https://reader034.vdocuments.net/reader034/viewer/2022042920/5f64c1540075480c2d320016/html5/thumbnails/10.jpg)
Asymmetry No. 1
Consider a distribution corresponding to
Y = βX + NY
• NY ⊥⊥ X• NY non-Gaussian
X Y
Then there is no
X = φY + NX
• NX ⊥⊥ Y• NX non-Gaussian
X Y
S. Shimizu, P.O. Hoyer, A. Hyvarinen and A.J. Kerminen: A linear non-Gaussian acyclic model for causal discovery, JMLR 2006
Jonas Peters (ETH Zurich) Distinguishing between Cause and Effect 9th December 2013
![Page 11: Distinguishing between Cause and Effect: Estimation of ...clopinet.com/isabelle/Projects/NIPS2013/slides/Peters_NIPS2013.pdf · Y non-Gaussian X Y Then there is no X = ˚Y + N X N](https://reader034.vdocuments.net/reader034/viewer/2022042920/5f64c1540075480c2d320016/html5/thumbnails/11.jpg)
Idea No. 2: Additive noise models
Nonlinear functions are also fine!
Y = f (X ) + NY NY ⊥⊥ X
P. Hoyer, D. Janzing, J. Mooij, J. Peters and B. Scholkopf: Nonlinear causal discovery with additive noise models, NIPS 2008
Jonas Peters (ETH Zurich) Distinguishing between Cause and Effect 9th December 2013
![Page 12: Distinguishing between Cause and Effect: Estimation of ...clopinet.com/isabelle/Projects/NIPS2013/slides/Peters_NIPS2013.pdf · Y non-Gaussian X Y Then there is no X = ˚Y + N X N](https://reader034.vdocuments.net/reader034/viewer/2022042920/5f64c1540075480c2d320016/html5/thumbnails/12.jpg)
Asymmetry No. 2
Consider a distribution corresponding to
Y = f (X ) + NY
with NY ⊥⊥ X
X Y
Then for “most combinations” (f ,P(X ),P(NY )) there is no
X = g(Y ) + MX
with MX ⊥⊥ Y
X Y
Jonas Peters (ETH Zurich) Distinguishing between Cause and Effect 9th December 2013
![Page 13: Distinguishing between Cause and Effect: Estimation of ...clopinet.com/isabelle/Projects/NIPS2013/slides/Peters_NIPS2013.pdf · Y non-Gaussian X Y Then there is no X = ˚Y + N X N](https://reader034.vdocuments.net/reader034/viewer/2022042920/5f64c1540075480c2d320016/html5/thumbnails/13.jpg)
Asymmetry No. 2
Consider a distribution corresponding to
Y = f (X ) + NY
with NY ⊥⊥ X
X Y
Then for “most combinations” (f ,P(X ),P(NY )) there is no
X = g(Y ) + MX
with MX ⊥⊥ Y
X Y
Jonas Peters (ETH Zurich) Distinguishing between Cause and Effect 9th December 2013
![Page 14: Distinguishing between Cause and Effect: Estimation of ...clopinet.com/isabelle/Projects/NIPS2013/slides/Peters_NIPS2013.pdf · Y non-Gaussian X Y Then there is no X = ˚Y + N X N](https://reader034.vdocuments.net/reader034/viewer/2022042920/5f64c1540075480c2d320016/html5/thumbnails/14.jpg)
Y = f (X ) + NY , NY ⊥⊥ X
Jonas Peters (ETH Zurich) Distinguishing between Cause and Effect 9th December 2013
![Page 15: Distinguishing between Cause and Effect: Estimation of ...clopinet.com/isabelle/Projects/NIPS2013/slides/Peters_NIPS2013.pdf · Y non-Gaussian X Y Then there is no X = ˚Y + N X N](https://reader034.vdocuments.net/reader034/viewer/2022042920/5f64c1540075480c2d320016/html5/thumbnails/15.jpg)
Y = f (X ) + NY , NY ⊥⊥ X
Jonas Peters (ETH Zurich) Distinguishing between Cause and Effect 9th December 2013
![Page 16: Distinguishing between Cause and Effect: Estimation of ...clopinet.com/isabelle/Projects/NIPS2013/slides/Peters_NIPS2013.pdf · Y non-Gaussian X Y Then there is no X = ˚Y + N X N](https://reader034.vdocuments.net/reader034/viewer/2022042920/5f64c1540075480c2d320016/html5/thumbnails/16.jpg)
X = g(Y ) + NX , NX ⊥⊥ Y
Jonas Peters (ETH Zurich) Distinguishing between Cause and Effect 9th December 2013
![Page 17: Distinguishing between Cause and Effect: Estimation of ...clopinet.com/isabelle/Projects/NIPS2013/slides/Peters_NIPS2013.pdf · Y non-Gaussian X Y Then there is no X = ˚Y + N X N](https://reader034.vdocuments.net/reader034/viewer/2022042920/5f64c1540075480c2d320016/html5/thumbnails/17.jpg)
X = g(Y ) + NX , NX ⊥⊥ Y
Jonas Peters (ETH Zurich) Distinguishing between Cause and Effect 9th December 2013
![Page 18: Distinguishing between Cause and Effect: Estimation of ...clopinet.com/isabelle/Projects/NIPS2013/slides/Peters_NIPS2013.pdf · Y non-Gaussian X Y Then there is no X = ˚Y + N X N](https://reader034.vdocuments.net/reader034/viewer/2022042920/5f64c1540075480c2d320016/html5/thumbnails/18.jpg)
Idea No. 3: Gaussian Process Inference (GPI)
We can always write∗
Y = f (X ,NY ), NY ⊥⊥ X
andX = g(Y ,NX ), NX ⊥⊥ Y
Which model is more “complex”? Use Bayesian model comparison.
J. M. Mooij, O. Stegle, D. Janzing, K. Zhang, B. Scholkopf:
Probabilistic latent variable models for distinguishing between cause and effect, NIPS 2010
∗E.g., J. Peters: Restricted Structural Equation Models for Causal Inference, PhD Thesis
Jonas Peters (ETH Zurich) Distinguishing between Cause and Effect 9th December 2013
![Page 19: Distinguishing between Cause and Effect: Estimation of ...clopinet.com/isabelle/Projects/NIPS2013/slides/Peters_NIPS2013.pdf · Y non-Gaussian X Y Then there is no X = ˚Y + N X N](https://reader034.vdocuments.net/reader034/viewer/2022042920/5f64c1540075480c2d320016/html5/thumbnails/19.jpg)
Asymmetry No. 3
1 Fix the noise distribution to be N (0, 1).
2 Put prior p(θX ) on input distribution p(x | θX ) ( complexity of X ).
3 Put prior p(θf ) on the functions p(f | θf ) ( complexity of f ).
4 Approximate marginal likelihood for X → Y
p(x , y) = p(x) · p(y | x)
=
∫p(x | θX )p(θX ) dθX
·∫δ(y − f (x , e)
)p(e)p(f ) de df
θf
f
Y
X
θX
E
5 Approximate marginal likelihood for Y → X .
6 Compare.
J. M. Mooij, O. Stegle, D. Janzing, K. Zhang, B. Scholkopf:
Probabilistic latent variable models for distinguishing between cause and effect, NIPS 2010
Jonas Peters (ETH Zurich) Distinguishing between Cause and Effect 9th December 2013
![Page 20: Distinguishing between Cause and Effect: Estimation of ...clopinet.com/isabelle/Projects/NIPS2013/slides/Peters_NIPS2013.pdf · Y non-Gaussian X Y Then there is no X = ˚Y + N X N](https://reader034.vdocuments.net/reader034/viewer/2022042920/5f64c1540075480c2d320016/html5/thumbnails/20.jpg)
Asymmetry No. 3
1 Fix the noise distribution to be N (0, 1).
2 Put prior p(θX ) on input distribution p(x | θX ) ( complexity of X ).
3 Put prior p(θf ) on the functions p(f | θf ) ( complexity of f ).
4 Approximate marginal likelihood for X → Y
p(x , y) = p(x) · p(y | x)
=
∫p(x | θX )p(θX ) dθX
·∫δ(y − f (x , e)
)p(e)p(f ) de df
θf
f
Y
X
θX
E
5 Approximate marginal likelihood for Y → X .
6 Compare.
J. M. Mooij, O. Stegle, D. Janzing, K. Zhang, B. Scholkopf:
Probabilistic latent variable models for distinguishing between cause and effect, NIPS 2010
Jonas Peters (ETH Zurich) Distinguishing between Cause and Effect 9th December 2013
![Page 21: Distinguishing between Cause and Effect: Estimation of ...clopinet.com/isabelle/Projects/NIPS2013/slides/Peters_NIPS2013.pdf · Y non-Gaussian X Y Then there is no X = ˚Y + N X N](https://reader034.vdocuments.net/reader034/viewer/2022042920/5f64c1540075480c2d320016/html5/thumbnails/21.jpg)
Idea No. 4: Information Geometric Causal Inference (IGCI)
Assume a deterministic relationship
Y = f (X )
and that f and P(X ) are “independent”.
D. Janzing, J. M. Mooij, K. Zhang, J. Lemeire, J. Zscheischler, P. Daniusis, B. Steudel, B. Scholkopf:
Information-geometric approach to inferring causal directions, Artificial Intelligence 2012
y
x
f(x)
p(x)
p(y)
Jonas Peters (ETH Zurich) Distinguishing between Cause and Effect 9th December 2013
![Page 22: Distinguishing between Cause and Effect: Estimation of ...clopinet.com/isabelle/Projects/NIPS2013/slides/Peters_NIPS2013.pdf · Y non-Gaussian X Y Then there is no X = ˚Y + N X N](https://reader034.vdocuments.net/reader034/viewer/2022042920/5f64c1540075480c2d320016/html5/thumbnails/22.jpg)
Idea No. 4: Information Geometric Causal Inference (IGCI)
Assume a deterministic relationship
Y = f (X )
and that f and P(X ) are “independent”.
D. Janzing, J. M. Mooij, K. Zhang, J. Lemeire, J. Zscheischler, P. Daniusis, B. Steudel, B. Scholkopf:
Information-geometric approach to inferring causal directions, Artificial Intelligence 2012
y
x
f(x)
p(x)
p(y)
Jonas Peters (ETH Zurich) Distinguishing between Cause and Effect 9th December 2013
![Page 23: Distinguishing between Cause and Effect: Estimation of ...clopinet.com/isabelle/Projects/NIPS2013/slides/Peters_NIPS2013.pdf · Y non-Gaussian X Y Then there is no X = ˚Y + N X N](https://reader034.vdocuments.net/reader034/viewer/2022042920/5f64c1540075480c2d320016/html5/thumbnails/23.jpg)
Asymmetry No. 4
Consider Y = f (X ) with id 6= f : [0, 1]→ [0, 1] invertible and X = g(Y ).If
“cov”(log f ′, pX ) =
∫log(f ′(x)) pX (x) dx −
∫log f ′(x) dx = 0
then
“cov”(log g ′, pY ) =
∫log(g ′(y)) pY (y) dy −
∫log g ′(y) dy > 0
D. Janzing, J. M. Mooij, K. Zhang, J. Lemeire, J. Zscheischler, P. Daniusis, B. Steudel, B. Scholkopf:
Information-geometric approach to inferring causal directions, Artificial Intelligence 2012
Jonas Peters (ETH Zurich) Distinguishing between Cause and Effect 9th December 2013
![Page 24: Distinguishing between Cause and Effect: Estimation of ...clopinet.com/isabelle/Projects/NIPS2013/slides/Peters_NIPS2013.pdf · Y non-Gaussian X Y Then there is no X = ˚Y + N X N](https://reader034.vdocuments.net/reader034/viewer/2022042920/5f64c1540075480c2d320016/html5/thumbnails/24.jpg)
Open Questions 1: Quantifying Identifiability
Jonas Peters (ETH Zurich) Distinguishing between Cause and Effect 9th December 2013
![Page 25: Distinguishing between Cause and Effect: Estimation of ...clopinet.com/isabelle/Projects/NIPS2013/slides/Peters_NIPS2013.pdf · Y non-Gaussian X Y Then there is no X = ˚Y + N X N](https://reader034.vdocuments.net/reader034/viewer/2022042920/5f64c1540075480c2d320016/html5/thumbnails/25.jpg)
Open Questions 1: Quantifying Identifiability
Jonas Peters (ETH Zurich) Distinguishing between Cause and Effect 9th December 2013
![Page 26: Distinguishing between Cause and Effect: Estimation of ...clopinet.com/isabelle/Projects/NIPS2013/slides/Peters_NIPS2013.pdf · Y non-Gaussian X Y Then there is no X = ˚Y + N X N](https://reader034.vdocuments.net/reader034/viewer/2022042920/5f64c1540075480c2d320016/html5/thumbnails/26.jpg)
Open Questions 1: Quantifying Identifiability
Proposition
Assume P(X ,Y ) is generated by
Y = f (X ) + NY
with independent X and NY .
Theninf
Q∈{Q:Y→X}KL(P ||Q) = ?
first steps to understand the geometry
gives us finite sample guarantees
Jonas Peters (ETH Zurich) Distinguishing between Cause and Effect 9th December 2013
![Page 27: Distinguishing between Cause and Effect: Estimation of ...clopinet.com/isabelle/Projects/NIPS2013/slides/Peters_NIPS2013.pdf · Y non-Gaussian X Y Then there is no X = ˚Y + N X N](https://reader034.vdocuments.net/reader034/viewer/2022042920/5f64c1540075480c2d320016/html5/thumbnails/27.jpg)
Open Questions 2: Robustness
What happens if assumptions are violated? E.g., in case of confounding?
X Y
Z
Can we still infer X → Y ? How useful is this?
Jonas Peters (ETH Zurich) Distinguishing between Cause and Effect 9th December 2013
![Page 28: Distinguishing between Cause and Effect: Estimation of ...clopinet.com/isabelle/Projects/NIPS2013/slides/Peters_NIPS2013.pdf · Y non-Gaussian X Y Then there is no X = ˚Y + N X N](https://reader034.vdocuments.net/reader034/viewer/2022042920/5f64c1540075480c2d320016/html5/thumbnails/28.jpg)
Conclusions
In theory, we can brake asymmetry between cause and effect.
restricted structural equation models:- linear functions, additive non-Gaussian noise- nonlinear functions, additive noise
complexity measures on functions and distributions
“independence” between function and input distribution
... principles behind new methods from challenge?
Causal inference prob-
lem of climate change is
solved! Fight the cause!
Don’t fly! (Zurich-SFO 5.4t CO2)!
Compensate!
Jonas Peters (ETH Zurich) Distinguishing between Cause and Effect 9th December 2013
![Page 29: Distinguishing between Cause and Effect: Estimation of ...clopinet.com/isabelle/Projects/NIPS2013/slides/Peters_NIPS2013.pdf · Y non-Gaussian X Y Then there is no X = ˚Y + N X N](https://reader034.vdocuments.net/reader034/viewer/2022042920/5f64c1540075480c2d320016/html5/thumbnails/29.jpg)
Conclusions
In theory, we can brake asymmetry between cause and effect.
restricted structural equation models:- linear functions, additive non-Gaussian noise- nonlinear functions, additive noise
complexity measures on functions and distributions
“independence” between function and input distribution
... principles behind new methods from challenge?
Causal inference prob-
lem of climate change is
solved! Fight the cause!
Don’t fly! (Zurich-SFO 5.4t CO2)!
Compensate!
Jonas Peters (ETH Zurich) Distinguishing between Cause and Effect 9th December 2013
![Page 30: Distinguishing between Cause and Effect: Estimation of ...clopinet.com/isabelle/Projects/NIPS2013/slides/Peters_NIPS2013.pdf · Y non-Gaussian X Y Then there is no X = ˚Y + N X N](https://reader034.vdocuments.net/reader034/viewer/2022042920/5f64c1540075480c2d320016/html5/thumbnails/30.jpg)
Conclusions
In theory, we can brake asymmetry between cause and effect.
restricted structural equation models:- linear functions, additive non-Gaussian noise- nonlinear functions, additive noise
complexity measures on functions and distributions
“independence” between function and input distribution
... principles behind new methods from challenge?
Causal inference prob-
lem of climate change is
solved! Fight the cause!
Don’t fly! (Zurich-SFO 5.4t CO2)!
Compensate!
Jonas Peters (ETH Zurich) Distinguishing between Cause and Effect 9th December 2013
![Page 31: Distinguishing between Cause and Effect: Estimation of ...clopinet.com/isabelle/Projects/NIPS2013/slides/Peters_NIPS2013.pdf · Y non-Gaussian X Y Then there is no X = ˚Y + N X N](https://reader034.vdocuments.net/reader034/viewer/2022042920/5f64c1540075480c2d320016/html5/thumbnails/31.jpg)
IGCI
It turns out that if X → Y∫log |f ′(x)|p(x) dx <
∫log |g ′(y)|p(y) dy
Estimator:
CX→Y :=1
m
m∑j=1
log
∣∣∣∣yj+1 − yjxj+1 − xj
∣∣∣∣ ≈ ∫ log |f ′(x)|p(x) dx
Infer X → Y ifCX→Y < CY→X
Jonas Peters (ETH Zurich) Distinguishing between Cause and Effect 9th December 2013
![Page 32: Distinguishing between Cause and Effect: Estimation of ...clopinet.com/isabelle/Projects/NIPS2013/slides/Peters_NIPS2013.pdf · Y non-Gaussian X Y Then there is no X = ˚Y + N X N](https://reader034.vdocuments.net/reader034/viewer/2022042920/5f64c1540075480c2d320016/html5/thumbnails/32.jpg)
Y = βX + NY , NY ⊥⊥ X , NY non-Gaussian
0.0 0.2 0.4 0.6 0.8 1.0
0.0
1.0
2.0
X
Y
Jonas Peters (ETH Zurich) Distinguishing between Cause and Effect 9th December 2013
![Page 33: Distinguishing between Cause and Effect: Estimation of ...clopinet.com/isabelle/Projects/NIPS2013/slides/Peters_NIPS2013.pdf · Y non-Gaussian X Y Then there is no X = ˚Y + N X N](https://reader034.vdocuments.net/reader034/viewer/2022042920/5f64c1540075480c2d320016/html5/thumbnails/33.jpg)
Y = βX + NY , NY ⊥⊥ X , NY non-Gaussian
0.0 0.2 0.4 0.6 0.8 1.0
0.0
1.0
2.0
X
Y
Jonas Peters (ETH Zurich) Distinguishing between Cause and Effect 9th December 2013
![Page 34: Distinguishing between Cause and Effect: Estimation of ...clopinet.com/isabelle/Projects/NIPS2013/slides/Peters_NIPS2013.pdf · Y non-Gaussian X Y Then there is no X = ˚Y + N X N](https://reader034.vdocuments.net/reader034/viewer/2022042920/5f64c1540075480c2d320016/html5/thumbnails/34.jpg)
X = φY + NX , NX ⊥⊥ Y , NX non-Gaussian
0.0 0.2 0.4 0.6 0.8 1.0
0.0
1.0
2.0
X
Y
Jonas Peters (ETH Zurich) Distinguishing between Cause and Effect 9th December 2013
![Page 35: Distinguishing between Cause and Effect: Estimation of ...clopinet.com/isabelle/Projects/NIPS2013/slides/Peters_NIPS2013.pdf · Y non-Gaussian X Y Then there is no X = ˚Y + N X N](https://reader034.vdocuments.net/reader034/viewer/2022042920/5f64c1540075480c2d320016/html5/thumbnails/35.jpg)
X = φY + NX , NX ⊥⊥ Y , NX non-Gaussian
0.0 0.2 0.4 0.6 0.8 1.0
0.0
1.0
2.0
X
Y
Jonas Peters (ETH Zurich) Distinguishing between Cause and Effect 9th December 2013
![Page 36: Distinguishing between Cause and Effect: Estimation of ...clopinet.com/isabelle/Projects/NIPS2013/slides/Peters_NIPS2013.pdf · Y non-Gaussian X Y Then there is no X = ˚Y + N X N](https://reader034.vdocuments.net/reader034/viewer/2022042920/5f64c1540075480c2d320016/html5/thumbnails/36.jpg)
Does X cause Y or vice versa?Real Data
Jonas Peters (ETH Zurich) Distinguishing between Cause and Effect 9th December 2013
![Page 37: Distinguishing between Cause and Effect: Estimation of ...clopinet.com/isabelle/Projects/NIPS2013/slides/Peters_NIPS2013.pdf · Y non-Gaussian X Y Then there is no X = ˚Y + N X N](https://reader034.vdocuments.net/reader034/viewer/2022042920/5f64c1540075480c2d320016/html5/thumbnails/37.jpg)
Does X cause Y or vice versa?Real Data
Jonas Peters (ETH Zurich) Distinguishing between Cause and Effect 9th December 2013
![Page 38: Distinguishing between Cause and Effect: Estimation of ...clopinet.com/isabelle/Projects/NIPS2013/slides/Peters_NIPS2013.pdf · Y non-Gaussian X Y Then there is no X = ˚Y + N X N](https://reader034.vdocuments.net/reader034/viewer/2022042920/5f64c1540075480c2d320016/html5/thumbnails/38.jpg)
Does X cause Y or vice versa?
No (not enough) data for chocolate
... but we have data for coffee!
Jonas Peters (ETH Zurich) Distinguishing between Cause and Effect 9th December 2013
![Page 39: Distinguishing between Cause and Effect: Estimation of ...clopinet.com/isabelle/Projects/NIPS2013/slides/Peters_NIPS2013.pdf · Y non-Gaussian X Y Then there is no X = ˚Y + N X N](https://reader034.vdocuments.net/reader034/viewer/2022042920/5f64c1540075480c2d320016/html5/thumbnails/39.jpg)
Does X cause Y or vice versa?
No (not enough) data for chocolate
... but we have data for coffee!
Jonas Peters (ETH Zurich) Distinguishing between Cause and Effect 9th December 2013
![Page 40: Distinguishing between Cause and Effect: Estimation of ...clopinet.com/isabelle/Projects/NIPS2013/slides/Peters_NIPS2013.pdf · Y non-Gaussian X Y Then there is no X = ˚Y + N X N](https://reader034.vdocuments.net/reader034/viewer/2022042920/5f64c1540075480c2d320016/html5/thumbnails/40.jpg)
Does X cause Y or vice versa?
0 2 4 6 8 10 12
05
15
25
coffee consumption per capita (kg)
# N
ob
el L
au
rea
tes /
10
mio
Correlation: 0.698, p-value: < 2.2 · 10−16.
Nobel Prize→ Coffee: Dependent residuals (p-value of 0).Coffee→ Nobel Prize: Dependent residuals (p-value of 0).
⇒ Model class too small? Causally insufficient?
Jonas Peters (ETH Zurich) Distinguishing between Cause and Effect 9th December 2013
![Page 41: Distinguishing between Cause and Effect: Estimation of ...clopinet.com/isabelle/Projects/NIPS2013/slides/Peters_NIPS2013.pdf · Y non-Gaussian X Y Then there is no X = ˚Y + N X N](https://reader034.vdocuments.net/reader034/viewer/2022042920/5f64c1540075480c2d320016/html5/thumbnails/41.jpg)
Does X cause Y or vice versa?
0 2 4 6 8 10 12
05
15
25
coffee consumption per capita (kg)
# N
ob
el L
au
rea
tes /
10
mio
Correlation: 0.698, p-value: < 2.2 · 10−16.
Nobel Prize→ Coffee: Dependent residuals (p-value of 0).Coffee→ Nobel Prize: Dependent residuals (p-value of 0).
⇒ Model class too small? Causally insufficient?Jonas Peters (ETH Zurich) Distinguishing between Cause and Effect 9th December 2013
![Page 42: Distinguishing between Cause and Effect: Estimation of ...clopinet.com/isabelle/Projects/NIPS2013/slides/Peters_NIPS2013.pdf · Y non-Gaussian X Y Then there is no X = ˚Y + N X N](https://reader034.vdocuments.net/reader034/viewer/2022042920/5f64c1540075480c2d320016/html5/thumbnails/42.jpg)
The linear Gaussian case
Y = βX + NY
with independent
X ∼ N (0, σ2X ) and
N ∼ N (0, σ2NY
) .
Then there is a linear SEM with
X = αY + MX
How can we find α and MX ?
L2
NYY
XβX
Jonas Peters (ETH Zurich) Distinguishing between Cause and Effect 9th December 2013