causality workbenchclopinet.com/causality challenges in causality isabelle guyon, clopinet...
Post on 19-Dec-2015
223 views
TRANSCRIPT
Causality Workbench clopinet.com/causality
Challenges in Causality
Isabelle Guyon, ClopinetConstantin Aliferis and Alexander Statnikov, Vanderbilt Univ.
André Elisseeff and Jean-Philippe Pellet, IBM Zürich
Gregory F. Cooper, Pittsburg University
Peter Spirtes, Carnegie Mellon
Causality Workbench clopinet.com/causality
Causal discovery
Which actions will have beneficial effects?
…your health?
…climate changes?… the economy?
What affects…
Causality Workbench clopinet.com/causality
What is causality?
• Many definitions:– Science– Philosophy– Law– Psychology– History– Religion– Engineering
• “Cause is the effect concealed, effect is the cause revealed” (Hindu philosophy)
Causality Workbench clopinet.com/causality
Feature Selection
X
Y
Predict Y from features X1, X2, …
Select most predictive features.
Causality Workbench clopinet.com/causality
X
Y
Causation
Predict the consequences of actions:
Under “manipulations” by an external agent, some features are no longer predictive.
Y
Causality Workbench clopinet.com/causality
Available data
• A lot of “observational” data.
Correlation Causality!
• Experiments are often needed, but:– Costly– Unethical– Infeasible
Causality Workbench clopinet.com/causality
Causal discovery from “observational data”
Example algorithm: PC (Peter Spirtes and Clarck Glymour, 1999)
Let A, B, C X and V X. Initialize with a fully connected un-oriented graph.1. Conditional independence. Cut connection if
V s.t. (A B | V).2. Colliders. In triplets A — C — B (A — B) if there is
no subset V containing C s.t. A B | V, orient edges as: A C B.
3. Constraint-propagation. Orient edges until no change:
(i) If A B … C, and A — C then A C. (ii) If A B — C then B C.
Causality Workbench clopinet.com/causality
Difficulties
• Violated assumptions:– Causal sufficiency– Markov equivalence– Faithfulness– Linearity– Gaussianity
• Overfitting (statistical complexity):– Finite sample size
• Algorithm efficiency (computational complexity):– Thousands of variables– Tens of thousands of examples
Causality Workbench clopinet.com/causality
Our approach
What is the causal question?
Why should we care?
What is hard about it?
Is this solvable?
Is this a good benchmark?
Causality Workbench clopinet.com/causality
Our challenges
Find…
• Problems
• Data
• Metrics
• Challenge protocols
• Implementation
Causality Workbench clopinet.com/causality
Ecology
0 2000 4000 6000 8000 10000 12000 14000 160000
10
20
30
40
50
60
70
80
90
100
DALTON
Healthcaremass spec
Upcoming datasets
ECONO
Marketing
TIED
Conceptual
Psychology
Epidemiology
InternetClimatology
Neuroscience
Security Sociology
Causality Workbench clopinet.com/causality
Want to contribute data?
• Real data:– Non confidential– Large number of samples– Large number of variables– Observational and experimental
• Semi-artificial data:– Re-simulated– Real data + artificial variables
Causality Workbench clopinet.com/causality
Metrics
• Fulfillment of an objective:• Future (prediction)
• Past (counterfactual)
• Causal relationships:• Existence
• Strength
• Degree
Causality Workbench clopinet.com/causality
Examples of objectives
• Medicine and epidemiology – Maximize life expectancy– Maximize drug efficacy– Minimize contagion
• Economy and marketing– Maximize Gross National Product (GNP)– Maximize sales– Minimize churn rate
Causality Workbench clopinet.com/causality
Lung Cancer
Smoking Genetics
Coughing
AttentionDisorder
Allergy
Anxiety Peer Pressure
Yellow Fingers
Car Accident
Born an Even Day
Fatigue
LUCAS0: natural
Causality assessmentwith manipulations
Causality Workbench clopinet.com/causality
LUCAS1: manipulate
d
Lung Cancer
Smoking Genetics
Coughing
AttentionDisorder
Allergy
Anxiety Peer Pressure
Yellow Fingers
Car Accident
Born an Even Day
Fatigue
Causality assessmentwith manipulations
Causality Workbench clopinet.com/causality
Lung Cancer
Smoking Genetics
Coughing
AttentionDisorder
Allergy
Anxiety Peer Pressure
Yellow Fingers
Car Accident
Born an Even Day
Fatigue
LUCAS2: manipulate
d
Causality assessmentwith manipulations
Causality Workbench clopinet.com/causality
Goal driven causality
0
9 4
11
61
10 2
3
7
5
8
• We define: V=variables of interest
(e.g. MB, direct causes, ...)
• We assess causal relevance: R=f(V,S).
4 11 2 3 1
• Participants return: S=selected subset
(ordered or not).
Causality Workbench clopinet.com/causality
Using artificial “probes”
Lung Cancer
Smoking Genetics
Coughing
AttentionDisorder
Allergy
Anxiety Peer Pressure
Yellow Fingers
Car Accident
Born an Even Day
FatigueLUCAP0: natural
Probes
P1 P2 P3 PT
Causality Workbench clopinet.com/causality
Probes
Lung Cancer
Smoking Genetics
Coughing
AttentionDisorder
Allergy
Anxiety Peer Pressure
Yellow Fingers
Car Accident
Born an Even Day
Fatigue
P1 P2 P3 PT
LUCAP1&2:
manipulated
Using artificial “probes”
Causality Workbench clopinet.com/causality
Scoring using “probes”
• What we can compute (Fscore):
– Negative class = probes (here, all “non-causes”, all manipulated).
– Positive class = other variables (may include causes and non causes).
• What we want (Rscore):
– Positive class = causes.
– Negative class = non-causes.
• What we get (asymptotically):
Fscore = (NTruePos/NReal) Rscore + 0.5 (NTrueNeg/NReal)
Causality Workbench clopinet.com/causality
Conclusion
• Try our first challenge, learn, and win!!!!– WCCI08 Workshop. Hong-Kong, June, 2008
• travel grants for top ranking students.
– Proceedings of JMLR. Top ranking entrants will be invited to write a paper.
• Best paper award: free WCCI registration.
– Prizes: P(i)=$100. P = n*sum P(i).
• Your problem solved by dozens of research groups: – help us organize the next challenge!