graphical causal models: determining causes from observations

23
Graphical Causal Models: Determining Causes from Observations William Marsh Risk Assessment and Decision Analysis (RADAR) Computer Science

Upload: maurilio-nihill

Post on 03-Jan-2016

25 views

Category:

Documents


0 download

DESCRIPTION

Graphical Causal Models: Determining Causes from Observations. William Marsh Risk Assessment and Decision Analysis (RADAR) Computer Science. RADAR Group, Computer Science. Risk Assessment and Decision Analysis Research areas Software engineering, safety, finance, legal - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Graphical Causal Models: Determining Causes from Observations

Graphical Causal Models: Determining Causes from Observations

William MarshRisk Assessment and Decision Analysis

(RADAR)Computer Science

Page 2: Graphical Causal Models: Determining Causes from Observations

RADAR Group, Computer Science

Risk Assessment and Decision Analysis Research areas

Software engineering, safety, finance, legal A new initiative in medical data analysis:

DIADEM

Norman FentonGroup leader

Martin Neil

http://www.dcs.qmul.ac.uk/researchgp/radar/

Page 3: Graphical Causal Models: Determining Causes from Observations

Outline

Graphical Causal Models Bayesian networks: prediction or

diagnosis Causal induction: learning causes from

data Causal effect estimation: strength of

causal relationships from data

DIADEM project

Page 4: Graphical Causal Models: Determining Causes from Observations

Bayesian Nets

Page 5: Graphical Causal Models: Determining Causes from Observations

Detecting Asthma Exacerbations

Aim to assist early detection of asthma episodes in Paediatric A&E Using only data

already available electronically

Network created by Experts Data

Page 6: Graphical Causal Models: Determining Causes from Observations

Bayes’ Theorem

)().|()().|(),( APABPBPBAPBAP

Joint probability

)().|()|( APABPBAP

Revised belief about A, given

evidence B

Prior probability of A

Factor to update belief about A, given evidence B

Page 7: Graphical Causal Models: Determining Causes from Observations

Bayes’ Theorem (Made Easy)

A person has a positive test result How likely is it they are infected? 17%

Infection

Test

yes, no

pos, negFalse positive P(T=pos|I=no) = 5%Negligible false negative

Infection rate: P(I) = 1%

Page 8: Graphical Causal Models: Determining Causes from Observations

Medical Uses of BNs

Diagnosis Differential diagnosis from symptoms

Prediction Likely outcome

Building a BN From expert knowledge expert

system From data data mining

Page 9: Graphical Causal Models: Determining Causes from Observations

Beyond Bayesian Networks

Page 10: Graphical Causal Models: Determining Causes from Observations

Cause versus Association

Both represent fever infection association ‘Causal model’ has arrow from cause to effect

Infection

Fever Infection

Fever

or ?

)().|(

)().|(

),(

FPFIP

IPIFP

FIP

Joint probability same:

Page 11: Graphical Causal Models: Determining Causes from Observations

Causal Induction

Discover causal relationships from data

Sometimes distinguishable

… different conditional independence

A B C

A B C

Page 12: Graphical Causal Models: Determining Causes from Observations

Causal Induction – Application

Discover causal relationships from data Need lots of data

Applied to gene regulatory networks Data from micro-array experiments Recent explanation of limitations

Page 13: Graphical Causal Models: Determining Causes from Observations

Estimating Causal Effects

Suppose A is a cause of B

What is the causal effect? Is it p(B | A) ?

A B

Page 14: Graphical Causal Models: Determining Causes from Observations

Benefits of Sports?

Is there a relationship between sport and exam success? Data available ‘Intelligence’ correlate

Is this the correct test?

intelligence

sport exam result

P(exam=pass|sport) > P(exam=pass| no-sport)

Page 15: Graphical Causal Models: Determining Causes from Observations

Benefits of Sports?

When we condition on ‘sport’ Probability for ‘exam result’ Probability for ‘intelligence’ changes

What if I decide to start sport?

p(pass|sport) > p(pass| no-sport)

73% 67%

observe

intelligence

sport exam result

Page 16: Graphical Causal Models: Determining Causes from Observations

Intervention v Observation

Causal effect differs from conditional probability

Mostly interested in consequence of change Causal effects can be measured by a Randomised

Control Trial Causal effect of sport on exam results not identifiable

change

P(pass|do(sport)) < P(pass| do(no sport))

intelligence

sport exam result

Page 17: Graphical Causal Models: Determining Causes from Observations

Benefit of Sport

New observable variable ‘attendance at lectures’

Causal effect of sport on exam results now identifiable

sport (S) exam result (E)

intelligence

attendance (A)

SA

SPASEPSAPSdoEP )().,|()|())(|(

Page 18: Graphical Causal Models: Determining Causes from Observations

Estimating Causal Effects

Rules to convert causal to statistical questions Generalises e.g. stratification, potential outcomes Assumptions: a causal model Some assumptions may be testable

Causal model Some variables observed, others not measured Some causal effects identifiable

Challenges Causal models for complex applications Statistical implications

Page 19: Graphical Causal Models: Determining Causes from Observations

Example Application

Royal London trauma service Criteria for activation of the trauma team Aim to prevent unnecessary trauma team calls

Extensive records of trauma patient outcomes US study of 1495 admissions proposed new

‘triage’ criteria Significant decrease in overtriage 51% 29% Insignificant increase in undertriage 1% 3% None of the patients undertriaged by new criteria

died Does this show safety of new criteria?

Page 20: Graphical Causal Models: Determining Causes from Observations

DIADEM Project

Page 21: Graphical Causal Models: Determining Causes from Observations

Digital Economy in Healthcare

Data Information and Analysis for clinical DEcision Making

EPSRC Digital Economy Cluster

Partnership between solution providers and clinical data analysis problem holders

Summarise unsolved data analysis needs, in relation to the analysis techniques available

Join the DIADEM cluster

Page 22: Graphical Causal Models: Determining Causes from Observations

Cluster Activities and Outcomes

Engage stakeholders and build a community: Creation of a community web-site and

forum Meetings with potential ‘problem holders’ Workshops

A road map: data and information Follow-up proposal

A self-sustaining website – health data analytics

Page 23: Graphical Causal Models: Determining Causes from Observations

Summary

Bayesian networks Prediction and diagnosis

Causal induction Identify (some) causal relationships from

(lots of) data Causal effects

Experimental results from … … non-experimental data … assumptions (causal model)

Join the DIADEM cluster