causality and causal inference in survival analysisfolk.uio.no › borgan › imb9335 › slides ›...

14
Causality and causal inference in survival analysis NORBIS course, Oslo Friday, December 13, 2019 Odd O. Aalen Dept. of Biostatistics, Institute of Basic Medical Sciences University of Oslo 2 New England Journal of Medicine, Editorial, Jan. 6, 2000, p. 42-49 The eleven most important developments in medicine in the past millennium Elucidation of human anatomy and physiology Discovery of cells and their substructures Elucidation of the chemistry of life Application of statistics to medicine Development of anesthesia Discovery of the relation of microbes to disease Elucidation of inheritance and genetics Knowledge of the immune system Development of body imaging Discovery of antimicrobial agents Development of molecular pharmacotherapy 3 From NEJM: The origin of modern epidemiology: 1854, when John Snow demonstrated the transmission of cholera from contaminated water The majority of people who got ill used the Broad Street Pump in London's Golden Square He removed the pump handle from the polluted well and the spread of the disease stopped. The original Broad Street pump © Wellcome Library, London 4 From NEJM: Earliest clinical trial in 1747 Scurvy (Serious disease: Magellan lost 80% of his men from scurvy) James Lind treated 12 scorbutic ship passengers on a British navy ship with cider, an elixir of vitriol, vinegar, sea water, oranges and lemon Those who got oranges and lemon did not get ill Supply of lemon juice eliminated scurvy from the navy

Upload: others

Post on 26-Jun-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Causality and causal inference in survival analysisfolk.uio.no › borgan › IMB9335 › slides › Causality_Friday_Aalen.pdf · Causality and causal inference in survival analysis

Causality and causalinference in survivalanalysis

NORBIS course, Oslo

Friday, December 13, 2019

Odd O. Aalen

Dept. of Biostatistics,

Institute of Basic Medical Sciences

University of Oslo2

New England Journal of Medicine,

Editorial, Jan. 6, 2000, p. 42-49 The eleven most important developments in

medicine in the past millennium Elucidation of human anatomy and physiology

Discovery of cells and their substructures

Elucidation of the chemistry of life

Application of statistics to medicine

Development of anesthesia

Discovery of the relation of microbes to disease

Elucidation of inheritance and genetics

Knowledge of the immune system

Development of body imaging

Discovery of antimicrobial agents

Development of molecular pharmacotherapy

3

From NEJM: The origin of modern epidemiology:

1854, when John Snow demonstrated the transmission of cholera from contaminated water

The majority of people who got ill used the Broad Street Pump in London's Golden Square

He removed the pump handle from the polluted well and the spread of the disease stopped.

The original Broad Street pump

© Wellcome Library, London 4

From NEJM:

Earliest clinical trial in 1747

Scurvy (Serious disease: Magellan lost 80% of his men from scurvy)

James Lind treated 12 scorbutic ship passengers on a British navy ship with cider, an elixir of vitriol, vinegar, sea water, oranges and lemon

Those who got oranges and lemon did not get ill

Supply of lemon juice eliminated scurvy from the navy

Page 2: Causality and causal inference in survival analysisfolk.uio.no › borgan › IMB9335 › slides › Causality_Friday_Aalen.pdf · Causality and causal inference in survival analysis

5

So, it is all about causality

Statistics is important because it is conceived ascontributing to a causal understanding which is needed in prevention and treatment of disease.

Statistics can indicate causality even in the absence of a mechanistic understanding. Treatment of scurvy far ahead of the knowledge of

vitamin C John Snow: 20 years ahead of Pasteur

Going to modern times next: Causality and statistics – a happy couple?

6

Modern breakthrough based on statistics:

Sleeping position influences risk of crib death

Crib deaths in Norway 1986-2003

Num

ber

7

Sudden infant death syndrome (SIDS)

The risk of SIDS is strongly increased (RR up to

13) when the infant is sleeping on its stomach

compared to sleeping on its back.

This is simple because

An intervention could be conceived and was easy to carry out in practice

The effect was immediate

The effect was very strong

None of these conditions normally hold in epidemiology

Questions

Why is causality important in medicine and other fields?

How can statistics say something about causality?

What is causality?

8

Page 3: Causality and causal inference in survival analysisfolk.uio.no › borgan › IMB9335 › slides › Causality_Friday_Aalen.pdf · Causality and causal inference in survival analysis

Danger of e-cigarettes

Center for Disease

Control (CDC):

A form of vitamin E has been identified as a “very strong culprit” in lung injuries related to vaping THC, health officials reported on Friday, a major advance in a frightening outbreak that has killed 40 people and sickened 2,051.

Analysis of lung fluid. Statistical association.

9 10

CREDITZEPHYR/SCIENCE SOURCE

Surgery for Blocked

Arteries Is Often

Unwarranted,

Researchers FindDrug therapy alone may save lives as

effectively as bypass or stenting procedures, a

large federal study showed.

New York Times, 16.11.2019

Veierød11

Association is not causation

If A and Y are associated, then this is compatible with four different types of causal relationship:

Direct causation

Reverse causation

Confounding

Collider effect (selection)

12

A Y

A Y

A Y

U

A Y

S

Page 4: Causality and causal inference in survival analysisfolk.uio.no › borgan › IMB9335 › slides › Causality_Friday_Aalen.pdf · Causality and causal inference in survival analysis

1313

Causality is hard:Mechanisms in biologyare complex.Intervention may havea number of unexpectedeffects followingdifferent pathways.

- Closing a boulevardin Paris …

What is causality?

The simplest formulation: it

is about intervention

What is the effect of an action?

This is the focus of Judea

Pearl’s famous book

Seeing vs. doing

Pearl implies that statistics

per se says nothing about

interventions

14

Do-operator:P(y | do(x), z)

Interventions in medicine

Medical doctors, psychologists and otherhealth personell constantly intervene:

Giving treatment, medications, advice …

Statistics is expected to be a tool for validating such actions, to prove or disprove their validity

This cannot be done without taking thecausal issue seriously

15

Seeing vs doing (Pearl)

Pearl makes a fundamental distinction between

seeing and doing. Causality is about doing,

while most statistical data is about seeing

Seeing and doing may coincide in experiments

because of the ability to control the setting. The

“big” experiment in medicine is the randomized

clinical trial where the effect of doing is directly

visible

16

Page 5: Causality and causal inference in survival analysisfolk.uio.no › borgan › IMB9335 › slides › Causality_Friday_Aalen.pdf · Causality and causal inference in survival analysis

Registries contain data on seeing only, and not doing

We want to say something about the effect of intervention. BUT: the registry only contains a description of what has happened, there is no information about what could have happened if one acted differently. Therefore, you can’t (directly) say anything about the effect of intervention

This is the case for observational data in general

Still, causal inference can help us if we collect enough data and the right type of data…

17

Randomised clinical trials

The established solution to the confounder problem. We create both a factual and a counterfactual world

One of the great pillars of medical research. An unrivalled source of reliable information. Thousands of clinical trials carried out every year

But limitations: very many exclusions (children e.g.), could be distant from clinical practice, often extremely expensive. Clinical trials become unethical once a secure effect has been established

18

19

Lancet 2010; 376:

1741–50

Proximal colon

Distal colon

In the absence of randomization

We need to correct for confounders

But many of them will be unknown

Sensitivity analysis is an interesting option

20

Page 6: Causality and causal inference in survival analysisfolk.uio.no › borgan › IMB9335 › slides › Causality_Friday_Aalen.pdf · Causality and causal inference in survival analysis

Graphical models

Graphical models with arrows and boxes are common. However, Judea Pearl has lifted them to a new level

A number of rules for evaluating graphs can be defined

These are applicable in practice as shall be demonstrated

21

Directed acyclic graph – DAG

Graph with arrows, where you never return to the same node

22

A

U

Y

L

Outcome

Confounder

Treatment /

Risk factor

Mediator

Collider: where two or more arrows meet

The following rules decide whether

a path is open or closed

1. A path with colliding arrows is closed (→←). If there are no colliders the path is open.

2. To conditon on a non-collider closes thepath.

3. To condition on a collider (or descendantof a collider) opens the path

23

What do we mean by “to condition on”

We mean e.g. to include a variable in the regression.

To include a confounder is usually ok

To include a collider is dangerous

However, a collider may not be avoided if it represents inherent selection in the data

24

Page 7: Causality and causal inference in survival analysisfolk.uio.no › borgan › IMB9335 › slides › Causality_Friday_Aalen.pdf · Causality and causal inference in survival analysis

Keep causal paths open and non-causal paths closed Example (Hein Stigum): red arrow is causal, black path

is not causal (backdoor path). Conditioning on age (or obesity) blocks the back-door path

25

Vitamin E

Age

Birth defects

Obesity

Exercise (Hein Stigum)

We want the causal effect of physical activity on CHD (coronary heart disease). What should we adjust for?

26

2727

Birth defects. Adjustment for confounder?Source: Hernán et al, Amer. J. Epidem. 2002, 155, 176-184

When estimating the effect of E on D, shall you adjust for C? No, one should not adjust for a collider. Case-control study on folic acid supplementation and neural tube

defects. Adjusted OR: 0.80 (0.62, 1.21), non-adjusted OR 0.65 (0.46,0.94)

Exercise: A larger DAG

28

Page 8: Causality and causal inference in survival analysisfolk.uio.no › borgan › IMB9335 › slides › Causality_Friday_Aalen.pdf · Causality and causal inference in survival analysis

When is a DAG causal?

Two views

Robins and Hernán: A DAG is causal when

1. Lack of an arrow can be interpreted as lack of direct causal effect

2. All common causes, even if unmeasured, of any pair of variables on the graph are themselves on the graph

Note: this requires a concept of direct cause

Pearl: A DAG becomes causal if intervening on

a node has the effect of removing all arrows into

the node while the DAG is otherwise unchanged

29

Metformin and diabetes Metformin is a standard

treatment for diabetes type 2

It does improve the diabetes

However, it is not clear that it improves the negative longterm effects of diabetes, like cardiovascular disease

Possibly, high glucose is just a symptom which does not have a causal effect oncomplications

Same issue with LDL cholesterol and heartdisease 30

Metformin

Compli-

cations

Diabetes

type 2

Glucose

Metformin

Compli-

cations

Glucose

Survival paradoxes

There are a number of well known paradoxes in epidemiology.

Obesity paradox denotes cases where obesity is a risk factor for developing a condition, like heart failure, kidney failure or myocardial infarction, but once this condition has occurred obesity appears as a protective factor for survival.

Cholesterol paradox: low cholesterol is associated with a poor prognosis in heart failure and in myocardial infarction.

31

Obesity paradox

5070 references in Google Scholar

Apparent “obesity paradox” seen in diabetes,

coronary artery disease, heart failure, peripheral

arterial disease, hypertension, chronic

obstructive pulmonary disease, lung cancer,

esophageal adenocarcinoma, myocardial

infarction, coronary revascularization, acute

coronary syndromes, and among hemodialysis

patients. Preston and Stokes, Epidemiology,

201432

Page 9: Causality and causal inference in survival analysisfolk.uio.no › borgan › IMB9335 › slides › Causality_Friday_Aalen.pdf · Causality and causal inference in survival analysis

Examples from the literature

Obesity increases mortality in women, but

reduces mortality in diabetic women. Lajous et

al, Epidemiology, 2014

Obesity reduces mortality among heart failure

patients. Banack and Kaufman, Epidemiology,

2013

Smoking appears to protect against Alzheimer’s

disease. Part of the effect is due to survival

paradox. Chang, Alzheimer Dis Assoc Disord,

201233

Why these paradoxes

There would be a mix of explanations

Some biological, I do not look at those

Some statistical, only those are considered

here

So, how can biases be produced? The typical explanation concentrates on collider effects

34

Questions

Why do we denote findings like this as paradoxes?

Can you imagine a reason for these paradoxes?

35

Remember

When we condition on a collider we may activate a non-causal pathway, thereby creating bias.

We’ll come back to the obesity paradoxes in a moment, but first …

36

Page 10: Causality and causal inference in survival analysisfolk.uio.no › borgan › IMB9335 › slides › Causality_Friday_Aalen.pdf · Causality and causal inference in survival analysis

One collider is ageThe survival of the fittest

Age selection is unavoidable. People die. If you

concentrate on an elderly population it is

selected. People who are vulnerable to high

cholesterol, obesity, smoking etc. may have

died. The robust ones survive

So, you may measure something in a young

person and it may have a causal effect. Then the

causal effect dissipates with age and is

substituted by a selection effect

37

Frailty selection is well known

Assume a constant causal hazard ratio: r(t)= r

Observed hazard ratio:

38

• The relative risk may declineand even cross over due to artefacts.

• This may not reflect anychange in risk at theindividual level.

• This is a causal issue if youare interested in whathappens to individuals whentime passes.

• See related comments in Hernan (2010): “The hazard of hazard ratios”

Seeing vs. doing

Comparing two treatment groups by hazard ratio. End of trial, should all patients be put on the default treatment?

Assume hazard ratio equals 1. Apparently no difference in treatment effect. So, ok to change?

Surprise! The intervention yields a sharp change in the event rate in the group that changes treatment.

39

0 1 2 3 4 5Time

0.5

1.0

1.5

Hazard ratio

Intervene

Gulsvik et al (2012) demonstrate that high serum cholesterol appears as a "protective" factor with respect to general mortality at old age, something which could be an artefact, especially since statin treatments reduce disease incidence of cardiovascular disease in the elderly (Savarese et al, 2013). A strongly reduced risk with age was also seen for smoking in Gulsvik et al. (2012).

40

102-Year-Old Woman Quits Smoking (But Not for Her Health)

Page 11: Causality and causal inference in survival analysisfolk.uio.no › borgan › IMB9335 › slides › Causality_Friday_Aalen.pdf · Causality and causal inference in survival analysis

Collider effects and survival

Survival bias due to age

41

Survived

to time t

Randomized

treatment

Survived

to time

t+dt

Con-

founder

Questions

In what sense is age a collider?

Why may the apparent protective effect of high cholesterol in old age be a selection effect?

42

Banack and Kaufman: Epidemiology, 2013

Obesity paradox in heart failure

For patients with heart failure: The odds ratio of mortality was 0.79 (0.70–0.88) for overweight and 0.65 (0.57–0.74) for obese

43

We need computation as well

The collider effects just tell us that there might be something to worry about, but not whether the bias is large or small, or even nonexistent. (There do exist models where colliders have no effect at all)

This requires a mathematical model where we explicitly model individual variation

44

Page 12: Causality and causal inference in survival analysisfolk.uio.no › borgan › IMB9335 › slides › Causality_Friday_Aalen.pdf · Causality and causal inference in survival analysis

Risk model

Model for the first event (like diabetes etc)

Hazard rate equals rate due to special risk

factor (e.g. obesity) plus model with built-in risk variation Z

By doing some mathematics, we can calculate the rate of mortality after the first event has occurred. It turns out to be smaller for those with the special risk than for those without

45

Lower curve: with special risk factorUpper curve: without

46

Question

What is the collider in the heart failure example?

47

Mediation

Can we understand mechanisms by using statistics?

Path analysis (Wright, 1921) introduced the idea of direct, indirect and total effects and presented a simple calculus for these effects based on linear regression models.

A lot of recent and sophisticated development of these ideas in the causal inference literature

48

OutcomeTreatment

Mediator

How much of the effect passes

directly from treatment to

outcome (direct effect) and

how much passes through the

mediator (indirect effect)?

Page 13: Causality and causal inference in survival analysisfolk.uio.no › borgan › IMB9335 › slides › Causality_Friday_Aalen.pdf · Causality and causal inference in survival analysis

Mediation: Cholesterol treatment

49

Mediation and confounding. Can we estimate the direct effect of statin on coronary heart disease?Confounders between mediator an outcome may give a false impression of an increased indirect effect

50

Example

Psychotherapy: It is well documented that therapeutic alliance appears to be a mediator. However, there is an obvious possibility of confounding effects

Which type of effects could that be?

Psycho-

therapyOutcome

Therapeutic

alliance

Confounder

51

Quotes from Dunn et alStatist. Med. 2007; 26:4719–4745

“Although there is an enormous methodological literature on the estimation of the effects of mediators …., most of it completely ignores the technical challenges raised by measurement errors in the proposed mediators and by potential hidden confounding of mediator(s) and outcome.”

“The assumptions concerning the lack of hidden confounding and measurement errors are very rarely stated, let alone their validity discussed. One suspects that the majority of investigators are oblivious of these two requirements. One is left with the unsettling thought that the thousands of investigations of mediational mechanisms in the psychological and other literatures are of unknown and questionable value.”

One object of criticism is the famous paper by Baron & Kenny (1986). This has now 88 800 references according to Google Scholar.

You ignore causal inference at your own peril

Example: Hormone replacement therapy (HRT) at menopause used to be viewed as a good idea because it was supposed to improve the health and well being ofthe woman. In fact, HRT was dangerous, increasing risk of breast cancer, heart disease and stroke

But the epidemiological studies were wrongly interpreted

due to a lack of causal understanding, see Hernan et al, (2008). Randomized trials gave the correct answer

Frailty selection is a relevant issue (Stensrud et al, 2017)

Today, the attitude to use of HRT is very cautious

52

Page 14: Causality and causal inference in survival analysisfolk.uio.no › borgan › IMB9335 › slides › Causality_Friday_Aalen.pdf · Causality and causal inference in survival analysis

Pearl’s new book

Popularizing ideasfrom his first book

Many niceexamples

Discussing alsoartificialintelligence

53

Summary

Causal inference is a large and complex area

Various types of survival bias are always present

Causal inference more and more becomes the

norm of analysis and presentation in an

international epidemiological setting

54

Some references

Lange, T. and Hansen, J. V, (2011). Direct and Indirect Effects in a Survival

Context. Epidemiology, 22(4):575-581. The electronic supplement contains

some information on software

Røysland, K., Jon Michael Gran et al. (2011). Analyzing direct and indirect

effects of treatment using dynamic path analysis applied to data from the

Swiss HIV Cohort Study. Statistics in Medicine. DOI: 10.1002/sim.4324

Pearl J. Causality: Models, Reasoning, and Inference, 2nd edn. Cambridge

University Press: Cambridge, 2009.

Hernán MA, Robins JM. Instruments for causal inference: an epidemiologist’s

dream? Epidemiology 2006; 17:360–372.

Gran JM et al. A sequential Cox approach for estimating the causal effect of

treatment in the presence of time dependent confounding. Statistics in

Medicine 2010; 29:2757–2768

Robins JM, Hernan MA, Brumback B. Marginal structural models and causal

inference in epidemiology. Epidemiology 2000; 11(5):550–560.