from association analysis to causal discovery

37
From Association Analysis to Causal Discovery Prof Jiuyong Li University of South Australia

Upload: kalia-houston

Post on 01-Jan-2016

43 views

Category:

Documents


0 download

DESCRIPTION

From Association Analysis to Causal Discovery. Prof Jiuyong Li University of South Australia. Association analysis. Diapers -> Beer Bread & Butter -> Milk. P ositive correlation of birth rate to stork population. increasing the stork population would increase the birth rate ?. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: From Association Analysis to Causal Discovery

From Association Analysis to Causal Discovery

Prof Jiuyong Li

University of South Australia

Page 2: From Association Analysis to Causal Discovery

Association analysis• Diapers -> Beer• Bread & Butter -> Milk

Page 3: From Association Analysis to Causal Discovery

Positive correlation of birth rate to stork population

• increasing the stork population would increase the birth rate?

Page 4: From Association Analysis to Causal Discovery

Further evidence for Causality ≠ AssociationsSimpson paradox

Recovered Not recovered Sum Recover rate

Drug 20 20 40 50%

No Drug 16 24 40 40%

36 44 80

Female Recovered Not recovered Sum Recover rate

Drug 2 8 10 20%

No Drug 9 21 30 30%

11 29 40

Male Recovered Not recovered Sum Recover rate

Drug 18 12 30 60%

No Drug 7 3 10 70%

25 15 40

Page 5: From Association Analysis to Causal Discovery

Association and Causal Relationship

• Two variables X and Y.• Prob(Y | X) ≠ P(Y), X is associated with Y (association

rules)• Prob(Y | do X) ≠ Prob(Y | X)• How does Y vary when X changes?

• The key, How to estimate Prob(Y | do X)? • In association analysis, the relationship of X and Y

is analysed in isolation. • However, the relationship between X and Y is

affected by other variables.

5

Page 7: From Association Analysis to Causal Discovery

Causal discovery 2• Bayesian network based

causal inference – Do-calculus (Pearl 2000)– IDA (Maathuis et al. 2009) – To infer causal effects in

a Bayesian network. – However– Constructing a Bayesian

network is NP hard– Low scalability to large

number of variables

Page 8: From Association Analysis to Causal Discovery

Leaning causal structures• PC algorithm (Spirtes,

Glymour and Scheines)– Not (A ╨ B | Z), there is an

edge between A and B.– The search space

exponentially increases with the number of variables.

• Constraint based search– CCC (G. F. Cooper, 1997)– CCU (C. Silverstein et. al.

2000)– Efficiently removing non-

causal relationships.

A C

B

ABC

CCU

A C

B

ABC, ABC, CAB

CCC

Page 9: From Association Analysis to Causal Discovery

Association rules

• Many efficient algorithms

• Hundreds of thousands to millions of rules.– Many are spurious.

• Interpretability– Association rules do

not indicate causal effects.

Page 10: From Association Analysis to Causal Discovery

Causal rules• Discover causal relationships using partial association

and simulated cohort study. • Do not rely on Bayesian network structure learning. The

discovery of causal rules also have strong theoretical support.

• Discover both single cause and combined causes.• Can be discovered efficiently.

• Z. Jin, J. Li, L. Liu, T. D. Le, B. Sun, and R. Wang, Discovery of causal rules using partial association. ICDM, 2012

• J. Li, T. D. Le, L. Liu, J. Liu, Z. Jin, and B. Sun. Mining causal association rules. In Proceedings of ICDM Workshop on Causal Discovery (CD), 2013.

Page 11: From Association Analysis to Causal Discovery

Problem

A B C D E F Y #repeats

1 1 1 1 1 1 1 14

1 0 1 1 1 1 1 8

1 1 0 1 0 1 1 15

0 1 1 1 1 1 1 8

0 1 0 0 0 0 0 5

0 0 0 0 1 0 1 6

1 0 0 0 0 1 0 4

1 0 1 1 1 0 0 3

0 1 0 1 1 0 0 3

0 1 0 0 1 0 0 5

Discover causal rules from large databases of binary variables

A YC YBF YDE Y

Page 12: From Association Analysis to Causal Discovery

Partial association test

I J

K

I JK

I J

K

M. W. Birch, 1964.

Nonzero partial association

Page 13: From Association Analysis to Causal Discovery

Partial association test – an example

4. Partial association test.A B C D E F Y G #repeat

1 1 1 1 1 1 1 0 14

1 0 1 1 1 1 1 0 8

1 1 0 1 0 1 1 0 15

0 1 1 1 1 1 1 0 8

0 1 0 0 0 0 0 0 5

0 0 0 0 1 0 1 0 6

1 0 0 0 0 1 0 0 4

1 0 1 1 1 0 0 0 3

1 1 1 1 0 1 1 1 3

0 1 0 0 1 0 0 0 5

k kk

kkkk

k k

kkkk

nnnnnn

nnnnn

KYXPA

)1(

)21

|(|

),,(

2001.1

201100011

),,( ACDEYBFPA

68.125

8031401100011

k

kkkk

n

nnnn 6776.0)125(25

3112214

)1( 22001.1

kk

kkkk

nn

nnnn

Page 14: From Association Analysis to Causal Discovery

Fast partial association test

• K denotes all possible variable combinations, the number is very large.

• Counting the frequencies of the combinations is also time consuming.

• Our solution: – Sort data and count frequencies of the

equivalence classes.– Only use the combinations existing in the data set.

Page 15: From Association Analysis to Causal Discovery

Pruning strategies Definition (Redundant causal rules): Assume that X W, if X → Y is a causal rule, ⊂rule W → Y is redundant as it does not provide new information.

Definition (Condition for testing causal rules): We only test a combined causal rule XV → Y if X and Y have a zero association and V and Y have a zero association (cannot pass the qui-square test in step 3).

Page 16: From Association Analysis to Causal Discovery

AlgorithmA B C D E F G Y #repeats

1 1 1 1 1 1 0 1 14

1 0 1 1 1 1 0 1 8

1 1 0 1 0 1 0 1 15

0 1 1 1 1 1 0 1 8

0 1 0 0 0 0 0 0 5

0 0 0 0 1 0 0 1 6

1 0 0 0 0 1 0 0 4

1 0 1 1 1 0 0 0 3

1 1 1 1 1 1 1 0 3

0 1 0 0 1 0 0 0 5

1. Prune the variable set (support)

2. Create the contingency table for each variable X

x

Y=1 Y=0 Total

X=1 n11 n12 n1.

X=0 n21 n22 n2.

Total n.1 n.2 n

3. Calculate the • If go to next step

2, YX

22, YX

22, YX

2,2

1,1

22

, )(

))((ji

ji ij

ijijYX nE

nEn

4. Partial association test.• If PA(X, Y, K) is nonzero

then XY is a causal rule.

5. Repeat 1-4 for each variable

which is the combination of variables in set N

• If move X to a set N

positive association

zero association

2),,( KYXPA

Page 17: From Association Analysis to Causal Discovery

Experimental evaluations• We use the Arrhythmia data set in UCI machine learning

repository.

– We need to classify the presence and absence of cardiac arrhythmia. The data set contains 452 records and each record obtains 279 data attributes and one class attribute

• Our results are quite consistent with the results from CCC method.

• Some rules in CCC are removed by our method as they cannot pass the partial association test.

• Our method can discover the combined rules. CCC and CCU methods are not set to discover these rules.

Page 18: From Association Analysis to Causal Discovery

Comparison with CCC and CCU

Page 19: From Association Analysis to Causal Discovery

Experimental evaluations

Figure 1: Extraction Time Comparison (20K Records) Figure 1: Extraction Time Comparison (100K Records)

Page 20: From Association Analysis to Causal Discovery

Summary 1

• Simpson paradox– Associations might be inconsistent in subsets

• Partial association test– Test the persistency of associations in all possible

partitions. – Statistically sound.– Efficiency in sparse data.

• What else?

Page 21: From Association Analysis to Causal Discovery

Cohort study 1

Defined population

Expose Not expose

Not havea disease

Have a disease

Not have a disease

Have a disease

• Prospective: follow up.• Retrospective: look back. Historic study.

Page 22: From Association Analysis to Causal Discovery

Cohort study 2

• Cohorts: share common characteristics but exposed or not exposed.

• Determine how the exposure causes an outcome.

• Measure: odds ratio = (a/b) / (c/d)Diseased Healthy

Exposed a bNot exposed c d

Page 23: From Association Analysis to Causal Discovery

Limitations of cohort study• Need to know a hypothesis beforehand• Domain experts determine the control

variables.• Collect data and test the hypothesis. • Not for data exploration.

• We need– Given a data set without any hypotheses.– An automatic method to find and validate

hypotheses.– For data exploration.

Page 24: From Association Analysis to Causal Discovery

Control variables

• If we do not control covariates (especially those correlated to the outcome), we could not determine the true cause.

• Too many control variables result too few matched cases in data.– How many people with the same race, gender, blood type,

hair colour, eye colour, education level, …. • Irrelevant variables should not be controlled.

– Eye colour may not relevant to the study.

Cause Outcome

Other factors

Page 25: From Association Analysis to Causal Discovery

Matches• Exact matching

– Exact matches on all covariates. Infeasible.• Limited exact matching

– Exact matches on a few key covariates. • Nearest neighbour matching

– Find the closest neighbours• Propensity score matching

– Based on the predicted effect of a treatment of covariates.

Page 26: From Association Analysis to Causal Discovery

Method1

A B C D E F Y

1 1 1 1 1 1 1

1 0 1 1 1 1 1

1 1 0 1 0 1 1

0 1 1 1 1 1 1

0 1 0 0 0 0 0

0 0 0 0 1 0 1

1 0 0 0 0 1 0

1 0 1 1 1 0 0

0 1 0 1 1 0 0

0 1 0 0 1 0 0

Discover causal association rules from large databases of binary variables

A YA B C D E F Y

1 1 1 1 1 1 1

1 0 1 0 1 1 1

1 1 0 1 0 1 0

1 0 1 0 1 0 0

0 1 1 1 1 1 0

0 0 1 0 1 1 0

0 1 0 1 0 1 1

0 0 1 0 1 0 1

Fair dataset

Page 27: From Association Analysis to Causal Discovery

Methods

A B C D E F Y

1 1 1 1 1 1 1

1 0 1 0 1 1 1

1 1 0 1 0 1 0

1 0 1 0 1 0 0

0 1 1 1 1 1 0

0 0 1 0 1 1 0

0 1 0 1 0 1 1

0 0 1 0 1 0 1

Fair dataset• A: Exposure variable

• {B,C,D,E,F}: controlled variable set.

• Rows with the same color for the controlled variable set are called matched record pairs.

A=0

A=1 Y=1 Y=0

Y=1 n11 n12

Y=0 n21 n22

• An association rule is a causal association rule if: A Y

1)( YAOddsRatiofD

Page 28: From Association Analysis to Causal Discovery

Algorithm

28

A B C D E F G Y

1 1 1 1 1 1 0 1

… … …

1 1 0 1 0 1 0 1

1. Remove irrelevant variables (support, local support, association)

2. Find the exclusive variables of the exposure variable (support, association), i.e. G, F.

The controlled variable set = {B, C, D, E}.

x

3. Find the fair dataset. Search for all matched record pairs

4. Calculate the odds-ratio to identify if the testing rule is causal

5. Repeat 2-4 for each variable which is the combination of variables. Only consider combination of non-causal factors.

For each association rule (e. g. ) A Y

A B C D E Y

1 1 1 1 1 1

… … …

0 1 1 1 1 0

… …

x

Page 29: From Association Analysis to Causal Discovery

Experimental evaluations

Page 30: From Association Analysis to Causal Discovery

Experimental evaluations

Figure 1: Extraction Time Comparison (20K Records)

CAR CCC CCU

Page 31: From Association Analysis to Causal Discovery

Experimental evaluations

Page 32: From Association Analysis to Causal Discovery

Causality – Judea Pearl

Judea Pearl. Causality: Models, Reasoning, and Inference. Cambridge University Press, 2000.32

X1 X2 … Xn-1 Xn

5.2 7.5 6.5 5.2

5.6 7.2 6.6 5.3

… … … … …

5.4 7.1 7.1 5.7

5.7 6.9 6.9 5.8

+1

+0.8

Page 33: From Association Analysis to Causal Discovery

Methods

• IDA– Maathuis, H. M.,

Colombo, D., Kalisch, M., and Buhlmann, P. (2010). Predicting causal effects in large-scale systems from observational data. Nature Methods, 7(4), 247–249.

33

Page 34: From Association Analysis to Causal Discovery

Conclusions• Association analysis has been widely used in data

mining, but associations do not indicate causal relationships.

• Association rule mining can be adapted for causal relationship discovery by combining some statistical methods.

– Partial association test

– Cohort study

• They are efficient alternatives for causal Bayesian network based methods.

• They are capable of finding combined causal factors.

Page 35: From Association Analysis to Causal Discovery

Discussions• Causality and classification

– Estimate prob (Y| do X) instead of prob (Y|X).

• Feature section versus controlled variable selection.

• Evaluation of causes.– Not classification accuracy– Bayesian networks??

Page 36: From Association Analysis to Causal Discovery

Research Collaborators

• Jixue Liu• Lin Liu• Thuc Le• Jin Zhou• Bin-yu Sun

Page 37: From Association Analysis to Causal Discovery

Thank you for listening

Questions please ??