1 richard scheines carnegie mellon university causal graphical models ii: applications with search

57
1 Richard Scheines Carnegie Mellon University Causal Graphical Models II: Applications with Search

Upload: reynold-dennis

Post on 23-Dec-2015

221 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Richard Scheines Carnegie Mellon University Causal Graphical Models II: Applications with Search

1

Richard ScheinesCarnegie Mellon University

Causal Graphical Models II: Applications with Search

Page 2: 1 Richard Scheines Carnegie Mellon University Causal Graphical Models II: Applications with Search

2

1. Foreign Investment

2. Welfare Reform

3. Online Learning

4. Charitable Giving

5. Stress & Prayer

6. Test Anxiety

7. Causal Connectivity among Brain Regions

Case Studies

Page 3: 1 Richard Scheines Carnegie Mellon University Causal Graphical Models II: Applications with Search

3

1. Exceedingly simple

2. Background theory weak

3. Claim:

– Not: search output is true

– Is: search adds value

Case Studies

Page 4: 1 Richard Scheines Carnegie Mellon University Causal Graphical Models II: Applications with Search

4

Case Study 1: Foreign Investment

Does Foreign Investment in 3rd World Countries cause Political Repression?

Timberlake, M. and Williams, K. (1984). Dependence, political exclusion, and government repression: Some cross-national evidence. American Sociological Review 49, 141-146.

N = 72

PO degree of political exclusivity

CV lack of civil liberties

EN energy consumption per capita (economic development)

FI level of foreign investment

Page 5: 1 Richard Scheines Carnegie Mellon University Causal Graphical Models II: Applications with Search

5

Correlations

po fi en fi -.175 en -.480 0.330 cv 0.868 -.391 -.430

Case Study 1: Foreign Investment

Page 6: 1 Richard Scheines Carnegie Mellon University Causal Graphical Models II: Applications with Search

6

Regression Results

po = .227*fi - .176*en + .880*cv

SE (.058) (.059) (.060)

t 3.941 -2.99 14.6

Interpretation: foreign investment increases political repression

Case Study 1: Foreign Investment

Page 7: 1 Richard Scheines Carnegie Mellon University Causal Graphical Models II: Applications with Search

Alternatives

.217

FI

PO

CV En

Regression

.88 -.176

FI

PO

CV En

Tetrad - FCI

FI

PO

CV En

Fit: df=2, 2=0.12, p-value = .94

.31 -.23

.86 -.48

Case Study 1: Foreign Investment

There is no model with testable constraints (df > 0) in which FI has a positive effect on PO that is not rejected by the data.

Page 8: 1 Richard Scheines Carnegie Mellon University Causal Graphical Models II: Applications with Search

8

Aurora Jackson, Richard Scheines

Single Mothers’ Self-Efficacy,

Parenting in the Home Environment, and  

Children’s Development in a Two-Wave Study

(Social Work Research, 29, 1, 7-20)

Case Study 2: Welfare Reform

Page 9: 1 Richard Scheines Carnegie Mellon University Causal Graphical Models II: Applications with Search

9

Two-Wave Longitudinal Study

• Longitudinal Data

o Time 1: 1996-97 (N = 188)

o Time 2: 1998-99 (N = 178)

• Single black mothers in NYC

• Current and former welfare recipients

• With a child who was 3 – 5 at time 1,

and 6 to 8 at time 2

Case Study 2: Welfare Reform

Page 10: 1 Richard Scheines Carnegie Mellon University Causal Graphical Models II: Applications with Search

10

Constructs/Scales/Measures

• Employment Status

• Perceived Self-efficacy

• Depressive Symptoms

• Quality of Mother/Father Relationship

• Father/Child Contact

• Quality of Home Environment

• Behavior Problems

• Cognitive Development

Case Study 2: Welfare Reform

Page 11: 1 Richard Scheines Carnegie Mellon University Causal Graphical Models II: Applications with Search

11

Background Knowledge

Tier 1:

• Employment Status

Tier 2:

• Depression

• Self-efficacy

• Mother/Father Relationship

• Father/Child Contact

• Mother’s Parenting/HOME

Tier 3:

• Negative Behaviors

• Cognitive Development

Over 22 million path models consistent with these constraints

Case Study 2: Welfare Reform

Page 12: 1 Richard Scheines Carnegie Mellon University Causal Graphical Models II: Applications with Search

12

Employment Status

(Time 1)

Mother’s self-efficacy

(Time 1)

Mother’s depressive symptoms (Time 1)

Mother/Father Relationship

(Time 1) Father/Child Contact (Time 1)

Mother’s Parenting/ Home Environment

(Time 1)

Negative Behaviors (Time 2)

Cognitive Development

(Time 2)

.215** -.456*

-.129* .407**

.162*

-.291** .166*

* = p < .05 ** = p < .01

.184*

2 (19) = 18.87 P = .46 GFI = .97 AGFI = .95

Employment Status

(Time 1)

Mother’s self-efficacy

(Time 1)

Mother’s depressive symptoms (Time 1)

Mother/Father Relationship

(Time 1) Father/Child Contact (Time 1)

Mother’s Parenting/ Home Environment

(Time 1)

Negative Behaviors (Time 2)

Cognitive Development

(Time 2)

.215** -.472*

-.184* .407**

.150*

-.291** .166*

-.166*

2 (20) = 22.3 P = .32 GFI = .97 AGFI = .95

* = p < .05 ** = p < .01

Tetrad Equivalence Class

Conceptual Model

c2 = 22.3, df = 20, p = .32

c2 = 18.87, df = 19, p = .46

Case Study 2: Welfare Reform

Page 13: 1 Richard Scheines Carnegie Mellon University Causal Graphical Models II: Applications with Search

13

Employment Status

(Time 1)

Mother’s self-efficacy

(Time 1)

Mother’s depressive symptoms (Time 1)

Mother/Father Relationship

(Time 1) Father/Child Contact (Time 1)

Mother’s Parenting/ Home Environment

(Time 1)

Negative Behaviors (Time 2)

Cognitive Development

(Time 2)

.215** -.456*

-.129* .407**

.162*

-.291** .166*

* = p < .05 ** = p < .01

.184*

2 (19) = 18.87 P = .46 GFI = .97 AGFI = .95

Employment Status

(Time 1)

Mother’s self-efficacy

(Time 1)

Mother’s depressive symptoms (Time 1)

Mother/Father Relationship

(Time 1) Father/Child Contact (Time 1)

Mother’s Parenting/ Home Environment

(Time 1)

Negative Behaviors (Time 2)

Cognitive Development

(Time 2)

.215** -.472*

-.184* .407**

.150*

-.291** .166*

-.166*

2 (20) = 22.3 P = .32 GFI = .97 AGFI = .95

* = p < .05 ** = p < .01

Tetrad

Conceptual Model

Points of Agreement:• Mother’s Self-Efficacy mediates

the effect of Employment on all other variables.

• Home environment mediates the effect of all other factors on outcomes: Cog. Develop and Prob. Behaviors

Points of Disagreement:• Depression key cause vs. only

an effect

Case Study 2: Welfare Reform

Page 14: 1 Richard Scheines Carnegie Mellon University Causal Graphical Models II: Applications with Search

14

Online Course in Causal & Statistical Reasoning

Case Study 3: Online Courseware

Page 15: 1 Richard Scheines Carnegie Mellon University Causal Graphical Models II: Applications with Search

15

Variables

Pre-test (%)

Print-outs (% modules printed)

Quiz Scores (avg. %)

Voluntary Exercises (% completed)

Final Exam (%)

9 other variables

Case Study 3: Online Courseware

Tier 1

Tier 2

Tier 3

Page 16: 1 Richard Scheines Carnegie Mellon University Causal Graphical Models II: Applications with Search

16

Printing and Voluntary Comprehension Checks: 2002 --> 2003

.302*

-.41**

.75**

.353*

.323*

pre

print voluntary questions

quiz

final

2002

-.08

-.16

.41*

.25*

pre

print voluntary questions

final

2003

Case Study 3: Online Courseware

Page 17: 1 Richard Scheines Carnegie Mellon University Causal Graphical Models II: Applications with Search

17

Variables

Tangibility/Concreteness (Exp manipulation)

Imaginability (likert 1-7)

Impact (avg. of 2 likerts)

Sympathy (likert)

Donation ($)

Case Study 4: Charitable Giving

Cryder & Loewenstein (in prep)

Page 18: 1 Richard Scheines Carnegie Mellon University Causal Graphical Models II: Applications with Search

18

Theoretical Model

Case Study 4: Charitable Giving

Imaginability Tangibility

Impact

Sympathy

Donation

study 1 (N= 94) df = 5, c2 = 52.0, p= 0.0000

Page 19: 1 Richard Scheines Carnegie Mellon University Causal Graphical Models II: Applications with Search

19

GES Outputs

Case Study 4: Charitable Giving

Imaginability Tangibility

Impact

Sympathy

Donation

study 1: df = 5, c2 = 5.88, p= 0.32

Imaginability Tangibility

Impact

Sympathy

Donation

study 1: df = 5, c2 = 3.99, p= 0.55

Page 20: 1 Richard Scheines Carnegie Mellon University Causal Graphical Models II: Applications with Search

20

Theoretical Model

Case Study 4: Charitable Giving

Imaginability Tangibility

Impact

Sympathy

Donation

study 2 (N= 115) df = 5, c2 = 62.6, p= 0.0000

Imaginability Tangibility

Impact

Sympathy

Donation

Imaginability Tangibility

Impact

Sympathy

Donation

study 2: df = 5, c2 = 8.23, p= 0.14

study 2: df = 5, c2 = 7.48, p= 0.18

Page 21: 1 Richard Scheines Carnegie Mellon University Causal Graphical Models II: Applications with Search

21

Build Pure Clusters

Output - provably reliable (pointwise consistent):

Equivalence class of measurement models over a pure subset of measures

L1 L2 L3

m1 m2 m3 m4 m5 m6 m7 m8 m9

Stress Dep Health

m1 m2 m3 m4 m5 m6 m7 m8 m9 m11 m10

m

BPC

True Model

Output

Page 22: 1 Richard Scheines Carnegie Mellon University Causal Graphical Models II: Applications with Search

22

Build Pure ClustersQualitative Assumptions

1. Two types of nodes: measured (M) and latent (L)

2. M L (measured don’t cause latents)

3. Each m M measures (is a direct effect of) at least one l L

4. No cycles involving M

Quantitative Assumptions:

1. Each m M is a linear function of its parents plus noise

2. P(L) has second moments, positive variances, and no deterministic relations

Page 23: 1 Richard Scheines Carnegie Mellon University Causal Graphical Models II: Applications with Search

23

Case Study 5: Stress, Depression, and Religion

MSW Students (N = 127) 61 - item survey (Likert Scale)

• Stress: St1 - St21

• Depression: D1 - D20

• Religious Coping: C1 - C20

p = 0.00

St1

12

Stress

St2

12

St21

12

.

.

Dep1

12

Coping

.

.

Depression

Dep2

12

Dep20

12

C1 C2 C20 . .

+

- +

Specified Model

Page 24: 1 Richard Scheines Carnegie Mellon University Causal Graphical Models II: Applications with Search

24

Build Pure Clusters St3

12

Stress

St4

12 St16

12

Dep9

12

Coping

Depression Dep13

12 Dep19

12

C9 C12 C15

St18

12

St20

12

C14

Case Study 5: Stress, Depression, and Religion

Page 25: 1 Richard Scheines Carnegie Mellon University Causal Graphical Models II: Applications with Search

25

Assume Stress temporally prior:

MIMbuild to find Latent Structure: St3

12

Stress

St4

12 St16

12

Dep9

12

Coping

Depression Dep13

12 Dep19

12

C9 C12 C15

St18

12

St20

12

C14

+

+

p = 0.28

Case Study 5: Stress, Depression, and Religion

Page 26: 1 Richard Scheines Carnegie Mellon University Causal Graphical Models II: Applications with Search

26

Case Study 6: Test Anxiety

Bartholomew and Knott (1999), Latent variable models and factor analysis

12th Grade Males in British Columbia (N = 335)

20 - item survey (Likert Scale items): X1 - X20:

X2

Emotionality Worry

X8

X9

X10

X15

X16

X18

X3

X4

X5

X6

X7

X14

X17

X20

Exploratory Factor Analysis:

Page 27: 1 Richard Scheines Carnegie Mellon University Causal Graphical Models II: Applications with Search

27

Build Pure Clusters:

X2

Emotionalty

X8

X9

X10

X11

X16

X18

X3

X5

X7

X14

X6

Cares About Achieving

Self-Defeating

Case Study 6: Test Anxiety

Page 28: 1 Richard Scheines Carnegie Mellon University Causal Graphical Models II: Applications with Search

28

Build Pure Clusters:

X2

Emotionalty

X8

X9

X10

X11

X16

X18

X3

X5

X7

X14

X6

Worries About Achieving

Self-Defeating

X2

Emotionality Worry

X8

X9

X10

X15

X16

X18

X3

X4

X5

X6

X7

X14

X17

X20

p-value = 0.00 p-value = 0.47

Exploratory Factor Analysis:

Case Study 6: Test Anxiety

Page 29: 1 Richard Scheines Carnegie Mellon University Causal Graphical Models II: Applications with Search

29

X2

Emotionalty

X8

X9

X10

X11

X16

X18

X3

X5

X7

X14

X6

Worries About Achieving

Self-Defeating

MIMbuild

p = .43

Emotionalty-Scale

Worries About Achieving-Scale

Self-Defeating

Uninformative

Scales: No Independencies or Conditional Independencies

Case Study 6: Test Anxiety

Page 30: 1 Richard Scheines Carnegie Mellon University Causal Graphical Models II: Applications with Search

30

• Goals:– Identify relatively BIG brain regions (ROIs).– Figure out how they influence one another, with what

timing sequences, in producing behaviors of interest.– Figure out individual differences.

Case Study 7: fMRI Brain Connectivity

Page 31: 1 Richard Scheines Carnegie Mellon University Causal Graphical Models II: Applications with Search

31

• Experiment: (Xue and Poldrack, unpublished)– 13 right handed subjects– On each trial, subject judged whether visual stimuli

rhymed or not– 8 pairs of words/nonwords presented for 2.5 seconds

each in eight 20 second blocks, separated by 20 seconds of visual fixation

– TR = 2000 milliseconds– 160 time points.

Case Study 7: fMRI

Page 32: 1 Richard Scheines Carnegie Mellon University Causal Graphical Models II: Applications with Search

32

• Problems:– Criteria for identifying ROIs– Individuals differ

• Brain ROIs• Parameter values

– Brain processing is cyclic– Time:

• Varying time delays of neuron ROI BOLD response• Time series sampling rate vs. processing rate

– Search Space • 11 ROIs – 323 DAGs

Case Study 7: fMRI Brain Connectivity

Page 33: 1 Richard Scheines Carnegie Mellon University Causal Graphical Models II: Applications with Search

ROI Construction

• Mean of signal intensity among voxels in a cluster at a time• 1st or ....4th principal component• Average of top X% variance• Maximum variance voxel.• Eyeballs• Etc., etc

Case Study 7: fMRI

Page 34: 1 Richard Scheines Carnegie Mellon University Causal Graphical Models II: Applications with Search

Example ROIs

Case Study 7: fMRI

Page 35: 1 Richard Scheines Carnegie Mellon University Causal Graphical Models II: Applications with Search

35

– Individuals differ• Brain ROIs• Parameter values

Case Study 7: fMRI Brain Connectivity

– Assume • same qualitiative causal structure• different quantitative causal structure (mixed effects)

– iMAGES search • Apply GES to each subject, 1 step• Take step = max(avg. BIC score) to each search• Repeat

Page 36: 1 Richard Scheines Carnegie Mellon University Causal Graphical Models II: Applications with Search

36

Time Problem 1

• fMRI recordings at time intervals can be analyzed as a collection of independent cases.

• Or, they can be analyzed as an auto-regressive time series.

• Which is better? – No general answer.– But if you think the neural activities measured at time

t influence the measurements at time t+1 then the data should be treated as a lag 1 auto-regressive time series.

– But then Granger causality isn’t a consistent estimator of causal relations.

Case Study 7: fMRI

Page 37: 1 Richard Scheines Carnegie Mellon University Causal Graphical Models II: Applications with Search

Granger Causality Corrected

Causal processes faster than the sampling rate:

Xt Xt+1 X

Yt Yt+1 Y

Zt Zt+1 Z

Regress on t variables

Apply GES to the RESIDUALS of the regression (Demiralp, Hoover)

NO False path

Case Study 7: fMRI

Page 38: 1 Richard Scheines Carnegie Mellon University Causal Graphical Models II: Applications with Search

38

Time Problem 2

• Varying time delays : neurons BOLD responses

Case Study 7: fMRI

• Try all time shifts of one or two units over all subsets of 3 vars, choose shift that leads to best likelihoods

Page 39: 1 Richard Scheines Carnegie Mellon University Causal Graphical Models II: Applications with Search

39

Lag 0 result Lag 1 result.

Page 40: 1 Richard Scheines Carnegie Mellon University Causal Graphical Models II: Applications with Search

40

Simulation Studies:

• 11 ROIs, each consisting of 50 simulated neurons:• Neuron output spikes simulated by thresholding a

tanh function of the sum of neuron inputs.• Excitatory feedback• Random subset of neurons in one ROI input to

random subset of neurons in an “effectively connected ROI”

• Measured variables = BOLD function of sum of ROI neurons + Gaussian error with variance = error variances of empirical measured variables in the X/P experiment.

Case Study 7: fMRI

Page 41: 1 Richard Scheines Carnegie Mellon University Causal Graphical Models II: Applications with Search

41

• Repeat 10 times:– Randomly generate a graphical structure with

11 nodes and 11 (feedforward) directed edges– Randomly select a subset of simulated ROIs.– Generate data – Randomly shift 0 to 3 variables one or 2 time

steps forward.– Apply the iMAGES method with 0 lag and 1

lag, with backshifting.• Tabulate the errors.

Simulate the Xue/Poldrack Experiment Time Series:

Case Study 7: fMRI

Page 42: 1 Richard Scheines Carnegie Mellon University Causal Graphical Models II: Applications with Search

42

Simulation Results

0 Lag:

Average number of false positive edges: 0.7

Average number of mis-directed edges: 1.6

1 Lag Residuals:

Average number of false positive edges: 1.2

Average number of mis-directed edges: 1.8

Case Study 6: fMRI

Page 43: 1 Richard Scheines Carnegie Mellon University Causal Graphical Models II: Applications with Search

43

Economics

Bessler, Pork Prices

Hoover, multiple

Cryder & Loewenstein,

Charitable Giving

Other Cases

Educational Research

Easterday, Bias & Recall

Laski, Numerical coding

Climate Research

Glymour, Chu, , Teleconnections

Biology

Shipley,

SGS, Spartina Grass

Neuroscience

Glymour & Ramsey, fMRI

Epidemiology

Scheines, Lead & IQ

Page 44: 1 Richard Scheines Carnegie Mellon University Causal Graphical Models II: Applications with Search

44

Straw Men!

• Model Search ignores theory

• Model Search hides assumptions

• Model Search needs more assumptions than standard statistical models

Page 45: 1 Richard Scheines Carnegie Mellon University Causal Graphical Models II: Applications with Search

45

References

Biology

Chu, Tianjaio, Glymour C., Scheines, R., & Spirtes, P, (2002). A Statistical Problem for Inference to Regulatory Structure from Associations of Gene Expression Measurement with Microarrays. Bioinformatics, 19: 1147-1152.

Shipley, B. Exploring hypothesis space: examples from organismal biology. Computation, Causation and Discovery. C. Glymour and G. Cooper. Cambridge, MA, MIT Press.

 Shipley, B. (1995). Structured interspecific determinants of specific leaf area in 34 species of

herbaceous angeosperms. Functional Ecology 9.

General

Spirtes, P., Glymour, C., Scheines, R. (2000). Causation, Prediction, and Search, 2nd Edition, MIT Press.

Pearl, J. (2000). Causation: Models of Reasoning and Inference, Cambridge University Press. 

Page 46: 1 Richard Scheines Carnegie Mellon University Causal Graphical Models II: Applications with Search

46

References

Scheines, R. (2000). Estimating Latent Causal Influences: TETRAD III Variable Selection and Bayesian Parameter Estimation: the effect of Lead on IQ, Handbook of Data Mining, Pat Hayes, editor, Oxford University Press.

Jackson, A., and Scheines, R., (2005). Single Mothers' Self-Efficacy, Parenting in the Home Environment, and Children's Development in a Two-Wave Study, Social Work Research , 29, 1, pp. 7-20.

Timberlake, M. and Williams, K. (1984). Dependence, political exclusion, and government repression: Some cross-national evidence. American Sociological Review 49, 141-146.

 

Page 47: 1 Richard Scheines Carnegie Mellon University Causal Graphical Models II: Applications with Search

47

References

Economics

Akleman, Derya G., David A. Bessler, and Diana M. Burton. (1999). ‘Modeling corn exports and exchange rates with directed graphs and statistical loss functions’, in Clark Glymour and Gregory F. Cooper (eds) Computation, Causation, and Discovery, American Association for Artificial Intelligence, Menlo Park, CA and MIT Press, Cambridge, MA, pp. 497-520.

Awokuse, T. O. (2005) “Export-led Growth and the Japanese Economy: Evidence from VAR and Directed Acyclical Graphs,” Applied Economics Letters 12(14), 849-858.

Bessler, David A. and N. Loper. (2001) “Economic Development: Evidence from Directed Acyclical Graphs” Manchester School 69(4), 457-476.

Bessler, David A. and Seongpyo Lee. (2002). ‘Money and prices: U.S. data 1869-1914 (a study with directed graphs)’, Empirical Economics, Vol. 27, pp. 427-46.

Demiralp, Selva and Kevin D. Hoover. (2003) !Searching for the Causal Structure of a Vector Autoregression," Oxford Bulletin of Economics and Statistics 65(supplement), pp. 745-767.

Haigh, M.S., N.K. Nomikos, and D.A. Bessler (2004) “Integration and Causality in International Freight Markets: Modeling with Error Correction and Directed Acyclical Graphs,” Southern Economic Journal 71(1), 145-162.

Sheffrin, Steven M. and Robert K. Triest. (1998). ‘A new approach to causality and economic growth’, unpublished typescript, University of California, Davis.

Page 48: 1 Richard Scheines Carnegie Mellon University Causal Graphical Models II: Applications with Search

48

References

Economics

Swanson, Norman R. and Clive W.J. Granger. (1997). ‘Impulse response functions based on a causal approach to residual orthogonalization in vector autoregressions’, Journal of the American Statistical Association, Vol. 92, pp. 357-67.

Demiralp, S., Hoover, K., & Perez, S. A Bootstrap Method for Identifying and Evaluating a Structural Vector Autoregression Oxford Bulletin of Economics and Statistics, 2008, 70, (4), 509-533

- Searching for the Causal Structure of a Vector Autoregression Oxford Bulletin of Economics and Statistics, 2003, 65, (s1), 745-767

Kevin D. Hoover, Selva Demiralp, Stephen J. Perez, Empirical Identification of the Vector Autoregression: The Causes and Effects of U.S. M2*, This paper was written to present at the Conference in Honour of David F. Hendry at Oxford University, 2325 August 2007.

Selva Demiralp and Kevin D. Hoover , Searching for the Causal Structure of a Vector Autoregression, OXFORD BULLETIN OF ECONOMICS AND STATISTICS, 65, SUPPLEMENT (2003) 0305-9049

A. Moneta, and P. Spirtes “Graphical Models for the Identification of Causal Structures in Multivariate Time Series Model”, Proceedings of the 2006 Joint Conference on Information Sciences, JCIS 2006, Kaohsiung, Taiwan, ROC, October 8-11,2006, Atlantis Press, 2006.

Page 49: 1 Richard Scheines Carnegie Mellon University Causal Graphical Models II: Applications with Search

49

Extra

Page 50: 1 Richard Scheines Carnegie Mellon University Causal Graphical Models II: Applications with Search

Lead and IQ: Variable Selection

BackwardsStepwise Regression

Measured Lead +5 Covariates

Measured Lead +39 Covariates

Final Variables (Needleman)

-lead baby teeth

-fab father’s age

-mab mother’s age

-nlb number of live births

-med mother’s education

-piq parent’s IQ

-ciq child’s IQ

Page 51: 1 Richard Scheines Carnegie Mellon University Causal Graphical Models II: Applications with Search

Needleman Regression

- standardized coefficient

- (t-ratios in parentheses)

- p-value for significance

ciq = - .143 lead - .204 fab - .159 nlb + .219 med + .237 mab + .247 piq

(2.32) (1.79) (2.30) (3.08) (1.97) (3.87)

0.02 0.09 0.02 <0.01 0.05 <0.01

All variables significant at .1 R2 = .271

Page 52: 1 Richard Scheines Carnegie Mellon University Causal Graphical Models II: Applications with Search

TETRAD Variable Selection

Tetradmab _||_ ciq

fab _||_ ciq

nlb _||_ ciq | med

ciq

mab fab nlb

lead piq med

Regressionmab _||_ ciq | { lead, med, piq, nlb fab}

fab _||_ ciq | { lead, med, piq, nlb mab}

nlb _||_ ciq | { lead, med, piq, mab, fab}

Page 53: 1 Richard Scheines Carnegie Mellon University Causal Graphical Models II: Applications with Search

Regressions

- standardized coefficient

- (t-ratios in parentheses)

- p-value for significance

Needleman (R2 = .271)

ciq = - .143 lead - .204 fab - .159 nlb + .219 med + .237 mab + .247 piq

(2.32) (1.79) (2.30) (3.08) (1.97) (3.87)

0.02 0.09 0.02 <0.01 0.05 <0.01

TETRAD (R2 = .243)

ciq = - .177 lead + .251 med + .253 piq

(2.89) (3.50) (3.59)

<0.01 <0.01 <0.01

Page 54: 1 Richard Scheines Carnegie Mellon University Causal Graphical Models II: Applications with Search

Measurement Error

• Measured regressor variables are proxies that involve measurement error

• Errors-in-all-variables model for Lead’s influence on IQ - underidentified

Actual LeadExposure

EnvironmentalStimulation

ciq

lead 3

2

111

1

ciq

lead

med

med

piq

piq

Geneticfactors

Strategies:

• Sensitivity Analysis

• Bayesian Analysis

Page 55: 1 Richard Scheines Carnegie Mellon University Causal Graphical Models II: Applications with Search

Prior over Measurement Error

Proportion of Variance from Measurement Error

• Measured Lead Mean = .2, SD = .1• Parent’s IQ Mean = .3, SD = .15• Mother’s Education Mean = .3, SD = .15

Prior Otherwise uninformative

Actual LeadExposure

EnvironmentalStimulation

ciq

lead 3

2

111

1

ciq

lead

med

med

piq

piq

Geneticfactors

Page 56: 1 Richard Scheines Carnegie Mellon University Causal Graphical Models II: Applications with Search

Posterior

Expected if Normal

0

50

100

150

200

250

0

50

100

150

200

250

Expected if Normal

Frequency

LEAD->ciq

Distribution of LEAD->ciq

Zero

Robust over similar priors

Page 57: 1 Richard Scheines Carnegie Mellon University Causal Graphical Models II: Applications with Search

Using Needleman’s Covariates

With similar prior, the marginal posterior:

Expected if Normal

0

20

40

60

80

100

120

140

0

2040

6080

100120

140160

Expected if Normal

Frequency

LEAD->ciq

Distribution of LEAD->ciq

Very Sensitive to Prior Over Regressors

TETRAD eliminated

Zero