1 simpson paradox and related problems. 2 simpson paradox 1960’s admission data show that male and...

53
1 Simpson Paradox And related problems

Upload: amos-merritt

Post on 04-Jan-2016

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Simpson Paradox And related problems. 2 Simpson Paradox 1960’s Admission data show that male and female have different admission rates when entering

1

Simpson Paradox

And related problems

Page 2: 1 Simpson Paradox And related problems. 2 Simpson Paradox 1960’s Admission data show that male and female have different admission rates when entering

2

Simpson Paradox

• 1960’s Admission data show that male and female have different admission rates when entering a famous University Graduate School.

• But every relevant person of the graduate school claimed that they are very fair in the process.

Page 3: 1 Simpson Paradox And related problems. 2 Simpson Paradox 1960’s Admission data show that male and female have different admission rates when entering

3

Hypothetical Data

• Two schools (Arts and Engineering)

• Male admission rate = 35/80 = .44

• Female admission rate = 20/60 = .33

Page 4: 1 Simpson Paradox And related problems. 2 Simpson Paradox 1960’s Admission data show that male and female have different admission rates when entering

4

2 by 2 table

admit deny

male 35 45

female 20 40

Page 5: 1 Simpson Paradox And related problems. 2 Simpson Paradox 1960’s Admission data show that male and female have different admission rates when entering

5

Further analysis

• School of Art

• Male admission rate = 5/20 = .25

• Female admission rate = 10/40 = .25

• School of Engineering

• Male admission rate = 30/60 = 0.5

• Female admission rate = 10/20 = 0.5

Page 6: 1 Simpson Paradox And related problems. 2 Simpson Paradox 1960’s Admission data show that male and female have different admission rates when entering

6

Data

• Notice that

• 30 + 5 = 35

• 30 + 15 = 45

• 10 + 10 = 20

• 10 + 30 = 40

Page 7: 1 Simpson Paradox And related problems. 2 Simpson Paradox 1960’s Admission data show that male and female have different admission rates when entering

7

School of arts

admit deny

male 30 30

female 10 10

Page 8: 1 Simpson Paradox And related problems. 2 Simpson Paradox 1960’s Admission data show that male and female have different admission rates when entering

8

School of Engineering

Admit Deny

Male 5 15

Female 10 30

Page 9: 1 Simpson Paradox And related problems. 2 Simpson Paradox 1960’s Admission data show that male and female have different admission rates when entering

9

Why?

• In each school, we can see that it is fair.

• But, on the whole, it seems that the female students are discriminated.

Page 10: 1 Simpson Paradox And related problems. 2 Simpson Paradox 1960’s Admission data show that male and female have different admission rates when entering

10

Reason

• More female students apply for school of arts

• The admission rate for school of arts is low for both male and female

Page 11: 1 Simpson Paradox And related problems. 2 Simpson Paradox 1960’s Admission data show that male and female have different admission rates when entering

11

Maori versus Non-Maori

Age Maori

(deaths/1000)

Non-Maori (deaths/1000)

0-4 3.68 2.75

5-14 .28 .27

15-24 1.26 1.06

25-44 2.44 1.31

45-64 15.0 8.76

65+ 67.36 54.75

Page 12: 1 Simpson Paradox And related problems. 2 Simpson Paradox 1960’s Admission data show that male and female have different admission rates when entering

12

But, on the whole

• For Maori, death rate = 4.65/1000

• For non-Maori, death rate = 8.35/1000

Page 13: 1 Simpson Paradox And related problems. 2 Simpson Paradox 1960’s Admission data show that male and female have different admission rates when entering

13

The lesson we learn

• We cannot draw a conclusion based on the data without understanding how the data are obtained.

Page 14: 1 Simpson Paradox And related problems. 2 Simpson Paradox 1960’s Admission data show that male and female have different admission rates when entering

14

Causal relation

• When we say that there is a sex discrimination in the admission process, we mean that sex is a cause and admission is the consequence.

• How can we come to conclude factor A causes the outcome B in Science?

Page 15: 1 Simpson Paradox And related problems. 2 Simpson Paradox 1960’s Admission data show that male and female have different admission rates when entering

15

Some possible mistakes

• Data---from hospital record

• Death rates of surgical patients are different for operations with different anesthetics

• Halothane (1.7%), Pentothal (1.7%), Cyclopropane (3.4%), Ether (1.9%)

• Can we say that cyclopropane is more dangerous than the other anesthetics?

Page 16: 1 Simpson Paradox And related problems. 2 Simpson Paradox 1960’s Admission data show that male and female have different admission rates when entering

16

Answer• No! the worst patients were receiving

cyclopropane.

Page 17: 1 Simpson Paradox And related problems. 2 Simpson Paradox 1960’s Admission data show that male and female have different admission rates when entering

17

Study the effect of vaccine on preventing Polio

• Can we apply the vaccine to all students and compare the proportion of students having polio at the end of year with the proportion in last year?

• Can we apply the vaccine to all students in New York City and compare with proportion of students having polio with the corresponding proportion in Chicago?

Page 18: 1 Simpson Paradox And related problems. 2 Simpson Paradox 1960’s Admission data show that male and female have different admission rates when entering

18

Further questions

• Can we compare the above proportion of students from private school with that of private school?

• Can we compare the above proportion of male students with that of female students?

Page 19: 1 Simpson Paradox And related problems. 2 Simpson Paradox 1960’s Admission data show that male and female have different admission rates when entering

19

How to know the effect of vaccine in preventing polio

• We need two groups: control group (no “real” treatment) treatment group (apply the vaccine)

Page 20: 1 Simpson Paradox And related problems. 2 Simpson Paradox 1960’s Admission data show that male and female have different admission rates when entering

20

We should compare the two groups under “equal” conditions

• People are different from each other

• By random assignment of participants into the two groups, we can make the two groups have almost identical conditions – e.g., around the same on average

Page 21: 1 Simpson Paradox And related problems. 2 Simpson Paradox 1960’s Admission data show that male and female have different admission rates when entering

21

Real difficulties

• There are many factors that will affect the outcome, it is impossible to control all of them

Page 22: 1 Simpson Paradox And related problems. 2 Simpson Paradox 1960’s Admission data show that male and female have different admission rates when entering

22

Design of an Experiment

• For comparing one treatment (A) with the other treatment (B), we need to randomize the patient into each group receiving one of the treatments

Page 23: 1 Simpson Paradox And related problems. 2 Simpson Paradox 1960’s Admission data show that male and female have different admission rates when entering

23

The vaccine can prevent Polio

• 1956---USA---over two million children involved

• Can we let the students voluntarily select their own treatment?

Page 24: 1 Simpson Paradox And related problems. 2 Simpson Paradox 1960’s Admission data show that male and female have different admission rates when entering

24

Randomization

• We need to randomly assign each school children to receive vaccine or placebo

• The purpose of such randomization is to ensure the comparability of the two groups

• Unfortunately many physicians could not understand the importance of the randomization

Page 25: 1 Simpson Paradox And related problems. 2 Simpson Paradox 1960’s Admission data show that male and female have different admission rates when entering

25

Placebo• In this case, placebo is another kind of

liquid, which is similar to the vaccine in its outlook, injected into the children.

• It is used so that all children were receiving “same” treatment. So that the difference in the results would not be explained as psychological effect

Page 26: 1 Simpson Paradox And related problems. 2 Simpson Paradox 1960’s Admission data show that male and female have different admission rates when entering

26

DataPolio (after half year)

No polio (after half year)

Control (placebo)

A=115 B=201,114

treatment C=33 D=200,712

Page 27: 1 Simpson Paradox And related problems. 2 Simpson Paradox 1960’s Admission data show that male and female have different admission rates when entering

27

An example

• The University Group Diabetes Program

• Randomly assign patients to 4 groups:

• Group 1: Placebo

• Group 2: Tolbutamide

• Group 3: Insulin Standard

• Group 4: Insulin variable

Page 28: 1 Simpson Paradox And related problems. 2 Simpson Paradox 1960’s Admission data show that male and female have different admission rates when entering

28

The results are controversial

• Is it really random?

Page 29: 1 Simpson Paradox And related problems. 2 Simpson Paradox 1960’s Admission data show that male and female have different admission rates when entering

29

Seven risk factors

• There are 7 risk factors related to diabetes

• Age of 55 or older, High Blood Pressure, History of Chest Pains, Electrocardiogram (EEG), history of digitalise use, High Cholesterol level, overweight and Calcification of the arteries

Page 30: 1 Simpson Paradox And related problems. 2 Simpson Paradox 1960’s Admission data show that male and female have different admission rates when entering

30

Risk factor distributions

No of RF I II III IV

0 28 25 22 15

1 60 50 62 76

2 59 58 60 57

3 26 34 34 30

4 10 17 8 4

5 2 4 8 4

6 0 1 1 1

Page 31: 1 Simpson Paradox And related problems. 2 Simpson Paradox 1960’s Admission data show that male and female have different admission rates when entering

31

Surprise?

• The distribution in the four groups are almost identical

• Notice that the study of the distribution is carried out after the experiment is done. It is quite likely that the randomization would make all potential risk factors equally distributed across the groups

Page 32: 1 Simpson Paradox And related problems. 2 Simpson Paradox 1960’s Admission data show that male and female have different admission rates when entering

32

Exercise one

• How to show that vitamin C can prevent catching cold?

Page 33: 1 Simpson Paradox And related problems. 2 Simpson Paradox 1960’s Admission data show that male and female have different admission rates when entering

33

FDA

• Food and Drug Administration

• Guidelines for developing drugs and treatments

• Statisticians should be involved in the design of the experiment and analysis of the data

Page 34: 1 Simpson Paradox And related problems. 2 Simpson Paradox 1960’s Admission data show that male and female have different admission rates when entering

34

Some past errors

• Hormone therapy (approved by FDA)---treat menopausal symptoms and to prevent osteoporosis, or age-related loss of bone density

• Later experiments showed that it does not protect against heart diseases or strokes and it increases the risk of dangerous blood clots and gallbladder disease.

Page 35: 1 Simpson Paradox And related problems. 2 Simpson Paradox 1960’s Admission data show that male and female have different admission rates when entering

35

Smoking and Lung Cancer

• For moral reason, we cannot randomly assign a person to smoke or not to smoke

Page 36: 1 Simpson Paradox And related problems. 2 Simpson Paradox 1960’s Admission data show that male and female have different admission rates when entering

36

Observational study

• Case-Control study• We study the smoking habit of patients with

lung cancer in the hospital• In the same hospital, we study the smoking

habit of patients of other diseases (without lung cancer, around same age, gender)

• Or, we can study the individuals without lung cancer from the same community

Page 37: 1 Simpson Paradox And related problems. 2 Simpson Paradox 1960’s Admission data show that male and female have different admission rates when entering

37

Example

• Oral contraceptives and Thromboembolic diseases

• Cases—all women in the hospital having thromboembolic diseases

• Control--?

Page 38: 1 Simpson Paradox And related problems. 2 Simpson Paradox 1960’s Admission data show that male and female have different admission rates when entering

38

Selection of controls

• Hospital---same as case

• Discharge date---same 6-month interval

• Discharge status---all alive

• Age—same 5-year span

• Marital status---same

• Residence---same metropolitan area

• Race---same

Page 39: 1 Simpson Paradox And related problems. 2 Simpson Paradox 1960’s Admission data show that male and female have different admission rates when entering

39

Selection of controls

• Parity---same (no pregnancies, one or two, three or more)

• Hospital status---same (ward, semiprivate, or private room)

Page 40: 1 Simpson Paradox And related problems. 2 Simpson Paradox 1960’s Admission data show that male and female have different admission rates when entering

40

Observational study

• Cohort study

• At the beginning, we have two groups, one smoking and the other non-smoking

• Wait for 5 years and study the proportions of persons getting lung cancer in the two groups

Page 41: 1 Simpson Paradox And related problems. 2 Simpson Paradox 1960’s Admission data show that male and female have different admission rates when entering

41

Cancer risk

• Many reports on the cancer risk were based on observation studies. Their results were not really reliable.

Page 42: 1 Simpson Paradox And related problems. 2 Simpson Paradox 1960’s Admission data show that male and female have different admission rates when entering

42

Exercise two

• Think about the validity of using case-control study in the following task---to show salted fish can cause nasopharyngeal cancer.

Page 43: 1 Simpson Paradox And related problems. 2 Simpson Paradox 1960’s Admission data show that male and female have different admission rates when entering

43

Question

• Comment on the following?• In a 1996 study by Dr. Leslie Wolfson of the

University of Connecticut, tai chi was compared to balance training, strength training, and combined balance and strength training in people with an average age of Eighty. Those who learned tai chi gained significantly more balance and strength than the other groups.

Page 44: 1 Simpson Paradox And related problems. 2 Simpson Paradox 1960’s Admission data show that male and female have different admission rates when entering

44

Case 1

• We obtain data on recoveries for males and females who have received a treatment (t) and a control ©

Page 45: 1 Simpson Paradox And related problems. 2 Simpson Paradox 1960’s Admission data show that male and female have different admission rates when entering

45

Males

R=1 R=0

T=t 18 12

T=c 7 3

Page 46: 1 Simpson Paradox And related problems. 2 Simpson Paradox 1960’s Admission data show that male and female have different admission rates when entering

46

Females

R=1 R=0

T=t 2 8

T=c 9 21

Page 47: 1 Simpson Paradox And related problems. 2 Simpson Paradox 1960’s Admission data show that male and female have different admission rates when entering

47

Combined

R=1 R=0

T=t 20 20

T=c 16 24

Page 48: 1 Simpson Paradox And related problems. 2 Simpson Paradox 1960’s Admission data show that male and female have different admission rates when entering

48

Question

• The recovery rate is higher for T=c for both males and females

• But the recovery rate is higher for T=t for the combined group?

• For a new subject whose gender is unknown, which treatment should we prefer, t or c?

Page 49: 1 Simpson Paradox And related problems. 2 Simpson Paradox 1960’s Admission data show that male and female have different admission rates when entering

49

Another situation

• Data on yields and heights for samples of black and white plants

Page 50: 1 Simpson Paradox And related problems. 2 Simpson Paradox 1960’s Admission data show that male and female have different admission rates when entering

50

Tall

Y=1 Y=0

C=w 18 12

C=b 7 3

Page 51: 1 Simpson Paradox And related problems. 2 Simpson Paradox 1960’s Admission data show that male and female have different admission rates when entering

51

Short

Y=1 Y=0

C=w 2 8

C=b 9 21

Page 52: 1 Simpson Paradox And related problems. 2 Simpson Paradox 1960’s Admission data show that male and female have different admission rates when entering

52

Combined

Y=1 Y=0

C=w 20 20

C=b 16 24

Page 53: 1 Simpson Paradox And related problems. 2 Simpson Paradox 1960’s Admission data show that male and female have different admission rates when entering

53

Question

• Should we plant a white (C=w) or a black variety of plant, in ignorance of the height the plant will grow to?