making inferences and justifying conclusions · making inferences and justifying conclusions roxy...

Making Inferences and

Justifying Conclusions

Roxy Peck

Cal Poly, San Luis Obispo

NCTM 2016 San Francisco

1

Common Core State Standard in Mathematics

S-IC 3 Recognize the purposes of and difference among sample surveys, experiments, and observational studies; explain how randomization relates to each.

S-IC 4 Use data from a sample survey to estimate a population mean or proportion; develop a margin of error through the use of simulation models for random sampling.

S-IC 5 Use data from a randomized experiment to compare two treatments; use simulation to decide if difference between parameters are significant.

S-IC 6 Evaluate reports based on data.


2

Common Core State Standard in Mathematics

These standards include difficult (but important)

statistical concepts.

Concepts of random selection, random assignment,

study design, sampling variability, margin of error,

statistical significance are not just for AP Statistics

anymore! They are now part of the “for all” part of the

high school curriculum.

In most Common Core schools (and “Common Core like

schools”), every high school teacher of mathematics is

now being asked to develop students’ statistical thinking

as well as their mathematical thinking.

This is a big challenge!NCTM 2016 San Francisco

3

So Where Do We Start??

In this session, we will consider class activities/demonstrations (depending on your access to technology) that address

The difference between observational studies and experiments.

The difference between random selection and random assignment.

How study design relates to the types of conclusions that can be drawn.

Using simulation to develop the concept of margin of error.

Using simulation to develop the concept of statistical significance.

But we won’t have time, so will go very quickly through the first three and then focus on the last two.


4

Observational Studies versus Experiments

Observational study

Observe characteristics of a sample selected from one or more populations

Goal is to use sample data to learn about the corresponding population

Important that the sample be representative of the population

Experiment

Study how a response variable behaves under different experimental conditions

Person conducting the experiment decides what the experimental conditions will

be and who will be in each experimental group

Important to have comparable experimental groups


5

Observational Studies versus Experiments

Observational studies (includes sample surveys)

Want random selection from population of interest since it is important to have a

sample that is representative of the population.

Random selection enables generalizing from sample to the population.

Experiments

Want random assignment of “subjects” to experimental conditions to create

comparable experimental groups.

Random assignment enables drawing a cause and effect conclusion (changes

in the experimental conditions cause change in response).

Experiments may or may not include random selection of subjects.


6

So What is Randomization??

Random selection

Random assignment

Randomization

Let’s keep it simple and not confuse students!


7

Give students lots of practice doing

things like this…


8


9


10


11

Margin of Error and Statistical

Significance

Observational Studies and Sample Surveys

Question of interest: How far off might my estimate

be?

Experiments

Question of interest: Could this have happened by

chance when there is no difference in the response to

the different experimental conditions?


12

Margin of Error

Statistical Significance

CCSS limits these conceptsMargin of errorobservational studies

Statistical significanceexperiments

Using Simulation to Develop Concept

of Margin of Error

How far off might my estimate be?

Study on facial stereotyping (thanks to Allan

Rossman and Beth Chance for this example).

Reference: Psychonomic Bulletin & Review, 2007

14(5), 901-907.


13

Bob or Tim?

One of these men is named Bob and one is named Tim.

They were asked “Which man is named Tim and which is

named Bob?”


14

Bob or Tim?

Want to use data to estimate the proportion of U.S. adults that

would choose the man on the left as Tim.

We will assume that it is reasonable to assume that this group is

representative of the population of adults in the U.S.

For this group, the proportion who chose the man of the left as Tim

is:

But I don’t have an internet connection so I am going to pretend

that we are a group of 100 people and that 78 picked the man on

the left as Tim. With a class where I would have internet access, I

would use the real class data. The proportion who choose the man

on the left as Tim is pretty consistently around 0.80.


15

Motivating Margin of Error

Based on my sample of 100 people, my estimate of the proportion

of U.S. adults who would choose the man on the left as Tim is 0.78.

But I don’t expect this to be exactly equal to the actual population

proportion. How close can I expect my estimate to be to the actual

value?

Margin of error is the maximum likely error. It would not be likely that

my estimate would be off by more than this amount. “Likely” is

defined in terms of 95%--If I were to takes samples from the

population and use each sample to estimate the population value, 95% of these estimates would differ from the actual value by less

than this amount.


16


How do we get a sense of how far off my estimate is likely to be?

Create a BIG hypothetical population with a proportion of “successes”

that is equal to my sample proportion.

Take a random sample of the same size as my original sample from the

big hypothetical population and calculate the proportion for this

simulated sample.

Repeat many times to get a collection of simulated sample proportions.

Look at the simulated sample proportions to see how far off they

tended to be from the known proportion for my BIG hypothetical

population.

The margin of error based on the simulated sample proportions is a

reasonable estimate of the margin of error I should associate with my

original estimate.


17


http://www.rossmanchance.com/ISIapplets.html


18




19



20


95% of simulated sample proportions were between 0.71

and 0.84. Since the actual proportion in the BIG

hypothetical population was 0.78, we could say that

about 95% of the simulated sample proportions were

within about 0.07 of the actual population value.

Margin of error is 0.07.

So we think that our estimate of 0.78 is probably within

about 0.07 of the actual proportion of adults who would

choose the man on the left as Tim.


21


Extension—If you have access to technology, have students each

carry out a simulation to get their own margin of error estimates. Compare with other students so that they see that the simulation

method tends to produce consistent results.

Also works for simulating margin of error for estimating a population

mean. But to create the BIG hypothetical population to sample

from, we create a population that consists of a large number of

copies of our sample (which we think is representative of the

population). Sampling from this BIG hypothetical population is equivalent to sampling with replacement from the original sample.


22


“For all” stops here.

For AP Stat, can motivate margin of error this way and

then move on to more traditional approach. For

example, the formula for margin of error for estimating a

population proportion using large samples, the estimate

for the Bob or Tim example based on n = 100 and a

sample proportion of 0.78 is 0.08, compared to the 0.07

from the simulation.


23


of Statistical Significance

Could this have happened by chance when…?

Study to determine if reducing body temperature for three days would improve survival for newborn babies whose brains were

temporarily deprived of oxygen as a result of complications at birth.

Reference: The New England Journal of Medicine, October 13, 2005

1574-1584.

Infants were randomly assigned to a cooling group (102 infants) or a control group (103 infants).


24

Experiment



Death or moderate to severe disability occurred in 45 of

102 infants in the cooling group (44%)

Death or moderate to severe disability occurred in 64 of

103 infants in the control group (62%).

Could this difference have happened just by chance if

there is no real difference in the death and disability

rates for the two experimental conditions? If not, we say

that the difference is statistically significant.


25



Could this have happened by chance?

By chance, we mean due just to the way people were

assigned to the two groups.

If the cooling treatment has no effect, then the

difference in the survival rates is just because more of

the infants who were going to survive happened to be

assigned to the cooling group. Is this a plausible

explanation for the difference?

Let’s explore…


26



Start with a simpler version just to demonstrate method

so that students understand the method

4 of 10 in cooling group (40%)

6 of 10 in control group (60%)

Applet from



27




28

Data From Original Groups




29

20 infants re-randomized into 2 groups



Could have happened by chance.


30


of Statistical Significance Now with real data

Unlikely to have occurred just by chance


31



Conclusions

The difference between the death and disability proportions for

the two experimental conditions (cooling, control) is statistically significant.

By statistically significant, we mean that it is unlikely that we

would observe a difference this large just due to chance.

Sample size plays an important role—difference of -0.20 was not

significant with sample sizes of 10, but difference of -0.18 is

significant with samples sizes of around 100.


32



Extension—If you have access to technology, have students each carry out a simulation to draw their own conclusion about statistical significance. Compare with other students so that they see that the simulation method tends to produce consistent results.

Also works for simulating difference in means for numerical data. If treatment has no effect, assumes numerical response would be the same no matter which treatment group the subject was assigned to. Investigates question “could this have happened by chance when there is no treatment effect?” by randomly reassigning the observed response values to experimental groups and calculating the difference in means.


33



“For all” stops here.

For AP Stat, can motivate idea of significance and the

meaning of p-value using this approach and then move

on to more traditional approach. For example, the p-

value for the large sample two proportions z test for the

cooling experiment data is 0.005, compared to 0.01 from

the simulation.


34

Concluding Remarks

The two class activities/demonstrations (depending on

your access to technology in the classroom) can be

used to develop an understanding of margin of error

and statistical significance that is consistent with the

intent of the Common Core State Standards.

In more advanced setting, such as AP Statistics, these

activities can be used as a starting point to develop an

understanding of the concepts before jumping in to

more formal methods for computing margin of error and

p-values.


35

Thanks for attending this session!

Comments or questions?

[email protected]


36

making inferences and justifying conclusions · making inferences and justifying conclusions roxy...

Documents