preliminaries introduction to statistical investigations

27
Preliminaries Introduction to Statistical Investigations

Upload: tryna

Post on 24-Feb-2016

73 views

Category:

Documents


4 download

DESCRIPTION

Preliminaries Introduction to Statistical Investigations. Statistics vs. Anecdotal Evidence. Smoking causes cancer . Seat belts save lives. Do Vaccines Cause Autism?. Nelson says it wasn't long after her son Parker's shots at 15 months that she noticed something was wrong. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Preliminaries Introduction to Statistical Investigations

Preliminaries

Introduction to Statistical Investigations

Page 2: Preliminaries Introduction to Statistical Investigations

Statistics vs. Anecdotal Evidence

Smoking causes cancer. Seat belts save lives.

Page 3: Preliminaries Introduction to Statistical Investigations

Do Vaccines Cause Autism?

Nelson says it wasn't long after her son Parker's shots at 15 months that she noticed something was wrong."He had run a slight fever after the vaccinations, but I didn't think anything of it," said Nelson. “… about a week after that he just completely stopped talking."After months of worrying, wondering, and going back and forth with doctors, an official diagnosis was made: autism.Nelson believes it started with the vaccines."Gradually, I started piecing it together. He got sick after his vaccinations and about a week later everything changed. He was a completely different little boy then," said Nelson.

http://www.wsaz.com/charleston/headlines/19376044.html

Page 4: Preliminaries Introduction to Statistical Investigations

StatisticsScientific conclusions cannot be

based on anecdotal evidence. We need evidence from data.

Statistics is the science of producing useful data to address a research question, analyzing the resulting data, and drawing appropriate conclusions from the data.

Page 5: Preliminaries Introduction to Statistical Investigations

Six-Step Statistical Investigation Method

Logic ofInference

Scope ofInference

SignificanceEstimation

GeneralizationCausation

6. Look back and ahead

1. Ask a research question

Research Hypothesis

2. Design a study and collect data

3. Explore the data

4. Draw inferences

5. Formulate conclusions

Page 6: Preliminaries Introduction to Statistical Investigations

Example P.1: Organ Donations While a majority of people

approve of organ donation in principle, far less than that actually sign up when getting a driver’s license.

Different states have different recruiting methods.

Do these different methods result in different sign-up rates?

Page 7: Preliminaries Introduction to Statistical Investigations

Recruiting Organ DonorsStep 1. Ask a Research Question In general: Is there a method that

will increase the likelihood that a person agrees to become an organ donor.

More specifically: Does the default option presented to driver’s license applicants influence the likelihood of someone becoming an organ donor?

Page 8: Preliminaries Introduction to Statistical Investigations

Recruiting Organ DonorsStep 2: Design a study and collect dataThe researchers decided to recruit various

participants and ask them to pretend to apply for a new driver’s license.

The participants did not know in advance that different options were given for the donor question, or even that this issue was the main focus of the study.

They offered an incentive of $4.00 for completing an online survey. After the results were collected, the researchers removed data arising from multiple responses from the same IP address, surveys completed in less than five seconds, and respondents whose residential address could not be verified.

Page 9: Preliminaries Introduction to Statistical Investigations

Recruiting Organ DonorsStep 2: Design a study and collect dataSome of the participants were forced to make a

choice of becoming a donor or not, without being given a default option (the “neutral” group).

Other participants were told that the default option was not to be a donor but that they could choose to become a donor if they wished (the “opt-in” group).

The remaining participants were told that the default option was to be a donor but that they could choose not to become a donor if they wished (the “opt-out” group).

Page 10: Preliminaries Introduction to Statistical Investigations

Recruiting Organ DonorsStep 3: Explore the data. 44 of the 56 (78.6%)

participants in the neutral group agreed to become organ donors,

23 of 55 (41.8%) participants in the opt-in group agreed to become organ donors, and

41 of 50 (82.0%) participants in the opt-out group agreed to become organ donors.

 

Page 11: Preliminaries Introduction to Statistical Investigations

Recruiting Organ DonorsStep 4: Draw inferences beyond the data. Using methods that you will learn in this course, the

researchers analyzed whether the observed differences between the groups was large enough to indicate that the default option had a genuine effect.

In particular, they reported strong evidence that the neutral and opt-out versions do lead to a higher chance of agreeing to become a donor, as compared to the opt-in version currently used in many states.

In fact, they could be quite confident that the neutral version increases the chances that a person agrees to become a donor by between 20 and 54 percentage points, a difference large enough to save thousands of lives per year in the United States.

Page 12: Preliminaries Introduction to Statistical Investigations

Recruiting Organ DonorsStep 5: Formulate conclusions. Based on the analysis of the data and the

design of the study, it is reasonable for these researchers to conclude that the neutral version causes an increase in the proportion who agree to become donors.

But because the participants in the study were volunteers recruited from internet bulletin boards, generalizing conclusions beyond these participants is only legitimate if they are representative of a larger group of people.

Page 13: Preliminaries Introduction to Statistical Investigations

Recruiting Organ DonorsStep 6: Look back and ahead. One limitation of the study is that participants were

asked to imagine how they would respond, which might not mirror how people would actually respond in such a situation.

A new study might look at people’s actual responses to questions about organ donation or could monitor donor rates for states that adopt a new policy.

Researchers could also examine whether presenting educational material on organ donation might increase people’s willingness to donate.

Another improvement would be to include participants from wider demographic groups than these volunteers.

Page 14: Preliminaries Introduction to Statistical Investigations

TerminologyThe individual entities on which data are

recorded are called observational units. The recorded characteristics of the observational

units are the variables of interest.Variables can be:

◦ Quantitative You can add, subtract, etc. with the values. Height, weight, distance, time…

◦ Categorical Labels for which arithmetic does not make sense. Sex, ethnicity, eye color…

What are the observational units and variables in the Organ Donation Study?

Page 15: Preliminaries Introduction to Statistical Investigations

More TerminologyThe distribution

of variable describes the pattern of value/category outcomes.

For the organ donation study the bar chart shown displays the distribution of responses.

Page 16: Preliminaries Introduction to Statistical Investigations

Old FaithfulExample P.2

Page 17: Preliminaries Introduction to Statistical Investigations

Old FaithfulHow faithful is Old Faithful?Can the time of the next eruption

be accurately predicted?

Page 18: Preliminaries Introduction to Statistical Investigations

Old Faithful

Page 19: Preliminaries Introduction to Statistical Investigations

Old FaithfulResearchers collected data on

222 eruptions taken over a number of days in the summers of 1978 and 1979.

The results are shown in a dotplot.

100959085807570656055504540time until next eruption (min)

Page 20: Preliminaries Introduction to Statistical Investigations

Old FaithfulWhat are the observational units and

variable in this study? Is the variable quantitative or categorical?We can see from the dotplot that Old

Faithful is not perfectly predictable. The time until the next eruption varies

from eruption to eruption. This variability is the most fundamental

property in studying Statistics.  Without variability, we wouldn’t need statistics.

Page 21: Preliminaries Introduction to Statistical Investigations

Old FaithfulLet’s take another look at the dotplot

and describe the distribution.

What could be some explanations for the variability?

100959085807570656055504540time until next eruption (min)

Page 22: Preliminaries Introduction to Statistical Investigations

Old FaithfulOne explanation could be the

duration of previous eruption (short: < 3.5 min. or long > 3.5 min.)

100959085807570656055504540

short

long

time until next eruption (min)

erup

tion

type

Page 23: Preliminaries Introduction to Statistical Investigations

Old Faithful

Summer 2005

Page 24: Preliminaries Introduction to Statistical Investigations

Old FaithfulOne way to measure the center

of a distribution is with the average, also called the mean.

One way to measure variability is with the standard deviation, which is roughly the average distance between a data value in the distribution and the mean of the distribution

Page 25: Preliminaries Introduction to Statistical Investigations

Old Faithful  Mean Standard

deviationOverall 71.0 12.8After short duration

56.3 8.5

After long duration

78.7 6.3

100959085807570656055504540

short

long

time until next eruption (min)

erup

tion

type

Page 26: Preliminaries Introduction to Statistical Investigations

Old FaithfulBasic TerminologySome aspects to look for in a distribution of a

quantitative variable are:◦ Shape: Is the distribution symmetric? Mound-

shaped? Are there several peaks or clusters? ◦ Center: Where is the distribution centered? What

is a typical value?◦ Variability: How spread out are the data? Are

most within a certain range of values?◦ Unusual observations: Are there outliers that

deviate markedly from the overall pattern of the other data values? Are there other unusual features in the distribution?

Page 27: Preliminaries Introduction to Statistical Investigations

Exploration P.3: Cars or GoatsPages P-13 to P-17