introduction to hypothesis testing ap statistics chap 11-1
TRANSCRIPT
Introduction to Hypothesis Testing
AP Statistics Chap 11-1
Statistical Dilemma
AP Statistics Chap 11-2
AT&T believes the average telephone bill in Columbus, Georgia is $42.05 per month.
They take a sample of 100 bills and find that the average value of the sample is $55.57.
What does it mean?
AP Statistics Chap 11-3
Hypothesis Testing
Population
Conclusion: Mean age is lower than thought.
How strong is the evidence?
Sample
Now select a random sample
Compare the sample results tocurrent accepted facts/thoughts. If currently accepted that mean age is 50 and sample mean is 20.
What is a Hypothesis?
• A hypothesis is a theory proposed to explain a observation.
– population mean
– population proportion
AP Statistics Chap 11-4
Example: The mean monthly cell phone bill of this city is = $42
Example: The proportion of adults in this city with cell phones is p = .68
The Null Hypothesis, H0
• States the currently accepted fact
Example: The average number of TV sets in U.S. Homes is at least three ( )
AP Statistics Chap 11-5
3μ:H0 3x:H0
3μ:H0
Is always about a population parameter, not about a sample statistic
The Null Hypothesis, H0
• Assume that the null hypothesis is true until there is sufficient evidence to reject it.– Similar to the notion of innocent until
proven guilty• Always contains “=” , “≤” or “” sign• May or may not be rejected– Never proven true or false
AP Statistics Chap 11-6
The Alternative Hypothesis, HA
• Is generally the hypothesis that is believed by the researcher based on the sample.
• Challenges the Ho
• Is the opposite of the null hypothesis– e.g.: The average number of TV sets in U.S.
homes is less than 3 ( HA: < 3 )
• Never contains the “=” , “≤” or “” sign• Stated as “≠”, “>” or “<“
AP Statistics Chap 11-7
If it is unlikely that we would get a sample mean of this value ...
Reason for Rejecting H0
AP Statistics Chap 11-8
Sampling Distribution of the Statistic
= 50If H0 is true ... then we reject the
null hypothesis that = 50.
20
... if in fact this were the population mean…
x
Level of Significance,
• Defines unlikely values of sample statistic if null hypothesis is true– Defines rejection region of the sampling
distribution
• Is designated by , (level of significance)– Typical values are .01, .05, or .10
• Is selected by the researcher at the beginning
AP Statistics Chap 11-9
Level of Significance and the Rejection Region
AP Statistics Chap 11-10
H0: μ =50 HA: μ < 50 0
a
Lower tail test
Level of significance = a
0
H0: μ = 50 HA: μ > 50
a
0Upper tail test
H0: μ = 50 HA: μ ≠ 50
/2a
Two tailed test
Rejection region is shaded
/2a
p-Value Approach to Testing
• p-value: Probability of obtaining a test statistic more extreme ( ≤ or ) than the observed sample value given H0 is true
– Also called observed level of significance
AP Statistics Chap 11-11
p-Value Approach to Testing
• Obtain the p-value from a computer randomization model more extreme
• Compare the p-value with
– If p-value < , reject H0
– If p-value , do not reject H0
AP Statistics Chap 11-12
Interpreting the p-value…
AP Statistics Chap 11-13
Overwhelming Evidence(Highly Significant)
Strong Evidence(Significant) Weak Evidence
(Not Significant)
No Evidence(Not Significant)
0 .01 .05 .10
Pictures were taken of 25 owners and their purebred dogs, selected at random from dog parks. Study participants were shown a picture of an owner together with pictures of two dogs (the owner’s dog and another random dog from the study) and asked to choose which dog most resembled the owner. Of the 25 owners, 16 were paired with the correct dog. Is this convincing evidence that dogs tend to resemble their owners or just the results of random chance?
How extreme is a phat of .64, if the results is random chance?
Dogs and Owners
Distribution of sample proportions
P-Value = .238 for two tail test
Do men and women have different views on divorce? A May 2010 Gallup poll of U.S. citizens over the age of 18 asked participants if they view divorce as “morally acceptable”. Of the 1029 adults surveyed, 71% of men and 67% of women responded ‘yes’.
What does the survey indicate?Men and women may differ in opinion.
What is the no change hypothesis?Men and women do not differ in opinion.
: 0
: 0a
o M W
M W
H P P
H P P
Attitude Toward Divorce
Attitude Toward Divorce
Is there sufficient evidence that men and women differ?
Researchers trained a sample of male college students to tap their fingers at a rapid rate. The sample was then divided at random into two groups of ten students each. Each student drank the equivalent of about two cups of coffee, which included about 200 mg of caffeine for the students in one group but was decaffeinated coffee for the second group. After a two hour period, each student was tested to measure finger tapping rate (taps per minute). The goal of the experiment was to determine whether caffeine produces an increase in the average tap rate.
What are the Null and Alternate Hypotheses
Caffeine and Finger Tapping
Hypotheses
0
0
:
:NC
NC
C
C
H
H
Or
0
0
: 0
: 0NC
NC
C
C
H
H
Caffeine and Finger Tapping
Researchers conducted a study examining the effect of a smile on the leniency of disciplinary action. For each suspect, along with a description of the offense, a picture was provided with either a smile or neutral facial expression. A leniency score was calculated based on the disciplinary. The experimenters are testing to see if the average lenience score is higher for smiling students than it is for students with a neutral facial .
Smiles and Punishment
What are the null and alternate hypotheses?
o S NS
a S NS
H : μ = μ
H : μ > μ
Smiles and Punishment
If α = .05, is the results statistically significant?
In a study of relationships between the type of uniforms worn by professional sports teams and the aggressiveness of the team, they consider teams from the National Football League (NFL). Participants with no knowledge of the teams rated the jerseys on characteristics such as timid/aggressive, nice/mean and good/bad. The averages of these responses produced a “malevolence” index with higher scores signifying impressions of more malevolent uniforms. To measure aggressiveness, the authors used the amount of converted to z-scores and averaged for each team over the seasons from 1970-1986. r = 0.43
Is there a correlation between uniforms and penalties in the NFL?
What are Ho and Ha?
NFL Uniforms vs Penalties
Hypotheses
: 0
: 0O
A
H
H
NFL Uniforms vs Penalties
Lithium vs Placebo
An experiment to investigate the effectiveness of the two drugs desipramine and lithium in the treatment of cocaine addiction was conducted. Subjects (cocaine addicts seeking treatment) were randomly assigned to take one of the treatment drugs or a placebo so that there were 24 patients in each group. The results of the study are summarized in the table below. The question of interest is whether lithium is more effective at preventing relapse than taking an inert pill.State the null and alternative hypotheses.
𝑯𝑶 : 𝒑𝑳=𝒑𝑵
𝑯 𝑨 : 𝒑𝑳<𝒑𝑵
How would you test these hypotheses?
Type I and Type II Errors
State of Nature
Decision
Do NotReject No Error Type II Error
Reject Type I Error
Possible Hypothesis Test Outcomes
H0 False H0 True
No Error
Practical vs Statistical Significance
Local college offers an SAT preparation course and provides a statistical analysis on its website showing that 95% of students improve their SAT score after taking their $1000 course.
How much would it have to improve your score to make the cost of the course worthwhile?
50 points?100 points?300 points?
Statistically significant results does not imply the size of the difference.