chapter 11 multinomial experiments and contingency tablesdenis.santos/arquivos/slides/... ·...
TRANSCRIPT
SlideSlide 2Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
11-1 Overview
11-2 Multinomial Experiments:Goodness-of-fit
11-3 Contingency Tables:Independence and Homogeneity
11-4 McNemar’s Test for Matched Pairs
Chapter 11Multinomial Experiments and Contingency Tables
SlideSlide 3Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Section 11-1 & 11-2 Overview and
Multinomial Experiments: Goodness of Fit
Created by Erin Hodgess, Houston, TexasRevised to accompany 10th Edition, Jim Zimmer, Chattanooga State,
Chattanooga, TN
SlideSlide 4Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Overview�We focus on analysis of categorical (qualitative
or attribute) data that can be separated into different categories (often called cells ).
�Use the χχχχ2 (chi-square) test statistic (Table A- 4).
�The goodness-of-fit test uses a one-way frequency table (single row or column).
�The contingency table uses a two-way frequency table (two or more rows and columns).
SlideSlide 5Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Key Concept
Given data separated into different categories, we will test the hypothesis that the distribution of the data agrees with or
“fits” some claimed distribution.
The hypothesis test will use the chi-square distribution with the observed frequency counts and the frequency counts that we
would expect with the claimed distribution.
SlideSlide 6Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Multinomial ExperimentThis is an experiment that meets the following conditions:
1. The number of trials is fixed.
2. The trials are independent.
3. All outcomes of each trial must be classified into exactly one of several different categories.
4. The probabilities for the different categories remain constant for each trial.
Definition
SlideSlide 7Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Example: Last Digits of Weights
When asked, people often provide weights that are somewhat lower than their actual weights. So how can researchers verify that weights were obtained through actual measurements instead of asking subjects?
SlideSlide 8Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Test the claim that the digits in Table 11-2 do not occur with the same frequency.
Table 11-2 summarizes the last digit of weights of 80 randomly selected students.
Example: Last Digits of Weights
SlideSlide 9Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
1. The number of trials (last digits) is the fixed number 80.2. The trials are independent, because the last di git of any individual’s weight does not affect the last di git of any other weight.3. Each outcome (last digit) is classified into ex actly 1 of 10 different categories. The categories are 0, 1, … , 9.
4. Finally, in testing the claim that the 10 digit s are equally likely, each possible digit has a probabili ty of 1/10, and by assumption, that probability remains constan t for each subject.
Verify that the four conditions of a multinomial experiment are satisfied.
Example: Last Digits of Weights
SlideSlide 10Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Definition
Goodness-of-fit Test
A goodness-of-fit test is used to test the hypothesis that an observed frequency distribution fits (or conforms to) some claimed distribution.
SlideSlide 11Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
O represents the observed frequency of an outcome.
E represents the expected frequency of an outcome.
k represents the number of different categories or outcomes.
n represents the total number of trials .
Goodness-of-Fit TestNotation
SlideSlide 12Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Expected Frequencies
If all expected frequencies are equal :
the sum of all observed frequencies divided by the number of categories
nE =
k
SlideSlide 13Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
If expected frequencies are not all equal :
Each expected frequency is found by multiplying the sum of all observed
frequencies by the probability for the category.
E = n p
Expected Frequencies
SlideSlide 14Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Goodness-of-Fit Test in Multinomial Experiments
1. The data have been randomly selected.2. The sample data consist of frequency counts
for each of the different categories.3. For each category, the expected frequency is at
least 5. (The expected frequency for a category is the frequency that would occur if the data actually have the distribution that is being claimed. There is no requirement that the observed frequency for each category must be at least 5.)
Requirements
SlideSlide 15Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Goodness-of-Fit Test in Multinomial Experiments
Critical Values1. Found in Table A-4 using k – 1 degrees of
freedom, where k = number of categories.
2. Goodness-of-fit hypothesis tests are always right-tailed .
χχχχ2 = ΣΣΣΣ (O – E)2
E
Test Statistics
SlideSlide 16Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
�A large disagreement between observed and expected values will lead to a large value of χχχχ2
and a small P-value.
�A significantly large value of χχχχ2 will cause a rejection of the null hypothesis of no difference between the observed and the expected.
�A close agreement between observed and expected values will lead to a small value of χχχχ2
and a large P-value.
Goodness-of-Fit Test in Multinomial Experiments
SlideSlide 17Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Relationships Among the χχχχ2 Test Statistic, P-Value, and Goodness-of-Fit
Figure 11-3
SlideSlide 18Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Example: Last Digit AnalysisTest the claim that the digits in Table 11-2
do not occur with the same frequency.
H0: p0 = p1 = ………… = p9
H1: At least one of the probabilities is different from the others.
αααα = 0.05
k – 1 = 9
χχχχ2.05, 9= 16.919
SlideSlide 19Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Example: Last Digit AnalysisTest the claim that the digits in Table 11-2
do not occur with the same frequency.
Because the 80 digits would be uniformly distributed through the 10 categories, each expected frequency should be 8.
SlideSlide 20Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Example: Last Digit AnalysisTest the claim that the digits in Table 11-2
do not occur with the same frequency.
SlideSlide 21Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Example: Last Digit Analysis
From Table 11-3, the test statistic is χχχχ2 = 156.500.
Since the critical value is 16.919, we reject the null hypothesis of equal probabilities.
There is sufficient evidence to support the claim that the last digits do not occur with the same relative frequency.
Test the claim that the digits in Table 11-2 do not occur with the same frequency.
SlideSlide 22Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Example: Detecting Fraud Unequal Expected Frequencies
In the Chapter Problem, it was noted that statistic s can be used to detect fraud. Table 11-1 lists the perc entages for leading digits from Benford’s Law.
SlideSlide 23Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Example: Detecting Fraud
Observed Frequencies and Frequencies Expected with Benford’s Law
Test the claim that there is a significant discrepa ncy between the leading digits expected from Benford’s Law and the leading digits from the 784 checks.
SlideSlide 24Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Example: Detecting Fraud
H0: p1 = 0.301, p2 = 0.176, p3 = 0.125, p4 = 0.097, p5 = 0.079, p6 = 0.067, p7 = 0.058, p8 = 0.051 and p9 = 0.046
H1: At least one of the proportions is different from the claimed values.
αααα = 0.01
k – 1 = 8
χχχχ2.01,8= 20.090
Test the claim that there is a significant discrepa ncy between the leading digits expected from Benford’s Law and the leading digits from the 784 checks.
SlideSlide 25Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Example: Detecting Fraud
The test statistic is χχχχ2 = 3650.251.
Since the critical value is 20.090, we reject the n ull hypothesis.
There is sufficient evidence to reject the null hypothesis. At least one of the proportions is different than expected.
Test the claim that there is a significant discrepa ncy between the leading digits expected from Benford’s Law and the leading digits from the 784 checks.
SlideSlide 26Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Example: Detecting FraudTest the claim that there is a significant discrepa ncy between the leading digits expected from Benford’s Law and the leading digits from the 784 checks.
Figure 11-5
SlideSlide 27Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Example: Detecting Fraud
Figure 11-6 Comparison of Observed Frequencies and Frequencies Expected with Benford’s Law
SlideSlide 28Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
RecapIn this section we have discussed:
Multinomial Experiments: Goodness-of-Fit
- Equal Expected Frequencies
- Unequal Expected Frequencies
Test the hypothesis that an observed frequency distribution fits (or conforms to) some claimed distribution.
SlideSlide 29Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Section 11-3 Contingency Tables: Independence and
Homogeneity
Created by Erin Hodgess, Houston, TexasRevised to accompany 10th Edition, Jim Zimmer, Chattanooga State,
Chattanooga, TN
SlideSlide 30Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Key ConceptIn this section we consider contingency tables (or two-way frequency tables), which include frequency counts for categorical data arranged in a table with a least two rows and at least two columns.
We present a method for testing the claim that the row and column variables are independent of each other.
We will use the same method for a test of homogeneity, whereby we test the claim that different populations have the same proportion of some characteristics.
SlideSlide 31Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Contingency Table(or two-way frequency table)
A contingency table is a table in which frequencies correspond to two variables.
(One variable is used to categorize rows, and a second variable is used to categorize columns.)
Contingency tables have at least two rows and at least two columns.
Definition
SlideSlide 32Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Case-Control Study of Motorcycle Drivers
Is the color of the motorcycle helmet somehow related to the risk of crash related injuries?
491
213
704
377
112
489
31
8
39
899
333
1232
Black White Yellow/OrangeRow
Totals
Controls (not injured)
Cases (injured or killed)
Column Totals
SlideSlide 33Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Test of Independence
A test of independence tests the null hypothesis that there is no association between the row variable and the column variable in a contingency table. (For the null hypothesis, we will use the statement that “the row and column variables are independent.”)
Definition
SlideSlide 34Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Requirements
1. The sample data are randomly selected and are represented as frequency counts in a two-way table.
2. The null hypothesis H0 is the statement that the row and column variables are independent ; the alternative hypothesis H1 is the statement that the row and column variables are dependent .
3. For every cell in the contingency table, the expectedfrequency E is at least 5. (There is no requirement that every observed frequency must be at least 5. Also there is no requirement that the population mu st have a normal distribution or any other specific distribution.)
SlideSlide 35Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Test of IndependenceTest Statistic
Critical Values1. Found in Table A-4 using
degrees of freedom = ( r – 1)(c – 1)
r is the number of rows and c is the number of colu mns
2. Tests of Independence are always right-tailed.
χχχχ2 = ΣΣΣΣ (O – E)2
E
SlideSlide 36Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
(row total ) (column total )(grand total )
E =
Total number of all observed
frequencies in the table
Expected Frequency
SlideSlide 37Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Test of Independence
This procedure cannot be used to establish a direct cause-and-effect link between variables in question.
Dependence means only there is a relationship between the two variables.
SlideSlide 38Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Expected Frequency for Contingency Tables
E = • •grand total
row total column totalgrand total
grand total
E = (row total) (column total)(grand total)
(probability of a cell)
n • p
SlideSlide 39Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
491
213
704
377
112
489
31
8
39
899
333
1232
Black White Yellow/OrangeRow
Totals
Controls (not injured)
Cases (injured or killed)
Column Totals
For the upper left hand cell:
= 513.714E =(899)(704)
1232
Case-Control Study of Motorcycle Drivers
(row total) (column total)E =
(grand total)
899
1232704
899
1232
SlideSlide 40Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
491513.714
213
704
377
112
489
31
8
39
899
333
1232
Black White Yellow/OrangeRow
TotalsControls (not injured)Expected
Cases (injured or killed)
Column Totals
Case-Control Study of Motorcycle Drivers
= 513.714E =(899)(704)
1232
(row total) (column total)E =
(grand total)
Expected
SlideSlide 41Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
491513.714
213
704
377
112
489
31
8
39
899
333
1232
Black White Yellow/OrangeRow
TotalsControls (not injured)Expected
Cases (injured or killed)
Column Totals
190.286
Calculate expected for all cells.
To interpret this result for the upper left hand cell, we can say that although 491 riders with black helmets were not injured, we would have expected the number to be 513.714 if crash related injuries are independent of helmet color.
356.827
132.173
28.459
10.541
Case-Control Study of Motorcycle Drivers
Expected
SlideSlide 42Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Using a 0.05 significance level, test the claim that group (control or case) is independent of
the helmet color.
H0: Whether a subject is in the control group or case group is independent of the helmet color. (Injurie s are independent of helmet color.)
H1: The group and helmet color are dependent.
Case-Control Study of Motorcycle Drivers
SlideSlide 43Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
491513.714
213
704
377
112
489
31
8
39
899
333
1232
Black White Yellow/OrangeRow
Totals
Cases (injured or killed)Expected
Column Totals
Controls (not injured)Expected
190.286
356.827
132.173
28.459
10.541
2 2 22 ( ) (491 513.714) (8 10.541)
...513.714 10.541
O E
Eχ − − −= = + +∑
2 8.775χ =
Case-Control Study of Motorcycle Drivers
SlideSlide 44Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
H0: Row and column variables are independent.H1: Row and column variables are dependent.
The test statistic is χχχχ2 = 8.775
αααα = 0.05
The number of degrees of freedom are(r–1)(c–1) = (2–1)(3–1) = 2.
The critical value (from Table A-4) is χχχχ2.05,2 = 5.991.
Case-Control Study of Motorcycle Drivers
SlideSlide 45Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
We reject the null hypothesis. It appears there is an association between helmet color and motorcycle safety.
Case-Control Study of Motorcycle Drivers
Figure 11-4
SlideSlide 46Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Relationships Among Key Components in Test of Independence
Figure 11-8
SlideSlide 47Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Definition
Test of Homogeneity
In a test of homogeneity , we test the claim that different populations have the same proportions of some characteristics.
SlideSlide 48Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
How to Distinguish Between a Test of Homogeneity
and a Test for Independence:
Were predetermined sample sizes used for different populations (test of homogeneity), or was one big sample drawn so both row and column totals were determined randomly (test of independence)?
SlideSlide 49Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Using Table 11-6 with a 0.05 significance level, test the effect of pollster gender on survey
responses by men.
Example: Influence of Gender
SlideSlide 50Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
H0: The proportions of agree/disagree responses are the same for the subjects interviewed by men and the subjects interviewed by women.
H1: The proportions are different.
Using Table 11-6 with a 0.05 significance level, test the effect of pollster gender on survey
responses by men.
Example: Influence of Gender
SlideSlide 51Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Using Table 11-6 with a 0.05 significance level, test the effect of pollster gender on survey
responses by men.
Example: Influence of Gender
Minitab
SlideSlide 52Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
RecapIn this section we have discussed:
Contingency tableswhere categorical data is arranged in a table with a least two rows and at least two columns.
* Test of Independencetests the claim thatthe row and column variables are independentof each other.
* Test of Homogeneitytests the claim thatdifferent populations have the sameproportion of some characteristics.
SlideSlide 53Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Section 11-4 McNemar’s Test for
Matched Pairs
Created by Erin Hodgess, Houston, TexasRevised to accompany 10th Edition, Jim Zimmer, Chattanooga State,
Chattanooga, TN
SlideSlide 54Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Key Concept
The Contingency table procedures in Section 11-3 are based on independentdata. For 2 x 2 tables consisting of frequency counts that result from matched pairs, we do not have independence, and for such cases, we can use McNemar’s test for matched pairs.
SlideSlide 55Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Table 11-9 is a general table summarizing the frequency counts that
result from matched pairs.
SlideSlide 56Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
McNemar’s Test uses frequency counts from matched pairs of nominal data from two
categories to test the null hypothesis that for a table such as Table 11-9, the frequencies b
and c occur in the same proportion.
Definition
SlideSlide 57Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Requirements
1. The sample data have been randomly selected.
2. The sample data consist of matched pairs of frequency counts.
3. The data are at the nominal level of measurement, and each observation can be classified two ways: (1) According to the category distinguishing values with each matched pair, and (2) according to another category with two possible values.
4. For tables such as Table 11-9, the frequencies are such that b + c ≥ 10.
SlideSlide 58Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
RequirementsTest Statistic (for testing the null hypothesis thatfor tables such as Table 11-9, the frequencies band c occur in the same proportion):
22 ( 1)b c
b cχ
− −=
+Where the frequencies of b and c are obtained from the 2 x 2 table with a format similar toTable 11-9.
Critical values1. The critical region is located in the right tail only .2. The critical values are found in Table A- 4 by us ing degrees of freedom = 1.
SlideSlide 59Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Example: Comparing TreatmentsTable 11-10 summarizes the frequency counts that resulted from the matched pairs using Pedacream on one foot and Fungacream on the other foot.
SlideSlide 60Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Example: Comparing Treatments
H0: The following two proportions are the same:
• The proportion of subjects with no cure on the Pedacream-treated foot and a cure on the Fungacream-treated foot.
• The proportion of subjects with a cure on the Pedacream-treated and no cure on the Fungacream-treated foot.
Based on the results, does there appear to be a difference?
SlideSlide 61Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Example: Comparing Treatments
Note that the requirements have been met:
The data consist of matched pairs of frequency counts from randomly selected subjects.
Each observation can be categorized according to two variables. (One variable has values of “Pedacream” and “Fungacream,” and the other variable has values of “cured” and “not cured.”)
b = 8 and c = 40, so b + c ≥ 10.
SlideSlide 62Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Example: Comparing Treatments
2 22 ( 1) (8 40 1)
20.0218 40
b c
b cχ
− − − −= = =
+ +
Test statistic
Critical value from Table A-4
2 3.841χ =
Reject the null hypothesis.
It appears the creams produce different results.
SlideSlide 63Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
Example: Comparing Treatments
Note that the test did not use the categories where both feet were cured or where neither foot was cured. Only results from categories that are different were used.
Definition .
Discordant pairsof results come from pairs of categories in which the two categories are different (as in cure/no cure or no cure/cure).
SlideSlide 64Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.
RecapIn this section we have discussed:
McNemar’s test for matched pairs.
• Data are place in a 2 x 2 table where eachobservation is classified in two ways.
• The test only compares categories that are different (discordant pairs).