chapter 11 multinomial experiments and contingency tablesdenis.santos/arquivos/slides/... ·...

SlideSlide 2Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley.

11-1 Overview

11-2 Multinomial Experiments:Goodness-of-fit

11-3 Contingency Tables:Independence and Homogeneity

11-4 McNemar’s Test for Matched Pairs

Chapter 11Multinomial Experiments and Contingency Tables


Section 11-1 & 11-2 Overview and

Multinomial Experiments: Goodness of Fit

Created by Erin Hodgess, Houston, TexasRevised to accompany 10th Edition, Jim Zimmer, Chattanooga State,

Chattanooga, TN


Overview�We focus on analysis of categorical (qualitative

or attribute) data that can be separated into different categories (often called cells ).

�Use the χχχχ2 (chi-square) test statistic (Table A- 4).

�The goodness-of-fit test uses a one-way frequency table (single row or column).

�The contingency table uses a two-way frequency table (two or more rows and columns).


Key Concept

Given data separated into different categories, we will test the hypothesis that the distribution of the data agrees with or

“fits” some claimed distribution.

The hypothesis test will use the chi-square distribution with the observed frequency counts and the frequency counts that we

would expect with the claimed distribution.


Multinomial ExperimentThis is an experiment that meets the following conditions:

1. The number of trials is fixed.

2. The trials are independent.

3. All outcomes of each trial must be classified into exactly one of several different categories.

4. The probabilities for the different categories remain constant for each trial.

Definition


Example: Last Digits of Weights

When asked, people often provide weights that are somewhat lower than their actual weights. So how can researchers verify that weights were obtained through actual measurements instead of asking subjects?


Test the claim that the digits in Table 11-2 do not occur with the same frequency.

Table 11-2 summarizes the last digit of weights of 80 randomly selected students.



1. The number of trials (last digits) is the fixed number 80.2. The trials are independent, because the last di git of any individual’s weight does not affect the last di git of any other weight.3. Each outcome (last digit) is classified into ex actly 1 of 10 different categories. The categories are 0, 1, … , 9.

4. Finally, in testing the claim that the 10 digit s are equally likely, each possible digit has a probabili ty of 1/10, and by assumption, that probability remains constan t for each subject.

Verify that the four conditions of a multinomial experiment are satisfied.



Definition

Goodness-of-fit Test

A goodness-of-fit test is used to test the hypothesis that an observed frequency distribution fits (or conforms to) some claimed distribution.


O represents the observed frequency of an outcome.

E represents the expected frequency of an outcome.

k represents the number of different categories or outcomes.

n represents the total number of trials .

Goodness-of-Fit TestNotation


Expected Frequencies

If all expected frequencies are equal :

the sum of all observed frequencies divided by the number of categories

nE =

k


If expected frequencies are not all equal :

Each expected frequency is found by multiplying the sum of all observed

frequencies by the probability for the category.

E = n p

Expected Frequencies


Goodness-of-Fit Test in Multinomial Experiments

1. The data have been randomly selected.2. The sample data consist of frequency counts

for each of the different categories.3. For each category, the expected frequency is at

least 5. (The expected frequency for a category is the frequency that would occur if the data actually have the distribution that is being claimed. There is no requirement that the observed frequency for each category must be at least 5.)

Requirements



Critical Values1. Found in Table A-4 using k – 1 degrees of

freedom, where k = number of categories.

2. Goodness-of-fit hypothesis tests are always right-tailed .

χχχχ2 = ΣΣΣΣ (O – E)2

E

Test Statistics


�A large disagreement between observed and expected values will lead to a large value of χχχχ2

and a small P-value.

�A significantly large value of χχχχ2 will cause a rejection of the null hypothesis of no difference between the observed and the expected.

�A close agreement between observed and expected values will lead to a small value of χχχχ2

and a large P-value.



Relationships Among the χχχχ2 Test Statistic, P-Value, and Goodness-of-Fit

Figure 11-3


Example: Last Digit AnalysisTest the claim that the digits in Table 11-2

do not occur with the same frequency.

H0: p0 = p1 = ………… = p9

H1: At least one of the probabilities is different from the others.

αααα = 0.05

k – 1 = 9

χχχχ2.05, 9= 16.919




Because the 80 digits would be uniformly distributed through the 10 categories, each expected frequency should be 8.





Example: Last Digit Analysis

From Table 11-3, the test statistic is χχχχ2 = 156.500.

Since the critical value is 16.919, we reject the null hypothesis of equal probabilities.

There is sufficient evidence to support the claim that the last digits do not occur with the same relative frequency.

Test the claim that the digits in Table 11-2 do not occur with the same frequency.


Example: Detecting Fraud Unequal Expected Frequencies

In the Chapter Problem, it was noted that statistic s can be used to detect fraud. Table 11-1 lists the perc entages for leading digits from Benford’s Law.


Example: Detecting Fraud

Observed Frequencies and Frequencies Expected with Benford’s Law

Test the claim that there is a significant discrepa ncy between the leading digits expected from Benford’s Law and the leading digits from the 784 checks.



H0: p1 = 0.301, p2 = 0.176, p3 = 0.125, p4 = 0.097, p5 = 0.079, p6 = 0.067, p7 = 0.058, p8 = 0.051 and p9 = 0.046

H1: At least one of the proportions is different from the claimed values.

αααα = 0.01

k – 1 = 8

χχχχ2.01,8= 20.090




The test statistic is χχχχ2 = 3650.251.

Since the critical value is 20.090, we reject the n ull hypothesis.

There is sufficient evidence to reject the null hypothesis. At least one of the proportions is different than expected.



Example: Detecting FraudTest the claim that there is a significant discrepa ncy between the leading digits expected from Benford’s Law and the leading digits from the 784 checks.

Figure 11-5



Figure 11-6 Comparison of Observed Frequencies and Frequencies Expected with Benford’s Law


RecapIn this section we have discussed:

Multinomial Experiments: Goodness-of-Fit

- Equal Expected Frequencies

- Unequal Expected Frequencies

Test the hypothesis that an observed frequency distribution fits (or conforms to) some claimed distribution.


Section 11-3 Contingency Tables: Independence and

Homogeneity


Chattanooga, TN


Key ConceptIn this section we consider contingency tables (or two-way frequency tables), which include frequency counts for categorical data arranged in a table with a least two rows and at least two columns.

We present a method for testing the claim that the row and column variables are independent of each other.

We will use the same method for a test of homogeneity, whereby we test the claim that different populations have the same proportion of some characteristics.


Contingency Table(or two-way frequency table)

A contingency table is a table in which frequencies correspond to two variables.

(One variable is used to categorize rows, and a second variable is used to categorize columns.)

Contingency tables have at least two rows and at least two columns.

Definition


Case-Control Study of Motorcycle Drivers

Is the color of the motorcycle helmet somehow related to the risk of crash related injuries?

491

213

704

377

112

489

31

8

39

899

333

1232

Black White Yellow/OrangeRow

Totals

Controls (not injured)

Cases (injured or killed)

Column Totals


Test of Independence

A test of independence tests the null hypothesis that there is no association between the row variable and the column variable in a contingency table. (For the null hypothesis, we will use the statement that “the row and column variables are independent.”)

Definition


Requirements

1. The sample data are randomly selected and are represented as frequency counts in a two-way table.

2. The null hypothesis H0 is the statement that the row and column variables are independent ; the alternative hypothesis H1 is the statement that the row and column variables are dependent .

3. For every cell in the contingency table, the expectedfrequency E is at least 5. (There is no requirement that every observed frequency must be at least 5. Also there is no requirement that the population mu st have a normal distribution or any other specific distribution.)


Test of IndependenceTest Statistic

Critical Values1. Found in Table A-4 using

degrees of freedom = ( r – 1)(c – 1)

r is the number of rows and c is the number of colu mns

2. Tests of Independence are always right-tailed.

χχχχ2 = ΣΣΣΣ (O – E)2

E


(row total ) (column total )(grand total )

E =

Total number of all observed

frequencies in the table

Expected Frequency


Test of Independence

This procedure cannot be used to establish a direct cause-and-effect link between variables in question.

Dependence means only there is a relationship between the two variables.


Expected Frequency for Contingency Tables

E = • •grand total

row total column totalgrand total

grand total

E = (row total) (column total)(grand total)

(probability of a cell)

n • p


491

213

704

377

112

489

31

8

39

899

333

1232


Totals

Controls (not injured)


Column Totals

For the upper left hand cell:

= 513.714E =(899)(704)

1232


(row total) (column total)E =

(grand total)

899

1232704

899

1232


491513.714

213

704

377

112

489

31

8

39

899

333

1232


TotalsControls (not injured)Expected


Column Totals


= 513.714E =(899)(704)

1232

(row total) (column total)E =

(grand total)

Expected


491513.714

213

704

377

112

489

31

8

39

899

333

1232


TotalsControls (not injured)Expected


Column Totals

190.286

Calculate expected for all cells.

To interpret this result for the upper left hand cell, we can say that although 491 riders with black helmets were not injured, we would have expected the number to be 513.714 if crash related injuries are independent of helmet color.

356.827

132.173

28.459

10.541


Expected


Using a 0.05 significance level, test the claim that group (control or case) is independent of

the helmet color.

H0: Whether a subject is in the control group or case group is independent of the helmet color. (Injurie s are independent of helmet color.)

H1: The group and helmet color are dependent.



491513.714

213

704

377

112

489

31

8

39

899

333

1232


Totals

Cases (injured or killed)Expected

Column Totals

Controls (not injured)Expected

190.286

356.827

132.173

28.459

10.541

2 2 22 ( ) (491 513.714) (8 10.541)

...513.714 10.541

O E

Eχ − − −= = + +∑

2 8.775χ =



H0: Row and column variables are independent.H1: Row and column variables are dependent.

The test statistic is χχχχ2 = 8.775

αααα = 0.05

The number of degrees of freedom are(r–1)(c–1) = (2–1)(3–1) = 2.

The critical value (from Table A-4) is χχχχ2.05,2 = 5.991.



We reject the null hypothesis. It appears there is an association between helmet color and motorcycle safety.


Figure 11-4


Relationships Among Key Components in Test of Independence

Figure 11-8


Definition

Test of Homogeneity

In a test of homogeneity , we test the claim that different populations have the same proportions of some characteristics.


How to Distinguish Between a Test of Homogeneity

and a Test for Independence:

Were predetermined sample sizes used for different populations (test of homogeneity), or was one big sample drawn so both row and column totals were determined randomly (test of independence)?


Using Table 11-6 with a 0.05 significance level, test the effect of pollster gender on survey

responses by men.

Example: Influence of Gender


H0: The proportions of agree/disagree responses are the same for the subjects interviewed by men and the subjects interviewed by women.

H1: The proportions are different.


responses by men.




responses by men.


Minitab



Contingency tableswhere categorical data is arranged in a table with a least two rows and at least two columns.

* Test of Independencetests the claim thatthe row and column variables are independentof each other.

* Test of Homogeneitytests the claim thatdifferent populations have the sameproportion of some characteristics.


Section 11-4 McNemar’s Test for

Matched Pairs


Chattanooga, TN


Key Concept

The Contingency table procedures in Section 11-3 are based on independentdata. For 2 x 2 tables consisting of frequency counts that result from matched pairs, we do not have independence, and for such cases, we can use McNemar’s test for matched pairs.


Table 11-9 is a general table summarizing the frequency counts that

result from matched pairs.


McNemar’s Test uses frequency counts from matched pairs of nominal data from two

categories to test the null hypothesis that for a table such as Table 11-9, the frequencies b

and c occur in the same proportion.

Definition


Requirements

1. The sample data have been randomly selected.

2. The sample data consist of matched pairs of frequency counts.

3. The data are at the nominal level of measurement, and each observation can be classified two ways: (1) According to the category distinguishing values with each matched pair, and (2) according to another category with two possible values.

4. For tables such as Table 11-9, the frequencies are such that b + c ≥ 10.


RequirementsTest Statistic (for testing the null hypothesis thatfor tables such as Table 11-9, the frequencies band c occur in the same proportion):

22 ( 1)b c

b cχ

− −=

+Where the frequencies of b and c are obtained from the 2 x 2 table with a format similar toTable 11-9.

Critical values1. The critical region is located in the right tail only .2. The critical values are found in Table A- 4 by us ing degrees of freedom = 1.


Example: Comparing TreatmentsTable 11-10 summarizes the frequency counts that resulted from the matched pairs using Pedacream on one foot and Fungacream on the other foot.


Example: Comparing Treatments

H0: The following two proportions are the same:

• The proportion of subjects with no cure on the Pedacream-treated foot and a cure on the Fungacream-treated foot.

• The proportion of subjects with a cure on the Pedacream-treated and no cure on the Fungacream-treated foot.

Based on the results, does there appear to be a difference?



Note that the requirements have been met:

The data consist of matched pairs of frequency counts from randomly selected subjects.

Each observation can be categorized according to two variables. (One variable has values of “Pedacream” and “Fungacream,” and the other variable has values of “cured” and “not cured.”)

b = 8 and c = 40, so b + c ≥ 10.



2 22 ( 1) (8 40 1)

20.0218 40

b c

b cχ

− − − −= = =

+ +

Test statistic

Critical value from Table A-4

2 3.841χ =

Reject the null hypothesis.

It appears the creams produce different results.



Note that the test did not use the categories where both feet were cured or where neither foot was cured. Only results from categories that are different were used.

Definition .

Discordant pairsof results come from pairs of categories in which the two categories are different (as in cure/no cure or no cure/cure).



McNemar’s test for matched pairs.

• Data are place in a 2 x 2 table where eachobservation is classified in two ways.

• The test only compares categories that are different (discordant pairs).

chapter 11 multinomial experiments and contingency tablesdenis.santos/arquivos/slides/... ·...

Documents