m8/j mo - digital library/67531/metadc331001/m2/1/high_res_d/1002715481... · methods for...
TRANSCRIPT
3 " 7 ?
M8/J
MO.ZLitO
A COMPARISON OF SOME CONTINUITY CORRECTIONS
FOR THE CHI-SQUARED TEST IN 3 X
3 X 4 , AND 3 X 5 TABLES
•J r
DISSERTATION
Presented to the Graduate Council of the
North Texas State University in Partial
Fulfillment of the Requirements
For the Degree of
DOCTOR OF PHILOSOPHY
by
Jerry D. Mullen, B.S., M.Ed.
Denton, Texas
May, 1987
Mullen, Jerry D., A Comparison of Some Continuity
Corrections for the Chi-Squared Test on 3 X 3 . 3 X 4 , and
3 x.5 Tables, Doctor of Philosophy {Educational Research),
May, 1987, 161 pp, 13 tables, 14 figures, bibliography, 49
titles.
This study was designed to determine whether chi-
squared based tests for independence give reliable estimates
(as compared to the exact values provided by Fisher's exact
probabilities test) of the probability of a relationship
between the variables in 3 X 3, 3 X 4 , and 3 X 5 contingency
tables when the sample size is 10, 20, or 30. In addition
to the classical (uncorrected) chi-squared test, four
methods for continuity correction were compared to Fisher's
exact probabilities test. The four methods were Yates'
correction, two corrections attributed to Cochran, and
Mantel's correction. The study was modeled after a similar
comparison conducted on 2 X 2 contingency tables and
published by Michael Haber.
In a Monte Carlo simulation, 2,500 contingency tables
were generated and analyzed for each combination of table
dimension and sample size. The tables were categorized by
ranges of the minimum expected frequencies and, within each
category, the average probability estimates were compared to
the exact probability. For the uncorrected chi—squared test
and for each of the four correction methods, the ratio of
the associated probability to the exact probability was
reported. The results were examined to determine the
effects of sample size, minimum expected frequency, and
table dimension on the probability of independence indicated
by the chi-squared based methods.
The analyses showed that, on the average, larger sample
sizes improved the estimates of the probability. Hone of
the five chi-squared based tests, however, was a reliable
estimator of the exact probability (given by Fisher's exact
probabilities test) under the conditions established for
this study. The corrections of Yates and Mantel produced
the greatest deviations from the exact probability in most
cases, while the uncorrected chi-squared test and the test
corrected by Cochran's two methods produced generally better
estimates. The results also showed that neither small
expected frequencies nor contingency table dimension had any
significant effect on the probability estimates.
xi
TABLE OF CONTENTS
LIST OF TABLES
LIST OF ILLUSTRATIONS v i i
Chapter
I. INTRODUCTION 1
Statement of the Problem
Purposes and Questions of the Study Significance of the Study The Model of the Study Definitions Assumptions
II. SYNTHESIS OF RELATED LITERATURE 14
Introduction Sampling Models Independence in Two-Way Tables Tests of Independence Fisher's Exact Probability Test The Chi-Squared Test Limitations to the Use of Chi-Squared Tests Popularity / Advantages of Chi-Squared Tests Continuity Corrections and Chi-Squared The Research of Michael Haber
III. PROCEDURES
Introduction Notation The Simulation Structure Random Number Generation Subroutine Descriptions Summary
ixx
IV. ANALYSIS OF DATA
Introduction Questions of the Study 3X3 Contingency Table Results 3X4 Contingency Table Results 3X5 Contingency Table Results Effects of Sample Size Effects of Expected Frequency Range Effects of Table Dimension Effects of Exact Probability Range Summary
V. SUMMARY OF FINDINGS, CONCLUSIONS, AND RECOMMENDATIONS FOR FURTHER RESEARCH
APPENDIX A. 1 2 2
The Main Routine
APPENDIX B 1 2 8
Random Number Generator Programs
APPENDIX C Subroutine MART
APPENDIX D Subroutine EV " "
APPENDIX E Subroutine RC0NT2
APPENDIX F , Subroutine CHISQ
APPENDIX G. . Subroutine PVAL
APPENDIX H 1 3 9
Subroutine RXCPRB
APPENDIX I 1 4 4
Subroutine COCHR
xv
APPENDIX J. 145 Subroutine YATES
APPENDIX K. 146 Subroutine RATIOS
APPENDIX L 147 Data Tables
BIBLIOGRAPHY 157
LIST OF TABLES
Table Page
I. Ranges of Minimum Expected Frequencies 65
II. Performance Ratio Means and Ranges, 3X3 Tables....89
III. Performance Ratio Means and Ranges, 3X4 Tables 95
IV. Performance Ratio Means and Ranges, 3X5 Tables...100
V. Performance Ratio Means, 3X3, N=10 148
VI. Performance Ratio Means, 3X3, N=20 149
VII. Performance Ratio Means, 3X3, N=30 150
VIII. Performance Ratio Means, 3X4, N=10 151
IX. Performance Ratio Means, 3X4, N=20 152
X. Performance Ratio Means, 3X4, N=30 153
XI. Performance Ratio Means, 3X5, N=10 154
XII. Performance Ratio Means, 3X5, N=20 155
XIII. Performance Ratio Means, 3X5, N=30 156
VI
LIST OF ILLUSTRATIONS
Performance Ratio (PR) Cluster Diagrams
Figure P a g e
1. PR, P A/P E vs. H, for e<0.5, PE<0.5, 3X3 91
2. PR, P A/P E vs. N, for e<0.5, PE>0.5, 3X3 92
3. PR, P A/P E vs. N, for e>0.5, PE<0.5, 3X3 93
4. PR, P A/P E vs. N, for e?0.5, PE>0.5, 3X3 94
5. PR, P A/P E vs. N, for e<0.5, PE<0.5, 3X4 96
6. PR, P A/P E v s . N, for e<0.5, PE>0.5, 3X4 97
7. PR, P A/P E vs. N, for e>0.5, PE<0.5, 3X4 97
8 P R' P A / P E V S ' N" f o r e- 0* 5' Pe>0.5, 3X4 98
9. PR, P A / P E vs. N, for e<0.5, Pe<0.5, 3X5 101
10. PR, P A/P R vs. N, for e<0.5, Pw>0.5, 3X5 102 E ' R
E
11. PR, P A/P E vs. N, for e>0.5, PE<0.5, 3X5 103
12. PR, P A/P E vs. N , for e>0.5, PE>0.5, 3X5 104
13. PR, Py/Pg vs. e, for N = 30, PE<0.5 107
14. PR, P C/P E vs. e, for N • 30, PE<0.5 ......108
V I 1
CHAPTER I
INTRODUCTION
Arrangements of frequency data into categories of two
variables simultaneously are commonly encountered in
educational and psychological research (1, p. 153). Such
arrangements are variously referred to as "cross-
classifications," "two-way tables," or "contingency tables,"
and they are further identified by specifying the number of
rows and columns comprising the table. Each row represents
a category of one of the two variables, while each column
represents a category of the other variable. A 3 X 5
contingency table, for example, has three rows, which
represent the three categories of one variable, and five
columns, one for each of the five categories of the second
variable. Intersections of rows and columns form cells in
the contingency table, and each cell indicates a cross-
classification according to the row and column categories.
The purpose for cross-classifying frequencies, from a
statistical analysis viewpoint, is to allow for
straightforward tests of independence (no association)
between the two variables {4, p. 209). Such tests are not
designed to measure the degree of association, but solely to
determine whether observed departures from independence may
or may not be attributed to chance (8, p. 90). Should the
hypothesis of independence be rejected, thereby indicating
the existence of some association or relationship between
the variables, further statistical tests may be used to
determine the strength and the direction of that association
(6, p. 41).
The initial step in the analysis of a contingency
table, therefore, is to perform a test of independence
between the two variables. The test most commonly used for
this purpose is a chi-squared goodness-of-fit test in which
the observed frequencies in the contingency table are
compared cell by cell to the frequencies which would be
expected under the condition of independence {4, p. 212).
The expected frequency for a given cell is determined by
multiplying the total number of observations for the row in
which that cell is located by the total number of
observations in the cell's column, and then dividing that
product by the total number of observations in the table.
The chi-squared statistic produced by the goodness-of-fit
test yields an approximation of the probability of
association between the two classification variables (8, p.
96). It is relatively easy to compute, and it is acceptably
accurate in many situations as long as its limitations are
understood and acknowledged. In educational and
psychological research, however, situations frequently occur
in which the limitations and conditions for applying chi-
squared tests cannot practically be met (1, p. 153). An
alternative in these situations is to use an exact test,
that is, one which does not approximate the probability of
association but, rather, which calculates it directly (6, p.
15) .
Probably the best known exact test is one proposed by
Fisher, who admits that even for 2 X 2 contingency tables,
its calculation is "laborious" {8, pp. 96-97). The
calculation difficulties arise from the need to compute
factorials of row and column totals, observed frequencies,
and sample size. For larger tables (e.g., 3 X 3 , 3 X 4 , or
3 X 5 ) , the computational difficulties make Fisher's exact
probabilities test impractical. The chi-squared test has
remained the most popular test as a result.
Researchers have attempted to compensate for some of
the limitations of the chi-squared test by proposing various
"corrections" which are intended to improve the test's
estimation of the probability of association. One of the
earliest of these was proposed by Yates (14), and it has
been widely used despite the considerable amount of
controversy it has engendered among researchers (6, p. 14).
Yates' correction, like another one proposed by Cochran {3,
pp. 331-332), attempts to overcome inaccuracies produced in
the chi-squared test when the number of observations in the
contingency table is small, and most of the research,
including the studies produced by those opposed to the
application of a correction, has been limited to 2 X 2
tables (7, p. 214). In a study published in 1980, Haber
used Fisher's exact probabilities test as a standard for
comparing the uncorrected chi-squared test and four
corrected chi-squared statistics in 2 X 2 contingency tables
(9). He concluded that for 2 X 2 tables with fixed marginal
totals the chi-squared test is improved by applying
continuity corrections, but that Yates' correction was the
least efficient in improving the approximation (9, p. 515).
Other researchers have attempted to take advantage of
improvements in electronic computer technology to develop
more efficient methods for calculating exact probabilities.
March (11) published an algorithm for general (r X c)
contingency table probabilities analysis in 1972, but his
technique was unacceptably slow for all but the smallest
tables. Various improvements on March's algorithm followed,
but no real breakthrough appeared until 1983, when Mehta and
Patel published their "network" algorithm (12). Although
the network algorithm is significantly faster than
algorithms based on March's approach, its application is
still not practical for general use. However, Mehta and
Patel recognized that their algorithm could be used in
research situations to extend Haber's comparisons of
continuity corrections to the chi-squared test, and they
recommended its application for that purpose (12, pp. 432-
433) .
The study reported her® is in response to Mehta and
Patel's recommendation. While Haber1s work serves as a
general model, this study also addresses certain other
issues related to chi-squared goodness-of-fit tests as
applied to contingency tables. The fact that continuity
corrections are difficult to apply to contingency tables
larger than 2 X 2 has resulted in the widespread
misconception that the corrections are applicable only to
the 2 X 2 table. That they may be applied to larger tables
is suggested {and supported) by the work of several other
researchers (3, pp. 329-330; 10). A further limitation
encountered in many studies is a lower bound on the minimum
expected value allowable in a contingency table when chi-
squared tests are to be applied. The exact value of this
lower bound is a subject of extensive disagreement (6, p.
40). What is clear is that researchers could easily be
confused by the myriad of "rules of thumb" proffered by
various authors, especially if the contingency tables to be
analyzed are larger than 2 X 2 .
Statement of the Problem
It is not known what correction for continuity, if any,
is appropriate when contingency tables larger than 2 X 2 are
tested for independence using the chi-squared goodness-of-
fit test. Furthermore, it is not known what effect small
sample size has when the test is applied to such tables, and
the effect of various table dimensions has not been studied.
Purposes and Questions of the Study-
In this study the uncorrected chi-squared goodness-of-
fit test and the same test with corrections by Yates,
Cochran {two versions), and Mantel are compared to Fisher's
exact probabilities test.
The purposes of this study were to
1. determine whether chi-squared based tests for
independence give reliable estimates (as compared to the
exact values) of the probability of a relationship between
the variables in 3 X 3, 3 X 4 , and 3 X 5 contingency tables
when the sample size, N, is 10, 20, or 30; and
2. determine which chi-squared based test provides the
best approximation of the exact probability calculated by
Fisher's exact probabilities test for 3 X 3 , 3 X 4 , and 3 X
5 contingency tables.
The following questions were addressed.
x. What is the effect of the small sample size on the
chi-squared based statistics in the two-way contingency
tables used in the study?
2, Does a small expected frequency (0.05, for example)
in the contingency table influence the accuracy of the test
statistic?
3. Is there any pattern or trend indicated in the
accuracy of the probability based on the chi—squared
statistic as compared to the exact probability as the table
dimensions increase from 3 X 3 to 3 X 4 to 3 X 5?
Significance of the Study
This study is considered significant for at least three
reasons. First, researchers in education, psychology,
sociology, and other related areas of behavorial study will
be provided with definite recommendations concerning the
type of continuity corrections which should be applied to
their frequency data in contingency tables. Also, rules
about minimum expected values and sample sizes might be
formed based on the results of this research.
A second reason for significance is that this study
extends the body of knowledge produced by the work of
Haber. This extension includes both table dimension (number
of rows and columns) and limits on minimum expected values.
Finally, the study is considered significant because it
examines the real need for the development of algorithms for
computing exact probabilities in R X C contingency tables.
That is, if the chi-squared goodness-of-fit test can be
corrected for continuity and used instead of an exact test,
there is probably no real need for extensive development of
general exact probabilities algorithms. This statement is
supported primarily by the parsimony and familiarity
associated with chi-squared tests.
The Model of the Study
The tables used for analysis in the experiments
reported in this study represent a simple random sample of
tables with the specified row and column dimension and the
chosen sample size (13, p. 4). Within the individual
tables, however, the data are generated according to a
hypergeometric sampling scheme, in which both row and column
totals are fixed in advance (2, p. 448). Continuity
corrections to the chi-squared test require hypergeometric
sampling if the estimated probabilities are to be considered
unconditional (5). Haber1s study is modeled exactly as
described here (9, p. 510). Since this study is an
extension of Haber*s work, it is only appropriate that the
same model be followed.
Definitions
Asymptotic.—A term referring to parameters which are
derived from or based upon large samples.
Cochran's Correction.—A continuity correction
suggested by Cochran for chi-squared tests applied to
contingency tables when sample size is small. The test
involves computing the next largest chi-squared value which
the data permit, and then reading the chi-squared table
halfway between this new value and the observed chi-squared
value.
9
Contingency Table.—An arrangement allowing for the
cross-classification of two or more variables, at least one
of which may be categorical.
Correction for Continuity.—A numerical procedure
designed to make probabilities obtained from approximating a
continuous distribution agree more closely with the
probabilities obtained from a discrete distribution.
Exceedance Probability.—The probability that the chi-
squared statistic will be greater than or equal to an
observed chi~squared test statistic.
jxpected Frequency.—A theoretical cell count in a
contingency table determined from a knowledge of that cell's
row and column totals and the sample size used in the
experiment.
Fisher's Exact Probabilities Test.—A test developed by
Fisher which allows computation of the exact probability of
occurrence of a particular distribution of frequencies in a
contingency table, or of an even more extreme distribution,
given the same marginal totals.
Hypergeometric—Sampling. A sampling method for cross-
classifications in which both row totals and column totals
are fixed prior to drawing the sample. It is equivalent to
sampling without replacement from a finite population (2, p.
449) .
Marginal Totals. The set of frequency totals for the
rows or columns of a contingency table.
10
Measure of Association.—A statistic used to indicate
the degree and/or direction of relationship between two or
more variables in a contingency table.
Measure of Independence.—A statistic indicating
whether or not classifications based on one variable in a
contingency table are affected by classifications based on
the other.
Monte Carlo Method.—A technique for simulating random
selection by using a computer to generate data which are
then treated within the experimental procedure as though
they were actual observed data.
Multinomial Sampling. A sampling technique in which
the total sample size, N, is fixed a priori and sampled
values are cross-classified according to the values of the
underlying variables.
n-Way Table. A contingency table representing the
cross-classification of n variables.
Poisson Sampling.—A sampling scheme in which no
restrictions are placed on sample size or marginal totals.
In other words, nothing is fixed in advance except,
possibly, the time of collection of the sample.
ProductuMultinoroial sampling. A sampling technique in
which one set of margins (row totals or column totals) is
fixed in advance and a multinomial sample whose size equals
the row (or column) total is taken and classified according
to the column (row) variable.
11
Sampling Zero.—A zero entry in a contingency table
caused by sampling phenomena, but the expected value of the
entry is greater than zero.
Stirling's Formula.—A mathematical formula used to
approximate the value of the factorial of a large number
(usually greater than 100).
Structural Zero.—A zero entry in a contingency table
produced by the nature of the data itself.
Yates'^Correction.—A correction for continuity used
with the chi-squared goodness-of-fit test which involves
subtracting 0.5 from the positive discrepancies (observed
frequencies minus expected frequencies) and adding 0.5 to
the negative discrepancies before the discrepancies are
squared to compute the chi-squared test statistic.
Assumptions
This study assumes the truth of the following two
statements.
The data produced by the computer's random number
generator and used in the experiments reported here do not
differ from data collected and analyzed in normal
educational and psychological research experiments.
The small sample sizes and larger contingency table
dimensions better reflect the realities encountered in
educational and psychological research than do large samples
and 2 X 2 tables.
12
CHAPTER BIBLIOGRAPHY
1. Berry, Kenneth J. and Paul W. Mielke, Jr., "Subroutines for Computing Exact Chi-Square and Fisher's Exact Probability Tests," Educational and Psychological Measurement, XLV (Spring, 1985), 153-159.
2. Bishop, Yvonne M. M., Stephen E. Fienberg, and Paul W. Holland, Discrete Multivariate Analysis. Cambridge, Massachusetts, The MIT Press, 1975.
3. Cochran, William G. , "The Chi-Squared Test of Goodness of Fit," Annals of Mathematical Statistics." XXIII (Spring, 1952), 315-345.
4. Cohen, Jacob, Statistical Power Analysis for the Behavioral Sciences. New York, Academic Press, 1969.
5. Conover, William J., "Some Reasons for Not Using the Yates Continuity Correction on 2 X 2 Contingency Tables," Journal of the American Statistical Association. LXIX (June, 1974), 374-376.
6. Everitt, B. S., The Analysis of Contingency Tables. London, Chapman and Hall, 1977.
7. Ferguson, George A., Statistical Analysis in Psychology and Education. 5th ed., New York, McGraw-Hill Book Company, 1981.
8. Fisher, Ronald A., Statistical Methods for Research Workers, 14th ed., New York, Hafner Publishing Company, 1973.
9. Haber, Michael, "A Comparison of Some Continuity Corrections for the Chi-Squared Test on 2 X 2 Tables," Journal of the American Statistical Association. LXXV (September, 1980), 510-515.
10. Mantel, Nathan, "The Continuity Correction," The American Statistician. XXX (May, 1976), 103-104.
11. March, David L., "Algorithm 434: Exact Probabilities for R x C Contingency Tables," Communications of
Association for Computing Machinery. XV (November, 1972), 991-992.
13
12. Mehta, Cyrus R. and Nitin R. Patel, "A Network Algorithm for Performing Fisher's Exact Test in r x c Contingency Tables," Journal of the American Statistical Association. LXXVIII (June, 1983). 427-434.
13. Reynolds, H. T. , The Analysis of Cross-Classifications . New York, The Free Press, 1977.
14. Yates, Frank, "Contingency Tables Involving Small Numbers and the Chi-Squared Test," Journal of the Royal Statistical Society. Ser. B, Supp., I (1934), 217-235.
CHAPTER II
SYNTHESIS OF RELATED LITERATURE
Introduction
Cross-classifications are the most common way of
displaying and studying nominal and ordinal variables, so
common, in fact, that hardly any social scientist, whether
student or practitioner, ever completely avoids them (38, p.
xiii). The use of cross-classifications to summarize
counted data predates even the work of Quetelet and other
investigators who, in the mid-nineteenth century, attempted
to analyze the association between the variables in a 2 X 2
contingency table, but it was not until the turn of the
century that Pearson and Yule formulated the first major
developments in the analysis of categorical data (11, p.
4). These two giants of statistics carried on a protracted
debate concerning their separate concepts regarding the
implications of categorizing a variable. Pearson preferred
to view each variable as having an underlying continuum and
a multivariate normal distribution, while Yule held that the
categories of a cross-classification should be regarded as
fixed. Both positions are tenable in certain situations,
14
15
but both also have serious limitations in others, and the
debate in some ways has yet to be completely resolved (11,
pp. 4-5).
Over the decades since the debate began, Pearson's chi-
squared test has been the most frequently applied approach
for determining independence between contingency table
variables, even though Yule's position appears to dominate
the statistical literature of the past quarter-century {11,
p. 5). Yule's position is particularly prominent, for
example, in the development of the theory of log-linear
models. Even so, Pearson's chi-squared test has retained
its popularity, probably because it is relatively well
understood, it is easy to apply, and it is widely supported
as an appropriate technique for this purpose by many
published studies. It is so popular that one statistician,
Mosteller, was prompted to write that "I fear that the first
act of most social scientists upon seeing a contingency
table is to compute a chi-square test for it. Sometimes the
process is enlightening, sometimes wasteful, but sometimes
it does not go far enough (33, p.l)." The application of
chi-squared tests for goodness-of-fit in determining
independence between contingency table variables is subject
to certain limitations which are, at best, vaguely defined.
Mosteller's quotation, above, reflects the general lack of
agreement concerning those limitations and the questionable
16
results that can be produced when ill-defined boundaries are
approached.
As an example, contingency tables with small samples
are often studied. Expected frequencies are calculated and
compared with the observed frequencies using goodness-of-fit
tests. While such methods can be justified in large
samples, their validity is questionable in small samples
(8). The real problem is that "small sample" is not
defined, and various researchers give different rules for
determining adequate sample size. This is just one example
of the lack of definition regarding contingency table
independence studies.
Sampling Models
Several sampling models are commonly encountered in the
collection of counted cross-classified data. The Poisson
model was first suggested by Fisher {14) in 1950. It
requires the observation of a set of Poisson processes, one
for each cell in the cross-classification, over a fixed
period of time, but with no a priori knowledge regarding the
number of observations to be used (11, p. 15). The most
commonly used model is a simple random sample, or
multinomial model, in which only the sample size is fixed in
advance (38, p. 4). A third model, the product-multinomial
model, uses fixed sample sizes for the categories of one
variable and classifies each member of the sample according
17
to the categories of the second variable. In other words,
one set of marginal totals, either row or column totals, is
fixed in advance (11, p. 15).
A fourth sampling model requires that both row and
column totals be fixed in advance, and that observations on
a simple random sample be classified accordingly (6, p.
159). The variables in this case are considered to be
hypergeometrically distributed, and the sampling scheme is
equivalent to random sampling without replacement (1, p.
450). Fisher described the classic example of this sampling
method in his tea-tasting experiment (13). A subject was
presented with a number of cups of tea, some of which had
been prepared by adding milk to tea, and the remainder by
adding tea to milk. The "tea-tasting lady" was told how
many, but not which ones, of the cups fell into each
category. Her task in the experiment was to taste each cup
and to try to determine the category to which each
belonged. Since the subject was informed as to how many
cups of each kind were in the sample, Fisher presumed that
her guesses would be matched to those two numbers, thereby
holding the marginal totals fixed. In this famous
experiment, the two variables were "actual method of
preparation" and "perceived method of preparation." Each
variable was divided into two categories, "tea into milk"
and "milk into tea."
18
Other examples of experiments using contingency tables
with fixed row and column totals were described by Conover
(6, pp. 159-162). In one of these, a psychologist asked a
subject to learn twenty-five words. The subject was given
twenty-five blue cards, each with one word on it. Five of
the cards had nouns, five had adjectives, five had adverbs,
five had verbs, and five had prepositions. The subject was
required to pair these blue cards with twenty-five white
cards, each also having one word and also having the same
distribution of the parts of speech. The subject was
allowed a period of time to pair the cards (one white card
with each blue card) and to study the pairs formed. Then he
was instructed to close his eyes while the words on the
white cards were read to him one by one. As each word was
read, he attempted to furnish the associated word from the
blue card. The psychologist was not interested in the
number of correct word pairings, but rather in examining the
pairing structure to determine if some sort of ordering was
indicated.
The importance of the hypergeometric model to this
study is that Fisher's exact probabilities test and
continuity corrections to chi-squared statistics are used
only for this sampling model if unconditional exceedance
probabilities are to be approximated (20, p. 510). If
conditional probabilities are acceptable, however, the
product-multinomial model may be used.
19
Independence in Two-Way Tables
The Concept of Independence
The initial step in the analysis of cross-classified
data is to perform some test of independence on the
classified observations. In general terms, independence
means that the probability of one event is not affected by
the occurrence or nonoccurrence of a second event {38, p.
7). In terms of cross-classifications, independence means
that classification of an observation within a category of
one of the variables is not affected by, and has no effect
on, its classification according to the other variable. If
this is the case, that is, if the two variables are
independent, then differences between the observed
frequencies in each category and the maximum likelihood
expectations for the frequencies should differ by amounts
attributable to chance factors only (10, p. 6).
Maximum likelihood estimates have been shown to be
satisfactory on theoretical grounds for calculating
departures from independence (1, p. 58). They are
calculated for a given cell in a contingency table by
dividing the product of that cell's row total and column
total by the sample size, N. If a table has no structural
zeroes, then it has a non-zero maximum likelihood estimate
for the expected frequency in every cell, even if some cells
20
have no observed counts (1, p. 59). Determining the
significance of any departures of the observed frequencies
from these estimates amounts to performing some test for
independence between the variables of the contingency table.
Tests of Independence
Bishop, Fienberg, and Holland, (1, pp. 373-374) defined
four classes of approximation tests for independence (or,
alternatively, association) in contingency tables. Their
classes were (1) measures based on the ordinary chi-squared
statistic, including those using corrections for continuity,
(2) measures based on the cross-product ratio for 2 X 2
tables, (3) a "proportional reduction of error" measure that
Bishop, Fienberg, and Holland attributed to Goodman and
Kruskal, and (4) a "proportion of explained variance
measure" formulated by Light and Margolin. The measures
based on chi-squared are outgrowths of Karl Pearson's work,
and they are the primary tests of interest in this study.
Yule developed the cross-product ratio which, as mentioned
earlier, formed the basis for log-linear modeling
techniques. The other tests mentioned by Bishop, Fienberg,
and Holland represent some tests which are less well known,
but which have been proposed for certain experimental
situations to determine independence.
At another point in their book Bishop, Fienberg, and
Holland described Fisher's exact probabilities test {1, pp.
21
364-366). This test is frequently recommended in the
literature for use when sample size is so "small" that the
chi-squared test might produce questionable results (38, p.
10; 10, pp. 15-20). In other instances, continuity
corrected chi-squared tests are suggested as alternatives
when computing Fisher's exact probabilities test would
require excessive labor (24, p. 334). This is because
Fisher's exact probabilities test requires that the
factorial of every cell entry and every marginal total and
the sample size be computed. In all but the smallest
contingency tables, this is a formidable task when done
manually, and it is considered slow, at best, when the
calculations are made electronically. Recently developed
computer algorithms have made Fisher's exact probabilities
test a more attractive alternative in some cases. Still,
chi-squared based tests retain their popularity, both among
statisticians and among researchers in general.
The chi-squared distribution arises from the normal
distribution as the probability of the sums of squares of a
number of independent variables, each of which has a
standard normal distribution (10, p. 8). The exact form of
the distibution depends on the number of independent
variates involved, a number generally referred to as the
number of degrees of freedom. The chi-squared test for
independence, the familiar goodness-of-fit chi-squared
statistic, is based on the summation of squared deviations
22
between maximum likelihood estimates of the expected
frequencies in the cells of a contingency table and the
observed counts in each cell (38, p. 8). It is an
approximation whose adequacy assumes that several
conditions, including sample size and table completeness,
have been met (38, p. 9). When these conditions have not
been met, certain "corrections" to the computed chi-squared
statistic are recommended so as to make the resulting
statistic a more accurate approximation (11, p. 21).
Although the literature reveals a considerable number of
studies aimed at determining the real limitations on the use
of the chi-squared statistic, no definite rules exist
concerning its general use as a research tool (24, p. 19).
Despite this fact, the chi-squared statistic continues to be
the most often used independence test (5, p. 212).
Fisher's Exact Probabilities Test
Fisher's exact probabilities test does not use the chi-
squared approximation at all; instead, the exact probability
distribution of the observed frequencies is used (10, p.
15). Fisher proposed the test first for the treatment of 2
X 2 tables, and he admitted that it was "laborious, though
necessary in cases of doubt (15, p. 96)." In these 2 X 2
tables, the probability (P) of obtaining any particular
arrangement of the frequencies a, b, c, and d (the observed
frequencies in the cells of a 2 X 2 table) when the marginal
23
totals are fixed is
(a+b) ! <c+d) J (a+c) ! (b+d) ! p = Eq. 1
a!b!c!d!N!
where a J is read 'a factorial' and represents the product of
a and all the whole numbers less than it, down to one (10,
p. 15). Fisher (15) developed this formula from the
theories of probability which state that if the occurrence
of an event has probability p, the probability of its
occurring a times in {a + b) independent samples is given by
the binomial expansion term,
(a + b) i pa qb ' a! b!
where q » 1 - p. The probability that it will occur c times
in a sample of size (c + d) is
(c + d) !
P c ^d" c! d! c a
The probability of the observed frequencies a, b, c, and d
in a 2 X 2 contingency table is the product of these two
terms, or
{a + b) ! (c + d) !
ai b! c! d! P a + C q b + d '
and this product is not known if p is unknown. However, for
all tables having the same marginal totals a + c , b + d , a +
b, and c + d, the unknown factor involving p and q is always
the same. The probabilities of the observations are in
proportion to
1
a! b! cl di
24
whatever the value of p may be. Fisher showed that the sui
of this quantity for all samples having the same marginal
totals is
NJ
(a+b)! (c+d)i (a+c)! (b+d)J
where N is the sum a + b + c + d. Therefore, given the
marginal totals, the probability of any observed set of frequencies is
(a+b)! (c+d)! (a+c) i (b+d)! N! a! b! c! d!
This outline of Fisher's development of his exact test
shows that he considered only 2 X 2 tables with observed
frequencies a, b, c, and d. In general, if a sample of size
N is subjected to two different and independent
classifications, A and B, with R and C classes respectively,
the probability P of obtaining the observed array of cell
frequencies X(x^^), under the conditions imposed by the
arrays of marginal totals A(ri) and B(c^) is given by
p . iflrin P ~ r c Eq. 2
N! t f (xijl)
i=l j»l
This expression (Eq. 2) is exact and holds if (a) the parent
population is infinite or the sampling is done with
replacement of the sampled items, (b) the sampling is
random, (c) the population is homogeneous, and (d) the
marginal totals are considered fixed in repeated sampling.
25
To test the hypothesis that A and B are independent, the
probability of obtaining an array as probable as, or less
probable than, the the observed array is found by (a)
computing the probability of the observed array, (b)
computing the probabilities for all other possible arrays of
cell frequencies, subject to the conditions imposed by the
fixed marginal totals, and (c) summing all of the
probabilities found in (b) that are less than or equal to
the probability of the observed array (28, p. 991).
Clearly, this quickly becomes laborious, even for 2 X 2
tables if N is very large (16, p. 18).
To facilitate the exact treatment of the general R x C
contingency table, researchers have developed computer
algorithms for Fisher's exact probabilities test. One of
the earliest of these was March's algorithm, which produced
an exhaustive enumeration of all possible R X C contingency
with fixed marginal sums (28). March used the facts that
when the marginal totals are fixed, the numerator of the
expression above (Eq. 2) divided by N! is a constant, and
that only the denominator products vary from table to
table. In order to avoid machine overflow and roundoff
errors, March computed the constant using logarithms. He
employed Stirlings formula to approximate the factorials of
all numbers greater than 100. Then, his algorithm varied
all the cells in the given table so as to produce new tables
fitting the marginal constraints, some of which had greater
26
probabilities and others which had smaller probabilities
than the original table's frequency distribution. The
probability of each new table was computed, again using
logarithms to compute the factorials and their products, and
then it was compared to the probability of the original
table. The probabilities which were smaller than that of
the original table were added together to compute the exact
probability of the observed distribution's occurrence.
March's results, in which errors were on the order of 1 0 ^ ,
were verified using hand computation. He used his algorithm
to carry out the research for his doctoral study entitled
"Accuracy of the Chi-Square Approximation for 2 X 3
Contingency Tables with Small Expectations" at Lehigh
University in 1970.
Boulton and Wallace (2) published a more efficient
algorithm which generated explicitly only those tables which
satisfied the marginal constraints, and which eliminated
tables with greater probabilities than the observed table.
In addition, it used an ordering technique which reduced the
time required to obtain the logarithm of the probability of
each generated table. Their tests demonstrated a
significant speed advantage over March's algorithm, an
advantage which improved as the table dimension was
increased. Furthermore, they indicated that their algorithm
could be extended to tables with more than two dimensions,
although they did not explicitly publish such an extension.
27
Two years later, in 1975, Hancock (21) published
another algorithm based on March's original work. It, like
the one by Boulton and Wallace, purported greater speed than
March's, accomplished by ignoring tables which satisfied the
marginal totals but which had probabilities greater than the
observed table. In most of the cases tested, Hancock's
algorithm was at least twice as fast as the one by March.
However, he made no attempt to extend the algorithm to
tables with more than two dimensions, and, in fact, he
admitted that his method would likely be "impractical" for
tables with more than nine degrees of freedom. At this
dimension, Hancock's algorithm required approximately
sixteen seconds of a CDC CYBER-73's central processing unit
(CPU) time to compute the probability. Still, this was a
great improvement over March's algorithm, which required
more than 500 seconds of CPU time for tables of the same
size on the same machine.
The next significant development in the evolution of
exact probabilities computation was the algorithm of Pagano
and Halvorsen (34), which was, in their terms, a network
algorithm for the calculation of the exact probability of an
R X C contingency table. It was an extension of an earlier
work by Mehta and Patel which provided an algorithm for the
2 X C contingency table. Mehta and Patel (30) subsequently
extended the bounds of computational feasibility given by
Pagano and Halvorsen with an algorithm they first published
in 1983, and which they have continued to refine. Like the
28
algorithm by Pagano and Halvorsen, theirs was a network
solution, and many R X C tables which were computationally
infeasable previously could be evaluated by their methods.
Their algorithms circumvented the need to enumerate
explicitly all the tables satisfying the specified marginal
totals and, instead, formulated the probability calculation
as a network problem. They constructed a network of nodes
and arcs such that all paths from the initial node to the
terminal node corresponded to a freqency distribution whose
probability was equal to or less than some observed
distribution. The sum of those path lengths gave the
probabilties for any observed table.
Mehta and Patel's continuing research produced a hybrid
test, one which blended exact and asymptotic theory so as to
produce results almost equivalent to Fisher's exact
probabilities test, while requiring considerably less
computational effort (31). According to their research,
this algorithm was especially efficient for data sets in
which a small number of cells were sparse or empty, although
the majority of cells had large entries. The computational
effort in these cases was similar to that required for the
chi-squared test. At the opposite extreme, where the
contingency table was sparse in all cells, the hybrid
algorithm proceeded exactly as the original network
algorithm. Between those extremes, the hybrid algorithm
adapted automatically to each specific distribution by
29
providing as much exact computation as necessary to
calculate a probability almost identical to that obtained by
Fisher's exact probabilities test. In effect, the hybrid
algorithm provided a smooth transition between a fully
asymptotic and a fully exact significance test. In the
process, it reduced the computing requirements of many
problems by "several" orders of magnitude (to use Mehta and
Patel's description), and in no case did it compromise the
accuracy of the probability value. Even so, the time
required to evaluate probabilities in large tables with
large sample sizes required unreasonably long calculation
periods for multi-user computers.
Fisher's exact probabilities test has been used most
often as a standard for the comparison of other easier-to-
use tests of independence. Garside and Mack (16) and
Conover (7) used it as a standard for their studies with 2 X
2 contingency tables, as did Haber (20) in the work which
directly led to this present study. Mehta and Patel
recommended their Fisher's exact probabilities algorithm for
use as a comparison standard (30, p. 433). It has also been
recommended by many textbook authors for use with 2 X 2
contingency tables when small sample size or small expected
values create doubt about the validity of a chi-squared
approximation (38? 10; 29, pp. 236-237). The concept of
"exactness" is apparently responsible for the test's being
accepted as a standard, but the term "exact" has been
30
criticized (16, p. 18). Starmer, Grizzle, and Sen (39)
cautioned that statisticians "should not be led into a
semantic trap by the words 'exact test'." They added that
the exact test is always conservative, and that there is, in
their opinion, no good reason for using it as a standard for
comparing competing tests. Upton (41, p. 38), in discussing
Fisher's exact probabilities test, referred to the word
exact" as a "sobriquet" which has prejudiced users in favor
of the test. Bradley made the clearest case for the use of
Fisher's exact probabilities test when he described it as
"perfectly efficient" in the sense that it is an exact
method which uses all the information in the sample and does
not substitute an approximating distribution for the actual
distribution of the observed frequencies within the cells
(3, pp. 199-200). So, despite its laborious computation and
its alleged conservativeness, this property of perfect
efficiency has apparently convinced many researchers, as
noted above, to use Fisher's exact probabilities test as a
comparison standard.
Like most statistical tests, Fisher's exact
probabilities test does require that certain conditions be
met in order to validate its application. Reynolds
summarized the most stringent conditions by saying that the
test is appropriate when both the row and the column
marginal totals are fixed, or when the researcher is willing
to use a test conditional on a given set of marginals
31
{either row or column totals, but not both, are fixed) (38,
p. 10). Bradley (3) gave the more general assumptions as:
(a) each of the two sampled populations are assumed to be
infinite in size, (b) the categories of the two variables
are mutually exclusive and exhaustive, (c) the outcome of
drawing one observation is independent of the outcome of
drawing any other observation, and (d) the sampling is
random.
The Chi-Squared Test
History and Development
The use of the distribution of chi-squared for testing
independence between the variables in a two-way contingency
table is commonplace {24, p. 19). The test has its
beginnings in the work of Karl Pearson (35) published in
1900. In this paper Pearson proposed that the quantity
X = S (observed frequency - expected frequency)2 Eq. 3
expected frequency
when the summation is extended over all classifications of
the sample, was distributed as chi —squared, and he used the
statistic exclusively for grouped continuous data. In doing
so he committed himself to the assumption that the expected
frequencies in all categories were large enough to satisfy
the asymptotic properties upon which the chi-squared distribution is based. In summary, the 1900 paper
established the necessary distribution theory for
32
determining significance levels when expected values are
provided exactly by the null hypothesis. However, he failed
to show that the exact distribution of X2, which is
discontinuous, actually approached chi-squared as a limiting
distribution (4, p. 320). A rigorous mathematical proof of
this was provided by Cramer (9, p. 424).
The greatest battle over the use of the test, however,
had to do with the calculation of the number of degrees of
freedom employed when determining the approximated
probability of the observed distribution. Pearson did not
recognize that estimating the expected values from the
sample made a difference to the goodness-of-fit statistic by
reducing the overall number of degrees of freedom (23). In
proposing the use of the chi-squared test for goodness-of-
fit in 2 X 2 contingency tables, Pearson attributed three
degrees of freedom to X , whereas it should have received
only one. The confusion and controversy caused by this was
not settled for more than twenty years. A 1915 paper by
Greenwood and Yule (17) illustrated the uncertainty which
perplexed many critical users of the test. In their paper,
they followed Pearson's rule of assigning three degrees of
freedom to their 2 X 2 contingency tables in which a sample
of subjects exposed to cholera was categorized according to
whether or not individuals in the sample had been
innoculated, and also as to whether they contracted the
disease following exposure. The hypothesis that
innoculation was ineffective was also tested by calculating
the difference between the percent ill among the innoculated
and the non-innoculated and applying a "normal deviate"
test. This test, they found, gave statistically significant
results more often than did Pearson's test, and they gave
the impression of being confused as to which was the more
accurate. They finally decided to adopt Pearson's test
because it was the more conservative, but they added that
the issue deserved further theoretical investigation.
The issue was finally resolved in 1924 when Fisher (12)
showed the correction for Pearson's degrees of freedom
mistake by proving that in a 2 X 2 contingency table X 2 is
the square of a single quantity which had a limiting normal
distribution, i. e., one degree of freedom. He also proved
that the distribution of X is contingent upon the method
used to estimate the expected frequencies. The "natural"
method, it would seem, would be one which results in the
minimum X , and Fisher showed that in the limit in large
samples the maximum likelihood method of estimation
accomplishes this. With the degrees of freedom battle
settled and the method of maximum likelihood estimation
established as theoretically sound, at least for large
samples, Pearson's chi-squared test rapidly gained
widespread acceptance.
34
Examples: The Use of Chi-Sauared Tests
McNemar (29, pp. 219-220) cited three situations for
which chi-squared tests are appropriate. The first, which
he said did not often arise in social science research, was
"the discrepancy of observed frequencies from frequencies
expected on the basis of some a priori principle." His
example of this situation was a genetics study in which a
particular characteristic of parents was hypothesized to
appear in a specified proportion of their offspring. The
second situation he cited was the contingency table
application, which is of primary interest here. McNemar
described contingency tables as cross-classifications of two
"variables for which we have only categorized information
for N individuals. The variables might be in dichotomy
(fourfold table), or one might be a dichotomy and the other
manifold, or both might involve multiple categories." His
final situation was for testing goodness-of-fit of an
observed frequency distribution with another (given)
frequency distribution. He gave as an example the testing
of an observed distribution to determine its goodness-of-fit
with the normal distribution. Fisher described the results
of applying the chi-squared test in any of these situations
as a "logical disjunction: either the hypothesis is untrue,
or the value of chi-squared has attained by chance an
exceptionally high value" (15, p. 80).
35
Statistical literature is replete with examples of the
chi-squared tests' being applied to the classes of
situations McNemar defined, and interpreted according to
Fisher's logical disjunction. Some examples of the
contingency table situation follow.
Fisher used data attributed to Greenwood and Yule
regarding typhoid innoculations to illustrate the use of the
chi-squared test of independence in fourfold ( 2 X 2 ) tables.
He used Tocher's data from a pigmentation survey of Scottish
children to construct and test a 2 X 5 contingency table,
and he used data collected by Wachter to construct a 4 X 4
classification involving physical characteristics resulting
from a series of back-crosses in mice (15, pp. 85-89). As
he extended the dimensions of the contingency tables in each
succeeding example, he demonstrated the efficacy of the chi-
squared test for determining independence (or the lack of
it) between the two variables.
Guilford (19, pp. 234-236) used a 2 X 2 contingency
table to classify subjects according to marital status and
intelligence. He calculated a significant chi-squared value
to indicate that the classifications were not independent
for that sample. Guilford's study, like those used by
Fisher in his illustrations, was based on large sample
sizes, ranging from just over 400 (Guilford's data) to more
than 18,000 (the Greenwood and Yule data used by Fisher).
36
There is little or no disagreement in the literature
concerning the applicability of the chi-squared test under
these large sample conditions. As the sample size
decreases, or as the table dimensions increase, however, the
accuracy of the test comes into question. Naturally, then,
many studies have addressed this question, and some of them
are described in the following paragraphs.
Limitations to the Use of Chi-Squared Tests
The maximum likelihood estimates of the expected values
in contingency tables are directly proportional to the size
of the sample classified in the table and inversely
proportional to the number of rows and columns of the
table. Cochran stated that, since chi-squared has been
established as the limiting distribution of X^ in large
samples, the smallest expected frequency in any category
should be ten (4, p. 328). He pointed out that some writers
recommend five as the lower limit, but he admitted that the
inflexible use of minimum expectations of five or ten may be
harmful (4, p. 329). Fisher (15) recommended a lower limit
of five. In their 1965 study, Lewontin and Felsenstein
investigated the robustness of the chi-squared test for
independence in 2 X n contingency tables. They summarized
their results with a "conservative rule:" The 2 X n table
can be reliably tested using the chi-squared test if all
expected frequencies are one or greater (24, p. 31). Even
this, they said, is "extremely conservative," and if the
37
smallest expected frequency is 0.5 the test is still
applicable. Despite the results reported by Lewontin and
Felsenstein, most textbook authors still cling to the
"minimum expectation of five" rule (10; 29; 19).
McNemar (29) provided a conceptual explanation of the
problem. He pointed out that in the derivation of the
equation for the chi-squared distribution(s) it is assumed
that the distribution of the discrepancies (observed
frequency minus expected frequency) follows the normal
distribution. If an observed frequency is small, for
example, if it equals two, then the only smaller possible
observations are zero and one, while larger possible
observations may be three, four, five, and upward. The
distribution of the discrepancies, therefore, is likely to
be skewed, that is, non-normal, thereby violating the
assumption underlying the fundamental equation. The effect,
most noticeable for fewer degrees of freedom, is to create
discontinuities in the distribution of the test statistic,
2
X . Since the approximating chi-squared distribution is
continuous, the accuracy of the approximation is
questionable.
In their study, Lewontin and Felsenstein acknowledged
the existence of the discontinuities. They justified their
study, however, on the basis that only the upper tail of the
chi-squared distribution, where the cumulative distribution
exceeds 0.90, is important for tests of independence. Thus,
38
2 they said, good agreement between the distributions of X
and chi-squared in this upper tail region have the effect of
making the chi-squared test robust, regardless of the
correspondence between the distributions for smaller values
of chi-squared (24, p. 20). Clearly, though, the issue is
not settled, and researchers have no concrete rule regarding
minimum expected frequencies.
One approach recommended when confronted with small
expected frequencies is to combine neighboring classes until
acceptable expectations are obtained (4, p. 328). Everitt
summarized some reasons for not using this technique (10, p.
40). Firstly, he said, considerable amounts of information
may be lost by combining categories, thereby detracting from
the interest and usefulness of the study. Secondly, the
randomness of the sample could be affected, thereby
violating the assumption of randomness upon which the chi-
squared test is founded. In addition, since the categories
are chosen in advance of the classification of the sample,
pooling categories after the data are seen may affect the
randomness of the sample in unpredictable ways. Lastly, the
inferences drawn may be influenced by the manner in which
the categories are combined. His conclusion was that
pooling classification categories should be avoided.
Another approach for handling the problems caused by
discontinuities in the test statistic is to "correct" the
discrepancies between observed and expected frequencies,
39
using some correction rule. A number of such rules have
been proposed, and several will be considered later in this
chapter.
Popularity and Advantages of Chi-Squared Tests
Cochran (4, p. 319), in 1952, wrote that "perhaps the
most common of all uses of the chi-squared test is for the 2
X 2 contingency table." Cohen reiterated this sentiment
when he stated that the most frequent application of chi-
squared is in contingency tests, and he did not limit it to
fourfold tables (5, p. 212). The use of the test for
contingency table analysis was termed "commonplace" by
Lewontin and Felsenstein (24, p. 19). Each of these
statements testifies to the popularity of the chi-squared
test for independence in cross-classifications.
Its popularity apparently derives from two facts.
First, as the historical survey indicates, the test has a
long history of use and is therefore well understood, and it
is considered central to the analysis of contingency tables
(10, p. 11). The second fact is that the X 2 test statistic
is relatively easy to compute, even for large contingency
tables (16, p. 18). These facts justify studies intended
either to improve the accuracy of the test or to extend its
use into more practical situations. In the next section
some attempts to accomplish these goals are examined.
40
Continuity Corrections and Chi-Squared
A fundamental problem, in the sense that it always
introduces error, with the use of the chi-squared test to
evaluate the probability of X 2 for some sample classified in
a contingency table is that frequency distributions must
always be discontinuous, while the chi-squared distribution
is continuous. As Fisher pointed out, the result of this
situation is that the use of chi-squared in the comparison
of observed frequencies with expected frequencies can
provide only an approximation of the true probability for
the observed frequency distribution (15, pp. 92-93). The
continuous distribution of chi-squared, Fisher said, is the
limit toward which the true discontinuous distribution tends
as the sample size is increased. To avoid what Fisher
called "the irregularities produced by small numbers", he
stipulated that the expected frequency for every
classification be at least five, in which case the chi-
squared distribution gives an acceptable approximation.
Yates' Continuity Correction
Should the expected frequencies be small, however, the
number of distinct values of X 2 may be very limited (4, p.
331), and the chi-squared distribution may give a poor
approximation to the exact probability. Yates (42), in
1934, suggested that a correction be applied to X2, the so-
called correction for continuity, to make the tail areas
41
correspond to those of the hypergeometric distribution.
Yates observed that, as the sample size increases, the
hypergeometric distribution is increasingly well
approximated by the normal distribution. If a continuous
normal random variable N has the same mean and variance as
the hypergeometric random variable H, then as sample size
approaches infinity,
Pr (N > k - 1/2) Pr (H > k) .
Using this fact as his basis, Yates proposed that the X 2
test statistic be corrected so that
X *= S £) observed frequency - expected frequency \ -0.512 Eq.4
expected frequency
As consideration of this formula reveals, Yates' correction
reduces the numerator, thereby reducing the value of the X 2
statistic. Many introductory statistics textbooks recommend
the use of this "corrected" statistic (10; 19).
Yates proposed the correction for use with 2 X 2 tables
having fixed marginal totals (11, p. 21). Plackett (37),
analytically, and Grizzle (18), numerically, each concluded
that Yates* correction was not appropriate for the case of
only one (rows or columns) fixed marginal. The application
of the correction to tables with more degrees of freedom was
recommended by Cochran (4, p. 334). In his "summary
recommendations," Cochran recommended that the correction be
applied to tables having between two and sixty degrees of
freedom when all expected frequencies are less than five.
42
No examinations of this recommendation, empirical or
theoretical, have been discovered in the literature.
Recent literature, however, does reveal an active and
ongoing debate concerning the merits of applying Yates'
correction in contingency table analysis. The debate
centers around the suggestion that if the aim of applying
the correction is to cause the X 2 statistic to adhere more
closely to the large-sample chi-squared distribution, rather
than to the hypergeometric distribution, then the use of the
correction may not be appropriate (11, p. 22) . Both
Plackett (37) and Grizzle (18) have shown that using the
corrected statistic in place of the uncorrected X 2 results
in an overly conservative test. That is, it too rarely
rejects the hypothesis of independence between the
classified variables. Grizzle and Plackett supported their
claims with empirical evidence gathered from Monte Carlo
experiments on 2 X 2 contingency tables.
Support for the application of Yates' correction has
come from Mantel and Greenhouse (27). Their argument
centered around two points. First, they said that the
proper probability model to use in a 2 X 2 table is the one
with both sets of marginal totals fixed, which yields the
hypergeometric distribution function for the test statistic
2
X . Second, they said that Yates' correction improves
probability estimates for the hypergeometric distribution
except "in pathological cases, such as when the distribution
43
is sufficiently asymmetric." Conover (7) rejected Mantel
and Greenhouse's arguments, and he used their own data to
compare probabilities for corrected and uncorrected test
statistics to the exact probabilities calculated using
Fisher's exact probabilities test. In a comment on
Conover*s article, Starmer, Grizzle, and Sen (39) criticized
Conover's use of Fisher's exact probabilities test as a
comparison standard, but they supported his rejection of
Mantel and Greenhouse's conclusions. They provided their
own test data from which they concluded that, in general,
the uncorrected test statistic resulted in a better
approximation than did the corrected statistic. To compare,
they used a randomization procedure which they attributed to
Tocher {40}. Tocher claimed that his randomization
procedure, although it would not be used by most
statisticians in practice, provided the most powerful test
against one-sided alternatives when both, one, or no
marginal totals were fixed in advance. Starmer, Grizzle,
and Sen, therefore, chose it as their standard of
comparison, noting that it would allow them to "search for
the best approximation to the most powerful test" which
would not require the undesirable feature of randomization
in order to achieve the desired significance level.
In a comment on the same article by Conover, Mantel
(25) held that Conover had misused the continuity
correction. Mantel pointed out that in calculating two-tail
44
probabilities for the continuity corrected test statistic
Conover simply doubled the single-tail probabilities, a
practice which, although he recognized as being "almost
universal," Mantel showed to be improper. The correct
method, he proposed, was to take the two-tail probability
for the observed corrected statistic and the corresponding
two—tail probability for the next-larger possible statistic
in the opposite tail, add them, and then use half their sum
as the proper two-tail value. For single-tail testing
Mantel used half the two-tail probability. In his comment
he used Conover's data (which Conover had borrowed from
Mantel and Greenhouse) and, by performing the calculations
according to what he held to be the correct method, he
showed that the continuity corrected X 2 statistic provided
"rather excellent agreement" with the exact probability as
computed by Fisher's test. In summarizing his comment,
Mantel stated that the real issue was not whether the
continuity correction should be applied. He said that "if
we are willing to assume that we are in something like a
discretized normal situation, we should be ready to make
calculations analogous to those we would make in a true
normal situation which we have discretized" (25, p. 380).
He said Conover had shown only that "his own proposed
miscalculation" did better than "an alternate
miscalculation which he (Conover) labeled the continuity
correction.
45
The most certain conclusion that one can draw from the
literature regarding Yates* continuity correction is, as
Miettinen (32) observed, that the controversy seems likely
to continue. Miettinen noted that the dilemma arises from
the "contrast between the evidence accrued from two lines of
inquiry." First, there is the question of whether the
continuity correction tends to make the sampling
distribution of the test statistic in the null case more
consistent with the theoretical model for the distribution,
which is usually the chi-squared distribution. The second
line of inquiry, Miettinen observed, focuses on the question
of whether the probability values from the corrected test
statistic tend to agree better with the corresponding exact
probabilities. He summarized the situation by saying that
the evidence in studying the first question was against
Yates' correction, while it supported the correction in
terms of the second question. As long as the two approaches
to evaluating Yates' continuity correction are regarded as
interchangeable, Miettinen said, the confusion will exist.
Cochran's Continuity Correction
Cochran, in his 1952 paper, provided an example of a
continuity correction which he preferred (4, pp. 329-331).
He used a 2 X 4 table in which both row totals were eight
and all column totals were four. The maximum likelihood
expectations for all cells, therefore, were two. He then
constructed all tables which satisfied the fixed marginal
46
totals and calculated the X 2 test statistic for each. In
all, only seven different X 2 values were found. When the
2
exact distribution of the X values and the chi-squared
approximations were compared, the agreement was not good,
with the tabular values of chi-squared being consistently
low.
Cochran suggested a correction to provide an
improvement in the fit between the two values which he
calculated for an observed X 2 value by first finding the
next smaller value of X 2 obtainable from all tables having
the same marginal totals as the observed table. The next 2
smaller value of X represents the table having the next
higher probability of occurrence. Then, he read the chi-
squared table at a point half way between the original
observed value and this new value of X2, and he used this
probability as the corrected one.
He summarized the procedure later in the paper <4, p.
332), saying that the steps were to "compute the next
largest value of X which the structure of the data
permits. Read the chi-squared table at a point halfway
between this value and the observed X2." He noted that
sometimes the next largest value of X 2 is not immediately
obvious and trial and error might be required to find it.
Conover (7) attributed this correction technique to
Kendall and Stuart (22), who recommended a similar technique
in their discussion of probabilities for discrete
47
distributions. Mantel (26) stated that Kendall and Stuart
probably did not intend that technique to be applied to
contingency tables because they later, in the same book,
indicated that Yate's correction was the one they
preferred. Mantel believed that it was Cochran's suggested
method which more accurately reflected Conover's correction
approach.
Mantel's Continuity Correction
Mantel, in his comment on Conover's article, gave his
suggestion for a "correct" continuity correction (25, p.
379). To repeat what Mantel called his "prescription" for
correcting the test statistic, get the "usual two-tail
probability for the observed corrected statistic and the
corresponding two-tail probability for the next-larger
possible statistic in the opposite tail, and take half their
sum." This method gave Mantel corrected test statistics
which resulted in probabilities very close to the exact
probabilities for the 2 X 2 contingency tables he examined.
In effect, Mantel was suggesting that Cochran's correction
be applied to each tail of the observed distribution
separately (20, p. 510). Haber (20) interpreted Mantel's
procedure as follows: Let Py be the P-value of the chi-
sguared test with the Yates correction for the observed
table. Let P'y be the maximal P-value that can be obtained
similarly from all the other tables with the same marginal
totals, subject to the condition that P'y < Py. Then, the
48
actual exceedance probability is <Py + P'y)/2. Haber noted
that if two tables produced the same deviations (sum of the
squared differences between the observed frequencies and the
expected frequencies) then Mantel's corrected statistic
equaled that produced by Yates* correction. Haber tested
Mantel's correction, along with several others, as shall be
examined in the following paragraphs.
The Research of Michael Haber
In 1980 Michael Haber published the results of a study
which he had designed to compare continuity corrections to
the chi-squared test for independence in 2 X 2 contingency
tables (20). He addressed the issue which Miettinen (32)
had called the "second line of inquiry," focusing on
determining which of the corrected statistics agreed best
with the exact probability values. In addition to the
uncorrected chi-squared value, Haber compared correction
methods which had been proposed over a forty year span, from
Yates' method, proposed in 1934, to Mantel's, proposed in
1974.
Between these two was Cochran's suggested technique,
the one sometimes attributed to Kendall and Stuart. Haber
included it in his study, noting, however, that it might be
interpreted (or applied) in two different ways. Conover (7)
had used the equation
Xs 2 = ( X0 2 + Xl 2 ) 1 2 E<3- 5
49
to calculate his corrected test statistic, following
Cochran's suggestion of reading the chi-squared table at a
point halfway between the observed X 2 and the next largest
2
X obtainable from tables having the same marginal totals.
Haber noted that a different correction is achieved if the
principle is applied to X instead of X 2 by performing what
he called a "two-sided normal test" on
X c * (XQ + X1) / 2 . Eq. 6
Because the arithmetic mean is always smaller than the
squared mean, the approximated probability based on X will c
be greater than the probability based on X2. Haber tested
these as two separate corrections, then, and he compared
them with Yates' and Mantel's corrections.
In his study, Haber assumed that the marginal totals
were fixed, citing Plackett (37) and Conover (7), who had
both pointed out that the continuity correction should not
be used if one or both sets of marginal totals were random.
Haber called the probabilities obtained under this fixed
marginals condition the "unconditional" exceedance
probabilities, but he indicated that his results would be
generalizable to the more common situation, in which at
least one set of marginal totals is not determined in
advance, if the researcher were willing to accept
"conditional" probabilities. The other conditions he
established were (a) N (the sample size in the tables)
ranging from ten to ninety-nine, (b) a minimum expected
50
frequency of one, as determined by the maximum likelihood
method, and (c) exact probabilities, as calculated by
Fisher's exact probabilities test, lying between 0.001 and
0 .1 .
Using Monte Carlo techniques, Haber generated almost
150,000 different 2 X 2 tables which satisfied these
conditions. He compared the probabilities derived from the
five chi-squared based test statistics to the exact
probability for each table, using a FORTRAN computer program
written especially for that purpose. He called the
2
uncorrected X test statistic the U method, and the Yates
correction he called the Y method. The C method was his
name for Cochran's technique applied to the X test
statistic, and the S method was his term for Cochran's
correction applied to X . Mantel's correction method was
called the M method. For each method, the ratio of the
probability for the calculated test statistic to the exact
probability was calculated and averaged over all the tables
in a subgroup. The subgroups were classifications of
contingency tables according to sample size, minimum
expected frequency, and exact probability.
From the results of his experiment, Haber concluded
that both the U method and the Y method were "inappropriate"
for estimating the exact probabilities. He found them to be
strongly biased, and, in almost all cases, the C, s, and M
methods produced better approximations. In the case of the
51
U method, Haber discovered errors in which the estimated
probability was less than one-twentieth of the exact
probability. The Y method, on the other hand, overestimated
the exact probability by a factor of four or more in some
cases. These represented the extremes of the errors, and
they occurred in tables having some expected frequencies
less than five. Even in tables having higher minimum
expectations, however, the C, Sf and M methods were
consistently better estimators of the exact probabilites.
Differences among the C, S, and M methods appeared to
be related to the size of the minimum expected frequencies.
For minimum expectations between one and three, the S method
seemed considerably inferior. For minimum expectations
ranging between three and five, both the C and the S method
outperformed the M method. For larger minimum expectations
(greater than five) the three methods produced only slight
differences which, under this condition, seemed related to
the range of the exact probabilities. For exact
probabilities greater than 0.01, the three methods were
essentially equal in their estimates, but for lower exact
probabilities the C and S methods both performed better than
the M method. As Haber indicated, these statements applied
to the average of all the tables in the subgroups, but not
necessarily for the individual tables.
52
All five of the approximation methods gave improved
estimates as the minimum expected frequency increased, with
certain exceptions. For example, the M method gave better
results when the minimum expected frequency was in the range
of two to three than when it was between three and five.
For small minimum expected values (five or less) all
the approximations became worse as sample size increased.
Haber recognized that this occurred because a small minimum
expectation when sample size is large can occur only for
highly skewed distributions of the test statistic. He
concluded, therefore, that the minimum expected frequency
alone did not account for possible failures of a chi-squared
based method of approximation.
Haber was not able to recommend a single approximation
for all the conditions included in the tables he studied.
Instead, he suggested that the method used be chosen
according to the sample size and the value of the minimum
expected value for the observed table. As an example, he
said that if the relative error is to be no greater than 50
per cent, either the C, S, or M method might be used for
minimum expected frequencies of five or greater. For
minimum expectations between three and five, the C method
could be used, but the M method would be better for minimum
expectations between two and three with sample sizes from
ten to fifty—nine, or for minimum expectations between one
and two when the sample size fell between ten and nineteen.
53
In general, Haber concluded that for 2 X 2 contingency
tables with fixed marginal totals, the approximations to the
exact probabilities were "considerably improved" with the
aid of a continuity correction. The traditional correction,
Yates' correction, he found to be inadequate for performing
the two-sided test. The three alternative methods, the two
derived from Cochran and the one suggested by Mantel, gave
"satisfactory" results, provided that the ratio of the
minimum expected frequency to the sample size exceeded 0.1.
In summary, Haber did not settle the ongoing debate
pitting Conover, Grizzle, and Plackett, who held that the
2
uncorrected X test statistic was better, against Mantel and
Greenhouse, who supported the use of Yates* correction.
Actually, Haber refuted both claims when he concluded that
both the uncorrected statistic and the statistic corrected
by Yates* method were inappropriate, at least for the tables
included in his study. At best, Mantel might have felt some
support for his position, since Haber*s M method was
actually Mantel's interpretation of the "correct" way to
apply Yates' correction technique.
Haber's general experimental model served as the
pattern for the present study. The goal here was to extend
Haber's comparisons to tables with dimensions of 3 X 3 , 3 X
4, and 3 X 5 . Additionally, no lower limit was placed on
the value of the minimum expected frequency, except that it
was to be greater than zero. Sample sizes of ten, twenty,
54
and thirty were used so as to more realistically simulate
research experiments in education and psychology.
55
CHAPTER BIBLIOGRAPHY
1. Bishop, Yvonne M. M., Stephen E. Fienberg, and Paul W. Holland, Discrete Multivariate Analysis, Cambridge, Massachusetts, The MIT Press, 1975.
2. Boulton, D. M. and C. S. Wallace, "Occupancy of a Rectangular Array," Computer Journal. XVI (January, 1973), 57-63.
3. Bradley, James V., Distribution-Free Statistical Tests, Englewood Cliffs, New Jersey, Prentice-Hall, Inc., 1968.
4. Cochran, William G., "The Chi-Squared Test of Goodness of Fit," Annals of Mathematical Statistics." XXIII (Spring, 1952), 315-345.
5. Cohen, Jacob, Statistical Power Analysis for the Behavioral Sciences, New York, Academic Press, 1969.
6. Conover, W. J., Practical Nonparametric Statistics. New York, John Wiley and Sons, Inc., 1971.
., "Some Reasons for Not Using the Yates Continuity Correction on 2 X 2 Contingency Tables," Journal of the American Statistical Association. LXIX (June, 1974), 374-376.
8. Cox, M. A. A. and R. L. Plackett, "Small Samples in Contingency Tables," Biometrika. LXVII (January, 1980), 1-13.
9. Cramer, H., Mathematical Methods of Statistics. Princeton, New Jersey, Princeton University Press, 1946.
10. Everitt, B. S., The Analysis of Contingency Tables, London, Chapman and Hall, 1977.
11. Fienberg, Stephen E., The Analysis of Cross-Classified Categorical Data, Cambridge, Massachusetts, The MIT Press, 1977.
12. Fisher, Ronald A., "The Conditions Under Which Chi Square Measures the Discrepancy Between Observation and Hypothesis," Journal of the Royal Statistical Society. LXXXVII (Winter, 1924), 442-450.
56
13. , The Design of Experiments, Edinburgh, Oliver and Boyd, 1935.
, "The Significance of Deviations from Expectation in a Poisson Series," Biometrics, VI (Spring, 1950), 17-24.
15. , Statistical Methods for Research Workers, 14th ed., New York, Hafner Publishing Company, 1973.
16. Garside, G. R. and C. Mack, "Actual Type 1 Error Probabilities for Various Tests in the Homogeneity Case of the 2 X 2 Contingency Table," The American Statistician, XXX {February, 1976), 18-21.
17. Greenwood, M. and G. U. Yule, "The Statistics of Anti-Typhoid and Anti-Cholera Innoculations and the Interpretation of Such Statistics in General," Proceedings of the Roval Society of Medicine. VIII (Spring, 1915), 113-190.
18. Grizzle, James E., "Continuity Correction in the Chi-Squared Test for 2 X 2 Tables," The American Statistician. XXI (October, 1967), 28-32.
19. Guilford, J. P., Fundamental Statistics in Psychology and Education. 4th. ed., New York, McGraw Hill, 1965.
20. Haber, Michael, "A Comparison of Some Continuity Corrections for the Chi-Squared Test on 2 X 2 Tables," Journal of the American Statistical Association. LXXV (September, 1980), 510-515.
21. Hancock, T. W., "Remark on Algorithm 434," Communications of the Association for Computing Machinery. XVIII (February, 1975), 117-119.
22. Kendall, Maurice G. and Alan Stuart, The Advanced Theory of Statistics. Vol. 2, 2nd. ed., New York, Hafner Publishing Company, 1967.
23. Lancaster, H., The Chi Squared Distribution. New York, John Wiley and Sons, 1969.
24. Lewontin, R. C. and J. Felsenstein, "The Robustness of Homogeneity Tests in 2 X N Tables," Biometrics. XXI (March, 1965), 19-33.
57
25. Mantel, Nathan, "Comment and a Suggestion," Journal of the American Statistical Association. LXIX (June, 1974), 378-380.
2 6 • * "The Continuity Correction," The American Statistician. XXX {May, 1976), 103-104.
2 7 • and Samuel W. Greenhouse, "What Is the Continuity Correction?" The American Statistician. XXII (December, 1968), 27-30.
28. March, David L., "Algorithm 434: Exact Probabilities for R x C Contingency Tables," Communications of the Association for Computing Machinery. XV (November, 1972), 991-992.
29. McNemar, Quinn, Psychological Statistics. 3rd. ed., New York, John Wiley and Sons, 1962.
30. Mehta, Cyrus R. and Nitin R. Patel, "A Network Algorithm for Performing Fisher's Exact Test in r x c Contingency Tables," Journal of the American Statistical Association. LXXVIII (June, 1983), 427-434.
3 1 • ; , "A Hybrid Algorithm for Fisher's Exact Test in Unordered RXC Contingency Tables," Communications in Statistics -Theory and Methods. XV (April, 1986), 387-403.
32. Miettinen, Olli s., "Comment," Journal of the American Statistical Association. LXIX (June, 1974), 380-383.
33. Mosteller, Frederick, "Association and Estimation in Contingency Tables," Journal of the American Statistical Association. LXIII (January, 1968), 1~ 28.
34. Pagano, M. and K. Halvorsen, "An Algorithm for Finding the Exact Significance Levels of r x c Contingency Tables," Journal of the American Statistical Association. LXXVI (November, 1981), 931-934.
35. Pearson, Karl, "On the Criterion That a Given System of Deviations From the Probable in the Case of a Correlated System of Variables Is Such That It Can Be Reasonably Supposed to Have Arisen From Random Sampling," Philosophical Magazine. Series 5, L (Spring, 1900), 157-172.
58
36. , On the Theory of Contingency and Its Relation to Association and Normal Correlation, London, Drapers' Company, 1904.
37. Plackett, R. L., "The Continuity Correction in 2 x 2 Tables," Biometrika, LI (May, 1964), 327-337.
38. Reynolds, Henry T., The Analysis of Cross-Classifications , New York, The Free Press, 1977.
39. Starmer, C. Frank, James E. Grizzle, and P. K. Sen, "Comment," Journal of the American Statistical Association, LXIX (June, 1974), 376-378.
40. Tocher, K. D., "Extension of the Neyman-Pearson Theory of Tests to Discontinuous Variates," Biometrika, XXXVII (February, 1950), 130-144.
41. Upton, Graham J. G., "A Comparison of Alternative Tests for the 2 x 2 Comparative Trial," Journal of the Royal Statistical Society, CXLV (Spring, 1982), 86-105.
42. Yates, Frank, "Contingency Tables Involving Small Numbers and the Chi-Squared Test," Journal of the Royal Statistical Society, Series B, Supp. Vol. 1, II (Spring, 1934), 217-235.
CHAPTER III
PROCEDURES
Introduction
The simulation study described here is an extension of
the research of Michael Haber which was described in the last
chapter. Mehta and Patel (5) suggested such an extension,
stating that algorithms for Fisher's exact probabilities test
for the general R X C contingency table eliminated the
difficulty usually associated with performing that test in
tables larger than 2 X 2 . Additionally, they pointed out
that their own "network" algorithm overcame the disadvantage
of long CPU times required by March's (3) algorithm and the
modifications to it.
This extension examines not only the effects of larger
dimensions in the contingency tables, but also addresses the
questions of small sample sizes and small expected
frequencies. In Chapter I these questions were stated as
follows.
1. What is the effect of the small sample size on the
chi~squared based statistics in the two—way contingency
tables used in the study?
59
60
2. Does a small expected frequency (0.05, for example)
in the contingency table influence the accuracy of the test
statistic?
3. Is there any pattern or trend indicated in the
accuracy of the probability based on the chi-squared
statistic as compared to the exact probability as the table
dimensions increase from 3 X 3 to 3 X 4 to 3 X 5?
The approach Haber used in studying 2 X 2 contingency
tables was adopted for the present study; that is,
contingency tables simulated using a hypergeometric sampling
technique were tested for independence using the classical
chi-squared statistic, chi-squared corrected by Yates'
method, by two methods attributed to Cochran, and by a method
suggested by Mantel. The probability of independence as
indicated by each of these five statistics was compared to
the exact probability according to Fisher's exact
probabilities test.
The purpose of this chapter is to describe the
procedures used both to generate data for the simulation
study and to determine the statistics of independence. The
data in the contingency tables used in this study were
generated using Monte Carlo techniques. A computer program,
XTAB, was written especially for this purpose, using the
FORTRAN programming language. The program was compiled and
run on a Leading Edge Model D MS-DOS personal computer which
was equipped with a math coprocessor for extended precision.
61
The program also applied the various tests of independence to
the contingency tables, analyzed and categorized the
results. This chapter is a description of the XTAB program
and the procedures which were used to verify its operation.
The chapter is organized into major sections covering the
main routine, the random number generators, and the
individual subroutines.
Notation
A system of notation for describing contingency tables,
their cell contents, and their marginal totals has been
developed and employed in the existing literature. Since
that notation is used throughout this chapter it is described
here.
Contingency tables are composed of cells arranged into
rows and columns. A table with i rows and j columns is
called an i X j table. Its rows are numbered from one to i,
with r^ representing the first row and r^ representing the
last. Similarly, the first column is c^ and the last column
is C y The observed frequencies are identified by the cells
into which they are classified. For example, the middle cell
in the first row of a 3 X 3 table contains frequency f 1 2,
where 12 indicates the first row, second column. In general,
frequencies are represented by f ^ . Expected frequencies are
represented by e ^ in this chapter.
The total of all the frequencies in row i of an R X C
contingency table is symbolized by , so
62
c ri.
Column totals for the r X c table likewise are represented by
the dot notation; the sum of frequencies in column j is c ., • J
where
r c . - I c, . •3 i-i 1 3
In every case, the sample size is represented by N.
The Simulation Structure
A modular program organization was used, with a main
routine which called nine different subroutines at various
points in its execution. In addition, some of the
subroutines themselves called other subroutines. Some
sections of the program, especially in the subroutines, were
modified from routines published in the literature for public
use, but most of it was written originally for the present
study. Listings of the main routine and each subroutine are
included in the appendices.
The Main Routine
The main routine began by taking care of some required
housekeeping chores, including declaring variable types,
establishing output file specifications, setting common data
values, selecting sample size, and dimensioning arrays. It
was necessary to modify the array column dimension as the
experiment progressed from the 3 X 3 contingency tables to
63
the 3 X 4 and 3 X 5 tables. Only the column dimension had to
be altered because the number of rows was always three. This
was done manually, although the program could have been
constructed to accomplish the redimensioning automatically.
Sample size was also modified manually from table to table.
The decision to enter these changes from the keyboard
rather than to generate them automatically was influenced
both by the length of time the program required for execution
and by the desire to keep the program as readable as
possible. The same general procedures were repeated for each
of the nine combinations of table dimension and sample size
used in the simulation. Preliminary experiments had shown
that the shortest time required for any of the nine
combinations was approximately one hour. The longest had
been estimated to take more than 120 hours. Because of these
lengths, the decision was made to run the program nine times,
modified each time for the particular combination of sample
size and table dimension. This allowed more economical use
of the computing facilities and guarded against electrical
interruptions which might have halted the execution of a
single longer program. Each time the program was run, the
array dimensions and sample size were adjusted from the
keyboard.
Another task handled by the main routine was the
reformatting of the contingency tables before certain of the
subroutines were called. In some subroutines, row and column
64
marginal totals were required. These were passed to the
subroutine by combining them with the contingency table data
to form an array having one more column and one more row than
the contingency table. The extra row and column held the
marginal totals and the sample size.
Once all the tests on the contingency table data had
been performed, the main routine categorized each table
according to the magnitude of its exact probability (less
than or greater than 0.5), as determined by Fisher's exact
probabilities test, and by the size of the table's smallest
expected value. This procedure was repeated 2,500 times for
each combination of table dimension and sample size.
After the 2,500 tables had been analyzed and
categorized, the main routine prepared the results for
tabulation. One of the categories by which the individual
tables were sorted was the minimum value of the expected
frequencies calculated by the maximum likelihood formula,
(r. > X {c .) EV. . = *r_ Eq. 7
1 3 N
First, two categories were used: expected frequencies less
than 0.5 and those equal to or greater than 0.5. Then,
because the ranges of these values varied from one
combination of table dimension and sample size to another,
these categories also were adjusted manually for nine
repeated executions of the program. Table I shows the range
65
of expected values for each of the nine combinations used in
the simulation.
TABLE I
RANGES OF MINIMUM EXPECTED FREQUENCIES CLASSIFIED ACCORDING TO TABLE DIMENSION AND
SAMPLE SIZE
SAMPLE SIZE 10 20 30
DIMENSION
3 X 3 0
» 1 O
* 0.05 - 1.8 0.033 - 3.33
3 X 4 .1 - .6 .05 - 1.5 .033 - 2.33
3 X 5 .1 - .6 .05 - 1.2 .033 - 2.00
In every case the lower end of the range equals the
reciprocal of the sample size. This is a result of the
formula for maximum likelihood expectations, Equation 1, in
which the product of the row and column total is divided by
the sample size. Since the minimum row and column values
are always one, the corresponding product is also one, and
the minimum expected frequency is the reciprocal of the
sample size. The largest possible minimum expected
frequencies are determined by dividing the sample size as
nearly equally as possible among the three rows and among
the three, four, or five columns and then applying the
maximum likelihood formula, using the smallest row total
and the smallest column total to form the numerator
product. For example, for a 3 X 4 table with sample size
66
ten, the most nearly equal row totals are three, three, and
four. The most nearly equal column totals are two, two,
three, and three. The minimum expected frequency in such a
table is ( 3 X 2 ) / 10, or 0.6. It is not possible to
obtain a minimum expected frequency any greater than 0.6 in
a 3 X 4 table with sample size ten.
By manually altering the category limits, the main
routine was adjusted to provide a better distribution of
the 2,500 tables among the categories of minimum expected
frequency, thereby allowing more precise resolution of
differences between the various tables. As its final task,
then, the main routine controlled the printout of the
results. Appendix A contains a listing of the main FORTRAN
routine adjusted for the largest contingency table, the 3 X
5 table with sample size thirty.
The Subroutines
The nine subroutines called by the main routine are
briefly identified here. A more complete description of
each one is given later in this chapter.
MART. Subroutine MART invoked a random number
generator subprogram to produce the fixed marginal totals
for each contingency table.
EV. This subroutine used the maximum likelihood
method to calculate the expected frequencies for the cells
of the contingency table being simulated. It also
determined the minimum expectation for the table.
67
RC0NT2. Filling the contingency tables according to
the limits set by the row and column totals and sample size
was accomplished by RCONT2. It, too, used a random number
generator subprogram.
CHISQ. This subroutine calculated the Pearson chi-
squared statistic for the contingency table generated by
RC0NT2. CHISQ was also called upon by other subroutines to
compare secondary tables to the original table.
PVAL. Subroutine PVAL was called by several other
program modules to compute the probability value for the
uncorrected chi-squared and the corrected chi-squared
statistics.
RXCPRB. This was the most extensive subroutine of the
nine. Its main task was to compute Fisher's exact
probability for each contingency table. Because it
generated all other more extreme tables meeting the same
marginal restrictions as a part of the process, it was used
to provide data for calculating the continuity corrected
chi-squared statistics of Cochran and Mantel. It was this
subroutine which accounted for the long execution time of
the overall program.
COCHR. Calculations of the two continuity corrections
to chi-squared as proposed by Cochran were accomplished by
this routine.
68
YATES. This module was used to calculate Yates'
correction to chi-squared for the original contingency
table.
RATIOS. Subroutine RATIOS was called by the main
routine to compare the probability of each of the chi-
squared statistics, corrected and uncorrected, to the exact
probability calculated by Fisher's exact probabilities
test. It generated the data tabulated in the main
routine's final output.
Random Number Generation
Randomness is an essential feature in any Monte Carlo
simulation. Two different random number generators, both
written as FORTRAN function subprograms, were employed in
the present study. A listing of each is given in Appendix
B.
The integer function IRAND was accessed by subroutine
MART to generate the fixed row and column totals for each
contingency table. IRAND was a modification of a random
number generator published in a textbook by Nanney {6, pp.
181-182). Subroutine MART determined the minimum and
maximum values of the particular marginal total to be
generated, then it passed these values and a seed number to
IRAND. IRAND began by generating a pseudo-random number
greater than zero and less than one using the
multiplicative congruential method, a widely used method
whose properties have been extensively studied. When the
69
multiplicative congruential method is implemented in
FORTRAN, intrinsic function MOD is used to return the
remainder of an arithmetic division operation. In IRAND,
the MOD function was invoked after the seed number had been
used with two constants in a sequence of arithmetic
operations (simple multiplication and addition). The
result of this sequence of operations was divided by
another constant, and the remainder of that division was
used to calculate the random fraction between zero and
one. This random fraction was in turn used to calculate an
integer result falling between the maximum and minimum
limits specified by subroutine MART.
The seed number passed by subroutine MART to IRAND was
used as a reference, then, for all the arithmetic
operations used in the random number generator. The
generator returned a different number every time it was
invoked because it, in the process of calculating the
random result, also altered the seed number which would be
used the next time it was invoked. Some argument-changing
procedure is the heart of almost all random number
generator routines. The results are called pseudo-random
because the entire sequence of generated numbers will be
repeated if the FORTRAN routine is restarted using the same
original seed number.
70
The second random number generator used in this study
was also written as a FORTRAN function subprogram. It was
accessed by subroutine RCONT2 to obtain pseudo-random
numbers between zero and one. RCONT2 used these numbers to
determine the frequencies in the contingency tables, given
the row and column totals. The function was called RANDOM,
and it was based on a routine published by Wichmann and
Hill (9) with modifications suggested by McLeod (4).
Wichmann and Hill developed their algorithm to
overcome some recognized disadvantages of other pseudo-
random number generators, namely their non-randomness at
the extremes of their distributions and their slow
execution times. Three simple multiplicative congruential
generators were used, each having a prime number for its
modulus and a primitive root for its multiplier. The three
results were added, and the fractional part was taken as
the random result. Wichmann and Hill showed that this sum
was rectangularly distributed and therefore statistically
satisfactory for generating a pseudo-random sequence.
However, McLeod noted that in some machines rounding errors
could produce zero values. Since the results were intended
always to be greater than zero (and less than one), McLeod
suggested a simple modification to guard against illegal
results. This modification was incorporated in the
function RANDOM used in the present study.
71
Both IRAND and RANDOM were tested before they were
incorporated into the overall program. In the test, random
numbers numbers were generated and classified into one of
several categories. A chi-squared test was performed to
assure that the generated numbers were rectangularly-
distributed. After approximately 2,500 cycles each of the
routines' random outputs demonstrated an acceptable fit to
the rectangular distribution.
Subroutine Descriptions
Subroutine MART
MART, the first subroutine called during program
execution, was written to randomly select marginal totals
for a contingency table. The main routine passed to MART
the sample size and the number of columns used in the
table. The number of rows was always three. The
subroutine returned two vectors, NROWT and NCOLT, the row
and column totals, to the main routine. The random number
function subprogram IRAND was accessed by MART to obtain
the random vector elements. Appendix C is a FORTRAN
listing of subroutine MART.
72
Choosing row totals.—The row totals were selected
first. Because every table had three rows, and since a
restriction was that every row would have at least one non-
zero entry, the total for the first row had a maximum value
of two less than the sample size, N, and its minimum value
was one. That is,
1 < r± i (N-2).
For example, if the sample size were ten, then the first
row total was selected from the range of one through eight,
inclusive. Even if the maximum (eight, in this example)
were selected, each of the other two rows could still have
totals equal to one, thereby meeting the non-zero
requirement„
Once the first row total had been determined, the
second row was selected similarly. The maximum value for
the second row, however, had to be adjusted to account for
the number of entries used in the first row. So,
1 i r2< i (N-l-r^ ).
The expression on the right-hand side of this relation
shows the maximum value for the second row total, and it
implies that at least one entry must be reserved for the
third row (by subtracting one) first. Then, all entries
not used in the first row could possibly be used in the
second. Again, using sample size ten as an example, if the
first row total had been selected to be four, then the
73
maximum second row total would be five, leaving one for the
third row.
Calculating the last row total was straightforward
since row three contained all entries not used in rows one
and two. A simple subtraction produced the value:
r 3 - H - - r 2 .
Again, the standard dot notation is used to indicate
summation across all values of the dotted subscript (the
second, or column value, in this case). For a sample size
equal to ten, thirty-six arrangements of the digits one
through eight were possible for representing the row
totals. MART used the procedures outlined here to select
one of the thirty-six arrangements and assigned the chosen
values to the vector NROWT. A chi-squared test was used to
check for the randomness of the selections, and it verified
that each arrangement had equal probability of being chosen
when 2,500 selections were made.
Choosing column totals.—A procedure similar to the
one used to calculate row totals was employed to select the
column totals. However, because the number of columns was
varied from three to four to five, the algorithm had to
take this variation into account and adjust the maximum
allowable column total accordingly. For example, for a
sample size of ten, the maximum column total value allowed
in a 3 X 5 contingency table was six. For the same sample
size, a 3 X 4 table could have a maximum column total value
74
of seven, and a 3 X 3 table could have eight. These limits
ensured that the minimum column total was always one. The
selected arrangement of column total values was returned to
the main routine in vector NCOLT.
Subroutine EV
Subroutine EV was written to calculate expected
frequencies and to determine the minimum expected value for
each contingency table. Four items of information were
provided by the main routine as inputs to EV. These were
sample size, the vector of row totals NROWT, the vector of
column totals NCOLT, and the number of columns. The
subroutine used these data to calculate the maximum
likelihood expected frequencies for each contingency table
cell. The calculation formula was
(r. } (c .) o — 1 * • J eij — N
where e ^ is the expected frequency in cell of row i and
column jr r ^ is the row total for row i, c ^ is the column
total for column j, and N is the sample size. This
calculation formula was previously given as Equation 7.
A nested loop structure was employed to find the
expected value for each cell in the first row, then in the
second row, and finally in the third row. As each value
was calculated it was assigned to a new matrix called
EXVAL. The matrix EXVAL was returned to the main routine
for further use.
75
Subroutine EV also returned the value of the smallest
expected frequency in each contingency table. This value
was determined by declaring the first calculated
expectation to be the minimum, and then comparing each
succeeding calculation with the minimum. If the most
recent value were smaller than the minimum, it was declared
to be the minimum, and the the process was continued until
all expected frequencies had been compared. The minimum
expected frequency was called EVMIN and was passed back to
the main routine where it was used to classify the
contingency table for the analysis of the final results.
The FORTRAN listing of subroutine EV is given in Appendix
D.
Subroutine RC0NT2
Subroutine RC0NT2 is listed in Appendix E. It was
used to generate the "observed" frequencies for the
contingency tables in the study. RC0NT2 was published as
AS 159 by Patefield (7), and it was used without
modification in this study.
As inputs RC0NT2 required NROWT and NCOLT, the vectors
of marginal totals generated by subroutine MART. It also
was given the number of rows and columns in the contingency
table. Restrictions on these inputs were that at least two
rows and two columns were necessary, and all marginal
totals had to be positive. As published by Patefield,
RC0NT2 was limited to samples of size 5,000, but this was
76
adjustable by changing two lines in Patefield's FORTRAN
coding.
RCONT2 returned the randomly generated contingency
table, which it called MATRIX, to the main routine. It
also returned a logical variable, KEY, which it used on
subsequent calls, and a fault indicator, IFAULT, which
reported violations of the input restrictions on rows,
columns, and marginal totals. In the process of generating
the "observed" frequencies, RC0NT2 used the random number
generator function RANDOM.
Subroutine CHISO
This subroutine was written for this study to
calculate the chi-squared statistic for the contingency
tables, both those analyzed in the experiment and those
related tables generated in order to compute some of the
continuity corrections to chi-squared. The calling routine
supplied CHISQ with the table to be evaluated, the matrix
of expected values (EXVAL), and the number of columns.
CHISQ used Equation 3, the well-known formula
, r c (f. . - e, .)2
X 2 = I t 13 *3
i"l 3=1 e ^
to calculate the chi-squared test statistic. The summation
extends over all the cells in the contingency table. CHISQ
returned this statistic to the calling routine. Appendix F
shows the FORTRAN listing for CHISQ.
77
Subroutine PVAL
In most instances, once a chi-squared statistic had
been calculated it was necessary to determine the
probability of the distribution of observed frequencies
from which it was derived. Subroutine PVAL was called to
make this determination. Poole and Borchers published a
program written in BASIC designed to accomplish this
probability calculation (8, pp. 130-132). PVAL was
essentially a FORTRAN translation of Poole's and Borchers'
routine. The BASIC program required the chi-squared value
and the number of degrees of freedom as inputs, but PVAL
was modified somewhat to fit the needs of this study.
Instead of being passed the number of degrees of freedom,
it was given the number of columns in the contingency
table. Since the number of rows was always three, the
number of degrees of freedom was easily determined by
applying the formula
df » (r - 1)(c - 1) ,
where r represented the number of rows and c was the number
of columns. The probability calculation formula used by
Poole and Borchers for an odd number of degrees of freedom
was
(X2)[(v+l)/2]e"x2/2 2 1 / 2
p = 1 " 4 • 2, Eq. 8
1 * 3 • 5 * ...» v (X* IT)1/2
and for an even number of degrees of freedom they used
78
(X2)(v/2)e x 2 / 2
P = 1 - • Z, Eq. 9 2 » 4 « v
where v represented the number of degrees of freedom, and
« - ! • ?
m=l (v+2)*(v+4)*...(v+2m)
These were incorporated into PVAL to compute the
probability value to a precision of approximately 10-7.
PVAL is listed in Appendix G.
Subroutine RXCPRB
Subroutine RXCPRB was the procedure used to perform
Fisher's exact probabilities test on each simulated
contingency table in the study. The FORTRAN listing in
Appendix H shows that the original version of RXCPRB
published by Hancock (2) was modified for use in this
s tudy.
Hancock's version of RXCPRB was based on March's
algorithm (3) called CONP. Hancock's changes almost
doubled the speed of the procedure for 3 X 3 and larger
contingency tables. Although the network algorithm of
Mehta and Patel (5) was even faster in the computation of
Fisher's exact probabilities test, RXCPRB was chosen
because it generated the related contingency tables needed
for performing Cochran's and Mantel's chi-squared
continuity corrections. RXCPRB did this by calling a
subroutine, INIT, a required part of the Fisher's exact
79
returned a new contingency table to RXCPRB, one which fit
the marginal constraints of the original table. RXCPRB
evaluated each of these new tables to determine whether or
not its frequency distribution was more extreme than that
of the original table. If it was, it was used in
calculating the exact probability of the original table.
Since Cochran's continuity corrections required
evaluation of the contingency table having the next less
extreme frequency distribution than the original table, it
was convenient to modify RXCPRB so that each of the tables
returned by INIT was tested to see if it were the one
meeting this requirement. When this table was found, it
was stored in a special matrix and later returned to the
main routine for use with another subroutine which
calculated Cochran's corrections. This subroutine is
described later.
RXCPRB was also modified to include steps to calculate
Mantel's correction to chi-squared. As Haber (1)
explained, Mantel's correction was found by averaging
Yates* corrected chi-squared value with the one from the
contingency table having the next smaller Yates' corrected
chi-squared, but with the same marginal totals. Yates'
corrected chi-squared value was found for each of the
tables returned by XN1T, these were compared to the value
for the original table, and the arithmetic was performed
when the appropriate value was found. RXCPRB returned
80
when the appropriate value was found. RXCFRB returned
Mantel's corrected chi-squared value to the main routine
when it terminated. The actual procedures for performing
Yates' correction were included in another subroutine which
is described later.
The method employed in RXCPRB for calculating the
probability for an observed frequency distribution in a
given contingency table was described in Chapter II. It
used the relationship based on the hypergeometric
distribution that
ft c J <r. !) IT <c «) i=l j=l 3
P x - = =
N! t l (x ') i=l j*l 1 3
where r and c were the row and column totals, respectively,
N was the sample size, and x was the individual cell
frequency. This relationship was previously given as
Equation 2. Fisher's exact probabilities test required
that this probability be calculated for the observed table
and for all other tables having the same marginal totals
but more extreme (and therefore less probable) frequency
distributions. The sum of these probabilities gave the
exact probability for the observed table.
Subroutine RXCPRB calculated the factorials by
accessing a function subprogram called FACLOG. This
subprogram generated a table of the logarithms of the
81
factorials of the numbers 0 through 100. For numbers
greater than 100, FACLOG used Stirling's approximation, a
well-known mathematical technique for approximating the
logarithm of the factorial. In this study there were no
numbers greater than 100 because the largest sample size
used was 30. The use of logarithms in RXCPRB kept the
numerator and the denominator of the probability equation
small enough so as not to exceed the limitations of
computer software. The probability was finally determined
by taking the anti-logarithm once the numerator and
denominator had been evaluated.
Subroutine RXCPRB was quite slow in its execution. In
general, it would no longer be used for Fisher's exact
probabilities test because of the availablity of algorithms
like Mehta's and Patel's. However, it was appropriate for
this experiment because enumeration of the related
contingency tables was useful, and even necessary, for
other steps in the study. The FORTRAN listing of RXCPRB in
Appendix H includes subroutines INIT and MATFIX and
function FACLOG. These subprograms were used only by
RXCPRB and were therefore considered a part of it.
Subroutine COCHR
The calculation of the two continuity corrections
attributed to Cochran was performed in subroutine COCHR,
which was written especially for this study. The inputs
for this subroutine were the matrix of expected
frequencies, the number of columns, the uncorrected chi-
squared value for the original table, and the matrix of
frequencies having the same marginal totals as the original
table but with the next less extreme distribution. This
matrix is referred to here as the "second" table.
COCHR began by calling subroutine CHISQ to determine
the chi-squared statistic for the second table. Then it
found the first continuity correction by averaging the chi-
squared statistics for the two tables, as defined by
Equation 5. This correction method was called the S-
method.
Cochran's second correction method, the C-method, was
then applied as defined by Equation 6. First, the square
roots of the two chi-squared statistics were evaluated and
then averaged. This average was squared to give the C-
method statistic.
Subroutine COCHR called subroutine PVAL to determine
the probabilities of the two corrected chi-squared
statistics. These two probabilities were returned to the
main routine. COCHR is listed in Appendix I.
Subroutine YATES
This subroutine, which computed Yates'corrected chi-
squared statistic for the input contingency table, used the
familiar algorithm, Equation 4, in which 0.5 was subtracted
83
from the differences between observed and expected
frequencies in each contingency table cell before those
differences were squared in the chi-squared calculation.
After YATES had determined the corrected chi-squared value,
it called subroutine PVAL to find the associated
probability. The subroutine then returned this probability
to the calling routine, RXCPRB. YATES is described
independently of RXCPRB, even though it was called only by
RXCPRB, because it performed one of the corrections to chi-
squared evaluated in the study. YATES is listed in
Appendix J.
Subroutine RATIOS
After the main routine had completed evaluating all
the contingency tables of a given dimension and sample
size, it categorized each one according to several
parameters. First, it classified tables into one of two
groups—those whose exact probability was less than or
equal to 0.5 and those whose exact probability was
greater. Within each of these groups tables were
classified according to sample size and minimum expected
frequency. Once the 2,500 tables were classified, the main
routine called subroutine RATIOS to complete the data
needed for tabulating and reporting the results. RATIOS
calculated the ratio of the probability for the chi-squared
statistic found for each of the compared methods to
84
Fisher's exact probability for the observed table. It
summed these ratios for each category of tables, and it
returned the average, the minimum, and the maximum ratio
for each category to the main routine. Appendix K holds
the FORTRAN listing for RATIOS.
Summary
The methods used in this study were essentially the
same as those employed by Haber (1) in his study for 2 X 2
contingency tables. Using Fisher's exact probability as a
reference, chi-squared based statistics for independence
were compared. The results of the comparisons were
reported for groups of contingency tables of a given
dimension and sample size having similar exact
probabilities and minimum expected frequencies. The
computer subroutines used to compute the chi-squared based
statistics were written especially for this experiment or
were added to the Fisher's exact probabilities subroutine,
RXCPRB. The RXCPRB subroutine was written by Hancock
(10). The FORTRAN listings for all the program modules are
included in the appendices. In the next chapter the
results of the experiment are analyzed.
85
CHAPTER BIBLIOGRAPHY
1. Haber, Michael, "A Comparison of Some Continuity Corrections for the Chi-Squared Test on 2 X 2 Tables," Journal of the American Statistical Association, LXXV {September, 1980), 510-515.
2. Hancock, T. W., "Remark on Algorithm 434," Communications of the Association for Computing Machinery, XVIII (February, 1975), 117-119.
3. March, David h., "Algorithm 434: Exact Probabilities for R x C Contingency Tables," Communications of the Association for Computing Machinery. XV (November, 1972), 991-992.
4. McLeod, A. Ian, "A Remark on Algorithm AS 183. An Efficient and Portable Pseudo-random Number Generator," Applied Statistics. XXXIV (Summer, 1985), 198-200.
5. Mehta, Cyrus R. and Nitin R. Patel, "A Network Algorithm for Performing Fisher's Exact Test in r x c Contingency Tables," Journal of the American Statistical Association. LXXVIII (June, 1983), 427-434.
6. Nanney, T. Ray, Computing: A Problem-Solving Approach with FORTRAN 77, Englewood Cliffs, New Jersey, Prentice-Hall, 1981.
7. Patefield, W. M., "An Efficient Method of Generating Random R x C Tables with Given Row and Column Totals," Applied Statistics. XXXIV (Summer, 1985), 91-97.
8. Poole, Lon and Mary Borchers, Some Common Basic Programs, Berkeley, California, Adam Osborne & Associates, Inc., 1977.
9. Wichmann, B. A. and I. D. Hill, "Algorithm AS 183: An Efficient and Portable Pseudo-random Number Generator," Applied Statistics. XXXI (Summer, 1981), 188-190.
CHAPTER IV
ANALYSIS OF DATA
Introduction
In this chapter the results of the study are evaluated
in terras of the questions introduced in Chapter I. Graphs
of the ratios of corrected and uncorrected chi-squared
probability estimates to exact probabilities are used to
illustrate relationships between these quantities and sample
size, minimum expected frequencies, and contingency table
dimensions.
Questions of the Study
The following three questions were stated in Chapter I
to define the purposes of this study.
1. What is the effect of the small sample size on the
chi-squared based statistics in the two-way contingency
tables used in the study?
2. Does a small expected frequency {0.05, for example)
in the contingency table influence the accuracy of the test
statistic?
36
87
3. Is there any pattern or trend indicated in the
accuracy of the probability estimate based on the chi-
squared statistic as compared to the exact probability as
the table dimensions increase from 3 X 3 to 3 X 4 to 3 X 5?
In the following paragraphs these questions are
considered individually. The data produced by the
simulation program, XTAB, are given in a series of tables.
A fourth, related question is suggested upon viewing these
data, that is, does the exact probability of a contingency
table's frequency distribution affect or influence the
performance of chi-squared-based test statistics? In other
words, would a highly skewed frequency distribution within a
contingency table cause the chi-squared {uncorrected or
corrected) statistic to be better or worse, or would it have
no effect? It is not logical that the answer to this
question could affect any practical decisions, because
knowing the exact probability eliminates the need for a chi-
squared-based test. It could, however, guide the
recommendation of a test method based on the analysis of
these data.
In the discussion that follows, "better" and "worse"
are evaluated by comparing the chi-squared-based test
statistics to Fisher's exact probabilities. A mathematical
ratio is derived and symbolized by P./P_, where P. A E A
represents the probability value corresponding to the chi-
squared-based test statistic and P^ represents the exact
probability. A ratio of 1.0 indicates that the chi—squared—
88
based test produced the same probability as Fisher's exact
probabilities test. The "ideal" performance ratio,
therefore, is 1.0.
To facilitate analysis, the data are graphed and
presented as Figures 1 through 14. Each figure represents
performance data for a particular combination of minimum
expected frequency range, exact probability range, and
contingency table dimension.
The data tabulated in Tables II, III, and IV are those
generated by the nine runs of program XTAB. Each table
contains information about a specified contingency table
dimension. The ratios of P^ to P^ are identified in the
tables and graphs by the identifiers used in XTAB: U is the
uncorrected chi-squared test, Y is Yates' correction, C is
Cochran's correction using the square roots of chi-squared
for two related tables, S is Cochran's correction based on
the average chi-squared value for the same two tables, and M
is Mantel's correction. Each table contains the results for
7,500 simulated contingency tables. These 7,500 tables are
categorized by the sample size, the range of the smallest
expected frequency, e, and the range of the exact
probability, PE« The letter T designates the number of
tables in each category.
XTAB produces the means and the ranges of the
probability ratios for the contingency tables. For the data
tables shown here, the mean is listed first, followed in
parentheses by the range.
89
3 X 3 Contingency Table Results
Table II shows the performance ratio means and ranges
for 3 X 3 contingency tables.
TABLE II
PERFORMANCE RATIO MEANS AND RANGES, 3 X 3 TABLES
h— Method Range of smallest expected freauencv
e < 0.5 e > 0.5 N * 10
T • 872 T « 40
<0.5 U 0.71 (0.02- 5.32) 0.81 <0.37- 4.34)
<0.5 Y 4.15 { .38-11.81) 4.52 (1.80-12.33) c .90 { .06- 6.36) .90 ( .37- 3.28) s .87 ( .06- 6.35) .89 ( .37- 3.27) M 4.09 ( .38-10.44) 4.50 (1.80-10.76)
T = 1520 T - 68 U .69 ( .39- 1.07) .72 < .49- .99) Y .93 ( .14- 1.80) 1.11 ( .91- 1.43)
>0.5 C .42 { .00- 1.18) .54 ( .01- .95) S .40 { .00- 1.08) .51 ( .00- .94) M .93 { .14- 1.80) 1.11 ( .91- 1.43)
N » 20 T * 943 T = 147
*0.5 U 0.98 (0.00-11.36) 0.95 (0.47- 3.01)
*0.5 Y 5.18 { .01-12.07) 3.78 (1.56-13.14) C 1.02 { .00-11.56) .99 ( .22- 3.22) S .99 ( .00-11.24) .97 ( .21- 3.13) M 4.98 ( .01-11.70) 3.77 (1.56-12.96)
T - 1196 T = 214 i U .85 ( .34- 1.54) .86 ( .51- 1.10)
Y .92 ( .04- 1.94) 1.26 ( .95- 1.82) >0.5 C .59 { .00- 1.61) .74 ( .00- 1.11)
S .58 ( .00- 1.58) .72 ( .00- 1.10) M .92 ( .04- 1.94) 1.26 ( .95- 1.82)
N • 30 T • 845 T • 245
<0.5 U 0.98 (0.00- 5.35) 1.05 (0.32- 2.62)
<0.5 Y 3.63 ( .01-76.28) 3.51 (1.55-18.19) C 1.02 ( .00- 6.19) .99 ( .22- 3.49) s .99 ( .00- 6.18) .98 ( .21- 3.48) M 3.61 { .01-76.28) 3.50 (1.55-18.18)
T » 1117 T - 293 U .90 ( .40- 1.57) .91 ( .63- 1.16) Y .92 { .00- 1.91) 1.26 ( .81- 1.77)
>0.5 C .62 ( .00- 1.71) .78 ( .00- 1.31) s .61 ( .00- 1.69) .77 ( .00- 1.30) M .92 ( .00- 1.91) 1.26 < .81- 1.76)
90
Each of the major sections of Table II contains the means
and the ranges of the performance ratios, categorized
according to the sample size, N, the range of the minimum
expected frequency, e, and the exact probability, P_.
In the graphs for these data, Figures 1 through 4,the
symbol "e" represents minimum expected frequency, and "P_" E
represents the exact probability as calculated using
Fisher's exact probabilities test. The term "p." stands for A
the average probability for one of the five chi-squared
based tests for a specific group of contingency tables. The
symbols U, Y, C, S, and M represent the five chi-squared
based methods which were compared to Fisher's exact method
by XTAB. Mantel's correction, symbolized by M in the XTAB
data, is not shown on the graphs. Tables II, III, and IV
show that Mantel's correction gives essentially the same
result as Yates * for contingency tables with the dimensions
used in this study. Haber (1) noted that this would be the
case in all two-sided tests. The only situations in which a
difference exists between Mantel's and Yates' corrections
are those in which tables with frequency distributions more
extreme than the observed distribution can occur in only one
way. As the tables indicate, this rarely happens in 3 X 3
and larger tables. In the graphs, then, Y and M correspond,
so only Y has been entered.
First, only the effects of sample size are considered.
The samples of ten, twenty, and thirty used in this
simulation are, at best, moderately sized, but for most of
91
the contingency tables in this simulation the samples are
considered small. As a rule of thumb, small samples for a
contingency table are those less than twice the number of
cells in the table. Moderate samples are smaller than four
times the number of cells. In this study, the best ratio of
sample size to number of cells is for the 3 X 3 table when
sample size is thirty. In that case, the sample size is 3.3
times the number of cells. The worst ratio, 0.67, occurs
for the 3 X 5 table with a sample of ten.
The first graph, Figure 1, is a plot of the average
performance ratios for 3 X 3 tables when the minimum
expected frequencies are less than 0.5 and the exact
probabilities are less than or equal to 0.5.
VPE f Y>3 Y> 4 Y>3
1.3 {
1.2 I I
1.1 i I C C
l.o r s S
0.9 I C U U
0.8 I 3
0.7 i U
0.6 f
0.5 -i- f 4- 4- — --it N
10 20 30
Fig. 1—Ratio of to Pg for e<0.5, PgSO.5, 3 X 3 tables
Since a ratio of 1.0 implies equality with the exact
probability, Figure 1 shows that methods U, C, and S all
92
approach this equality as sample size increases. Notice
that the Y and M methods (both indicated by Y) produce large
ratios under the conditions represented here.
Figure 2 illustrates the results when the exact
probability P E is greater than 0.5 but with all other
conditions the same as in Figure 1.
PA / PE
1.1 1 . 0
0.9
0 .8
0.7
0 .6
0.5
U
—CS-
10
Y
U
cs
20
Y U
CS
N
30
Fig. 2—Ratio of P & to PE for e<0.5, PE>0.5f 3 X 3 tables
In this figure the Y method (and the M method, which has the
same averages) is nearest the "ideal" 1.0, and it is
consistently close to that ideal for all sample sizes.
Averages for the U method and for the C and S methods
increase toward 1.0 as sample size increases, but U is
closer to the ideal in every case.
For the tables analyzed for Figure 3 the minimum
expected frequencies are greater than or equal to 0.5, and
the exact probabilities are less than or equal to 0.5.
Still, only 3 X 3 tables are considered. Under these
conditions for e and Pg, methods C, S, and U yield average
performance ratios more closely clustered about the ideal
1.0 than in any other of the 3 X 3 contingency table
analyses. However, Yates' and Mantel's correction methods
give average ratios which are, on the average, out of the
range of the graph.
P /P A E
1.2
1.1 1.0 0.9
0.8
0.7
0.6 0.5
Y>4 Y>:
CS
U
UC S
Y>3
U
C
-4 N
10 20 30
Fig. 3—Ratio of P & to P£ for ei0.5, PE<0.5, 3 X 3 tables
In Figure 3, as in Figure 1, the Y and M methods produce
probabilities several times greater than Fisher's exact
probability. As noted previously, average ratios for the
other three methods cluster between 0.8 and 1.1, quite near
the ideal ratio.
The last analysis of 3 X 3 tables is shown in Figure
4. For the tables represented in this figure, the minimum
expected frequencies are greater than or equal to 0.5 and
the exact probabilities are all greater than 0.5. In terms
of minimum expected frequencies and skewness of the
94
frequency distribution, therefore, these tables represent
the least extreme group.
VPE
1.3 + i Y
1.2 f Y
1.1 f Y
1 n i X • U T
0.9 I U U
0.8 i ! C
0.7 f U C S ! S
0.6 f c \ s
0.5 4— *
10 20 30
-t N
Fig. 4—Ratio of ? A to P £ for e>0.5, PE>0.5, 3 X 3 tables
Figure 4 shows improvement in the U, C, and S methods as
sample size, N, increases. Still, the U method is
consistently better (in terms of nearness to the 1.0 ratio)
than either C or S. The Y and M methods produce ratios
which steadily increase away from 1.0 as N increases.
While the graphs illustrate the averages of the
performance ratios, the tables give a somewhat more detailed
record of the overall results. The ranges of those ratios
are shown to vary widely for all five of the chi-squared-
based methods tested. In general, Tables II through IV show
that the ranges are greatest for contingency tables with low
exact probabilities, especially for methods Y and M.
95
3 X 4 Contingency Table Results
Table III contains the data for 3 X 4 contingency
tables.
TABLE III
PERFORMANCE RATIO MEANS AND RANGES, 3 X 4 TABLES
- E — Method Range of smallest expected freauencv e < 0.5 1 e > 0.5
N = 10 T = 938 T = 4
U 0.76 (0.04- 5.10) 0.67 (0.39- 0.85) <0.5 Y 4.96 ( .66-12.71) 5.38 (2.44-11.82)
C .91 { .10- 7.13) .81 ( .57- .99) S .89 ( .09- 6.55) .80 ( .56- .98) M 4.93 ( .33-12.71) 5.38 (2.44-11.82)
T = 1552 T = 6 U 0.70 {0.14- 1.10) 0.71 (0,55- 0.77) Y .91 { .12- 1.91) 1.09 (1.00- 1.38)
>0.5 C .51 { .00- 1.19) .68 ( .53- .79) s .49 { .00- 1.17) .67 { .51- .79) M .91 { .12- 1.91) 1 1.09 (1.00- 1.38)
N = 20 T = 1041 T = 51
U 1.01 {0.00-16.72) 0.98 (0.27- 2.55) <0.5 Y 4.50 ( .00-27.37) 4.31 (1.86-15.25)
C 1.00 ( .00-12.82) .87 { .38- 2.12) s 0.97 ( .00-12.81) .86 { .36- 2.12) M 4.46 ( .00-27.09) 4.31 (1.86-15.25)
T - 1343 T = 65 U 0.88 (0.23- 1.44) 0.86 (0.61- 1.09) Y .88 ( .00- 1.94) 1.27 ( .97- 1.90)
>0.5 C .66 ( .00- 1.76) .73 ( .00- 1.08) S .65 ( .00- 1.69) .72 ( .00- 1.08) M .88 ( .00- 1.93) 1.27 ( .97- 1.90)
N • 30 T - 1039 T - 113
<0.5 U 1.06 (0.00- 7.60) 1.15 (0.44- 12.99)
<0.5 Y 3.91 ( .00-122.34) 8.35 (1.62-113.81) C 1.01 ( .00- 11.65) 0.99 { .32- 8.03) S 1.00 ( .00- 10.91) .98 { .32- 7.98) M 3.90 { .00-121.61) 8.35 (1.62-113.81)
T « 1237 T « 111 U 0.94 (0.26- 1.72) 0.91 (0.59- 1.13) Y .89 ( .00- 1.92) 1.31 ( .99- 1.82)
>0.5 C .75 ( .00- 1.73) .83 < .00- 1.24) s .74 { .00- 1.73) .82 ( .00- 1.22) M .89 { .00- 1.92) 1.31 ( .99- 1.82)
96
The next four figures, Figures 5 through 8, graph the
results obtained for 3 X 4 tables. The graphs are
remarkably similar to those for the 3 X 3 tables. As
before, the four graphs represent different combinations of
minimum expected frequency range and exact probability
range. In Figure 5 the minimum expected frequencies are
less than 0.5, and the exact probabilities are less than or
equal to 0.5.
VPE Y>4 Y>4 Y>3
1.1 4-i U
1.0 4- U C — I c s
0.9 + C S s
u 0.8 i-
0.7 f
0.6 f
0.5 4- + + 4- ^ N
10 20 30
Fig. 5—Ratio of PA to P £ for e<0.5, PE<0.5, 3 X 4 tables
The graph looks very much like the one in Figure 1, which is
a plot made under the same conditions of minimum expected
frequency and exact probability, but for 3 X 3 tables
instead of the 3 X 4 tables evaluated here. Yates* (and
Mantel's) method produces average performance ratios which
are out of the range of the graph's ordinate scale, while
the other methods, TJ, C, and S, converge on the ideal 1.0 as
sample size increases from ten to twenty to thirty.
97
Figure 6 shows the results when the range of P_ is ft
greater than 0.5. Other conditions are the same as in
Figure 5; that is, the minimum expected frequencies are
still less than 0.5 and the table dimension is still 3 X 4 .
P A ^ E
1.1 f
1.0 I u
0.9 + Y UY
0.8 f
0.7 i U
Y
CS
cs 0.6 f 0.5 4- CS 1 4 4 N
10 20 30
Fig. 6—Ratio of P^ to P^ for e<0.5, Pg>0.5, 3 X 4 tables
In Figure 7, the minimum expected frequencies are all
greater than or equal to 0.5, and the exact probabilities
are less than or equal to 0.5.
P /P A E
1.2
1.1
Y> 4 Y> 3 Y>7
1.0 t U— i u cs
0.9 4-i C
0.8 4 C S S
U 0.7 f
0.6 |
0.5 4--- — 4- f + 4 N
10 20 30
Fig. 7—Ratio of P & to Pg for e^0.5, P <0.5, 3 X 4 tables
98
The figure reveals the same pattern shown in Figures 1, 3,
and 5r which, like Figure 7, illustrate tables having P g
less than or equal to 0.5. Except for those produced by
methods Y and M, the probabilities all improve as sample
size increases.
The last plot of 3 X 4 table probability ratios is in
Figure 8. Except for Pg, which is now greater than 0.5, the
conditions are the same as in Figure 7.
P /P A E
1.3
1.2 1.1 1.0
0.9 - U U
0.8 + CS
0.7 - U CS CS
0.6 f
0.5 -i- 1 + f- -i N
10 20 30
Fig. 8—Ratio of P A to P E for e*0.5, Pg>0.5, 3 X 4 tables
Once again, a pattern is evident in Figures 2, 4, 6, and 8,
the plots in which Fisher's exact probability exceeds 0.5
for all tables evaluated. On the average, methods U, C, and
S yield improved probability estimates, as indicated by the
performance ratios, as sample size increases, but methods Y
and M produce better probability estimates for the smallest
sample size, ten.
99
Again, it is important to consider not only the average
ratios, but also the range. Table III reveals ranges
similar to those observed for the 3 X 3 tables reported in
Table II. The ranges vary greatly, and no method appears to
be better than another at minimizing the variation. As
before, low exact probabilities, indicating more extremely
skewed frequency distributions, produce wider ranges of
performance ratios, and the Y and M methods are the most
severely affected. Performance ratios for those methods
range from less than 0.005 to more than 122. The minimum
ratio for all methods, in fact, is less than 0.005 under
some of the conditions of minimum expected frequency and
exact probability ranges given in Table III. The maximum
ratio for the U method is 16.72, for the C method it is
12.82, and for the S method it is 12.81.
Interestingly, Haber (1) reported similar patterns in
the performance ratio ranges in his study of 2 X 2 tables.
His simulation produced the greatest extremes in range for
the Y method and the least for the S method. This led him
to conclude that both the uncorrected chi-squared and the
traditional Yates correction were inadequate.
3 X 5 Contingency Table Results
The 3 X 5 simulated tables produced the data in Table
IV. Sample size is ten in the first one-third of the
table. With this sample size and table dimension, minimum
expected frequencies are always less than 0.5.
100
TABLE IV
PERFORMANCE RATIO MEANS AND RANGES, 3 X 5 TABLES
Lj> Method Ranae of smallest expected freauencv
e < 0.5 1 j e >0.5 N • 10
T » 828 T = 0 U 0.87 (0.08- 12.42)
iO.5 Y 10.87 { .74-136.45) C 1.07 ( .09- 20.50) s 1.05 { .09- 19.52) M 10.87 ( .74-136.45)
T = 1672 T = 0 U 0.65 (0.20- 1.11) y .87 ( .10- 1.86)
>0.5 c .54 ( .00- 1.31) s .53 ( .00- 1.30) M .87 ( .10- 1.86)
N « 20 T = 1123 T = 17
10.5 U 1.04 (0.00- 17.45) 1.34 (0.45- 6.29)
10.5 Y 5.36 ( .00-116.31) 12.64 (1.97-163.28) C 1.00 ( .00- 14.05) .96 ( .30- 4.04) S .98 ( .00- 14.03) .95 ( .30- 4.02) M 5.36 ( .00-116.31) 12.64 (1.97-163.28)
T = 1349 T - 11 U 0.88 (0.27- 1.55) 0.85 (0.57- 0.95) Y .80 { .00- 1.96) 1.15 (1.02- 1.50)
>0.5 C .71 ( .00- 1.79) .88 ( .52- 1.05) s .70 ( .00- 1.70) .88 ( .52- 1.04) M .80 { .00- 1.96) 1.15 (1.02- 1.50)
N - 30 T ® 1145 T = 56
U 1.10 (0.00- 10.55) 1.01 (0.46- 2.96) <0.5 Y 5.72 { .00-114.71) 5.89 (1.69-130.08)
C 1.02 { .00- 8.36) 0.88 ( .29- 1.81) S 1.00 ( .00- 8.34) .87 ( .26- 1.81) M 5.72 ( .00-114.71) 5.89 (1.69-130.08)
T - 1239 T • 60 U 0.96 (0.34- 1.67) 0.90 (0.65- 1.31) Y .81 < .00- 1.99) 1.39 (1.00- 1.83)
>0.5 C .80 { .00- 1.75) .81 { .11- 1.24) S .80 ( .00- 1.74) .79 ( .02- 1.24) M .81 ( .00- 1.99) 1.39 (1.00- 1.83)
101
Figure 9 is the first of four plots based on the
results for 3 X 5 tables, plotted from the data in Table
IV. Minimum expected frequencies in the contingency tables
plotted here are all less than 0.5, and Fisher's exact
probability equals or is less than 0.5. It is interesting
to note that Cochran's corrections, methods C and S follow a
somewhat different pattern from that in other graphs which
also have exact probabilities less than or equal to 0.5.
1.3
1.2
1.1 1.0
0.9
0.8
0.7
0.6 0.5
Y>4 Y>10 Y>7
CS U U c s C s —
U
t N
10 30
Fig. 9—Ratio of P^ to P £ for e<0.5, PgiO.5, 3 X 5 tables
Still, methods U, C, and S all converge toward the ideal 1.0
as N increases, and methods Y and M yield poor estimates.
Figure 10 shows the effect of having exact
probabilities greater than 0.5. The minimum expected
frequencies in the contingency tables are still less than
0.5, just as they are for those tables whose ratios are
plotted in Figure 9, above.
102
P /P A E
1.3 f
1.2 } 1.1 i
1.0 f
0.9 i y u c
o.8 + y s
0.7 4- CS U
UY
0.6 4-C
0.5 i i"S : i — : N
10 20 30
Fig. 10—Ratio of P A to P E for e<0,5, PE>0.5, 3 X 5 tables
Figure 10 gives a pattern quite similar to the ones in
Figures 2 and 6, which were plotted for the same conditions
of minimum expected frequencies and exact probabilities, but
for 3 X 3 and 3 X 4 contingency tables, respectively.
Methods Y and M are consistently close to the exact
probabilities for all three sample sizes, and methods U, C,
and S all give improved estimates as N increases. Of these
three, U is always nearest 1.0.
In Figure 11, the contingency tables tested have
minimum expected frequencies equal to or greater than 0.5
and exact probabilities less than or equal to 0.5. Note
that for sample size 10, there are no 3 X 5 tables with
minimum expected frequencies greater than or equal to 0.5.
Although it is theoretically possible for minimum expected
frequencies to range from 0.1 through 0.6 under these
conditions (see Table I), none of the 2,500 tables produced
103
PA / PE
1.3
1.2 1.1
1.0 0.9
0.8
0.7
0 . 6
0.5
Y>10 y>6
u
cs
u
10 20
c s
30
N
Fig. 11—Ratio of to Pg for e>0,5, Pg<0.5r 3 X 5 tables
in the simulation has a minimum expectation equal to or
greater than 0.5. The same situation applies to Figure 12,
in which all tables have P_ greater than 0.5.
The fact that no 3 X 5 contingency tables with minimum
expected frequencies equal to or greater than 0.5 were
generated by XTAB for sample size ten is a result of the
experiment design. Row totals and column totals for a
contingency table of a specified dimension and sample size
were independently and randomly selected. The row total and
column total vectors are the only quantities which affect
the maximum likelihood calculation of minimum expected
frequency. The random distribution of each of these vector
populations is skewed toward the smaller minimum
expectations. This was verified during the testing of the
random number generator used to select the row and column
totals for the hypergeometric sampling procedure.
104
VPE
1.3
1.2
1.1
1.0
0.9
0.8
0.7
0.6
0.5
CS U
U
CS
•+ N
10 20
Fig. 12—Ratio of to Pg for e20.5, Pg>0.5, 3 X 5 tables
Figure 12 shows a similar trend to those observed in Figures
2, 4, 6, 8, and 10, all of which are plots of performance
ratios for tables whose exact probabilities are greater than
0.5.
Table IV shows that the chi-squared-based tests on the
3 X 5 contingency tables suffer from the same extremes in
the ranges of probability estimates as they did for the
other table dimensions. Once again, the more highly skewed
frequency distributions, as indicated by exact probabilities
less than or equal to 0.5, produce wider ranges in the
performance ratios. Methods Y and M still demonstrate the
widest ranges. Performance ratios based on the estimates
of those methods ranged from less than 0.005 to slightly
more than 163, while the maximum for all the other methods
was less than 18.
105
Effects of Sample Size
To summarize the answer to the first question, which
has to do with the effects of sample size, several
observations based on the average performance ratio are
apparent. Under all conditions methods U, C, and S yield
better probability estimates as sample size increases from
ten to twenty to thirty. Methods Y and M give worse
estimates as N increases. In fact, methods Y and 11 are
always the worst estimators of the exact probability except
for samples of size ten when exact probabilities are greater
than 0.5, in which case they are the best. For sample size
ten and exact probabilities equal to or less than 0.5,
method C is most accurate. For samples of twenty and
thirty, method U, the uncorrected chi-squared statistic,
produces the closest estimate of Fisher's exact probability,
on the average, in almost every case.
Sample size appears to have no effect on the ranges of
the performance ratios. For a given table dimension, a
method's range is seen to be consistent.
Effects of Expected Frequency Range
The second question of the study concerns the effects
of small expected frequencies. Haber placed a lower limit
of one on the minimum expected frequencies in his research
No limitations were set in the study reported here. Some
106
information about the effects of small expected frequencies
can be gained by examining graphs having the same exact
probability ranges and table dimensions at equal sample
sizes. Six pairs of graphs can be used: Figures 1 and 3,
Figures 2 and 4, Figures 5 and 7, Figures 6 and 8, Figures 9
and 11, and Figures 10 and 12. Examining these pairs of
graphs at corresponding sample sizes discloses only slight
effects.
When exact probabilities are less than or equal to 0.5,
as they are in Figures 1 and 3, in Figures 5 and 7, and in
Figures 9 and 11, the range of the minimum expected
frequencies has little or no effect on the accuracy of the
probability estimates. For exact probabilities greater than
0.5, as in Figures 2 and 4, in Figures 6 and 8, and in
Figures 10 and 12, methods C and S give slightly better
estimates of P^, and methods Y and M give slightly worse
estimates. Examinations of the six pairs of graphs reveal
no effects of range of minimum expected frequency on the
probability estimates produced by method U.
These observations are viewed from a different
perspective in Figures 13 and 14. The evaluation of the
effects of minimum expected frequency confirms the
perspective given by the six figure pairs investigated
previously. Figures 13 and 14 are plots of exactly the same
data, contingency table simulations in this case, as those
plotted in Figures 1 through 12. Alterations to the main
107
program routine, XTAB, produced the data tables in Appendix
L. They differ from those in Tables II, III, and IV in the
ranges of the minimum expected frequency, e, the table
layout, and in the fact that only the average performance
ratios are presented.
In each of the two figures a correction method's
performance ratio, is plotted against minimum
expected frequency. The performance ratio is plotted for
the three different table dimensions, and the other
conditions, sample size and exact probability range, are
held constant. In Figure 13, the performances of method U
are graphed for samples of size twenty and for tables in
which the exact probability exceeds 0.5.
PA 1.2
(¥E x 3 X 3 tables
1 . 1 1 * 3 X 4 tables o 3 X 5 tables
0 . 9 -1-6 0 . 8 0 . 7
0 . 1 0 . 3 0 . 5 0 . 7 0 . 9 1 . 1 1 . 3
X g 5 S X * . X
Fig. 13—Performance of Method U for N • 30, P < 0 . 5 E*
The data in the tables in Appendix L show the results
graphed in Figure 13 to be typical of the six possible
graphs for method U. Over the entire range of minimum
expected frequencies possible in this study the performance
ratios for method U vary insignificantly, just as they
appear to do in Figure 13. The data tables in Appendix L
display the narrow range of variation in the performance
ratio for all possible values of minimum expected frequency,
108
Figure 14 is a graph of the performance of method C for
samples of size thirty and for tables having exact
probabilities equal to or less than 0.5. For one thing,
Figure 14 demonstrates the same good performance of method C
for tables with low exact probabilities as previously
demonstrated in Figures 1, 3, 5, 7, and 9.
PA^PE 1.2 1.1 1.0 0.9 0.8 0.7
x 3 X 3 tables 0 * 3 X 4 tables * * _ o 3 X 5 tables .4 J X — - O g —
o
0.1 0.3 0.5 0.7 0.9 1.1 1.3
Fig. 14—Performance of Method C for N = 30, Pg<0.5
More importantly at this point in the data analysis, Figure
14 shows that varying the minimum expected frequency
produces no trend or pattern in the quality of the method's
performance for the conditions included in the graph. Both
in Figure 13 and in Figure 14 the performance plot for 3 X 3
contingency tables varies less than the plots for the other
two table dimensions. This is an artifact of the data which
is easily discovered by examining the tables in Appendix L.
The minimum expected frequency in a 3 X 3 contingency table
with sample size twenty or thirty has a much wider range of
possible values than possible for 3 X 4 or 3 X 5 tables.
Table I lists the respective ranges. The plots for the 3 X
3 tables cover only about one-half the total domain of
possible minimum expected frequencies. An examination of
109
the data in Appendix L verifies that performance ratio
varies in 3 X 3 tables also, and that the range of variation
is almost the same as it is in the other two sizes of
contingency tables.
In general, the range of minimum expected frequencies
used in this simulation seems to have only slight (or no)
effect on the probability estimators tested.
Effects of Table Dimension
To answer the third question of the study, regarding
the effects of table dimension, trios of graphs chosen from
Figures 1 through 12 can be used. In each trio, the graphs
must have the same minimum expected frequency range and the
same exact probability range, and the probability ratios
must be evaluated at equal sample sizes. Four trios of
graphs satisfying these conditions are in Figures 1, 5, and
9, in Figures 2, 6, and 10, in Figures 3, 7, and 11, and in
Figures 4, 8, and 12. Studying these trios reveals no
apparent effect of table dimension on the accuracy of the
probability estimators. The patterns of the probability
estimator performance ratios are consistent within each of
these trios.
Figures 13 and 14 confirm this analysis of the effects
of table dimension. In both figures, the performance of a
test method is shown for all three contingency table sizes
with all other variables held constant. The graphs show
that the performance ratio varies similarly, in terms of
110
range and direction, for all table sizes. This can be
further verified by examining the data in Appendix L.
Effects of Exact Probability Range
The related question concerning the effects of exact
probability is most easily answered by comparing Figures 1,
3, 5, 7, 9, and 11 with Figures 2, 4, 6, 8, 10, and 12. The
uncorrected chi-squared test is the least affected by the
exact probabilty of the table's frequency distribution. The
corrected chi-squared tests, on the other hand, demonstrate
some significant effects. The Y and M methods show the most
pronounced effects. For both methods, tables with exact
probabilities equal to or less than 0.5 are indicated to
have probabilities from 1.95 to more than 10 times greater
than the exact probability. The mean is approximately five
times greater. When the tables have exact probabilities
greater than 0.5, methods Y and M perform more consistently
near the ideal 1.0 ratio.
Cochran's corrections exhibit the opposite effect. For
low exact probabilities methods C and S are consistently
near an average performance ratio of 1.0. In fact, they are
consistently the best estimators of the exact probability.
When the exact probability exceeds 0.5, however, methods C
and S are almost always the worst estimators.
Ill
Summary
As might have been expected, the only real effects on
the accuracies of the methods tested in this study are those
of sample si2e. Researchers and statisticians, since the
introduction of chi-squared tests, have been aware of the
asymptotic properties of the test and have recommended large
samples for greatest accuracy of its applications. This
study shows that, in terms of its estimation of the exact
probability of independence of two variables, chi-squared
produces widely varying estimates in these small sample
situations. Conclusions regarding these findings are given
in Chapter V.
CHAPTER V
SUMMARY OF FINDINGS, CONCLUSIONS, AND RECOMMENDATIONS
FOR FURTHER RESEARCH
In this chapter the findings established by the data
are summarized and conclusions drawn from these findings are
given. Recommendations for further research are then
offered.
Summary of Findings
Chapter IV reported the findings resulting from this
simulation study. Those findings are summarized in the
following paragraphs.
Findings Regarding the Effects of Sample Size
With regard to the effects of sample size in the
simulated contingency tables, the following findings are
supported by the data of Table II, Table III, and Table IV.
1. On the average, the uncorrected chi-squared
statistic and the statistic corrected by Cochran's methods
are improved by increasing the sample size.
112
113
2. On the average, Yates' and Mantel's correction
methods yield poorer estimates of the exact probability as
sample size increases.
3. The range of the estimates for all methods is not
improved as sample size is increased from ten to twenty to
thirty. In fact, the data show that, in general, the sample
size has little effect on the range of estimates produced by
any method.
The implications of this finding can be seen by
examining the middle section of Table II, for example. The
average estimate produced by Cochran's S method for 943
contingency tables is 0.99 of the exact probability value.
Within that set of tables, however, the S method gives
estimates ranging from less than 0.01 to 11.24 times the
exact probability.
Within that same set of tables the exact probability
value is always less than or equal to 0.5. If Cochran's
correction is more than eleven times greater than the exact
probability in at least one case, the exact probability in
that case can be no greater than about 0.09. Assume, for
purposes of example, that the exact probability is 0.05.
The chi-squared test corrected by Cochran's S method yields
a probability of more than 0.55. A result considered
significant at the 0.05 level would be judged as
insignificant because of the error introduced by the
corrected chi-squared test. Close examination of the data
114
tables reveals that this extreme variation in range is not
an exceptional result.
Findings Regarding the Effects of Expected Frequency
The data tables in Appendix L give the best perspective
for analyzing the effects of expected frequency. These
effects can be summarized very simply as follows.
The value of the minimum expected frequency has no
influence on the accuracy of any of the tested methods'
estimation of the exact probability.
Findings Regarding the Effects of Table Dimension
The data tables clearly demonstrate the following fact
concerning the effects of contingency table dimension.
For the five chi-squared-based methods tested, table
dimension does not affect the accuracy of the estimation of
the exact probability.
Conclusions
This study of chi-squared based tests has considered a
narrowly defined class of contigency tables and sample
conditions and, therefore, the conclusions drawn here have a
correspondingly narrow interpretation. As previously noted,
practically all research reported in the statistical
literature has dealt with 2 X 2 contingency tables or with
cross-classifications of large samples. The samples of
sizes ten, twenty, and thirty classified into 3 X 3 , 3 X 4 ,
and 3 X 5 contingency tables in this simulation study allow
115
analysis of conditions not frequently considered. These
conclusions apply only to contingency tables meeting the
same conditions specified in this study. Fortunately,
besides the sample size and table dimension conditions
already stated, no other strict limitations are required,
except that no structural zeroes are allowed in the
categories of the cross-classifications. Sampling zeroes,
on the other hand, are permitted because they were allowed
in the simulation.
This study, like the one by Haber (2) on which it was
modeled, addresses the issue Miettinen (4) called the
"second line of inquiry" regarding continuity corrections to
chi-squared, that of determining which statistic agrees best
with Fisher's exact probability. Using the exact
probability as a reference, the probability associated with
the uncorrected chi-squared statistic was, on the average,
never more than 35 per cent in error, and in two-thirds of
the cases, those based on samples of twenty and thirty, it
deviated less than 15 per cent. Because of this it is
tempting to conclude that uncorrected chi-squared tests are
useful even in the extreme sample size and minimum expected
frequency conditions tested in this study. Similar
statements could be made regarding some of the other methods
tested.
116
However, a conclusion based on the average of the
results is a dangerous one. Although the averages may not
indicate extreme errors, especially for the larger sample
sizes, Tables II, III, and IV show that the ranges of the
estimates vary widely. The conclusions, therefore, are
limited to the following.
1. For contingency tables with the dimensions and
sample sizes used in this study, chi-squared-based tests are
not dependable estimators of the exact probability of
independence.
2. Lower limits on the minimum expected frequencies in
the contingency tables are unnecessary. The data for these
sparse contingency tables show no pattern or trend in the
variation of the probability estimates as minimum
expectations are allowed to vary from as low as 0.033 to as
high as 2.0. Again, it should be noted that structural
zeroes are not allowed in this study, so the minimum
expected frequencies are always greater than zero, but they
are as small as 0.033 in some cases.
3. Probability estimates based on the chi-squared
statistics tested are not affected by table dimension. The
results were the same for the 3 X 3 , 3 X 4 , and 3 X 5 tables
tested.
117
Recommendations for Further Research
Two major areas for further research are suggested by
the results of this study. These are discussed in the
following paragraphs.
First, there are those questions dealing with
extensions of the contingency table parameters used in this
study. Obviously, the sample sizes and table dimensions
investigated in this study represent a minute fraction of
the many possibilities. Certainly, some of the other
possiblities warrant investigation, particularly since they
generally have been ignored in the literature.
Of somewhat greater interest might be a study of three-
way contingency tables, or four-way tables, and so on. The
conceptual difficulties accompanying the treatment of multi-
way tables are overcome to a certain extent by the use of
computers.
Also, there is the possibility of investigating
contingency tables having structural zeroes. The
implications of structural zeroes are largely unexplored in
contingency table research.
The second major area for additional research suggested
by this study has to do not with contingency tables and
their properties directly, but rather with related tests
that are used with categorical data. Bishop, Fienberg, and
Holland (1, pp. 123-131) discuss the likelihood ratio
2
statistic, G , which is not so familiar as the Pearson chi-
squared statistic used in this study. They suggest that G^
118
possesses certain advantages, as compared with chi-squared,
which might make it a more attractive goodness-of-fit
statistic in some situations. A comparison of the two under
conditions similar to those used in this study might be
instructive.
Finally, and perhaps most interesting in light of the
current research, there is the possibility of studying the
implications of sparse data sets, like the samples of ten,
twenty, and thirty used in this study, for log-linear
models. Small samples are sometimes the only practical
samples, particularly in psychological and educational
research, and small samples almost always produce sampling
zeroes. Whether or not these small samples affect log-
linear models in the same way they affect the chi-squared
statistic merits investigation.
Summary
For this student the research reported here has been
exciting and the results are comforting. The familiar chi-
squared test has faced the challenge of small sample sizes
and fractional expected frequencies and has emerged with a
somewhat predictable, albeit less than spectacular
performance. Moreover, the discovery of efficient computer
algorithms for Fisher's exact probabilities test has opened
a window into a new area which promises opportunity for much
learning. While the results reported here may not be earth-
shattering, they are important in their own right, and
119
perhaps other researchers will be able to proceed with
confidence into the situations studied here. At least,
something is now known about those situations.
An underlying motivation in this study has been to
establish the usefulness of some test of independence for
small samples, even as small as the sample available to the
educational researcher who is limited to using the students
in a single course of study. The chi-squared-based tests
have proved to be unreliable in this situation. Fisher's
exact probabilities test is an alternative, one which is
becoming more and more feasable in light of the work of
Mehta and Patel. The popularity of Fisher's exact
probabilities test is likely to increase and its application
will be better understood as it becomes a part of more
computer statistics software packages.
120
CHAPTER BIBLIOGRAPHY
1. Bishop, Yvonne M. M., Stephen E. Fienberg, and Paul W. Holland, Discrete Multivariate Analysis. Cambridge, Massachusetts, The MIT Press, 1975.
2. Haber, Michael, "A Comparison of Some Continuity Corrections for the Chi-Squared Test on 2 X 2 Tables," Journal of the American Statistical Association, LXXV (September, 1980), 510-515.
3. Mehta, Cyrus R. and Nitin R. Patel, "A Network Algorithm for Performing Fisher's Exact Test in r x c Contingency Tables," Journal of the American Statistical Association, LXXVIII (June, 1983), 427-434.
4. Miettinen, Olli S., "Comment," Journal of the American Statistical Association, LXIX (June, 1974), 380-383.
APPENDICES
121
APPENDIX A
THE MAIN ROUTINE
It was the function of this routine to perform the housekeeping tasks like dimensioning arrays, declaring variable types, establishing output file specifications, etc. It then called the subroutines to perform the statistical tests and to calculate the associated probabilities. Finally, it presented the data in summary tables.
This appendix is a listing of the FORTRAN source program. The version of the listing given here is for 3 X 3 tables when the sample size is ten.
PROGRAM XTAB C Q * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
c C COMPARE CORRECTIONS TO CHI-SQUARED FOR 3X3, 3X4,
X AND 3X5 CONTINGENCY TABLES USING SAMPLES X OF SIZES 10, 20, AND 30.
C c *************************************************** C
REAL U (14), Y(14), C(14), S(14), M(14), RKMIN<14), X RKMAX(14), RYMIN(14), RYMAX(14), RCMIN(14), X RCMAX(14) , RSMIN(14) , RSMAXU4), RMMIN(14) , X RMMAX(14),RKSUM(14),RCSUM(14),RSSUM(14),RMSUM(14), X RYSUM(14),EXVAL(3,3)
CHARACTER PRN DIMENSION MATEMP(3,3), NT(14) DIMENSION MATR(4,4), NROWT(3), NCOLT(3),MATRIX(3,3) COMMON /RAND/ IX,IY,IZ COMMON ISEED ICOUNT=0 ISEED=1733 IX=8351 IY=3317 IZ=1773
122
123
OPEN(2,FILE='PRN') WRITE(2,8100) WRITE{2,8200) NROW=3 NCOL=3 N=1Q DO 50 MM=*1,14
RKMIN(MM)=9.0 RKMAX(MM)=0.0 RKSUM(MM)*0.0 RCMIN(MM)=9.0 RCMAX(MM)=0.0 RCSUM(MM)=0.0 RSMIN(MM)=9.0 RSMAX(MM)=0.0 RSSUM(MM)=0.0 RMMIN(MM)=9.0 RMMAX(MM)=0.0 RMSUM(MM)=0.0 RYMIN(MM)=9.0 RYMAX(MM)=0.0 RYSUM(MM)30.0 NT(MM)=0
50 CONTINUE DO 8000 NTABL»=1,2500 ICOUNT=ICOUNT+l CALL MART(N,NROW,NCOL,NROWT,NCOLT) CALL EV(N,NROWT,NCOLT,NCOL,EXVAL,EVMIN) CALL RCONT2(NROW,NCOL,NROWT,NCOLT,JWORK,MATRIX,KEY,IFAULT) WRITE(*,60) ICOUNT
60 FORMAT(3X,'ICOUNT = ',15) CALL CHISQ(MATRIX,EXVAL,NCOL,XSQ) CALL PVAL(NCOL,XSQ,PK) DO 200 1=1,3
DO 100 J=1,NCOL MATR{I,J)=MATRIX(I,J)
100 CONTINUE 200 CONTINUE
DO 400 1=1,3 DO 300 J=1,NCOL
MATR <I,NCOL+1)=NROWT(I) MATR(4,J)=NCOLT(J)
300 CONTINUE 400 CONTINUE
NR=NROW+l NC=NCOL+l MATR(4,NC)=N CALL RXCPRB(MATR,NR,NC,NCOL,EXVAL,PT,PS,PM,MATEMP,PY,
X MAFX) CALL COCHR(MATEMP,EXVAL,XSQ,NCOL,PCC,PCS)
124
C C
IF (PS .GT. 0.5) GOTO 3000 IF (EVMIN .LiT. 0.2) THEN
NT(1)=NT(1)+1 ICAT=1 CALL RATIOS(PK,PS,RKMIN(1),RKMAX(1),RKSUM(1),
RATIOS(PCC,PS,RCMIN(1),RCMAX(1),RCSUM(1) RATIOS(PCS, PS,RSMIN(1),RSMAX(1),RSSUM{1) RATIOS(PM,PS,RMMIN{1),RMMAX(1),RMSUM(1), RATIOS(PY,PS,RYMIN(1),RYMAX(1),RYSUM(1),
.AND. EVMIN .LT. 0.3) THEN
ICAT) CALL RATIOS(PCC,PS,RCMIN(1),RCMAX(1) ,RCSUM(1) ,ICAT) CALL RATIOS(PCS,PS,RSMIN(1),RSMAX(1),RSSUM(1),ICAT) CALL RATIOS(PM,PS,RMMIN{1),RMMAX(1),RMSUM(1),ICAT) CALL RATIOS(PY,PS,RYMIN(1) ,RYMAX(1),RYSUM <1),ICAT) (EVMIN .GE. 0.2 NT(2)=NT(2)+1 ICAT=2 CALL RATIOS(PK,PS,RKMIN(2),RKMAX(2),RKSUM(2),ICAT) CALL RATIOS(PCC,PS,RCMIN(2),RCMAX(2),RCSUM(2),ICAT) CALL RATIOS< PCS,PS,RSMIN(2),RSMAX(2),RSSUM(2),ICAT) CALL RATIOS(PM,PS,RMMIN(2),RMMAX(2),RMSUM(2),ICAT) CALL RATIOS(PY,PS,RYMIN(2),RYMAX(2),RYSUM(2),ICAT)
ELSEIF (EVMIN .GE. 0.3 .AND. EVMIN .LT. 0.4) THEN NT(3)=NT(3)+1 ICAT=3 CALL RATIOS(PK,PS,RKMIN(3),RKMAX(3),RKSUM(3),ICAT) CALL RATIOS(PCC,PS,RCMIN(3),RCMAX(3),RCSUM(3),ICAT) CALL RATIOS(PCS,PS,RSMIN(3),RSMAX(3),RSSUM(3),ICAT) CALL RATIOS(PM,PS,RMMIN(3),RMMAX(3),RMSUM(3),ICAT) CALL RATIOS(PY,PS,RYMIN(3),RYMAX(3),RYSUM(3),ICAT)
ELSEIF (EVMIN .GE. 0.4 .AND. EVMIN .LT. 0.5) THEN NT(4)=NT(45+1 ICAT=4 CALL RATIOS(PK,PS,RKMIN(4),RKMAX(4),RKSUM(4),ICAT) CALL RATIOS<PCC,PS,RCMIN(4),RCMAX(4),RCSUM(4),ICAT) CALL RATIOS <PCS,PS,RSMIN(4) ,RSMAX(4) ,RSSUM(4) ,ICAT) CALL RATIOS(PM,PS,RMMIN(4),RMMAX(4),RMSUM(4),ICAT) CALL RATIOS(PY,PS,RYMIN(4),RYMAX(4),RYSUM(4),ICAT)
ELSEIF (EVMIN .GE. 0.5 .AND. EVMIN .LT. 0.6) THEN NT(5)=NT(5)+1 ICAT=5 CALL RATIOS(PK,PS,RKMIN(5),RKMAX(5),RKSUM(5),ICAT) CALL RATIOS(PCC,PS,RCMIN(5),RCMAX(5),RCSUM(5),ICAT) CALL RATIOS(PCS t PS,RSMIN(5),RSMAX(5),RSSUM(5),ICAT) CALL RATIOS(PM,PS,RMMIN(5),RMMAX(5),RMSUM(5),ICAT) CALL RATIOS(PY,PS,RYMIN(5),RYMAX(5),RYSUM(5)fICAT)
ELSEIF (EVMIN .GE. 0.6 .AND. EVMIN .LT. 0.7) THEN NT (6) »=NT (6) +1 ICAT=s6 CALL RATIOS(PK,PS,RKMIN(6),RKMAX{6),RKSUM(6),ICAT) CALL RATIOS(PCC,PS,RCMIN(6),RCMAX(6),RCSUM(6),ICAT) CALL RATIOS(PCS,PS,RSMIN(6),RSMAX(6),RSSUM(6),ICAT) CALL RATIOS(PM,PS,RMMIN(6),RMMAX(6),RMSUM(6),ICAT) CALL RATIOS(PY,PS,RYMIN(6),RYMAX(6),RYSUM(6),ICAT)
125
C C C 3000
ELSEIF (EVMIN .GE. 0.7) THEN NT{7)»NT(7)+1 ICAT=7 CALL RATIOS(PKf PS,RKMIN(7) ,RKMAX(7) ,RKSUM(7),ICAT) CALL RATIOS{PCC,PS,RCMIN(7) ,RCMAX(7),RCSUM(7)fICAT) CALL RATIOS(PCS,PS,RSMIN(7),RSMAX(7),RSSUM(7),ICAT) CALL RATIOS{PM, PS,RMMIN(7),RMMAX(7),RMSUM(7},ICAT) CALL RATIOS(PY,PS,RYMIN(7),RYMAX(7),RYSUM(7),ICAT)
ENDIF GOTO 8000
HERE IF EXACT PROBABILITY .GT. 0.5
IF (EVMIN .LT. 0.2) THEN NT(8)«NT(8) +1 ICAT=8 CALL RATIOS(PK,PS,RKMIN(S),RKMAX(8),RKSUM(8),ICAT) CALL RATIOS{PCC,PS,RCMIN(8),RCMAX(8),RCSUM(8),ICAT) CALL RATIOS(PCS,PS,RSMIN(8),RSMAX(8),RSSUM(8),ICAT) CALL RATIOS(PM,PS,RMMIN(8) , RMMAX(8),RMSUM(8),ICAT) CALL RATIOS(PY,PS »RYMIN(8) ,RYMAX(8) ,RYSUM(8) ,ICAT)
ELSEIF (EVMIN .GE. 0.2 .AND. EVMIN .LT. 0.3) THEN NT(9)«NT(9)+1 ICAT=9 CALL RATIOS(PK,PS,RKMIN(9),RKMAX(9),RKSUM(9),ICAT) CALL RATIOS(PCC,PS,RCMIN(9),RCMAX(9),RCSUM(9),ICAT) CALL RATIOS(PCS,PS f RSMIN(9),RSMAX(9) ,RSSUM(9) ,ICAT) CALL RATIOS(PM,PS,RMMIN(9),RMMAX(9),RMSUM(9),ICAT) CALL RATIOS(PY,PS,RYMIN(9),RYMAX(9),RYSUM(9),ICAT)
ELSEIF (EVMIN .GE. 0.3 .AND. EVMIN .LT. 0.4) THEN NT(10)»NT(10)+1 ICAT=10 CALL RATIOS(PK,PS,RKMIN(10),RKMAX(10),RKSUM(10),ICAT) CALL RATIOS(PCC,PS,RCMIN(10),RCMAX(10),RCSUM(10),ICAT) CALL RATIOS(PCS,PS,RSMIN(10),RSMAX(10)rRSSUM(10),ICAT) CALL RATIOS{PM,PS,RMMIN(10),RMMAX(10),RMSUM(10),ICAT) CALL RATIOS(PY,PS,RYMIN(10),RYMAX(10),RYSUM(10),ICAT)
ELSEIF (EVMIN .GE. 0.4 .AND. EVMIN .LT. 0.5) THEN NT(11)"NT(11)+1 ICAT=11 CALL RATIOS <PK,PS,RKMIN(11),RKMAX(11),RKSUM(11),ICAT) CALL RATIOS(PCC,PS,RCMIN <11),RCMAX(11),RCSUM(11),ICAT) CALL RATIOS <PCS,PS,RSMIN(11),RSMAX(11) ,RSSUM(11) ,ICAT) CALL RATIOS(PM,PS,RMMIN(11),RMMAX(11),RMSUM(11),ICAT) CALL RATIOS(PY,PS,RYMIN(11),RYMAX(11),RYSUM(11),ICAT)
ELSEIF (EVMIN .GE. 0.5 .AND. EVMIN .LT. 0.6) THEN NT(12)^NT(12)+1 ICAT®12 CALL RATIOS(PK.PS,RKMIN(12),RKMAX(12),RKSUM(12),ICAT) CALL RATIOS(PCC,PS,RCMIN(12),RCMAX(12),RCSUM(12),ICAT) CALL RATIOS(PCS,PS,RSMIN(12),RSMAX(12),RSSUM(12),ICAT) CALL RATIOS(PM,PS,RMMIN(12),RMMAX(12),RMSUM(12),ICAT) CALL RATIOS(PY,PS,RYMIN(12),RYMAX(12),RYSUM(12),ICAT)
126
8000 8050
8080 8100 8200
8300
8350
8400
8500
8600
X
X
X
X X
X X
X X
X X
ELSEIF (EVMIN .GE. 0.6 .AND. EVMIN .LT. 0.7) THEN NT(13)«=NT(13)+1 ICAT=13 CALL RATIOS(PK,PS,RKMIN(13),RKMAX(13),RKSUM{13)#ICAT) CALL RATIOS(PCC,PS,RCMIN(13),RCMAX(13),RCSUM(13),ICAT) CALL RATIOS(PCS,PS,RSMIN(13),RSMAX(13),RSSUM(13),ICAT) CALL RATIOS(PM,PS,RMMIN(13),RMMAX(13),RMSUM(13),ICAT) CALL RATIOS(PY,PS,RYMIN(13),RYMAX(13),RYSUM(13),ICAT)
ELSEIF (EVMIN .GE. 0.7) THEN NT(14)"NT(14)+1 ICAT=14 CALL RATIOS(PK,PS,RKMIN(14) ,RKMAX(14) ,RKSUM(14),ICAT) CALL RATIOS <PCC,PS,RCMIN(14),RCMAX(14),RCSUM(14) ,ICAT) CALL RATIOS(PCS,PS,RSMIN(14),RSMAX(14),RSSUM(14),ICAT) CALL RATIOS(PM,PS,RMMIN(14),RMMAX(14),RMSUM(14),ICAT) CALL RATIOS(PY,PS,RYMIN(14),RYMAX(14),RYSUM(14),ICAT)
ENDIF CONTINUE DO 8080 L^l, IF
14
e')
(NT(L) .EQ. 0) GOTO 8080 U(L)«RKSUM(L)/NT(L) Y(L)=RYSUM(L)/NT(L) C(L)=RCSUM(L)/NT(L) S(L)«RSSUM(L)/NT(L) M(L)=RMSUM(L)/NT(L)
CONTINUE FORMAT('11,20X,'Range of the Smallest Expected Frequency, FORMAT(7X,'METHOD',4X,10.1-<e<0.2',7X,,0.2«<e<0.3',7X,
'0.3=<e<0.4',7X,'0.4=<e<0.5') FORMAT('1',10X,'a. Exact Significance Probability Less Than or Equal To 0.5') FORMAT(* 1',10X,'b. Exact Significance Probability Greater Than 0.5') FORMAT(2X,'N*',I2,12X,'T=',I4,11X,'T«',I4,11X,
1T=1,14,11X,'T=',14) FORMAT(10X,A1,3X,F4.2,'(',F4.2,'-',F4.2,')',2X,F4.2,'(', ' (' , } . F4.2, F4.2,') F4.2,,F4.2,')',2X,F4.2,
2X,F4.2,'(',F4.2,,F4.2, FORMAT('1',18X,,0.5=<e<0.6',7X,'0.6=<e<0.7,,10X,'e>0.7') WRITE(2,8300) DO 8700 1=1,8,7
WRITE(2,8400)N,NT(I),NT(I+1),NT(I+2),NT(I+3) WRITE(2,8500)'U',U(I),RKMIN(I),RRMAX(I),U(I+1),
RKMIN(1+1),RKMAX(1+1),U(I+2),RKMIN(I+2), RKMAX(1+2),U(1+3),RKMIN(1+3),RKMAX(1+3)
WRITE(2,8500)'Y1,Y(I),RYMIN(I),RYMAX(I),Y(I+1), RYMIN(1+1),RYMAX(1+1),Y(I+2),RYMIN(I+2), RYMAX(1+2),Y(1+3),RYMIN(1+3),RYMAX(I+3)
WRITE(2,8500)'C',C(I),RCMIN(I),RCMAX(I),C(I+1), RCMIN(1+1),RCMAX(I+1),C(I+2),RCMIN(1+2), RCMAX(I+2),C(1+3),RCMIN(1+3),RCMAX(I+3)
127
WRITE(2,8500)'S*,S(I),RSMIN(I),RSMAX(I),S(I+1), X RSMIN(1+1),RSMAX(1+1),S(1+2)f RSMIN(1+2), X RSMAX{1+2),S(1+3),RSMIN(1+3),RSMAX(1+3)
WRITE{2,8500)'M*,M(I),RMMIN(I),RMMAX(I),M(I+1), X RMMIN{1+1),RMMAX(1+1),M(1+2),RMMIN{1+2), X RMMAX(1+2),M(1+3),RMMIN(I+3),RMMAX(I+3)
WRITE(2,8600) WRITE(2,8400)N,NT(1+4),NT(I+5),NT(I+6) WRITE(2,8500)'U*,U(I+4),RKMIN(I+4),RKMAX(I+4),
X U(1+5),RKMIN(1+5),RKMAX(I+5),U(I+6), X RKMIN(1+6),RKMAX(1+6)
WRITE(2,8500)'Y',Y(I+4),RYMIN(I+4),RYMAX(I+4), X Y(1+5),RYMIN(1+5),RYMAX(I+5),Y(I+6), X RYMIN(1+6),RYMAX(1+6)
WRITE(2,8500)'C*,C(I+4),RCMIN(I+4),RCMAX(I+4), X C(1+5),RCMIN(I+5) ,RCMAX{I+5) ,C(I+6), X RCMIN(1+6),RCMAX(1+6)
WRITE(2,8500)'S',S(I+4),RSMIN(I+4),RSMAX(I+4), X S(1+5),RSMIN(1+5),RSMAX(1+5),S(I+6), X RSMIN(1+6),RSMAX(1+6)
WRITE(2,8500)'M',M(I+4),RMMIN(1+4),RMMAX(1+4), X M(I+5),RMMIN(1+5),RMMAX(I+5),M(I+6), X RMMIN(1+6),RMMAX(1+6)
WRITE(2,8350) 8700 CONTINUE
STOP END
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * C
c c c
SUBROUTINE MATFIX(MATR,NCOL,NC,MAFX) C C DELETES ROW/COL TOTALS FROM MATRIX C
DIMENSION MATR(4,NC), MAFX(3,NCOL) DO 20 1-1,3
DO 10 J=l,NCOL MAFX(I,J)=MATR(I,J)
10 CONTINUE 20 CONTINUE
RETURN END
APPENDIX B
RANDOM NUMBER GENERATOR ROUTINES
The FORTRAN subprogram listings for IRAND and RANDOM are contained in this appendix. Complete descriptions of the use of these two random number generators are given in Chapter III. Their basic structures are also described there.
C C FUNCTION IRAND C RANDOM INTEGER FUNCTION C
INTEGER FUNCTION IRAND(IBEG,ITER,I) INTEGER I,L,K,P REAL X DATA LrK,P/5243,55397,262139/ I»MOD(I*L+K,P) X={REAL(I)+0.5)/REAL(P) IRAND=X*(ITER-IBEG+l)+IBEG RETURN END
C c Q * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
C c
FUNCTION RANDOM (L) C C ALGORITHM AS 183 APPL. STATIST. (1982) VOL.31, NO.2 C C RETURNS A PSEUDO-RANDOM NUMBER RECTANGULARLY DISTRIBUTED C BETWEEN 0 AND 1. C C IX, IY AND IZ SHOULD BE SET TO INTEGER VALUES BETWEEN C 1 AND 30000 BEFORE FIRST ENTRY. C C INTEGER ARITHMETIC UP TO 30323 IS REQUIRED C
COMMON /RAND/IX, IY, IZ
128
129
IX=S171*M0D (IX, 177) -2 * (IX/177) IYas172*MOD{IY, 176) ~35* (IY/176) IZ-170*MOD(IZ,178)~63* (IZ/178)
C IF (IX .LT. 0) IX*IX+30269 IF (IY .LT. 0) IY=IY+30307 IF (IZ .LT. 0) IZ=IZ+30323
C RANDOM » AMOD(FLOAT(IX) / 30269.0 + FLOAT{IY) / 30307.0 +
# FLOAT(IZ) / 30323.0 , 1.0) IF (RANDOM.GT.0.0) RETURN RANDOM - DMOD(DBLE(FLOAT(IX))/30269.0D0+
# DBLE(FLOAT{IY))/30307.0D0+DBLE(FLOAT(IZ))/30323.0D0, # 1.0D0)
IF (RANDOM.GE.1.0) RANDOM«0.999999 RETURN END
C Q ****** a*******************************************:**:**:******** c
APPENDIX C
SUBROUTINE MART
Subroutine MART randomly selects marginal totals for a contingency table, given the table's dimensions and the sample size. MART accesses random number generator IRAND when it is executed.
SUBROUTINE MART{N,NROW,NCOL,NROWT,NCOLT) C C RANDOMLY SELECT MARGINAL TOTALS FOR A MATRIX. C INPUTS: C N=SAMPLE SIZE C NROW-NUMBER OF ROWS C NCOL-NUMBER OF COLUMNS C OUTPUTS: C NROWT-VECTOR OF ROW TOTALS C NCOLT-VECTOR OF COLUMN TOTALS C EXTERNALS: C IRAND{IBEG,ITER,ISEED) FUNCTION WHICH RETURNS C RANDOM INTEGER BETWEEN IBEG & ITER. C
DIMENSION NROWT(NROW), NCOLT(NCOL) COMMON ISEED
C C NUMBER OF ROWS IS ALWAYS 3. FIND ROW TOTALS FIRST. C
NROWT(1)«IRAND(1,N~2,ISEED) NROWT(2)-IRAND(1,N-1-NROWT(1),ISEED) NROWT(3)-N-NROWT(1)-NROWT{2)
C C CHOOSE COLUMN TOTALS. C
NCOLT(1)-IRAND(1,N-(NCOL-1),ISEED) NCOLT(2)-IRAND(1,N~(NCOL-2)-NCOLT(1),ISEED)
130
131
IF {NCOL.EQ.3)THEN NCOLT(3)=N-NCOLT(1)-NCOLT{2) RETURN ELSE ITER»N-(NCOL-3)-NCOLT(1)-NCOLT(2) NCOLT(3)=IRAND(1,ITER,ISEED)
END IF IF (NCOL.EQ.4)THEN
NCOLT(4)=N-NCOLT(1)-NCOLT(2)-NCOLT(3) RETURN ELSE ITER=N-{NCOL-4 >-NCOLT{1)-NCOLT(2)-NCOLT{3) NCOLT(4)=IRAND(1,ITER,ISEED)
ENDIF NCOLT(5)=N-NCOLT{1)-NCOLT(2)-NCOLT(3)-NCOLT < 4) RETURN END
C Q * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
APPENDIX D
SUBROUTINE EV
Expected values for the simulated contingency table are computed by this subroutine. The maximum likelihood formula is employed. The subroutine must be passed the marginal totals and the sample size.
SUBROUTINE EV(N,NROWT,NCOLT,NCOL,EXVAL,EVMIN) C C CALCULATE EXPECTED VALUES AND DETERMINE THE MINIMUM C EXPECTATION. THERE ARE ALWAYS 3 ROWS. C INPUTS: C N=SAMPLE SIZE C NROWT»VECTOR OF ROW TOTALS C NCOLT=VECTOR OF COLUMN TOTALS C NCOL-NUMBER OF COLUMNS (3-5) C OUTPUTS: C EXVAL=MATRIX OF EXPECTED VALUES C EVMIN=MINIMUM EXPECTED VALUE C
DIMENSION EXVAL<3,NCOL), NROWT{3), NCOLT(NCOL) EVMIN=4Q.0 DO 200, L=1,3
DO 100, M=1,NCOL RW=REAL{NROWT(L)) CL=REAL(NCOLT(M)) SS=REAL(N) EXVAL(L,M)=RW*CL/SS IF(EXVAL(L,M).LT.EVMIN) EVMIN=EXVAL{L,M)
100 CONTINUE 200 CONTINUE
RETURN END
C Q * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
c
132
APPENDIX E
SUBROUTINE RCONT2
This subroutine was published in Applied Statistics as AS 159 by W. M. Patefield. It accesses random number generator RANDOM as it simulates the observations in the contingency tables.
C C
SUBROUTINE RCONT2{NROW,NCOL,NROWT,NCOLT,JWORK,MATRIX, X KEY,IFAULT)
C C ALGORITHM AS 159 APPL. STATIST. {1981) VOL.30, NO.l C C GENERATE RANDOM TWO-WAY TABLE GIVEN MARGINAL TOTALS
c DIMENSION NROWT(NROW), NCOLT(NCOL), MATRIX(NROW,NCOL) DIMENSION JWORK(NCOL) INTEGER DUMMY REAL FACT(5001) LOGICAL KEY LOGICAL LSP,LSM COMMON /B/ NTOTAL, NROWM, NCOLM, FACT COMMON /RAND/ IX,IY,IZ DATA MAXTOT /5000/ DUMMY=0
C IFAULT®0 IF (KEY) GOTO 103
C C SET KEY FOR SUBSEQUENT CALLS C
KEY * .TRUE.
13:
134
C CHECK FOR FAULTS AND PREPARE FOR FUTURE CALLS
C IF (NROW .LE. 1) GOTO 212 IF (NCOL .LE. 1) GOTO 213 NROWM=NROW-l NCOLM=»NCOL~l DO 100 1=1,NROW
IF(NROWT(I) .LE. 0) GOTO 214 100 CONTINUE
NTOTAL-O DO 101 J=l,NCOL
IF (NCOLT(J) .LE. 0) GOTO 215 NTOTAL=NTOTAL+NCOLT(J)
101 CONTINUE IF {NTOTAL .GT. MAXTOT) GOTO 216
C C CALCULATE LOG-FACTORIALS C
X=0 .0 FACT(1)=0.0 DO 102 I«l,NTOTAL X=X+ALOG(FLOAT CI)> FACT{1+1)=X
102 CONTINUE C C CONSTRUCT RANDOM MATRIX C 103 DO 105 J=1,NCOLM 105 JWORK(J)=NCOLT(J)
JC=NTOTAL C
DO 190 L»l,NROWM NROWTL=NROWT(L) IA»NROWTL IC=JC JC=JC-NROWTL DO 180 M=1,NCOLM
ID=JWORK(M) IE=IC IC-IC-ID IB=IE-IA II»IB-ID
C C TEST FOR ZERO ENTRIES IN MATRIX C
IF (IE .NE. 0) GOTO 130 DO 121 J=M,NCOL
121 MATRIX{L,J)=0 GOTO 190
C C GENERATE PSEUDO-RANDOM NUMBER C 130 RAND=RANDOM(DUMMY) C
135
C COMPUTE CONDITIONAL EXPECTED VALUE OF MATRIX(L,M) C 131 NLM»FLOAT(IA*ID)/FLOAT(IE)+0.5
IAP=IA+1 IDP=ID+1 IGP«IDP~NLM IHP«=IAP-NLM NLMP=NLM+1 IIP=II+NLMP 4 ^ X-EXP(FACT(IAP)+FACT{IB+1)+FACT(IC+1)+FACT(IDP)-FACT{IE+1)
x -FACT(NLMP)-FACT(IGP)-FACT(IHP)-FACT(IIP)) IF (X .GE. RAND) GOTO 160 SUMPRB=X Y*X NLL=NLM LSP" .FALSE. LSM® .FALSE.
C C INCREMENT ENTRY IN ROW L, COLUMN M C 140 J«(ID-NLM)*(IA-NLM)
IF (J .EQ. 0) GOTO 156 NLM=NLM+1 X^X*FLOAT(J)/FLOAT(NLM*(II+NLM)) SUMPRB-SUMPRB+X IF (SUMPRB .GE. RAND) GOTO 160
150 IF (LSM) GOTO 155 C C DECREMENT ENTRY IN ROW L, COLUMN M C
J«=NLL* (II+NLL) IF (J .EQ. 0) GOTO 154 NLL=NLL-1 Y=Y*FLOAT(J)/FLOAT((ID-NLL)*(IA-NLL)) SUMPRB=SUMPRB+Y IF (SUMPRB .GE. RAND) GOTO 159 IF (.NOT. LSP) GOTO 140 GOTO 150
154 LSM= .TRUE. 155 IF (.NOT. LSP) GOTO 140
RAND=SUMPRB * RANDOM < DUMMY) GOTO 131
156 LSP= .TRUE. GOTO 150
159 NLM=NLL 160 MATRIX(L,M)=NLM
IA=IA-NLM JWORK(M)•JWORK(M)-NLM
180 CONTINUE MATRIX(L,NCOL)"IA
190 CONTINUE C
136
C COMPUTE ENTRIES IN LAST ROW OF MATRIX C
DO 192 M=1,NCOLM 192 MATRIX {NROW, M) «=JWORK (M)
MATRIX(NROW,NCOL)=IB-MATRIX{NROW,NCOLM) RETURN
C C SET FAULTS C 212 IFAULT=1
RETURN 213 IFAULT«2
RETURN 214 IFAULT®3
RETURN 215 IFAULTS=4
RETURN 216 IFAULT*5
RETURN END
C C
APPENDIX F
SUBROUTINE CHISQ
The subroutine CHISQ calculates the chi-squared statistic for a contingency table, given the table and the corresponding table of expected frequencies.
SUBROUTINE CHISQ(MATRIX,EXVAL,NCOL,XSQ) c C CALCULATE CHI-SQUARED ESTIMATE FOR OBSERVED TABLE. C INPUTS: C MATRIX=THE OBSERVED CONTINGENCY TABLE C EXVAL=THE EXPECTED VALUES C NCOL-NUMBER OF COLUMNS C OUTPUTS: C XSQ<=ESTIMATE OF CHI-SQUARED C
DIMENSION MATRIX(3,NCOL), EXVAL(3,NCOL) C C CALCULATED CHI-SQUARED STATISTIC C
XSQ=0.0 DO 200, 1=1,3
DO 100, J=l,NCOL DIFF-FLOAT(MATRIX(I,J))-EXVAL(I,J) XSQ=XSQ+DIFF* * 2/EXVAL(I,J)
100 CONTINUE 200 CONTINUE
RETURN END
C Q * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * c
137
APPENDIX G
SUBROUTINE PVAL
This appendix lists the FORTRAN translation of a BASIC program designed to calculate the probability value associated with a chi-squared test statistic. The BASIC program was published in the book. Some Common Basic Programs.
SUBROUTINE PVAL(NCOL,XSQ,PK) C C CALCULATE P-VALUE FOR A GIVEN CHI-SQUARED STAT, C C INPUTS: C NCOL=NUMBER OF COLUMNS IN TABLE C XSQ=OBSERVED CHI-SQUARED ESTIMATE C OUTPUTS: C PK=PROBABILITY ASSOCIATED WITH XSQ C
NDF=2*(NCOL-1) PD=1 DO 300, I»NDF,2,-2
PD=PD*I 300 CONTINUE
PN=XSQ**{INT{(NDF+1)/2))*EXP(~XSQ/2)/PD IF(INT(NDF/2).EQ.(FLOAT(NDF)12)) GOTO 400 F«SQRT(2/(XSQ*3.141592653599)) GOTO 500
400 F=1.0 500 G=1.0
H=1.0 550 NDF=NDF+2
H"H*XSQ/NDF IF(H.LT.0.00000001)GOTO 600 G=G+H GOTO 550
600 PK=1~F*PN*G RETURN END
C C C
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
133
APPENDIX H
SUBROUTINE RXCPRB
Calculation of Fisher's exact probability for a given contingency table is the function of the subroutine listed here. It a modification of the original FORTRAN subroutine published in Communications of the Association for Computing Machinery by T. W. Hancock who modified an earlier version. Algorithm 434, published in the same periodical by David L. March. . .
The modifications added here permit evaluatxon of the chi" squared statistic using continuity correction techniques suggested by Cochran and Mantel.
SUBROUTINE RXCPRB(MATR,NR,NC,NCOL,EXVAL,PT,PS,PM,MATEMP,PY, X MAFX)
C C COMPUTES EXACT PROBABILITY OF RXC CONTINGENCY TABLE. C C INPUTS: C MATR=THE OBSERVED TABLE C NR=NUMBER OF ROWS IN THE MATRIX C NC-NUMBER OF COLUMNS IN THE MATRIX C EXVAL=THE EXPECTED VALUES C OUTPUTS: C PT=PROBABILITY OF THE OBSERVED TABLE C PS=PROBABILITY OF A TABLE AS OR LESS PROBABLE C THAN THE OBSERVED TABLE C PM=PROBABILITY USING MANTEL'S CORRECTION C MATEMP=MATRIX USED IN COCHRAN'S AND MANTEL'S METHODS C EXTERNALS: C INIT-SUBROUTINE WHICH RETURNS THE NEXT MATRIX TO C SATISFY THE MARGINALS C FACLOG=FUNCTION TO RETURN THE LOG OF A FACTORIAL C YATES•SUBROUTINE USED IN MANTEL'S CORRECTION C
DIMENSION MATR(NR,NC), MATEMP(3,NCOL), EXVAL<3,NCOL), X MAFX(3,NCOL)
139
140
INTEGER R,C R=NR~1 C=NC-1 PCH=1.0 PYP=0.0
C COMPUTE LOG OF CONSTANT NUMERATOR QXLOG=-FACLOG(MATR{NR,NC)) DO 10,1=1,R
QXLOG-QXLOG+FACLOG(MATR(I,NC)) 10 CONTINUE
DO 20 J=1,C QXLOG-QXLOG+FACLOG(MATR(NR,J))
20 CONTINUE C COMPUTE PROBABILITY OF THE GIVEN TABLE.
RXLOG-O.0 DO 40 1=1,R
DO 30 J=1,C RXLOG-RXLOG+FACLOG(MATR(I,J))
30 CONTINUE 40 CONTINUE
PT-10. 0 * * (QXLOG-RXLOG) PS=0.Q CALL MATFIX(MATR,NCOL,NC,MAFX) CALL YATES(MAFX,EXVAL,C,PY)
C ALL CELL VALUES INITIALLY SET TO ZERO DO 60 1=1,R
DO 50 J»1,C MATR(I,J)=0
50 CONTINUE 60 CONTINUE C EACH CYCLE STARTS HERE 70 REY=1
MATR(2,2)"~1 C GENERATING SET OF FREQUENCIES PROGRESSIVELY IN C LOWER RIGHT (R-l)MC-l) CELLS.
DO 160 1=2,R DO 150 J*2,C
MATR(I,J)-MATR(I,J)+1 C CHECKING SUMMATIONS .LE. RESPECTIVE MARGINALS
ISUM=0 JSUM=0 DO 80 M=J,C
ISUM=ISUM+MATR(I,M) 80 CONTINUE
IF (ISUM .GT. MATR(I,NC)) GOTO 130 DO 90 K=I,R
J SUM-J SUM+MATR(K,J) 90 CONTINUE
IF (JSUM .GT. MATR(NR,J)) GOTO 130 C JUMP TO STMT 170 WHERE ALL CELLS PRIOR TO MATR(I,J) C ARE SET TO ZERO.
IF (KEY .EQ. 2) GOTO 170
141
IP=I JP=J
C CALL SUBR INIT TO FIND NEXT BALANCED MATRIX CALL INIT(MATR,NR,NC)
C COMPUTE LOG OF THE DENOMINATOR RXLOG^O.0 DO 110 K=1,R
DO 100 M=1,C RXLOG-RXLOG+FACLOG(MATR(K,M))
100 CONTINUE 110 CONTINUE
CALL MATFIX <MATR,NCOL,NC,MAFX) CALL YATES(MAFX,EXVALf C,PYY) IF({PYY .LE. PY).AND.(PYY .GT. PYP)) PYP=PYY
C COMPUTE PX. ADD TO PS IF PX .LE. PT PX-10.0**(QXLOG-RXLOG) IF ((PT/PX) .GT. 0.99999) THEN PS=PS+PX ELSEIF (PX .LT. PCH) THEN
PCH-PX DO 117 ITM»1,R
DO 115 JTM=1,C MATEMP(ITM,JTM)»MATR(ITM,JTM)
115 CONTINUE 117 CONTINUE
ENDIF C IF POSSIBLE, A SEQUENCE OF MATRICES AND C ASSOCIATED PROBABILITIES ARE GENERATED 120 IF (MATR(1,2) .LT. 1 .OR. MATR(2,1) .LT. 1) GOTO 140
MATR (1,1) ssMATR (1,1) +1 MATR(2,2)=MATR(2,2)+1 PX=PX*FLOAT(MATR(1,2))*FLOAT(MATR(2,1))/FLOAT(MATR(1,1))
X /FLOAT(MATR(2,2)) CALL MATFIX(MATR,NCOL,NC,MAFX) CALL YATES(MAFX,EXVAL,C,PYY) IF ({PYY .LE. PY).AND.(PYY .GT. PYP)) PYP=PYY IF ((PT/PX) .GT. 0.99999) THEN
PS=PS+PX ELSEIF (PX .LT. PCH) THEN
PCH-PX DO 127 ITM=1,R
DO 123 JTM-1,C MATEMP{ITM,JTM)=MATR(ITM,JTM)
123 CONTINUE 127 CONTINUE
ENDIF MATR(1,2)-MATR(1,2)-1 MATR(2,1)=MATR(2,1)-1 GOTO 120
130 IP=I JP=J
C SET KEY TO 2 AS CYCLE COMPLETED 140 KEY=2 150 CONTINUE 160 CONTINUE
142
PM=(PY+PYP)12 RETURN
C ALL CELLS OF MATR PRIOR TO THE (I,J)TH ARE SET TO 0 170 DO 180 M=2,JP
MATR{IP,M)=0 180 CONTINUE
IP=IP~1 DO 200 K=1,IP
DO 190 M=2,C MATR(K,M)=0
190 CONTINUE 200 CONTINUE
GOTO 70 END
C c ************************************************************* C
SUBROUTINE INIT(MATR,NR,NC) C C RETURNS THE NEXT MATRIX TO SATISFY THE MARGINALS AND THE C SEQUENCE OF GENERATION DEFINED IN SUBR RXCPRB. C
DIMENSION MATR(NR,NC)fMROW{4),MCOL(6) INTEGER R,C R=NR-1 C=NC-1
C EQUIVALENCE MROW AND MCOL TO ROW AND COLUMN MARGINALS DO 10 K=1,R
MATR(K,1)=0 MROW(K)=MATR{K,NC)
10 CONTINUE DO 20 M=1,C
MCOL {M) «=MATR(NR,M) 20 CONTINUE C FOR EACH ROW, SUBTRACT ELEMENTS 2 TO C FROM MROW
DO 40 K=2,R DO 30 M=2,C
MROW{K)=MROW(K)-MATR(K,M) 30 CONTINUE 40 CONTINUE C FOR EACH COLUMN, SUBTRACT ELEMENTS 2 TO R FROM MCOL
DO 60 M=2,C DO 50 K=2,R
MCOL(M)=MCOL(M)-MATR(K,M) 50 CONTINUE 60 CONTINUE C FORMING NEXT BALANCED ARRAY
DO 90 1=1,R IR=NR-I DO 80 J=1,C
MIN=MIN0(MROW(IR),MCOL(J))
143
IF (MIN .EQ. 0) GOTO 70 MATR(IR,J)=MATR(IR,J)+MIN MROW(IR)-MROW(IR)-MIN MCOL(J) =MCOL(J) -MIN
70 IF (MROW{IR} .EQ. 0) GOTO 90 80 CONTINUE 90 CONTINUE
RETURN END
C Q * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
C FUNCTION FACLOG(N)
C INPUT: C N=INTEGER .GE. ZERO C RESULT: C FACLOG=BASE 10 LOG OF N FACTORIAL
DIMENSION TABLE{101} DATA TPILOG /0.3990899342/ DATA ELOG /0.4342944819/ DATA IFLAG /0/
C USE STIRLING'S APPROXIMATION IF N .GT. 100 IF (N .GT. 100) GOTO 20
C LOOK UP ANSWER IF TABLE WAS GENERATED IF (IFLAG .EQ. 0) GOTO 30
10 FACLOG=TABLE(N+l) RETURN
C HERE FOR STIRLING'S APPROXIMATION 20 X=FLOAT(N)
FACLOG=(X+0.5}*ALOG10(X)-X*ELOG+TPILOG+ELOG/(12.0*X) X -ELOG/(360.0*X*X*X)
RETURN C HERE TO GENERATE LOG FACTORIAL TABLE 30 TABLE(1)=0.0
DO 40 1=2,101 X=FLOAT(I-l) TABLE{I)=TABLE{I~1)+ALOGIO(X)
40 CONTINUE IFLAG=1 GOTO 10 END
C Q * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
c
APPENDIX I
SUBROUTINE COCHR
Cochran's continuity corrections are applied to the contingency tables supplied to this subroutine. It computes the corrected chi-squared statistic for both the C method and the S method.
SUBROUTINE COCHR(MATEMP,EXVAL,XSQ,NCOL,PCC„PCS) C C CALCULATES CHI-SQUARED CORRECTED BY TWO METHODS ATTRIBUTED C TO COCHRAN. CALLS SUBROUTINE PVAL TWICE. C INPUTS: C MATEMP-SECOND CONTINGENCY TABLE USED C EXVAL-THE EXPECTED VALUES C XSQ=CHI~SQUARED FOR THE OBSERVED TABLE C NCOL-NUMBER OF COLUMNS C OUTPUTS: C PCC=PROBABILITY BY METHOD C C PCS=PROBABILITY BY METHOD S C EXTERNALS: C SUBROUTINE CHISQ C SUBROUTINE PVAL C
DIMENSION MATEMP{3,NCOL), EXVAL{3,NCOL) CALL CHISQ{MATEMP,EXVAL,NCOL,XSQK)
C C S-METHOD C
XSQS=(XSQ+XSQK)/2 CALL PVAL(NCOL,XSQS,PCS)
C C C-METHOD C
XO=SQRT(XSQ) XC=SQRT(XSQK) XSQC=((XO+XC)/2)* *2 CALL PVAL(NCOL,XSQC,PCC) RETURN END
C
c
144
APPENDIX J
SUBROUTINE YATES
This appendix contains the FORTRAN listing for the subroutine used to perform Yates* continuity correction for the simulated contingency table.
SUBROUTINE YATES(MATRIX,EXVAL,NCOL,PY) C C CALCULATES CHI-SQUARED WITH YATES' CORRECTION. C CALLS PVAL; CALLED BY RXCPROB. C INPUTS: C MATRIX-THE OBSERVED CONTINGENCY TABLE C EXVAL-THE EXPECTED VALUES C NCD«THE COLUMN DIMENSION C OUTPUTS: C PY-PROBABILITY ASSOCIATED WITH YATES' CORRECTION C
DIMENSION MATRIX(3 f NCOL) ,EXVAL(3 f NCOL) X=0.0 DO 200, 1=1,3
DO 100, J=l,NCOL RM=REAL{MATRIX(I,J)) EM=EXVAL(I,J} DIFF-ABS(RM-EM) CORR=DIFF-0.5 SQD«CORR**2 ADX»SQD/EM X=X+ADX
100 CONTINUE 200 CONTINUE
CALL PVAL(NCOL,X,PY) RETURN END
C Q * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
c
145
APPENDIX K
SUBROUTINE RATIOS
The main routine called subroutine RATIOS to figure the performance ratios for each of the chi—squared-based tests. It was used with groups of contingency tables categorized by table dimension, sample size, range of minimum expected frequency, and range of exact probability.
SUBROUTINE RATIOS(PROB,PS,RMIN,RMAX,RSUM,I) C C CALCULATE THE RATIO OF A PROBABILITY FOR A GIVEN METHOD C TO THE EXACT PROBABILITY. FIND THE RANGE OF RATIOS FOR C A GIVEN METHOD. C C INPUTS: C PROB=PROBABILITY FOR A GIVEN METHOD C PS=EXACT PROBABILITY FOR THE OBSERVED TABLE C RMIN=MINIMUM RATIO FOR THE CALLING METHOD C RMAX=MAXIMUM RATIO FOR THE CALLING METHOD C RSUM=SUM OF RATIOS FOR THE CALLING METHOD C OUTPUTS: C RMIN=MINIMUM RATIO FOR THE CALLING CATEGORY C RMAX=MAXIMUM RATIO FOR THE CALLING CATEGORY C RSUM=SUM OF RATIOS FOR THE CALLING CATEGORY C
RA=PROB/PS RMIN»MIN(RA,RMIN) RMAX=MAX{RA,RMAX) RSUM=RSUM+RA RETURN END
C Q * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
C
146
APPENDIX L
DATA TABLES
The tables in this appendix were generated by XTAB
during an analysis of the same contingency tables which
produced the data in Table IX, Table III, and Table IV,
included in Chapter IV. The tables here differ from those
in Chapter IV only in the ranges of the categories used for
e, the minimum expected frequency values. This
reclassification permits a more accurate analysis of the
effects of the minimum expected frequency on the measures of
independence compared in this study.
The category limits for e selected for these tables
were determined by dividing the range of possible minimum
expected frequencies into reasonably sized groups. The
ranges of possible values are given in Table I, which is
found in Chapter III. In most cases, the ranges were
divided into seven categories.
The layout of the tables varies somewhat from that of
Tables II, III, and IV. It should be emphasized, though,
that the contingency tables producing these data are exactly
the same tables used previously.
147
TABLE V
148
PERFORMANCE RATIO MEANS, 3 X 3, N = 10
Ranae ot ?F PE<0.5 $E>0.5
Method Range
T • 630 of e T * 418 T • 630 U 0.59 0.78 Y 3.34 .65
0.l<e£0.2 C I .81 .41 S .75 .38 M 3.26 .65
T - 310 T = 614 U .78 .63 Y 4.98 1.10
0.2<e<0.3 C .99 .44 S .98 .43 M 4.96 1.10
T • 73 T - 151 U .96 .54 Y 4.96 1.13
0.3<e^0.4 C 1.11 .28 S 1.10 .26 M 4.95 1.13
T - 71 T « 125 U .90 .69 Y 4.36 1.19
0.4<e<0.5 C .88 .51 s .87 .49 M 4.34 1.19
T = 0 T = 0 U Y
0.5<e<0.6 C S M
T - 38 T = 57 U .81 .72 Y 4.60 1.13
0.6<e£0.7 C .90 .51 S .89 .48 M 4.59 1.13
T - 2 T » 11 U .81 .72 Y 2.86 1.05
e>0.7 C .97 .71 S .93 .68 M 2,86 1.05
TABLE VI
PERFORMANCE RATIO MEANS, 3 X 3 , N = 20
149
Range ot F PE50.5 §E>0.5
Method l Range
T = 962 of e T • 701 T = 962 U 0.92 0.85 Y 3.61 .81
0.05<e<0.3 C 1 1.01 .51 s .98 .50 M 3.57 .81
*3
It if*
i§*
T = 258 U 1 1.03 .80 Y 4.27 1.28
0.3<e<0.55 C 1.01 .67 s 1.00 .67 M 4.24 1.28
T - 85 T = 99 U .94 .85 Y 3.95 1.24
0.55<e<0.8 C .97 .72 s .96 .71 M 3.94 1.24
T - 58 T » 52 U .97 .86 Y 4.12 1.18
0.8<e<1.05 C .98 .69 s .97 .67 M 4.10 1.18
T - 11 T « 20 U .89 .88 Y 3.07 1.28
1.05<e<1.3 C .78 .77 S .77 .75 M 3.07 1.28
>3
II T « 3 U .79 .87 Y 1.95 1.37
1.3<e<l.55 C .78 .75 S .78 .75 M 1.95 1.37
T = 1 T = 2 U .80 .94 Y 2.00 1.17
e>l.55 C .62 1.00 S .62 .99 M 1 2.00 1.17
TABLE VII
PERFORMANCE RATIO MEANS, 3 X 3, N • 30
150
Range of P PEiO.5 fE>0.5
Method Range of e T • 845 T • 1117
U 0.98 0.90 Y 3.63 .92
0.33<e<0.5 C 1.02 .62 s .99 .61 M 3.61 .92
T • 171 T = 194 U 1.05 .90 Y 3.51 1.26
.05<eil.0 C .99 .76 S .98 .75 M 3.50 1.26
T = 50 T «b 70 U .94 .91 Y 3.18 1.29
1.0<ell.5 C .98 .78 S .97 .77 M 3.17 1.29
T - 17 T = 19 U 1.13 .95 Y 4.40 1.25
1.5<e<2.0 C 1.14 .81 S 1.13 .81 M 4.40 1.24
T = 6 T - 9 U .84 .94 Y 2.30 1.24
2.0<e<2.5 C .78 .79 S .77 .76 M 2.29 1.23
T = 1 T - 1 U 1.00 .96 Y 3.15 1.02
2.5<e<3.0 C 1.45 .91 s 1.44 .91 M 3.15 1.02
TABLE VIII
151
PERFORMANCE RATIO MEANS, 3 X 4, N = 10
Ranae of Pr PE*0.5 PE>0.5
Method Range of e T = 564 T - 930
U 0.65 0.74 Y 3.46 .73
0.1<e<0.15 C .79 .52 S .77 .50 M 3.42 .73
T = 231 T « 506 U .90 .63 Y 6.97 1.17
.15<e<0.25 C 1.06 .52 S 1.05 .50 M 6.96 1.17
T = 65 T - 85 U 1.13 .57 Y 7.27 1.18
.25<e<0.35 C 1.27 .36 S 1.26 .35 M 7.27 1.18
T = 28 T = 31 U .77 .72 Y 9.61 1.21
.35<e£0.45 C 1.03 .55 S 1.00 .53 M 9.61 1.21
T = 0 T = 0 U Y
.45<e<0.55 C S M
T - 0
o IS E-i
U Y
0.55<ei0.6 C S M
T = 4 T = 6 U .67 .71 Y 5.38 1.09
e>0.6 C .81 .68 S .80 .67 M 5.38 1.09
TABLE IX
152
PERFORMANCE RATIO i MEANS, 3 X 525 II o
Ranae o£ p TT PE<0.5 fE>0.5
Method Range of e T - 712 T = 922
U 0.97 0.92 Y 3.96 .70
0.05<e*0.2 C .95 .65 S .92 .64 M 3.90 .70
T « 276 T = 367 U 1.12 .78 Y 5.91 1.28
0. 2<e^0.35 C 1.14 .66 S 1.13 . 66 M 5.91 1.28
T - 53 T = 54 U 1.01 .85 Y 4.40 1.38
0.35<e<0.5 C .86 .75 S .36 .74 M 4.40 1.38
T = 39 T - 54 U .99 .86 Y 4.29 1.28
0. 5<e<«0.65 C .92 .73 s .91 .71 M 4.28 1.28
T = 5
r-1! IH
U 1.07 .86 Y 5.87 1.22
0.65<e<0.8 C .77 .77 S .76 .76 M 5.87 1.22
T = 6 II EH
U .80 .84 Y 3.20 1.21
0.8<ei0.95 C .60 .70 S .59 .69 M 3.20 1.21
T = 1
o II
£4
U .82 Y 2.35
e>0- 95 C .42 s .40 M 2.35
TABLE X
153
PERFORMANCE RATIO i MEANS, 3 X t£* 35 I! o
Ranae o£ PE<0.5 £E>O.5
Method Range of e T = 697 T = 845
U 1.00 0.97 Y 2.97 .70
.033<e<0.2 C 0.94 .74 S .92 .73 M 2.96 .70
T - 281 T = 333 U 1.18 .85 Y 5.59 1.29
0. 2<e <10.35 C 1.20 .76 S 1.19 .76 M 5.59 1.29
T * 61 T » 59 U 1.12 .92 Y 6.79 1.38
0.35<e<0.5 C 1.03 .79 S 1.02 .78 M 6.79 1.38
T - 57
II IH
U 1.27 .89 Y 12.33 1.35
0.5<ei0.65 C 1.06 .88 S 1.05 .87 M 12.33 1.35
T = 23 T - 21 U 1.03 .91 Y 3.87 1.30
0.65<e<0.8 C .94 .80 S .94 .80 M 3.87 1.30
T = 18 *3
fl
U .91 .90 Y 3.01 1.25
0. 8<eis0.95 C .82 .76 s .81 .75 M 3.01 1.25
T - 15 T = 15 U 1.05 .92 Y 4.80 1.28
e> 0.95 C .91 .90 S .91 .90 M 4.80 1.28
TABLE XI
PERFORMANCE RATIO MEANS, 3 X 5 , N = 10
154
Ranae o£ p * TJ* • • -
PE<0.5 §E>0.5 Method
Range of e T - 514 T - 1109
U 0.73 0.68 Y 4.74 .71
0.l<e<0.15 C .93 .56 S .90 .54 M 4.74 .71
T = 264 T = 440 U 1.01 .60 y 14.73 1.18
.15<e<0.25 c 1.24 .53 s 1.22 .52 M 14.73 1.18
T = 48 T = 120 U 1.52 .54 Y 17.13 1.20
.25<e50.35 C 1.73 .39 S 1.73 .38 M 17.13 1.20
T = 2 II
U .73 .61 Y 4.28 1.17
e>0.35 C .66 .66 S .65 .66 M 4.28 1.17
TABLE XII
155
PERFORMANCE RATIO MEANS, 3 X 5, N = 20
Ranae o£ p • fi -PE<0.5 PE>0.5
Method Range of e T • 831 T • 1036
U 0.96 0.92 Y 4.24 .65
0.05<e<0.2 C ! .95 .72 S .92 .71 M 4.24 .65
T = 277 T - 299 U 1.28 .76 y ! 8.77 1.28
0.2<e<0.35 C j 1.17 .67 s 1.17 . 66 M 8.76 1.28
T « 15 T - 14 U .97 .83 Y 4.45 1.40
0 -35<e^0.5 C .79 .79 S .78 .79 M 4.45 1.40
f3
If T = 11 U 1.34 .85 Y 12.91 1.15
0.5<e£0.65 C .96 .88 S .95 .88 M 12.91 1.15
*-3
II o o
II i-»
U Y
0.65<e<0.8 C S M
T = 1 T = 0 U .66 Y 3.76
e>0.8 C .63 s .63
1 M 3.76
TABLE XIII
156
PERFORMANCE RATIO • MEANS, 3 X 5, N - 30
Ranae of pr PESO.5 $E>0.5
Method Range of e T = 264 T = 328
U 1.02 1.08 Y 0.36 0.09
,033<e<0.2 c .97 .77 s .92 .76 M .36 .09
T = 616 T = 622 U 1.07 .95 Y 7.63 .94
0.2<e£0.35 C 1.00 .83 S .98 .82 M 7.63 .94
T - 265 IT = 289 U 1.26 .83 Y 6.62 1.34
0.35<e<0.5 C 1.13 .77 s 1.12 .77 M 6.62 1.34
T = 38
o 11
U 1.05 .90 Y 7.34 i 1.41
0.5<e£0.65 C .86 .81 S .85 .81 M 7.34 1.41
T = 12 It
U .91 .88 Y 2.51 1.36
0.65<e£0.8 C .91 .76 S .90 .75 M 2.51 1.36
T = 6 T = 8 U 1.01 .92 Y 3.44 1.37
0.8<e<0.95 C 1.01 .86 S 1.01 .86 M 3.44 1.37
T = 0 T = 1 U .98 Y 1.02
e>0.95 C .99 S .99 M 1.02
BIBLIOGRAPHY
Books
Bishop, Yvonne M. M., Stephen E. Fienberg, and Paul W. Holland, Discrete Multivariate Analysis, Cambridge, Massachusetts, The MIT Press, 1975.
Bradley, James V., Distribution-Free Statistical Tests, Englewood Cliffs, New Jersey, Prentice-Hall, Inc., 1968.
Cohen, Jacob, Statistical Power Analysis for the Behavioral Sciences, New York, Academic Press, 1969.
Conover, W. J., Practical Nonpararoetrie Statistics, New York, John Wiley and Sons, Inc., 1971.
Cramer, H., Mathematical Methods of Statistics, Princeton, New Jersey, Princeton University Press, 1946.
Everitt, B. S., The Analysis of Contingency Tables, London, Chapman and Hall, 1977.
Ferguson, George A., Statistical Analysis in Psychology and Education, 5th ed., New York, McGraw-Hill Book Company, 1981.
Fienberg, Stephen E., The Analysis of Cross-Classified Categorical Data, Cambridge, Massachusetts, The MIT Press, 1977.
Fisher, Ronald A., The Design of Experiments, Edinburgh, Oliver and Boyd, 1935.
, Statistical Methods for Research Workers, 14th ed., New York, Hafner Publishing Company, 1973.
157
158
Guilford, J. P., Fundamental Statistics in Psychology and Education, 4th. ed. , Nevr York, McGraw Hill, 1965.
Kendall, Maurice G. and Alan Stuart, The Advanced Theory of Statistics, Vol. 2, 2nd. ed., New York, Hafner Publishing Company, 1967.
Lancaster, H., The Chi Squared Distribution, New York, John Wiley and Sons, 1969.
McNemar, Quinn, Psychological Statistics, 3rd. ed., New York, John Wiley and Sons, 1962.
Nanney, T. Ray, Computing: A Problem-Solving Approach with FORTRAN 77, Englewood Cliffs, New Jersey, Prentice-Hall, 1981.
Pearson, Karl, On the Theory of Contingency and Its Relation to Association and Normal Correlation, London, Drapers' Company, 1904.
Poole, Lon and Mary Borchers, Some Common Basic Programs, Berkeley, California, Adam Osborne & Associates, Inc., 1977.
Reynolds, Henry T., The Analysis of Cross-Classifications , New York, The Free Press, 1977.
Articles
Berry, Kenneth J. and Paul W. Mielke, Jr., "Subroutines for Computing Exact Chi-Square and Fisher's Exact Probability Tests," Educational and Psychological Measurement, XLV (Spring, 1985), 153-159.
Boulton, D. M. and C. S. Wallace, "Occupancy of a Rectangular Array," Computer Journal, XVI (January, 1973), 57-63.
Cochran, William G., "The Chi-Squared Test of Goodness of Fit," Annals of Mathematical Statistics," XXIII (Spring, 1952), 315-345.
Conover, W. J., "Some Reasons for Not Using the Yates Continuity Correction on 2 X 2 Contingency Tables," Journal of the American Statistical Association, LXIX (June, 1974), 374-376.
Cox, M. A. A. and R. L. Plackett, "Small Samples in Contingency Tables," Biometrika, LXVII (January, 1980), 1-13.
159
Fisher, Ronald A., "The Conditions Under Which Chi Square Measures the Discrepancy Between Observation and Hypothesis," Journal of the Royal Statistical Society, LXXXVII (Winter, 1924), 442-450.
"The Significance of Deviations from Expectation in a Poisson Series," Biometrics, VI (Spring, 1950), 17-24.
Garside, G. R. and C. Mack, "Actual Type 1 Error Probabilities for Various Tests in the Homogeneity Case of the 2 X 2 Contingency Table," The American Statistician, XXX (February, 1976), 18-21.
Greenwood, M. and G. U. Yule, "The Statistics of Anti-Typhoid and Anti-Cholera Innoculations and the Interpretation of Such Statistics in General," Proceedings of the Roval Society of Medicine, VIII (Spring, 1915), 113-190.
Grizzle, James E., "Continuity Correction in the Chi-Squared Test for 2 X 2 Tables," The American Statistician, XXI (October, 1967), 28-32.
Haber, Michael, "A Comparison of Some Continuity Corrections for the Chi-Squared Test on 2 X 2 Tables," Journal of the American Statistical Association, LXXV (September, 1980), 510-515.
Hancock, T. W., "Remark on Algorithm 434," Communications of the Association for Computing Machinery, XVIII (February, 1975), 117-119.
Lewontin, R. C. and J. Felsenstein, "The Robustness of Homogeneity Tests in 2 X N Tables," Biometrics, XXI (March, 1965), 19-33.
Mantel, Nathan, "Comment and a Suggestion," Journal of the American Statistical Association, LXIX (June, 1974), 378-380.
, "The Continuity Correction," The American Statistician, XXX (May, 1976), 103-104.
, and Samuel W. Greenhouse, "What Is the Continuity Correction?" The American Statistician, XXII (December, 1968), 27-30.
March, David L., "Algorithm 434: Exact Probabilities for R x C Contingency Tables," Communications of the Association for Computing Machinery, XV (November, 1972), 991-992.
160
McLeod, A. Ian, "A Remark on Algorithm AS 183. An Efficient and Portable Pseudo-random Number Generator," Applied Statistics, XXXIV (Summer, 1985), 198-200.
Mehta, Cyrus R. and Nitin R. Patel, "A Network Algorithm for Performing Fisher's Exact Test in r x c Contingency Tables," Journal of the American Statistical Association, LXXVIII (June, 1983}, 427-434.
, "A Hybrid Algorithm for Fisher's Exact Test in Unordered RXC Contingency Tables," Communications in Statistics -Theory and Methods, XV (April, 1986), 387-403.
Miettinen, Olli S., "Comment," Journal of the American Statistical Association, LXIX (June, 1974), 380-
Mosteller, Frederick, "Association and Estimation in Contingency Tables," Journal of the American Statistical Association, LXIII (January, 1968), 1-28.
Pagano, M. and K. Halvorsen, "An Algorithm for Finding the Exact Significance Levels of r x c Contingency Tables," Journal of the American Statistical Association, LXXVI (November, 1981), 931-934.
Patefield, W. M., "An Efficient Method of Generating Random R x C Tables with Given Row and Column Totals," Applied Statistics, XXXIV (Summer, 1985), 91-97.
Pearson, Karl, "On the Criterion That a Given System of Deviations From the Probable in the Case of a Correlated System of Variables Is Such That It Can Be Reasonably Supposed to Have Arisen From Random Sampling," Philosophical Magazine, Series 5, L (Spring, 1900), 157-172.
Plackett, R. L., "The Continuity Correction in 2 x 2 Tables," Biometrika. LI (May, 1964), 327-337.
Starmer, C. Frank, James E. Grizzle, and P. K. Sen, "Comment," Journal of the American Statistical Association, LXIX (June, 1974), 376-378.
Tocher, K. D., "Extension of the Neyman-Pearson Theory of Tests to Discontinuous Variates," Biometrika, XXXVII (February, 1950), 130-144.
161
Upton, Graham J. G., "A Comparison of Alternative Tests for the 2 x 2 Comparative Trial," Journal of the Royal Statistical Society, CXLV (Spring, 1982), 86-105.
Wichmann, B. A. and I. D. Hill, "Algorithm AS 183: An Efficient and Portable Pseudo-random Number Generator," Applied Statistics, XXXI (Summer, 1981), 188-190.
Yates, Frank, "Contingency Tables Involving Small Numbers and the Chi-Squared Test," Journal of the Royal Statistical Society, Series B, Supp. Vol. 1, II (Spring, 1934), 217-235.