Download - Tutorial: Chi-Square Distribution
![Page 1: Tutorial: Chi-Square Distribution](https://reader035.vdocuments.net/reader035/viewer/2022062423/568147b3550346895db4f726/html5/thumbnails/1.jpg)
Tutorial: Chi-Square Tutorial: Chi-Square DistributionDistributionPresented by: Nikki NatividadCourse: BIOL 5081 - Biostatistics
![Page 2: Tutorial: Chi-Square Distribution](https://reader035.vdocuments.net/reader035/viewer/2022062423/568147b3550346895db4f726/html5/thumbnails/2.jpg)
PurposePurposeTo measure discontinuous
categorical/binned data in which a number of subjects fall into categories
We want to compare our observed data to what we expect to see. Due to chance? Due to association?
When can we use the Chi-Square Test? ◦ Testing outcome of Mendelian Crosses, Testing
Independence – Is one factor associated with another?, Testing a population for expected proportions
![Page 3: Tutorial: Chi-Square Distribution](https://reader035.vdocuments.net/reader035/viewer/2022062423/568147b3550346895db4f726/html5/thumbnails/3.jpg)
Assumptions:Assumptions:1 or more categoriesIndependent observationsA sample size of at least 10Random samplingAll observations must be usedFor the test to be accurate, the
expected frequency should be at least 5
![Page 4: Tutorial: Chi-Square Distribution](https://reader035.vdocuments.net/reader035/viewer/2022062423/568147b3550346895db4f726/html5/thumbnails/4.jpg)
Conducting Chi-Square Conducting Chi-Square AnalysisAnalysis1) Make a hypothesis based on your basic biological
question
2) Determine the expected frequencies
3) Create a table with observed frequencies, expected frequencies, and chi-square values using the formula:
(O-E)2
E
4) Find the degrees of freedom: (c-1)(r-1)
5) Find the chi-square statistic in the Chi-Square Distribution table
6) If chi-square statistic > your calculated chi-square value, you do not reject your null hypothesis and vice versa.
![Page 5: Tutorial: Chi-Square Distribution](https://reader035.vdocuments.net/reader035/viewer/2022062423/568147b3550346895db4f726/html5/thumbnails/5.jpg)
Example 1: Testing for Example 1: Testing for ProportionsProportions
Leaf Cutter Ants
Carpenter Ants
Black Ants
Total
Observed 25 18 17 60
Expected 20 20 20 60
O-E 5 -2 -3 0
(O-E)2
E1.25 0.2 0.45 χ2 = 1.90
HO: Horned lizards eat equal amounts of leaf cutter, carpenter and black ants.HA: Horned lizards eat more amounts of one species of ants than the others.
![Page 6: Tutorial: Chi-Square Distribution](https://reader035.vdocuments.net/reader035/viewer/2022062423/568147b3550346895db4f726/html5/thumbnails/6.jpg)
Example 1: Testing for Example 1: Testing for ProportionsProportions
χ2α=0.05 = 5.991
![Page 7: Tutorial: Chi-Square Distribution](https://reader035.vdocuments.net/reader035/viewer/2022062423/568147b3550346895db4f726/html5/thumbnails/7.jpg)
Example 1: Testing for Example 1: Testing for ProportionsProportions
Chi-square statistic: χ2 = 5.991 Our calculated value: χ2 = 1.90
*If chi-square statistic > your calculated value, then you do not reject your null hypothesis. There is a significant
difference that is not due to chance.
5.991 > 1.90 ∴ We do not reject our null hypothesis.
Leaf Cutter Ants
Carpenter Ants
Black Ants
Total
Observed 25 18 17 60
Expected 20 20 20 60
O-E 5 -2 -3 0
(O-E)2
E1.25 0.2 0.45 χ2 = 1.90
![Page 8: Tutorial: Chi-Square Distribution](https://reader035.vdocuments.net/reader035/viewer/2022062423/568147b3550346895db4f726/html5/thumbnails/8.jpg)
SAS: Example 1SAS: Example 1
Included to format the table
Define your data
Indicate what your want in your output
![Page 9: Tutorial: Chi-Square Distribution](https://reader035.vdocuments.net/reader035/viewer/2022062423/568147b3550346895db4f726/html5/thumbnails/9.jpg)
SAS: Example 1SAS: Example 1
![Page 10: Tutorial: Chi-Square Distribution](https://reader035.vdocuments.net/reader035/viewer/2022062423/568147b3550346895db4f726/html5/thumbnails/10.jpg)
SAS: What does the p-value SAS: What does the p-value mean?mean?
“The exact p-value for a nondirectional test is the sum of probabilities for the table having a test statistic greater than or equal to the value of the observed test statistic.”
High p-value: High probability that test statistic > observed test statistic. Do not reject null hypothesis.
Low p-value: Low probability that test statistic > observed test statistic. Reject null hypothesis.
![Page 11: Tutorial: Chi-Square Distribution](https://reader035.vdocuments.net/reader035/viewer/2022062423/568147b3550346895db4f726/html5/thumbnails/11.jpg)
SAS: Example 1SAS: Example 1
High probability that Chi-Square statistic
> our calculated chi-square statistic.
We do not reject our null hypothesis.
![Page 12: Tutorial: Chi-Square Distribution](https://reader035.vdocuments.net/reader035/viewer/2022062423/568147b3550346895db4f726/html5/thumbnails/12.jpg)
SAS: Example 1SAS: Example 1
![Page 13: Tutorial: Chi-Square Distribution](https://reader035.vdocuments.net/reader035/viewer/2022062423/568147b3550346895db4f726/html5/thumbnails/13.jpg)
Example 2: Testing Example 2: Testing AssociationAssociation
c
cellchi2 = displays how much each cell contributes to the overall chi-squared value
no col = do not display totals of column
no row = do not display totals of rows
chi sq = display chi square statistics
HO: Gender and eye colour are not associated with each other.HA: Gender and eye colour are associated with each other.
![Page 14: Tutorial: Chi-Square Distribution](https://reader035.vdocuments.net/reader035/viewer/2022062423/568147b3550346895db4f726/html5/thumbnails/14.jpg)
Example 2: More SAS Example 2: More SAS ExamplesExamples
![Page 15: Tutorial: Chi-Square Distribution](https://reader035.vdocuments.net/reader035/viewer/2022062423/568147b3550346895db4f726/html5/thumbnails/15.jpg)
Example 2: More SAS Example 2: More SAS ExamplesExamples
(2-1)(3-1) = 1*2 = 2
High probability that Chi-Square statistic > our
calculated chi-square statistic. (78.25%)
We do not reject our null hypothesis.
![Page 16: Tutorial: Chi-Square Distribution](https://reader035.vdocuments.net/reader035/viewer/2022062423/568147b3550346895db4f726/html5/thumbnails/16.jpg)
Example 2: More SAS Example 2: More SAS ExamplesExamples
If there was an association, can
check which interactions
describe association by looking at how much each cell
contributes to the overall Chi-square
value.
![Page 17: Tutorial: Chi-Square Distribution](https://reader035.vdocuments.net/reader035/viewer/2022062423/568147b3550346895db4f726/html5/thumbnails/17.jpg)
LimitationsLimitations No categories should be less than 1 No more than 1/5 of the expected categories
should be less than 5◦ To correct for this, can collect larger samples or
combine your data for the smaller expected categories until their combined value is 5 or more
Yates Correction*◦ When there is only 1 degree of freedom, regular
chi-test should not be used◦ Apply the Yates correction by subtracting 0.5
from the absolute value of each calculated O-E term, then continue as usual with the new corrected values
![Page 18: Tutorial: Chi-Square Distribution](https://reader035.vdocuments.net/reader035/viewer/2022062423/568147b3550346895db4f726/html5/thumbnails/18.jpg)
What do these mean?What do these mean?
![Page 19: Tutorial: Chi-Square Distribution](https://reader035.vdocuments.net/reader035/viewer/2022062423/568147b3550346895db4f726/html5/thumbnails/19.jpg)
Likelihood Ratio Chi Likelihood Ratio Chi SquareSquare
![Page 20: Tutorial: Chi-Square Distribution](https://reader035.vdocuments.net/reader035/viewer/2022062423/568147b3550346895db4f726/html5/thumbnails/20.jpg)
Continuity-Adjusted Chi-Continuity-Adjusted Chi-Square TestSquare Test
![Page 21: Tutorial: Chi-Square Distribution](https://reader035.vdocuments.net/reader035/viewer/2022062423/568147b3550346895db4f726/html5/thumbnails/21.jpg)
Mantel-Haenszel Chi-Mantel-Haenszel Chi-Square TestSquare Test
QMH = (n-1)r2
r2 is the Pearson correlation coefficient (which also measures the linear association between row and column)
◦ http://support.sas.com/documentation/cdl/en/procstat/63104/HTML/default/viewer.htm#procstat_freq_a0000000659.htm
Tests alternative hypothesis that there is a linear association between the row and column variableFollows a Chi-square distribution with 1 degree of freedom
![Page 22: Tutorial: Chi-Square Distribution](https://reader035.vdocuments.net/reader035/viewer/2022062423/568147b3550346895db4f726/html5/thumbnails/22.jpg)
Phi CoefficientPhi Coefficient
![Page 23: Tutorial: Chi-Square Distribution](https://reader035.vdocuments.net/reader035/viewer/2022062423/568147b3550346895db4f726/html5/thumbnails/23.jpg)
Contigency CoefficientContigency Coefficient
![Page 24: Tutorial: Chi-Square Distribution](https://reader035.vdocuments.net/reader035/viewer/2022062423/568147b3550346895db4f726/html5/thumbnails/24.jpg)
Cramer’s VCramer’s V
![Page 25: Tutorial: Chi-Square Distribution](https://reader035.vdocuments.net/reader035/viewer/2022062423/568147b3550346895db4f726/html5/thumbnails/25.jpg)
Yates & 2 x 2 Contingency Yates & 2 x 2 Contingency TablesTablesHO: Heart Disease is not associated with cholesterol levels.HA: Heart Disease is more likely in patients with a high cholesterol diet.
Calculate degrees of freedom: (c-1)(r-1) = 1*1 = 1We need to use the YATES CORRECTION
High Cholester
ol
Low Cholesterol
Total
Heart Disease 15 7 22Expected 12.65 9.35 22
Chi-Square 0.44 0.59 1.03
No Heart Disease
8 10 18
Expected 10.35 7.65 18Chi-Square 0.53 0.72 1.25
TOTAL 23 17 40
Chi-Square Total
2.28
![Page 26: Tutorial: Chi-Square Distribution](https://reader035.vdocuments.net/reader035/viewer/2022062423/568147b3550346895db4f726/html5/thumbnails/26.jpg)
Yates & 2 x 2 Contingency Yates & 2 x 2 Contingency TablesTablesHO: Heart Disease is not associated with cholesterol levels.HA: Heart Disease is more likely in patients with a high cholesterol diet. High
Cholesterol
Low Cholesterol
Total
Heart Disease 15 7 22Expected 12.65 9.35 22
Chi-Square 0.27 0.37 0.64
No Heart Disease
8 10 18
Expected 10.35 7.65 18Chi-Square 0.33 0.45 0.78
TOTAL 23 17 40
Chi-Square Total
1.42
(|15-12.65| - 0.5)2 12.65
= 0.27
![Page 27: Tutorial: Chi-Square Distribution](https://reader035.vdocuments.net/reader035/viewer/2022062423/568147b3550346895db4f726/html5/thumbnails/27.jpg)
Example 1: Testing for Example 1: Testing for ProportionsProportions
χ2α=0.05 = 3.841
![Page 28: Tutorial: Chi-Square Distribution](https://reader035.vdocuments.net/reader035/viewer/2022062423/568147b3550346895db4f726/html5/thumbnails/28.jpg)
Yates & 2 x 2 Contingency Yates & 2 x 2 Contingency TablesTablesHO: Heart Disease is not associated with cholesterol levels.HA: Heart Disease is more likely in patients with a high cholesterol diet.
3.841 > 1.42 ∴ We do not reject our null hypothesis.
High Cholester
ol
Low Cholesterol
Total
Heart Disease 15 7 22Expected 12.65 9.35 22
Chi-Square 0.27 0.37 0.64
No Heart Disease
8 10 18
Expected 10.35 7.65 18Chi-Square 0.33 0.45 0.78
TOTAL 23 17 40
Chi-Square Total
1.42
![Page 29: Tutorial: Chi-Square Distribution](https://reader035.vdocuments.net/reader035/viewer/2022062423/568147b3550346895db4f726/html5/thumbnails/29.jpg)
Fisher’s Exact TestFisher’s Exact TestLeft: Use when the alternative to independence
is negative association between the variables. These observations tend to lie in lower left and upper right cells of the table. Small p-value = Likely negative association.
Right: Use this one-sided test when the alternative to independence is positive association between the variables. These observations tend to lie in upper left and lower right cells or the table. Small p-value = Likely positive association.
Two-Tail: Use this when there is no prior alternative.
![Page 30: Tutorial: Chi-Square Distribution](https://reader035.vdocuments.net/reader035/viewer/2022062423/568147b3550346895db4f726/html5/thumbnails/30.jpg)
Yates & 2 x 2 Contingency Yates & 2 x 2 Contingency TablesTables
![Page 31: Tutorial: Chi-Square Distribution](https://reader035.vdocuments.net/reader035/viewer/2022062423/568147b3550346895db4f726/html5/thumbnails/31.jpg)
Yates & 2 x 2 Contingency Yates & 2 x 2 Contingency TablesTables
![Page 32: Tutorial: Chi-Square Distribution](https://reader035.vdocuments.net/reader035/viewer/2022062423/568147b3550346895db4f726/html5/thumbnails/32.jpg)
HO: Heart Disease is not associated with cholesterol levels.
HA: Heart Disease is more likely in patients with a high cholesterol diet.
![Page 33: Tutorial: Chi-Square Distribution](https://reader035.vdocuments.net/reader035/viewer/2022062423/568147b3550346895db4f726/html5/thumbnails/33.jpg)
ConclusionConclusionThe Chi-square test is important in testing
the association between variables and/or checking if one’s expected proportions meet the reality of one’s experiment
There are multiple chi-square tests, each catered to a specific sample size, degrees of freedom, and number of categories
We can use SAS to conduct Chi-square tests on our data by utilizing the command proc freq
![Page 34: Tutorial: Chi-Square Distribution](https://reader035.vdocuments.net/reader035/viewer/2022062423/568147b3550346895db4f726/html5/thumbnails/34.jpg)
ReferencesReferencesChi-Square Test Descriptions:
http://www.enviroliteracy.org/pdf/materials/1210.pdf
http://129.123.92.202/biol1020/Statistics/Appendix%206%20%20The%20Chi-Square%20TEst.pdf
Ozdemir T and Eyduran E. 2005. Comparison of chi-square and likelihood ratio chi-square tests: power of test. Journal of Applied Sciences Research. 1(2):242-244.
SAS Support website: http://www.sas.com/index.html“FREQ procedure”
YouTube Chi-square SAS Tutorial (user: mbate001):http://www.youtube.com/watch?v=ACbQ8FJTq7k