hypothesis testing between two or more categorical variables the chi-square distribution and test...
TRANSCRIPT
HYPOTHESIS TESTING BETWEEN TWO OR MORE CATEGORICAL VARIABLES
The Chi-Square Distribution and Test for
Independence
Agenda
Wrapping up the t-test program example
Chi-Square and Chi-Square test of independence
2
Chi-Square Distribution3
The chi-square distribution results when independent variables with standard normal distributions are squared and summed.
Chi-square Degrees of freedom4
df = (r-1) (c-1)
Where r = # of rows, c = # of columns
Thus, in any 2x2 contingency table, the degrees of freedom = 1.
As the degrees of freedom increase, the distribution shifts to the right and the critical values of chi-square become larger.
Chi-Square Test of Independence
5
Using the Chi-Square Test6
Often used with contingency tables (i.e., crosstabulations) E.g., gender x student
The chi-square test of independence tests whether the columns are contingent on the rows in the table.
In this case, the null hypothesis is that there is no relationship between row and column frequencies. H0: The 2 variables are independent.
Requirements for Chi-Square test
7
Must be a random sample from population
Data must be in raw frequencies
Variables must be independent
Categories for each I.V. must be mutually exclusive and exhaustive
Example Crosstab: Gender x Student
8
Student Not Student Total
Males46 (40.97)
71 (76.02)
117
Females37 (42.03)
83 (77.97)
120
Total 83 154 237
Observed
Expected
Special Cases
Fisher’s Exact Test When you have a 2 x 2 table with expected
frequencies less than 5.
Strength of Association Some use Cramer’s V (for any two nominal variables)
or Phi (for 2 x 2 tables) to give a value of association between the variables.
9
Practical Examples:10
chi2dist.do
chisquare.do
Auto.dta