hypothesis testing between two or more categorical variables the chi-square distribution and test...

10
HYPOTHESIS TESTING BETWEEN TWO OR MORE CATEGORICAL VARIABLES The Chi-Square Distribution and Test for Independence

Upload: sydney-bradford

Post on 05-Jan-2016

214 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: HYPOTHESIS TESTING BETWEEN TWO OR MORE CATEGORICAL VARIABLES The Chi-Square Distribution and Test for Independence

HYPOTHESIS TESTING BETWEEN TWO OR MORE CATEGORICAL VARIABLES

The Chi-Square Distribution and Test for

Independence

Page 2: HYPOTHESIS TESTING BETWEEN TWO OR MORE CATEGORICAL VARIABLES The Chi-Square Distribution and Test for Independence

Agenda

Wrapping up the t-test program example

Chi-Square and Chi-Square test of independence

2

Page 3: HYPOTHESIS TESTING BETWEEN TWO OR MORE CATEGORICAL VARIABLES The Chi-Square Distribution and Test for Independence

Chi-Square Distribution3

The chi-square distribution results when independent variables with standard normal distributions are squared and summed.

Page 4: HYPOTHESIS TESTING BETWEEN TWO OR MORE CATEGORICAL VARIABLES The Chi-Square Distribution and Test for Independence

Chi-square Degrees of freedom4

df = (r-1) (c-1)

Where r = # of rows, c = # of columns

Thus, in any 2x2 contingency table, the degrees of freedom = 1.

As the degrees of freedom increase, the distribution shifts to the right and the critical values of chi-square become larger.

Page 5: HYPOTHESIS TESTING BETWEEN TWO OR MORE CATEGORICAL VARIABLES The Chi-Square Distribution and Test for Independence

Chi-Square Test of Independence

5

Page 6: HYPOTHESIS TESTING BETWEEN TWO OR MORE CATEGORICAL VARIABLES The Chi-Square Distribution and Test for Independence

Using the Chi-Square Test6

Often used with contingency tables (i.e., crosstabulations) E.g., gender x student

The chi-square test of independence tests whether the columns are contingent on the rows in the table.

In this case, the null hypothesis is that there is no relationship between row and column frequencies. H0: The 2 variables are independent.

Page 7: HYPOTHESIS TESTING BETWEEN TWO OR MORE CATEGORICAL VARIABLES The Chi-Square Distribution and Test for Independence

Requirements for Chi-Square test

7

Must be a random sample from population

Data must be in raw frequencies

Variables must be independent

Categories for each I.V. must be mutually exclusive and exhaustive

Page 8: HYPOTHESIS TESTING BETWEEN TWO OR MORE CATEGORICAL VARIABLES The Chi-Square Distribution and Test for Independence

Example Crosstab: Gender x Student

8

  Student Not Student Total

Males46 (40.97)

71 (76.02)

117

Females37 (42.03)

83 (77.97)

120

Total 83 154 237

Observed

Expected

Page 9: HYPOTHESIS TESTING BETWEEN TWO OR MORE CATEGORICAL VARIABLES The Chi-Square Distribution and Test for Independence

Special Cases

Fisher’s Exact Test When you have a 2 x 2 table with expected

frequencies less than 5.

Strength of Association Some use Cramer’s V (for any two nominal variables)

or Phi (for 2 x 2 tables) to give a value of association between the variables.

9

Page 10: HYPOTHESIS TESTING BETWEEN TWO OR MORE CATEGORICAL VARIABLES The Chi-Square Distribution and Test for Independence

Practical Examples:10

chi2dist.do

chisquare.do

Auto.dta