chi_ squre test
TRANSCRIPT
-
8/8/2019 Chi_ Squre Test
1/32
PG students: Dr Amit Gujarathi
Dr Naresh Gill
-
8/8/2019 Chi_ Squre Test
2/32
History.
y Chi Square (chi, the Greek letter pronounced "kye)
test is a Nonparametric statistical technique used todetermine if a distribution of observed frequencies
differs from the theoretical expected frequencies.
y It was developed by Prof. Karl Pearson in 1900
-
8/8/2019 Chi_ Squre Test
3/32
Introductiony The Relationship between two or more continuous
variables can be studied by correlation and regression
y
But in medical research to test the association betweentwo Discrete variables we use Chi-Square test.
y Such as to see association between a continuous
variable grouped into categories( Hb level: Mild ,Mod,
Severe anemia) and a discontinuous variables( Socio
economic status) or between two continuous variablesgrouped into categories (Hb Level: Mild ,Mod, Severe
anemia and No of ANC visits i.e. None,1-3 or .3).
-
8/8/2019 Chi_ Squre Test
4/32
Application of Chi Square testy Test of proportions.
y Test of association.
y Goodness of Fit test.
-
8/8/2019 Chi_ Squre Test
5/32
Test Of Proportion
yAs an alternative test to find the significance in two
or more than two proportions .
y For comparing values of two binomial samples evenif they are small, less than 30.(provided correction
factor, Yates correction is applied and expected
value is not less than 5 in any cell.)
y For comparing the frequencies of multinomialsamples.
-
8/8/2019 Chi_ Squre Test
6/32
Test Of Association
y Most important application of the test.
y Can be used between two discrete events in
binomial or multinomial samples.y It measures the probability of association between
two discrete variables.
y Two possibilities:
y Either influence each other ( Dependent)
y Or not influencing each other( Independent) i.e. No
association is there.
-
8/8/2019 Chi_ Squre Test
7/32
Contd..
yAssumption of independence is made i.e. Null
hypothesis, that there is no association between two
event.
y Thus Chi square test measures the probability(p)or relative frequency of association due to chance
and also if the two events are dependant on each
other or associated with each other.
yAdded advantage : can be used to find associationbetween two discrete variables when they are
categorized into more than two classes as happens
in multinomial samples.
-
8/8/2019 Chi_ Squre Test
8/32
Test Of Goodness Of Fity Idea behind is that Chi square goodness of fit test
is to see the if samples comes from the Population
with the Claimed distribution.y This goodness of fit test is used to determine
whether population has certain hypothesized
distribution, expressed as a proportions of
individuals in the population falling into variousoutcome categories.
-
8/8/2019 Chi_ Squre Test
9/32
Hypothesizes for Goodness of Fit Testy Suppose that hypothesized distribution has K outcome
categories.
y H0 = the actual population proportions are equal to thehypothesized proportions.
y Ha = the actual proportions are different from thehypothesized proportions
y First calculate the chi-square value which has X2
distribution at ( k-1) degree of freedom.y For test ofH0 against alternative hypothesis at least
two of the actual population proportions differ from theirhypothesized proportions.
-
8/8/2019 Chi_ Squre Test
10/32
Chi - square distributiony Chi square distribution are family of distribution that take
only positive values and are skewed to right. A specific chi square distribution is specified by one parameter is calleddegree of freedom.
-
8/8/2019 Chi_ Squre Test
11/32
Requirementsy Requirements :
yA random sample
y
Qualitative datay Lowest expected frequency in any cell should not be
less than 5 (Chi square distribution)
y If the smallest expected frequency is less than 5 thenFishers exact probability test should be used.
y Chi square should be calculated using thefrequencies only and not with rates, proportions orpercentages.
y Frequencies should be independent.
-
8/8/2019 Chi_ Squre Test
12/32
-
8/8/2019 Chi_ Squre Test
13/32
y Sum the X2 values of all the cells to get the total chi square value.
y X2df indicates the total X2 value at particular degrees
of freedom.
y Calculates the degree of freedom.
df = (c-1) (r-1)
y Refer to Fishers X2 table .Compare the calculatedvalue with highest obtainable by chance at thedesired degree of freedom given in the table under
different probabilities such as 0.05.0.01,0.001 etc.y If calculated value of X2df is higher than the value
given in the table , then its significant at thatparticular level of significance.
-
8/8/2019 Chi_ Squre Test
14/32
exampleyA student wants to see whether the food preferences of
males and females differed. He tried to see whether
males or femalesh
ad a general difference in th
epreference for cooked and raw foods. A survey wasconducted with the following results:
y Twelve males preferred Cooked foods.
y
Eight males preferred Raw foods.y Five females preferred Cooked foods.
y Five females preferred Raw foods.
-
8/8/2019 Chi_ Squre Test
15/32
Step 1: State the null hypothesis
and the alternative hypothesis.y Ho: There is no significant difference between the food
preferences of males and females.
Ory Food preference is independent of gender.
y Ha: There is a significant difference between the food
preferences of males and females.Or
y Food preference is affected by gender.
-
8/8/2019 Chi_ Squre Test
16/32
Step 2: State the level of
significance.y = 0.05
y 0.05 is the level of significance for most scientific
experiments
-
8/8/2019 Chi_ Squre Test
17/32
Step 3: Set up a contingency tablePreference Male Female Total (row)
Cooked 12 5 17
Raw 8 5 13
Total(Column) 20 10 30
-
8/8/2019 Chi_ Squre Test
18/32
Step 4: Compute for the expected frequencies
The chi-square test for independence usually uses the third method of gettingexpected frequencies.
Expected Frequency = R1X C
1
N
This expected frequency is computed for EACH cell.
Preference Male female Total (row)Cooked food (20) (17) /30
= 11.33
(10) (17) / 30
=5.67
17
Raw food (13) (20) / 30
=8.67
(13) (10) / 30
= 4.33
13
Total (
column)
20 10 30
-
8/8/2019 Chi_ Squre Test
19/32
y Where O is the observed frequencies
E is the expected frequencies
And x2 is the chi-square value
-
8/8/2019 Chi_ Squre Test
20/32
Step 5: Rearrange the table to show the observed and expected
frequen
cies on
the column
s, an
d the subcategories on
the rows.
Preference Observed expected Chi - square
Cooked food,Male
12 11.33 0.0396
Cooked food ,
female
5 5.67 0.0792
Raw food, male 8 8.67 0.0518
Raw food,
Female
5 4.33 0.1037
Total 0.2793
-
8/8/2019 Chi_ Squre Test
21/32
-
8/8/2019 Chi_ Squre Test
22/32
Step 7: Check the tabular Chi-squared
value with your df and level of significance.
y Checking the table, we see that the tabular chi-
squared value for df = 1, and = 0.05 is 3.841.
y Since our calculated chi-square is less than this,
means there is the difference is not significant , its
due chance , null hypothesisisaccepted.
y Hence, food preference is independent of gender.
y If it were greater, we would reject the null hypothesis.
-
8/8/2019 Chi_ Squre Test
23/32
y Example 2:
Attack rates among the vaccinated and unvaccinatedagainst measles . Protective value of Vaccination??
Group Results Total
Attacked Not-attacked
Vaccinated 10 90 100
Unvaccinated 26 74 100
Total 36 164 200
-
8/8/2019 Chi_ Squre Test
24/32
Null Hypothesis(Ho) states that there is no difference in
attack rate between two groups.
While the alternate hypothesis(Ha) states that there issignificant difference in attack rates in two groups.
Group Total
Attacked Not-attacked
Vaccinated Observed 10 90 100
Expected 18 82
Unvaccinated Observed 26 74 100
Expected 18 82
Total 36 164 200
-
8/8/2019 Chi_ Squre Test
25/32
After putting these values, Chi square value : 8.670
df = 1,
Table value of chi square at the df 1, and 5%significance level=3.84
Calculated value of Chi Square is greater than table
value, Hence its significant.
Null hypothesis is rejected and alternate hypothesis isaccepted.
-
!
E
EO2
2 )(G
-
8/8/2019 Chi_ Squre Test
26/32
Test Of Goodness Of Fit: example
y Ques:In a sample of 100 persons, blood group
proportions as observed and expected are given
below. Find if the observed distribution fits the
hypothetical(expected) distribution.
Blood
group A
B AB O
Observed 23 35 5 37
Expected 42 9 3 46
-
8/8/2019 Chi_ Squre Test
27/32
!'exp
exp)( 22 obs
There are 4 classes (k), Hence df=k-1, df=3
cesignificanoflevel5%at82.723 !'
Chi Square value calculated = 86.8
At the calculated value 86.8, P is far less than 0.001.
Hence , the value is highly significant.
The observed distribution does not fit to the
hypothetical distribution.
-
8/8/2019 Chi_ Squre Test
28/32
-
8/8/2019 Chi_ Squre Test
29/32
Contd..
4. Interpret X2 test with caution if sample total or totalof values in all cells is less than 50.
5. X2 test tells the presence or absence of an
association between two events but does not
measures the strength of association.6. The statistical finding or relationship, does not
indicate the cause and effect .
-
8/8/2019 Chi_ Squre Test
30/32
Alternate formulae for calculation
Chi - Square value
GroupResults(Measles)
TotalAttacked Not-attacked
Vaccinated a b a+b
Unvaccinated c d c+d
Total a+c b+d a+b+c+d
-
!
))()()((
)()( 22
dbcadcba
dcbabcadG
-
8/8/2019 Chi_ Squre Test
31/32
Formulae with Yates correction:
-
!
))()()((NX)(
dbcdcbabcadG
-
! ))()()((
NXN/2)( 22
dbcadcba
bcad
G
-
8/8/2019 Chi_ Squre Test
32/32