chi square test for cross tab - session 9 & 10
TRANSCRIPT
![Page 1: Chi Square Test for Cross Tab - Session 9 & 10](https://reader033.vdocuments.net/reader033/viewer/2022051613/5513193e4a7959c4028b4b67/html5/thumbnails/1.jpg)
Cross‐tabulation and Chi‐square testq
Business Research MethodologyBusiness Research Methodology
Dr. Gunjan MalhotraDr. Gunjan MalhotraAssistant Professormailforgunjan@gmail [email protected]
![Page 2: Chi Square Test for Cross Tab - Session 9 & 10](https://reader033.vdocuments.net/reader033/viewer/2022051613/5513193e4a7959c4028b4b67/html5/thumbnails/2.jpg)
Simple Tabulation for Ranking Type Q ti Bi i t i blQuestions – Bivariate variables
• Suppose ‐ ordinal scale questions
• Q. Rank the 5 brands of refrigerators shown below on ascale of 1 to 5 (1=Best and 5=Worst), according to youropinionopinion.
BRAND RANKBRAND RANKWhirlpool ___Kelvinator ___Godrej ___Samsung ___Videocon ___
![Page 3: Chi Square Test for Cross Tab - Session 9 & 10](https://reader033.vdocuments.net/reader033/viewer/2022051613/5513193e4a7959c4028b4b67/html5/thumbnails/3.jpg)
Output table formulationOutput table formulation
Table 1BRAND RANK 1 RANK2 RANK3 RANK4 RANK5BRAND RANK 1 RANK2 RANK3 RANK4 RANK5Whirlpool x x x x xKelvinator x x x x xKelvinator x x x x xGodrej x x x x xSamsung x x x x xSamsung x x x x xVideocon x x x x x
![Page 4: Chi Square Test for Cross Tab - Session 9 & 10](https://reader033.vdocuments.net/reader033/viewer/2022051613/5513193e4a7959c4028b4b67/html5/thumbnails/4.jpg)
Univariate tablesUnivariate tables• For constructing univariate tables ‐ take up one column at atime and do separate frequency tables or charts. E.g.
BRAND No. of People who Ranked it No.1p
Whirlpool 90
Kelvinator 60
Godrej 70
Samsung 32g
Videocon 45
TOTAL 297
• We can calculate %age on a total for each brand. E.g. 90/297works out to 303 or 30 3% who ranked Whirlpool as no 1 andworks out to .303 or 30.3% who ranked Whirlpool as no.1. andso on.
![Page 5: Chi Square Test for Cross Tab - Session 9 & 10](https://reader033.vdocuments.net/reader033/viewer/2022051613/5513193e4a7959c4028b4b67/html5/thumbnails/5.jpg)
Simple Tabulation for Rating Type Questions Q. Rate the following attributes of LIRIL soap on a scale of 1 to 5 (1= Very Unsatisfactory to 5=Very Satisfactory).Very Unsatisfactory to 5 Very Satisfactory).
Lather __________________________________
1 2 3 4 51 2 3 4 5
Fragrance __________________________________
1 2 3 4 5
• For each attribute, the number of people who rated it as 1, 2, 3, 4 or 5 can be tabulated in separate tables like:
RATING Lather
1 30
2 25
3 50
4 76
5 22
TOTAL 203
![Page 6: Chi Square Test for Cross Tab - Session 9 & 10](https://reader033.vdocuments.net/reader033/viewer/2022051613/5513193e4a7959c4028b4b67/html5/thumbnails/6.jpg)
Alternatively, we can tabulate ratings for all attributes as follows ‐
RATING LATHER FRAGRANCE ATR.3 ATR.4 ATR.51 x x x x x1 x x x x x2 x x x x x3 x x x x x4 x x x x x5 x x x x x
![Page 7: Chi Square Test for Cross Tab - Session 9 & 10](https://reader033.vdocuments.net/reader033/viewer/2022051613/5513193e4a7959c4028b4b67/html5/thumbnails/7.jpg)
Second Stage Analysis – Cross Tabulation• A cross‐tabulation can be done by combining any two of the
questions and tabulating the data together. This is a 2‐variablequestions and tabulating the data together. This is a 2 variablecross tabulation.
b l b d f f b d f• E.g. a cross‐tabulation between Brand Preference for brands of teaand Region to which Respondent belongs.
BRANDRegionwise Buyers (No.)RAN Regionwise uyers (No.)North South East West Total
Brooke Bond 25 (50%) 20 20 15(30%) 80(40%)Lipton 10(20%) 15 20 5(10%) 50(25%)Tata 15(30%) 15 10 30(60%) 70(35%)Total 50(100%) 50 50 50(100%) 200(100%)Total 50(100%) 50 50 50(100%) 200(100%)
– An extension of this could be adding percentages.An extension of this could be adding percentages.
![Page 8: Chi Square Test for Cross Tab - Session 9 & 10](https://reader033.vdocuments.net/reader033/viewer/2022051613/5513193e4a7959c4028b4b67/html5/thumbnails/8.jpg)
Calculating Percentages in a Cross Tabulation•In the above example, we can compute percentages
• row‐wise,row wise,• column‐wise or•on the total sample of 200.
•The general rule is to calculate percentages across the dependentvariable (across Brand categories ).( g )
• Assume that brand preference depends on the region to whichrespondents belong. i.e. “Brand” ‐ dependent variable, and“Region” ‐ independent variable.
• The interpretation is – “Out of 50 respondents from the NorthernRegion, 50% buy Brooke Bond, 20% buy Lipton, and 30% buy TataRegion, 50% buy Brooke Bond, 20% buy Lipton, and 30% buy TataTea”.
![Page 9: Chi Square Test for Cross Tab - Session 9 & 10](https://reader033.vdocuments.net/reader033/viewer/2022051613/5513193e4a7959c4028b4b67/html5/thumbnails/9.jpg)
Chi‐square testq
1. Univariate ‐ Chi‐square test for goodness of fitq g
• Test for significance in the analysis of frequency distributions.Test for significance in the analysis of frequency distributions.• Each question represents a variable under study.• Compare observed frequencies with expected frequenciesCompare observed frequencies with expected frequencies
2 Bivariate ‐ Chi‐square test for relatedness or independence2. Bivariate Chi square test for relatedness or independence
– Chi‐Square allows testing for significant differences between– Chi‐Square allows testing for significant differences between groups.
[Two different questions in a questionnaire may represent two variables.]q q y p
![Page 10: Chi Square Test for Cross Tab - Session 9 & 10](https://reader033.vdocuments.net/reader033/viewer/2022051613/5513193e4a7959c4028b4b67/html5/thumbnails/10.jpg)
Chi‐square test for Goodness of FitChi square test for Goodness of Fit• is used to analyze probabilities of multinomial y pdistribution trials along a single dimension.
• The Chi‐square test for goodness‐of‐fit test comparesThe Chi square test for goodness of fit test compares the expected (theoretical) frequencies of categories from a population distribution to the observedfrom a population distribution to the observed (actual) frequencies from a distribution to determine whether there is a difference between what waswhether there is a difference between what was expected and what was observed .
∑ −=
i
ii )²( ²E
EOxiE
![Page 11: Chi Square Test for Cross Tab - Session 9 & 10](https://reader033.vdocuments.net/reader033/viewer/2022051613/5513193e4a7959c4028b4b67/html5/thumbnails/11.jpg)
Example 1: Chi Square test for goodness of fit ‐ Equal expected frequency
• The table outlines the attitudes of 60 people towards US• The table outlines the attitudes of 60 people towards US military bases in Australia. A chi‐square test for goodness of fit will allow us to determine if differencesgoodness of fit will allow us to determine if differences in frequency exist across response categories.H Th i i ifi t diff f f• Ho: There is no significant difference across frequency of attitudes towards military base in Australia.
Attitude towards US Military Frequency of ResponseAttitude towards US Military bases in Australia
Frequency of Response(Observed frequencies)
In favour 8
Against 20
Undecided 32
![Page 12: Chi Square Test for Cross Tab - Session 9 & 10](https://reader033.vdocuments.net/reader033/viewer/2022051613/5513193e4a7959c4028b4b67/html5/thumbnails/12.jpg)
Output 1: Chi‐Square test – equal expected frequencies
![Page 13: Chi Square Test for Cross Tab - Session 9 & 10](https://reader033.vdocuments.net/reader033/viewer/2022051613/5513193e4a7959c4028b4b67/html5/thumbnails/13.jpg)
Interpretation 1: Chi‐square test – equal d f iexpected frequencies
• The output shows that the chi‐square value is significant (p < .05). (Ho: rejected).g (p ) ( j )
• Therefore it can be concluded that there are• Therefore, it can be concluded that there are significant differences in the frequency of attitudes towards military base in Australiatowards military base in Australia.
• The results show that people are largely undecided on this issue, chi‐square (2,N=60)=14.4, p < .05.
![Page 14: Chi Square Test for Cross Tab - Session 9 & 10](https://reader033.vdocuments.net/reader033/viewer/2022051613/5513193e4a7959c4028b4b67/html5/thumbnails/14.jpg)
Example 2: Chi‐square test for goodness of fit – Unequal expected frequencies
• Sometimes the expected frequencies are not evenly balanced across categories.y g
• E.g. the expected frequency for each category was 15 15 and 30was 15, 15 and 30.
Attitude towardsUS Military bases
Frequency of Response
Expected Frequency ofUS Military bases
in AustraliaResponse(Observedfrequencies)
Frequency of responses
I f 8 15In favour 8 15
Against 20 15
Undecided 32 30
![Page 15: Chi Square Test for Cross Tab - Session 9 & 10](https://reader033.vdocuments.net/reader033/viewer/2022051613/5513193e4a7959c4028b4b67/html5/thumbnails/15.jpg)
Output 2: Chi‐square test – unequal expected frequencies
![Page 16: Chi Square Test for Cross Tab - Session 9 & 10](https://reader033.vdocuments.net/reader033/viewer/2022051613/5513193e4a7959c4028b4b67/html5/thumbnails/16.jpg)
Interpretation 2: Chi‐square test – unequal expected frequencies
• The output shows that the chi square value is• The output shows that the chi‐square value is not significant (p = .079 > .05). (Ho = accepted)
• Therefore, it can be concluded that there is no ,significant differences in the frequency of attitudes towards military base in Australia.attitudes towards military base in Australia.
Th lt h th t l l l• The results show that people are largely undecided on this issue, chi‐square (2,N=60)= 5 067 055.067, p > .05.
![Page 17: Chi Square Test for Cross Tab - Session 9 & 10](https://reader033.vdocuments.net/reader033/viewer/2022051613/5513193e4a7959c4028b4b67/html5/thumbnails/17.jpg)
Chi square test of IndependenceChi‐square test of Independence
• Qualitative Variables Nominal data• Qualitative Variables ‐ Nominal data
• used to test if the two variables are statistically• used to test if the two variables are statistically associated with each other significantly.
• Used to analyze the frequencies of two variables with multiple categories to determine whether the twomultiple categories to determine whether the two variables are independent.
• It is possible to do a cross‐tabulation (and a chi‐squared test – with given table value, df, confidence level) for any two nominal variables in the survey.
![Page 18: Chi Square Test for Cross Tab - Session 9 & 10](https://reader033.vdocuments.net/reader033/viewer/2022051613/5513193e4a7959c4028b4b67/html5/thumbnails/18.jpg)
Example 1: Chi square test for cross tabExample 1: Chi‐square test for cross‐tab
• Let us assume that we have conducted consumer survey for a brand of detergent. One of the question dealt with income category of the respondent. Another asked the respondent to rate his purchase intentions.
• Ho: There is no significant association between Respondent Income and Purchase Intentionp
![Page 19: Chi Square Test for Cross Tab - Session 9 & 10](https://reader033.vdocuments.net/reader033/viewer/2022051613/5513193e4a7959c4028b4b67/html5/thumbnails/19.jpg)
S. No
INCOME CODE INTENT INTCODE No.1 Less Than 5000 1 NONE 1 2 Less Than 5000 1 LOW 2 3 Less Than 5000 1 LOW 2 4 Less Than 5000 1 NONE 14 Less Than 5000 1 NONE 15 Less Than 5000 1 HIGH 3 6 5001-10000 2 LOW 2 7 5001-10000 2 HIGH 3 8 5001-10000 2 VERY
HIGH 4
9 5001-10000 2 HIGH 3 10 5001-10000 2 LOW 2 11 10001-20000 3 HIGH 3 12 10001-20000 3 VERY
HIGH 4
13 10001-20000 3 CERTAIN 514 10001-20000 3 HIGH 3 15 10001-20000 3 VERY
HIGH 4
16 Above 20000 4 HIGH 316 Above 20000 4 HIGH 317 Above 20000 4 CERTAIN 5 18 Above 20000 4 VERY
HIGH 4
19 Abo e 20000 4 CERTAIN 519 Above 20000 4 CERTAIN 520 Above 20000 4 CERTAIN 5
![Page 20: Chi Square Test for Cross Tab - Session 9 & 10](https://reader033.vdocuments.net/reader033/viewer/2022051613/5513193e4a7959c4028b4b67/html5/thumbnails/20.jpg)
Both variables are coded.Both variables are coded.
Income codes and their equivalent incomes are –
Code Income in Rs. per Month1 Less than 50001 Less than 50002 5001 to 10,0003 10,001 to 20,0004 Above 20 0004 Above 20,000
Purchase Intention codes are as follows –
Code Explanation (Value Labels for the Variable)1 None – No intention to buy1 None No intention to buy2 Low – Low intention to buy3 High – High intention4 Very High Very high intention4 Very High – Very high intention5 Certain – Certain to buy
![Page 21: Chi Square Test for Cross Tab - Session 9 & 10](https://reader033.vdocuments.net/reader033/viewer/2022051613/5513193e4a7959c4028b4b67/html5/thumbnails/21.jpg)
INCOME Per Month by PURCHASE INTENTION
Income per Month in RS.--- Purchase Intent
Code Less than 5000
5000-10000
10000-20000
Above 20000
TOTAL
5000None 1 2 0 0 0 2 Low 2 2 2 0 0 4Low 2 2 2 0 0 4High 3 1 2 2 1 6 V. High 4 0 1 2 1 4 Certain 5 0 0 1 3 4TOTAL 5 5 5 5 20
![Page 22: Chi Square Test for Cross Tab - Session 9 & 10](https://reader033.vdocuments.net/reader033/viewer/2022051613/5513193e4a7959c4028b4b67/html5/thumbnails/22.jpg)
Cross‐tabulation of code (column‐income per month) and Intcode (row – purchase intent).
![Page 23: Chi Square Test for Cross Tab - Session 9 & 10](https://reader033.vdocuments.net/reader033/viewer/2022051613/5513193e4a7959c4028b4b67/html5/thumbnails/23.jpg)
Result 1: Chi Square test for cross tabResult 1: Chi‐Square test for cross‐tab
![Page 24: Chi Square Test for Cross Tab - Session 9 & 10](https://reader033.vdocuments.net/reader033/viewer/2022051613/5513193e4a7959c4028b4b67/html5/thumbnails/24.jpg)
![Page 25: Chi Square Test for Cross Tab - Session 9 & 10](https://reader033.vdocuments.net/reader033/viewer/2022051613/5513193e4a7959c4028b4b67/html5/thumbnails/25.jpg)
Interpretation 1: Chi‐square test for cross‐tab
• The cross‐tabulation shows the number of respondentsfalling into each cell (a cell is the combination of oneINCOME category with one PURCHASE INTENTION category).
• The first line of the chi‐squared test reads a significancelevel of 0 097 This means the chi‐squared test is showing alevel of 0.097. This means the chi squared test is showing asignificant association between these two variables at a 90percent confidence level. (equivalent to 0.10 significancelevel).
• Thus, we conclude that at 90 percent confidence level,PURCHASE INTENTION and INCOME are associatedsignificantly with each other This may lead us to concludesignificantly with each other. This may lead us to concludethat the price of the detergent is important in its purchase.
![Page 26: Chi Square Test for Cross Tab - Session 9 & 10](https://reader033.vdocuments.net/reader033/viewer/2022051613/5513193e4a7959c4028b4b67/html5/thumbnails/26.jpg)
Example 2: Chi square test for Cross tabsExample 2: Chi square test for Cross‐tabs
• Suppose the researcher finds the association• Suppose the researcher finds the association between educational background (independent
i bl ) f PGDM t d t d th i fvariable) of PGDM students and their performance in terms of grade (dependent variable) secured.
• A bivariate cross‐tabulation has been done by combining the above two variables and tabulating g gthe data together.
• Here assumption is made by our group based on• Here assumption is made by our group based on information extracted from the database (performance) of B schools(performance) of B‐schools.
![Page 27: Chi Square Test for Cross Tab - Session 9 & 10](https://reader033.vdocuments.net/reader033/viewer/2022051613/5513193e4a7959c4028b4b67/html5/thumbnails/27.jpg)
• We want to test at 90% and 95% confidence level, what is the level of significance of gassociation between EDUCATIONAL BACKGROUND of PGDM students and theirBACKGROUND of PGDM students and their PERFORMANCE in terms of GRADE.
![Page 28: Chi Square Test for Cross Tab - Session 9 & 10](https://reader033.vdocuments.net/reader033/viewer/2022051613/5513193e4a7959c4028b4b67/html5/thumbnails/28.jpg)
• Further, the variables are coded.
• Educational background and their eqvivalent codes areEducational background CodeEducational background Code
B.Com 1B E 2B.E. 2B.Sc. 3B B A 4B.B.A. 4B.A. 5
• Grade codes are as follows:Grade Obtainend Grade Code
A 1B 2C 3
![Page 29: Chi Square Test for Cross Tab - Session 9 & 10](https://reader033.vdocuments.net/reader033/viewer/2022051613/5513193e4a7959c4028b4b67/html5/thumbnails/29.jpg)
• These two variables were cross‐tabulated for twenty‐five observations.y
• A cross‐tabulation with a Chi‐squared test was performed using SPSS packageperformed using SPSS package.
![Page 30: Chi Square Test for Cross Tab - Session 9 & 10](https://reader033.vdocuments.net/reader033/viewer/2022051613/5513193e4a7959c4028b4b67/html5/thumbnails/30.jpg)
Input data tablell k d d d d dS.No. Roll No. Background Code Grade Grdcode
1 1 B.Com 1 B 22 2 B.Com 1 C 33 3 B.Com 1 A 14 4 B.Com 1 C 35 5 B.Com 1 B 26 6 B.E. 2 A 17 7 B.E. 2 A 17 7 B.E. 2 A 18 8 B.E. 2 A 19 9 B.E. 2 B 210 10 B.E. 2 A 111 11 B Sc 3 B 211 11 B.Sc. 3 B 212 12 B.Sc. 3 B 213 13 B.Sc. 3 C 314 14 B.Sc. 3 C 315 15 B.Sc. 3 C 316 16 BBA 4 A 117 17 BBA 4 B 218 18 BBA 4 C 319 19 BBA 4 C 320 20 BBA 4 B 221 21 B.A. 5 C 322 22 B.A. 5 C 322 22 B.A. 5 C 323 23 B.A. 5 C 324 24 B.A. 5 C 325 25 B.A. 5 B 2
![Page 31: Chi Square Test for Cross Tab - Session 9 & 10](https://reader033.vdocuments.net/reader033/viewer/2022051613/5513193e4a7959c4028b4b67/html5/thumbnails/31.jpg)
Output table 2: Grades Vs Entry QualificationOutput table 2: Grades Vs Entry Qualification
![Page 32: Chi Square Test for Cross Tab - Session 9 & 10](https://reader033.vdocuments.net/reader033/viewer/2022051613/5513193e4a7959c4028b4b67/html5/thumbnails/32.jpg)
Result 2: Chi Square test for cross tabResult 2: Chi‐Square test for cross‐tab
![Page 33: Chi Square Test for Cross Tab - Session 9 & 10](https://reader033.vdocuments.net/reader033/viewer/2022051613/5513193e4a7959c4028b4b67/html5/thumbnails/33.jpg)
![Page 34: Chi Square Test for Cross Tab - Session 9 & 10](https://reader033.vdocuments.net/reader033/viewer/2022051613/5513193e4a7959c4028b4b67/html5/thumbnails/34.jpg)
Interpretation 2: Chi‐Square test for cross‐tab• The Chi‐square test revealed the significant association between the educational background of the studentsbetween the educational background of the students and their performance in terms of grade.
• The significance level of 0.089 (Pearson’s) has been achieved This means the Chi‐square test is showing aachieved. This means the Chi square test is showing a significant association between the above two variables at 91.1% confidence level (100 – 8.9).
• Thus we conclude that at 90% confidence level, ,educational background of PGDM students and their performance in terms of grade are associated significantly with each other, whereas this is not significant at the 95% confidence level.
![Page 35: Chi Square Test for Cross Tab - Session 9 & 10](https://reader033.vdocuments.net/reader033/viewer/2022051613/5513193e4a7959c4028b4b67/html5/thumbnails/35.jpg)
• From the obtained contingency coefficient (C) of 0.596, it g y ( ) ,can be inferred that the association between the dependent and independent variable is significant, as the value 0.596 is closer to 1 that to 0.
• From the Lambda asymmetric value (with grade code dependent) of 0.286, we conclude that there is a moderate level of association between the above two variables. This lambda value tells us that there is a 28.6% reduction in predicting the grade of student when we know his educational background.
• This leads us to conclude that educational background plays a vital role in the performance of the students of PGDM course.
![Page 36: Chi Square Test for Cross Tab - Session 9 & 10](https://reader033.vdocuments.net/reader033/viewer/2022051613/5513193e4a7959c4028b4b67/html5/thumbnails/36.jpg)
Example 3: Chi‐square test for cross tab ‐ 3• A manufacturer was interested in assesing how children ages four, five
and six play with one of the manufacturer’s toys. Each child was asked 1 i ll i h hild’ l d i i h15 questions. Following the child’s completed interview, the parent was asked the same 15 questions to validate the child’s answers. The following table lists the number of responses to selected items from g pthe survey. One hundred interviewers were conducted with both the parent and the child. Notice that item response rates varied from
ti t ti F h ti t t t l t th d th tquestion to question. For each question, state at least one method that could be used to attempt to correct for this item nonresponse bias.
Question # Children Responding
# Parents Responding
Age of child 95 100
Location of Play 80 85
How much the child 30 50How much the child liked the toy
30 50
![Page 37: Chi Square Test for Cross Tab - Session 9 & 10](https://reader033.vdocuments.net/reader033/viewer/2022051613/5513193e4a7959c4028b4b67/html5/thumbnails/37.jpg)
Result 3: Chi square test for cross tabResult 3: Chi‐square test for cross‐tab
![Page 38: Chi Square Test for Cross Tab - Session 9 & 10](https://reader033.vdocuments.net/reader033/viewer/2022051613/5513193e4a7959c4028b4b67/html5/thumbnails/38.jpg)
![Page 39: Chi Square Test for Cross Tab - Session 9 & 10](https://reader033.vdocuments.net/reader033/viewer/2022051613/5513193e4a7959c4028b4b67/html5/thumbnails/39.jpg)
• Thank you…