dtc quantitative research methods three (or more) variables: extensions to cross- tabular analyses...

15
DTC Quantitative Research Methods Three (or more) Variables: Extensions to Cross- tabular Analyses Thursday 13 th November 2014

Upload: louisa-miles

Post on 21-Dec-2015

220 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: DTC Quantitative Research Methods Three (or more) Variables: Extensions to Cross- tabular Analyses Thursday 13 th November 2014

DTC Quantitative Research Methods

Three (or more) Variables: Extensions to

Cross- tabular Analyses

Thursday 13th November 2014  

Page 2: DTC Quantitative Research Methods Three (or more) Variables: Extensions to Cross- tabular Analyses Thursday 13 th November 2014

Multivariate analysis

• So far we have tended to concentrate on two-way relationships (e.g. between gender and participation in sports). But we have started to look at about three-way relationships (e.g. the gendering of the relationship between age and participation in sports).

• Social relationships and phenomena are usually more complex than is allowed for in a bivariate analysis.

• Multivariate analyses are thus commonly used as a reflection of this complexity.

• Hence, this week we will look briefly about the rationale for multivariate analysis and have a think about cross-tabular techniques for conducting this form of analysis.

Page 3: DTC Quantitative Research Methods Three (or more) Variables: Extensions to Cross- tabular Analyses Thursday 13 th November 2014

Multivariate analysis

De Vaus (1996: 198) suggests that we can use multivariate analysis to elaborate bivariate relationships, in order to answer the following questions:

1. Why does the relationship [between two variables] exist? What are the mechanisms and processes by which one variable is linked to another?

2. What is the nature of the relationship? Is it causal or non-causal?3. How general is the relationship? Does it hold for people in general,

or is it specific to certain subgroups?

This is because multivariate analysis enables the identification of:Spurious relationshipsIntervening variables

The replication of relationshipsThe specification of relationships

Page 4: DTC Quantitative Research Methods Three (or more) Variables: Extensions to Cross- tabular Analyses Thursday 13 th November 2014

Age

Height Reading abilitySpurious relationship

Spurious relationships

• A spurious relationship exists where two variables are not related but a relationship between them is generated by their relationships with a third variable.

• For example:

Page 5: DTC Quantitative Research Methods Three (or more) Variables: Extensions to Cross- tabular Analyses Thursday 13 th November 2014

Intervening variables• Sometimes, although there is a real (non-spurious) relationship

between two variables, we want to establish why that relationship exists.

• For example, if we discover that there is a relationship between risk of unemployment and ethnicity, we want to know why that is the case. One possibility is that some ethnic groups have lower educational levels and that this has implications for their ability to get work. In this case education would be an intervening variable.

• Intervening variables enable us to answer questions about the bivariate relationship between two variables – suggesting that (in this case) the relationship between ethnicity and unemployment is not direct but (at least in part) occurs via educational levels.

EducationEthnicity Unemployment

Page 6: DTC Quantitative Research Methods Three (or more) Variables: Extensions to Cross- tabular Analyses Thursday 13 th November 2014

Is it spurious or intervening?When we do statistical tests we will obtain similar results for a spurious variable and an intervening variable: In both cases the effect of the independent variable on the dependent variable will be moderated by the third variable. So how do we know whether this third variable provides evidence of a spurious relationship or is an intervening variable?– There is no hard-and-fast statistical rule for deciding this.– But if we are suggesting that a variable is intervening, the logic of

the process must make sense – i.e. you must have a cogent theoretical reason for thinking that your independent variable affects the intervening variable which in turn affects the dependent variable.

– This kind of causal process is easiest to argue for when the timing of events supports it, i.e. when the intervening variable can be seen to occur in between the independent and dependent variables (e.g. education in the earlier example of the relationship between ethnicity and unemployment).

Page 7: DTC Quantitative Research Methods Three (or more) Variables: Extensions to Cross- tabular Analyses Thursday 13 th November 2014

Replication• Sometimes when we have found a basic

(‘zero-order’) relationship between two variables (e.g. ethnicity and unemployment), we want to demonstrate that this relationship exists within different subgroups of the population (e.g. for both men and women; for those of different ages…).

• Where the relationship is replicated we can rule out the possibility that it is produced by the variable in question, either as an intervening variable or in a spurious way.

Page 8: DTC Quantitative Research Methods Three (or more) Variables: Extensions to Cross- tabular Analyses Thursday 13 th November 2014

Specification

• Sometimes a particular variable only has an effect in specific situations. The variable that determines these situations is said to interact with the independent variable.

• For example, an example in De Vaus’s book suggests that going to a religious school makes boys more religious but has little or no effect on girls.

• In this case type of school interacts with gender: religious education only affects students’ religiosity in combination with being male.

Page 9: DTC Quantitative Research Methods Three (or more) Variables: Extensions to Cross- tabular Analyses Thursday 13 th November 2014

Specification (interactions)

Not at all Very

How religious was your education?

Religiousness

high

low

boys

girls

Not at all Very

How religious was your education?

Religiousness

high

low

boys

girls

Interaction between No interaction sex and religiousness

of school

Graphical representation of the relationship between religious educationand religiousness, controlling for sex:

Page 10: DTC Quantitative Research Methods Three (or more) Variables: Extensions to Cross- tabular Analyses Thursday 13 th November 2014

Using Cramér’s V to classify a multivariate situation

• If the Cramér’s V values for the layers are all similar, then we have a situation of replication.

• If the Cramér’s V values are smaller for the layered cross-tabulation than the value for the original cross-tabulation, then we either have a situation where the third variable is acting as an intervening variable, or one where it is inducing a spurious relationship between the original two variables. Deciding between these two options involves reflecting on whether the third variable makes sense conceptually as part of some causal mechanism linking the original two variables.

If we use SPSS to produce a cross-tabulation of two variables, then we can elaborate this relationship by introducing a third variable as a layer variable. Examining the Cramér’s V values for the original cross-tabulation and for the layers of the elaborated cross-tabulation tells us what kind of situation we are looking at:

Page 11: DTC Quantitative Research Methods Three (or more) Variables: Extensions to Cross- tabular Analyses Thursday 13 th November 2014

• If the Cramér’s V values for the layered cross-tabulation vary in size, perhaps with some being smaller than the original value and some being as large or larger than it, then the situation is one of specification.

• However, if one or more of the Cramér’s V values is larger than the original value, then a failure to take account of the third variable in the first instance may also have been suppressing an underlying relationship between the two variables.

• This latter situation is a variation on the theme of spuriousness: in this case, the absence of a bivariate relationship is spurious rather than the presence of one!)

Using Cramér’s V to classify a multivariate situation (continued)

Page 12: DTC Quantitative Research Methods Three (or more) Variables: Extensions to Cross- tabular Analyses Thursday 13 th November 2014

• Multivariate analyses can utilise a variety of techniques (depending on the form of the data, research questions to be addressed, etc. – we will be looking at multiple (linear) regression, but other ‘popular’ techniques include logistic regression and log-linear models), in order to determine whether the relationship between two variables persists or is altered when we ‘control for’ a third (or fourth, or fifth...) variable.

• Multivariate analysis can also enable us to establish which variable(s) has/have the greatest impact on a dependent variable – e.g. Is sex more important than ‘race’ in determining income?

• It is often important for a multivariate analysis to check for interactions between the effects of independent variables, as discussed earlier under the heading of specification.

More generally…

Page 13: DTC Quantitative Research Methods Three (or more) Variables: Extensions to Cross- tabular Analyses Thursday 13 th November 2014

An example (from BSA 2006)

View on whether pre-marital sex wrong

Always Mostly Sometimes Rarely Not at all Total

Has religion? No 7 13 41 52 361 4741.5% 2.7% 8.6% 11.0% 76.2%

100.0%Yes 51 59 87 64 296 557

9.2% 10.6% 15.6% 11.5% 53.1%100.0%

Total 58 72 128 116 6571031 5.6% 7.0% 12.4% 11.3% 63.7%100.0%

24 = 80.81 (p < 0.001) Cramér’s V = 0.280

Page 14: DTC Quantitative Research Methods Three (or more) Variables: Extensions to Cross- tabular Analyses Thursday 13 th November 2014

But if we split the crosstabulation by age...

Under 45: 24 = 53.52 (p < 0.001)

Cramér’s V = 0.334

45 or over: 24 = 22.27 (p < 0.001)

Cramér’s V = 0.201

Hence there is an extent to which (a small) part of the bivariate relationship was a spurious consequence of age (since 53.52 + 22.27 = 75.79, which is less than 80.81, and the Cramér’s V values show elements both of replication (since there is a statistically significant relationship for both age groups), and also of specification (since the relationship appears weaker for the younger age group, i.e. the effects of religion and age interact).

Page 15: DTC Quantitative Research Methods Three (or more) Variables: Extensions to Cross- tabular Analyses Thursday 13 th November 2014

Testing for interactions

• Unfortunately, as mentioned earlier, testing for an interaction in a three-way cross-tabulation requires knowledge of an additional technique (hierarchical log-linear modelling).