workshop 3 on focusing on the case qualitative modelling

29
Workshop 3 on ‘Focusing on the Case Qualitative Modelling

Upload: sophia-hagan

Post on 28-Mar-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Workshop 3 on Focusing on the Case Qualitative Modelling

Workshop 3 on ‘Focusing on the Case

Qualitative Modelling

Page 2: Workshop 3 on Focusing on the Case Qualitative Modelling

Topics Covered Herein

Qualitative Regression Models The Odds Ratio – an online ppt to learn about it The log-odds as a Dependent Variable

Alternatives: Dummy variables as independent variables Log-Linear analysis without a dep/indep causal strcture being

superimposed

Discussion of Causality in Multi-Level Situations Nested multi-level situations Non-Nested multi-level stratified reality

Plurality of Causes – Is it operationalisable? Suggested Readings

Page 3: Workshop 3 on Focusing on the Case Qualitative Modelling

Qualitative Regression Models

consider the participation of men and women in the labour force.

In logistic regression, in particular, the measurement mode is not assigned a priori except in so far as cases (in this case people) are identified as bearers of data. Even here, the ‘case’ may bear not only the weight of causes that operate on or through persons, but also structural relations between larger classes of people and institutional factors that affect such a person. The variables in the regression are of a variety of levels of measurement (not all are continuous). The dependent variable in particular is qualitative in character.

Page 4: Workshop 3 on Focusing on the Case Qualitative Modelling

We make two translations of the act of ‘entering’ a labour market. First, we allow people to declare how they have done that; secondly, we group these declarations into the larger categories ‘active’ and ‘inactive’,

transform this binary categorisation into a new continuous variable, the logit of activity. The logit is defined as the log of the odds of being active.

Page 5: Workshop 3 on Focusing on the Case Qualitative Modelling

The odds of being active are defined as:

  the ratio of the probability

of being active to the probability of being inactive.

We take its logarithm, giving a new number on a wider scale ranging from negative to positive values. The logit, ie the log odds, is not constrained to be between 0 and 1.

Page 6: Workshop 3 on Focusing on the Case Qualitative Modelling

One equation, for instance, summarising some results in this particular case, can be represented as follows:

  log of the odds of employment = -1.47(LTLI)+0.27*London+.61*Degree-

0.76*Noqual+0.92*Wife+.61   Each number shows whether the odds of being employed are

raised or lowered by the presence of a given characteristic. In this equation, the following definitions are used.

Page 7: Workshop 3 on Focusing on the Case Qualitative Modelling

  LTLI = Long-term limiting illness (specifically,

the person reports that they are unable to do some forms of work due to an illness or other disabling condition)

London= Lives in Inner or Outer London or the rest of the Southeast

Degree = Has a degree and/or a higher degree Noqual= Has no qualifications ie no CSEs or O-

Levels or other qualifications Wife = Is married or cohabiting, and is female

Page 8: Workshop 3 on Focusing on the Case Qualitative Modelling

In conclusion, critics of statistics argue that statisticians seek only regularities and assume closure in reality. In this section we showed that methodological closure can be assumed, and regularities within the data-set can be sought, without assuming closure in reality.

Look at the Appendix to the handout to see whether the coefficients exhibit regularity.

Some do.

Some don’t.

What would you do?

Page 9: Workshop 3 on Focusing on the Case Qualitative Modelling

Transition to Multi-Level Models

Page 10: Workshop 3 on Focusing on the Case Qualitative Modelling

H1 - wage variation across occupational groups, acting as a proxy for a variety of labour-market segmenting practices, notably boundaries limiting potential entrants into occupations, is considerable even after person-level characteristics and occupation-specific characteristics are accounted for.

Page 11: Workshop 3 on Focusing on the Case Qualitative Modelling

A graph of these advantages, with educationyears' constant slope, iscalled a random-intercept model. It is also called a fixed-effects model, since the effect of education and segpoint is a fixed effect rather than having a slope that varies from one SOC2 group to another.

Page 12: Workshop 3 on Focusing on the Case Qualitative Modelling
Page 13: Workshop 3 on Focusing on the Case Qualitative Modelling
Page 14: Workshop 3 on Focusing on the Case Qualitative Modelling

The model with the slope of lnwage on education varying by SOC2 occupational groups looks like this - but the slope differentials are just acceptably statistically significant -- t=2.5 or so:

Page 15: Workshop 3 on Focusing on the Case Qualitative Modelling

H1: supported by the change in variance explained by the SOC2 categories in themselves. MODEL 1: Empty

model showing the gross variance of LnWage

MODEL 2: Empty model with Gender in

MODEL 3: Complex model with Gender and other factors in

Between-occupation correlation of wages: (defind as: Variance of the intercepts for the different occupations, relative to the total variance )

.338 .287 .274

Change in the variance explained relative to Model 1

- .051 (falls by 15% of .338)

.064 (falls by 19% of .338)

Statistical test of the change in the Log Likelihood corresponding to the change from Model 1

-

-2LL 6361 6289 3675

Page 16: Workshop 3 on Focusing on the Case Qualitative Modelling
Page 17: Workshop 3 on Focusing on the Case Qualitative Modelling

Under this particular model, however, the obvious association of the predicte dwage with the 'numerical label' of teh SOC2 category disappears: Notice:

Page 18: Workshop 3 on Focusing on the Case Qualitative Modelling

-2LL 6361 6289 3675 n of raw cases 4050 4050 4050 n of level 2 cases 77 77 77

Page 19: Workshop 3 on Focusing on the Case Qualitative Modelling

H2b) Furthermore, the interaction effect of SEGPOINT with GENDER in linear regression is not significant. (Need to test again for multilevel.)THE TEST: basic two-level model, lacking some controls: 

SAME MODEL but with SEGFEM = SEGPOINT * FEMALE:

S

Page 20: Workshop 3 on Focusing on the Case Qualitative Modelling

It appears that women gain less than men from being in male-dominated occupations; equivalently, since the statistical test is symmetric, men lose more from being in women-dominated occupations than women do. The graph illustrates it; note that the y-intercept has not been corrected for other factors but the slopes show the differentiated SEGPOINT effect. The Slope for women is lower. The Slope for men is relatively higher; note that in teh equation, the direct impact of SEGPOINT (which is now applying to males only,) is 0.025, or 2.5% per 10% change of SEGPOINT. They lose 2.5% for each 10% downward they go toward a female-dominated occupation. Women, relative to the mean, lose only (0.025 -0.011)= 0.014, or 1.4%.

Page 21: Workshop 3 on Focusing on the Case Qualitative Modelling
Page 22: Workshop 3 on Focusing on the Case Qualitative Modelling

It appears that women gain less than men from being in male-dominated occupations; equivalently, since the statistical test is symmetric, men lose more from being in women-dominated occupations than women do. The graph illustrates it; note that the y-intercept has not been corrected for other factors but the slopes show the differentiated SEGPOINT effect. The Slope for women is lower. The Slope for men is relatively higher; note that in teh equation, the direct impact of SEGPOINT (which is now applying to males only,) is 0.025, or 2.5% per 10% change of SEGPOINT. They lose 2.5% for each 10% downward they go toward a female-dominated occupation. Women, relative to the mean, lose only (0.025 -0.011)= 0.014, or 1.4%.

Page 23: Workshop 3 on Focusing on the Case Qualitative Modelling
Page 24: Workshop 3 on Focusing on the Case Qualitative Modelling

The use of SOC2 as a level assumes in statistical discourse that the occupational categories are a 'random sample of a global population of occupational categories'. The population is not explicitly referring to any population outside of Great Britain or in another year than 1999/2000, but rather is hypothetical construction to enable us to use inferential discourse in deriving statistical significance values from the variables that exist at SOC2 level, notably 'SEGPOINT'. Here, inference at Level 2 means inferring from the sample data to the population. (Since this population is purely hypothetical, and we actually have a full account of every SOC2 category for 1999/2000, this is

slightly misleading. By comparison, the sampling at Level 1, for individuals, and the weighting of the sample cases makes a lot of sense and is based in reality.)

Page 25: Workshop 3 on Focusing on the Case Qualitative Modelling

This graph shows merely that for a given occupation, whose mean education is on the x-axis, the predicted level of lnwage is the height of the point on the y-axis.

Page 26: Workshop 3 on Focusing on the Case Qualitative Modelling

The corresponding multi-level random effects model, with tiny variations in slopes and a better fit than EDMEAN alone, looks like this: (very similar to the original SOC2 with EDSCALE model):

Page 27: Workshop 3 on Focusing on the Case Qualitative Modelling
Page 28: Workshop 3 on Focusing on the Case Qualitative Modelling
Page 29: Workshop 3 on Focusing on the Case Qualitative Modelling

Summary

Discuss topics covered Reiterate welcome Reminder to submit paperwork Wrap-up