missing income data in the millennium cohort study: evidence from the first two sweeps

17
Missing income data in the millennium cohort study: Evidence from the first two sweeps Authors: Denise Hawkes and Ian Plewis Discussant: Nicholas Biddle [email protected]

Upload: duscha

Post on 07-Jan-2016

30 views

Category:

Documents


0 download

DESCRIPTION

Missing income data in the millennium cohort study: Evidence from the first two sweeps Authors: Denise Hawkes and Ian Plewis Discussant: Nicholas Biddle [email protected]. Introduction and overview. Data – Millennium Cohort Study - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Missing income data in the millennium cohort study: Evidence from the first two sweeps

Missing income data in the millennium cohort study:

Evidence from the first two sweeps

Authors: Denise Hawkes and Ian Plewis

Discussant: Nicholas [email protected]

Page 2: Missing income data in the millennium cohort study: Evidence from the first two sweeps

Introduction and overview

Data – Millennium Cohort Study

Research questions – What are the factors associated with non-response? More specifically:

Are there within household and individual correlations for missing income data? Is the sex of the interviewer an important explanatory variable? How is missing data in sweep one related to missing data in sweep two? Is attrition at sweep two related to the level of household income or the failure to

provide data in sweep one?

Method – Descriptive analysis Binary and Multinomial Logit models with non-response as dependent variable Binary Logit with attrition between sweep one and sweep two as dependent variable

Page 3: Missing income data in the millennium cohort study: Evidence from the first two sweeps

Data

Millennium Cohort Study First sweep – 18,819 babies born in the UK from 1st September 2000 (from 18,552 families).

Interviewed when baby was 9 months old Second Sweep – 14,898 families from original sample and 692 new families. Interviewed when

children around 3 years old. Information from main respondent (usually mother) and partner of respondent (usually father)

Incomplete information on income through: Unit non-response (response rate 72% in first sweep) Partner non-response (88% of families with partners responded) Item non-response for income (6% of main respondents and partners did not provide income data) Attrition between sweeps (79% of eligible families responded in sweep two)

Income information: Collected from those currently doing paid work, those who have a paid job but are on leave, those

who have worked in the past but have no current job. For employees – total take home pay and gross pay For self employed – ‘amount you personally took out of the business after all taxes and costs’

Page 4: Missing income data in the millennium cohort study: Evidence from the first two sweeps

Data

Millennium Cohort Study First sweep – 18,819 babies born in the UK from 1st September 2000 (from 18,552 families).

Interviewed when baby was 9 months old Second Sweep – 14,898 families from original sample and 692 new families. Interviewed when

children around 3 years old. Information from main respondent (usually mother) and partner of respondent (usually father)

Incomplete information on income through: Unit non-response (response rate 72% in first sweep) Partner non-response (88% of families with partners responded) Item non-response for income (6% of main respondents and partners did not provide income data) Attrition between sweeps (79% of eligible families responded in sweep two)

Income information: Collected from those currently doing paid work, those who have a paid job but are on leave, those

who have worked in the past but have no current job. For employees – total take home pay and gross pay For self employed – ‘amount you personally took out of the business after all taxes and costs’

Page 5: Missing income data in the millennium cohort study: Evidence from the first two sweeps

Data

Millennium Cohort Study First sweep – 18,819 babies born in the UK from 1st September 2000 (from 18,552 families).

Interviewed when baby was 9 months old Second Sweep – 14,898 families from original sample and 692 new families. Interviewed when

children around 3 years old. Information from main respondent (usually mother) and partner of respondent (usually father)

Incomplete information on income through: Unit non-response (response rate 72% in first sweep) Partner non-response (88% of families with partners responded) Item non-response for income (6% of main respondents and partners did not provide income data) Attrition between sweeps (79% of eligible families responded in sweep two)

Income information: Collected from those currently doing paid work, those who have a paid job but are on leave, those

who have worked in the past but have no current job. For employees – total take home pay and gross pay For self employed – ‘amount you personally took out of the business after all taxes and costs’

Page 6: Missing income data in the millennium cohort study: Evidence from the first two sweeps

Patterns of income response

Original sample (paper has information on new families and proxies)

Sweep one Sweep two

Main Partner Main Partner

Income response 45.9% 64.7% 50.6% 62.9%Don’t know 1.8% 2.1%

Refusal 0.9% 2.1%

Total non-response 2.7% 4.3% 4.4% 8.7%

Not applicable 51.5% 31.0% 45.1% 28.4%

Sample 18,552 14,898

Page 7: Missing income data in the millennium cohort study: Evidence from the first two sweeps

Patterns of income response

Original sample (paper has information on new families and proxies)

Sweep one Sweep two

Main Partner Main Partner

Income response 45.9% 64.7% 50.6% 62.9%Don’t know 1.8% 2.1%

Refusal 0.9% 2.1%

Total non-response 2.7% 4.3% 4.4% 8.7%

Not applicable 51.5% 31.0% 45.1% 28.4%

Sample 18,552 14,898

Page 8: Missing income data in the millennium cohort study: Evidence from the first two sweeps

Modelling non-response – Main respondent

Sweep one Sweep two

Spec. (I) Spec. (II) Spec. (III)

Self employed 6.4 6.8 6.6 6.7

Has a partner 0.58 0.57 0.56

Social class Intermediate 1.6

- Reference managerial Small employers and self employment 1.8

and professional Lower supervisors and technical

Semi routine and routine

Ethnicity Mixed

- Reference white Indian 2.4 2.3 2.3

Pakistani and Bangladeshi

Black or Black British 1.6

Other ethnic group 2.3

Country Wales

- Reference England Scotland

Northern Ireland 1.7 1.5

Respondent did not respond in sweep one - - 3.0 3.0

Respondent same in sweep one and two - - - 5.3

Sample Size 8,190 5,800 5,800 5,800

Page 9: Missing income data in the millennium cohort study: Evidence from the first two sweeps

Modelling non-response – Main respondent

Sweep one Sweep two

Spec. (I) Spec. (II) Spec. (III)

Self employed 6.4 6.8 6.6 6.7

Has a partner 0.58 0.57 0.56

Social class Intermediate 1.6

- Reference managerial Small employers and self employment 1.8

and professional Lower supervisors and technical

Semi routine and routine

Ethnicity Mixed

- Reference white Indian 2.4 2.3 2.3

Pakistani and Bangladeshi

Black or Black British 1.6

Other ethnic group 2.3

Country Wales

- Reference England Scotland

Northern Ireland 1.7 1.5

Respondent did not respond in sweep one - - 3.0 3.0

Respondent same in sweep one and two - - - 5.3

Sample Size 8,190 5,800 5,800 5,800

Page 10: Missing income data in the millennium cohort study: Evidence from the first two sweeps

Modelling non-response – Partner (I)

Sweep one Sweep two

Spec. (I) Spec. (II) Spec. (III)

Self employed 1.7 3.6 3.6 3.6

Social class Intermediate

- Reference managerial Small employers and self employment 3.0

and professional Lower supervisors and technical 0.68

Semi routine and routine 0.66

NVQ Level 1

NVQ Levels NVQ Level 2 0.63

- Reference none NVQ Level 3 0.59

NVQ Level 4 0.47

NVQ Level 5 0.34

Other/overseas qual only

Ethnicity Mixed 2.3 2.4 2.5

- Reference white Indian 1.8 2.5 2.3 2.3

Pakistani and Bangladeshi 2.2 2.4 2.2 2.2

Black or Black British

Other ethnic group 2.0

Owner occupier 0.76 0.76 0.77

Page 11: Missing income data in the millennium cohort study: Evidence from the first two sweeps

Modelling non-response – Partner (I)

Sweep one Sweep two

Spec. (I) Spec. (II) Spec. (III)

Self employed 1.7 3.6 3.6 3.6

Social class Intermediate

- Reference managerial Small employers and self employment 3.0

and professional Lower supervisors and technical 0.68

Semi routine and routine 0.66

NVQ Level 1

NVQ Levels NVQ Level 2 0.63

- Reference none NVQ Level 3 0.59

NVQ Level 4 0.47

NVQ Level 5 0.34

Other/overseas qual only

Ethnicity Mixed 2.3 2.4 2.5

- Reference white Indian 1.8 2.5 2.3 2.3

Pakistani and Bangladeshi 2.2 2.4 2.2 2.2

Black or Black British

Other ethnic group 2.0

Owner occupier 0.76 0.76 0.77

Page 12: Missing income data in the millennium cohort study: Evidence from the first two sweeps

Modelling non-response – Partner (II)

Sweep one Sweep two

Spec. (I) Spec. (II) Spec. (III)

Country Wales

- Reference England Scotland

Northern Ireland 1.9 1.5 1.6 1.6

Respondent did not respond in sweep one - - 4.6 4.5

Respondent same in sweep one and two - - - 0.39

Sample Size 10,754 7,893 7,893 7,893

Page 13: Missing income data in the millennium cohort study: Evidence from the first two sweeps

Other modeling – Multinomial Logit and attrition

Multinomial Logit – Response vs. don’t know vs. refuse Main respondent:

Self employed only significantly more likely to be ‘don’t know’ not ‘refusal’ Same with social class variables Black or Black British as well as Northern Ireland more likely to refuse

Partner respondent: Self employed significantly more likely to refuse and not know NVQ levels and ethnicity both associated with refusal

Attrition at sweep two Higher income in sweep one associated with lower odds of attrition between sweep one

and sweep two Main income and partner income non-response in sweep one associated with higher

odds of attrition between sweep one and sweep two

Page 14: Missing income data in the millennium cohort study: Evidence from the first two sweeps

Other modeling – Multinomial Logit and attrition

Multinomial Logit – Response vs. don’t know vs. refuse Main respondent:

Self employed only significantly more likely to be ‘don’t know’ not ‘refusal’ Same with social class variables Black or Black British as well as Northern Ireland more likely to refuse

Partner respondent: Self employed significantly more likely to refuse and not know NVQ levels and ethnicity both associated with refusal

Attrition at sweep two Higher income in sweep one associated with lower odds of attrition between sweep one

and sweep two Main income and partner income non-response in sweep one associated with higher

odds of attrition between sweep one and sweep two

Page 15: Missing income data in the millennium cohort study: Evidence from the first two sweeps

Summary

Household and individual correlations for missing income data

Self employment, some ethnic groups (though not consistent), Northern Ireland The sex of the interviewer is not an important explanatory variable in explaining income

non-response Some variables only associated with ‘don’t know’ or ‘refusal’ only

Missing data in sweep one associated with higher odds of missing data in sweep two

Especially amongst partner respondents

Higher household income in sweep one associated with lower attrition in sweep two

Missing data in sweep one associated with higher attrition in sweep two

Page 16: Missing income data in the millennium cohort study: Evidence from the first two sweeps

Suggested further work and information

Models for non-response More diagnostic information (e.g. tests of group significance) Information on the child?

Interviewer bias Multilevel model? Interactions or other information on the interviewer

Implications for survey design Difference between don’t know and refusal

Page 17: Missing income data in the millennium cohort study: Evidence from the first two sweeps