matching plasc and alspac plasc/npd user group workshop 13 th september 2006 andy boyd...

32
Matching PLASC and ALSPAC PLASC/NPD User Group Workshop 13 th September 2006 Andy Boyd ([email protected]) David Herrick ([email protected])

Upload: emma-salisbury

Post on 28-Mar-2015

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Matching PLASC and ALSPAC PLASC/NPD User Group Workshop 13 th September 2006 Andy Boyd (a.w.boyd@bristol.ac.uk) David Herrick (david.herrick@bristol.ac.uk)

Matching PLASCand ALSPAC

PLASC/NPD User Group Workshop13th September 2006

Andy Boyd([email protected])

David Herrick([email protected])

Page 2: Matching PLASC and ALSPAC PLASC/NPD User Group Workshop 13 th September 2006 Andy Boyd (a.w.boyd@bristol.ac.uk) David Herrick (david.herrick@bristol.ac.uk)

What is ALSPAC?

“Avon Longitudinal Study of Parents and Children”

Cohort study of children and their parents, based in south-west England

Designed to determine ways in which the individual’s genotype combines with environmental pressures to influence health and development

Page 3: Matching PLASC and ALSPAC PLASC/NPD User Group Workshop 13 th September 2006 Andy Boyd (a.w.boyd@bristol.ac.uk) David Herrick (david.herrick@bristol.ac.uk)

Study design

Eligibility criteria: Mothers had to be resident in Avon and have an expected date of delivery between 1st April 1991 and 31st December 1992

Avon was broadly representative of the UK as a whole and has a relatively stable population

Enrolled sample of 14,541 pregnancies resulting in 14,062 live born children

Page 4: Matching PLASC and ALSPAC PLASC/NPD User Group Workshop 13 th September 2006 Andy Boyd (a.w.boyd@bristol.ac.uk) David Herrick (david.herrick@bristol.ac.uk)

Data

Self Completion Questionnaires Hands on Measurements Biological Samples Health Records Education Records Direct School Contact

Page 5: Matching PLASC and ALSPAC PLASC/NPD User Group Workshop 13 th September 2006 Andy Boyd (a.w.boyd@bristol.ac.uk) David Herrick (david.herrick@bristol.ac.uk)

ALSPAC at School

Page 6: Matching PLASC and ALSPAC PLASC/NPD User Group Workshop 13 th September 2006 Andy Boyd (a.w.boyd@bristol.ac.uk) David Herrick (david.herrick@bristol.ac.uk)

Educational Data - Primary

Contact with ~350 primary schools in the four local LEAs:• Bristol• South Gloucestershire• North Somerset• Bath and North East Somerset

Private & special schools included Parental contact for out of area cases

Page 7: Matching PLASC and ALSPAC PLASC/NPD User Group Workshop 13 th September 2006 Andy Boyd (a.w.boyd@bristol.ac.uk) David Herrick (david.herrick@bristol.ac.uk)

Educational Data - Primary

Questionnaires in Year 3 & Year 6:• School (Head teacher)• Class (Class teacher)• Child (Class teacher)

Year 4 test: Maths Year 6 tests: Maths, Spelling,

Science

Page 8: Matching PLASC and ALSPAC PLASC/NPD User Group Workshop 13 th September 2006 Andy Boyd (a.w.boyd@bristol.ac.uk) David Herrick (david.herrick@bristol.ac.uk)

Educational Data - Secondary

Questionnaire for maths teachers in 2002/3 (Year 7) & 2004/5 (Years 7, 8 & 9) and associated class lists

Year 6 maths test repeated in Year 8 Moving away from direct school

contact

Page 9: Matching PLASC and ALSPAC PLASC/NPD User Group Workshop 13 th September 2006 Andy Boyd (a.w.boyd@bristol.ac.uk) David Herrick (david.herrick@bristol.ac.uk)

Educational Data - SATS

Entry Assessment & KS1 data on eligible children at local schools acquired directly from the LEAs

Linkage to NPD:• Increased coverage• Easier linking (UPN)• PLASC as well

Page 10: Matching PLASC and ALSPAC PLASC/NPD User Group Workshop 13 th September 2006 Andy Boyd (a.w.boyd@bristol.ac.uk) David Herrick (david.herrick@bristol.ac.uk)

Study Approval & Cohort Matching

Ethics & study approval The Fischer Trust Validating the cohort match Anonymizing the data set Issues encountered

Page 11: Matching PLASC and ALSPAC PLASC/NPD User Group Workshop 13 th September 2006 Andy Boyd (a.w.boyd@bristol.ac.uk) David Herrick (david.herrick@bristol.ac.uk)

Ethics & Study Approval

ALSPAC Ethics & Law committee LREC (NHS research ethics committee)

‘Eligible’ vs. ‘Enrolled’ cohort Final research file to be anonymous DfES commissioned a third party,

The Fischer Trust, to conduct the cohort/data match

Page 12: Matching PLASC and ALSPAC PLASC/NPD User Group Workshop 13 th September 2006 Andy Boyd (a.w.boyd@bristol.ac.uk) David Herrick (david.herrick@bristol.ac.uk)

The Fischer Trust

An intermediately between ALSPAC and the DfES

FT received both ALSPAC and NPD datasets and conducted the cohort match.

FT created it’s own ID (however we were also provided with UPN)

Page 13: Matching PLASC and ALSPAC PLASC/NPD User Group Workshop 13 th September 2006 Andy Boyd (a.w.boyd@bristol.ac.uk) David Herrick (david.herrick@bristol.ac.uk)

Cohort match variables

Details for 20551 children provided: Child Surname Child Forename Child Date of Birth Home Postcode School Indicator (name & address) from

ALSPAC schools data collection

Page 14: Matching PLASC and ALSPAC PLASC/NPD User Group Workshop 13 th September 2006 Andy Boyd (a.w.boyd@bristol.ac.uk) David Herrick (david.herrick@bristol.ac.uk)

Validating the cohort match

For our methodology, study requirements we wanted to reverse check the match

FT matched 86% cases provided (17671 cases)

Very few errors found (<0.5%)

Page 15: Matching PLASC and ALSPAC PLASC/NPD User Group Workshop 13 th September 2006 Andy Boyd (a.w.boyd@bristol.ac.uk) David Herrick (david.herrick@bristol.ac.uk)

FT matches by variable

FT 'match level' variable

Page 16: Matching PLASC and ALSPAC PLASC/NPD User Group Workshop 13 th September 2006 Andy Boyd (a.w.boyd@bristol.ac.uk) David Herrick (david.herrick@bristol.ac.uk)

Problems with the match variables

Child Surname (change over time) Child Forename (familiar names) Child Date of Birth Home Postcode (out of date and lost

cases) School Indicator (name & address) from

ALSPAC schools data collection (depended on school participation and out of date information)

Page 17: Matching PLASC and ALSPAC PLASC/NPD User Group Workshop 13 th September 2006 Andy Boyd (a.w.boyd@bristol.ac.uk) David Herrick (david.herrick@bristol.ac.uk)

Anonymizing the data set

UPN transferred to new internal ID and then to new collaborator ID

Personal variables dropped (DoB, names, postcode, age at census)

Identifying variables dropped (care authority)

Variables recoded (ethnicity, SEN) LEA & Estab Ids recoded into our own

unique ALSPSCHL_ID

Page 18: Matching PLASC and ALSPAC PLASC/NPD User Group Workshop 13 th September 2006 Andy Boyd (a.w.boyd@bristol.ac.uk) David Herrick (david.herrick@bristol.ac.uk)

Issues encountered

Cases not covered by NPD REE – not including old schools Primary to junior succession Children who resit years or are in a

non natural school year Historical records of school

movement

Page 19: Matching PLASC and ALSPAC PLASC/NPD User Group Workshop 13 th September 2006 Andy Boyd (a.w.boyd@bristol.ac.uk) David Herrick (david.herrick@bristol.ac.uk)

Issues - UPN

We discovered that the U in UPN isn’t that unique!

215 ALSPAC cases have multiple UPNs (with no clear pattern as to why)

PLASC 2004 has two ALSPAC children with the same UPN

Page 20: Matching PLASC and ALSPAC PLASC/NPD User Group Workshop 13 th September 2006 Andy Boyd (a.w.boyd@bristol.ac.uk) David Herrick (david.herrick@bristol.ac.uk)

Sample

At least 1 PLASC return identified for 11,997 (85%) of the 14,062 enrolled live births:• 2002 - 11,850 (84%)• 2003 - 11,731 (83%)• 2004 - 11,473 (82%)

Balance:• Private schools• Home educated• Outside England• Not identified

Page 21: Matching PLASC and ALSPAC PLASC/NPD User Group Workshop 13 th September 2006 Andy Boyd (a.w.boyd@bristol.ac.uk) David Herrick (david.herrick@bristol.ac.uk)

Supplied Documentation

Page 22: Matching PLASC and ALSPAC PLASC/NPD User Group Workshop 13 th September 2006 Andy Boyd (a.w.boyd@bristol.ac.uk) David Herrick (david.herrick@bristol.ac.uk)

ALSPAC & PLASC

Page 23: Matching PLASC and ALSPAC PLASC/NPD User Group Workshop 13 th September 2006 Andy Boyd (a.w.boyd@bristol.ac.uk) David Herrick (david.herrick@bristol.ac.uk)

Editing (1)

Convert string variables to numeric, label and sort missing values and write documentation.

Calculate age at census. From date of entry derive age on starting

at current school and length of time at current school.

Derive expected NCYG (National Curriculum Year Group).

Page 24: Matching PLASC and ALSPAC PLASC/NPD User Group Workshop 13 th September 2006 Andy Boyd (a.w.boyd@bristol.ac.uk) David Herrick (david.herrick@bristol.ac.uk)

Editing (2)

Ethnicity: 39 cases had new ethnicity codes in 2002 – these were mapped back to old codes and an equivalent to main category derived. Also derive white/non-white indicators.

Care: In 2003 17 of the 34 cases marked as currently in care were marked as N for ever in care. Did not occur in 2004.

Page 25: Matching PLASC and ALSPAC PLASC/NPD User Group Workshop 13 th September 2006 Andy Boyd (a.w.boyd@bristol.ac.uk) David Herrick (david.herrick@bristol.ac.uk)

Unanswered questions

6.6% of children were not in the expected NCYG in 2002 compared with 0.7% in 2003 and 2004.

Large increase in use of code T for ethnicity source between 2003 & 2004, even if restricted to Year 7 only.

Page 26: Matching PLASC and ALSPAC PLASC/NPD User Group Workshop 13 th September 2006 Andy Boyd (a.w.boyd@bristol.ac.uk) David Herrick (david.herrick@bristol.ac.uk)

Ethnicity Source (1) 2003 2004

N % N %

C Child 625 3.8 968 6.1

P Parent 12418 76.1 9998 62.5

S Current school 3178 19.5 2530 15.8

T Previous school 17 0.1 2354 14.7

O Other 89 0.5 150 0.9

16327 16000

Page 27: Matching PLASC and ALSPAC PLASC/NPD User Group Workshop 13 th September 2006 Andy Boyd (a.w.boyd@bristol.ac.uk) David Herrick (david.herrick@bristol.ac.uk)

Ethnicity Source (2) 2003 (Yr 7) 2004 (Yr 7)

N % N %

C Child 534 13.8 472 5.2

P Parent 2517 65.1 4956 54.6

S Current school 796 20.6 1229 13.5

T Previous school 5 0.1 2326 25.6

O Other 17 0.4 89 1.0

3869 9072

Page 28: Matching PLASC and ALSPAC PLASC/NPD User Group Workshop 13 th September 2006 Andy Boyd (a.w.boyd@bristol.ac.uk) David Herrick (david.herrick@bristol.ac.uk)

Age on starting current schoolPLASC 2002

26 2 17571

6984

736 731

5034

1222 988

171

0

1000

2000

3000

4000

5000

6000

7000

8000

0 1 2 3 4 5 6 7 8 9 10

Page 29: Matching PLASC and ALSPAC PLASC/NPD User Group Workshop 13 th September 2006 Andy Boyd (a.w.boyd@bristol.ac.uk) David Herrick (david.herrick@bristol.ac.uk)

Illegal Values (1)

Numeric codes in Boarder field (should be only ‘B’ or ‘N’) – 2 cases in 2002, 7 in 2003 and 13 in 2004.

Code ‘1’ in for NCYG in 2003 for child in secondary school who was expected to be in Year 7 and who was recorded as in Year 6 in 2002 and Year 8 in 2004.

Page 30: Matching PLASC and ALSPAC PLASC/NPD User Group Workshop 13 th September 2006 Andy Boyd (a.w.boyd@bristol.ac.uk) David Herrick (david.herrick@bristol.ac.uk)

Illegal Values (2)

X in NCYG in 2004 – 2 cases. A small number of cases are missing

important fields like date of entry, NCYG.

3 cases had the same code for primary and secondary SEN types.

Page 31: Matching PLASC and ALSPAC PLASC/NPD User Group Workshop 13 th September 2006 Andy Boyd (a.w.boyd@bristol.ac.uk) David Herrick (david.herrick@bristol.ac.uk)

Uses Identifying Developmental Impairments:

• Investigating the use of early life parental questionnaires to predict later problems.

• SEN types used to identify autism, speech/language problems and possible learning difficulties.

• Twin approach with medical database searches.

Autism project. Ethnicity.

Page 32: Matching PLASC and ALSPAC PLASC/NPD User Group Workshop 13 th September 2006 Andy Boyd (a.w.boyd@bristol.ac.uk) David Herrick (david.herrick@bristol.ac.uk)

Wish List

Detailed documentation describing how different fields relate (especially for SATs).

Numeric fields supplied as numeric rather than string.