data linkage: the key to long term outcomes

Post on 03-Jan-2016

29 Views

Category:

Documents

3 Downloads

Preview:

Click to see full reader

DESCRIPTION

Data linkage: the key to long term outcomes. Professor Ronan Lyons Farr Institute – CIPHER Centre for Improvement in Population Health through E-records Research. Swansea University Biennial Scientific Meeting , Congenital Anomaly Registers: Utilizing a valuable resource - PowerPoint PPT Presentation

TRANSCRIPT

Data linkage: the key to long term outcomes

Professor Ronan Lyons Farr Institute – CIPHER

Centre for Improvement in Population Health through E-records Research. Swansea University

Biennial Scientific Meeting, Congenital Anomaly Registers: Utilizing a valuable resourceTuesday 7th October 2104 Dylan Thomas Centre, Swansea

• Farr Institute

• Data linkage in the UK

• What is possible now and in the future

• Long term outcomes

Content of Presentation

Historical research

MRC’s vision for UK medical bioinformatics research

Enabling technologies & infrastructure

Developing capacity & expertise

Funding for innovative research

High throughput

data

Cohorts

Trials

BioBanks

EducationalEnvironmental

SocialData

NHSClinicalData

Patient groups

Demographicdata

Farr UCL Partners

Farr Scotland

Farr - CIPHER

Farr N8 Manchester

Strengthening health informatics research

• MRC coordinated 10-partner £19m call for e-health informatics research centres across the UK

Cutting edge research using data linkage

capacity building

• Additional £20m capital to create Farr Institute

• UK Health Informatics Research Network

Coordinate training, share good practice and develop methodologies

Engage with the public, collaborate with industry and the NHS

“To harness health data for patient and public benefit by setting the international standard in trustworthy reuse of electronic patient records

and related linkable data for large-scale research.”

Our Vision

Our Ten Key Activities

1. Collaborative Leadership 6. Meta Data and Enabling Datasets2. Cutting edge Research 7. Harmonised eInfrastructure3. Public engagement 8. Partnerships4. Governance (safe havens) 9. Training/ Capacity Building 5. Methods development 10. CommunicationsTo deliver impact nationally an internationally

Various developments across the UK

• Considerable number of initiatives

• UK – Farr Institute– Administrative Data Research Centres/Network

• England– Health and Social Care Information Centre– Clinical Practice Research Datalink

• Northern Ireland– Northern Ireland Longitudinal study

• Scotland– Information Services Division, ISD Scotland– Electronic Data Research and Innovation Service eDRIS

• Wales– SAIL databank

Steps in utilising health information for research

1. Building trust, partnerships and collaboration

2. Development of anonymisation and linkage techniques

3. Quality assessment and appraisal of datasets

4. Use of datasets to support research

SAIL uses a split file, trusted third party (TTP), multi-stage encryption, and step wise and restricted field remote access analysis system to ensure privacy protection

Lyons RA, et al.The SAIL databank: linking multiple health and social care datasets. BMC Med Inform Decis Mak. 2009 Jan 16;9:3. http://www.biomedcentral.com/1472-6947/9/3

Secure Anonymised Information Linkage (SAIL) databank

SAIL: a multi-sourced data bank of linkable anonymised data on the population of Wales:

• health service operational systems• national databases • clinical and biological data• education, housing, social care, etc.

Uses a trusted third party, split file and multiple encryption technologies to create Anonymised Linkage Fields (ALFs) for individuals and residences

SAIL Gateway is a remote access analysis facility to curtailed data.

SAIL split file/trusted third party methodology

Anonymisation process

HIRU (Blue C)

Demographic data only

Clinical / activity data

Recombine

Other recombined data

Validated, anonymised data

Encrypt and load

Operational system

NHS Wales Informatics Service

Data Provider

HIRU (Blue C)

Con

stru

ct

ALF

Valid

ate Tra

ce &

Geo-

cod

e

Datasets in SAIL (incomplete coverage)

Administrative Health:PopulationInpatients Outpatients Emergency DepartmentChild Health Database WalesNHS Direct WalesAdministrative Non-Health:BirthsDeathsEducational AttainmentSocial Services Housing

Clinically rich data bases: Specialty specificCancer IncidenceCancer Screening Congenital AnomaliesArthropathiesMyocardial InfarctionDiabetesEtc.General GP DataLaboratory systemsStudy specificEmbedded trials and cohorts

Patient Journey Analysis - Health and Social Care

• Fetal deaths common with more severe malformations• Fetus does not have an ‘identity’ such as an NHS number• Ther e may be multiple fetuses• Babies often leave hospital with incomple name – ‘Baby

Surname’• Early neonatal deaths - not registered with GP

• However, possible to link maternal and baby NHS numbers if systems like National Community Child Health Databases in Wales exist

• NN4B

Partcular difficulties with congenital anomaly research

• Modern cohorts/registries designed for multi-modal data linkage– Huge amounts of data – Different database structures/sizes– Major challenges when creating cross/cohort/platform analyses– Semantic interoperability /data harmonisation issues

• Original metadata - standards• Variable definitions from baseline/laboratory results• Variable definitions from routine GP/hospital data

– GP Read codes: UK/NZ, user variation+++– UK Inpatient data – different in Wales/England/Scotland

– Too difficult to move very large and complex data• Recipients would need to design/implement very complex data structures just to receive

data

• Privacy protection essential– Potential for ‘jigsaw’ attacks, threat from reidentification scientists

• World-wide shortage of skills and expertise in managing these challenges– No single institution with all necessary skills– Need for international collaboration – Build upon existing expertise, developments and investments

Informatics challenges

• 22 cohorts involved• UK Biobank – greatest variety

– Baseline survey– Baseline anthropometrics/ physiological measurements

(continuous/categorical)– Baseline biochemistry/haematology– Genomics – 821,000 SNPs– Imaging: retinal/MRI/US– Accelerometer data– Follow up

• Death and cancer registry• Primary care• Hospital data• Disease registries• Self reported conditions/status• Functional/cognitive impairment

Cohort Data in UK Dementia Platform

• Built upon SAIL Gateway developments www.saildatabank.com

• Built with MRC capital infrastructure for Farr Institute– bid supported by ALSPAC, UK Biobank, LifeStudy cohorts

• A national / international resource delivered through FARR – A secure environment to enable research groups to conform to

best practices of data management, security and information governance

– A remote access large scale IT infrastructure with standard and bespoke analytical tools

• Leaves data ownership with the cohorts– devolved account and access control – information governance responsibility & control with projects

• Researchers focus on the science

Remote analysis platform for multiple cohorts: UK Secure e-Research Platform (UK SeRP)

• Multidisciplinary collaborative project

• Platform for translating routinely collected data into an anonymised population level child e-cohort

• Investigate the widest possible range of social and environmental determinants of child health and social outcomes

• Inform the development of interventions to reduce health inequalities of children in Wales

• Two phases: - Phase 1: proof of concept

- Phase 2: dynamic capabilities

Wales Electronic Cohort for Children (WECC)

Birth records

(ONS births)

Mortality records

(ONS deaths)

Wales Electronic Cohort for Children

N=981,404

WECC eligibility criteria applied

Data cleaning: rules for removal of duplicates and errors

WDSChild

Health(NCCHD)

ALF_E

WDS: Welsh Demographic Service, NCCHD: National Community Child Health, ONS: Office for National Statistics

WECC development

• Links with health and education data via ALF_E• Links with maternal health data via mALF_E• Links with SAIL eGIS data via ALF_E/RALF_E

WECC coren = 981,404

♂: 500,181 (51.0%)♀ : 481,205 (49.0%)

Inpatient

GP consultation

s

Perinatal and Child

health

Environment

House Moves

Non-Welsh births

n=215,095♂: 107,222 (49.8%)♀ : 107,872 (50.2%)

Born in Walesn= 766,309

♂: 392,959 (51.3%)♀ : 373,333 (49.0%)

WECC derived tables

National dataset

Education

I. Influence of maternal and child health factors on time to first admission with a respiratory disorder

(Paranjothy S. et al (2013) Pediatrics 132:6 e1562-e1569)

II. Influence of head injuries on educational attainment at age 7 (Gabbe B.J. et al (2014)Journal of Epidemiology and Community Health, J Epidemiol Community Health.68:5 466-470 )

III. Educational outcomes for frequent movers (Hutchings H. et al (2013) PLoS One. 8(8) e70601)

IV. Influence of the physical social and environment on childhood obesity

Examples of analyses

Background to WECC phase 2

Poor educational attainment unemployment and/or low salary

ill-healthA greater understanding of factors underlying

education inequalities is necessary to target interventions to protect future generations from poverty and ill health.

Health of the child

E

Environment

Family size

Household illness

Unemployment

Ill health

Low salary

Educational attainment

1. Does moving to a less deprived community influence child health and educational outcomes?

2. To what extent do serious childhood or family health conditions affect educational outcomes?

3. Is poor educational attainment a risk factor for adverse health in adolescence?

4. Can a novel hybrid cohort study; embedding a traditional detailed survey cohort e.g. Millennium Cohort Study (MCS) within D-WECC be used to evaluate the strengths and weaknesses of using e-cohorts for epidemiological studies?

Research questions

• Individual linkage– Mortality data : survival and cause of death– GP and hospital activity: health service impact/comorbidy– Laboratory and imaging systems: severity of

condition/comorbidity– Education attainment: social impact of condition– Work and benefits: social impact/disability

• Family/household linkage– Impact on the wider family

Data linkage and long term outcomes

Time to the first emergency respiratory hospital admission

• Risk decreased with each successive week in gestation up to 40 – 42 weeks.

• Risk further increased for babies that were small for gestational age.

• The increased risk is small for late preterm infants but the number affected is large and will impact on healthcare services.

Head injury and school performance

J Epidemiol Community Health 2014;68:466-470 doi:10.1136/jech-2013-203427

For children entering the school, what is the association between preceding head injury and KS1 (age 5-7 years) performance?

n=116,154Born in Wales Sept 1998-

Aug 2001

n=90,661Valid KS1 result

n=290Head injury admission

n=90,371No head injury

n=101,892Remaining in Wales

n=14,262Left Wales

Association between head injury and satisfactory performance on KS1 Predictor OR (95% CI) AOR (95% CI)Head injury None (reference)

Skull fracture

Concussion

Intracranial injury

1

0.73 (0.50, 1.09)

0.85 (0.33, 2.16)

0.50 (0.33, 0.75)

1

0.79 (0.52, 1.18)

0.87 (0.31, 2.49)

0.46 (0.30, 0.72)

Gender Male (reference)

Female

- 1

1.95 (1.87, 2.03)

Townsend deprivation index quintile

1 (Least deprived) (reference)

2

3

4

5 (Most deprived)

- 1

0.64 (0.59, 0.69)

0.49 (0.45, 0.52)

0.38 (0.35, 0.41)

0.26 (0.24, 0.28)

Age at KS1 assessment

(years) - 2.77 (2.60, 2.97)

Birth weight (kg) - 1.41 (1.35, 1.47)

Gestational age (weeks) - 1.01 (1.00, 1.03)

Household level linkage

Soon - a tidal wave of data…

• Full genome sequence ~£3,000• Dropping in price 10x every 2-4 years• Existing NHS genetic test ~£1,000• Disk cost to store individuals variations

~10p

• Development of continuous monitoring and remote sensors

• Data from many other sources• New approaches needed for accessing,

manipulating, visualizing• Requires entirely new perspective

• Expect further development of data linkage capabilities across the UK

• However, capacity is a major issue

• Amount of work needed is often underestimated

• Ensuring privacy is protected and that the public are engagement and accept this research approach are key activities

The future is bright

top related