data 101 stuart harris principal public health intelligence analyst
TRANSCRIPT
Data 101
Stuart HarrisPrincipal Public Health Intelligence Analyst
2
In this session…
- Types of data
- Data quality
- Choosing outcome indicators
Introduction to primary data sources
3
Data through our lives
Census; local authority data; deprivation indices; surveys
PH mortality file; vital statistics; compendium
PH births file; vital statistics; compendium Inpatient data (SUS; HES); GP
data; community information systems; screening uptake (IC)Healthcare + immunisations
uptake; maternity data; HV needs assessment
Introduction to primary data sources
4
Data types
• Person/event based • Aggregated• Collections and tools
For each type – examples and resources
Introduction to primary data sources
5
Person/event based data• Only available under confidentiality agreements to
appropriate organisations
• Individual records. e.g.- Births- Mortality- HES- A&E attendances
• May be anonymised
• Record linkage increasingly difficult
• May or may not have NHS number
• Confidentiality issues
Introduction to primary data sources
6
Common characteristics
Each line is one event (birth, death, episode of care)
Usually includes:
• relevant dates eg birth, death, admission
• geographical data eg postcode, LA
• appropriate details of event e.g. cause of death, place of death, diagnosis, procedure
Some socioeconomic data may be inferred
Introduction to primary data sources
7
ONS mortality file – some fieldsDate of Birth Date of registration
Date of Death Sex
Age Causes of death (15 occurrences)
Age Unit (1=years, 2=months, 3=weeks, 4=days)
Underlying cause of death (non-neonatal)
Postcode of residence Secondary cause of death
Communal Establishment Code (H=Home, E=Elsewhere
Neonatal indicator
Establishment where death occurred Country of place of birth
Place of residence (LA, Ward, Region)
Standard Occupation Classification
LA place of death Place of death
Introduction to primary data sources
8
HES has many more fieldsRefer to data dictionary for full list: http://www.hscic.gov.uk/hesdatadictionary
Introduction to primary data sources
9
Advantages of individual level dataFar more flexibility
• Aggregate by numerous characteristics
• Possible to construct trend data
• Possible to link to other data sets
• Data may be processed as required (e.g. standardised)
However, can require considerable effort and time to produce required outputs.
May not be comparable with routinely produced statistics
Introduction to primary data sources
10
Elementary record linkage
• If you have a common field in two sets of records you can link records
• eg NHS number (link HES and mortality records) – with permission!
• Postcode (link mortality to Gridlink to IMD for Super Output Area (LSOA) deprivation scores)
Introduction to primary data sources
11
Principles of record linkage
Admissions numbers
Quintile 1 000000
Quintile 2 000000
Quintile 3 000000
Quintile 4 000000
Quintile 5 000000
population
00000
00000
00000
00000
00000
Rate
000
000
000
000
000
HES record
ID procedure postcode
XX1 2XX
Gridlink
Postcode SOA
XX1 2XX YYYYYYY
IMD
SOA IMD score quintile
YYYYYY ZZZZZ 5
/ =
Introduction to primary data sources
12
Confidentiality
• ONS releases data under one of two acts:
• National Health Service Act 1977 (Section 124A as amended by the Health Act 1999)
• Census Act 1920 (Section 5)
• Data supplied under confidentiality declarations
• No statistics may be published which will reveal personal information
• In practice this means no numbers under 5 (including 0’s)
• Avoid indirect disclosure (disclosure by differencing) e.g. subtracting males from a total will reveal that the count for females is under 5
Introduction to primary data sources
13
Confidentiality
Methods that can be used to protect tables:
• Table re-design
• Grouping categories within a table
• Aggregating across a number of time periods
• Using a higher level of geography
Suppression
• Suppression of rows and/or columns
• Suppression of cells
• In both row and cell suppression, some secondary suppression is usually necessary
Introduction to primary data sources
14
Aggregate data
Census
HSCIC Indicator Portal
Neighbourhood Statistics (NeSS)
ONS births/deaths statistics
• These do not include individual data
• Rates may already be calculated
• Quick for standard queries
Introduction to primary data sources
15
Census
• Conducted every 10 years by ONS (latest 2011)
• Statutory survey undertaken at household level
• Covers wide range of topics(demography, household structure, amenities, employment, health/disability, migration)
• Extensive range of outputs across a wide variety of types of area
• Can be difficult to find/extract required statistics
Introduction to primary data sources
16
HSCIC Indicator PortalContains a variety of indicator sets, including;
• Clinical Commissioning Group Indicators
• Compendium of Population Health Indicators
• Local Basket of Inequalities Indicators
• GP Practice Data
• Social Care Indicators
• NHS Outcome Framework Indicators
Collections of Excel spreadsheets
Introduction to primary data sources
17
HSCIC Indicator Portalhttps://indicators.ic.nhs.uk/webview/
• Contains hundreds of Excel spreadsheets
• Compendium is best collection of data on health and disease (including mortality rates)
• LBOI indicators cover wider determinants
• Data available for Local Authorities, CCGs
Introduction to primary data sources
18
HSCIC Indicator Portal
Introduction to primary data sources
19
Neighbourhood Statistics (NeSS)
• NeSS established in 2001 (http://www.neighbourhood.statistics.gov.uk)
• Contains over 300 datasets, covering Health, Housing, Education, Deprivation, Age, Ethnicity and Census data
• Around 1 billion counts of information, to neighbourhood level
• On-line guidance available to help users through the site
• Can search by postcode or area of interest, down to small area geographies
Introduction to primary data sources
20
Neighbourhood Statistics site
Introduction to primary data sources
21 Introduction to primary data sources
22
NOMIS
• NOMIS website presents official labour market statistics (www.nomisweb.co.uk)
• Can obtain profiles by local authority and ward
• Alternatively, can construct more detailed queries
• Good source for benefits (job seekers, Incapacity Benefit, Employment and Support Allowance)
• Also good source from which to extract Census data
Introduction to primary data sources
23 Introduction to primary data sources
24
Collections and tools
PHE produces and maintains a large number of indicator collections and tools.
All of these collections and tools can be accessed on line via the PHE Data Gateway (https://www.gov.uk/guidance/phedata-and-analysis-tools)
Most of the collections and tools are interactive, and users can determine the types of outputs they wish to produce
Introduction to primary data sources
25
Collections and tools
Examples include;
• Health profiles
• Public Health Outcomes Framework
• GP Practice Profiles
• Single topic profiles
• Spend and Outcomes Tool
• Return on Investment Tools (Smoking, Alcohol, Physical Activity)
Introduction to primary data sources
26
Health Profiles
• A member of the ‘official statistics’ collection
• Health profiles provide a general overview of the health situation in an area
• Available in pdf format or interactive online
• Web link - http://www.apho.org.uk/default.aspx?QN=P_HEALTH_PROFILES
Introduction to primary data sources
27 Introduction to primary data sources
28
Public Health Outcomes Framework• A member of the ‘official statistics’ collection
• Contains a wider range of indicators relating to key public health issues
• Indicators across five domains• Overarching indicators, wider determinants of health, health
improvement, health protection, healthcare and premature mortality
• Available in pdf format or interactive online
• Web link - http://www.phoutcomes.info/
Introduction to primary data sources
29 Introduction to primary data sources
30
National General Practice Profiles• Contains data at individual GP practice level
• Practice figures may be benchmarked against host CCG or national deprivation decile
• Mainly web based, but can download summary pdfs
• Many indicators QOF based
• link - http://fingertips.phe.org.uk/profile/general-practice
Introduction to primary data sources
31 Introduction to primary data sources
32
Single topic profilesMany are available, including;
• Breastfeeding
• Cardiovascular disease
• Health protection profile
• Injury profiles
• Liver disease profiles
• Local tobacco control profiles
• Sexual and reproductive health profiles
Introduction to primary data sources
33
Spend and Outcome Tool
• Operates at LA and CCG level
• Provides an overview of spend and outcomes across key areas of business (not just public health)
• Excel based tool, download from web site (LA pdf factsheets can also be downloaded)
• Training video and case studies available on website
• link - http://www.yhpho.org.uk/default.aspx?RID=49488
Introduction to primary data sources
34 Introduction to primary data sources
35
NICE Return on Investment Tools
• Tools available for tobacco, physical activity and alcohol
• Provides estimates of costs and savings (healthcare and wider) from different mixes and levels of interventions
• Operates at LA and CCG level
• Excel based tool, download from web site
• Training videos and guides available on website
• link - https://www.nice.org.uk/About/What-we-do/Into-practice/Return-on-investment-tools
Introduction to primary data sources
36 Introduction to primary data sources
37
Types of data not covered
Survey data – apart from census
Other examples include • Integrated household survey• Labour force survey
Modelled data – e.g. synthetic estimates
Introduction to primary data sources
38
Support Available
PHE’s local South West Knowledge and Intelligence Service is able to provide advice and assistance in using many data sets, profiles and tools.
Our Principal Knowledge Transfer Facilitator is Nicola Bowtell ([email protected]).
For specific support in using the Return on Investment tools contact [email protected]
Introduction to primary data sources
39
Data quality
What is ‘quality’ data?
• Accurate
• Timely
• Appropriate measure
• Consistent – between areas, over time
Introduction to primary data sources
40
Example – population counts
Different sources
• Census
• Population estimates
• Electoral registers
• GP practice registers
Each has particular strengths and weaknesses
Introduction to primary data sources
41
Example – hospital admissions
Different sources
• Patient Administration System (PAS)
• Secondary Uses Service (SUS)
• Hospital Episodes Statistics (HES)
Often a trade-off between timeliness and data quality (accuracy)
Introduction to primary data sources
42
Example – mortality statistics
Some elements of the data are more likely to be accurate than others
• Age, gender, counts
• Cause of death – diagnosis, changes in coding
• Occupation
• Consistency – compiled from registration office returns across the country
Introduction to primary data sources
43
Considerations regarding data quality Some things to consider when evaluating data quality
• How is it collected? (statutory return, voluntary return, survey)
• Why is it collected? (HES, QOF data)
• Is data coverage complete? (missing values)
• Are definitions clear? – are they consistently applied across areas, over time
• Is there a published quality assurance process?
• Is it a new or an established collection?
Introduction to primary data sources
44
Example – QOF based prevalence estimates
Widely used e.g. in GP Practice Profiles
• Why is it collected? – set GP payment levels
• How is it collected? – GPs compile registers of patients with certain conditions, submit annual returns
• Is it accurate? – Limited quality control/checking.
• Is it complete? – Only includes diagnosed patients, excludes exceptions
• Is it appropriate? – does it actually measure prevalence?
Introduction to primary data sources
45
Pointers for assessing data quality
Some suggestions
• Examine meta-data and/or technical guides – should provide information on how data is collected, how indicator is calculated, and any caveats regarding the data
• Ask an expert – people who have thorough knowledge of data sets and extensive experience of using them are best placed to advise on strengths and weaknesses
Often it is not the data sets or indicators themselves that are the problem, but the manner in which they are used!
Introduction to primary data sources
46
Choosing outcome indicators wisely
Not specifically covered here, but many of the themes covered in the assessing data quality section are relevant
Key publication – ‘The Good Indicators Guide’ – download from APHO website - http://www.apho.org.uk/resource/item.aspx?RID=44584
Introduction to primary data sources