research outputs for small areas: initial analysis and findings

28
Research Outputs for small areas: initial analysis and findings This SlideShare highlights factors around the differences between administrative based population estimates (Research Outputs) and official population estimates at Lower Layer Super Output Area (LSOA) level Please note that these Research Outputs are NOT official statistics

Upload: office-for-national-statistics

Post on 15-Apr-2017

729 views

Category:

Government & Nonprofit


1 download

TRANSCRIPT

Page 1: Research Outputs for small areas: initial analysis and findings

Research Outputs for small areas: initial analysis and findings

This SlideShare highlights factors around the differences between administrative based

population estimates (Research Outputs) and official population estimates at Lower Layer

Super Output Area (LSOA) level

Please note that these Research Outputs are NOT official statistics on the population

Page 2: Research Outputs for small areas: initial analysis and findings

Research Outputs - Background

• In October 2015, we published the first set of Administrative Data Census Research Outputs and provided population estimates by five year age group and sex at local authority (LA) level

• Using a Statistical Population Dataset (SPD) we matched individual records across multiple data sources into a single, coherent dataset that forms the basis for estimating the population

Page 3: Research Outputs for small areas: initial analysis and findings

Research Outputs - Background

• Now we are producing administrative data population estimates at small area level using the same method

• This analysis is mainly based on a comparison with 2011 Census estimates and uses SPD v2.0 estimates

• We have also published information on the methodology used to produce the Research Outputs

Page 4: Research Outputs for small areas: initial analysis and findings

Useful links

• Administrative Data Research Outputs – current release

• Information on our methodology used to produce the Research Outputs

• Feedback survey - Although we can explain some of the differences in the estimates from the examples given in this presentation, we require other data sources and local knowledge to help improve the performance of the SPD estimates at the small area level. We would welcome your feedback

Page 5: Research Outputs for small areas: initial analysis and findings

Constructing the SPD to produce the population estimates

NHS PatientRegister (PR)

DWP/HMRC Customer Information

System (CIS)

Higher Education Statistics Agency Data (HESA) - students

SPD populationestimates

School Census Data

Statistical Population Dataset – SPD

Matched records from the various administrative data sources are included in the Statistical Population Dataset (SPD)

Aggregate totals of Armed Forces personnel are added in to the estimates

=A future plan is to use a coverage survey to adjust for biases on the SPD

The SPD estimates have been produced by matching individual records across the administrative data sources. To protect privacy of individuals the process involves replacing identifying fields (names, dates of birth and addresses) by one or more artificial identifiers.

Page 6: Research Outputs for small areas: initial analysis and findings

What is an LSOA?

Geography Minimum population

Maximum population

Minimum number of

households

Maximum number of

households

LSOA 1,000 3,000 400 1,200

Geography England WalesLSOA 32,844 1,909

A  Lower Layer Super Output Area (LSOA) is a geographic area forming part of a geographic hierarchy designed to improve the

reporting of small-area statistics in England and Wales.

Page 7: Research Outputs for small areas: initial analysis and findings

LSOA analysis v Local Authority (LA) analysis

Advantages- Small area analysis allows greater

potential to understand the key issues- Ability to develop strong evidence

based explanations for the differences- Can help to explain some differences

seen at LA level

Disadvantages- Scale - there are over 35k

LSOAs in E&W- Reduced understanding and

analysis around LSOA level data, in particular how the quality of

official estimates change through the decade post census

Page 8: Research Outputs for small areas: initial analysis and findings

Differences between the estimates – possible scenarios

SPD estimate higher than official estimate (SPD overestimated)

SPD estimate lower than official estimate (SPD underestimated)

SPD estimates = official estimates (both correct)*

SPD estimate higher than official estimate (official estimate overestimated)

SPD estimate lower than official estimate (official estimate underestimated)

SPD estimate = official estimate (both overestimated)

SPD estimate = official estimate (both underestimated)

* While the population estimates may be the same, either in total or for a particular age/sex group, it could still be possible that when characteristics are added they are not found to be representative of the same people.

Page 9: Research Outputs for small areas: initial analysis and findings

Distribution of SPD estimates vs. census estimates 2011 - differences in 1% increments

<-10%

-10% -9% -8% -7% -6% -5% -4% -3% -2% -1% 0% 1% 2% 3% 4% 5% 6% 7% 8% 9% 10% >10%0%

2%

4%

6%

8%

10%

12%

14%

LSOA distribution of difference between SPD estimates and census estimates 2011

Prop

ortio

n of

LSO

As

in S

PD

These LSOAs were of most interest for us as

they showed the largest differences.

The SPD difference can be higher or lower than the official population

estimate

For approximately 80% of LSOAs, the differ-ence between the estimates was relatively

small +/- 5 %

SPD estimate lower than census estimate SPD estimate higher than census estimate

Page 10: Research Outputs for small areas: initial analysis and findings

Initial analysis• Carried out at LSOA level by broad age-group and sex

• Identified samples of LSOAs with the largest differences between census and SPD estimates (2011)

• Selected 800 LSOAs (about 2% of total)

• Then conducted analysis to understand why these areas had differences

• A number of factors were identified to explain the differences which we will explore – but we don’t have all the answers

• This work has highlighted the need for additional data sources - to provide ‘activity’1 indicators to confirm usual residency in the population - local data could help us explain the differences in the small area estimates - also likely that a Population Coverage Survey (PCS) will be needed to collect information that

can evaluate the quality of the SPD and adjust accordingly for coverage errors

• An LSOA analysis tool for the SPD V2.0 estimates for England and Wales, provides interactive summary statistics and information for reference years 2011 and 2015

1. ‘Activity’ can be defined as an individual interacting with an administrative system, for example for National Insurance or tax purposes, when claiming a benefit, attending hospital appointments or updating information on government systems in some other way. Only demographic information (such as name, date of birth and address) and dates of interaction are needed from such data sources to improve the coverage of our population estimates.

.

Page 11: Research Outputs for small areas: initial analysis and findings

What factors can help explain the large differences at LSOA level

Armed Forces – personnel and dependents

Prisoners – special populations

Students/Graduates

Seasonal Workers

Deprivation level and interaction withhealth and benefit systems

Real time change – housing development

Other factors?

At the LA level we may not see a large difference in the estimates

BUT analysis at LSOA level can show underlying large differences

This could be due to one of more of these factors

LA District

LSOA

Let’s see the effect these can have on the SPD estimates

Page 12: Research Outputs for small areas: initial analysis and findings

1. Armed forces (AF) personnel

• The official population estimates and SPD estimates both include armed forces personnel at their place of residence which may or may not be on the base

• AF personnel may not be represented on GP patient register data because they use medical facilities on site and therefore they are excluded from the SPD

• But medical coverage can vary by base and for AF dependents (families)

• To overcome this we add in aggregate armed forces statistics because without their inclusion noticeably low estimates could be observed in the LAs containing large military bases

Page 13: Research Outputs for small areas: initial analysis and findings

North Kesteven – Armed forces example

LSOA: E01026184 containsRAF WaddingtonThere is a station medical centre on the base which provides care to AF personnel but NOT their dependents

LSOA: 01026198 containsRAF Cranwell Also has a medical centre which provides medical care for personnel AND their dependents.

North Kesteven is a local authority in Lincolnshire and includes RAF Cranwell and RAF Waddington with a large number of military personnel living in the area

These two bases are reasonably close togetherLeaflet | © OpenStreetMap contributors, CC-BY-SA, Nomis

Page 14: Research Outputs for small areas: initial analysis and findings

North Kesteven – Armed forces example

LSOA: E01026184 containsRAF WaddingtonAF personnel appear to register with the on-base medical centre and do not appear on the NHS Patient Register (PR)

North Kesteven is a local authority in Lincolnshire and includes RAF Waddington and RAF Cranwell resulting in a large number of military personnel living in the area

For this LSOA the results look very close after adding the aggregated AF personnel data to the SPD

0 to 14 15 to 29 30 to 44 45 to 64 65+0

50100150200250300350400450

E01026184: Males

SPD estimates 2011 Census estimates 2011

Population

Map Data © OpenStreetMap Contributors, Nomis Leaflet | © OpenStreetMap contributors, CC-BY-SA, Nomis

Page 15: Research Outputs for small areas: initial analysis and findings

Leaflet | © OpenStreetMap contributors, CC-BY-SA, Nomis

North Kesteven – Armed forces example

LSOA: E01026198 containsRAF Cranwell AF personnel appear to register with the on-base medical centre and a local GP and therefore are appearing on the PR.

North Kesteven is a local authority in Lincolnshire which includes RAF Cranwell and RAF Waddington resulting in a large number of military personnel living in the area

Adding in the aggregate AF data to the SPD appears to result in double counting for younger males in this LSOA

Additional data on AF interaction with health service systems could help solve this

0 to 14 15 to 29 30 to 44 45 to 64 65+0

100

200

300

400

500

600

E01026198: Males

SPD estimates 2011 Census estimates 2011

Population

Page 16: Research Outputs for small areas: initial analysis and findings

2. Prisoners

Inclusion of prisoners in the official estimates and the SPDbased estimates

Census Official estimates SPD estimates Institutions housing special population groups, for example prisons,provide independent health services on site, meaning thesepopulations will not be recorded on the NHS Patient Register (PR)

This means areas with large numbers of prisoners will be underestimated in the SPD because these are unlikely to be fullyrepresented on the PR

Page 17: Research Outputs for small areas: initial analysis and findings

0 to 14 15 to 29 30 to 44 45 to 64 65+0

100

200

300

400

500

600

700

E01024618: Males

SPD estimates 2011 Census estimates 2011

Population

Example the ‘Sheppey prison cluster’

The Sheppey cluster is an amalgamation of the three prisons: Elmley, Standford Hill and Swaleside.

The cluster falls within LSOA E01024618 which is in Swale in Kent

Because prisoners are not added to the SPD our estimates are lower than the official estimates in areas housing a prison

Answer?Obtain additional data to allow correct addition of prisoners in the SPD

HM Prison Swaleside Wikipedia Creative Commons License

Page 18: Research Outputs for small areas: initial analysis and findings

3. Students/GraduatesExplaining the differences between SPD estimates and official estimates in an area populated with university students.

These examples may help explain why there are differences:

Using postcodes, halls of residence can be incorrectly allocated to a small area – where this occurs neighbouring small areas may be significantly over and under underestimated in the two sets of estimates

Foreign students are more likely to be excluded from the SPD estimates having not registered with a GP or applied for a national insurance number if there is no intention to work during their period of study

The official mid-year estimates of internal migration rely on moving graduates out of their local authority of study by detecting moves between updates of the PR. Delays in re-registering with a new GP has the potential to affect the estimates

List cleaning of the GP patient register can cause SPD estimates for some LAs to decrease in size between years

These examples can be more apparent at the LSOA level of detail

Page 19: Research Outputs for small areas: initial analysis and findings

Example of the student factor - University of Hertfordshire (Welwyn Hatfield)

0 to

4

5 to

9

10 to

14

15 to

19

20 to

24

25 to

29

30 to

34

35 to

39

40 to

44

45 to

49

50 to

54

55 to

59

60 to

64

65 to

69

70 to

74

75 to

79

80 to

84

85 to

89

90+

0

2

4

6

8

10

12

14

SPD V2.0 and 2011 Census population estimates by five-year age group Welwyn Hatfield, 2011

Total population

SPD estimate

Census es-timate

AgeSource: Office for National Statistics

Thousands

At the LA level the results look similar for student ages

Page 20: Research Outputs for small areas: initial analysis and findings

Two LSOAs covering University of Hertfordshire (Welwyn Hatfield)

The 2011 Census estimate shows a higher proportion of students resident in this LSOA when compared with the SPD estimates

The SPD estimate shows a higher proportion of students resident in this LSOA when compared with the 2011 Census estimate

E01023938

E01023937

Neighbouring small areas can show up inconsistencies in student residency when looking at the official estimates and the SPD estimates (2011)

We can see this more clearly in the next slide

Leaflet | © OpenStreetMap contributors, CC-BY-SA, Nomis

Page 21: Research Outputs for small areas: initial analysis and findings

0 to 14 15 to 29 30 to 44 45 to 64 65+0

500

1,000

1,500

2,000

2,500

3,000

E01023937: Total population

SPD estimates 2011 Census estimates 2011

Population

0 to 14 15 to 29 30 to 44 45 to 64 65+0

200400600800

1,0001,2001,4001,6001,800

E01023938: Total population

SPD estimates 2011 Census estimates 2011

Population

Students at LSOA level – University of Hertfordshire

When we look at the age distributions for the two LSOAs covering the University of Hertfordshire we can see the full picture

The majority of the students have been allocated to different LSOAs in the SPD and the 2011 Census

At this small area level the postcode used to determineterm-time location in the census may differ from that used in the SPD which utilises the HESA address data. Further quality checking against the census address could provide more information at this level of detail.

Page 22: Research Outputs for small areas: initial analysis and findings

4. Seasonal workers

It’s reasonable to expect that administrative data sources accumulate records for people who are only temporarily resident due to seasonal working patterns

Higher SPD estimates are likely to reflect the generaltendency for younger people to take longer to updatetheir health of tax records when they leave their area ofseasonal employment

Example of the seasonal effect on the SPD estimates• Swale, North Kent, E01024556

Page 23: Research Outputs for small areas: initial analysis and findings

Seasonal workers example, SwaleFarms with accommodation for seasonal workers

Caravans form temporary accommodation for

seasonal farm workers

Young workers will appear on the

administrative data sources used to create the SPD

estimates

E01024556

Difference between the Research Output and the 2011 Census estimate – LSOA E01024556

Contains Ordinance Survey Data © Crown copyright and database right 2015

0 to 14 15 to 29 30 to 44 45 to 64 65+0

50100150200250

E01024556: Males

SPD estimates 2011 Census estimates 2011

Population

0 to 14 15 to 29 30 to 44 45 to 64 65+0

50100150200250

E01024556: Females

SPD estimates 2011 Census estimates 2011

Population

Activity data could help establish if people have moved and reduce the accumulation effect

Imagery © 2016 DigitalGlobe

Page 24: Research Outputs for small areas: initial analysis and findings

5. Deprivation level and interaction with health and benefit systems

SPD estimates appear better quality in areas of high need/use of public services and health systems

Theory: more interaction with systems = less error in the SPD estimate

Why?

In more deprived areas, interaction with health and benefit systems is likely to be higher; inward and outward migration is picked up in the administrative records as people update their records

In areas that are relatively affluent the reverse may be true with less interaction with health and benefit systems and associated delays in updates to administrative records when people move into these areas

Page 25: Research Outputs for small areas: initial analysis and findings

Deprivation example Hammersmith and Fulham

-16%

-12%

-8%

-4%

0%

4%

8%

12%

16%

Deprivation quintiles: Hammersmith & Fulham LSOAs, 2011

Total population

LSOAs with high deprivation LSOAs with low deprivation

Aver

age

% d

iffer

ence

bet

wee

n SP

D V2

.0 a

nd C

ensu

s 20

11 The more deprived the LSOA the smaller the difference

between the SPD estimate and the 2011 Census estimate

Page 26: Research Outputs for small areas: initial analysis and findings

6. Real time change – housing development

Catching up with real time change

In areas of rapid growth the administrative data will take time tocatch up with reality due to delays in registration with GPs(affecting both official estimates and administrative data outputs)and the CIS

Change in the years following the 2011 Census gives an indication of the pace at which administrative data catches up with reality

The same is true for areas where housing development results in a reduction in the housing stock at LSOA level, for example with the demolition of a tower block

Page 27: Research Outputs for small areas: initial analysis and findings

0 to 14 15 to 29 30 to 44 45 to 64 65+0

100200300400500600700800900

1,000

E01033424: Total population

SPD estimate 2011 Census 2011 SPD estimate 2015Official estimate 2015

Population

Real time change – housing development

An example, E01033424, Wembley (Brent)

The post 2011 change for this LSOA is beingpicked up in each series reflecting the population expansion in the area but the SPD estimate still lags behind the census and official estimate

Wembley is one of the largest regeneration projects in the country.

According to the Mayor of London it can accommodate approximately 11,500 new homes and 10,000 new jobs through the development of sites along Wembley High Road and the land around Wembley Stadium.

Answer?Access to accurate local data could help

improve the SPD

estimates

Leaflet | © OpenStreetMap contributors, CC-BY-SA, Nomis

Page 28: Research Outputs for small areas: initial analysis and findings

Other factors affecting the estimates?

There are likely to be differences in areas that contain:

- Boarding schools - Homeless shelters, or - Other communal establishments

and

there will be factors we have not yet found

Although we can explain some of the differences in the estimates

from the examples given, we require other data sources and local knowledge to help improve the performance of the SPD estimates at the small area level – we would welcome your feedback (see slide 4)

SPDestimate

Official estimate

Official estimate

SPDestimate