quality measures for ons population estimates: introduction local insight reference panels autumn...

74
Quality Measures for ONS population estimates: Introduction Local Insight Reference Panels Autumn 2014 1

Upload: tobias-tucker

Post on 27-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Quality Measures for ONS population estimates: Introduction

Local Insight Reference PanelsAutumn 2014

1

Session Summary

• Plausibility ranges• Administrative sources and demographic

analysis comparison tool• Measures of Uncertainty• Visualisation tool

2

Reference periods for the measuresMeasure Year

01 02 03 04 05 06 07 08 09 10 11 12 13

Plausibility Ranges

( (

Admin/ demographic indicators

( ( (

Uncertainty ( ( ( ( ( ( ( ( ( ( ( ( (

Visualisation Tool

( ( ( ( ( ( ( ( ( ( (

3

How accurate do you think our estimates are?

A. A perfect measure of the population

B. Within +/- 1%

C. Within +/- 2%

D. Within +/- 5%

E. Within +/- 10%

F. Within +/- 15%

G. Within +/- 20%

H. >20% little relation to the true population

For• 2011 Census (all

persons• 2011 Rolled forward (all

persons)• 2011 Census (25-29

year olds)• 2011 Census (25-29

year old males)

4

Accuracy of Population Estimates for 2011

5

Note: Average = average weighted by population size

Your awareness of quality tools

• Had you heard of the Quality tools before this session?

• Have you used any of them?

6

Part 1

Using Administrative Data to Set Plausibility Ranges for Population Estimates- Assessment Following the 2011 Census

7

Background

Update the work carried out in 2012 which used the 2009 Mid-Year Estimates.

One of several initiatives taken forward for the quality assurance of Mid-Year Estimates.

- Release of 2011 Census estimates allowed methods to be evaluated.

Same methodology as the 2012 report.

- How the ranges performed against both the 2011 Census estimates and MYEs for 2011.

Only for those aged 0-158

What are plausibility ranges?

Definition of plausibility ranges: A plausibility range is the setting of upper and lower limits, calculated using administrative data, within which the population estimates could reasonably be expected to fall.

Outside range

Outside range

Within range

Lower Upper

9

What are plausibility ranges?

Plausibility range

MYE

Census Confidence interval

Census estimateSC PR CB

SC= School Census PR= Patient Register CB= Child Benefit

10

Does the Census Validate the ranges?

264 201 266 280 280

- - -Under 1s 1 4yrs 5 7yrs 8 11yrs 12-15yrs

Within range lower 25% Within range Within range upper 25%

Above Plausibility Range Below Plausibility Range

Age group

Nu

mb

er o

f L

As

11

Summary of findings

Absolute Approach Findings Relative Approach Findings

• Around 1/5th of LAs’ Census estimates fell outside of the plausibility ranges

• Those that fell within the ranges- Most agreement between MYE11s and Census estimates• Those that fell outside of the ranges- On average the Census estimates were closer to the plausibility ranges than the MYEs.

• Around 12% (42 LAs) false positives. LAs which were not a problem but were flagged as areas of concern.

• Little useful information about the under 1 and 1-4 year age group.

• Around 14% (49 LAs) false negatives. LAs which potentially should have been flagged that were not.

• Some useful information about the 5-15 year age group.

12

Limitations

• Data sources- ranges only as good as the administrative sources used to calculate them.

• Census variability- current methodology compares a point with a range rather than comparing a range with a range.

• Methodology- more sensitive at picking up over estimates than under estimates.

• Age grouping- following a cohort is difficult, different sized age groups.

• Specific areas- for example armed forces and areas with high levels of independent schools.

13

Plausibility Ranges

• 5 mins discussion• Were you aware of the original report?

• Have you seen the revised report?

• Do you agree with our conclusions?

14

Key Points

• Around 1/5th of LAs’ plausibility ranges not validated by the 2011 Census.

• Some useful information (5-15 years), little useful information for 0-4 year olds.

• Plausibility ranges not advised for future use, as they currently stand.

• The use of tolerance ranges (as in Census) not ruled out for future use

15

Part 2

Mid-year estimate QA tool

16

Background to MYE QA tool

• Quality assurance of the 2011 Census made extensive use of admin data and demographic analysis

• MYE QA made some use• Wanted to carry out something similar for the MYEs

but taking into account the speed of release and the resources available.

• Solution, take the most useful and appropriate elements and use those.

• Mixed mode approach• Carry out the sort of analysis our stakeholders do

17

The MYE QA ToolComparing MYEs for 2013 with.....

0

1,000

2,000

3,000

4,000

5,000

6,0000

1 -

4

5 -

9

10 -

14

15 -

19

20 -

24

25 -

29

30 -

34

35 -

39

40 -

44

45 -

49

50 -

54

55 -

59

60 -

64

65 -

69

70 -

74

75 -

79

80 -

84

85+

Nu

mb

er o

f m

ales

Age

Chart 1, Data comparison for Ceredigion, Males

MYE Patient register Child Benefit School Census State pensions Births

MYE13-Male PR 13-Male CB13-Male SC 13-Male SP13-Male Bths 13-Male

MYE11-Male PR 11-Male CB11-Male SC 11-Male SP11-Male Bths 11-Male

MYE11(A)-Male PR 11(A)-Male CB11(A)-Male SC 11(A)-Male SP11(A)-Male

MYE 12-Male

MYE12(A)-Male

18

Comparing MYEs for 2013 with.....Admin data

0

1,000

2,000

3,000

4,000

5,000

6,0000

1 -

4

5 -

9

10 -

14

15 -

19

20 -

24

25 -

29

30 -

34

35 -

39

40 -

44

45 -

49

50 -

54

55 -

59

60 -

64

65 -

69

70 -

74

75 -

79

80 -

84

85+

Nu

mb

er o

f m

ales

Age

Chart 1, Data comparison for Ceredigion, Males

MYE Patient register Child Benefit School Census State pensions Births

MYE13-Male PR 13-Male CB13-Male SC 13-Male SP13-Male Bths 13-Male

MYE11-Male PR 11-Male CB11-Male SC 11-Male SP11-Male Bths 11-Male

MYE11(A)-Male PR 11(A)-Male CB11(A)-Male SC 11(A)-Male SP11(A)-Male

MYE 12-Male

MYE12(A)-Male

19

Understanding the quality of admin data

-30

-20

-10

0

10

20

30

40

50

60

70

-30 -20 -10 0 10 20 30 40 50 60 70

Per

cen

t D

iff

Pat

ien

t re

gis

ter

to M

YE

, ag

e 40

-44

, 20

13

Per cent Diff Patient register to MYE, age 38-42, 2011

Chart 2, Distribution of percentage differences for Patient register and MYEs for 40 - 44 year olds (Males), in 2011 and 2013 highlighting relationship for Ceredigion, Cohort

Local Authorities Ceredigion Linear (2011 = 2013)

20

Coherence between counts from admin sources and MYEs

• Coverage and definitional differencesSchool census under-represents resident

populationChanges to eligibility for child benefitAreas with special populationsTiming

• PR list inflation• PR list cleaning – reduces list inflation

21

The MYE QA ToolComparing MYEs for 2013 with.....

0

1,000

2,000

3,000

4,000

5,000

6,0000

1 -

4

5 -

9

10 -

14

15 -

19

20 -

24

25 -

29

30 -

34

35 -

39

40 -

44

45 -

49

50 -

54

55 -

59

60 -

64

65 -

69

70 -

74

75 -

79

80 -

84

85+

Nu

mb

er o

f m

ales

Age

Chart 1, Data comparison for Ceredigion, Males

MYE Patient register Child Benefit School Census State pensions Births

MYE13-Male PR 13-Male CB13-Male SC 13-Male SP13-Male Bths 13-Male

MYE11-Male PR 11-Male CB11-Male SC 11-Male SP11-Male Bths 11-Male

MYE11(A)-Male PR 11(A)-Male CB11(A)-Male SC 11(A)-Male SP11(A)-Male

MYE 12-Male

MYE12(A)-Male

22

Comparing MYEs for 2013 with.....2011 and 2012 MYEs on a period basis

0

1,000

2,000

3,000

4,000

5,000

6,0000

1 -

4

5 -

9

10 -

14

15 -

19

20 -

24

25 -

29

30 -

34

35 -

39

40 -

44

45 -

49

50 -

54

55 -

59

60 -

64

65 -

69

70 -

74

75 -

79

80 -

84

85+

Nu

mb

er o

f m

ales

Age

Chart 1, Data comparison for Ceredigion, Males

MYE Patient register Child Benefit School Census State pensions Births

MYE13-Male PR 13-Male CB13-Male SC 13-Male SP13-Male Bths 13-Male

MYE11-Male PR 11-Male CB11-Male SC 11-Male SP11-Male Bths 11-Male

MYE11(A)-Male PR 11(A)-Male CB11(A)-Male SC 11(A)-Male SP11(A)-Male

MYE 12-Male

MYE12(A)-Male

23

Comparing MYEs for 2013 with.....2011 MYEs on a cohort basis

0

1,000

2,000

3,000

4,000

5,000

6,0000

1 -

4

5 -

9

10 -

14

15 -

19

20 -

24

25 -

29

30 -

34

35 -

39

40 -

44

45 -

49

50 -

54

55 -

59

60 -

64

65 -

69

70 -

74

75 -

79

80 -

84

85+

Nu

mb

er o

f m

ales

Age

Chart 1, Data comparison for Ceredigion, Males

MYE Patient register Child Benefit School Census State pensions Births

MYE13-Male PR 13-Male CB13-Male SC 13-Male SP13-Male Bths 13-Male

MYE11-Male PR 11-Male CB11-Male SC 11-Male SP11-Male Bths 11-Male

MYE11(A)-Male PR 11(A)-Male CB11(A)-Male SC 11(A)-Male SP11(A)-Male

MYE 12-Male

MYE12(A)-Male

24

Quality assuring estimates for women to quality assure estimates for men (1)

• Admin data for working age males generally weaker than for working age females.

• Availability of data on fertility provides additional means of looking estimates of females.

• QA of estimates for females more comprehensive than for males.

• Use confidence around estimates for females to allow QA of males – sex-ratios.

25

Quality assuring estimates for women to quality assure estimates for men (2)

0.0

0.5

1.0

1.5

2.0

2.5

3.0

0.0 0.5 1.0 1.5 2.0 2.5 3.0

2013

TF

R

2011 TFR

Chart 4, Total fertility rate in 2011 and 2013, Ceredigion highlighted

All Local authorities Ceredigion 2011=2013

26

Using sex-ratios for QA

• Sex-ratio = males/females• Analysis of sex-ratios in decade 2001 and

2011 shows these can be a strong indication of issues with the MYEs.

• Comparison of sex-ratios for 2001 and 2011(Census based) shows distribution of sex-ratios is broadly constant.

• Use distribution of sex-ratios in 2011 to evaluate local authorities over the decade.

27

Spread of sex ratiosGiven by standard deviation

Note: Excludes Isles of Scilly28

Using sex-ratiosWhat does the real distribution of sex-ratios look like?

40

60

80

100

120

140

160

180

200

1 -

4

5 -

9

10 -

14

15 -

19

20 -

24

25 -

29

30 -

34

35 -

39

40 -

44

45 -

49

50 -

54

55 -

59

60 -

64

65 -

69

70 -

74

75 -

79

80 -

84

85+

Sex

-rat

io (

mal

es/f

emal

es*1

00)

Age (Cohort)

Chart 3, Comparison of sex-ratios by age for Ceredigion against the distribution of 2011 Census based MYE sex-ratios (Cohort comparison)

Middle 98% Middle 90% Inter-quartile range Ceredigion 2013 Ceredigion 2011

29

Using sex-ratiosHow does Ceredigion compare?

40

60

80

100

120

140

160

180

200

1 -

4

5 -

9

10 -

14

15 -

19

20 -

24

25 -

29

30 -

34

35 -

39

40 -

44

45 -

49

50 -

54

55 -

59

60 -

64

65 -

69

70 -

74

75 -

79

80 -

84

85+

Sex

-rat

io (

mal

es/f

emal

es*1

00)

Age (Cohort)

Chart 3, Comparison of sex-ratios by age for Ceredigion against the distribution of 2011 Census based MYE sex-ratios (Cohort comparison)

Middle 98% Middle 90% Inter-quartile range Ceredigion 2013 Ceredigion 2011

30

Summary of MYE QA tool

• Necessity of a mixed mode approach• Patient register, child benefit, state pensions, school

census.• Sex-ratios• Fertility• Change over time• Present data on period and cohort basis• Published alongside MYEs on day of release• Access to the same data for each local authority

(lower & upper tier), regions and England and Wales.

31

Improving the process

• The 2013 MYEs represent the first time we’ve run through this QA process.

• The main issue is time, the volume of estimates to QA more than fills the time available to do it.

• Increasing the amount of time to allow for contingency would be useful.

• For 2014 it is hoped to implement some prioritisation of “more tricky” local authorities.

• The potential to automate some of the checks.• More resources

32

Evaluation

• As part of the development of the tool we talked to stakeholders, future developments require further engagement.

• Usefulness to stakeholders outside of ONS• What else could be included?• What could be clarified?• Via StatUserNet, Population Statistics Community• Via LIRPs!

33

What do you think?

• Have you looked at or used the MYE QA tool?

• Your experiences?

• From what you’ve seen today is this something you would find useful?

• What else would you like to see?

• Do you do something similar?34

Part 3

Measuring uncertainty in the ONS mid-year estimates

35

•Measuring uncertainty around the mid-year population estimates allows users to evaluate change over time

•We have already published uncertainty measures for 2002-10 as research statistics

•Uses modelled immigration•Uses school boarder adjustment

•We are now reviewing the methods to take into account (1) recent changes in the way the MYEs are calculated and (2) using new information from the 2011 Census

Uncertainty measures- work in progress

36

Summary of uncertainty measures (as % of MYE)

37

•Methods have been developed in collaboration with academics at Southampton University

•We use a simulations-based approach to measure variability around the MYEs

•Our methods mirror the complexity of current population estimates, which involve using administrative, survey and census data and a range of statistical techniques

Uncertainty measures- our approach

38

Base population

Natural change

International migration

Internal migration

Other changes

Base population

International migration

Internal migration

Natural change

Other changes

Assume no variance

+-

+-

+-

+-

Uncertainty estimates =

Bootstrapping to create 1,000 simulations to derive 95% CIs for MYEs

MYEs=

Cohort component method

39

Bootstrapping International Immigration

International Passenger Survey National Estimate

Workers

Students

Others

UK Returners

Split by type

MWS

HESA/BIS/WAG

PRDS

Census

Use admin data to distribute to LAs

348 Local Authority estimates of international immigrants

Recombine to create LA totals

MYEs=

Uncertainty estimates =

1,000 simulations from IPS

1,000 simulated admin counts for each migrant type in each LA

Apply admin-based proportions to IPS estimate for each migrant type to derive counts of each migrant type in each LA

1,000 Worker counts

1,000 Student counts

1,000 ‘Other’ counts

1,000 UK Returner counts

1,000 simulated international immigrant counts for each LA

Sum these to produce 1,000 LA totals. 26th and 975th ranked values provide uncertainty interval

40

What is bootstrapping?International immigration over time

41

International immigration bootstrap

42

2 bootstraps

43

More bootstraps

44

Lots of bootstraps

45

International Immigration

46

International Immigration

47

Apportionment to LA - streams

48

Apportionment to LA - streams

49

e.g. Foreign-born worker in-migrants

1 bootstrap

2 bootstraps

More bootstraps

Lots of bootstraps

Summary all bootstraps for allocation of IPS to LA

Summary of resultant estimates by LA

Summary of resultant estimates by LA

57

Key Points

• Produces 95% confidence intervals around overall estimates for each LA (no age/sex breakdown).

• Uses bootstrapping (simulation)

• Only covers variance due to 2001 Census, internal migration and international migration.

58

What do you think?

• Your experiences of looking at/using data from the previous release?

• Is it useful to have the Confidence intervals around total population?

59

Part 4

Visualising the causes of discrepancies between rolled forward and Census-based mid-year estimates

60

Background

• Based on QA work for 2011 Census and from dealing with LA queries on differences between Census estimates and MYEs

• Consistent approach was needed for all LAs: • explores MYEs at component level• seeks to understand processes used to derive MYEs• uses comparators where available

61

Aims of work

1. To provide a consistent way of explaining the most likely causes of discrepancies between Census based and rolled forward mid-year estimates

2. To provide a consistent way of understanding the most likely causes of bias in the mid-year estimates rolled forward from 2011

62

Basic Principles

• Within MYEs overall discrepancies are a result of discrepancies at a component of change level

• Multiple components set up a complex web of effects and may cause compensating differences

• Comparators only allow us to indicate potential discrepancies

• Highlighting a potential discrepancy at the component level can improve understanding of the estimates and may aid the improvement of their quality

• Visualisation aims to make a complex analysis quick and easy to understand for a non-technical user

63

Basic principlesCeredigion

Discrepancy indicated by

Census

Relative size of international

emigration flows

Probable discrepancy due to

international immigration

Probable discrepancy due to

internal migration

Probable bias due to school

boarders

Impact of 2001 Census Base

Impact of 2001 Census Base

Probable bias due to school

boarders

Probable discrepancy due to

internal migration

Probable discrepancy due to

international immigration

Relative size of international

emigration flows

Discrepancy indicated by

Census

Relative size of internal

migration flows

2002 Census Response rate

2001 Census Response rate

Relative size of internal

migration flows

Relative size of international

immigration flows

Relative size of international

emigration flows

Probable bias due to

international immigration

Probable bias due to internal

migration

Probable bias due to Census Base

Probable bias due to Census Base

Probable bias due to internal

migration

Probable bias due to

international immigration

3 0 -1 0 3 85+ 1 0 -1 0 20 0 -1 0 0 80-84 3 0 0 0 22 0 1 0 3 75-79 1 0 -1 0 00 0 0 0 0 70-74 3 0 0 0 21 0 0 0 3 65-69 1 -1 0 0 00 0 0 -1 3 60-64 3 -1 0 0 00 0 0 -1 0 55-59 3 -1 0 0 00 0 0 0 0 50-54 3 -1 0 0 00 0 0 0 0 45-49 3 -1 0 0 0

-1 1 0 0 0 40-44 0 0 0 0 0-3 3 0 0 -3 35-39 0 0 0 1 -2-3 3 0 3 -3 30-34 0 3 0 3 0-3 3 0 0 0 2 25-29 3 0 0 0 3 03 2 0 -3 0 3 20-24 0 0 0 0 3 20 0 0 -1 0 3 15-19 3 0 -1 0 0 00 0 0 0 0 3 10-14 3 0 -1 -2 0 10 0 0 0 0 5-9 0 0 0 0 00 0 0 0 1-4 0 0 0 0

-3 0 1 0 0 0 0 0 0

3 Overestimate 3 High Flow/low response rate2 21 1

Neutral Low Flow/High response rate-1-2-3 Underestimate

Flow/bias combinationFlow

3 0 0 0 0 0 0 02 0 0 0 0 0 0 01 0 0 0 0 0 0 00 0 0 0 XXXXX 0 0 0

-3 -2 -1 0 1 2 3

BiasOverestimateUnderestimate

The Probable discrepancy due to international immigration - M for those aged 5-9 in Ceredigion is negligible .

Males Females

The probable negative impact of the mis-measurement of the international inflows is assessed by comparing the published international inflows against the number of patient register records with a 'Flag 4' indicating that someones previous place of residence was outside the United Kingdom. Where there is a strong correspondence between the number of moves as estimated by the IPS and Flag4s we would conclude that there is a low probability of over or under estimation due to international migration, if the number of Flag 4s is much higher it suggests that the international inflow may be underestimated and if it is much lower it suggests the international inflow may be overestimated.

5-9

Probable discrepancy due to international immigration - M

Ceredigion

Ceredigion

5-9

Probable discrepancy due to international immigration - M

Ceredigion

Tool to explain risk/flow interaction

Auto-text to explain method and output to user

64

Basic Principles

• Develop an understanding of how each component may be deficient or how it compares to an external benchmark.

• Construct an alternative population estimate ad for each issue.

• Compare alternative to published series to determine impact.

• Calculate Z-scores to indicate whether each LA/Sex/age group for each component is unusual.

65

Basic Principles

• Values by age and sex are compared against a mean across LAs using a standard deviation across LAs.

34.1%

13.6%

2.2%

0.1%

-3 -2 -1 0 S.D

(neg.) (pos.)

• Values at the ends of the distributions are most likely to show a potential discrepancy.

• Uses survey and administrative data 2001/2011 Census, 2001/2011 Census Response Rates, MYEs 2001-2011, Patient Register 2001-2011, Flag4 2006-2010, IPS 2001-2011, Internal Migration 2001-2011

66

Demonstration

• Compounding• Compensating• Messaging

67

Moving Forward...

• Ongoing work includes:• Using the 2011 Census as a base and exploring

current and forthcoming mid-year estimates• Conversion of the processing element to SAS• Auto-text to aid users in understanding of output• Fine tuning

• Differential sensitivity of indicators (school boarders more sensitive than other components)

• Utilising other intelligence (such as the characteristics of each LA).

68

Key Points

• Aims to provide reasonably informed intelligence about where they may be issues with the mid-year estimates.

• Based on intelligence gathered from Census QA and dealing with stakeholder queries

• Articulating complexity of compounding and compensating errors.

• Provides intelligence by sex and quinary age group.

• Feeds into development of improved methods

69

What do you think?

• This provides indicative rather than definitive intelligence about potential issues• Is this type of information useful to you?• How would you use it?

• We intend to publish, the interactive tool along with papers outlining the methods

• What do you think• Does this look plausible to you

• Would you like to help?

70

Quality tools Summary

• ONS have pursued four very different ways of assessing the quality of MYEs

• Plausibility ranges no longer being taken forward

• MYE QA tool for 2013 MYEs available now

• New Confidence intervals coming soon

• Visualisation tool in development

71

Different tools for different roles

• Confidence intervals aim to provide statistically robust measure of accuracy.

• MYE QA tool aims to find where MYEs look unusual against administrative and demographic comparators.

• Visualisation tool aims to show which components are likely to be contributing to discrepancies between MYEs and reality.

72

What do you think?

• Are these quality tools what you were expecting?

• Coherence between tools

• Usefulness of different approaches

• How would you use these tools?

73

Where to find further information.......

• Plausibility rangeshttp://www.ons.gov.uk/ons/guide-method/method-quality/specific/population-and-migration/population-statistics-research-unit--psru-/latest-publications-from-the-population-statistics-research-unit/index.html

• MYE QA Toolhttp://www.ons.gov.uk/ons/publications/re-reference-tables.html?edition=tcm%3A77-322718

• Uncertainty in LA MYEshttp://www.ons.gov.uk/ons/guide-method/method-quality/imps/latest news/uncertainty-in-la-mypes/index.html

• Visualisation tool (coming soon)[email protected]

74