list of quality indicators - european commission 2011_deliverable_6.1.pdf · indicators, which draw...

44
SGA 2011: Deliverable 6.1 Version 1.0 ESSNET USE OF ADMINISTRATIVE AND ACCOUNTS DATA IN BUSINESS STATISTICS WP6 Quality Indicators when using Administrative Data in Statistical Outputs List of Quality Indicators May, 2012

Upload: others

Post on 17-Sep-2019

12 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: List of Quality Indicators - European Commission 2011_Deliverable_6.1.pdf · indicators, which draw together certain basic quality indicators into ‘themes’ in line with the ESS

SGA 2011: Deliverable 6.1

Version 1.0

ESSNET

USE OF ADMINISTRATIVE AND ACCOUNTS DATA

IN BUSINESS STATISTICS

WP6 Quality Indicators when using Administrative Data

in Statistical Outputs

List of Quality Indicators

May, 2012

Page 2: List of Quality Indicators - European Commission 2011_Deliverable_6.1.pdf · indicators, which draw together certain basic quality indicators into ‘themes’ in line with the ESS

ESSnet – Admin data: Quality Indicators 2

Contents

1. Executive summary…………………………………………………………….3

2. Introduction………………………………………………………………………5

3. Glossary………………………………………………………………………….9

4. List of basic quality indicators…………………………………………….......11

4.1. Background Information indicators…………………………………...11 4.2. Quality Indicators……………………………………………………….15

5. Examples of calculating the indicators……………………………………….23

Page 3: List of Quality Indicators - European Commission 2011_Deliverable_6.1.pdf · indicators, which draw together certain basic quality indicators into ‘themes’ in line with the ESS

ESSnet – Admin data: Quality Indicators 3

1. Executive Summary: A Guide to the Quality Indicators

What are the quality indicators?

The European Statistical System Network project on administrative data (ESSnet

AdminData) has developed a list of quantitative quality indicators, for use with business

statistics involving admin data. The indicators provide a measure of quality of the statistical

output, taking input and process into account. They are based on the ESS dimensions of

statistical output quality.

Who are they for?

The list of quality indicators has been developed primarily for producers of statistics, within

the ESS and more widely. The indicators can also be used for quality reporting, thus

benefiting users of the statistical outputs. They provide the user with an indication of the

quality of the output, and an awareness of how the admin data have been used in the

production of the output.

When can they be used?

The list of quality indicators is particularly useful for two broad situations:

1. When planning to start using admin data as a replacement for, or to supplement,

survey data. In this scenario, the indicators can be used to assess the feasibility of

moving to admin data, and the impact on output quality.

2. When admin data are already being used to produce statistical outputs. In this

scenario, the indicators can be used to gauge and report on the quality of the output,

and to monitor it over time. Certain indicators will be suitable to report to users, whilst

others will be most useful for the producers of the statistics only.

How should they be used?

There are 23 basic quality indicators in total, but a statistical producer need only use the

indicators relevant to their output. The table below shows which of the indicators relates to

which dimension or ‘theme’ of quality, which may be useful in identifying which indicators to

use. For more information about each of the themes, please refer to the ‘Introduction’

section.

Page 4: List of Quality Indicators - European Commission 2011_Deliverable_6.1.pdf · indicators, which draw together certain basic quality indicators into ‘themes’ in line with the ESS

ESSnet – Admin data: Quality Indicators 4

Indicators 1 to 8 are background indicators, which provide general information on the use of

administrative data in the statistical output in question but do not, directly, relate to the

quality of the statistical output.

Indicators 9 to 23 provide information directly addressing the quality of the statistical output.

Quality theme Indicators relevant to that theme

Accuracy 9 , 10, 11, 12, 13, 14, 15, 16, 17.

Timeliness and punctuality

4, 18.

Comparability 19.

Coherence 5, 6, 20, 21.

Cost and efficiency 7, 8, 22, 23.

Use of administrative data

1, 2, 3.

Further information

For more detailed information about the indicators and how to use them, please consult the

‘Introduction’ section.

Page 5: List of Quality Indicators - European Commission 2011_Deliverable_6.1.pdf · indicators, which draw together certain basic quality indicators into ‘themes’ in line with the ESS

ESSnet – Admin data: Quality Indicators 5

Quality Indicators when using Administrative Data in Statistical Outputs

2. Introduction

One of the aims of the European Statistical System Network project on administrative data (ESSnet AdminData) is the development of quality indicators for business statistics involving administrative data, with a particular focus on developing quantitative quality indicators.

Some work has already been done in the area of quality of business statistics involving administrative data and some indicators have been produced, namely under the preparation of the Quality Report Framework for Business Statistics under Regulation (CE) no. 295/2008. However, the work conducted thus far refers to qualitative indicators or is based more on a descriptive analysis of administrative data (see Eurostat, 2003). The quantitative indicators that have been produced have been more to do with the quality of the administrative sources (Daas, Ossen & Tennekes, 2010) or have been to develop a quality framework for the evaluation of administrative data (Ossen, Daas & Tennekes, 2011). These do not address the quality of the production of the statistical output however. In fact, almost no work has been done on quantitative indicators of business statistics involving administrative data, which is the main focus of this project (for further discussion on this topic see Frost, Green, Pereira, Rodrigues, Chumbau & Mendes, 2010).

The ESSnet aims to develop quality indicators of statistical outputs that involve administrative data. These indicators are for the use of members of the European Statistical System; producers of statistics. Therefore, the list contains indicators on input and process because these are critical to the work of the National Statistical Institutes and it is the input and process in particular that are different when using administrative data. Moreover, the list of indicators developed is specifically in relation to business statistics involving administrative data. Indicators (e.g. on accessibility) that do not differ for administrative vs. survey based statistics are not included in this work because they fall outside the remit of this section of the ESSnet AdminData project.

To address some issues of terminology, a few definitions are provided below to clarify how these terms are used in this document and throughout the ESSnet AdminData. Further information on terminology is included in the glossary in Section 3.

What is administrative data? Administrative data are data derived from an administrative source, before any processing or validation by the NSI.

What is an administrative source? A data holding containing information collected and maintained for the purpose of implementing one or more administrative regulations.

The list of quality indicators

A list of quantitative quality indicators has been developed on the basis of research which took stock of work being conducted in this field across Europe. This list was then user tested within five European National Statistical Institutions (NSIs), before testing across Europe. Feedback from user testing was used to improve the list of quality indicators.

Page 6: List of Quality Indicators - European Commission 2011_Deliverable_6.1.pdf · indicators, which draw together certain basic quality indicators into ‘themes’ in line with the ESS

ESSnet – Admin data: Quality Indicators 6

The current list of indicators has been grouped into two main areas:

o Background Information – these are ‘indicators’ in the loosest sense. They provide general information on the use of administrative data in the statistical output in question but do not, directly, relate to the quality of the statistical output. This information is often crucial in understanding better those indicators that measure quality more directly.

o Quality Indicators – these provide information directly addressing the quality of the statistical output.

The background information indicators and the quality indicators are further grouped by quality ‘theme’. These quality themes are based on the ESS dimensions of output quality, with some additional themes which relate specifically to administrative data. These themes also appear in the composite quality indicators that are being developed by WP6 (see ‘Future work’). The quality themes are:

Quality theme Description

Accuracy The closeness between an estimated result and the unknown true value.

Timeliness and punctuality The lapse of time between publication and the period to which the data refer, and the time lag between actual and planned publication dates.

Comparability The degree to which data can be compared over time and domain.

Coherence The degree to which data that are derived from different sources or methods, but which refer to the same phenomenon, are similar.

Cost and efficiency The cost of incorporating admin data into statistical systems, and the efficiency savings possible when using admin data in place of survey data.

Use of administrative data Background information relating to admin data inputs.

A short description of each indicator is included in the attached list along with a formula on how to calculate the indicator (if applicable), and example calculations (see Section 5).

The indicators have been developed so that a low indicator score denotes high quality, and a high indicator score denotes low quality.

This is consistent with the concept of error, where high errors signify low quality. The exceptions to this rule are the background indicators (1 to 8), where the score provides

Page 7: List of Quality Indicators - European Commission 2011_Deliverable_6.1.pdf · indicators, which draw together certain basic quality indicators into ‘themes’ in line with the ESS

ESSnet – Admin data: Quality Indicators 7

information rather than a quality ‘rating’; and indicators 20 and 23, where a high indicator score denotes high quality, and a low indicator score denotes low quality.

Using the list of quality indicators

Throughout the list, there are words in blue which take you to the definition of that particular word in the glossary. There are also links in the ‘How to calculate’ section, which take you to examples of how to calculate each indicator.

A framework for the basic quantitative quality indicator examples

The calculation of an indicator needs some preliminary steps. Some or all of these steps will be used for each example of the indicators to ensure consistency of the examples, and to aid understanding of the indicators themselves (see Section 5 for a list of examples).

A. Define the statistical output

B. Define the relevant units

C. Define the relevant variables

D. Adopt a schema for calculation

E. Declare the tolerance method for quantitative and qualitative variables

Links between this and other work on Quality

The work being carried out under this project should not be seen as independent of other work already in place. When analysing the list of indicators, one can conclude that some other information is useful in regard to the quality of administrative data. However, some of that very useful information cannot be (or has not been) translated into quantitative indicators. The main aim of the current project is not to discuss all the issues related to quality when using administrative data. The aim, at this stage, is to discuss basic quantitative quality indicators.

In addition, these indicators are for the benefit of the members of the European Statistical System (ESS); the producers of statistics. Consequently, the end result of the ESSnet AdminData work in this area should be integrated with the work already in place on the production of Eurostat Quality Reports.

Future work

In addition to the list of basic indicators, the ESSnet also aims to develop composite quality indicators, which draw together certain basic quality indicators into ‘themes’ in line with the ESS dimensions of output quality, to provide a more holistic view of the quality of a statistical output. In addition to this, investigative work is underway to develop quality guidance for

Page 8: List of Quality Indicators - European Commission 2011_Deliverable_6.1.pdf · indicators, which draw together certain basic quality indicators into ‘themes’ in line with the ESS

ESSnet – Admin data: Quality Indicators 8

situations where survey and administrative data are combined. Work is also planned to develop qualitative quality indicators to complement the list of basic quantitative quality indicators.

References Daas, P.J.H., Ossen, S.J.L. & Tennekes, M. (2010). Determination of administrative data quality: recent results and new developments. Paper and presentation for the European Conference on Quality in Official Statistics 2010. Helsinki, Finland.

Eurostat, (2003). Item 6: Quality assessment of administrative data for statistical purposes. Luxembourg, Working group on assessment of quality in statistics, Eurostat.

Frost, J.M., Green, S., Pereira, H., Rodrigues, S., Chumbau, A. & Mendes, J. (2010). Development of quality indicators for business statistics involving administrative data. Paper presented at the Q2010 European Conference on Quality in Official Statistics. Helsinki, Finland.

Ossen, S.J.L., Daas, P.J.H. & Tennekes, M. (2011). Overall Assessment of the Quality of Administrative Data Sources. Paper accompanying the poster at the 58th Session of the International Statistical Institute. Dublin, Ireland.

Page 9: List of Quality Indicators - European Commission 2011_Deliverable_6.1.pdf · indicators, which draw together certain basic quality indicators into ‘themes’ in line with the ESS

ESSnet – Admin data: Quality Indicators 9

3. Glossary1

Term Definition

1. administrative data Data derived from an administrative source, before any processing or validation by the NSI.

2. administrative source A data holding containing information collected and maintained for the purpose of implementing one or more administrative regulations.

3. common units Units that are included in more than one source.

4. consistent items Values for a variable in a specific unit that are the same across different sources, within a certain tolerance.

5. item A ‘value’ for a variable for a specific unit.

6. key variables Variables that are the most important and have the largest impact on the statistical output (e.g. turnover, number of employees, wages and salaries, etc.)

7. reference population The set of units about which information is wanted and estimates are required. This might be the entire Business Register (BR) or some part of the BR, e.g. manufacturing sector.

8. relevant units Businesses that are within the scope of the statistical output (e.g. units from the services sector should be excluded from manufacturing statistics).

9. relevant items ‘Values’ for units on relevant variables that should be included in calculating the statistical output.

10. required period The reporting period used within the statistical output.

1 Work Package 1 (WP1) of the ESSnet AdminData has developed an ‘Admin Data Glossary’. To access the glossary, please follow this link: http://essnet.admindata.eu/WorkPackage?objectId=4251

Page 10: List of Quality Indicators - European Commission 2011_Deliverable_6.1.pdf · indicators, which draw together certain basic quality indicators into ‘themes’ in line with the ESS

ESSnet – Admin data: Quality Indicators 10

11. required variables Variables necessary to calculate the statistical output.

12. statistical output A statistic produced by the NSI – whether based on a specific variable (e.g. no. of employees) or a set of related variables (e.g. total turnover; domestic market turnover; external market turnover). In the broadest sense, statistical output would also apply to the whole STS or SBS output.

13. unit Refers to statistical units – enterprise, legal unit, local unit, etc.

14. weighted A number of the quality indicators described in this document can be calculated in unweighted or weighted versions. Formulae are given for the unweighted versions of the indicators. Weighting can be beneficial as the weighted indicator will often better describe the quality of the statistical output. For example, the unweighted item non-response will inform users what proportion of valid units did not respond for a particular variable, whereas the weighted item non-response will estimate the proportion of the output variable affected by non-response. A non-response rate of 30% is of less concern if those 30% of units only cover 1% of the output variable. In practice, we do not know the values of the output variable for non-responders, so we use a related variable instead. Business register variables such as Turnover or Employment are often used as proxies.

The weighted indicators are calculated as follows:

Page 11: List of Quality Indicators - European Commission 2011_Deliverable_6.1.pdf · indicators, which draw together certain basic quality indicators into ‘themes’ in line with the ESS

ESSnet – Admin data: Quality Indicators 11

4. List of Basic Quality Indicators

4.1 Background Information

Use of administrative data:

Indicator Description How to calculate

1

Number of admin sources used

This indicator provides information on the number of

administrative sources used in each statistical output. The number of sources should include all those used in the statistical output whether the admin data are used as raw data, in imputation or to produce estimations.

Note. Where relevant, a list of the admin sources may also be helpful for users, along with a list of the variables included in each source. Alternatively, the number of admin sources used can be specified by variable.

Examples of indicator

2

% of items obtained exclusively from admin

data

This indicator provides information on the proportion of items only obtained from admin data, whether directly or indirectly, and where survey data are not collected. This includes where admin data are used as raw data, as proxy data, in calculations, etc. This indicator should be calculated on the basis of the statistical output – the number of items obtained exclusively from admin data (not by survey) should be considered.

%100 items of no. Total

dataadmin fromy exclusivel obtained items of No.

This indicator could also be weighted in terms of whether or not the variables are key to the statistical output. Examples of indicator

Page 12: List of Quality Indicators - European Commission 2011_Deliverable_6.1.pdf · indicators, which draw together certain basic quality indicators into ‘themes’ in line with the ESS

ESSnet – Admin data: Quality Indicators 12

Indicator Description How to calculate

3

% of required variables derived from admin data that are used as a proxy

This indicator provides information on the extent that admin data are used in the statistical output as a proxy or are used in calculations rather than as raw data. This indicator should be calculated on the basis of the statistical output – the number of required variables derived indirectly from admin data (because not available directly from admin or survey data) should be considered.

%100 variablesrequired of No.

proxy a as used dataadmin from derived variablesrequired of No.

Note. If a combination of survey and admin data is used, this indicator would need to be weighted (by number of units). If double collection is necessary (e.g. to check quality of admin data), some explanation should be provided. This indicator could also be weighted in terms of whether or not the variables are key to the statistical output. Examples of indicator

Timeliness and punctuality:

Indicator Description How to calculate

4

Periodicity (frequency of arrival of the admin data)

This indicator provides information about how often the admin data are received by the NSI. This indicator should be provided for each admin source.

Note. If data are provided via continuous feed from the admin source, this should be stated in answer to this indicator. Only data you receive for statistical purposes should be considered. Examples of indicator

Page 13: List of Quality Indicators - European Commission 2011_Deliverable_6.1.pdf · indicators, which draw together certain basic quality indicators into ‘themes’ in line with the ESS

ESSnet – Admin data: Quality Indicators 13

Coherence:

Indicator Description How to calculate

5

% of common units across two or more admin sources

This indicator relates to the combination of one or more admin sources. This indicator provides information on the proportion of common units across two or more admin sources. Only units relevant to the statistical output should be considered. This indicator should be calculated pairwise for each pair of admin sources and then averaged. If only one admin source is available, this indicator is not relevant.

%100 units uniquerelevant of No.

sourcesadmin in the unitscommon relevant of No.

Note. The “unique units” in the denominator means that units should only be counted once, even if they appear in multiple sources. This indicator should be calculated separately for each variable. This indicator could also be weighted (e.g. by turnover or number of employees) in terms of the % contribution of these units to the statistical output. Examples of indicator

6

% of common units when combining admin and

survey data

This indicator relates to the combination of admin and survey data. This indicator provides information on the proportion of common units across admin and survey data. Linking errors should be detected and resolved before this indicator is calculated. This indicator should be calculated for each admin source and then aggregated based on the number of common units (weighted by turnover) in each source.

%100survey in units of No.

datasurvey andadmin in unitscommon of No.

Note. If there are few common units due to the design of the statistical output (e.g. a combination of survey and admin data), this should be explained. This indicator could also be weighted (e.g. by turnover or number of employees) in terms of the % contribution of these units to the statistical output. Examples of indicator

Page 14: List of Quality Indicators - European Commission 2011_Deliverable_6.1.pdf · indicators, which draw together certain basic quality indicators into ‘themes’ in line with the ESS

ESSnet – Admin data: Quality Indicators 14

Cost and efficiency:

Indicator Description How to calculate

7

% of items obtained from admin source and also

collected by survey

This indicator relates to the combination of admin and survey data. This indicator provides information on the double collection of data, both admin source and surveys. Thus, it provides an idea of redundancy as the same data items are being obtained more than once. This indicator should be calculated for each admin source and then aggregated.

%100survey in itemsrelevant of No.

datasurvey andadmin by obtained itemscommon relevant of No.

Note. Double collection is sometimes conducted for specific reasons, e.g. to measure quality. If this is the case, this should be explained. Only admin data which meet the definitions and timeliness requirements of the output should be included. Examples of indicator

8

% reduction of survey sample size when moving from survey to admin data

This indicator relates to the combination of admin and survey data. This indicator provides information on the reduction in survey sample size because of an increased use of admin data. Only changes to the sample size due to using admin data should be included in this calculation. The indicator should be calculated for each survey and then aggregated (if applicable).

%100dataadmin of usein increase before size Sample

after size sample - dataadmin of usein increase before size Sample

Note. This indicator is likely to be calculated once, when

making the change from survey to admin data.

Examples of indicator

Page 15: List of Quality Indicators - European Commission 2011_Deliverable_6.1.pdf · indicators, which draw together certain basic quality indicators into ‘themes’ in line with the ESS

ESSnet – Admin data: Quality Indicators 15

4.2. Quality indicators Accuracy:

Indicator Description How to calculate

9

Item non-response (% of units with missing values for key variables)

Although there are technically no ‘responses’ when using admin data, non-response (missing values at item or unit level) is an issue in the same way as with survey data. This indicator provides information on the extent of missing values for the key variables. This indicator should be calculated for each of the key variables and for each admin source and then aggregated based on the contributions of the variables to the overall output.

%100 variableXfor relevant units of No.

variableXfor valuemissing with dataadmin in the unitsrelevant of No.

This indicator could also be weighted (e.g. by turnover or number of employees) in terms of the % contribution to the output Examples of indicator

10 Misclassification rate

This indicator provides information on the proportion of units in the admin data which are incorrectly coded. For simplicity and clarity, activity coding as recorded on the Business Register (BR) is considered to be correct. The level of coding used should be at a level consistent with the level used in the statistical output (e.g. if the statistical output is produced at the 3-digit level, then the accuracy of the coding should be measured at this level). This indicator should be calculated for each admin source and then aggregated based on the number of relevant units (weighted by turnover) in each source.

%100 dataadmin in unitsrelevant of No.

BR tocode NACEdifferent with dataadmin in unitsrelevant of No.

Note. If the activity code from the admin data is not used by the NSI (e.g. if coding from BR is used), this indicator is not relevant.

If a survey is conducted to check the rate of misclassification, the rate from this survey should be provided and a note added to the indicator. This indicator could also be weighted (e.g. by turnover or number of employees) in terms of the % contribution to the output. Examples of indicator

Page 16: List of Quality Indicators - European Commission 2011_Deliverable_6.1.pdf · indicators, which draw together certain basic quality indicators into ‘themes’ in line with the ESS

ESSnet – Admin data: Quality Indicators 16

Indicator Description How to calculate

11 Undercoverage

This indicator provides information on the undercoverage of the admin data. That is, units in the reference population that should be included in the admin data but are not (for whatever reason). This indicator should be calculated for each admin source and then aggregated based on the number of relevant units (weighted by turnover) in each source.

%100 population referencein unitsrelevant of No.

dataadmin in NOTbut population referencein unitsrelevant of No.

Note. This could be calculated for each relevant publication of the statistical output, e.g. first and final publication. This indicator could also be weighted (e.g. by turnover or number of employees) in terms of the % contribution to the output. Examples of indicator

12 Overcoverage

This indicator provides information on the overcoverage of the admin data. That is, units that are included in the admin data but should not be (e.g. are out-of-scope, outside the reference population). This indicator should be calculated for each admin source and then aggregated based on the number of relevant units (weighted by turnover) in each source.

%100 population referencein units of No.

population referencein NOTbut dataadmin in units of No.

This indicator could also be weighted (e.g. by turnover or number of employees) in terms of the % contribution to the output. Examples of indicator

13

% of units in the admin source for which reference

period differs from the required reference period

This indicator provides information on the proportion of units that provide data for a different reporting period than the required period for the statistical output. If the periods are not those required, then some imputation is necessary, which may impact quality. This indicator should be calculated for each admin source and then aggregated based on the number of relevant units (weighted by turnover) in each source.

%100 dataAdmin in unitsrelevant of No.

period required from period reporting

different with dataAdmin in unitsrelevant of No.

This indicator could also be weighted (e.g. by turnover or number of employees) in terms of the % contribution to the output.

Examples of indicator

Page 17: List of Quality Indicators - European Commission 2011_Deliverable_6.1.pdf · indicators, which draw together certain basic quality indicators into ‘themes’ in line with the ESS

ESSnet – Admin data: Quality Indicators 17

Indicator Description How to calculate

14

Size of revisions from the different versions of the

admin data RAR – Relative Absolute

Revisions

This indicator assesses the size of revisions from different versions of the admin data, providing information on the reliability of the data received. With this indicator it is possible to understand the impact of the different versions of admin data on the results for a certain reference period. When data is revised based on other information (e.g. survey data) this should not be included in this indicator. The indicator should be calculated for each admin source and then aggregated. If only one version of the admin data is received, this indicator is not relevant.

%100

1

1

T

t Pt

T

t PtLt

X

XX

= Latest data for X variable

= First data for X variable Note. This indicator should only be calculated for estimates based on the same units (not including any additional units added in a later draft). This indicator could also be weighted (e.g. by turnover or number of employees) in terms of the % contribution to the output.

Examples of indicator

15

% of units in admin data which fail checks

This indicator provides information on the extent to which data fail some elements of the checks (automatic or manual) and are flagged by the NSI as suspect. This does not mean that the data are necessarily adjusted (see Indicator 16), simply that they fail one or more check(s). This checking can either be based on a model, checking against other data sources (admin or survey), internet research or through direct contact with the businesses. This indicator should be calculated for each of the key variables and aggregated based on the number of relevant units (weighted by turnover) in each source.

%100 checked unitsrelevant of no. Total

failed and checked dataadmin in unitsrelevant of No.

Note. If the validation is done automatically and the system does not flag or record this in some way, this should be noted. This indicator could also be weighted (e.g. by turnover or number of employees) in terms of the % contribution to the output. Users should state the number of checks done, and the proportion of data covered by these checks. Examples of indicator

LtX

PtX

Page 18: List of Quality Indicators - European Commission 2011_Deliverable_6.1.pdf · indicators, which draw together certain basic quality indicators into ‘themes’ in line with the ESS

ESSnet – Admin data: Quality Indicators 18

Indicator Description How to calculate

16

% of units for which data have been adjusted

This indicator provides information about the proportion of units for which the data have been adjusted (a subset of the units included in Indicator 15). These are units that are considered to be erroneous and are therefore adjusted in some way (missing data should not be included in this indicator – see Indicator 9). Any changes to the admin data before arrival with the NSI should not be considered in this indicator. This indicator should be calculated for each of the key variables and aggregated based on the number of relevant units (weighted by turnover) in each source.

%100 DataAdmin in unitsrelevant of No.

data adjusted with dataAdmin in the unitsrelevant of No.

This indicator could also be weighted (e.g. by turnover or number of employees) in terms of the % contribution to the output. Examples of indicator

17

% of imputed values (items) in the admin data

This indicator provides information on the impact of the values imputed by the NSI. These values are imputed because data are missing (see Indicator 9) or data items are unreliable (see Indicator 16). This indicator should be calculated by variable for each admin source and then aggregated based on the contributions of the variables to the overall output.

%100 dataadmin in itemsrelevant of No.

dataadmin relevant in the items imputed of No.

This indicator should be weighted (e.g. by turnover or number of employees) in terms of the % contribution of the imputed values to the statistical output. Examples of indicator

Page 19: List of Quality Indicators - European Commission 2011_Deliverable_6.1.pdf · indicators, which draw together certain basic quality indicators into ‘themes’ in line with the ESS

ESSnet – Admin data: Quality Indicators 19

Timeliness and punctuality:

Indicator Description How to calculate

18

Delay to accessing / receiving data from Admin

Source

This indicator provides information on the proportion of the time from the end of the reference period to the publication date that is taken up waiting to receive the admin data. This is calculated as a proportion of the overall time between reference period and publication date to provide comparability across statistical outputs. This indicator should be calculated for each admin source and then aggregated.

%100daten publicatio toperiod reference of end thefrom Time

dataAdmin receiving toperiod reference of end thefrom Time

Note. Include only the final dataset used for the statistical output. If a continuous feed of data is received, the ‘last’ dataset used to calculate the statistical output should be used in this indicator. If more than one source is used, an average should be calculated, weighted by the sources’ contributions to the final estimate. If the admin data are received before the end of the reference period, this indicator would be 0. This indicator applies to the first publication only, not to revisions.

Examples of indicator

Page 20: List of Quality Indicators - European Commission 2011_Deliverable_6.1.pdf · indicators, which draw together certain basic quality indicators into ‘themes’ in line with the ESS

ESSnet – Admin data: Quality Indicators 20

Comparability:

Indicator Description How to calculate

19

Discontinuity in estimate when moving from a survey-based output to an admin-

based output

This indicator measures the impact on the level of the estimate when changing from a survey-based output to an admin-based output. This indicator is likely to be calculated once, when making the change from survey to admin data. This indicator should be calculated separately for each key estimate included in the output.

Note. This indicator should be calculated using survey and admin data which refer to the same period. Examples of indicator

Coherence:

Indicator Description How to calculate

20

% of consistent items for common variables in more

than one source2

This indicator provides information on consistent items for any common variables across sources (either admin or survey). Only variables directly required for the statistical output should be considered – basic information (e.g. business name and address) should be excluded. Values within a tolerance should be considered consistent – the width of this tolerance (1%, 5%, 10%, etc.) would depend on the variables and methods used in calculating the statistical output. This indicator should be calculated for each of the key variables and aggregated based on the contributions of the variables to the overall output.

%100 variableXfor required items of no. Total

variableXfor lerance)(within to items consistent of No.

Note. If only one source is available or there are no common variables, this indicator is not relevant. Please state the tolerance used. This indicator could also be weighted (e.g. by turnover or number of employees) in terms of the % contribution to the output. Examples of indicator

2Indicators 20 and 23 are the only indicators in Section 4.2 for which a high indicator score denotes high quality and a low indicator score denotes low quality.

Page 21: List of Quality Indicators - European Commission 2011_Deliverable_6.1.pdf · indicators, which draw together certain basic quality indicators into ‘themes’ in line with the ESS

ESSnet – Admin data: Quality Indicators 21

oSS

S

U+U

U

21

% of relevant units in admin data which have to be

adjusted to create statistical units

This indicator provides information on the proportion of units that have to be adjusted in order to create statistical units. For example, the proportion of data at enterprise group level which therefore need to be split to provide reporting unit data.

Relevant units in the reference population that are adjusted to the statistical concepts by the use of statistical methods

Relevant units in the reference population that correspond to the statistical concepts

This indicator should be weighted (e.g. by turnover or number of employees) in terms of the % contribution of these units to the statistical output. Examples of indicator

oSU

S U

Page 22: List of Quality Indicators - European Commission 2011_Deliverable_6.1.pdf · indicators, which draw together certain basic quality indicators into ‘themes’ in line with the ESS

ESSnet – Admin data: Quality Indicators 22

Cost and efficiency:

Indicator Description How to calculate

22

Cost of converting admin data to statistical data

This indicator provides information on the estimated cost (in person hours) of converting admin data to statistical data. The indicator should be calculated for each admin source and then aggregated based on the contribution of the admin source to the statistical output.

(Estimated) Cost of conversion in person hours

Note. This should only be calculated for parts of the admin data relevant to the statistical output. Examples of indicator

23

Efficiency gain in using admin data3

This indicator provides information on the efficiency gain in using admin data rather than simply using survey data. For example, collecting admin data is usually cheaper than collecting data through a survey but this benefit might be offset by higher processing costs. Production cost should include all costs the NSI is able to attribute to the production of the statistical output.

%100statistic basedsurvey ofcost Production

statistic basedadmin ofcost production - statistic basedsurvey ofcost Production

Note. Estimated costs are acceptable.

This indicator is likely to be calculated once, when making the change from survey to admin data. Examples of indicator

3 Indicators 20 and 23 are the only indicators in Section 4.2 for which a high indicator score denotes high quality and a low indicator score denotes low quality.

Page 23: List of Quality Indicators - European Commission 2011_Deliverable_6.1.pdf · indicators, which draw together certain basic quality indicators into ‘themes’ in line with the ESS

ESSnet – Admin data: Quality Indicators 23

5. Examples of calculating the quality indicators

A framework for the basic quantitative quality indicator examples

The calculation of an indicator needs some preliminary steps. Some or all of these steps will be used for each example of the indicators to ensure consistency of the examples, and to aid understanding of the indicators themselves.

A. Define the statistical output

B. Define the relevant units

C. Define the relevant variables

D. Adopt a schema for calculation

E. Declare the tolerance method for quantitative and qualitative variables

QI n. 1 – Number of administrative sources

Back to indicator 1

Example 1

A. Statistical output: The BR Enterprise units updating/identification

B. Relevant units: 10+ employees enterprises (relevant for a specific survey or as

base for the HG firms)

D. Steps for calculation: Identify the relevant admin sources.

Let S1 be the Fiscal Register source

Let S2 be the Chamber of Commerce source

Let S3 be the Social Security source

Let S4 be the Yellow Pages source

I(1) = 4 sources.

Example 2

A. Statistical output: The BR Local units updating/identification

B. Relevant units: The local units of enterprises with more than one local unit

D. Steps for calculation: Identify the relevant admin sources.

Let S1 be the Chamber of Commerce source

Let S2 be the Social Security source

I’(1) = 2 sources.

Page 24: List of Quality Indicators - European Commission 2011_Deliverable_6.1.pdf · indicators, which draw together certain basic quality indicators into ‘themes’ in line with the ESS

ESSnet – Admin data: Quality Indicators 24

QI n. 2 – Percentage of items obtained exclusively from admin data

Back to indicator 2

Example 1

A. Statistical output: The BR Enterprise units updating/identification

B. Relevant units: Enterprises with 10 or less employees (relevant for a specific

survey)

C. Relevant variables: Date of commencement of activities; Date of final cessation of

activities; Principal activity code at NACE 4-digit level; Number of

persons employed; Number of employees; Turnover.

D. Steps for calculation:

D1. From BR take the population of units with 10 or less employees;

D2. For each relevant variable, calculate the proportion of items for which the variable is

obtained exclusively from admin data (items with non missing variable);

D3. Divide the sum of numbers of items for which the variables are obtained exclusively

from admin data by the sum of numbers of items for which the variable is not missing

D4. Calculate the indicator as follows:

Let INIT be the date of commencement of activities;

Let END be the date of final cessation of activities;

Let NACE be the Principal Activity Code;

Let PER be the number of persons employed;

Let EMP be the number of employees;

Let TUR be the Turnover, obtained as Proxy and included in this indicator;

Variables

(1)

Number of items

for which the

variable is not

missing in the

relevant items

(2)

Number of

employees of

(1)

(3)

Number of items for

which the variable is

obtained exclusively

from admin data

(4)

Number of

employees of

(3)

(5)=[(3)/(1)]*100

Proportion of items for

which variables are

obtained exclusively

from admin data

(6)=[(4)/(2)]*100

Proportion of items for

which variables are

obtained exclusively from

admin data weighted by

employees

INIT 4,360,685 3,501,511 4,349,379 3,485,361 99.7 99.5

END 232,594 89,359 232,015 88,633 99.8 99.2

NACE 4,360,685 3,501,511 4,331,316 3,460,597 99.3 98.8

PER 4,360,685 3,501,511 4,281,650 3,294,864 98.2 94.1

EMP 1,405,754 3,501,511 1,329,071 3,299,878 94.5 94.2

TUR 4,282,711 3,367,764 4,282,711 3,367,764 100.0 100.0

Total 19,003,114 17,463,167 18,806,142 16,997,097 99.0 97.3

99.0% 100*114,003,19

142,806,18100*

#

min# I(2)

itemsofTotal

dataadfromyexclusivelobtaineditems

Page 25: List of Quality Indicators - European Commission 2011_Deliverable_6.1.pdf · indicators, which draw together certain basic quality indicators into ‘themes’ in line with the ESS

ESSnet – Admin data: Quality Indicators 25

QI n. 3 – % of required variables derived from admin data that are used as a proxy

Back to indicator 3

Example 1

A. Statistical output: The BR Enterprise units

B. Relevant unit: Enterprise with turnover

C. Let the list of relevant variables4 be as follows:

1) Date of commencement of activities;

2) Date of final cessation of activities;

3) Principal activity code at NACE 4-digit level;

4) Number of persons employed;

5) Number of employees;

6) Turnover;

7) Identification number of the resident/truncated enterprise group, to which the enterprise

belongs

The relevant variable obtained from the Fiscal source is the VAT turnover, proxy of the Turnover

D: Steps for calculation:

D1: Number of required variables derived from admin data

D2: Number of variables of D1 used as a proxy (i.e. the variable is derived indirectly from

admin data) (num)

D3: Number of required variables by the BR Regulation (denom)

Formula:

I(3)= (Num/Den)*100=(1/7)*100=14%

QI n. 4 – Periodicity (frequency of arrival of admin data)

Back to indicator 4

Example 1

A. Statistical output: The BR Enterprise units

4 The variables are required by the Regulation (EC) No 177/2008 of the European Parliament and of the Council, but each country will decide which of the required variables will be relevant for this indicator.

Page 26: List of Quality Indicators - European Commission 2011_Deliverable_6.1.pdf · indicators, which draw together certain basic quality indicators into ‘themes’ in line with the ESS

ESSnet – Admin data: Quality Indicators 26

D: Steps for calculation: Record periodicity for each source

Let S1 be the Fiscal Register source

Let S2 be the Chamber of Commerce source

Let S3 be the Social Security source

Let S4 be the Yellow Pages source

IS1(4) = 1; IS2(4) = 2; IS3(4) = 2; IS4(4) = 1

Example 2

A. Statistical output: OROS Survey (Employment, earnings and social security

contributions) based on the Social Security administrative data.

B. Relevant units: small enterprises with employees

D: Steps for calculation: Record periodicity for each source

Let S1 be the Fiscal Register source

Let S2 be the Social Security source

I’S1(4) = 4; I’S2(4) = 4

QI n. 5 – % of common units across two or more administrative sources

Back to indicator 5

Example 1

A. Statistical output: The BR Enterprise units updating/identification

B. Relevant units: The NACE sector = Construction

Yellow Pages data 1

Chamber of Commerce data 2

Social Security data 2

Fiscal Register data 1

Frequency of arrival of the admin data (respect to BR

reference year) - Per yearType of admin data

Fiscal Register data 4

Social Security data 4

Type of admin data

Frequency of arrival of the admin data -Per

year

Page 27: List of Quality Indicators - European Commission 2011_Deliverable_6.1.pdf · indicators, which draw together certain basic quality indicators into ‘themes’ in line with the ESS

ESSnet – Admin data: Quality Indicators 27

D. Steps for calculation:

D1. Identify the statistical unit (enterprise) for each source (i.e. group the administrative

records in one source at id code level)

D2. Match all sources each other by id code

D3. Attribute a “presence(1) / absence(0)” indicator to the unit with regard to the specific

source

D4. Calculate the number of possible pairings between sources (i.e. when there are n

sources, it is the combination of n sources taken k=2 at a time), Cn,k= n!/(n-k)!* k!

Let’s suppose 4 sources, the possible combinations will be: C4,2 =24/4=6

D5. Multiply the “presence(1) / absence(0)” indicator to obtain the presence (1) /absence

(0) indicator for each pairwise

D6. Sum up the “presence(1) / absence(0)” indicator at pair level and divide by

Cn,k*#relevant units

Let A be the Social Security source

Let B be the Chamber of Commerce source

Let C be the Yellow Pages source

Let D be the Fiscal Register source

Nace Sector = Construction

Presence(1)/Absence(0) of the unit in the source

Ind(XiA)=1 if Xi is present in the source A; Ind(XiA)=0 if Xi is absent in the source A

Num=∑ijind(Xij)=34

Denom=m*Cn,k=10*6=60

I(5)=(Num/Denom)*100=(34/60)*100=57%

UNIT A B C D AB AC AD BC BD CD Sum

X1 0 0 1 1 0 0 0 0 0 1 1

X2 0 1 0 1 0 0 0 0 1 0 1

X3 0 1 1 1 0 0 0 1 1 1 3

X4 1 1 1 1 1 1 1 1 1 1 6

X5 0 1 0 1 0 0 0 0 1 0 1

X6 1 1 1 1 1 1 1 1 1 1 6

X7 1 1 1 1 1 1 1 1 1 1 6

X8 0 1 0 1 0 0 0 0 1 0 1

X9 1 1 0 1 1 0 1 0 1 0 3

X10 1 1 1 1 1 1 1 1 1 1 6

Sum 5 4 5 5 9 6 34

Page 28: List of Quality Indicators - European Commission 2011_Deliverable_6.1.pdf · indicators, which draw together certain basic quality indicators into ‘themes’ in line with the ESS

ESSnet – Admin data: Quality Indicators 28

The following picture illustrates the meaning of the result:

And, weighting by Turnover

Num=∑ijind(Xij)*wi=43,722,364

Den= Cn,k*∑wi=6*7,514,804=45,088,824

I(5)=(num/den)*100=97%

QI n. 6 – % of common units when combining admin and survey data

Back to indicator 6

Example 1

A. Statistical output: A sectoral survey

B. Relevant units: The units in the survey(s)

UNIT

(2)

Turnover AB*(2) AC*(2) AD*(2) BC*(2) BD*(2) CD*(2) Sum

X1 15,020 0 0 0 0 0 15,020 15,020

X2 28,340 0 0 0 0 28,340 0 28,340

X3 57,812 0 0 0 57,812 57,812 57,812 173,436

X4 1,167,584 1,167,584 1,167,584 1,167,584 1,167,584 1,167,584 1,167,584 7,005,504

X5 21,333 0 0 0 0 21,333 0 21,333

X6 5,767,853 5,767,853 5,767,853 5,767,853 5,767,853 5,767,853 5,767,853 34,607,118

X7 153,000 153,000 153,000 153,000 153,000 153,000 153,000 918,000

X8 63,021 0 0 0 0 63,021 0 63,021

X9 184,818 184,818 0 184,818 0 184,818 0 554,454

X10 56,023 56,023 56,023 56,023 56,023 56,023 56,023 336,138

Sum 7,514,804 7,329,278 7,144,460 7,329,278 7,202,272 7,499,784 7,217,292 43,722,364

Page 29: List of Quality Indicators - European Commission 2011_Deliverable_6.1.pdf · indicators, which draw together certain basic quality indicators into ‘themes’ in line with the ESS

ESSnet – Admin data: Quality Indicators 29

C. Relevant variables: NACE activity code

D. Steps for calculation:

D1. Match each source with survey(s) by the common id code

D2. Attribute a “presence(1) / absence(0)” indicator to the unit if it belongs at least to a

survey (sum up for obtaining denominator)

D3. Attribute a “presence(1) / absence(0)” indicator to the unit if it belongs both to the

survey and to each source (sum up by source for obtaining numerator)

D4. Calculate the aggregate indicator as follows:

Let A be the Chamber of commerce Source

Let B be the Social security source:

100*#

#

#

#

#

#)6(

Surveyunits

SurveyBAunits

Surveyunits

SurveyBunits

Surveyunits

SurveyAunitsI

And, if we have three sources:

Let C be the Yellow pages

# # ## ( ) # ( )

# # # ( ) # # ( )(6)

# ( ) # (

# ( )

units A Survey units B Survey units A B Surveyunits C Survey units A C Survey

units Survey units Survey units Survey units Survey units SurveyI

units B C Survey units A B C Su

units Survey

*100)

# ( )

rvey

units Survey

and so on

I(6)=(Σind_survey&A+Ind_Survey&B+Ind_Survey&C)/Σind_survey-(Common

A&B&Survey+Common A&C&Survey+Common B&C&Survey)/Σind_survey+Common

A&B&C&Survey /Σind_survey =(3+2+2)/5-(1+1+1)/5+0/5)*100=4/5*100=80%

Source A Source B Source C Survey 1 Survey 2

(0)

Ind_Survey

Ind_survey

∩ A

(1)

Ind_survey

∩ B

(2)

Ind_survey

∩ C

(3)

Common

A∩B∩Survey

(4)

Common

A∩C∩Survey

(5)

Common

B∩C∩Survey

(6)

Common

A∩B∩C∩Survey

(7) Turnover

X1 1 0 1 1 1 1 1 0 1 0 1 0 0 35,147

X2 1 1 0 0 0 0 0 0 0 0 0 0 0 1,507,231

X3 1 0 0 1 0 1 1 0 0 0 0 0 0 627,432

X4 1 1 1 0 0 0 0 0 0 0 0 0 0 18,150

X5 1 1 0 0 1 1 1 1 0 1 0 0 0 57,442

X6 0 1 1 1 1 1 0 1 1 0 0 1 0 159,630

X7 1 0 0 0 0 0 0 0 0 0 0 0 0 68,000

X8 0 1 0 0 0 0 0 0 0 0 0 0 0 34,123

X9 1 0 1 0 0 0 0 0 0 0 0 0 0 22,365

X10 0 0 0 1 1 1 0 0 0 0 0 0 0 18,130

X11 1 0 0 0 0 0 0 0 0 0 0 0 0 59,458

X12 1 0 0 0 0 0 0 0 0 0 0 0 0 39,658

Sum 5 3 2 2 1 1 1 0 2,646,766

Page 30: List of Quality Indicators - European Commission 2011_Deliverable_6.1.pdf · indicators, which draw together certain basic quality indicators into ‘themes’ in line with the ESS

ESSnet – Admin data: Quality Indicators 30

And, weighting by turnover:

I(6)=[(35,147+627,432+57,442+57,442+159,630+35,147+159,630)/(35,147+627,432+57,442+159,

630+18,130)-(57,442+35,147+159,630)/(35,147+627,432+57,442+159,630+18,130)]*100=98%

QI n. 7 – % of items obtained from admin source and also collected by survey

Back to indicator 7

Example 1

A. Statistical output: A survey on the commerce sector

B. Relevant units: Units in the survey

C. Relevant variables: Economic activity code (NACE) (var1) and legal status (var2)

D. Steps for calculation:

D1. Match each source with survey(s) by the common id code

D2. Attribute a “presence(1) / absence(0)” indicator to items of var1 and var2 in survey

(sum up for obtaining denominator)

D3. Attribute a value=1(0) for common (not) item in survey and in the source (sum up for

obtaining numerator)

D4. Calculate the indicator as follows:

Let CC be the Chamber of Commerce source

Let SBS and GI be two Surveys

Let Var1 be the ATECO (5-Digits italian version of NACE)

Let Var2 be the Legal Status.

100*)(#

min#)7(

ssurveyinitemsrelevantof

datasurveyandadbyobtaineditemscommonrelevantofI

Page 31: List of Quality Indicators - European Commission 2011_Deliverable_6.1.pdf · indicators, which draw together certain basic quality indicators into ‘themes’ in line with the ESS

ESSnet – Admin data: Quality Indicators 31

QI n.8 – % reduction of survey sample size when moving from survey to admin data

Back to indicator 8

A. Statistical output: A sectoral survey

B. Relevant units: Enterprises with commercial area greater than 400 m2

D. Steps for calculation:

D1. Identify sample size before use of admin data

D2. Identify sample size after use of admin data

D3. Calculate the indicator as follows:

X1 15 46520 46510 46520 1 1 1320 1320 1 1

X2 10 10840 0 0 1220 0 0

X3 150 47112 47111 1 1 1310 1310 1 1

X4 0 68200 47112 1 1 1310 0 0

X5 28 47113 0 0 1320 0 0

X6 237 47112 47112 1 1 1320 1330 1 1

X7 58 10120 46321 47112 1 1 1320 missing 1 0

X8 76 47112 47111 1 1 1320 1320 1 1

X9 199 47111 47111 47111 1 1 missing 1310 0 0

X10 15 46411 0 0 1330 0 0

X11 0 47114 1 0 0 0

X12 11 47112 0 0 1310 0 0

Sum 799 8 7 5 4

CC-Legal

status

SBS-

ATECO

SBS-Legal

status

Presence/Absence

(1/0) of items in

source and survey

CC-

ATECO

Presence/Absence

(1/0) of items in

source and survey

Units

Number of

employeesVariable 1:ATECO Variable 2: Legal status

GI-Ateco

# items in

surveys-

ATECO

# items in

surveys-

Legal status

Sample size before increase in use of admin data 1482

Sample size after increase in use of admin data 950

I7(ATECO) = 7/8 * 100% = 87%

I7w(ATECO) = (15 + 150 + 0 + 237 + 58 + 76 + 199)/( 15 + 150 + 0 + 237 + 58 + 76 +

199 + 0)*100% = 100%

I7(Legal status) = 4/5 * 100% = 80%

I7w(Legal status) = (15 + 150 + 237 + 76)/(15 + 150 + 237 + 58 + 76) * 100% = 89%

%9.35100*1482

9501482100*

dataadmin of usein increase before size Sample

after size Sample - dataadmin of usein increase before size Sample)8(

I

Page 32: List of Quality Indicators - European Commission 2011_Deliverable_6.1.pdf · indicators, which draw together certain basic quality indicators into ‘themes’ in line with the ESS

ESSnet – Admin data: Quality Indicators 32

QI n. 9 – Item non-response (% of units with missing values for key variables)

Back to indicator 9

Example 1

A. Statistical output: BR for SBS

B. Relevant units: Units with 100+ employees

C. Relevant variables: number of employees

D. Steps for calculation:

D1. from BR take the population of units with 100+ employees

D2. Match source A with BR100+ by the common id code

D3. Calculate number of common units in A with missing value

D4. Calculate the indicator as follows:

Let A be the Social Security source:

100*#

missing#)9(

unitsrelevant

employeesforvaluewithAsourceinunitsI

QI n. 10 – Misclassification rate

Back to indicator 10

Example 1

A. Statistical output: BR unit

B. Relevant units: Units in Construction sector

C. Relevant variables: Economic Activity code (NACE, 4 digits or 3 digits)

E. Tolerance: “consistency” means equal NACE at 4 digits

D. Steps for calculation:

n. units n.employees (num) n. relevant units in A source with missing data 10 19,691 (den) n. relevant units "enterprises with 100+ employees" 12,277 5,394,055

I(9)= num/den%=(10/12,277)*100= 0.08%

weighted for employment

I(9)w=num/den%=(19,691/5,394,055)*100= 0.37%

Page 33: List of Quality Indicators - European Commission 2011_Deliverable_6.1.pdf · indicators, which draw together certain basic quality indicators into ‘themes’ in line with the ESS

ESSnet – Admin data: Quality Indicators 33

D1. Match each source (VAT (file of “Value – Added Taxes” model) and/or CCIAA (file of

declaration to ‘Chambers of Commerce’) with BR by the common id code

D2. Attribute a “presence(1) / absence(0)” indicator to items of variable in each admin data

(sum up for obtaining denominator)

D3. Attribute a value=1(0) for “inconsistency” (“consistency”) item between BR and source

(sum up for obtaining numerator)

D4. Calculate the indicator (simple or aggregated, weighted or not weighted) as follows:

sourceCCIAA or VATin items #

)( digit) 3or 4 (NACE variableof nciesinconsiste #)10(

VATorCCIAASourceBRI

I(10)VAT=(2/3)*100=67%

I(10)VATw=(2+5.42)/(2+16+5.42)*100=32%

I(10)CCIAA=(3/5)*100=60%

I(10)CCIAA w=(2+1+5.42)/(2+2.25+1+16+5.42)*100=32%

I(10)aggregated VAT – CCIAA =(67*3+60*5)/(3+5)=63%

QI n. 11 – Undercoverage

Back to indicator 11

Example 1

A. Statistical output: BR for SBS

B. Relevant units: Units with 100+ employees

C. Relevant variables: number of employees

D. Steps for calculation:

D1. From BR take the population of units with 100+ employees

unit NACE -BR

Nace-

Source

VAT

#items in VAT

Source

Inconsistency

VAT-BR (4

digits)

Nace-

Source

CCIAA

#items in

CCIAA

Source

Inconsistency

CCIAA-BR (4

digits)

Persons

employed

X1 43910 41200 1 1 412 1 1 2

X2 41200 0 0 1

X3 41200 0 412 1 0 2.25

X4 41200 0 0 2

X5 43390 0 41 1 1 1

X6 43290 43290 1 0 43290 1 0 16

X7 43220 0 0 1

X8 43120 41200 1 1 412 1 1 5.42

X9 432 0 0 1

X10 43290 0 0 1

Sum 3 2 5 3 32.67

Page 34: List of Quality Indicators - European Commission 2011_Deliverable_6.1.pdf · indicators, which draw together certain basic quality indicators into ‘themes’ in line with the ESS

ESSnet – Admin data: Quality Indicators 34

D2. Match source A with BR100+ by the common id code

D3. Calculate number of units in relevant population BUT not present in A

D4. Calculate the indicator as follows:

Let A be the Social Security source:

100*unitsrelevant #

A sourcein NOT unitsrelevant #)11( I

QI n. 12 – Overcoverage

Back to indicator 12

Example 1

A. Statistical output: BR enterprises with turnover less than 7,500,000 Euro

B. Relevant units: Units with turnover less than 7,500,000 Euro

C. Relevant variables: Any variable of interest

D. Steps for calculation: Source: Statistics – Based tax assessment (SBTASS), a survey

managed by the Italian Tax Authority

D1. Match each source (SBTASS) with BR by the common id code

D2. Identify relevant units

D3. Calculate number of units BR∩SBTASS out of scope, that is units with turnover greater

than 7,500,000 Euro

D4. Calculate the indicator as follows:

I(12)= 100*.

min

populationreferenceinunitsrelevantofN

populationreferenceinnotbutdataadinunitsrelevantofN

n.units n. employees (num) n. relevant units not included in A source 29 44,173 (den) n. relevant units "enterprises with 100+ employees" 12,277 5,394,055

I(11)= num/den%=(29/12,277)*100= 0.24

weighted for employment

I(11)w= num/den%=(44,173/5,394,055)*100= 0.82

Page 35: List of Quality Indicators - European Commission 2011_Deliverable_6.1.pdf · indicators, which draw together certain basic quality indicators into ‘themes’ in line with the ESS

ESSnet – Admin data: Quality Indicators 35

QI n.13 - % of units in the admin source for which reference period differs from the required

reference period

Back to indicator 13

Example 1

A. Statistical output: The BR Enterprise units

B. Relevant units: Corporations in Admin data with different reporting period from

required BR period

D: Steps for calculation:

D1. From Balance Sheet source take all corporations with different reporting period with

respect to the required BR period. Required BR period is 01.01.2009-31.12.2009 while a

different reporting period, for example, is 30.06.2008-30.06.2009

D2. Match all BR corporations with Balance Sheet by the common id code

D3. Calculate the indicator as follows:

Let A be the Balance Sheet source

I(13)=(Num/Den)*100=(11.341/607.899)*100=1.87%

Num: No of relevant corporations with different required BR period;

Den: No of relevant corporations in BR.

I(13)w(E)=[Employees(2)/Employees(1)]*100=(410,777/7,980,361)*100=5.15%

I(13) w (T)=[Turnover(2)/Turnover(1)]*100=(140,954,727,363/2,088,997,442,877)*100=6.75%

    Units n. pers. empl.

(num) - BR∩SBTASS with turnover greater than 7,500,000

Euro 2,074 63,279.05

(den) - BR with turnover less than 7,500,000 Euro

(excluding missing value) 3,305,497 9,076,661.77

I(12) = 2,074 / 3,305,497*100=0.06%

I(12)w (for pers. empl) =( 63,279.05 / 9,076,661.77)*100=0.70%

Units Employees Turnover

607,899 7,980,361 2,088,997,442,877

11,341 410,777 140,954,727,363No of BR corporations present in Balance Sheet for which

reference period is different from Br required reference period (2)

No of BR corporations present in Balace Sheet source (1)

Page 36: List of Quality Indicators - European Commission 2011_Deliverable_6.1.pdf · indicators, which draw together certain basic quality indicators into ‘themes’ in line with the ESS

ESSnet – Admin data: Quality Indicators 36

QI n.14 – Size of revisions from the different versions of the admin data RAR – Relative

Absolute Revisions

Back to indicator 14

Example 1

A. Statistical output: The BR Enterprise units updating/identification.

B. Relevant units: Units with 100 or more employees in a ATECO (5-digits, Italian

version of NACE) activity code.

C. Relevant variables: Number of employees.

D. Steps for calculation:

D1. Identify the statistical unit (enterprise) in the first and in the second version of data

coming from the same source

D2. Take the units with 100 or more employees which are included in the ATECO activity

code.

D3. Take the non missing values (XPt) from the first data version;

D4. Take the non missing values (XLt) from the second data version for the same units

received in the first data version

D5. Calculate the difference (absolute value) between the latest data and the first data

version for each unit;

D6. sum up the differences and divide it by the sum of the absolute values of the first data.

D7. Calculate the indicator as follows:

100*

||

||

I(14)

1

1

T

t

T

t

XPt

XPtXLt

X1 150 150 0 150,322

X2 227 227 0 273,200

X3 125 127 2 100,233

X4 8,023 8,218 195 11,027,323

X5 1,312 1,315 3 7,182,325

X6 Absent Absent

X7 58 123 65 78,000

X8 887 887 0 532,233

X9 24 118 94 21,452

X10 533 533 0 1,125,328

Total 11,339 11,698 359 20,490,416

Number of employees in

the first data (1)

Number of employees in the

second data (2) Absolute values of (2)-(1) Turnover

Page 37: List of Quality Indicators - European Commission 2011_Deliverable_6.1.pdf · indicators, which draw together certain basic quality indicators into ‘themes’ in line with the ESS

ESSnet – Admin data: Quality Indicators 37

I(14)=[(І150-150І+І227-227І+І127-125І+І8,218-8,023І+І1,315-1,312І+І123-58І+І887-887І+І118-

24І+І533-533І)/(І150+227+125+8,023+1,312+58+887+24+533І)]*100=(359/11,339)*100=3%

QI n.15 – % of units in admin data which fail checks

Back to indicator 15

Example 1

A. Statistical output: The BR Enterprise updating/identification.

B. Relevant units: all the units in the register

C. Relevant variables: The key variables: NACE activity Code; state of activity; number of

employees.

D. Steps for calculation:

D1: calculate for each key variable the number of units that come from admin data;

D2: Identify for each key variable the number of units that fail checks and come from admin

data;

D3. Average the proportions of units that fail checks by weighting by the numbers of units.

Let A be the NACE activity code;

Let B be the state of activity;

Let C be the number of employees.

I(15)=(190,080/10,589,670)*100=1.8%

I(15)w=(346,657,095,921/4,182,029,063,563)*100=8.3%

QI n.16 - % of units for which data have been adjusted

Back to indicator 16

Example 1

A. Statistical output: The BR Enterprise updating/identification.

B. Relevant units: all the units in the register

Variables

A 60,277 71,131,871,029 4,486,410 1,494,887,680,425 1.3 4.8

B 48,505 15,367,168,813 4,497,993 1,518,016,883,187 1.1 1.0

C 81,298 260,158,056,079 1,605,267 1,169,124,499,951 5.1 22.3

Total 190,080 346,657,095,921 10,589,670 4,182,029,063,563 1.8 8.3

(1)

Number of units

with admin data

which fail checks

(3)

Number of units with

admin data

(2)

Turnover of (1)

(4)

Turnover of (3)

[(1)/(3)]*100 %

of units in admin

data which fail

checks

[(2)/(4)]*100

Weighted by

turnover

Page 38: List of Quality Indicators - European Commission 2011_Deliverable_6.1.pdf · indicators, which draw together certain basic quality indicators into ‘themes’ in line with the ESS

ESSnet – Admin data: Quality Indicators 38

C. Relevant variables: The key variables NACE activity Code; state of activity;

number of employees.

D. Steps for calculation:

D1: calculate for each key variable the number of units that come from admin data;

D2: Identify for each key variable the number of units for which data have been

adjusted;

D3: For each variable, divide the number of units of D2 by the number of units of D1;

D4: Average the simple indexes weighting by the units of D1.

Let A the NACE activity code;

Let B the state of activity;

Let C the number of employees.

I(16) =(148,539/10,589,670)*100=1.4%

I(16)w=(322,369,980,306/4,182,029,063,563)*100=7.7%

QI n.17 – % of imputed values (items) in the admin data

Back to indicator 17

Example 1

A. Statistical output: The results of a sectoral survey.

B. Relevant units: all the units in a specific NACE activity code.

C. Relevant variables: The variables NACE activity Code; number of employees;

turnover.

D. Steps for calculation:

D1: For each source identify the variables which are used for the statistical output.

D2. For each variable in the source calculate the number of items in admin data.

D3. For each variable in the source identify all the units with items present in admin data

which are afterwards imputed;

D4: For each variable in the source calculate the non missing items in the statistical output.

D5. For each variable calculate the proportion of D3 on D2;

A 18,736 46,844,755,414 4,486,410 1,494,887,680,425 0.4 3.1

B 48,505 15,367,168,813 4,497,993 1,518,016,883,187 1.1 1.0

C 81,298 260,158,056,079 1,605,267 1,169,124,499,951 5.1 22.3

Total 148,539 322,369,980,306 10,589,670 4,182,029,063,563 1.4 7.7

(1)

Number of units with

admin data for which

data have been adjusted

Variables

(2)

Turnover of (1)

(3)

Number of units

with admin data

% of units in admin

data for which data

have been adjusted

[(1)/(3)]*100

% weighted by

turnover

[(2)/(4)]*100

(4)

Turnover of (3)

Page 39: List of Quality Indicators - European Commission 2011_Deliverable_6.1.pdf · indicators, which draw together certain basic quality indicators into ‘themes’ in line with the ESS

ESSnet – Admin data: Quality Indicators 39

D6. Calculate the indicator for each source weighting the proportions with the items of D4.

D7. Calculate the general indicator weighting the indicators of D6 for the data.

Percentage of units in source A with Nace activity codes imputed: (3/10)*100=30% Percentage of units in Source A with number of employees imputed: (1/10)*100=10% Percentage of units in Source B with Nace activity codes imputed: (2/8)*100=25%

Percentage of units in Source B with Turnover imputed:(1/8)*100=12.5% I(17)Source A=[(3+1)/(10+10)]*100=20%

I(17)Source B=[(2+1)/(8+8)]*100=18.75%

I(17)Sources A and B=(20*10+18.75*8)/(10+8)=19.4%

QI n.18 – Delay to accessing/receiving data from admin source

Back to indicator 18

Example 1

A. Statistical output: The BR Enterprise units updating/identification

B. Relevant units: 10+ employees enterprises (relevant for a specific survey or as

base for the HG firms)

D: Steps for calculation:

D1. from BR take the population of units with 10+ employees

D2. Match each source with BR10+ by the common id code obtaining the number of

common units;

D3. Calculate for each source the number of months from the end of the reference period

to the arrival of Admin data;

X1 16231 2 Absent in the source Absent in the source

X2 16232 0 16232 16231 80,305

X3 missing 16231 3 16231 127,118

X4 16231 0 Absent in the source Absent in the source

X5 17110 16231 10 16231 335,550

X6 16231 15 missing 25,332

X7 16231 1 47112 16231 118,125

X8 missing 16231 0 5 16231 63,212

X9 16231 0 16231 missing 7550

X10 16291 0 16291 18,123

10 3 10 1 8 2 8 1

Nace

activity

code

source A

Nace activity

code source

A afterwards

imputed

Number

of units

Turnover

Source B

afterwards

imputed

Units Number of

employees source A

afterwards imputed

Nace activity code

source B

Nace activity

code source B

afterwards

imputed

Turnover Source BNumbers

of

employees

Source A

Page 40: List of Quality Indicators - European Commission 2011_Deliverable_6.1.pdf · indicators, which draw together certain basic quality indicators into ‘themes’ in line with the ESS

ESSnet – Admin data: Quality Indicators 40

D4: Calculate the number of months from the end of the reference period to the

dissemination date

D4. Calculate the indicator as follows

Let A be the Fiscal Register source;

Let B be the Archive of the Chamber of Commerce;

Let C be the Social Security source;

Let D be the Yellow Pages

Numerator:

For each source divide the number of months from the end of the reference period to the arrival

of Admin Data by the number of months from the end of reference period to publication date;

then sum up the source indicators.

Source A: (6/15)*100=40%;

Source B: (8/15)*100=53.3%

Source C: (6/15)*100=40%

Source D: (0/15)*100=0%

Denominator: the number of sources;

I(18)=num/den=(40%+53.3%+40%+0%)/4=33.3%

Weighted indicator: (weighted by the contribution of each source to the final result)

I(18)=(40%*186,605+53.3%*184,356+40%*186,549+0%*121,759)/(186,605+184,356+186,549+1

21,759)=36.4%

Weighting by turnover:

I(18)=(40%*2,081,725,436,855+53.3%*2,076,137,208,748+40%*2,081,236,000,674/(2,081,725,4

36,855+

+2,076,137,208,748+2,081,236,000,674+1,596,790,158,244)=35,37%

A 186,605 6 15 2,081,725,436,855

B 184,356 8 15 2,076,137,208,748

C 186,549 6 15 2,081,236,000,674

D 121,759 0 15 1,596,790,158,244

Common units in source

and relevant units

Months from the end of

reference period to receiving

admin data

Months from the end of

reference period to

publication dateSource Turnover

Page 41: List of Quality Indicators - European Commission 2011_Deliverable_6.1.pdf · indicators, which draw together certain basic quality indicators into ‘themes’ in line with the ESS

ESSnet – Admin data: Quality Indicators 41

QI n.19 – Discontinuity in estimate when moving from a survey-based output to an admin-

based output

Back to indicator 19

A. Statistical output: A sectoral survey

B. Relevant units: Enterprises of construction section (NACE Rev2, Section F)

C. Relevant variables: Number of employees.

D. Steps for calculations:

D1. Compute the estimate of the variable(s) for the survey-based

output

D2. Compute the estimate of the variable(s) for the admin-based

output

D3. Calculate the indicator as follows:

This indicates that the admin-based output will be 0.6% higher than the survey-based output.

QI n. 20 – % of consistent items for common variables in more than one source

Back to indicator 20

Example 1

A. Statistical output: A sectoral survey

B. Relevant units: Units in the survey(s)

C. Relevant variables: ATECO, 5-Digits, Italian version of NACE (var1) and legal status

(var2)

E. Tolerance: ATECO (var1) equal at 4 digits; and legal status (var2) equal at 4

digits

D. Steps for calculation:

D1. Match each source with survey(s) by the common id code

Estimate of total number of employees for the enterprises of Section F using admin data 1,158,542

Estimate of total number of employees for the enterprises of Section F using survey data 1,152,487

%6.0100*487,152,1

487,152,1542,158,1100*

survey from Estimate

survey from Estimate-dataadmin from Estimate)19(

I

Page 42: List of Quality Indicators - European Commission 2011_Deliverable_6.1.pdf · indicators, which draw together certain basic quality indicators into ‘themes’ in line with the ESS

ESSnet – Admin data: Quality Indicators 42

D2. Attribute a “presence(1) / absence(0)” indicator to items of var1 and var2 in survey

(sum up for obtaining denominator)

D3. Attribute a value=1(0) for consistent (not) item in survey and in the source (it is

considered as consistent if var1=var(survey)) (sum up for obtaining numerator)

D4. Calculate the indicator as follows

100*

#

2var1var#)20(

surveyinitems

SurveyAandinitemsconsistentI

QI n. 21 – % of relevant units in admin data which have to be adjusted to create statistical units

Back to indicator 21

Example 1

A. Statistical output: A sectoral survey

B. Relevant units: Enterprises with commercial area greater than 400 m2

C. Relevant variables: Retail area of the enterprise.

D. Steps for calculation:

Unit

X1 27200 27200 27320 1 1 1320 1320 1 1

X2 10840 0 1120 0

X3 68200 47112 1 0 1440 1440 1 1

X4 68100 0 1120 0

X5 27330 28111 1 0 1120 0

X6 47112 47112 47112 1 1 1330 1320 1 0

X7 47113 0 1320 0

X8 68200 0 1120 0

X9 47114 0 1210 0

X10 47113 68200 47113 1 1 1220 missing 0

X11 47114 1

Total 6 3 3 2

# items

in

surveys -

ATECO

Cosistent -

Legal

status

Variable 1: ATECO Variable 2: Legal status

Consistent

ATECO

Source A,

legal

status

SBS - legal

status

# items

in

surveys -

legal

Source A,

ATECO,

national cl.

For NACE

SBS -

ATECO

GI -

ATECO

I(20)=n. of consistent items in admin and survey data for ATECO/n. of items in survey for ATECO*100=3/6*100=50.0%

I(20)=n. of consistent items in admin and survey data for legal status/n. of items in survey for legal status*100=2/3*100=66.7%

Page 43: List of Quality Indicators - European Commission 2011_Deliverable_6.1.pdf · indicators, which draw together certain basic quality indicators into ‘themes’ in line with the ESS

ESSnet – Admin data: Quality Indicators 43

D1.Identify the units in the admin data which need to be adjusted

in order to obtain the relevant statistical units

D2.Identify the relevant units in admin data that correspond to

the statistical concepts.

D3. Divide #D1 by (#D1+#D2)

Let S1 be an admin data base on Commerce (e.g. Nielsen data) where admin units do not

correspond to the statistical concepts (the enterprise) i.e. n. admin units (n>=1) correspond

to one enterprise.

%2.10100*523,3400

400100*

units lstatistica the toingcorrespond unitsrelevant adjustedbeen have that unitsRelevant

concepts lstatistica the toadjustedbeen have that unitsRelevant )21(

I

%8.12100*450,755,1428,257

428,257)21(

wI

QI n. 22 – Cost of converting admin data to statistical data

Back to indicator 22

Example 1

A. Statistical output: A sectoral survey

B. Relevant units: Enterprises with commercial area greater than 400 m2

D. Steps for calculation:

D1.Identify the time in person hours necessary to convert the

admin data in order to obtain statistical data as a function of

admin source size and complexity in the treatment of admin

data.

Let c1=number of records in admin data

Let c2=number of records processed per hour=complexity coefficient

Units m2

Number of enterprises with more than one local unit in the admin data: 400 257,428

Number of enterprises with only one local unit in the admin data: 3523 1,755,450

Page 44: List of Quality Indicators - European Commission 2011_Deliverable_6.1.pdf · indicators, which draw together certain basic quality indicators into ‘themes’ in line with the ESS

ESSnet – Admin data: Quality Indicators 44

I(22)=Cost of conversion in person hours=f(#of record in admin data, #of records processed per

hour)=

= 2

1

c

cNumber of person hours= H36

83

000,3

QI n. 23 – Efficiency gain in using admin data

Back to indicator 23

Example 1

A. Statistical output: A survey on small enterprises

B. Relevant units: Corporations with 10 or less employees.

D. Steps for calculation:

D1. Quantify costs of survey-based statistic (total cost of the

survey including questionnaires, mailing, re-contacting , staff etc.)

D2. Quantify cost of survey when based on admin data (Balance

Sheets): cost of admin source acquisition; processing costs; staff

etc.

Production cost of Survey-based statistic 36,150

Production cost of Admin-based statistic 22,500

%8.37100*150,36

500,22150,36100*

statistic basedsurvey ofcost Production

statistic basedadmin ofcost Production-statistic basedsurvey ofcost Production)23(

I