goals

17
Impact of using fiscal data on the imputation strategy of the Unified Enterprise Survey of Statistics Canada Ryan Chepita, Yi Li, Jean-Sébastien Provençal, Chi Wai Yeung Statistics Canada ICES III, Montréal, June 2007

Upload: prentice

Post on 05-Jan-2016

38 views

Category:

Documents


0 download

DESCRIPTION

Impact of using fiscal data on the imputation strategy of the Unified Enterprise Survey of Statistics Canada Ryan Chepita, Yi Li, Jean-Sébastien Provençal, Chi Wai Yeung Statistics Canada ICES III, Montréal, June 2007. Goals. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Goals

Impact of using fiscal data on the imputation strategy of the

Unified Enterprise Survey of Statistics Canada

Ryan Chepita, Yi Li, Jean-Sébastien Provençal, Chi Wai Yeung

Statistics CanadaICES III, Montréal, June 2007

Page 2: Goals

Goals

• To illustrate the challenges of applying a centralized E and I strategy to a broad range of industrial sectors

• To discuss the changes put in place due to the increasing use of fiscal data

• To discuss one approach used to quantify the overall E and I effect

Page 3: Goals

Outline

• Overview of the Unified Enterprise Survey (UES)• Survey content• Imputation strategy • Use of fiscal data• Challenges• Diagnostic tool • Conclusion

Page 4: Goals

Overview of the UES

• Annual business survey

• Initiated with 7 industries in 1997

• Presently integrates over 40 industries covering the major sectors of the economy– 950K establishments in the population– 127K establishments in the sample

Page 5: Goals

Overview of the UES

• Stratified sampling design – NAICS, province, and size in terms of revenue

• Data collection – Mail out survey, fax and phone follow-up

• Edit and Imputation• Estimation

– H.-T. for totals and provincial and industrial breakdowns

Page 6: Goals

Survey content

• 2 or 3 Key variables– Total revenue and total expenses

– Similar concepts from one industry to another

• A lot of details (over 50 variables)– Totals breakdowns

– By province, type of expenses or source of revenue

– Industry specific

• Can be revised from year to year

Page 7: Goals

Survey content

• Example : manufacturing sector

VARIABLES

Sales oth. Goods and serv. produced

Total sales of goods purch for resale

Amount received for custom work

Amount received for repair work

Stumpage sales

Total sales of goods and services produced

Sales of logs and wood residue

Total sales

Key

Details

Page 8: Goals

Imputation Strategy

• Categories of non-response– Category 1: Partial response with at least 1 key

variable reported – Category 2: Total non-response with historical

data– Category 3: Total non-response without

historical data

Page 9: Goals

Imputation Strategy

• Historical data for some records– Records sampled the year before– Same questionnaire

• Administrative data for all records– Stratification information– NAICS, province, size in terms of revenue

Page 10: Goals

Imputation Strategy

• Type 1 and type 2 non-response• Missing key variables

– Historical Trend – Ratio using current survey information

• Missing details – Historical distribution– Distribution from all respondent within a

homogeneous group– Distribution from a single donor

Page 11: Goals

Imputation Strategy

• Type 3 non-response

• Donor imputation

• Closest neighbour based on administrative data

Page 12: Goals

Use of fiscal data

• Use fiscal data as a proxy value for total non-response

• Use fiscal data as a proxy value for simple units randomly selected at the sampling stage

• Use to update the initial size in terms of revenue• Number of survey variables for which we use

fiscal data as proxy range from 7 to 25

Page 13: Goals

Challenges

• Conceptual differences– Questionnaire content review

• Variables for which there is no proxy value on the fiscal data base – Modeling

• Industry specific needs– Tailored strategy

Page 14: Goals

Challenges

• Monitoring the effect– Creation of a distinct path for records where we

used fiscal data (category 4 of non-response)– Creation of a diagnostic tool

Page 15: Goals

Diagnostic tool• Identification section

– Industry, province, variable description

• Weighted sums, share and percentages by category of non-response

Share 60% 10% 10% 20% 100%

Variable X Resp. Cat.2 Cat.3 Cat.4 Total

Sums 30M 5M 5M 10M 50M

Percentages 20% 20% 25% 18% 20%

Variable Y (Total)

150M 25M 20M 55M 250M

Page 16: Goals

Conclusion

• Centralized E and I strategy vs industry specific needs

• Diagnostic tool

• Modeling

Page 17: Goals

Thank you!

Questions?