1 business register: quality practices eddie salyers [email protected] 301-763-2638
TRANSCRIPT
2
An Assessment of Current Quality Assurance Practices and Ongoing Work to Develop a Comprehensive Quality Plan for the U.S. Census Bureau Business Register
3
• Introduction– Database Redesign– Quality Assurance Team
• Business Register Overview• Quality Assurance
– Migration – Administrative Records– Census Bureau Data Collections– Recommendations
• Conclusion
Business Register:Quality Practices
4
BR Database Redesign
• Complete redesign • Old Standard Statistical Establishment List
(SSEL) VAX RDB• New Business Register (BR) Oracle• All software rewritten• New BR production Fall 2002
5
Quality Assurance Team
Mission:Assure the quality of the new BR is a
minimum commensurate with the old SSEL which it replaces, and to establish a complete quality framework.
6
Quality Assurance Team
Definitions:Quality – "The totality of features and characteristics of
a product or service that bare on its ability to satisfy specified or implied needs." (ISO, 1986).
Reliability - “The ability of a system or component to perform its required functions under stated conditions for a specified period of time.” [IEEE 90].
Integrity - Information in the system follows designated standards and is consistent both within an individual table as well as between associated tables.
7
Business Register Overview
• Primary Functions– Economic Census enumeration list– Survey sampling frames– Central storage of administrative data– Control file for data collection/processing – Data for statistical products– Data for economic research
8
Key Concepts and Definitions
The BR’s UnitsBusiness/Statistical
– Establishment– Enterprise– Enterprise segment
(e.g., alternate reporting unit)Administrative
– EIN unit– SSN unit }
Standard Statistical Units
} Variable
}
Mainly for IRS tax reporting
9
Business Organization
Basic Types
Single-establishment enterprise:– An enterprise that operates just one
establishment (i.e., at one physical location) - a single unit or SU
Multi-establishment enterprise– An enterprise that operates two
establishments or more (2-plus locations)
10
Multiunit
EIN(Payroll Only)
EIN(Payroll Only)
EIN(ConsolidatedIncome Tax)
EIN(Payroll Only)
Enterprise(Parent)
SubsidiaryEstablish-
mentEstablish-
mentEstablish-
mentEstablish-
mentEstablish-
mentEstablish-
mentEstablish-
ment
Establish-ment
Establish-ment
A more complex MU may have: Multiple EIN units One subsidiary enterprise or more
11
Complex Multiunits
The largest U.S. Multi-units may have:
Several thousand EINs
More than 10,000 establishments
12
System
• Oracle Database
• Many Related Tables
• Interactive Web-Based Interface built with Oracle Forms & PL/SQL
• Interface used for research and updates
• Software for interactive and batch updates and edits
13
Migration
• Complete Redesign– New IDs– New Table Structures– All New Software– Copy Existing data - 2001– Load “new” data - 2002
14
Migration
• Quality Checks – Create SAS Datasets from Old SSEL and
New BR for 2001 Records– Record to Record Match of 2001 SSEL
and 2001 BR• After accounting for differences cause by
design no significant differences were found
– Comparison of 2001 BR to 2002 BR• Checks both migration and software used to
load 2002 records• Year to Year Changes as Expected
15
Administrative RecordsInternal Revenue Service:• Business Master File (BMF)• Payroll tax returns• Business income tax returns
• Bureau of Labor Statistics (BLS):– Description: Industrial classification assigned by State
Employment Security Agencies as part of Covered Employment and Wages
• Social Security Administration– Applications for new Employer Identification Number (EIN)
17
Administrative Records Quality Assurance
Current Practices:• Stage 1:
– Tabulate distributions of variables on incoming files and compare to expected values.
– Unchanged with redesign, works on inputs• Stage 2:
– Basic Validity Test: Edits to assure each item has a valid form (valid states, data type, etc.)
– Ratio Edits: Examine Consistency of correlated data, I.e. Payroll per employee
– Data failing edits are replaced with imputed values and referred to an analyst for review
– Done as part of load to BR database– Process is similar to old, but all software rewritten for new
BR
18
Administrative Records Quality Assurance
• Current Practices:– Strengths:
• Identifies systematic file errors well– Weaknesses
• Lack of Macro-Level Post Processing Quality Assurance
• Communication • Identifying significant problems with large
cases
19
Administrative Records Quality Assurance
• Recommendations– Using SAS datasets that are created monthly
from the BR perform a routine macro-level review. – Creation of a Centralized Administrative Record
Tracking System– Standardization and Automation of all Current QA
Reports– Increase Ability to Identify Important Companies
with Missing or Inaccurate Administrative Records– Development of Systematic Review of Post-
Processing Administrative Record QA – Monitor Cost of Current Administrative Record
Quality Assurance Activities
20
Census Bureau Data Collections
Company Organization SurveyDescription: Register proving survey directed to
selected multiunit enterprisesContent
– Ownership or control by a United States parent– Ownership or control by a foreign parent– Inventory of establishments, verifying or collecting the following for
each:• Primary and secondary name• Physical location• EIN used for payroll tax reporting• SIC• Employment for pay period including March 12• First quarter and annual payroll• Year-end operating status
21
Census Bureau Data Collections
Economic CensusDescription: Enumeration of establishments in covered industriesContent for each establishment:
– Ownership or control by a parent enterprise– Locations of operation– Primary and secondary name– Physical location address– EIN used for payroll tax reporting– SIC and Type of Operation – Employment for pay period including March 12– First quarter and annual payroll– Dollar volume of business (value of shipments, sales, receipts,
revenue)– Year-end operating status– Value of products and services by category (selectively)– Other industry-specific content
22
Census Bureau Data Collections Quality Assurance
Current Practices:• Data Entry
– Independent Verification of samples – Data are re-keyed and difference adjudicated– Lots accepted or rejected based on error rates.
• Batch Update Operations– Basic Validity Test: Edits to assure each item has a valid
form (valid states, data type, etc.)– Ratio Edits: Examine Consistency of correlated data, I.e.
Payroll per employee– Data failing edits are replaced with imputed values and
referred to an analyst for review– Done as part of load to BR database– Process is similar to old, but all software rewritten for new
BR
23
Census Bureau Data Collections Quality Assurance
Current Practices:• Clerical Operations
– A second person that is qualified as a verifier selects and inspects a sample of the referrals from each completed work unit (dependent verification);
– Rejected work units subjected to 100% re-inspection
– Note “old” SSEL had functionality to hold corrections until they passed inspections
24
Additional QA Team Recommendations• Improve Error Tracking • Improve Imputation for missing Employment and
Payroll Values• Evaluate ORACLE DQI (Data Quality Inspector)
as way to identify problems• Expand use of SAS datasets built from the BR to
assess quality• Review and documentation of user needs and
how the BR meets those needs• Comparison to Bureau of Labor Statistics (BLS)
Business Establishment List (BEL)-