1 record linkage for epidemiologic research: accessing linked data at the nchs research data center...
TRANSCRIPT
1
Record Linkage for Record Linkage for Epidemiologic Research: Epidemiologic Research: Accessing Linked data at the NCHS Accessing Linked data at the NCHS Research Data CenterResearch Data CenterChristine S. CoxNCHS Data Users ConferenceJuly 12, 2006
U.S. DEPARTMENT OF HEALTH AND HUMAN SERVICESCenters for Disease Control and PreventionNational Center for Health Statistics
2
Administrative records
Linked Data File
NCHS Surveys
What is Record Linkage?What is Record Linkage?
3
NCHS Linked Data: NCHS Linked Data: Major ActivitiesMajor Activities
MortalityMortality National Death IndexNational Death Index
Health Care Utilization and CostsHealth Care Utilization and Costs Medicare DataMedicare Data
Retirement and DisabilityRetirement and Disability Social Security DataSocial Security Data
4
NCHS Linked Data: Mortality NCHS Linked Data: Mortality
Eligibility statusEligibility status Assigned vital statusAssigned vital status Date of deathDate of death Age at deathAge at death Underlying and multiple causes of deathUnderlying and multiple causes of death Adjusted sample weightsAdjusted sample weights
5
Research Potential of Research Potential of Linked Mortality DataLinked Mortality Data
Living and Dying in the USA: Behavioral, Health, and Social Differentials of Adult MortalityRG Rogers, CB Nam, RA Hummer
A Semiparametric Analysis of the Body Mass Index’s Relationship to MortalityJT Gronniger
The Income-Associated Burden of Disease in the United States P Muennig, P Franks, H Jia, E Lubetkin and MR Gold
Excess Deaths Associated with Underweight, Overweight, and ObesityKM Flegal, BI Graubard, DF Williamson; MH GailJAMA. 2005;293:1861-1867.
6
NCHS Linked Data: MedicareNCHS Linked Data: Medicare
Medicare entitlement and health care utilization Medicare entitlement and health care utilization and payment data for 1991-2000and payment data for 1991-2000 Denominator fileDenominator file
MEDPAR Inpatient hospitalizationMEDPAR Inpatient hospitalization
MEDPAR Skilled nursing facilityMEDPAR Skilled nursing facility
Hospital outpatient Hospital outpatient
Home Health CareHome Health Care
HospiceHospice
Carrier (physician/supplier Part B file)Carrier (physician/supplier Part B file)
Durable Medical EquipmentDurable Medical Equipment
7
Research Potential ofResearch Potential ofLinked Medicare DataLinked Medicare Data
Examine risk factors for health conditionsExamine risk factors for health conditions Examine reliability of survey dataExamine reliability of survey data
Examine survey report of disability with program Examine survey report of disability with program participation eligibility criteriaparticipation eligibility criteria
Compare survey reported health conditions to claims Compare survey reported health conditions to claims recordsrecords
Examine disparities in Medicare service Examine disparities in Medicare service utilizationutilization
8
NCHS Linked Data: Retirement/DisabilityNCHS Linked Data: Retirement/Disability
Social Security data from Retirement, Social Security data from Retirement, Survivors, and Disability Insurance (RSDI) Survivors, and Disability Insurance (RSDI) and Supplemental Security Insurance and Supplemental Security Insurance (SSI) programs(SSI) programs Master Beneficiary Record (MBR)Master Beneficiary Record (MBR)
1962-20031962-2003 Payment History Update System (PHUS)Payment History Update System (PHUS)
1984-20031984-2003 Supplemental Security Record (SSR)Supplemental Security Record (SSR)
1974-20031974-2003
9
Research Potential of Research Potential of Linked Social Security DataLinked Social Security Data
Examine reliability of survey information for SSA Examine reliability of survey information for SSA program participation and benefitsprogram participation and benefits
Compare the health characteristics of those who take Compare the health characteristics of those who take early (age 62) Social Security benefits to those who early (age 62) Social Security benefits to those who postpone benefits postpone benefits
Policy analysis using validated survey dataPolicy analysis using validated survey data Predicting the number of people who will become disabled Predicting the number of people who will become disabled
based upon survey reported health conditions based upon survey reported health conditions Determining whether current disability entitlement funding levels Determining whether current disability entitlement funding levels
will be adequate as the population ageswill be adequate as the population ages
10
Summary NCHS Data LinkageSummary NCHS Data Linkage
XXNNHS 1985
XXXNHANES III
XXNHANES II
XXXNHANES I
XXXLSOA II
XXXNHIS 1994-1998
XNHIS 1986-2000
Retirement & Disability (SSA)
Medicare (CMS)
Mortality (NDI)
12
Why can’t you just give Why can’t you just give me the data?me the data?
NCHS does not “own” the linked NCHS does not “own” the linked administrative dataadministrative data
NCHS data confidentiality rules prohibit NCHS data confidentiality rules prohibit the release of potentially identifiable data – the release of potentially identifiable data – special considerations concerning the special considerations concerning the protection of linked dataprotection of linked data
The RDC is the only option for access for The RDC is the only option for access for now….now….
13
Overview: Overview: Data Access ProceduresData Access Procedures
Proposal RequirementsProposal Requirements
Access MethodsAccess Methods
Helpful TipsHelpful Tips
Where to get help?Where to get help?
14
Proposal RequirementsProposal Requirements
Proposal is evaluated by review committee Proposal is evaluated by review committee Review criteriaReview criteria
Scientific and technical feasibilityScientific and technical feasibility Availability of RDC resourcesAvailability of RDC resources Disclosure risk for restricted informationDisclosure risk for restricted information The extent to which project is in accordance The extent to which project is in accordance
with the mission of NCHSwith the mission of NCHS
Special note:Special note: NCHS does not try to NCHS does not try to determine if proposals are duplicativedetermine if proposals are duplicative
15
Proposal RequirementsProposal Requirements
Cover letterCover letter Project titleProject title Abstract (maximum 300 words summarizing Abstract (maximum 300 words summarizing
project)project) Full contact informationFull contact information
Institutional affiliationInstitutional affiliation Mail address, phone, emailMail address, phone, email
Dates of proposed time at RDC (or indication of Dates of proposed time at RDC (or indication of using remote access)using remote access)
Source of funding for proposed researchSource of funding for proposed research
16
Proposal RequirementsProposal Requirements Study backgroundStudy background
Key study questions or hypothesesKey study questions or hypotheses Public health benefitsPublic health benefits
MethodsMethods Analytic approach and statistical methodsAnalytic approach and statistical methods Statistical software requirementsStatistical software requirements
Description of intended output for nondisclosure Description of intended output for nondisclosure review, e.g.review, e.g. Table shellsTable shells Model equationsModel equations Test statistics that researcher plans to remove from Test statistics that researcher plans to remove from
RDCRDC
17
Proposal RequirementsProposal Requirements
Explanation of why restricted data are needed, Explanation of why restricted data are needed, e.g. describe why publicly available data are e.g. describe why publicly available data are insufficientinsufficient
Summary of data requirements to be included in Summary of data requirements to be included in analytic fileanalytic file Identification of sampleIdentification of sample Identification of variablesIdentification of variables
Description of additional data to be supplied by Description of additional data to be supplied by researcher to be merged with NCHS or other researcher to be merged with NCHS or other data source (must clearly identify source of other data source (must clearly identify source of other data)data)
18
Proposal Requirements: Proposal Requirements: AppendicesAppendices
Current Current Curriculum VitaeCurriculum Vitae or resume for each or resume for each investigatorinvestigator
Data dictionary – complete listing of specific Data dictionary – complete listing of specific data requested and its source(s) and indicate if data requested and its source(s) and indicate if public use or restricted access variablespublic use or restricted access variables specific files and yearsspecific files and years samplesample variables (dependent, independent, matching/linking)variables (dependent, independent, matching/linking)
19
Proposal Requirements: Proposal Requirements: AppendicesAppendices
For remote-access applicantsFor remote-access applicants Description of the computer and email system Description of the computer and email system
to be used to receive outputto be used to receive output Security provisions for the computer and Security provisions for the computer and
email systemsemail systems For studentsFor students
Letter from department chair or academic Letter from department chair or academic advisor stating that student is working under advisor stating that student is working under the direction of the departmentthe direction of the department
20
Overview: Overview: RDC Data Access ProceduresRDC Data Access Procedures
Proposal RequirementsProposal Requirements
Access MethodsAccess Methods
Helpful TipsHelpful Tips
Where to get help?Where to get help?
21
Access MethodsAccess Methods
Once approved, three methods to access Once approved, three methods to access restricted datarestricted data on-site - use local computing resources in the NCHS on-site - use local computing resources in the NCHS
RDC, Hyattsville, MDRDC, Hyattsville, MD remote – submit programs electronically to be remote – submit programs electronically to be
executed in the RDC with output returned by emailexecuted in the RDC with output returned by email staff assisted – RDC staff provide on-site staff assisted – RDC staff provide on-site
programming for off-site approved researchersprogramming for off-site approved researchers For all methods of access, restricted data files For all methods of access, restricted data files
remain in RDC and output is inspected for remain in RDC and output is inspected for disclosure violationsdisclosure violations
22
On-Site AccessOn-Site Access
RDC staff constructs necessary data files, RDC staff constructs necessary data files, including merged user dataincluding merged user data
Most statistical packages available with Most statistical packages available with sufficient lead timesufficient lead time
Output subject to disclosure reviewOutput subject to disclosure review
Open only during normal working hoursOpen only during normal working hours
23
Remote Access MethodRemote Access Method
RDC staff constructs necessary data files, RDC staff constructs necessary data files, including merged user dataincluding merged user data
SAS programs only (certain procedures and SAS programs only (certain procedures and functions not allowed) – additional software functions not allowed) – additional software options expectedoptions expected
Both submitted programs and output undergo a Both submitted programs and output undergo a programmed disclosure limitation reviewprogrammed disclosure limitation review
24
RDC Staff-assisted RDC Staff-assisted Programming MethodProgramming Method
Subcontract with the RDC staff to perform Subcontract with the RDC staff to perform programming tasksprogramming tasks
Useful for those planning to use statistical Useful for those planning to use statistical software not available for the remote software not available for the remote system and who are not able to travel to system and who are not able to travel to the RDC facilitythe RDC facility
Cost is estimated for each research Cost is estimated for each research projectproject
25
Overview: Overview: RDC Data Access ProceduresRDC Data Access Procedures
Proposal RequirementsProposal Requirements
Access MethodsAccess Methods
Helpful TipsHelpful Tips
Where to get help?Where to get help?
26
RDC Helpful TipsRDC Helpful Tips
Be clear about research and data Be clear about research and data requirements (helps to determine requirements (helps to determine feasibility of project)feasibility of project) Clearly identify the sample to be included in Clearly identify the sample to be included in
the analytic filethe analytic file Provide data dictionaries for bothProvide data dictionaries for both
Public use dataPublic use data Restricted dataRestricted data
Provide examples of expected outputProvide examples of expected output
27
Overview: Overview: RDC Data Access ProceduresRDC Data Access Procedures
Proposal RequirementsProposal Requirements
Access MethodsAccess Methods
Helpful TipsHelpful Tips
Where to get help?Where to get help?