1 record linkage for epidemiologic research: accessing linked data at the nchs research data center...

28
1 Record Linkage for Record Linkage for Epidemiologic Research: Epidemiologic Research: Accessing Linked data at the Accessing Linked data at the NCHS Research Data Center NCHS Research Data Center Christine S. Cox NCHS Data Users Conference July 12, 2006 U.S. DEPARTMENT OF HEALTH AND HUMAN SERVICES Centers for Disease Control and Prevention National Center for Health Statistics

Upload: roger-jacobs

Post on 28-Dec-2015

221 views

Category:

Documents


0 download

TRANSCRIPT

1

Record Linkage for Record Linkage for Epidemiologic Research: Epidemiologic Research: Accessing Linked data at the NCHS Accessing Linked data at the NCHS Research Data CenterResearch Data CenterChristine S. CoxNCHS Data Users ConferenceJuly 12, 2006

U.S. DEPARTMENT OF HEALTH AND HUMAN SERVICESCenters for Disease Control and PreventionNational Center for Health Statistics

2

Administrative records

Linked Data File

NCHS Surveys

What is Record Linkage?What is Record Linkage?

3

NCHS Linked Data: NCHS Linked Data: Major ActivitiesMajor Activities

MortalityMortality National Death IndexNational Death Index

Health Care Utilization and CostsHealth Care Utilization and Costs Medicare DataMedicare Data

Retirement and DisabilityRetirement and Disability Social Security DataSocial Security Data

4

NCHS Linked Data: Mortality NCHS Linked Data: Mortality

Eligibility statusEligibility status Assigned vital statusAssigned vital status Date of deathDate of death Age at deathAge at death Underlying and multiple causes of deathUnderlying and multiple causes of death Adjusted sample weightsAdjusted sample weights

5

Research Potential of Research Potential of Linked Mortality DataLinked Mortality Data

Living and Dying in the USA: Behavioral, Health, and Social Differentials of Adult MortalityRG Rogers, CB Nam, RA Hummer

A Semiparametric Analysis of the Body Mass Index’s Relationship to MortalityJT Gronniger

The Income-Associated Burden of Disease in the United States P Muennig, P Franks, H Jia, E Lubetkin and MR Gold

Excess Deaths Associated with Underweight, Overweight, and ObesityKM Flegal, BI Graubard, DF Williamson; MH GailJAMA. 2005;293:1861-1867.

6

NCHS Linked Data: MedicareNCHS Linked Data: Medicare

Medicare entitlement and health care utilization Medicare entitlement and health care utilization and payment data for 1991-2000and payment data for 1991-2000 Denominator fileDenominator file

MEDPAR Inpatient hospitalizationMEDPAR Inpatient hospitalization

MEDPAR Skilled nursing facilityMEDPAR Skilled nursing facility

Hospital outpatient Hospital outpatient

Home Health CareHome Health Care

HospiceHospice

Carrier (physician/supplier Part B file)Carrier (physician/supplier Part B file)

Durable Medical EquipmentDurable Medical Equipment

7

Research Potential ofResearch Potential ofLinked Medicare DataLinked Medicare Data

Examine risk factors for health conditionsExamine risk factors for health conditions Examine reliability of survey dataExamine reliability of survey data

Examine survey report of disability with program Examine survey report of disability with program participation eligibility criteriaparticipation eligibility criteria

Compare survey reported health conditions to claims Compare survey reported health conditions to claims recordsrecords

Examine disparities in Medicare service Examine disparities in Medicare service utilizationutilization

8

NCHS Linked Data: Retirement/DisabilityNCHS Linked Data: Retirement/Disability

Social Security data from Retirement, Social Security data from Retirement, Survivors, and Disability Insurance (RSDI) Survivors, and Disability Insurance (RSDI) and Supplemental Security Insurance and Supplemental Security Insurance (SSI) programs(SSI) programs Master Beneficiary Record (MBR)Master Beneficiary Record (MBR)

1962-20031962-2003 Payment History Update System (PHUS)Payment History Update System (PHUS)

1984-20031984-2003 Supplemental Security Record (SSR)Supplemental Security Record (SSR)

1974-20031974-2003

9

Research Potential of Research Potential of Linked Social Security DataLinked Social Security Data

Examine reliability of survey information for SSA Examine reliability of survey information for SSA program participation and benefitsprogram participation and benefits

Compare the health characteristics of those who take Compare the health characteristics of those who take early (age 62) Social Security benefits to those who early (age 62) Social Security benefits to those who postpone benefits postpone benefits

Policy analysis using validated survey dataPolicy analysis using validated survey data Predicting the number of people who will become disabled Predicting the number of people who will become disabled

based upon survey reported health conditions based upon survey reported health conditions Determining whether current disability entitlement funding levels Determining whether current disability entitlement funding levels

will be adequate as the population ageswill be adequate as the population ages

10

Summary NCHS Data LinkageSummary NCHS Data Linkage

XXNNHS 1985

XXXNHANES III

XXNHANES II

XXXNHANES I

XXXLSOA II

XXXNHIS 1994-1998

XNHIS 1986-2000

Retirement & Disability (SSA)

Medicare (CMS)

Mortality (NDI)

11

www.cdc.gov/nchs/r&d/nchs_datalinkage/data_linkage_activities.htm

12

Why can’t you just give Why can’t you just give me the data?me the data?

NCHS does not “own” the linked NCHS does not “own” the linked administrative dataadministrative data

NCHS data confidentiality rules prohibit NCHS data confidentiality rules prohibit the release of potentially identifiable data – the release of potentially identifiable data – special considerations concerning the special considerations concerning the protection of linked dataprotection of linked data

The RDC is the only option for access for The RDC is the only option for access for now….now….

13

Overview: Overview: Data Access ProceduresData Access Procedures

Proposal RequirementsProposal Requirements

Access MethodsAccess Methods

Helpful TipsHelpful Tips

Where to get help?Where to get help?

14

Proposal RequirementsProposal Requirements

Proposal is evaluated by review committee Proposal is evaluated by review committee Review criteriaReview criteria

Scientific and technical feasibilityScientific and technical feasibility Availability of RDC resourcesAvailability of RDC resources Disclosure risk for restricted informationDisclosure risk for restricted information The extent to which project is in accordance The extent to which project is in accordance

with the mission of NCHSwith the mission of NCHS

Special note:Special note: NCHS does not try to NCHS does not try to determine if proposals are duplicativedetermine if proposals are duplicative

15

Proposal RequirementsProposal Requirements

Cover letterCover letter Project titleProject title Abstract (maximum 300 words summarizing Abstract (maximum 300 words summarizing

project)project) Full contact informationFull contact information

Institutional affiliationInstitutional affiliation Mail address, phone, emailMail address, phone, email

Dates of proposed time at RDC (or indication of Dates of proposed time at RDC (or indication of using remote access)using remote access)

Source of funding for proposed researchSource of funding for proposed research

16

Proposal RequirementsProposal Requirements Study backgroundStudy background

Key study questions or hypothesesKey study questions or hypotheses Public health benefitsPublic health benefits

MethodsMethods Analytic approach and statistical methodsAnalytic approach and statistical methods Statistical software requirementsStatistical software requirements

Description of intended output for nondisclosure Description of intended output for nondisclosure review, e.g.review, e.g. Table shellsTable shells Model equationsModel equations Test statistics that researcher plans to remove from Test statistics that researcher plans to remove from

RDCRDC

17

Proposal RequirementsProposal Requirements

Explanation of why restricted data are needed, Explanation of why restricted data are needed, e.g. describe why publicly available data are e.g. describe why publicly available data are insufficientinsufficient

Summary of data requirements to be included in Summary of data requirements to be included in analytic fileanalytic file Identification of sampleIdentification of sample Identification of variablesIdentification of variables

Description of additional data to be supplied by Description of additional data to be supplied by researcher to be merged with NCHS or other researcher to be merged with NCHS or other data source (must clearly identify source of other data source (must clearly identify source of other data)data)

18

Proposal Requirements: Proposal Requirements: AppendicesAppendices

Current Current Curriculum VitaeCurriculum Vitae or resume for each or resume for each investigatorinvestigator

Data dictionary – complete listing of specific Data dictionary – complete listing of specific data requested and its source(s) and indicate if data requested and its source(s) and indicate if public use or restricted access variablespublic use or restricted access variables specific files and yearsspecific files and years samplesample variables (dependent, independent, matching/linking)variables (dependent, independent, matching/linking)

19

Proposal Requirements: Proposal Requirements: AppendicesAppendices

For remote-access applicantsFor remote-access applicants Description of the computer and email system Description of the computer and email system

to be used to receive outputto be used to receive output Security provisions for the computer and Security provisions for the computer and

email systemsemail systems For studentsFor students

Letter from department chair or academic Letter from department chair or academic advisor stating that student is working under advisor stating that student is working under the direction of the departmentthe direction of the department

20

Overview: Overview: RDC Data Access ProceduresRDC Data Access Procedures

Proposal RequirementsProposal Requirements

Access MethodsAccess Methods

Helpful TipsHelpful Tips

Where to get help?Where to get help?

21

Access MethodsAccess Methods

Once approved, three methods to access Once approved, three methods to access restricted datarestricted data on-site - use local computing resources in the NCHS on-site - use local computing resources in the NCHS

RDC, Hyattsville, MDRDC, Hyattsville, MD remote – submit programs electronically to be remote – submit programs electronically to be

executed in the RDC with output returned by emailexecuted in the RDC with output returned by email staff assisted – RDC staff provide on-site staff assisted – RDC staff provide on-site

programming for off-site approved researchersprogramming for off-site approved researchers For all methods of access, restricted data files For all methods of access, restricted data files

remain in RDC and output is inspected for remain in RDC and output is inspected for disclosure violationsdisclosure violations

22

On-Site AccessOn-Site Access

RDC staff constructs necessary data files, RDC staff constructs necessary data files, including merged user dataincluding merged user data

Most statistical packages available with Most statistical packages available with sufficient lead timesufficient lead time

Output subject to disclosure reviewOutput subject to disclosure review

Open only during normal working hoursOpen only during normal working hours

23

Remote Access MethodRemote Access Method

RDC staff constructs necessary data files, RDC staff constructs necessary data files, including merged user dataincluding merged user data

SAS programs only (certain procedures and SAS programs only (certain procedures and functions not allowed) – additional software functions not allowed) – additional software options expectedoptions expected

Both submitted programs and output undergo a Both submitted programs and output undergo a programmed disclosure limitation reviewprogrammed disclosure limitation review

24

RDC Staff-assisted RDC Staff-assisted Programming MethodProgramming Method

Subcontract with the RDC staff to perform Subcontract with the RDC staff to perform programming tasksprogramming tasks

Useful for those planning to use statistical Useful for those planning to use statistical software not available for the remote software not available for the remote system and who are not able to travel to system and who are not able to travel to the RDC facilitythe RDC facility

Cost is estimated for each research Cost is estimated for each research projectproject

25

Overview: Overview: RDC Data Access ProceduresRDC Data Access Procedures

Proposal RequirementsProposal Requirements

Access MethodsAccess Methods

Helpful TipsHelpful Tips

Where to get help?Where to get help?

26

RDC Helpful TipsRDC Helpful Tips

Be clear about research and data Be clear about research and data requirements (helps to determine requirements (helps to determine feasibility of project)feasibility of project) Clearly identify the sample to be included in Clearly identify the sample to be included in

the analytic filethe analytic file Provide data dictionaries for bothProvide data dictionaries for both

Public use dataPublic use data Restricted dataRestricted data

Provide examples of expected outputProvide examples of expected output

27

Overview: Overview: RDC Data Access ProceduresRDC Data Access Procedures

Proposal RequirementsProposal Requirements

Access MethodsAccess Methods

Helpful TipsHelpful Tips

Where to get help?Where to get help?

28

Visit the RDC at: www.cdc.gov/nchs/r&d/rdc.htm or email: [email protected]