linking electronic patient records and death records: challenges and opportunities

32
Linking ELECTRONIC patient RECORDs and death records: Challenges and opportunities Mike Hogarth, MD, FACP, FACMI [email protected] http://hogarth.ucdavis.edu March 21, 2017

Upload: mike-hogarth-md-facmi-facp

Post on 22-Jan-2018

84 views

Category:

Health & Medicine


0 download

TRANSCRIPT

Linking ELECTRONIC patient RECORDs and death records: Challenges and opportunities

Mike Hogarth, MD, FACP, FACMI

[email protected]

http://hogarth.ucdavis.edu

March 21, 2017

Overview• Problem statement• Review of death certificate data• Overview of CA-EDRS• Sources for “death records” • Entity matching– Deterministic, probabilistic

• Matching modes – supervised vs. unsupervised• Where to match – front office vs. back office• Things to consider – the impact of “false” and where• Matching tools

Problem• The EHR records have a large number of falsely

“alive” patients–Most patients pass away outside of the healthcare

organization’s hospitals/clinics– There is no process to notify healthcare

organizations that a patient previously seen has expired

– Healthcare organizations do not have a systematic and reliable source for death information about their patients• Those who know of the expiration do not know the

healthcare org cared for the patient• Healthcare orgs are left to perform “matching”

Typical death certificate data

California EDRS (CA-EDRS)CA-EDRS CA-FDRS

Deaths/Year ~

250,000 2,500

Go Live Jan 1, 2005 May 1, 2013

Users 4,438 1,871Funeral Directors 753 271FH Staff 1,860 748Certifiers 0 0MF Staff 566 251ME/Coroners 491 251

Organizations 1,512 1,512Funeral Homes 1,177 1,177Hospitals 208 208ME/Coroner 58 58

Two national death files: CDC NDI and SSA DMF

• National Death Index (CDC)– Application form– 2-3 months review– Study subject matching against

CDC’s national death records– $$

• “Death Master File” (SSA)– 1962-present (83million)– 2011 – no longer includes

‘protected’ state records (removed 4.2m records)

– Since 2011, has 1M fewer records per year – about 40% of all annual deaths are no longer in the DMF)

– ~$60,000 license fee

Using the CDC NDI• Requires you to submit your data to CDC• Only approved for use in matching clinical

trial/research data• Can be delayed – up to 24mo before all deaths

from a year are in the file• Approval takes ~2 months • Costs

Issues with Death Master File

• Has all deaths prior to 2011, but ongoing is missing significant numbers of deaths

• Updated annually• Can be ~24mo behind• No longer includes all deaths in the US

annually– Only about 50% of deaths per year are in DMF

today

California’s Fact of Death File

• As of 2016, California Dept. of Public Health (CDPH) has made a fact of death file available to healthcare organizations to match against their records

• Provided monthly• Data elements for matching in the file– First name– Middle name– Last name– Gender– Date of birth

• Does not include SSN or cause of death

California Law Regarding Preservation and Release of Vital Records Data (Health and Safety Code – 102230)

California Research Files

• CDPH has a process for applying for identified death files with data beyond the fact of death file– Requires IRB review and Vital Statistics Advisory

Committee (VSAC) approval– Used to be a “one time” file, but they will consider

on-going distribution on a monthly basis

Matching records – entity matching and ‘record linkage’

• There are two ways to link/match records– Deterministic matching– Probabilistic matching

• Probabilistic matching allows one to assign weights to different data elements used in the matching and use a threshold rather than an “all or none” determination on matching

Why use Probabilistic Matching?

• Can handle missing data in a weighted fashion• One can have “possible matches”, in addition

to “matches” and ”non-matches”• Can adjust the thresholds for matches and

possible matches• Can be ‘trained’ to perform with less human

“custom rule making”

Fellegi-Sunter: Probabilistic record linkage

“WHERE” to match and its value/rationale

• Where to implement the match can vary dramatically in terms of tolerance to incorrect matching and its impact on the person and/or institution

• “Front office” (EHR)– The avoid scheduling deceased patients– To express condolences to the family– To prevent fraud

• “Back office” or “Data Warehouse” (Population Analytics, Quality metrics)– To improve accuracy of population/quality metrics– Incorrect quality reporting could have a significant impact

• Quality metrics must be reported to CMS under MACRA and will be counted toward the composite performance score (CPS)

Matching modes• Fully Automated matching without

confirmation– A software matching system is employed and

changes the vital status field automatically and without confirmation by a human

• Supervised matching– Software is used for matching but results are

confirmed before the system flag is set– In other words, the software is used as a

‘screening’ to find record matches that should be further explored and confirmed

Vital status - how to think about it• If we consider the vital status flag as “truth” and

”alive” as having the condition (of being alive), then:– True positive (TP) – when your vital status flag is “correct”

as indicating the patient is “alive”– True negative (TN) – when your vital status flag is

“correct” as indicating the patient is “deceased”– False positive (FP) – when your vital status flag has the

patient “alive” but they are actually deceased– False negative (FN) -- when your vital status flag has the

patient ”deceased” but they are actually alive

Vital Status and “False”• It is not possible to have 100% correct status in your

system because you are doing matching at a later date with a source data set and matching approach that cannot guarantee 100% TP and TN– You will have to deal with some degree of incorrectness– So, it is inevitable to have FPs and FNs!

• Two possibilities– False Positive (FP): Patient is deceased, but your system

shows them alive– False Negative(FN): Patient is alive, but your system shows

them deceased

What is done today? • Today, few if any healthcare systems have

access to a file for matching against the EHR• Healthcare systems ”learn” of patient deaths

because they “hear” about them from family or their providers– Similar to “supervised matching” in that the family

notification invokes a process to confirm the status, if possible.

• Some patient pass in the hospital so the vital status is set by staff – the minority of the deceased in your databases

What do you have today in your systems?

• You have a significant rate of False Positives in the EHR and the Clinical Data Warehouse, which receives its vital status from the EHR– You have a low rate of False Negatives

• What is your FP rate (how incorrect are you)?– Depends on the age group• the older the patient age group, the higher the error

(higher FP rate)

UC Health Patients Alive and >85

There were only 600,000 Californians over 85 in 2010!

1.8M non-deceased and over 85 across UC Health

Things to consider• You do NOT have to implement automated matching in

both front office and back office• You CAN start with automated unsupervised matching

in the Clinical Data Warehouse where you have low effort, low risk, high value– Your quality metrics will be more correct– You can tolerate some “false negatives”, which would have

no impact on the front office, or patient

• If you have enough staff, and a high fidelity matching process, you CAN consider implementing supervised matching in the front office (EHR)– You will still be VERY unlikely to have False Negatives from a

poorly performing matching system

Most likely errors of an entity matching system

• “False Positive” is by far the most common error by a matching system– FP – it fails to detect a match that is there, so the record

continues as “alive” when the person is deceased

• ”False Negatives” are quite uncommon because of how rare it is to have two individuals with exactly the same name (first, middle, and last), gender, and date of birth– It is possible but not common– One can require ‘supervised confirmation’ if you have two

records in your EHR/CDW that match an EDRS record.

Where are we with the file today• We have an existing agreement with CDPH for the

”fact of death” file (2005 – present) – Available to all UC Health sites– The fact of death California death file is available through

a secure site hosted by UCSD – required 2 factor RSA authentication in addition to login/pw

– UCD required an MOU to be signed with UCD for me to provide you the file (because you have to agree not to misuse the file, which is a misdemeanor per CDPH agreement)

• We are applying for a file that includes SSN and cause of death – through the VSAC process

Getting Started

• You can start by performing automated unsupervised matching in the clinical data warehouse– A “false negative”, even if it happened, would

have no impact on the EHR and/or patient– The ”false positive” rate for “alive” is so high in

the clinical data warehouse, that even a poorly performing match because it uses a low number of common data elements without SSN is likely to help you get “more correct” than you are today• Remember – 100% perfection is not realistic or possible

The DecEnt Matching Tool• We have a simple command line java tool we developed that

uses Oyster, an open source implementation of probabilistic matching based on Fellegi-Sunter

• It loads edrs data we furnish and performs matching on first, middle, last, gender, dob

IBM Initiate

• A sophisticated matching system designed for healthcare and identifying duplicate records in different clinical databases (matching)

• Used in many healthcare systems already (over 60% of the market)

• Requires SSN

IBM Initiate - built to find patients in two clinical data sets -

IBM Initiate workflow

IBM Initiate – standardizing data before matching (comparison)

IBM Initiate “bucket functions” for efficient searching -- note phonetic and equivalence functions