patient matching ehr ailments: going from placebo to cure ...€¦ · • patient records are...
TRANSCRIPT
![Page 1: Patient Matching EHR Ailments: Going from Placebo to Cure ...€¦ · • Patient records are scattered across the health care system in various data silos including; laboratory systems,](https://reader034.vdocuments.net/reader034/viewer/2022052021/603621d90683ec40db771eb5/html5/thumbnails/1.jpg)
Patient Matching EHR Ailments: Going from Placebo to Cure
Tuesday, March 1st 2016 Adam W. Culbertson, Innovator-in-Residence HHS, HIMSS
Keith J. Miller, Chief Scientist for Identity Intelligence, MITRE
Approved for Public Release; Distribution Unlimited. Case Number 15-4026 ©2016 The MITRE Corporation. ALL RIGHTS RESERVED.
![Page 2: Patient Matching EHR Ailments: Going from Placebo to Cure ...€¦ · • Patient records are scattered across the health care system in various data silos including; laboratory systems,](https://reader034.vdocuments.net/reader034/viewer/2022052021/603621d90683ec40db771eb5/html5/thumbnails/2.jpg)
Conflict of Interest Adam W. Culbertson, MS and Keith J. Miller, PhD Have no real or apparent conflicts of interest to report.
![Page 3: Patient Matching EHR Ailments: Going from Placebo to Cure ...€¦ · • Patient records are scattered across the health care system in various data silos including; laboratory systems,](https://reader034.vdocuments.net/reader034/viewer/2022052021/603621d90683ec40db771eb5/html5/thumbnails/3.jpg)
Agenda
• Background – History of Matching
– What is Patient Matching
• Challenges in Matching – Data Availability
– Data Quality
• Test Evaluation Framework
• Metrics for Algorithm Performance
• Creating Test Data Sets
![Page 4: Patient Matching EHR Ailments: Going from Placebo to Cure ...€¦ · • Patient records are scattered across the health care system in various data silos including; laboratory systems,](https://reader034.vdocuments.net/reader034/viewer/2022052021/603621d90683ec40db771eb5/html5/thumbnails/4.jpg)
• Explain why patient matching is a multi-step process requiring a strategy and not a “one size fits all” solution, the main steps in developing this strategy, and why determining quality of the data is key for effective patient matching
• Demonstrate how the framework helps address the multiple steps needed for an effective patient matching strategy, such as an understanding of the data and the tradeoffs involved in a good matching strategy, and why different matching strategies may be needed for different populations
• Demonstrate how an organization can gain a better understanding of their data through use of the “Data Variant Taxonomy” and data characterization tool suite without requiring ongoing hands-on access
• Describe why a gold standard data set is required for a good test framework, allowing for an “apples-to-apples” comparison of patient matchers and the issues involved in producing this data set using the data variant taxonomy
Learning Objectives
![Page 5: Patient Matching EHR Ailments: Going from Placebo to Cure ...€¦ · • Patient records are scattered across the health care system in various data silos including; laboratory systems,](https://reader034.vdocuments.net/reader034/viewer/2022052021/603621d90683ec40db771eb5/html5/thumbnails/5.jpg)
• Electronic: Patient matching has been identified as a key barrier to Interoperability in ONC’s nationwide Health IT Roadmap
• Prevention & Patient Education: Reduction in patient safety events caused by missing or incorrectly matched records
• Patient Engagement/Population Management: More complete records gathered across disparate health systems
• Savings: Missing information and reordered tests cost over $8 Billion annually. Improvement in patient matching can reduce this cost.
• Improvements in patient matching can reduce deaths healthcare costs and fraud caused by incorrectly matched data
How Patient Matching Benefits Health IT
![Page 6: Patient Matching EHR Ailments: Going from Placebo to Cure ...€¦ · • Patient records are scattered across the health care system in various data silos including; laboratory systems,](https://reader034.vdocuments.net/reader034/viewer/2022052021/603621d90683ec40db771eb5/html5/thumbnails/6.jpg)
Background
![Page 7: Patient Matching EHR Ailments: Going from Placebo to Cure ...€¦ · • Patient records are scattered across the health care system in various data silos including; laboratory systems,](https://reader034.vdocuments.net/reader034/viewer/2022052021/603621d90683ec40db771eb5/html5/thumbnails/7.jpg)
Significant Dates in (Patient) Matching
A Framework for Cross-Organizational
Patient Identity Management
2015
Kho, Abel N., et al Design and
Implementation of a Privacy Preserving
Electronic Health Record Linkage Tool
HIMSS Patient Identity
Integrity
Grannis, et al Privacy and Security
Solutions for Interoperable Health Information
Exchange
2009
Joffe et al A Benchmark Comparison
of Deterministic and Probabilistic Methods for Defining Manual Review
Datasets in Duplicate Records Reconciliation
Dusetzina, Stacie B., et al Linking Data for Health
Services Research: A Framework and Instructional Guide
HIMSS hires Innovator In Residence (IIR) focused
on Patient Matching
Audacious Inquiry and ONC
Patient Identification and Matching Final Report
2014
HIMSS Patient Identify Integrity Toolkit,
Patient Key Performance
Indicators
Winkler Matching and
Record Linkage
2011
Newcombe, Kennedy, & Axford
Automatic Linkage of Vital Records
1959
Dunn Record Linkage
1946
Soundex US Patent 1261167
1918
Fellegi & Sunter A Theory of
Record Linkage
1969
Grannis, et al Analysis of Identifier Performance Using a Deterministic Linkage
Algorithm
2002
Campbell, K et al A Comparison of Link Plus, The Link King, and a “Basic”
Deterministic Algorithm
RAND Health Report
Identity Crisis: An Examination of the Costs and Benefits of a Unique Patient Identifier for the US Health Care System
2008
![Page 8: Patient Matching EHR Ailments: Going from Placebo to Cure ...€¦ · • Patient records are scattered across the health care system in various data silos including; laboratory systems,](https://reader034.vdocuments.net/reader034/viewer/2022052021/603621d90683ec40db771eb5/html5/thumbnails/8.jpg)
Patient Matching Definition
Patient matching: Comparing data from multiple sources to identify records that represent the same patient. • In Healthcare involves matching varied
demographic fields from different health data stores to create a unified view of a patient.
![Page 9: Patient Matching EHR Ailments: Going from Placebo to Cure ...€¦ · • Patient records are scattered across the health care system in various data silos including; laboratory systems,](https://reader034.vdocuments.net/reader034/viewer/2022052021/603621d90683ec40db771eb5/html5/thumbnails/9.jpg)
Identity Matching / Identity Resolution
Identity analysis:
link analysis, data mining
Identity resolution:
Merge/dedupe records
Identity matching Measure record similarity.
Search/retrieval
Attribute matching Compare name, DOB, COB, address, etc.
Identity data repository
Structured and unstructured data sources
![Page 10: Patient Matching EHR Ailments: Going from Placebo to Cure ...€¦ · • Patient records are scattered across the health care system in various data silos including; laboratory systems,](https://reader034.vdocuments.net/reader034/viewer/2022052021/603621d90683ec40db771eb5/html5/thumbnails/10.jpg)
![Page 11: Patient Matching EHR Ailments: Going from Placebo to Cure ...€¦ · • Patient records are scattered across the health care system in various data silos including; laboratory systems,](https://reader034.vdocuments.net/reader034/viewer/2022052021/603621d90683ec40db771eb5/html5/thumbnails/11.jpg)
“Patient had an onset of diabetes, which is accompanied by an odd change in race, and the medication worked extremely well, and in subsequent visits no longer occurred.”
Wes Richel: ONC HIT Privacy and Security Tiger Team Hearing, December 2010
![Page 12: Patient Matching EHR Ailments: Going from Placebo to Cure ...€¦ · • Patient records are scattered across the health care system in various data silos including; laboratory systems,](https://reader034.vdocuments.net/reader034/viewer/2022052021/603621d90683ec40db771eb5/html5/thumbnails/12.jpg)
Challenges in Matching
![Page 13: Patient Matching EHR Ailments: Going from Placebo to Cure ...€¦ · • Patient records are scattered across the health care system in various data silos including; laboratory systems,](https://reader034.vdocuments.net/reader034/viewer/2022052021/603621d90683ec40db771eb5/html5/thumbnails/13.jpg)
Challenges • Lack of adoption of metrics
• Data availability
• Patient records are scattered across the health care system in various data silos including; laboratory systems, hospitals and primary care provider EMRs.
• Differences in electronic health record vendors
– Data attributes collected
– Variation in output formats
– 12/01/1985, 12-01-1985, 01-12-1985
![Page 14: Patient Matching EHR Ailments: Going from Placebo to Cure ...€¦ · • Patient records are scattered across the health care system in various data silos including; laboratory systems,](https://reader034.vdocuments.net/reader034/viewer/2022052021/603621d90683ec40db771eb5/html5/thumbnails/14.jpg)
Availability of Data Attributes
![Page 15: Patient Matching EHR Ailments: Going from Placebo to Cure ...€¦ · • Patient records are scattered across the health care system in various data silos including; laboratory systems,](https://reader034.vdocuments.net/reader034/viewer/2022052021/603621d90683ec40db771eb5/html5/thumbnails/15.jpg)
% Availability of Attributes Over Region
0.00%
10.00%
20.00%
30.00%
40.00%
50.00%
60.00%
70.00%
80.00%
90.00%
100.00%Fi
rst N
ame
Mid
dle
Nam
eLa
st N
ame
Date
of B
irth
Birt
h Ye
arG
ende
rSo
cial
Sec
urity
Num
ber
Addr
ess (
full)
Stre
et A
ddre
ss L
ine
1Ci
tySt
ate
Post
al C
ode
Coun
try
Abbr
evia
tion
Coun
try
Full
Nam
ePh
one
Num
ber (
any)
Hom
e Ph
one
Num
ber
Cell
Phon
e N
umbe
rW
ork
Phon
e N
umbe
rEm
ail A
ddre
ssN
ickn
ame
Insu
ranc
e N
umbe
r (fr
ee te
xt)
Driv
ers L
icen
se N
umbe
rRa
ce (O
MB)
Race
(fre
e te
xt)
Ethn
iciti
yLa
ngua
geO
ccup
atio
nIn
com
eM
arita
l Sta
tus
Heig
ht (c
m)
Heig
ht (m
)He
ight
(in)
Heig
ht (f
t)W
eigh
t (lb
s)W
eigh
t (kg
)Bl
ood
Type
Site B
Site A
Site C
![Page 16: Patient Matching EHR Ailments: Going from Placebo to Cure ...€¦ · • Patient records are scattered across the health care system in various data silos including; laboratory systems,](https://reader034.vdocuments.net/reader034/viewer/2022052021/603621d90683ec40db771eb5/html5/thumbnails/16.jpg)
![Page 17: Patient Matching EHR Ailments: Going from Placebo to Cure ...€¦ · • Patient records are scattered across the health care system in various data silos including; laboratory systems,](https://reader034.vdocuments.net/reader034/viewer/2022052021/603621d90683ec40db771eb5/html5/thumbnails/17.jpg)
![Page 18: Patient Matching EHR Ailments: Going from Placebo to Cure ...€¦ · • Patient records are scattered across the health care system in various data silos including; laboratory systems,](https://reader034.vdocuments.net/reader034/viewer/2022052021/603621d90683ec40db771eb5/html5/thumbnails/18.jpg)
Data Quality
![Page 19: Patient Matching EHR Ailments: Going from Placebo to Cure ...€¦ · • Patient records are scattered across the health care system in various data silos including; laboratory systems,](https://reader034.vdocuments.net/reader034/viewer/2022052021/603621d90683ec40db771eb5/html5/thumbnails/19.jpg)
• Data Quality is a Key – Garbage in and Garbage out
• Data entry errors are compound data matching complexity – Various algorithmic solutions to address these, not perfect
• Types of errors: – Missing or Incomplete Values – Inaccurate data – Fat finger errors – Information is out of date – Transposed names – Misspelled names
Data Quality
![Page 20: Patient Matching EHR Ailments: Going from Placebo to Cure ...€¦ · • Patient records are scattered across the health care system in various data silos including; laboratory systems,](https://reader034.vdocuments.net/reader034/viewer/2022052021/603621d90683ec40db771eb5/html5/thumbnails/20.jpg)
• Transposition errors • Mary Sue vs Sue Marie • Smitty, John vs John, Smitty
• Names change over time • Marriage, Divorce
• More than one way to spell name • Jon, John
• Data entry – Fat-finger = typo, transposition, etc.
• Phonetic variation
Data Quality
![Page 21: Patient Matching EHR Ailments: Going from Placebo to Cure ...€¦ · • Patient records are scattered across the health care system in various data silos including; laboratory systems,](https://reader034.vdocuments.net/reader034/viewer/2022052021/603621d90683ec40db771eb5/html5/thumbnails/21.jpg)
Variant Taxonomy
• Element Variation – Data Errors
• OCR • Typos • Truncations
– Short forms • Abbreviations • Initials
– Spelling variations • Alternate Spellings • Transliterations
– Particles • Particle Segmentation • Particle Omission
– Nicknames & Diminutives – Translation variants – Non-word characters – Presence/Absence of TAQ – Case variation
• Structural Variation – Additions/deletions – Fielding variations – Permutations – Placeholders – Element segmentation
Names
© 2014 The MITRE Corporation. All rights reserved.
![Page 22: Patient Matching EHR Ailments: Going from Placebo to Cure ...€¦ · • Patient records are scattered across the health care system in various data silos including; laboratory systems,](https://reader034.vdocuments.net/reader034/viewer/2022052021/603621d90683ec40db771eb5/html5/thumbnails/22.jpg)
Variant Taxonomy
• Element Variation – Data errors
• OCR • Typos • Truncations • Removals
– Short forms • Abbreviations • Initials • Numerals • Symbols
– Spelling variations • Alternate Spellings • Transliterations • Segmentation
– Translation variants – Aliases – Substitutions – Element length – Case variation
• Structural variation – Additions/deletions – Fielding variations – Permutations – Placeholders
Addresses
![Page 23: Patient Matching EHR Ailments: Going from Placebo to Cure ...€¦ · • Patient records are scattered across the health care system in various data silos including; laboratory systems,](https://reader034.vdocuments.net/reader034/viewer/2022052021/603621d90683ec40db771eb5/html5/thumbnails/23.jpg)
Variant Taxonomy
• Element Variation – Data Errors
• OCR • Typos • Truncations
– Particles • Particle substitutions • Particle omission
– Short forms • Abbreviations • Month numbers • Dropping leading zeros • Dropping leading year
digits • Structural Variation
– Additions/deletions – Fielding variations – Placeholders – Element segmentation
Dates (of birth) IDs (SSN/other)
• Element Variation – Data Errors
• OCR • Typos • Dropping leading
zeros – Particles
• Particle substitutions • Particle omission
– Short forms • Structural Variation
– Missing data/deletions – Fielding variations – Placeholders – Element segmentation
![Page 24: Patient Matching EHR Ailments: Going from Placebo to Cure ...€¦ · • Patient records are scattered across the health care system in various data silos including; laboratory systems,](https://reader034.vdocuments.net/reader034/viewer/2022052021/603621d90683ec40db771eb5/html5/thumbnails/24.jpg)
Variant Taxonomy
Paper / Poster presented at AMIA 2013 Summit on Clinical Research Informatics
![Page 25: Patient Matching EHR Ailments: Going from Placebo to Cure ...€¦ · • Patient records are scattered across the health care system in various data silos including; laboratory systems,](https://reader034.vdocuments.net/reader034/viewer/2022052021/603621d90683ec40db771eb5/html5/thumbnails/25.jpg)
Test Evaluation Framework
![Page 26: Patient Matching EHR Ailments: Going from Placebo to Cure ...€¦ · • Patient records are scattered across the health care system in various data silos including; laboratory systems,](https://reader034.vdocuments.net/reader034/viewer/2022052021/603621d90683ec40db771eb5/html5/thumbnails/26.jpg)
• What is the question you are trying to answer? • What data attributes do you have? • What is the quality of these attributes? • What is the matching method you want to use? • How effective is your matching method?
Framework Applied to Patient Matching
![Page 27: Patient Matching EHR Ailments: Going from Placebo to Cure ...€¦ · • Patient records are scattered across the health care system in various data silos including; laboratory systems,](https://reader034.vdocuments.net/reader034/viewer/2022052021/603621d90683ec40db771eb5/html5/thumbnails/27.jpg)
Metrics for Algorithm Performance
![Page 28: Patient Matching EHR Ailments: Going from Placebo to Cure ...€¦ · • Patient records are scattered across the health care system in various data silos including; laboratory systems,](https://reader034.vdocuments.net/reader034/viewer/2022052021/603621d90683ec40db771eb5/html5/thumbnails/28.jpg)
• Ideal outcome of any matching exercise is correctly answering this one question hundreds or thousands of times, Are these two things the same thing?
– Correctly identifying all the true positives and true negatives while minimizing the number of errors, false positives and false negatives
Patient Matching Goal
![Page 29: Patient Matching EHR Ailments: Going from Placebo to Cure ...€¦ · • Patient records are scattered across the health care system in various data silos including; laboratory systems,](https://reader034.vdocuments.net/reader034/viewer/2022052021/603621d90683ec40db771eb5/html5/thumbnails/29.jpg)
• True Positive- The two records represent the same patient
• True Negative- The two records don't represent the same patient
Patient Matching Terminology
![Page 30: Patient Matching EHR Ailments: Going from Placebo to Cure ...€¦ · • Patient records are scattered across the health care system in various data silos including; laboratory systems,](https://reader034.vdocuments.net/reader034/viewer/2022052021/603621d90683ec40db771eb5/html5/thumbnails/30.jpg)
• False Negative: The algorithm misses a record that should be matched
• False Positive: The algorithm creates a link to two records that don’t actually match
Patient Matching Terminology
![Page 31: Patient Matching EHR Ailments: Going from Placebo to Cure ...€¦ · • Patient records are scattered across the health care system in various data silos including; laboratory systems,](https://reader034.vdocuments.net/reader034/viewer/2022052021/603621d90683ec40db771eb5/html5/thumbnails/31.jpg)
EHR A EHR B Truth (Gold Standard)
Algorithm Match Type
Jonathan Jonathan Match Match True Positive
Jonathan Sally Non-Match Non-Match True Negative
Jonathan Sally Non-Match Match False Positive
Jonathan Jon Match Non-Match False Negative
Evaluation
Good
Bad
![Page 32: Patient Matching EHR Ailments: Going from Placebo to Cure ...€¦ · • Patient records are scattered across the health care system in various data silos including; laboratory systems,](https://reader034.vdocuments.net/reader034/viewer/2022052021/603621d90683ec40db771eb5/html5/thumbnails/32.jpg)
EHR A EHR B Truth (Gold Standard)
Algorithm Match Type
Jonathan Jonathan Match Match True Positive
Jonathan Sally Non-Match Non-Match True Negative
Jonathan Sally Non-Match Match False Positive
Jonathan Jon Match Non-Match False Negative
Evaluation
Bad
![Page 33: Patient Matching EHR Ailments: Going from Placebo to Cure ...€¦ · • Patient records are scattered across the health care system in various data silos including; laboratory systems,](https://reader034.vdocuments.net/reader034/viewer/2022052021/603621d90683ec40db771eb5/html5/thumbnails/33.jpg)
EHR A EHR B Truth (Gold Standard)
Algorithm Match Type
Jonathan Jonathan Match Match True Positive
Jonathan Sally Non-Match Non-Match True Negative
Jonathan Sally Non-Match Match False Positive
Jonathan Jon Match Non-Match False Negative
Evaluation
Bad
![Page 34: Patient Matching EHR Ailments: Going from Placebo to Cure ...€¦ · • Patient records are scattered across the health care system in various data silos including; laboratory systems,](https://reader034.vdocuments.net/reader034/viewer/2022052021/603621d90683ec40db771eb5/html5/thumbnails/34.jpg)
Truth
Algorithm
Positive Negative
Positive True Positive False Positive
Negative False Negative True Negative
Evaluation
Recall
Precision
Precision = True Positives / (True Positives + False Positives)
Recall = True Positives / (True Positives + False Negatives)
![Page 35: Patient Matching EHR Ailments: Going from Placebo to Cure ...€¦ · • Patient records are scattered across the health care system in various data silos including; laboratory systems,](https://reader034.vdocuments.net/reader034/viewer/2022052021/603621d90683ec40db771eb5/html5/thumbnails/35.jpg)
• Calculation – Precision = True Positives / (True Positives +
False Positives)
– Recall = True Positives / (True Positives + False Negatives)
• Tradeoffs between Precision and Recall – F Measure
Evaluation
![Page 36: Patient Matching EHR Ailments: Going from Placebo to Cure ...€¦ · • Patient records are scattered across the health care system in various data silos including; laboratory systems,](https://reader034.vdocuments.net/reader034/viewer/2022052021/603621d90683ec40db771eb5/html5/thumbnails/36.jpg)
Creating Test Data Sets
![Page 37: Patient Matching EHR Ailments: Going from Placebo to Cure ...€¦ · • Patient records are scattered across the health care system in various data silos including; laboratory systems,](https://reader034.vdocuments.net/reader034/viewer/2022052021/603621d90683ec40db771eb5/html5/thumbnails/37.jpg)
Development of Test Data Set
Patient Database
Select Potential Matches (aka Adjudication Pool)
Compare Algorithm and Test Data Set
Human-Reviewed Match Decisions (Answer Key == Ground Truth Data Set)
Manual Reviewer 1
Manual Reviewer 2
Manual Reviewer 3
![Page 38: Patient Matching EHR Ailments: Going from Placebo to Cure ...€¦ · • Patient records are scattered across the health care system in various data silos including; laboratory systems,](https://reader034.vdocuments.net/reader034/viewer/2022052021/603621d90683ec40db771eb5/html5/thumbnails/38.jpg)
Development of Ground Truth Sets • Identify data set that reflects real word use case
• Develop potential duplicates
• Human adjudication review and classification – Match or Non-Match
• Estimate truth
– Pooled methods using multiple matching methods
![Page 39: Patient Matching EHR Ailments: Going from Placebo to Cure ...€¦ · • Patient records are scattered across the health care system in various data silos including; laboratory systems,](https://reader034.vdocuments.net/reader034/viewer/2022052021/603621d90683ec40db771eb5/html5/thumbnails/39.jpg)
Issues In Establishing Ground Truth Examples B Smith Bill Smythe William Smythe W Smith ?? DOB: 10/12/1972 October 11, 1972 December 10, 1972 12/10/72 October 12, 1927
![Page 40: Patient Matching EHR Ailments: Going from Placebo to Cure ...€¦ · • Patient records are scattered across the health care system in various data silos including; laboratory systems,](https://reader034.vdocuments.net/reader034/viewer/2022052021/603621d90683ec40db771eb5/html5/thumbnails/40.jpg)
Activity: Patient Names
![Page 41: Patient Matching EHR Ailments: Going from Placebo to Cure ...€¦ · • Patient records are scattered across the health care system in various data silos including; laboratory systems,](https://reader034.vdocuments.net/reader034/viewer/2022052021/603621d90683ec40db771eb5/html5/thumbnails/41.jpg)
/li/
‘Li’ ‘Lee’
‘Leah’
‘Leigh’ /le.ɑ/
/li.ɑ/
/lei̯/
‘Lay’
‘Laye’
/lai̯/ ‘Lie’
‘Ligh’
Quoi?
Patient Names (Answers)
Jean Rimbaud (OK, or John….)
Leigh Cramer
Alice Slawson
I don’t know what your neighbors’ names are… … but did you get them right? … did you get the *whole* name right?
![Page 42: Patient Matching EHR Ailments: Going from Placebo to Cure ...€¦ · • Patient records are scattered across the health care system in various data silos including; laboratory systems,](https://reader034.vdocuments.net/reader034/viewer/2022052021/603621d90683ec40db771eb5/html5/thumbnails/42.jpg)
Identity Matching Adjudication Collector (IMAC) User Interface
One screen of the Adjudication Collector continually provides questions to the adjudicator which need to be answered. These screens first ask the question with no dates provided and then again asks the question with dates shown.
![Page 43: Patient Matching EHR Ailments: Going from Placebo to Cure ...€¦ · • Patient records are scattered across the health care system in various data silos including; laboratory systems,](https://reader034.vdocuments.net/reader034/viewer/2022052021/603621d90683ec40db771eb5/html5/thumbnails/43.jpg)
Issues In Establishing Ground Truth • Different truth for different applications
– Credit check – Security applications – Customer support – De-duplication of mailing lists
• What is the cost of missing a match? – New record entered into database – Irritated customer – Lives are lost
• Criteria for truth must be carefully established and well-understood by annotators
– Question posed to annotators must be carefully phrased
![Page 44: Patient Matching EHR Ailments: Going from Placebo to Cure ...€¦ · • Patient records are scattered across the health care system in various data silos including; laboratory systems,](https://reader034.vdocuments.net/reader034/viewer/2022052021/603621d90683ec40db771eb5/html5/thumbnails/44.jpg)
Issues In Establishing Ground Truth
• How much time / expertise is available to judge (/discount) false positives?
• Needs to reflect real word test use case • Evaluation results are only as good as the truth on
which they are based – And only as appropriate as the evaluation is to the task that will
be performed with the operational system
• Absolute recall impossible to measure without
completely known test set (i.e. “You don’t know what you’re missing.”)
– Estimate with pooled results
![Page 45: Patient Matching EHR Ailments: Going from Placebo to Cure ...€¦ · • Patient records are scattered across the health care system in various data silos including; laboratory systems,](https://reader034.vdocuments.net/reader034/viewer/2022052021/603621d90683ec40db771eb5/html5/thumbnails/45.jpg)
Issues In Establishing Ground Truth • First step in evaluation is to determine why the
evaluation is being conducted • Different truth for different applications
– Security Applications vs Patient Health Record
• What is the cost of missing a match? – Security: Lives are lost – Health: Patient safety event, missed medications, allergies,
etc… death But…this is situation today.
• What is the cost of wrongly identifying a match? – Security : Passenger is inconvenienced / delayed – Health: Patient safety event, wrong medication, treatment,
liability, death
• Criteria for truth must be carefully established and well-understood
– E.g. Question posed to annotators must be carefully phrased
Summary for Healthcare Use Case
![Page 46: Patient Matching EHR Ailments: Going from Placebo to Cure ...€¦ · • Patient records are scattered across the health care system in various data silos including; laboratory systems,](https://reader034.vdocuments.net/reader034/viewer/2022052021/603621d90683ec40db771eb5/html5/thumbnails/46.jpg)
Next Steps
![Page 47: Patient Matching EHR Ailments: Going from Placebo to Cure ...€¦ · • Patient records are scattered across the health care system in various data silos including; laboratory systems,](https://reader034.vdocuments.net/reader034/viewer/2022052021/603621d90683ec40db771eb5/html5/thumbnails/47.jpg)
•Build ground truth dataset to enable evaluation of complementary approaches to patient matching
•Complete the attribute study looking at how variables change over time and region
•Encourage adoption and understanding of metrics-based decisions with respect to implementation of patient matching systems
Next Steps
![Page 48: Patient Matching EHR Ailments: Going from Placebo to Cure ...€¦ · • Patient records are scattered across the health care system in various data silos including; laboratory systems,](https://reader034.vdocuments.net/reader034/viewer/2022052021/603621d90683ec40db771eb5/html5/thumbnails/48.jpg)
Summary • Patient matching is an old problem • Need to understand data attributes available for
matching • Understand their quality • Follow a systematic approach to evaluation
• Methodology to create ground truth data • Metrics
• Precision • Recall
• If you don’t measure it, you can’t improve it!
![Page 49: Patient Matching EHR Ailments: Going from Placebo to Cure ...€¦ · • Patient records are scattered across the health care system in various data silos including; laboratory systems,](https://reader034.vdocuments.net/reader034/viewer/2022052021/603621d90683ec40db771eb5/html5/thumbnails/49.jpg)
• Electronic: Patient matching has been identified as a key barrier to Interoperability in ONC’s nationwide Health IT Roadmap
• Prevention & Patient Education: Reduction in patient safety events caused by missing or incorrectly matched records
• Patient Engagement/Population Management: More complete records gathered across disparate health systems
• Savings: Missing information and reordered tests cost over $8 Billion annually. Improvement in patient matching can reduce this cost.
• Improvements in patient matching can reduce deaths healthcare costs and fraud caused by incorrectly matched data
How Patient Matching Benefits Health IT
![Page 51: Patient Matching EHR Ailments: Going from Placebo to Cure ...€¦ · • Patient records are scattered across the health care system in various data silos including; laboratory systems,](https://reader034.vdocuments.net/reader034/viewer/2022052021/603621d90683ec40db771eb5/html5/thumbnails/51.jpg)
Questions?
![Page 52: Patient Matching EHR Ailments: Going from Placebo to Cure ...€¦ · • Patient records are scattered across the health care system in various data silos including; laboratory systems,](https://reader034.vdocuments.net/reader034/viewer/2022052021/603621d90683ec40db771eb5/html5/thumbnails/52.jpg)
Back-Up
![Page 53: Patient Matching EHR Ailments: Going from Placebo to Cure ...€¦ · • Patient records are scattered across the health care system in various data silos including; laboratory systems,](https://reader034.vdocuments.net/reader034/viewer/2022052021/603621d90683ec40db771eb5/html5/thumbnails/53.jpg)
What is the first step in an effective patient matching strategy? A. Understanding your data. B. Understanding the question you are trying to answer for patient
matching in your organization. C. Implementing a patient matcher software solution. D. Improving data entry processes.
Question 1
![Page 54: Patient Matching EHR Ailments: Going from Placebo to Cure ...€¦ · • Patient records are scattered across the health care system in various data silos including; laboratory systems,](https://reader034.vdocuments.net/reader034/viewer/2022052021/603621d90683ec40db771eb5/html5/thumbnails/54.jpg)
Correct Answer:
B. Understand the question you are trying to answer. The approach you take will be dependent upon the question, as this will determine how you address tradeoffs that will be needed, for instance in timeliness of a response vs. accuracy.
Incorrect Answers:
A: Understanding your data is the next step. The first step is understanding what exactly you want to do.
C, D: Implementing a patient matching solution should happen only after understanding your use cases and your data, and at the same time as improving data entry processes.
Answer 1
![Page 55: Patient Matching EHR Ailments: Going from Placebo to Cure ...€¦ · • Patient records are scattered across the health care system in various data silos including; laboratory systems,](https://reader034.vdocuments.net/reader034/viewer/2022052021/603621d90683ec40db771eb5/html5/thumbnails/55.jpg)
What is the data variant taxonomy? A. A taxonomy used to describe the way errors can happen in the demographic data. B. A taxonomy for describing how patient health data varies between patients. C. A taxonomy for describing the cultural variation in patient populations. D. A taxonomy for describing errors in the collection of patient health data.
Question 2
![Page 56: Patient Matching EHR Ailments: Going from Placebo to Cure ...€¦ · • Patient records are scattered across the health care system in various data silos including; laboratory systems,](https://reader034.vdocuments.net/reader034/viewer/2022052021/603621d90683ec40db771eb5/html5/thumbnails/56.jpg)
Correct Answer:
A. A. A taxonomy used to describe the way errors can happen in the demographic data. This variant taxonomy provides a unified way to describe errors that can happen in patient demographic data, for instance, truncation of dates.
Incorrect Answers:
B. Incorrect because this is not related to health data.
C. Incorrect because this describes the types of errors commonly seen in data, not the cultural make-up of the population that is being matched.
D. This is related to errors in demographic data, not health data.
Answer 2
![Page 57: Patient Matching EHR Ailments: Going from Placebo to Cure ...€¦ · • Patient records are scattered across the health care system in various data silos including; laboratory systems,](https://reader034.vdocuments.net/reader034/viewer/2022052021/603621d90683ec40db771eb5/html5/thumbnails/57.jpg)
True or False: You need to understand your data because the approach you take varies depending upon the mix of cultures and naming conventions, some matchers are better than others at dealing with different types of errors in the data, demographics such as predominance of age groups can change your matching approach.
Answer: TRUE - all of the above are true for reasons in understanding the data before undertaking patient matching
Question 3
![Page 58: Patient Matching EHR Ailments: Going from Placebo to Cure ...€¦ · • Patient records are scattered across the health care system in various data silos including; laboratory systems,](https://reader034.vdocuments.net/reader034/viewer/2022052021/603621d90683ec40db771eb5/html5/thumbnails/58.jpg)
The Trade-off Between False Positive and False Negative Matches
• As the match score threshold is increased, the number of false positives decreases, but false negatives increase. (increasing precision)
• As the match score threshold is lowered, the number of false negatives decreases, but false positives increase (increasing recall)
Source: Grannis, S. Introduction to Record Linkage. September 27, 2012
![Page 59: Patient Matching EHR Ailments: Going from Placebo to Cure ...€¦ · • Patient records are scattered across the health care system in various data silos including; laboratory systems,](https://reader034.vdocuments.net/reader034/viewer/2022052021/603621d90683ec40db771eb5/html5/thumbnails/59.jpg)
Basic IR Metrics: Precision and Recall
“Subject”: MAHMOUD ABDUL HAMEED
12/10/1945
False positives
False negatives
“Target List”:
‘True’ Answers
System returns
Precision (P) = X/Y
Recall (R) = X/Z
X
Y
Z
MOREY APPLEBAUM MOHAMMED ABDUL HAMID MAHMOUD ABD EL HAMEED MAKMUD ABDUL HAMID MAHMOUD ABD ALHAMID
(2/4)
(2/3)
True Positives
![Page 60: Patient Matching EHR Ailments: Going from Placebo to Cure ...€¦ · • Patient records are scattered across the health care system in various data silos including; laboratory systems,](https://reader034.vdocuments.net/reader034/viewer/2022052021/603621d90683ec40db771eb5/html5/thumbnails/60.jpg)
Precision and Recall Inversely Related (1)
Database
‘True’ Hits
System returns
Recall Increased, but Precision Fell
The ‘Low Hanging Fruit’ phenomenon – more false hits will come in for every true one
![Page 61: Patient Matching EHR Ailments: Going from Placebo to Cure ...€¦ · • Patient records are scattered across the health care system in various data silos including; laboratory systems,](https://reader034.vdocuments.net/reader034/viewer/2022052021/603621d90683ec40db771eb5/html5/thumbnails/61.jpg)
Precision and Recall Inversely Related (2)
Database
‘True’ Hits
System returns
Precision Increased, but Recall Fell
More selective matching
![Page 62: Patient Matching EHR Ailments: Going from Placebo to Cure ...€¦ · • Patient records are scattered across the health care system in various data silos including; laboratory systems,](https://reader034.vdocuments.net/reader034/viewer/2022052021/603621d90683ec40db771eb5/html5/thumbnails/62.jpg)
What Makes a Good Evaluation? • Objective – gives unbiased results • Replicable – gives same results for same inputs • Diagnostic – can give information about system
improvement • Cost-efficient – does not require extensive
resources to repeat • Understandable – results are meaningful in some
way to appropriate people • Well-documented – also contextualizes results in
terms of purpose of the evaluation and task
![Page 63: Patient Matching EHR Ailments: Going from Placebo to Cure ...€¦ · • Patient records are scattered across the health care system in various data silos including; laboratory systems,](https://reader034.vdocuments.net/reader034/viewer/2022052021/603621d90683ec40db771eb5/html5/thumbnails/63.jpg)
• Lack of Transparency in How Patient Matching Algorithms Perform • Varied Claims in Algorithm Performance • Need greater transparency in system performance • Better education around patient matching understanding the science. • Little work done to quantify match rates on data sets on real work
clinical data sets • Need Reporting on Match rates in terms of precision and recall
Problem Statement
![Page 64: Patient Matching EHR Ailments: Going from Placebo to Cure ...€¦ · • Patient records are scattered across the health care system in various data silos including; laboratory systems,](https://reader034.vdocuments.net/reader034/viewer/2022052021/603621d90683ec40db771eb5/html5/thumbnails/64.jpg)
IMAC – Admin Interface
An administrative screen allows the ability to manage IMAC users as well as manage the questions asked of users. This includes the ability to set the priority of questions and the number of judges to be used for each question.
![Page 65: Patient Matching EHR Ailments: Going from Placebo to Cure ...€¦ · • Patient records are scattered across the health care system in various data silos including; laboratory systems,](https://reader034.vdocuments.net/reader034/viewer/2022052021/603621d90683ec40db771eb5/html5/thumbnails/65.jpg)
IMAC – Admin Interface (2) Viewing and resolution of conflicting adjudications can also be performed from the administrative screen.
![Page 66: Patient Matching EHR Ailments: Going from Placebo to Cure ...€¦ · • Patient records are scattered across the health care system in various data silos including; laboratory systems,](https://reader034.vdocuments.net/reader034/viewer/2022052021/603621d90683ec40db771eb5/html5/thumbnails/66.jpg)
Evaluation: Like IR Tasks • Metrics
– F-measure - harmonic mean of precision and recall • F = (β2 + 1) P R / ( (β2 P) + R) where P = precision = correct system responses / all system responses R = recall = correct system responses / all correct reference responses β = beta factor– provides a mean to control the importance of recall over precisio
– Additional Measures • False positives – items that are identified as correct responses that are
not correct responses (= 1 – Precision)
• False negatives – correct responses not identified (= 1 – Recall) • Fallout = non-relevant responses / all non-relevant reference responses
(related to, but not directly calculable from precision / recall) Issue: • Annotation Standard for Development of Ground Truth
![Page 67: Patient Matching EHR Ailments: Going from Placebo to Cure ...€¦ · • Patient records are scattered across the health care system in various data silos including; laboratory systems,](https://reader034.vdocuments.net/reader034/viewer/2022052021/603621d90683ec40db771eb5/html5/thumbnails/67.jpg)
• Large Affects on performance due to algorithm tuning
• Tuning is need specific • Setting Cut-offs
– Upper Thresholds – Feature Selection – Feature Weighing
• Blocking
Algorithm Tuning
![Page 68: Patient Matching EHR Ailments: Going from Placebo to Cure ...€¦ · • Patient records are scattered across the health care system in various data silos including; laboratory systems,](https://reader034.vdocuments.net/reader034/viewer/2022052021/603621d90683ec40db771eb5/html5/thumbnails/68.jpg)
Algorithm Performance
Algorithm
Algorithm Tuning
Data Quality
![Page 69: Patient Matching EHR Ailments: Going from Placebo to Cure ...€¦ · • Patient records are scattered across the health care system in various data silos including; laboratory systems,](https://reader034.vdocuments.net/reader034/viewer/2022052021/603621d90683ec40db771eb5/html5/thumbnails/69.jpg)
Framework for Evaluation: EAGLES 7-Step Recipe/ISLE FEMTI* 1. Define purpose of evaluation – why doing the
evaluation 2. Elaborate a task model – what tasks are to be
performed with the data 3. Define top-level quality characteristics 4. Produce detailed system requirements 5. Define metrics to measure requirements 6. Define technique to measure metrics 7. Carry out and interpret evaluation
Originally developed as an evaluation framework for Machine Translation, but authors note that it should be able to be used as a generic evaluation framework.
*Acronyms: EAGLES – European Advisory Group on Language Engineering Standards ISLE – International Standards for Language Engineering FEMTI – Framework for the Evaluation of Machine Translation in ISLE (http://www.issco.unige.ch/femti)