creating dynamic groupers using overrepresentation of clinical terms

Post on 22-May-2015

486 Views

Category:

Healthcare

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Presented at Epic's Research Advisory Council, April 3, 2014, Verona, WI See a novel approach to query expansion based on pre-existing structured information within the EHR. Presenters adopted over-representation analysis to find statistically significant associations among the clinical terms extracted from Clarity reports. The study population consisted of over 7,000 patients and their 12 million observations - including labs, medications, phenotypes, diseases, and procedures. See the detailed findings and discuss computational and terminology challenges.

TRANSCRIPT

Creating Dynamic Groupers Using Overrepresentation of Clinical Terms

Tomasz Adamusiak MD PhD

Froedtert & Medical College of Wisconsin

2

Conflict of interest disclosure

Tomasz Adamusiak has no real or apparent conflicts of interest to report

3

Learning objectives

• Recognize the value of structured clinical information

• Identify computational and terminology challenges in big data analytics

• Evaluate how this approach applies to different use cases

4

What is a grouper?

Lists of specific values derived from standard vocabularies used to define clinical concepts, e.g. patients with diabetes

• SNOMED CT concepts

• ICD-9/10 codes

• EDG terms

• CQM Value Sets

5

Diabetes: Eye Exam CMS eMeasure: CMS131v2

Value Set Name

Diabetes

Type Grouping

Steward National Committee for Quality Assurance

Program CMS,MU2 EP Update 2013-06-14

… … …

190330002 Diabetes mellitus, juvenile type, with hyperosmolar coma (disorder)

SNOMEDCT

250 Diabetes mellitus without mention of complication, type II or unspecified type, not stated as uncontrolled

ICD9CM

E10.10 Type 1 diabetes mellitus with ketoacidosis without coma

ICD10CM

6

Mining associations in EHR data

Diabetes mellitus

Yes No

Glucohemoglobin measurement

Yes 1509 5442

No 881 99

7

Positive association

Background reference

Dynamic = expansion + association

8

CPT-4 83036

ICD10 E08-E13

Extract-Load-Transform

9

Transformation in ClinMiner https://clinminer.hmgc.mcw.edu user:epicdemo pass:epicdemo

10

This image by Tomasz Adamusiak is licensed under a CC BY 3.0 US license

ClinMiner is a non-commercial, prototype software

Pilot: test all possible diabetes associations

11

8k patients

12M observations

Labs (CPT-4/LOINC)

Medications (RxNorm)

Problems (ICD-9)

Procedures (CPT-4)

18 764 terms 162 significant

associations

Summarize, but normalize per patient 1 + 1 = 1

12

Parent Concepts

ICD-10-CM

Relatively straightforward in ICD

13

Parent Concepts

ICD-10-CM

Caveat: flat hierarchy results in disconnected clinical contexts

Q: All tuberculosis codes

• 010-018.99 TUBERCULOSIS

• 137 Late effects of tuberculosis

• 647.3 Tuberculosis complicating pregnancy childbirth or the puerperium

14

Expansion has to take into account multiple inheritance in SNOMED CT

15

SNOMED CT

Parent Concepts

Pieter Brueghel the Elder (1526/1530–1569) [Public domain], via Wikimedia Commons

In pursuit of a single language

16

Integrating terminologies with UMLS

Donald A.B. Lindberg, M.D.

Clinical

Terminologies

UMLS

17

UMLS is ideal for integration of heterogeneous clinical data

• Single entry point to MU terminologies

• Cross-walk between MU terms

• Terminology-agnostic

• Text-mining

18

UMLS

Exanthema C0015230

SNOMED CT

ICD-10-CM

UMLS establishes equivalence mappings across biomedical terminologies

SNOMED CT

rash NOS

ICD-10:R21

Cutaneous eruption

SCT:112625008

Eruption

SCT:1806006

UMLS

Exanthema C0015230

SNOMED CT

ICD-10-CM

UMLS establishes equivalence mappings across biomedical terminologies

SNOMED CT

Cutaneous eruption

SCT:112625008

rash NOS

ICD-10:R21

Eruption

SCT:1806006

6o of terminological Kevin Bacon

Acute myocardial infarction

Myocardial ischemia

Vascular Diseases

Disorder of soft tissue

Collagen Diseases

Connective Tissue Diseases

Epidermal and dermal conditions

Skin and subcutaneous tissue disorders

Dermatologic disorders

21

Expansion limited to MU terminologies and by semantic type

22

Finding

Disease or Syndrome

Ignore

Open issue: cycles due to subtle differences in meaning

23

Immune System

Endocrine System

Expansion in UMLS across MU sources

24

Diabetes mellitus without mention of complication,

type II or unspecified type, not stated as

uncontrolled

ICD-9

ICD-10

SNOMED CT

NDF-RT

Situation with explicit

context

Metabolic diseases

roots:

Statistical methods for establishing over/under-representation

• Serial contingency tables

• Chi-squared test with Bonferroni correction

• RR estimate of effect size

• Test diabetes in all 18 764 concept pairs

25

EHR-based association rule mining

Diabetes mellitus (C0011849)

Yes No

Glucohemoglobin measurement

(C0202054)

Yes 1509 5442

No 881 99

26

Positive association

Background reference

Other positive associations

• C0785704 Blood glucose monitoring equipment

• C0935929 Antidiabetics

• C0304870 Insulin, Long-Acting

• C0770893 Metformin hydrochloride

• C0011882 Diabetic Neuropathies

• C0011880 Diabetic Ketoacidosis

• C0011884 Diabetic Retinopathy

27

Expansion generalization on

class or system level

A non-representative control background can bias the findings

Diabetes inversely associated with

• C1314183 Special EEG tests

• C0242953 Barbiturate hypnotic

• C0064636 lamotrigine

• C1719410 Epilepsy and recurrent seizures

28

Open issue: reconciling lab orders with results

Clinical Laboratory

Hemoglobin A1c/Hemoglobin .total in Blood by

HPLC

LOINC:17856-6

Hemoglobin; glycosylated (A1C)

CPT-4:83036

29

Challenges

• Availability of correctly and exhaustively coded data

• Expansion with multiple inheritance memory intensive

• Testing all possible (180M) combinations computationally expensive

30

What can we learn from other industries?

31

Thank You!

Tomasz Adamusiak MD PhD

Human and Molecular Genetics Center

Medical College of Wisconsin

tomasz@mcw.edu

@7omasz

For more information

• Next-generation phenotyping using the Unified Medical Language System (UMLS). Adamusiak T, Shimoyama N, Shimoyama M, JMIR Med Inform. doi:10.2196/medinform.3172

• EHR-based phenome wide association study in pancreatic cancer. Adamusiak T, Shimoyama M, AMIA Summits Transl Sci Proc. 2014 (in press)

top related