hxr 2016: data insights: mining, modeling, and visualizations- niraj katwala
TRANSCRIPT
© 2016 Talix. Confidential and Proprietary.
Data Insight: Mining, Modeling and VisualizationsNiraj Katwala, EVP & CTO, Talix
April 5, 2016
About Talix
– PEOPLE Sixty passionate Medical Professionals, Informaticists, Software Engineers, and Product Leaders
– PRODUCTS Coding InSight Improves risk management,
leading to better patient outcomes and optimized reimbursement for healthcare providers and payers
HealthSearch Powers search and discovery for healthcare professionals and patients
– ACCOMPLISHMENTS Serving major Healthcare brands through products powered by our HealthData Engine
2
HealthData Engine: Unlocking the Value of Unstructured Data
3
Inpu
ts
Unstructured Patient Data(e.g., Treatment
Authorization Request)
Clinical Rules
NLP
HealthTaxonomy
Structured Patient Data
(e.g., EMR Data, Formulary Lists of Drugs)
OUTPUTS WHAT IT ENABLES
INPUTS
Pharmaceuticals:Pharma insights on clinical trial,
prescription and other data
Publishers:Content search across large bodies
of content
Providers:Group patients into cohorts based
on key attributes / factors
Health Plans: Match treatment authorization
requests to clinical policy bulletins
Patient data and other clinical data
extracted, normalized,
categorized, and enriched
4
22 2324
25
Clinical Inference Engine using Clinical Rules on Patient Data (22 - 25)
Patient Data Inference Engine Master
Patient Data Inference Visualizer (Outcomes with
Confidence Score)
Configuration Database forSelecting Encoded Rules
By Use Case (Coding, UM, F/W/A, Re-Admission)
Loop over IndividualPatient Data Records
HealthData Engine: Leveraging Taxonomy, NLP, Clinical Rules
Structured ClientPatient Data CaptureMethods/Adaptors
UnstructuredClient Patient
Data in Text Format
StructuredClient Patient Data
Client Clinical Data Systems(EMR, Claims, Rx Systems, Laboratory, Radiology, Social Networks)
Unstructured ClientPatient Data CaptureMethods/Adaptors,
Including OCR/Speech Recognition
1
Talix HealthTaxonomy
6
5
2 3
4
Patient Data Acquisition Module (1 – 6)
Talix HealthTaxonomy
Automated Machine Learning from Published Literature,
Claims, EMR Data
Clinical GuidelinesCapture through
Clinical Guidelines Editor
Manual Entryby Smart Tags Editor
Clinical Rules (Diagnostic and Treatment Rules) Capture Mechanisms
Raw Rules Database
Rules Enrichment Process
Enriched Rules Database
Rules Codifier using HL-DSL
Clinical Rules Assembly Module (12 – 21)
13 14 15
1816
20
17
19
21Encoded Rules in HL-DSLwith Confidence Scores
Final StructuredClient Patient Data
Records Enriched and Normalized (LPR)
Talix NLP Engine Master
Talix HealthTaxonomy
Non-Healthcareand Use Case-
SpecificKnowledge bases
Healthcare andGeneral Annotators
Use Case and DocumentSpecific Parsers
8
9
10
11 12
NLP Engine for Patient Data Enrichment and Normalization (6 – 12)
7
HealthData Engine: Leveraging Taxonomy, NLP, Clinical Rules
5
Structured and Enriched Content and Data
10
Talix NLP Engine (Process Steps in next diagram)
8
Talix Health Taxonomy
6
Clinical Rules
Database
13
Clinical Guidelines Editor (Machine Readable Form)
12
Terminology Editor
5
Back-End Terminology Integration Engine
CPT-4 LOINCOMIM NDC ICD-10
NCI-T HCPCS Gene Ont.ICD-9 SNOMED1
Medical Informatics Team (Human)1. Semantic Relationship Build-out2. Consumer-Friendly Names3. Clinical Quality Control4. Acronym and Abbreviations5. Stemming Correction Lists6. Homonyms and Negation7. Term and Query Specific Rules
2
Data Mining from Published Literature (Suggestions Only)
3
Organization SpecificTerminologies
4
Clinical Rules
Database
7
Organization-Specific Clinical Guidelines
11
Unstructured Content and Data
9
Health Taxonomy: A Robust Knowledge Base
6
INDUSTRY STANDARDS CALIBRATIONMost precise and comprehensive healthcare taxonomy for all healthcare segments mapping multiple industry standard terminologies including
• ICD-9, ICD-10• MESH, NCI THESAURUS, GENE ONTOLOGY, OMIM• SNOMED, LOINC, HCPCS, DRG• RXNORM, NDC
1+ MILLION CONCEPTS BASED ARCHITECTURE Concepts with many attributes including
• Synonyms, Abbreviations, Acronyms• Misspellings• Homonym Identification• Stemming Correction Lists
2+ MILLION SEMANTIC RELATIONSHIPS WITH RANKINGS Unique in the industry and include
• Disease to Drugs • Disease to Symptoms • Disease to Treatments • Disease to Diagnostic Procedures • And many others with Ranking strength
HISTORY OF THE TAXONOMYHighly-iterative effort over 15 years in development by a dedicated team of Medical Professionals and Data Scientists and tens of millions of R&D dollars in multiple implementations & domains. 3rd party verifications
Clinical Rules
7
TalixHealthData Engine (HDE)
Machine Readable GuidelinePDF Clinical Guideline
NODE 143ER Positive: ICD 9: V86.0, ICD 10: Z17.0, CPT4: 3315F, Synonyms: Estrogen Receptor Positive, ER+, ER +ve
PR Positive: CPT4: 3315F, Synonyms: Progesterone Receptor Positive, PR+, PR +ve
NODE 134
Tubular Mucinous: ICD 9: 189, ICD 10: C64
Enriched Clinical Guideline Node Data
3
2
1
8
Beyond NLP: Natural Language Understanding (NLU)
Documents annotations written out into Cassandra and Solr, from which search and analytics toolkits can consume this data.
Lucene Search Index
Cassandra Data store
Summary Generators
Automatically generates document summaries using concept aggregation.
9
In-Document Abbreviation Recognizer
Maps abbreviations to concepts per document.
8 Coordinate Expansion
Expands “Diabetes Type I and II” into “Diabetes Type I” and “Diabetes Type II.”
7 Word Sense Disambiguator
Uses taxonomy to disambiguate ambiguous terms, e.g., disambiguate between Cold Temperature and Common “Cold.”
6 Rule Based Annotators (next slide)
Drug Dosage Module, Laboratory Test with Values, Family History, Negation, Demographics, Past Medical History etc.
5
Clinical Rules Database
Various Named Entity Extractors
Dictionary based named entity extraction: names, geographical entitles, molecules, etc.
4
3rd Party Dictionaries
Finds concepts in document by looking up graph view of taxonomy, sets base and relationship scores, normalizes and adjusts scores.
3 Concept Mapper
Produces various views of the document. i.e.: Paragraph, Sentence, Word, etc.
Tokenizer2Document Preprocessor (CIP Conversion)
1
Various Types of Documents – Clinical Policy Bulletins, Clinical Guidelines, Input Patient History with Treatment Recommendations
Produces various views of any document. i.e.: Section, Paragraph, Sentence, other defined fields
Talix Health Taxonomy
9
NLP Stack: Annotators and Knowledge Repositories
Chief Complaint Annotator
Laboratory Test and Results
Annotator
Drug and Dosage
Annotator
Past Medical History
Annotator
Social History Annotator
Family History Annotator
Pre and Post Surgery
Observations
Conditions Annotator
Treatment Procedures Annotator
Negation Annotator
Age Group Annotator
Gender Annotator
Geographic Annotator
Temporal Value
Annotator
Code Translations(ICD9, CPT4,
RxNORM, etc.)
Semantic Type Concepts (e.g. Diseases, Labs,
Drugs)
Regular Expression
Patterns(e.g. Drug
Dosage Patterns)
Temporal Values, Age
Values, Georgraphic Entities, etc.
Document Types and Sub-Headings
Stemming Corrections, Homonyms
Condition Specific Rules and Patterns
Use Case Specific Data,
Rules and Patterns
Document Section Specific Annotators
Level 1
Semantic TypeAnnotators
Level 2
Base Term TypeAnnotators
Level 3
Knowledge Bases Level 4
Access 1
Access 2
Access 3
Vital Signs and
Observations Annotator
Use Case: Leveraging Data Analytics for Risk Adjustment
10
a
EHR
/ Cl
inic
al S
yste
ms
Unstructured Patient Data
Semi-Structured Patient Data
Structured Patient Data
Clinical Rules
INPUTS
INPUTS
NLP
HealthTaxonomy
Risk Adjustment Model
RISK COMPUTATION
CODING OPTMIZATION
ANALYTICS & REPORTING
The Challenges of Risk Adjustment
11
Time-consuming, inefficient and
error-prone
Retrospectiverather than prospective
Significant impact on reimbursement and
patient care delivery
Overlooked clinical factors in unstructured
narratives and patient histories
Inferior analytics technology, leading to a significant number of
missed or inaccurate codes
Not integrated at the point
of care
Coding InSight Application: Addressing Risk Adjustment
• Automate coding gaps detection for more accurate coding and risk scoring
• Conduct prospective and retrospective coding optimization
• Analyze projected coding patterns and provider documentation gaps
• Integrate into the physician workflow at the point of care
• Improve care planning and patient outcomes
12
Analyzing Unstructured Patient Data
13
Peripheral Neuropathy
• Novolog Mix 70-30• Flexpen
Insulin Injection
HbA1c 7.3
• Metformin 1,000 mg tablet• Actos 30 mg tablet
Endocrinologist
Onglyza
BMI 38.86
Hemoglobin A1c
ComplicationPeripheral Neuropathy
Medication• Novolog Mix 70-30• Flexpen
Treatment ProcedureInsulin Injection
Lab ResultHbA1c 7.3
Medications• Metformin 1,000 mg tablet• Actos 30 mg tablet
SpecialistEndocrinologist
MedicationOnglyza
Risk FactorBMI 38.86
Diagnostic Procedure Hemoglobin A1c
Optimizing CMS Payments
14
Scenario 1: What Was Coded Scenario 2: What Should Have Been Coded
Condition ICD-10 Code
HCC Risk Score
Diabetes Mellitus with diabetic nephropathy E11.21 0.368
Peripheral Vascular Disease, unspecified I73.9 0.299
Chronic Obstructive Pulmonary Disease, unspecified
J44.9 0.346
Condition ICD-10 Code
HCC Risk Score
Diabetes Mellitus with diabetic nephropathy E11.21 0.368
Peripheral Vascular Disease, unspecified I73.9 0.299
Chronic Obstructive Pulmonary Disease, unspecified
J44.9 0.346
Sick Sinus Syndrome I49.5 0.295
Chronic Viral Hepatitis C B18.2 0.251
BMI 40.0-44.9, adult Z68.41 0.365
RAF Score: 1.013Total Payment: $10,130
RAF Score: 1.924Total Payment: $19,240Source: Data based on a Talix customer