content analytics for healthcare

17
© 2014 IBM Corporation Xavier Constant Núñez – Business Analytics Architect October - 2014 Advanced Analytics for Healthcare Xavier Constant [email protected] WHO-FIC Network Annual Meeting October 2014

Upload: xavier-constant

Post on 16-Jul-2015

201 views

Category:

Data & Analytics


3 download

TRANSCRIPT

Page 1: Content analytics for healthcare

© 2014 IBM Corporation

Xavier Constant Núñez – Business Analytics Architect

October - 2014

Advanced Analytics for Healthcare

Xavier [email protected]

WHO-FIC Network Annual MeetingOctober 2014

Page 2: Content analytics for healthcare

© 2014 IBM Corporation

For Healthcare Institutions, harvesting the data profusion with analytics opens up new sources of value

What patient-centric factors are most predictive of outcomes from a treatment regimen ?

What patient-level interventions could help improve treatment adherence ?

How can you provide clinicians with targeted assistance ?

How can you best meet the information and support needs of patients and caregivers?

What channels of communication work best ?

… and many more

Physician notes and discharge summaries

Patient history and symptoms Pathology reports Tweets, text messages and online

forums Satisfaction surveys Claims and case management data Forms-based data and comments Emails and correspondence Trusted reference journals and

portals Paper based records and documents

* AIIM website, accepted industry percentage

Over 80% of stored health information is unstructured*

Data from a variety of sources is now available for harvesting.

These sources generate huge volumes of data …

… and can help identify significant points of leverage

15 petabytes

Amount of new information created each day - eight times more than the information in all US libraries

ClinicalOutcomes

OperationalOutcomes

Health data growing 35%

per year*

Page 3: Content analytics for healthcare

© 2014 IBM Corporation

time spent manually interpreting data would become time spent healing patients.•Aggregate, activate and enrich relevant patient information beyond what is known

•Surface earlier, more accurate insights to drive incipient intervention opportunities

•Adaptive and proactive delivery drives individualized, patient centered care

Confirm what I think or suspect?

Show me something new or unexpected?

How many are being missed?

How do we move faster and anticipate change?

If we could only activate the relevant information to bring insights to the point of care when needed most

Knowledge, guidelines and best practice measures

Adapt care to changing conditions and new information

Indentify intervention opportunities

Longitudinal “data driven” insights

Information should aid us, not lie hidden and dormant

Page 4: Content analytics for healthcare

© 2014 IBM Corporation

Carilion Clinics flags patients at risk for developing Congestive Heart Failure (CHF)

Page 5: Content analytics for healthcare

© 2014 IBM Corporation

Use of Content Analytics to enrich the Predictive Models used to identify patients at risk for developing CHF

Page 6: Content analytics for healthcare

66 © 2014 IBM Corporation#ibmiod

Patient Age: 65Gender: MaleRace: White

Diagnosis

MelanomaStage: 2

Social Marital status: single

Labs AJCC: T2

Risk of metastasis

47%

Content AnalyticsPredictive Analytics

RecommendedAdd’l Treatment

DTIC

Similarity Analytics

Care Manager

100’s or 1000’s of patients

100’s or 1000’s of patients

One Patient

Goals Avoid remission

Activities Avoid UV radiationRegular screeningTransportation assistance

A 65-year old white male has been diagnosed with stage 2 melanoma. He is widowed and lives alone.

AJCC: T2

Raw Information(e.g. EMR and Claims)

10’s of thousands of patients

6

Technical Solution OverviewCapture, Analyze, Activate

Page 7: Content analytics for healthcare

77 © 2014 IBM Corporation#ibmiod

Examples of NLP Challenges in Healthcare

• Accurately identify and extract facts from text including negation“55%” = LVEF“Patient does not show signs” = Negative Symptom

• Accurately interpret and assign values to ambiguous statements“around 55%” = LVEF“Shows slightly elevated levels” = if condition A = 10%, if condition B = 20%

• Infer meaning from non-contextual content“Cut back from two packs to one per day” = Smoker

• Cleanse, enhance and normalize raw data“Myocardia infarction” and “heart attack” = equal same thingCorrect misspellings and abbreviations through NLPEnhance or augment by assigning correct RxNorm, SNOMED, ICD-10 or other codes / terminology

• Preserve and structure facts and concepts from contextual content:

7

A 42-year old white male presents for a physical. He recently had a right hemicolectomy invasive grade 2 (of 4) adenocarcinoma

in the ilocecal valve was found and excised. At the same time he had an appendectomy. The appendix showed no diagnostic abnormality.

Patient Age: 42 Gender: Male Race: White

Procedure hemicolectomy diagnosis: invasive adenocarcinoma anatomical site: ileocecal valve grade: 2 (of 4)

Procedure appendectomy diagnosis: normal anatomical site: appendix

Content Analytics

Page 8: Content analytics for healthcare

88 © 2014 IBM Corporation#ibmiod

Content Analytics Applied to Improve Patient Outcomes

8

Content Analytics

Language IdentificationLanguage Identification

Spell checkingSpell checking

Lexical analysisLexical analysis

Part of Speech, DisambiguationPart of Speech, Disambiguation

Named Entity RecognitionNamed Entity Recognition

Part of a SentencePart of a Sentence

Semantics (Relationships), DisambiguationSemantics (Relationships), Disambiguation

Synonims, ConceptsSynonims, Concepts

Objectiveness: opinion, doubt, fact, questionObjectiveness: opinion, doubt, fact, question

IdeasIdeas

Indexing

Evolving topics

Sentiment Analysis

Interest analysis

life-events

QA

Deep QA (Watson)

Brand Analytics

Classification

Custom applications

text-analytics

NLPDocument Summarization

Page 9: Content analytics for healthcare

© 2014 IBM Corporation

Enrich & Improve Predictive Models with information trapped in unstructured data

• Content Analytics capabilities :• Trend, Pattern, Anomaly, Deviation and

Context Analysis

• Medical Fact, Relationship and Outcome Annotation

• Healthcare Accelerators speed time to value:

• Annotators focused on extracting medical terms

• Approximately 800 pre-built rules developed in IBM Content Analytics Studio

• Included diagnoses, procedures, labs, many drugs

• Transforming unstructured data to CPT, ICD9, and SNOMED concept ID outputs

• Detecting negations

• Detecting family histories

Content Analytics

Page 10: Content analytics for healthcare

© 2014 IBM Corporation

Enrich & Improve Predictive Models with information trapped in unstructured data

Content Analytics

Page 11: Content analytics for healthcare

© 2014 IBM Corporation

Real case: Readmissions at SetonThe Data We Thought Would Be Useful … Wasn’t

• 113 candidate predictors from structured and unstructured data sources

• Structured data not available, not accurate enough without the unstructured content

• Unexpected Indicators Emerged … Readmission is a Highly Predictive Problem! • 18 population specific predictors surface previously unknown intervention opportunities

11

Unstructured data unlocks hidden insights

Predictor Analysisc % EncountersStructured Data

% Encounters Unstructured

Data

Ejection Fraction (LVEF)

2% 74%

Smoking Indicator 35%(65% Accurate)

81%(95% Accurate)

Living Arrangements <1% 73%(100% Accurate)

Drug and Alcohol Abuse

16% 81%

Assisted Living 0% 13%

Content Analytics

Page 12: Content analytics for healthcare

1212 © 2014 IBM Corporation

Output: alerts predictions recommendations categorization visualization …

Training

Analysis

Medical health records / provided services / lab tests

Statistical Prediction Models

2. Analysis – runtime process

1. Learning

A new Patient / change in

disease state / new condition

Periodic Training:Learn from the latest data

Collect real data

Predictive AnalyticsPredictive Analytics

Page 13: Content analytics for healthcare

1313 © 2014 IBM Corporation

Representing Patients using Information Obtained from Multiple Sources of Data

Feature Extraction

Feature Extraction

Patient Feature Vector

x1

xN

x2

Patient

Predictive Analytics

Page 14: Content analytics for healthcare

1414 © 2014 IBM Corporation14 IBM Confidential 18 de desembre de 2014

Similarity Visualization

Patient Representation

Similarity Identification

Patient Similarity Analytics - Given an index patient, find clinically similar patients

Similarity Analytics

Page 15: Content analytics for healthcare

© 2014 IBM Corporation

Congestive Heart Failure (CHF) Onset Prediction (Results achieved in 6 weeks)

Page 16: Content analytics for healthcare

© 2014 IBM Corporation

IBM Advanced Care Insights provide analytical capabilities enabling holistic, individualized approaches to Smarter Care

ObjectiveFind clinically similar patients for decision support and Comparative Effectiveness

Patient Similarity

ObjectiveIdentify most effective treatment option for a given patient

Personalized Comparative Effectiveness

Predict Patient Clinical Pathway Patient / Provider Matching

Visualize Population Cohorts

Visualize Disease Pathways

Predictive

Analytics

Multimodal Longitudinal

Patient Data (e.g. Structured +

Unstructured [text, image, genetics, …],

potentially social media)

Chance of

Adverse

Event = 80%

X months

ObjectiveAnalyze patient’s longitudinal records to model the future risk of developing adverse conditions

ObjectivePredict and visualize patient disease progression

ObjectiveVisualize populations through interactive multi-dimensional exploration of inter-cluster and intra-cluster relationships

ObjectiveMatch patients with providers based on similarity analytics and optimal performance characteristics`

Differentiating

Identifying

Challenging Patients

Page 17: Content analytics for healthcare

© 2014 IBM Corporation

Thank You