tech natives 22042013_bartde_witte_watson_v01

34
© 2013 International Business Machines Corporation Watson from Jeopardy to Healthcare BDent, Bart de Witte, MAppSc Healthcare Industry Leader CEE / ALPS April 2013 – Tech Natives Event, Wirtschaftskammer, Wien Follow us @IBMWatson Follow me @swisshealth20

Upload: bart-de-witte

Post on 07-May-2015

275 views

Category:

Business


0 download

DESCRIPTION

IBM Watson

TRANSCRIPT

Page 1: Tech natives 22042013_bartde_witte_watson_v01

© 2013 International Business Machines Corporation

Watson from Jeopardy to Healthcare

BDent, Bart de Witte, MAppSc

Healthcare Industry Leader CEE / ALPS

April 2013 – Tech Natives Event, Wirtschaftskammer, Wien

Follow us @IBMWatson

Follow me @swisshealth20

Page 2: Tech natives 22042013_bartde_witte_watson_v01

© 2013 International Business Machines Corporation

watson - jeopardywatson - jeopardy

healthcare & datahealthcare & data

watson in healthcarewatson in healthcare

Page 3: Tech natives 22042013_bartde_witte_watson_v01

© 2013 International Business Machines Corporation

Jeopardy

Broad/Open Domain Complex Language High Precision Accurate Confidence High Speed

Human Language Words by themselves have no meaning Only grounded in human cognition Words navigate, align and communicate an

infinite space of intended meaning Computers can not ground words to human

experiences to derive meaning

Page 4: Tech natives 22042013_bartde_witte_watson_v01

© 2013 International Business Machines Corporation

Why Jeopardy?

Grand Challenge

Page 5: Tech natives 22042013_bartde_witte_watson_v01

© 2013 International Business Machines Corporation

The world is “dying of thirst in an ocean of data”

80%of the world’s data

today is unstructured

90% of the world’s data was created in the

last two years

20%amount of data

traditional systems leverage today

Page 6: Tech natives 22042013_bartde_witte_watson_v01

© 2013 International Business Machines Corporation

Easy Question

0.008850.00885 (LN (12,546,798*π )) ^ 2 / 34,576.46 (LN (12,546,798*π )) ^ 2 / 34,576.46

Select Payment where Owner = “David Jones” and Type (Product) = “Laptop”Select Payment where Owner = “David Jones” and Type (Product) = “Laptop”

Owner Serial Number

David Jones 45322190-AK

Serial Number Type Invoice #

45322190-AK LapTop INV10895

Invoice # Vendor Payment

INV10895 MyBuy $104.56

Page 7: Tech natives 22042013_bartde_witte_watson_v01

© 2013 International Business Machines Corporation

Computer programs are natively explicit, fast and exacting in their calculation over numbers and symbols….But Natural Language is implicit, highly contextual, ambiguous and often imprecise.

Where was X born?

One day, from among his city views of Ulm, Otto chose a water color to send to Albert Einstein as a remembrance of Einstein´s birthplace.

X ran this?

If leadership is an art then surely Jack Welch has proved himself a master painter during his tenure at GE.

Structured

Unstructured

Hard Question

Page 8: Tech natives 22042013_bartde_witte_watson_v01

© 2013 International Business Machines Corporation

Decision Maker

Search Engine

Finds Documents containing KeywordsFinds Documents containing Keywords

Delivers Documents based on PopularityDelivers Documents based on Popularity

Has QuestionHas Question

Distills to 2-3 KeywordsDistills to 2-3 Keywords

Reads Documents, Finds AnswersReads Documents, Finds Answers

Finds & Analyzes EvidenceFinds & Analyzes Evidence

Informed Decision Making: Search vs. Expert Q&A

Expert

Understands QuestionUnderstands Question

Produces Possible Answers & EvidenceProduces Possible Answers & Evidence

Delivers Response, Evidence & ConfidenceDelivers Response, Evidence & Confidence

Analyzes Evidence, Computes ConfidenceAnalyzes Evidence, Computes Confidence

Asks NL QuestionAsks NL Question

Considers Answer & EvidenceConsiders Answer & Evidence

Decision Maker

8

Page 9: Tech natives 22042013_bartde_witte_watson_v01

© 2013 International Business Machines Corporation

explorer

India

In May 1898

India

In May

celebrated

anniversary

in Portugal

In May, Gary arrived in India after he celebrated his anniversary in Portugal

In May, Gary arrived in India after he celebrated his anniversary in Portugal

Portugal

400th anniversary

celebrated

Gary

In May 1898 Portugal celebrated the 400th anniversary of this explorer’s arrival in India

In May 1898 Portugal celebrated the 400th anniversary of this explorer’s arrival in India

This evidence suggests “Gary” is the answer BUT the system must learn that keyword matching may be weak relative to other types of evidence

arrived in

arrival in

Legend

Keyword “Hit”

Reference Text

Answer

Weak evidenceRed Text

Why is Jeopardy! so Difficult?answering complex natural language questions requires more than keyword evidence

Page 10: Tech natives 22042013_bartde_witte_watson_v01

© 2013 International Business Machines Corporation10

Winning Human Performance

Winning Human Performance

2007 QA Computer System2007 QA Computer System

Grand Champion Human

Performance

Grand Champion Human

Performance

Top human players are remarkably

good.

Top human players are remarkably

good.

Each dot – actual historical human Jeopardy! gamesEach dot – actual historical human Jeopardy! games

More ConfidentMore Confident Less ConfidentLess Confident

What It Takes to compete against Top Human Jeopardy! Players

Page 11: Tech natives 22042013_bartde_witte_watson_v01

© 2013 International Business Machines Corporation

27th May 1498

Vasco da Gama

landed in

arrival in

explorer

India

Statistical Para-phrases

Geo-Spatia

l Reasoning

DateMatch

Stronger evidence can be much harder to find and score…

…and the evidence is still not 100% certain

Search far and wide

Explore many hypotheses

Find judge evidence

Many inference algorithms

On the 27th of May 1498, Vasco da Gama landed in Kappad BeachOn the 27th of May 1498, Vasco da Gama landed in Kappad Beach

400th anniversary

Portugal

May 1898

celebrated

In May 1898 Portugal celebrated the 400th anniversary of this explorer’s arrival in India.

In May 1898 Portugal celebrated the 400th anniversary of this explorer’s arrival in India.

Kappad Beach

Legend

Temporal Reasoning

Reference Text

Answer

Statistical Paraphrasing

GeoSpatial Reasoning

Levering Algorithms for Deeper Evidence

Page 12: Tech natives 22042013_bartde_witte_watson_v01

© 2013 International Business Machines Corporation

Watson is a Massively Parallel Probabilistic Evidence-Based Architecture DeapQA generates and scores many hypotheses using an extensible collection if Natural Language Processing, Machine Learning and Reasoning Algoritms. These gather and weigh evidence over both structured and unstructured content to determine the answer with the best confidence

Answer Scoring

Models

Responses with Confidence

Inquiry

Evidence Sources

Models

Models

Models

Models

ModelsPrimarySearch

CandidateAnswer

Generation

HypothesisGeneration

Hypothesis and Evidence Scoring

Final Confidence Merging & Ranking

Synthesis

Answer Sources

Inquiry/Topic Analysis

EvidenceRetrieval

Deep Evidence Scoring

Learned Modelshelp combine and

weigh the Evidence

HypothesisGeneration

Hypothesis and Evidence Scoring

InquiryDecomposition

Page 13: Tech natives 22042013_bartde_witte_watson_v01

© 2013 International Business Machines Corporation

Baseline 12/06

v0.1 12/07

v0.3 08/08

v0.5 05/09

v0.6 10/09

v0.8 11/10

v0.4 12/08

v0.2 05/08

IBM Watson

Playing in the Winners Cloud

V0.7 04/10

DeepQA: Incremental Progress in Answering Precision on the Jeopardy Challenge: 6/2007-11/2010

Page 14: Tech natives 22042013_bartde_witte_watson_v01

IBM ConfidentialIBM Confidential © 2013 IBM

Healthcare & Data

Page 15: Tech natives 22042013_bartde_witte_watson_v01

© 2013 International Business Machines Corporation

Our Watson Healthcare strategy solves 3 problems in clinical practice

Related 3 problems

medicine is a science but practiced as an artmedicine is a science but practiced as an art

The number of untapped information that can be used asa source of knowledge is growing exponentiallyThe number of untapped information that can be used asa source of knowledge is growing exponentially

Impossible to keep up and have access to existing knowledgeImpossible to keep up and have access to existing knowledge

Page 16: Tech natives 22042013_bartde_witte_watson_v01

© 2013 International Business Machines Corporation

Our Watson Healthcare strategy solves 3 problems in clinical practice

Estimated 30-40% of care in UK not based on available scientific evidence Grol, R. and Grimshaw, J. (2003)

5 year gap between publication of guidelines and changes in routine practice in Western healthcare systems, Lomas et al (1993)

1 out of 5 diagnoses are wrong Unprecedented research commissioned by the EU has found that 23% of EU

citizens have been a victim or the member of a family who has been a victim of a “serious medical error in a local hospital” or a “serious medical error from a medicine that was prescribed by a doctor”.

In all, only 17% of Austrians and Germans said that hospital patients were very likely or fairly likely to be able to avoid a serious medical error.

medicine is a science but practiced as an artmedicine is a science but practiced as an art

Page 17: Tech natives 22042013_bartde_witte_watson_v01

© 2013 International Business Machines Corporation

Our Watson Healthcare strategy solves 3 problems in clinical practice

medical knowledge doubles every five years 81% of the physicians in the US report spending 5 hours or less a month

reading medical journals Medicine has become too complex and only 20% of the knowledge clinicians

use is evidence based

Impossible to keep up and have access to existing knowledgeImpossible to keep up and have access to existing knowledge

Page 18: Tech natives 22042013_bartde_witte_watson_v01

© 2013 International Business Machines Corporation

Our Watson Healthcare strategy solves 3 problems in clinical practice

16000 Hospitals worldwide collect data 80% of the data is unstructured and stored in hundred of forms such as lab results,

images and medical transcripts data will grow 800% over the next five years 90% of the digital data has been generated in the last 2 years unstructured data will grow 50 times faster then structured data patient monitoring equipment pumps out on average 1000 readings per second or

86400 reading a day Data is getting more social. . . 20M articles on Wikipedia, 30B pieces of Facebook content are shared monthly There are 156M public blogs, 12 terabites on tweets generates every day 70 percent of physicians report that at least one of their patients is sharing health

measurement data with them

The number of untapped information that can be used asa source of knowledge is growing exponentiallyThe number of untapped information that can be used asa source of knowledge is growing exponentially

Page 19: Tech natives 22042013_bartde_witte_watson_v01

© 2013 International Business Machines Corporation

Big Data: this is just the beginning

2010

Vol

ume

in E

xaby

tes

9000

8000

7000

6000

5000

4000

3000

2015

Percentage of uncertain data

Percent of uncertain data

100

80

60

40

20

0

Sensors & Devices

VoIP

Enterprise Data

Social Media

Source: IBM Global Technology Outlook - 2012

You are here

Page 20: Tech natives 22042013_bartde_witte_watson_v01

© 2013 International Business Machines Corporation

Healthcare industry is beset with some of the most complex information challenges we collectively face

Steven Shapiro, Chief Medical & Scientific Officer, UPMC

“Medicine has become too complex. Only about 20% of the knowledge clinicians use today is evidence-based.”

Steven Shapiro, Chief Medical & Scientific Officer, UPMC

“Medicine has become too complex. Only about 20% of the knowledge clinicians use today is evidence-based.”

Steven Shapiro, Chief Medical & Scientific Officer, UPMC

Page 21: Tech natives 22042013_bartde_witte_watson_v01

IBM ConfidentialIBM Confidential © 2013 IBM

Watson in Healthcare Oncology Advisor

Page 22: Tech natives 22042013_bartde_witte_watson_v01

© 2013 International Business Machines Corporation

NEJM Medical Concept Annotations – Attribute extractions

Medications

SymptomsDiseases

Modifiers

Page 23: Tech natives 22042013_bartde_witte_watson_v01

© 2013 International Business Machines Corporation

Sy

mp

tom

sUTI

Diabetes

Influenza

Hypokalemia

Renal Failure

no abdominal painno back painno cough

no diarrhea

(Thyroid Autoimmune)

Esophagitis

pravastatinAlendronate

levothyroxinehydroxychloroquine

Diagnosis Models

frequent UTI

cutaneous lupus

hyperlipidemiaosteoporosis

hypothyroidism

Sym

pto

ms

Fam

. Histo

ryP

at. Histo

ryM

edicatio

ns

Fin

din

gs Confidence

difficulty swallowing

dizziness

anorexia

fever dry mouththirst

frequent urination

Fa

mil

yH

isto

ry

Graves’ Disease

Oral cancerBladder cancer

HemochromatosisPurpura

Pa

tie

nt

His

tory

Med

icat

ion

s

Fin

din

gs

supine 120/80 mm HG

urine dipstick: leukocyte esterase

urine culture: E. Coliheart rate: 88 bpm

SymptomsFamily History

Patient History

MedicationsFindings

Putting the proper pieces together at the point of impact can be life changing

Page 24: Tech natives 22042013_bartde_witte_watson_v01

© 2013 International Business Machines Corporation

Build an intelligence engine to provide patient-specific diagnostic test and treatment recommendations

Provide actionable treatment recommendations

Built on the cognitive computing technologies developed in Watson by IBM Research

Developed and Trained in collaboration with partners who are experts in their domain

Watson in Healthcare Project GoalsWatson in Healthcare Project Goals

Page 25: Tech natives 22042013_bartde_witte_watson_v01

© 2013 International Business Machines Corporation

1 in 4individuals will die from cancer

✔✔

20%of cancer cases receive the wrong diagnosis initially with

some as high as 44%

✔ ✔✔

Source: American Cancer Society, National Health Institute

X

$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$

263.8Boverall costs of cancer in

the US in 2010

Cancer is an insidious disease and the second highest cause of death

+

Working Together to Beat Cancer

IBM+

3Xrate cancer cost climbs vs. std.

health costs or 15-18% / yr.

Working Together to Beat CancerWorking Together to Beat Cancer

Page 26: Tech natives 22042013_bartde_witte_watson_v01

© 2013 International Business Machines Corporation

Creating a Corpus of Knowledge for Cancer Care

Ingestion of NCCN guidelines for breast cancer and lung cancer: Roughly 500,000 unique combinations of breast cancer patient attributes. Roughly 50,000 unique combinations of lung cancer patient attributes.

Over 600,000 pieces of evidence ingested, from 42 different publications/publishers, including:

The Breast Journal, National Comprehensive Cancer Network (Clinical Practice Guidelines, Drug and Biologics compendium, et al.), American Journal Of Hematology, Annals Of Neurology, CA: A Cancer Journal For Clinicians, Cancer Journal, Cochrane, EBSCO, Hematological Oncology, Hepatology, International Journal Of Cancer, Journal Of Gene Medicine, Journal of Clinical Oncology, Journal of Oncology Practice, Massachusetts Medical Society Journal Watch, Massachusetts Medical Society New England Journal Of Medicine, Merck, Nephrology, UptoDate, Clinical Lung Cancer, Current Problems in Cancer, Cancer Treatment Reviews, Elsevier's Monographs in Cancer (multiple), Clinical Breast Cancer, European Journal of Cancer, Lung Cancer (the journal).

Watson has received 14,700 hours of training from clinicians

Accurate: in the cases run, it's 90% accurate, the goal is 100% accurancy, today physicians are about 50% accurate.

IBM Confidential

Page 27: Tech natives 22042013_bartde_witte_watson_v01

© 2013 International Business Machines Corporation27

Page 28: Tech natives 22042013_bartde_witte_watson_v01

© 2013 International Business Machines Corporation28

Page 29: Tech natives 22042013_bartde_witte_watson_v01

© 2013 International Business Machines Corporation29

Page 30: Tech natives 22042013_bartde_witte_watson_v01

© 2013 International Business Machines Corporation30

Page 31: Tech natives 22042013_bartde_witte_watson_v01

© 2013 International Business Machines Corporation31

Page 32: Tech natives 22042013_bartde_witte_watson_v01

© 2013 International Business Machines Corporation

Analyzes large volumes of unstructured and structured data

Combines large amounts of unstructured data with structured data to be analyzed together

Interprets and understands natural language questions

Understands ambiguous and imprecise questions using sophisticated natural language algorithms

Generates and evaluates hypotheses and quantifies confidence in answers

Identifies many answers to questions with evidence to "explain" rationale for answers

Supports iterativedialogue to refine results

Enables iterative and interactive question and answering to refine and improve results

Adapts and learns to improve results over time

Learns from additional evidence, additional questions and mistakes to improve accuracy over time

Watson’s Five Core Capabilities

Page 33: Tech natives 22042013_bartde_witte_watson_v01

© 2013 International Business Machines Corporation

Sir Muir Gray,Director NHS National Knowledge Service & NHS Chief Knowledge Officer

“ The application of what we know will have a bigger impact than any drug or technology likely to be introduced in the next decade.”

“ The application of what we know will have a bigger impact than any drug or technology likely to be introduced in the next decade.”

Page 34: Tech natives 22042013_bartde_witte_watson_v01

© 2013 International Business Machines Corporation