tech natives 22042013_bartde_witte_watson_v01
DESCRIPTION
IBM WatsonTRANSCRIPT
© 2013 International Business Machines Corporation
Watson from Jeopardy to Healthcare
BDent, Bart de Witte, MAppSc
Healthcare Industry Leader CEE / ALPS
April 2013 – Tech Natives Event, Wirtschaftskammer, Wien
Follow us @IBMWatson
Follow me @swisshealth20
© 2013 International Business Machines Corporation
watson - jeopardywatson - jeopardy
healthcare & datahealthcare & data
watson in healthcarewatson in healthcare
© 2013 International Business Machines Corporation
Jeopardy
Broad/Open Domain Complex Language High Precision Accurate Confidence High Speed
Human Language Words by themselves have no meaning Only grounded in human cognition Words navigate, align and communicate an
infinite space of intended meaning Computers can not ground words to human
experiences to derive meaning
© 2013 International Business Machines Corporation
Why Jeopardy?
Grand Challenge
© 2013 International Business Machines Corporation
The world is “dying of thirst in an ocean of data”
80%of the world’s data
today is unstructured
90% of the world’s data was created in the
last two years
20%amount of data
traditional systems leverage today
© 2013 International Business Machines Corporation
Easy Question
0.008850.00885 (LN (12,546,798*π )) ^ 2 / 34,576.46 (LN (12,546,798*π )) ^ 2 / 34,576.46
Select Payment where Owner = “David Jones” and Type (Product) = “Laptop”Select Payment where Owner = “David Jones” and Type (Product) = “Laptop”
Owner Serial Number
David Jones 45322190-AK
Serial Number Type Invoice #
45322190-AK LapTop INV10895
Invoice # Vendor Payment
INV10895 MyBuy $104.56
© 2013 International Business Machines Corporation
Computer programs are natively explicit, fast and exacting in their calculation over numbers and symbols….But Natural Language is implicit, highly contextual, ambiguous and often imprecise.
Where was X born?
One day, from among his city views of Ulm, Otto chose a water color to send to Albert Einstein as a remembrance of Einstein´s birthplace.
X ran this?
If leadership is an art then surely Jack Welch has proved himself a master painter during his tenure at GE.
Structured
Unstructured
Hard Question
© 2013 International Business Machines Corporation
Decision Maker
Search Engine
Finds Documents containing KeywordsFinds Documents containing Keywords
Delivers Documents based on PopularityDelivers Documents based on Popularity
Has QuestionHas Question
Distills to 2-3 KeywordsDistills to 2-3 Keywords
Reads Documents, Finds AnswersReads Documents, Finds Answers
Finds & Analyzes EvidenceFinds & Analyzes Evidence
Informed Decision Making: Search vs. Expert Q&A
Expert
Understands QuestionUnderstands Question
Produces Possible Answers & EvidenceProduces Possible Answers & Evidence
Delivers Response, Evidence & ConfidenceDelivers Response, Evidence & Confidence
Analyzes Evidence, Computes ConfidenceAnalyzes Evidence, Computes Confidence
Asks NL QuestionAsks NL Question
Considers Answer & EvidenceConsiders Answer & Evidence
Decision Maker
8
© 2013 International Business Machines Corporation
explorer
India
In May 1898
India
In May
celebrated
anniversary
in Portugal
In May, Gary arrived in India after he celebrated his anniversary in Portugal
In May, Gary arrived in India after he celebrated his anniversary in Portugal
Portugal
400th anniversary
celebrated
Gary
In May 1898 Portugal celebrated the 400th anniversary of this explorer’s arrival in India
In May 1898 Portugal celebrated the 400th anniversary of this explorer’s arrival in India
This evidence suggests “Gary” is the answer BUT the system must learn that keyword matching may be weak relative to other types of evidence
arrived in
arrival in
Legend
Keyword “Hit”
Reference Text
Answer
Weak evidenceRed Text
Why is Jeopardy! so Difficult?answering complex natural language questions requires more than keyword evidence
© 2013 International Business Machines Corporation10
Winning Human Performance
Winning Human Performance
2007 QA Computer System2007 QA Computer System
Grand Champion Human
Performance
Grand Champion Human
Performance
Top human players are remarkably
good.
Top human players are remarkably
good.
Each dot – actual historical human Jeopardy! gamesEach dot – actual historical human Jeopardy! games
More ConfidentMore Confident Less ConfidentLess Confident
What It Takes to compete against Top Human Jeopardy! Players
© 2013 International Business Machines Corporation
27th May 1498
Vasco da Gama
landed in
arrival in
explorer
India
Statistical Para-phrases
Geo-Spatia
l Reasoning
DateMatch
Stronger evidence can be much harder to find and score…
…and the evidence is still not 100% certain
Search far and wide
Explore many hypotheses
Find judge evidence
Many inference algorithms
On the 27th of May 1498, Vasco da Gama landed in Kappad BeachOn the 27th of May 1498, Vasco da Gama landed in Kappad Beach
400th anniversary
Portugal
May 1898
celebrated
In May 1898 Portugal celebrated the 400th anniversary of this explorer’s arrival in India.
In May 1898 Portugal celebrated the 400th anniversary of this explorer’s arrival in India.
Kappad Beach
Legend
Temporal Reasoning
Reference Text
Answer
Statistical Paraphrasing
GeoSpatial Reasoning
Levering Algorithms for Deeper Evidence
© 2013 International Business Machines Corporation
Watson is a Massively Parallel Probabilistic Evidence-Based Architecture DeapQA generates and scores many hypotheses using an extensible collection if Natural Language Processing, Machine Learning and Reasoning Algoritms. These gather and weigh evidence over both structured and unstructured content to determine the answer with the best confidence
Answer Scoring
Models
Responses with Confidence
Inquiry
Evidence Sources
Models
Models
Models
Models
ModelsPrimarySearch
CandidateAnswer
Generation
HypothesisGeneration
Hypothesis and Evidence Scoring
Final Confidence Merging & Ranking
Synthesis
Answer Sources
Inquiry/Topic Analysis
EvidenceRetrieval
Deep Evidence Scoring
Learned Modelshelp combine and
weigh the Evidence
HypothesisGeneration
Hypothesis and Evidence Scoring
InquiryDecomposition
© 2013 International Business Machines Corporation
Baseline 12/06
v0.1 12/07
v0.3 08/08
v0.5 05/09
v0.6 10/09
v0.8 11/10
v0.4 12/08
v0.2 05/08
IBM Watson
Playing in the Winners Cloud
V0.7 04/10
DeepQA: Incremental Progress in Answering Precision on the Jeopardy Challenge: 6/2007-11/2010
IBM ConfidentialIBM Confidential © 2013 IBM
Healthcare & Data
© 2013 International Business Machines Corporation
Our Watson Healthcare strategy solves 3 problems in clinical practice
Related 3 problems
medicine is a science but practiced as an artmedicine is a science but practiced as an art
The number of untapped information that can be used asa source of knowledge is growing exponentiallyThe number of untapped information that can be used asa source of knowledge is growing exponentially
Impossible to keep up and have access to existing knowledgeImpossible to keep up and have access to existing knowledge
© 2013 International Business Machines Corporation
Our Watson Healthcare strategy solves 3 problems in clinical practice
Estimated 30-40% of care in UK not based on available scientific evidence Grol, R. and Grimshaw, J. (2003)
5 year gap between publication of guidelines and changes in routine practice in Western healthcare systems, Lomas et al (1993)
1 out of 5 diagnoses are wrong Unprecedented research commissioned by the EU has found that 23% of EU
citizens have been a victim or the member of a family who has been a victim of a “serious medical error in a local hospital” or a “serious medical error from a medicine that was prescribed by a doctor”.
In all, only 17% of Austrians and Germans said that hospital patients were very likely or fairly likely to be able to avoid a serious medical error.
medicine is a science but practiced as an artmedicine is a science but practiced as an art
© 2013 International Business Machines Corporation
Our Watson Healthcare strategy solves 3 problems in clinical practice
medical knowledge doubles every five years 81% of the physicians in the US report spending 5 hours or less a month
reading medical journals Medicine has become too complex and only 20% of the knowledge clinicians
use is evidence based
Impossible to keep up and have access to existing knowledgeImpossible to keep up and have access to existing knowledge
© 2013 International Business Machines Corporation
Our Watson Healthcare strategy solves 3 problems in clinical practice
16000 Hospitals worldwide collect data 80% of the data is unstructured and stored in hundred of forms such as lab results,
images and medical transcripts data will grow 800% over the next five years 90% of the digital data has been generated in the last 2 years unstructured data will grow 50 times faster then structured data patient monitoring equipment pumps out on average 1000 readings per second or
86400 reading a day Data is getting more social. . . 20M articles on Wikipedia, 30B pieces of Facebook content are shared monthly There are 156M public blogs, 12 terabites on tweets generates every day 70 percent of physicians report that at least one of their patients is sharing health
measurement data with them
The number of untapped information that can be used asa source of knowledge is growing exponentiallyThe number of untapped information that can be used asa source of knowledge is growing exponentially
© 2013 International Business Machines Corporation
Big Data: this is just the beginning
2010
Vol
ume
in E
xaby
tes
9000
8000
7000
6000
5000
4000
3000
2015
Percentage of uncertain data
Percent of uncertain data
100
80
60
40
20
0
Sensors & Devices
VoIP
Enterprise Data
Social Media
Source: IBM Global Technology Outlook - 2012
You are here
© 2013 International Business Machines Corporation
Healthcare industry is beset with some of the most complex information challenges we collectively face
Steven Shapiro, Chief Medical & Scientific Officer, UPMC
“Medicine has become too complex. Only about 20% of the knowledge clinicians use today is evidence-based.”
Steven Shapiro, Chief Medical & Scientific Officer, UPMC
“Medicine has become too complex. Only about 20% of the knowledge clinicians use today is evidence-based.”
Steven Shapiro, Chief Medical & Scientific Officer, UPMC
IBM ConfidentialIBM Confidential © 2013 IBM
Watson in Healthcare Oncology Advisor
© 2013 International Business Machines Corporation
NEJM Medical Concept Annotations – Attribute extractions
Medications
SymptomsDiseases
Modifiers
© 2013 International Business Machines Corporation
Sy
mp
tom
sUTI
Diabetes
Influenza
Hypokalemia
Renal Failure
no abdominal painno back painno cough
no diarrhea
(Thyroid Autoimmune)
Esophagitis
pravastatinAlendronate
levothyroxinehydroxychloroquine
Diagnosis Models
frequent UTI
cutaneous lupus
hyperlipidemiaosteoporosis
hypothyroidism
Sym
pto
ms
Fam
. Histo
ryP
at. Histo
ryM
edicatio
ns
Fin
din
gs Confidence
difficulty swallowing
dizziness
anorexia
fever dry mouththirst
frequent urination
Fa
mil
yH
isto
ry
Graves’ Disease
Oral cancerBladder cancer
HemochromatosisPurpura
Pa
tie
nt
His
tory
Med
icat
ion
s
Fin
din
gs
supine 120/80 mm HG
urine dipstick: leukocyte esterase
urine culture: E. Coliheart rate: 88 bpm
SymptomsFamily History
Patient History
MedicationsFindings
Putting the proper pieces together at the point of impact can be life changing
© 2013 International Business Machines Corporation
Build an intelligence engine to provide patient-specific diagnostic test and treatment recommendations
Provide actionable treatment recommendations
Built on the cognitive computing technologies developed in Watson by IBM Research
Developed and Trained in collaboration with partners who are experts in their domain
Watson in Healthcare Project GoalsWatson in Healthcare Project Goals
© 2013 International Business Machines Corporation
1 in 4individuals will die from cancer
✔✔
✔
20%of cancer cases receive the wrong diagnosis initially with
some as high as 44%
✔ ✔✔
✔
Source: American Cancer Society, National Health Institute
X
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
263.8Boverall costs of cancer in
the US in 2010
Cancer is an insidious disease and the second highest cause of death
+
Working Together to Beat Cancer
IBM+
3Xrate cancer cost climbs vs. std.
health costs or 15-18% / yr.
Working Together to Beat CancerWorking Together to Beat Cancer
© 2013 International Business Machines Corporation
Creating a Corpus of Knowledge for Cancer Care
Ingestion of NCCN guidelines for breast cancer and lung cancer: Roughly 500,000 unique combinations of breast cancer patient attributes. Roughly 50,000 unique combinations of lung cancer patient attributes.
Over 600,000 pieces of evidence ingested, from 42 different publications/publishers, including:
The Breast Journal, National Comprehensive Cancer Network (Clinical Practice Guidelines, Drug and Biologics compendium, et al.), American Journal Of Hematology, Annals Of Neurology, CA: A Cancer Journal For Clinicians, Cancer Journal, Cochrane, EBSCO, Hematological Oncology, Hepatology, International Journal Of Cancer, Journal Of Gene Medicine, Journal of Clinical Oncology, Journal of Oncology Practice, Massachusetts Medical Society Journal Watch, Massachusetts Medical Society New England Journal Of Medicine, Merck, Nephrology, UptoDate, Clinical Lung Cancer, Current Problems in Cancer, Cancer Treatment Reviews, Elsevier's Monographs in Cancer (multiple), Clinical Breast Cancer, European Journal of Cancer, Lung Cancer (the journal).
Watson has received 14,700 hours of training from clinicians
Accurate: in the cases run, it's 90% accurate, the goal is 100% accurancy, today physicians are about 50% accurate.
IBM Confidential
© 2013 International Business Machines Corporation27
© 2013 International Business Machines Corporation28
© 2013 International Business Machines Corporation29
© 2013 International Business Machines Corporation30
© 2013 International Business Machines Corporation31
© 2013 International Business Machines Corporation
Analyzes large volumes of unstructured and structured data
Combines large amounts of unstructured data with structured data to be analyzed together
Interprets and understands natural language questions
Understands ambiguous and imprecise questions using sophisticated natural language algorithms
Generates and evaluates hypotheses and quantifies confidence in answers
Identifies many answers to questions with evidence to "explain" rationale for answers
Supports iterativedialogue to refine results
Enables iterative and interactive question and answering to refine and improve results
Adapts and learns to improve results over time
Learns from additional evidence, additional questions and mistakes to improve accuracy over time
Watson’s Five Core Capabilities
© 2013 International Business Machines Corporation
Sir Muir Gray,Director NHS National Knowledge Service & NHS Chief Knowledge Officer
“ The application of what we know will have a bigger impact than any drug or technology likely to be introduced in the next decade.”
“ The application of what we know will have a bigger impact than any drug or technology likely to be introduced in the next decade.”
© 2013 International Business Machines Corporation