msim 111 session 5 (ibm watson by armen pischdotchian)
DESCRIPTION
IBM WatsonTRANSCRIPT
-
2015 IBM Corporaton1
IBM Watson and Old Dominion University
Watson from DeepQA toDeep Learning
By: Armen Pischdotchian
-
2015 IBM Corporaton2
Agenda About cognitve systems The statstcs behind DeepQA The DeepQA Pipeline in Detail From DeepQA to Deep Learning
-
2015 IBM Corporaton3
About Cognitve Systems
-
2015 IBM Corporaton4
What is common amongst cognitve systems
The three L's: Language: are you leveraging an NLP stack? Levels: do you score or rank returned responses? Learning: do you employ machine learning technologies?
Coming soon to the three L's is the forth L: Limbs: robotcs
-
2015 IBM Corporaton5
Natural Language Processing Challenges
-
2015 IBM Corporaton6
Deterministc vs. Probabilistc Systems
-
2015 IBM Corporaton7
Linear Regression Logistcal Regression
-
2015 IBM Corporaton8
NLP terminology
-
2015 IBM Corporaton9
When recall is more important than precision
5 Relevant documents (red fsh)
5 irrelevant documents (blue fsh)
The search has retrieved 3 relevant
documents out of a total of 5 relevant
documents from the corpus and 1 irrelevant document.
Recall = 3 / 5 = 0.6
Precision = 3 / 4 = 0.75 (the blue fsh is not part of the equaton at all).
These images are from www.lucidata.inc
-
2015 IBM Corporaton10
The case of 100% recall and low precision
5 Relevant documents (red fsh)
5 irrelevant documents (blue fsh)
In Watson Discovery Advisor, this is thepreferred scenario even though there may be some irrelevant documents with a high score.
The algorithm team will then work on increasing the precision of this system.
What would be the preferred outcome for the Watson Engagement Advisor?
-
2015 IBM Corporaton11
The case of 100% precision and low recall5 Relevant documents (red fsh)
5 irrelevant documents (blue fsh)
Zero false positves, 100% precision No blue fsh in the net
But there are many false negatves Many red fsh in the sea
There are potentally many relevant documents that we will never consider.Perfect precision with poor recall is of no value to a DeepQA system.
These images are from www.lucidata.inc
-
2015 IBM Corporaton12
Precision and accuracy in Jeopardy!
-
2015 IBM Corporaton13
Stage 2: Hypothesis Generaton Precision vs.Percentage atempted
Copyright 2010, Association for the Advancement of Artificial Intelligence. All rights reserved. ISSN 0738-4602
-
2015 IBM Corporaton14
Search Engine vs. Questons Answering SystemA QA system demands more processing from the system and less analysis on the
user compared to a search engine.
-
2015 IBM Corporaton15
The DeepQA Pipeline
-
2015 IBM Corporaton16
An example Jeopardy! questonIN 1698, THIS COMETDISCOVERER TOOK A
SHIP CALLED THEPARAMOUR PINK ONTHE FIRST PURELY
SCIENTIFIC SEA VOYAGE
IN 1698, THIS COMETDISCOVERER TOOK A
SHIP CALLED THEPARAMOUR PINK ONTHE FIRST PURELY
SCIENTIFIC SEA VOYAGE
Related Content(Structured & Unstructured)
Primary Search
Wilhelm TempelWilhelm Tempel
HMS ParamourHMS Paramour
Isaac NewtonIsaac Newton
Halleys CometHalleys Comet
Pink PantherPink Panther
Christiaan HuygensChristiaan Huygens
Peter SellersPeter Sellers
Edmond HalleyEdmond Halley
Candidate Answer Generation
1) Edmond Halley (0.85)2) Christiaan Huygens (0.20)3) Peter Sellers (0.05)
1) Edmond Halley (0.85)2) Christiaan Huygens (0.20)3) Peter Sellers (0.05)
Merging &Ranking
EvidenceRetrieval
Question Analysis
Keywords: 1698, comet, paramour, pink, AnswerType(comet discoverer)Date(1698)Took(discoverer, ship)Called(ship, Paramour Pink)
Keywords: 1698, comet, paramour, pink, AnswerType(comet discoverer)Date(1698)Took(discoverer, ship)Called(ship, Paramour Pink)
[0.58 0 -1.3 0.97][0.71 1 13.4 0.72][0.12 0 2.0 0.40]
[0.84 1 10.6 0.21]
[0.33 0 6.3 0.83][0.21 1 11.1 0.92][0.91 0 -8.2 0.61]
[0.91 0 -1.7 0.60]EvidenceScoring
Spat
ial
Tem
pora
l
Lexi
cal
Taxo
nom
ic
Models
Models
Models
Models
Models
Models
-
2015 IBM Corporaton17
ScoringFinal
MergingRanking
Scoring
Question
Answer, Confidence,
Evidence
TrainedModels
CandidateAnswer
GenerationPrimarySearch
ContextualAnswer Scoring
AnswerScoring
EvidenceRetrieval
Scoring
SearchQuestionAnalysis
Wikipediaetc.
ContextualAnswer Scoring
AnswerScoring ContextualAnswer
Scoring
AnswerScoring
How Watson responds to a Queston
-
2015 IBM Corporaton18
Queston Analysis (QA) OverviewWhat is Queston Analysis?
Queston Analysis is the frst stage in the Watson pipeline Ultmate goal: Understand what is being asked
Various algorithms and technologies to identfy as much as possible about theinput queston
Named Entty Detecton Natural Language Processing (NLP) Shallow and Deep Semantc Relaton Detecton
All downstream components rely on the annotatons produced by QA
-
2015 IBM Corporaton19
Stage 1: Queston Analysis Queston analysis technologies includesPart of speech parsing technologyNamed Entty DetectonRelaton ExtractonInverse Document Frequency (IDF)
-
2015 IBM Corporaton20
Question
PrimarySearch
SearchQuestionAnalysis
Stage 2: Hypothesis Generaton
-
2015 IBM Corporaton21
Who is the 44th President of the United States?
Question
PrimarySearch
SearchQuestionAnalysis
Keywords:44th President United States
Stage 2: Hypothesis Generaton Primary search
-
2015 IBM Corporaton22
Question
CandidateAnswer
GenerationPrimarySearch
SearchQuestionAnalysis
Barack ObamaGeorge W. BushHarvard Law SchoolIllinois
Who is the 44th President of the United States?
Stage 2: Hypothesis Generaton Candidate Answer Gen
-
2015 IBM Corporaton23
Stage 3: Hypothesis Scoring What is Hypothesis Scoring?
Enumeraton of annotators responsible for scoring previous generated candidateanswers
The results produced by these scorers are ranked by the Merging and Rankingcomponents to produce a ranked list of answers.
Outcome: a confdence level of a generated hypothesis Scorers can produce results in any (reasonable) range In fnal merging step, scorers are normalized according to how well their scoring
heuristc correlates to the correct answer Normalized to [0..1] in fnal merging
-
2015 IBM Corporaton24
Hypothesis & Evidence Scoring
Hypotheses EvidenceFeaturesTextual
Alignment
Term andnGram
Matching
LogicalForm
Analysis
Hypothesis Scoring - components
. . .
Question/Topic
Analysis
Question
Hypothesis &Evidence Scoring
Answer,Confidence
Evidence
FinalMerging
& Ranking
HypothesisGeneration
TrainedModels
-
2015 IBM Corporaton25
AnswerIdf scorer
Context Independent scorer
Uses concept referred to as Inverse Document Frequency
Rato of total documents versus documents containing targettext
Target text = candidate answer textLarge corpus (e.g., Wikipedia)Lucene formulaLog scale
Scores in range (0inf)
Higher score indicates more informatveness (answer textappears in few documents)
Example10,000 documentsAnswer text appears in only 10 documentsLog (10,000 / 10) = Log (1,000) = 3
-
2015 IBM Corporaton26
Textual Alignment Answer ScorerSurface similarity measurementQuestonSupportng passage
Dynamic programming for subsequence alignment
Consider the following example:Who led the Allied forces on the European front during World War 2?Dwight D. Eisenhower was supreme commander of Allied forces during the D-Dayinvasion and European front during World War 2.--Overlap is signifcant
Now, consider the example:In 1698, what comet discoverer took a ship called the Paramour Pink on the frstpurely scientfc sea voyage?Edmund Halley made probably the frst primarily scientfc voyage to study thevariaton of the magnetc compass
--Fewer textual overlaps, likely with lower IDF scores
-
2015 IBM Corporaton27
Who is the 44th President of the United States?
ScoringScoring
Question
ContextualAnswer Scoring
Scoring
SearchQuestionAnalysis
ContextualAnswer ScoringContextual
Answer Scoring
Barack Obama is the 44th President of the United StatesGeorge W. Bush is the 44th President of the United StatesHarvard Law School is the 44th President of the UnitedStatesIllinois is the 44th President of the United States
Barack Hussein Obama II (i/brk husen obm/; born August 4, 1961) is the 44th and current President of the United States.
George Walker Bush (born July 6, 1946) is anAmerican politician who served as the 43rdPresident of the United States from 2001 to2009 and the 46th Governor of Texas from1995 to 2000.
Barack Obama .95George W. Bush .80Harvard Law School .05Illinois.10
-
2015 IBM Corporaton28
ScoringFinal
MergingRanking
Scoring
Question
TrainedModels
CandidateAnswer
GenerationPrimarySearch
ContextualAnswer Scoring
AnswerScoring
Scoring
SearchQuestionAnalysis
Wikipediaetc.
ContextualAnswer Scoring
AnswerScoring ContextualAnswer
Scoring
AnswerScoring
Answer, Confidence,
Evidence
Stage 4: Final Merger and Ranking
-
2015 IBM Corporaton29
Challenge: Heterogenous feature types and values
-
2015 IBM Corporaton30
EvidenceRetrieval
Who is the 44th President of the United States?
Candidate Answer AnswerScoring Contextual AnswerScoring Confidence
Barack Obama 0.90 0.90 .95
George W. Bush 0.90 0.80 .65
Harvard Law School 0.10 0.05 .05
Illinois 0.15 0.10 .10
Stage 4: Final Merger and Ranking confdence scoring
-
2015 IBM Corporaton31
Watson is Deep Learning
-
2015 IBM Corporaton32
University of Texas Watson university competton demo
-
2015 IBM Corporaton33
Watson is going Deep Learning
Slide 1Slide 2Slide 3Slide 4Slide 5Slide 6Slide 7Slide 8Slide 9Slide 10Slide 11Slide 12Slide 13Slide 14Slide 15Slide 16Slide 17Slide 18Slide 19Slide 20Slide 21Slide 22Slide 23Slide 24Slide 25Slide 26Slide 27Slide 28Slide 29Slide 30Slide 31Slide 32Slide 33