smart data webinar: deep qa (question/answer) - lessons from watson and jeopardy!

27
Copyright (c) 2016 by STORM Insights Inc. All Rights reserved. 7/17/2015 Deep QA (Question/Answer) Lessons From Watson and Jeopardy! October 13, 2016 Adrian Bowles, PhD Founder, STORM Insights, Inc. [email protected]

Upload: dataversity

Post on 13-Apr-2017

420 views

Category:

Technology


2 download

TRANSCRIPT

Copyright (c) 2016 by STORM Insights Inc. All Rights reserved. 7/17/2015

Deep QA (Question/Answer)Lessons From Watson and Jeopardy!

October 13, 2016

Adrian Bowles, PhDFounder, STORM Insights, Inc.

[email protected]

Copyright (c) 2016 by STORM Insights Inc. All Rights reserved.

Deep Question/Answering - Lessons from Watson & Jeopardy!

The GameThe ChallengeScope of the problem

DeepQA Architecture & Processes

Software, Hardware & Resources

Next Steps

Copyright (c) 2016 by STORM Insights Inc. All Rights reserved.

Answers must be given in the form of a questionLast contestant to answer correctly chooses the next questionCorrect responses must satisfy the demands of both the clue and the category

JeopardySix categories, 5 Questions for each category, $100-500 based on difficulty

Double JeopardySix categories, 5 Questions for each category, $200-1,000 based on difficulty, and 3 hidden questions allow the person who chooses them to bet everything they have at that point in the game

Final JeopardyPlayer must have a positive balance from the previous round to playPlayers see the category and then decide - secretly - how much to wagerThe question is presented30 seconds to answer

Playing the Game:

Wikipedia, The Free Encyclopedia. October 12, 2016, 02:40 UTC. Available at: https://en.wikipedia.org/w/index.php?title=Jeopardy!&oldid=743931483. Accessed October 12, 2016.

Copyright (c) 2016 by STORM Insights Inc. All Rights reserved.

Open Domain, broad use of language - Jeopardy! questions often involve puns, ambiguity…IBM reviewed a sample of 20,000 questions, and found 2,500 distinct lexical answer types (LANo single LAT accounted for more than 3% of the totalFor each category, there could be thousands of questionsBest players provide correct answers ~85% of the timeBest players know what they don’t know - base their bets on their confidence~3 seconds to answer questions

Challenges of Jeopardy! for Machines:

Players may only use the data/knowledge they have on arrival - no lifelines, resources…Constraint

Winning Jeopardy! requires a contestant to answer ~70% of the questions, with 80%+ precision.

Copyright (c) 2016 by STORM Insights Inc. All Rights reserved.

Predicting lexical answer types in open domain question and answering (qa) systems US 20130035931 A1 2013, Ferrucci, Gliozzo, Kalyanpur

Copyright (c) 2016 by STORM Insights Inc. All Rights reserved.

Precision

SpeedConfidence

Quality

SpeedCost

Business Constraints Jeopardy! Constraints

Copyright (c) 2016 by STORM Insights Inc. All Rights reserved.

Look for SimilarSolved

Problems

Accept or CreateProblem Statement

GenerateHypotheses

Identify Evidencein Corpus

Score Evidence

Score Hypotheses

PresentResults

GetFeedback

Train

ModelOrientAct

Observe

Decide

WorldModel

Formalizing the Decision-Making Process

Boyd’s LoopJohn Boyd (1927-

1997) Continuous Learning

Copyright (c) 2016 by STORM Insights Inc. All Rights reserved.

MachineLearning

NLU NLG

Information

RetrievalReasoning

KnowledgeRepresent

ation

Evidence

Gather Decide

Evaluate WeighGenerate Hypothese

s

Automating QA

Copyright (c) 2016 by STORM Insights Inc. All Rights reserved.

* Building Watson: An Overview of the DeepQA Project, AI Magazine, Fall 2010 Issue,Ferrucci, Brown, Chu-Carroll, Fan, Gondek, Kalyanpur, Lally, Murdock, Nyberg, Prager, Schlaefer, Welty.

Build a database of question/answer pairsBuild a formal model of the worldBuild a search engine

What they didn’t do:

What they did:

DeepQA - “a massively parallel probabalistic evidence-based architecture.”*

Develop reusable NLU tech to analyze textAnalyze sources - structured and unstructured - to capture background knowledgeApply knowledge representation and Reasoning (KRR) to the resulting structured knowledgeUse machine learning to generate and score hypotheses

Copyright (c) 2016 by STORM Insights Inc. All Rights reserved.

Massively Parallel Probabalistic Evidence-based Architecture

Copyright (c) 2016 by STORM Insights Inc. All Rights reserved.

Content AcquisitionBuilding the corpus

For Jeopardy! this had to be completed before the game commenced.Ingested encyclopedias, dictionaries, thesauri, newswire articles, literary works, databases, taxonomies, ontologies…

IRL, we can identify and use new resources based on the problem at hand.

Copyright (c) 2016 by STORM Insights Inc. All Rights reserved.

Question AnalysisWhat is being asked?

Question classification:any words with double meanings?Puzzle question, factoid…?

Detect focus LATrelations

Copyright (c) 2016 by STORM Insights Inc. All Rights reserved.

Relation-detection

“They’re the two states you could be reentering if you’re crossing Florida’s norther border.”

Category: Head North

borders(Florida, ?,x,north)

Copyright (c) 2016 by STORM Insights Inc. All Rights reserved.

Copyright (c) 2016 by STORM Insights Inc. All Rights reserved.

Copyright (c) 2016 by STORM Insights Inc. All Rights reserved.

Hypothesis Generation& Scoring

Use a candidate answer with the question, try to prove correct with a degree of confidence supported by the evidence.

Scoring may use a variety of relationships:

temporalspatialgeospatialtaxonomic classificationcorrelation between candidate and question…

Copyright (c) 2016 by STORM Insights Inc. All Rights reserved.

Evaluating Potential Answers

Watson scores evidence in multiple dimensions

What works for a factoid question may not work for a puzzle question.

“Chile shares its longest land border with this country.”

Copyright (c) 2016 by STORM Insights Inc. All Rights reserved.

Copyright (c) 2016 by STORM Insights Inc. All Rights reserved.

Copyright (c) 2016 by STORM Insights Inc. All Rights reserved.

Merging & Ranking

Identifying the most likely answer based on confidence scores.

Answer scores are merged before ranking and confidence estimation.

Uses ML approach to compare with training set data when confidence scores in different categories result in “too close to call” results.

Copyright (c) 2016 by STORM Insights Inc. All Rights reserved.

Wikipedia, The Free Encyclopedia. October 12, 2016, 17:06 UTC. Available at: https://en.wikipedia.org/w/index.php?title=Watson_(computer)&oldid=744021754. Accessed October 12, 2016.

Copyright (c) 2016 by STORM Insights Inc. All Rights reserved.

Software

Apache Hadoophttp://hadoop.apache.org

Apache UIMA - Unstructured Information Management Architecturehttp://uima.apache.org

IBM DB2

Linux (Suse Enterprise Server 11)

Copyright (c) 2016 by STORM Insights Inc. All Rights reserved.

Resources

Wordnet(R) Princeton University "About WordNet." WordNet. Princeton University. 2010. <http://wordnet.princeton.edu>

Wordnet(R)

Copyright (c) 2016 by STORM Insights Inc. All Rights reserved.

Resources

Wordnet(R) Princeton University "About WordNet." WordNet. Princeton University. 2010. <http://wordnet.princeton.edu>

Wordnet(R)

Copyright (c) 2016 by STORM Insights Inc. All Rights Reserved. 9/28/2011

IBM Power 75090 servers, 32 cores/server, 2880 Cores in 10 racks

16Tb RAM

~80TeraFLOPS

80,000,000,000,000FLOPS

Hardware

Copyright (c) 2016 by STORM Insights Inc. All Rights reserved.

Next Steps…

Presenter
Presentation Notes
we’re going to open it up for questions now, so I’ll turn it back to Shannon and leave you with my contact info and information about upcoming webinars in this series

For more information:

Copyright (c) 2016 by STORM Insights Inc. All Rights reserved.

[email protected]

Twitter @ajbowlesSkype ajbowles

Upcoming Webinar Dates & Topics

November 10 Emerging Hardware Choices for Modern AI Data ManagementDecember 8 Leverage the IOT to Build a Smart Data Ecosystem

2017 Webinar Themes

Technology TrendsMarket Trends

CommunicatingLearningUnderstandingReasoningPlanning

Presenter
Presentation Notes
we’re going to open it up for questions now, so I’ll turn it back to Shannon and leave you with my contact info and information about upcoming webinars in this series