aquaint r&d program: phase ii plans and schedule dr. john d. prange aquaint program manager...
TRANSCRIPT
AQUAINT R&D Program:Phase II Plans and Schedule
Dr. John D. PrangeAQUAINT Program Manager
http://www.ic-arda.org
2
“Providing answers to the questions analysts ask”
AQUAINT Program Goals
• Accept complex “Questions” in a form natural to the analyst
• Translate “Question” into multiple queries appropriate to the various data sets to be searched
• Find relevant information in distributed, multimedia, multilingual, multi-agency data sets
• Analyze, fuse and summarize information into a coherent “Answer”
• Provide “Answer” to analyst in the form they want
I have plentyof Questions;It’s Answersthat I need!!
IC ANALYST
3
“Providing answers to the questions analysts ask”
AQUAINT Program Goals
• Accept complex “Questions” in a form natural to the analyst
• Translate “Question” into multiple queries appropriate to the various data sets to be searched
• Find relevant information in distributed, multimedia, multilingual, multi-agency data sets
• Analyze, fuse and summarize information into a coherent “Answer”
• Provide “Answer” to analyst in the form they want
I have plentyof Questions;It’s Answersthat I need!!
IC ANALYST
Still Valid !
4
QUESTION????
Clarification
Other Analysts
Question & RequirementContext; Analyst Background
Knowledge
Multimedia Examples
Natural Statement ofQuestion;
Use of
QueryAssessment,
Advisor,Collaboration
Question Under- standing andInterpretation
Knowledge Bases;Technical Databases
AQUAINT:R&D Focused on Three Functional Components
Question & Answer Context
•Relevant information extracted and combined where possible;•Accumulation of Knowledge across “Documents”•Cross “Document” Summaries created;•Language/Media Independent Concept Representation•Inconsistencies noted;•Proposed Conclusions and Inferences Generated
Determinethe
Answer
Relevant “Documents”
MultipleRanked
Lists
Single, Merged
Ranked List ofRelevant “Documents”
Queries
Relevant“Knowledge”
KBQueries
Multiple Sources;Multiple Media;Multi-Lingual;Multiple Agencies
MultipleSource
SpecificQueries
Translate Queriesinto Source Specific Retrieval Languages
Partially Annotated & Structured Data
Automatic Metadata Creation
SupplementalUse
Supple- mentalUse
Query Refinement based on Analyst
Feedback
Iterative Refinementof Results based
on Analyst Feedback
AnalystFeed-back
FINAL ANSWER
Results of Analysis• Formulate Answer for Analyst in form they want
• Multimedia Navigation Tools for Analyst Review
AnswerFormulation
ProposedAnswer
AnswerContext
5
QUESTION????
Clarification
Other Analysts
Question & RequirementContext; Analyst Background
Knowledge
Multimedia Examples
Natural Statement ofQuestion;
Use of
QueryAssessment,
Advisor,Collaboration
Question Under- standing andInterpretation
Knowledge Bases;Technical Databases
AQUAINT:R&D Focused on Three Functional Components
Question & Answer Context
•Relevant information extracted and combined where possible;•Accumulation of Knowledge across “Documents”•Cross “Document” Summaries created;•Language/Media Independent Concept Representation•Inconsistencies noted;•Proposed Conclusions and Inferences Generated
Determinethe
Answer
Relevant “Documents”
MultipleRanked
Lists
Single, Merged
Ranked List ofRelevant “Documents”
Queries
Relevant“Knowledge”
KBQueries
Multiple Sources;Multiple Media;Multi-Lingual;Multiple Agencies
MultipleSource
SpecificQueries
Translate Queriesinto Source Specific Retrieval Languages
Partially Annotated & Structured Data
Automatic Metadata Creation
SupplementalUse
Supple- mentalUse
Query Refinement based on Analyst
Feedback
Iterative Refinementof Results based
on Analyst Feedback
AnalystFeed-back
FINAL ANSWER
Results of Analysis• Formulate Answer for Analyst in form they want
• Multimedia Navigation Tools for Analyst Review
AnswerFormulation
ProposedAnswer
AnswerContext
Still Valid But . . .
6
Cross Cutting/Enabling Technologies Research Issues
QUESTION????
FINAL ANSWER
AnswerFormulation
Question Under-
standing and Inter-pretation
InformationRetrievalProcess
Analysis &SynthesisProcess
Determinethe Answer
AQUAINTPhase II
Solicitation
Component Integration and System Architecture Issues
Component Level / End-to-End Testing & Evaluation
Annotated and ‘Ground Truthed’ Data
SeparateCoordinated
Activities
AQUAINT:Separate, Coordinated Activities
7
Cross Cutting/Enabling Technologies Research Issues
QUESTION????
FINAL ANSWER
AnswerFormulation
Question Under-
standing and Inter-pretation
InformationRetrievalProcess
Analysis &SynthesisProcess
Determinethe Answer
AQUAINTPhase I
Solicitation
Component Integration and System Architecture Issues
Component Level / End-to-End Testing & Evaluation
Annotated and ‘Ground Truthed’ Data
SeparateCoordinated
Activities
AQUAINT:Separate, Coordinated Activities
Still Valid But . . .
8
Levels of Participation
• For Phase I AQUAINT Program supported three distinct levels of participation:– Full QA Systems
– Emphasis on One or more QA System Component
– Focus on Cross Cutting / Enabling Technology Area
• For Phase II– All three levels will remain
FIRST AN OBSERVATION:– Significant number & variety of Full QA Systems are emerging
SO FOR PHASE II– System Component Proposals – Must include plans for integrating /
testing / evaluating component(s) within a full system environment
– Cross Cutting / Enabling Technologies Proposals – Must STRONGLY Justify value / importance to AQUAINT Program. Also there would be significant value added if proposal include plans / ideas for being responsive to larger AQUAINT R&D Community
9
Key Themes / Foci of AQUAINT Phase II
1. Significantly More Emphasis on Advanced Question Answering Scenarios
10
Implications of Scenarios
• Certainly Implies Increased Complexity of Questions & Answers . . .
• But Also Implies:– Increased Complexity of the Question Answering User
Environment
Complexity of Questions / Answers
Complexity User Environment
Phase I
Phase II
11
Intelligence Community & The Intelligence Cycle
7Dissemination
2Planning &Direction
4Processing
3Collection
5Analysis
6Reporting
1Intelligence
Requirements
12
Variety
(@ S
ignal /
Media Level) Volatility
Velocity
Volume
Five “V’s”Of the ExpandingWorld of Data
Variabilit
y
(@ C
ontent
Level)
Massive,Heterogeneous,
World of Data
SIGINTIMINT
HUMINT
MASINT
OSINT
IC CollectionSources
The Intelligence Community and the World of Data
15
Intelligence Community & The Intelligence Cycle
7Dissemination
2Planning &Direction
4Processing
3Collection
5Analysis
6Reporting
1Intelligence
Requirements
16
Types of Intelligence
• Current Intelligence– Addresses day-to-day events
– Seeks to:• Apprise consumers of new developments and related background• Assess significance• Warn of near-term consequences• Signal potentially dangerous situations in the future
– Presented in regular publications (daily, weekly etc.), in ad hoc memorandums/messages and oral briefings to senior officials
• Estimative Intelligence– Deals with what might be or what might happen; Starts with analysis of
available facts then migrates into the unknown or even unknowable– Goal: Help policy makers / decision makers to navigate the gaps between
available facts by suggesting alternative, plausible patterns that fit available facts and to provide informed assessments of the range and likelihood of possible outcomes
– Flagship reporting vehicle is National Intelligence Estimate
17
Types of Intelligence (continued)
• Warning Intelligence– Sounds an alarm or gives time critical notice; connotes urgency– Often involves situations that might involve US military forces or might
cause their deployment, threats to US facilities and personnel overseas, sudden and deleterious effects on US foreign policy (coups, ethnic violence, etc.)
– Warning analysis involves exploring alternative futures and low probability / high impact scenarios
• Scientific and Technical Intelligence– Information on Technical Developments and characteristics,
performance, and capabilities of foreign technologies including weapon systems or subsystems.
– Generally includes detailed technical measurements– Derived from analysis of all-source data– Generally produced in response to specific national requirements derived
from weapons acquisition programs, arms control negotiations, or military operations
18
• Working Aids– Created & maintained by individual analysts, groups of analysts or
organizations– Typically focused or organized by topic, theme, category, etc. – Contain source documents, processing or analytic results, informal
reports; May be highly structured
Types of Intelligence (continued)
• Research Intelligence– Monographs & other in-depth studies
– Two specialized subcategories:• Basic Intelligence
– Geographic, demographic, social, military and political data on foreign countries
• Intelligence of Operational Support – Reports of all types that are tailored, focused, and rapidly
produced for planners and operators– Example: DIA support to military forces & its operation of JWICS
(Joint Worldwide Intelligence Communication System)
19
Intelligence Community & The Intelligence Cycle
7Dissemination
2Planning &Direction
4Processing
3Collection
5Analysis
6Reporting
1Intelligence
Requirements
Focus of Info-X
20
Breadth of Information ExploitationApplied across all of the “INTs”
Presentation & Visualization
AnalyticKnowledge
Information Retrieval
Assessment& Interpretation
Content Data Mark-up
Content Data Transformation
Synthesis& Fusion IC Analysts
Data Filtering& Selection
Reporting & Dissemination
InformationUnderstanding
Information Discovery
ExpandingWorld of Data
Volume
Velocity
Volatility
Variety
(@ S
ignal /
Media Level)
Variabilit
y
(@ C
ontent
Level)
21
Location / Spatial Issues
Entities of Intelligence Interest / Value
Time / Temporal Issues
Presentation & Visualization
AnalyticKnowledge
Information Retrieval
Assessment& Interpretation
Content Data Mark-up
Content Data Transformation
Synthesis& Fusion IC Analysts
Data Filtering& Selection
Reporting & Dissemination
InformationUnderstanding
Information Discovery
Humans / Organizations Physical Objects
Location / Spatial Issues Time / Temporal Issues
ExpandingWorld of Data
22
Data Information Intelligence Synthesis & Fusion of Observables within a Given Context
Location / Spatial Issues Time / Temporal Issues
Presentation & Visualization
AnalyticKnowledge
Information Retrieval
Assessment& Interpretation
Content Data Mark-up
Content Data Transformation
Synthesis& Fusion IC Analysts
Data Filtering& Selection
Reporting & Dissemination
InformationUnderstanding
Information Discovery
Humans / Organizations Physical Objects
ExpandingWorld of Data
Who, What,When, Where, How
“Observables”
Instances
Properties /Attributes
Relationships
Events / Activities /Processes
Time / Temporal Issues
Location / Spatial Issues
23
Data Information Intelligence Assessments, Interpretations, Judgments & Predictions
Location / Spatial Issues Time / Temporal Issues
Presentation & Visualization
AnalyticKnowledge
Information Retrieval
Assessment& Interpretation
Content Data Mark-up
Content Data Transformation
Synthesis& Fusion IC Analysts
Data Filtering& Selection
Reporting & Dissemination
InformationUnderstanding
Information Discovery
Humans / Organizations Physical Objects
ExpandingWorld of Data
Goals / Objectives
Attitudes /Perspectives
Intentions Motivation
Meaning
Values / Beliefs
“Behavioral Factors”Why?
Time / Temporal Issues
Location / Spatial Issues
24
Breadth of Information ExploitationApplied across all of the “INTs”
“Behavioral Factors”
“Observables”
StrongInteraction
ActiveCross-
Fertilization
Data Information Intelligence “Observables” & “Behavioral Factors” are NOT Independent Activities
Location / Spatial IssuesTime / Temporal Issues
Presentation & Visualization
AnalyticKnowle
dge
Information Retrieval
Assessment&
Interpretation
Content Data Mark-
up
Content Data
Transformation
Synthesis&
Fusion
IC Analysts
Data Filtering
& Selection
Reporting & Disseminatio
n
InformationUnderstandin
g
Information
Discovery
Humans / Organizations Physical Objects
25
Location / Spatial IssuesTime / Temporal Issues
Presentation & Visualization
AnalyticKnowle
dge
Information Retrieval
Assessment&
Interpretation
Content Data Mark-
up
Content Data
Transformation
Synthesis&
Fusion
IC Analysts
Data Filtering
& Selection
Reporting & Disseminatio
n
InformationUnderstandin
g
Information
Discovery
Humans / Organizations Physical Objects
“Observables” & “Behavioral Factors”In Fact They are Really Tightly Intertwined
“Behavioral Factors” 12
“Observables”
26
Data Information Intelligence Combined “Observables” & “Behavioral Factors”
Location / Spatial Issues Time / Temporal Issues
Presentation & Visualization
AnalyticKnowledge
Information Retrieval
Assessment& Interpretation
Content Data Mark-up
Content Data Transformation
Synthesis& Fusion IC Analysts
Data Filtering& Selection
Reporting & Dissemination
InformationUnderstanding
Information Discovery
Humans / Organizations Physical Objects
ExpandingWorld of Data
Properties /Attributes
Relationships
Events /Activities /
Processes
Location / Spatial Issues
Time / Temporal Issues
Observables
BehavioralFactors
Instances
Goal
s /
Obje
ctiv
esA
ttitu
des
/P
ersp
ectiv
es
Inte
nti
on
s Mo
tivation
Values /
Beliefs
Meaning
27
1. Significantly More Emphasis on Advanced Question Answering Scenarios
2. Expanded Data Sources; Same emphasis on Newswire Text but Significantly More Interest / Emphasis on Non-Newswire Data Dimensions
Key Themes / Foci of AQUAINT Phase II
28
Implications on Data Dimensions
• Phase I:– Dimension 1: Focus Data Dimension (Single Media, Single
Language, Single Genre English “Newswire”
– Dimension 2: Multiple Media
– Dimension 3: Multiple Languages
– Dimension 4: Multiple Genre
– Dimension 5: Structured and Unstructured Data Sources
• Phase II:– Strongly encourage greater exploration of Dimensions 2-5
– Strongly encourage approaches that would more tightly couple/integrate multiple data dimensions
– Special interest in the “Combining” / “Joint” aspects of Dimension 5
29
1. Significantly More Emphasis on Advanced Question Answering Scenarios
2. Expanded Data Sources; Same emphasis on Newswire Text but Significantly More Interest / Emphasis on Non-Newswire Data Dimensions
3. Exploring the Boundaries / Combinations of Knowledge-Based, Statistical and Linguistic approaches to Question Answering
Key Themes / Foci of AQUAINT Phase II
30
Higher Interest Areas
Exploring Intersection of Technical Approaches
Knowledge-BasedApproaches
Statistically-BasedApproaches
Linguistically-BasedApproaches
31
DIMENSIONS OF THE QUESTIONPART OF THE QA PROBLEM
DIMENSIONS OF THE ANSWERPART OF THE QA PROBLEM
Context
Judgement
Scope
Fusion
Interpretation
MultipleSources
Complex QA:The Need for Ever Increasing Knowledge -- Of All Types
** Knowledge Requirement would be better represented with a whole “quiver of arrows” of different sizes, lengths and types
QA R&D Program
QA R&D Program
Advanced AdvancedSimpleFactual
Question
SimpleAnswer,SingleSource
Increasing
Knowledge Requirements **
IncreasingKnowledgeRequirements **
32
Increasing Knowledge Requirements
• Types of Knowledge Needed– Factual Knowledge & Linguistic Knowledge – Common Sense Knowledge & World Knowledge– Procedural Knowledge & Explanatory Knowledge– Domain Knowledge & Modal Knowledge– Tacit Knowledge– Etc.
• Sources– Hand Crafted by experts; supplemented by end-users– Results from application of:
• Learning algorithms
• Bootstrapping / Hill-climbing Methods
– Extracted from large data corpora– Obtained via “Re-Use”
33
Overarching Context / Operational Requirement
Who is thisadvisor?
What do weknow about
him/her?
What are his/her views?
What influence does he/she have on FM?
And still more questions ???
In a foreign news broadcast a team of analysts observe a previously unknown individual conferring with the Foreign Minister. They suspect
that he/she is really a new senior advisor.
Does this signal that other
policy changes are coming?
Information Analysts
Improved Reasoning & Learning
FOCUS
34
Improved Reasoning & Learning
Associates Associates Follow-upLeads
Follow-upLeads
Cross Fertilization
Advanced Reasoning:• Use Multi-level Plans• Create and evaluate chains of reasoning• Reason across hetero- geneous data sources• Infer answers from data extracted from multiple sources when the answer is not explicitly stated • Utilize Link Analysis & Evidence Discovery• Plus other strategies
Advanced Learning:• Automatically learn new or modify existing reasoning strategies
New SeniorAdvisor
“Views: Past & Present” .….… ….…...……. ….…...……. ….…...……. ….…...……. ….…..
Summarized Results
Collected Views
TV & RadioBroadcasts,Newspapers
& OtherArchives
“Bio”………..….……..…….………..….……..…….………..….……..…….…………...
Raw “Bio”Information
Education
Past Positions
Family
Travels
Other Activities
Summarized Results
35
Unsolved Problems
• Developing / Implementing a Detailed, Complex Plan to Solve the QA Task at Hand
• Decomposing Complex Questions into a series / sequence of Simpler Questions whose Answers can be found
• Selecting the appropriate sources to search
• Knowing when No Answer is Available; Being able to then give a partial, incomplete answer
• Giving understandable explanations of the “Plan”, the “Reasoning Used” and the “Answers Found”
36
1. Significantly More Emphasis on Advanced Question Answering Scenarios
2. Expanded Data Sources; Same emphasis on Newswire Text but Significantly More Interest / Emphasis on Non-Newswire Data Dimensions
3. Exploring the Boundaries / Combinations of Knowledge-Based, Statistical and Linguistic approaches to Question Answering
4. Continue to explore “Uncharted Waters . . .”
Key Themes / Foci of AQUAINT Phase II
37
AQUAINT:Somewhat Oversimplified View of Current Situation
QUESTION????
Clarification
Natural Statement ofQuestion;
QueryAssessment,
Advisor,Collaboration
Question Under- standing andInterpretation
•Relevant information extracted and combined where possible;• Plus Other Processing
Determinethe
Answer
Relevant “Documents”
Single Ranked
Lists
Queries
Relevant“Knowledge”
SingleSource;Single Media;English;Single Agency
SingleSource
SpecificQueries
Translate Queriesinto Source Specific Retrieval Languages
FINAL ANSWER
Results of Analysis• Formulate Answer
AnswerFormulation
ProposedAnswer
AnswerContext
38
AQUAINT:Plenty of Very Challenging R&D Areas Still Remain !
QUESTION????
Clarification
Other Analysts
Question & RequirementContext; Analyst Background
Knowledge
Multimedia Examples
Natural Statement ofQuestion;
Use of
QueryAssessment,
Advisor,Collaboration
Question Under- standing andInterpretation
Knowledge Bases;Technical Databases
Question & Answer Context
•Relevant information extracted and combined where possible;•Accumulation of Knowledge across “Documents”•Cross “Document” Summaries created;•Language/Media Independent Concept Representation•Inconsistencies noted;•Proposed Conclusions and Inferences Generated
Determinethe
Answer
Relevant “Documents”
MultipleRanked
Lists
Single, Merged
Ranked List ofRelevant “Documents”
Queries
Relevant“Knowledge”
KBQueries
Multiple Sources;Multiple Media;Multi-Lingual;Multiple Agencies
MultipleSource
SpecificQueries
Translate Queriesinto Source Specific Retrieval Languages
Partially Annotated & Structured Data
Automatic Metadata Creation
SupplementalUse
Supple- mentalUse
Query Refinement based on Analyst
Feedback
Iterative Refinementof Results based
on Analyst Feedback
AnalystFeed-back
FINAL ANSWER
Results of Analysis• Formulate Answer for Analyst in form they want
• Multimedia Navigation Tools for Analyst Review
AnswerFormulation
ProposedAnswer
AnswerContext
39
1. Significantly More Emphasis on Advanced Question Answering Scenarios
2. Expanded Data Sources; Same emphasis on Newswire Text but Significantly More Interest / Emphasis on Non-Newswire Data Dimensions
3. Exploring the Boundaries / Combinations of Knowledge-Based, Statistical and Linguistic approaches to Question Answering
4. Continue to explore “Uncharted Waters . . .”
5. Follow AQUAINT’s “Three Commandments”
Key Themes / Foci of AQUAINT Phase II
40
Three Commandmentsfor AQUAINT’s Phase II
I“High Risk-High Payoff
R&D”NOT
“IncrementalDevelopment”
41
Key Themes ofARDA’s Info-X Programs
• Start with overarching, comprehensive Operational Problems.
• Develop a Vision of Desired Future Operational Capability that addresses these Operational Problems (~ 5-6 year time horizon)
• Identify Possible Incremental Path to reach this vision… (But not a prescriptive one)
• Focus funded research on the Key, Critical Problems that Must be solved
• Tackle Problems that are Higher Up in the “Food Chain”, but attempt not to ignore “Foundational Issues”
• Encourage higher risk, “Fresh Looks” at “Old Problems”
• Encourage synergistic, Cross-Discipline Research
42
Three Commandmentsfor AQUAINT’s Phase II
I“High Risk-High Payoff
R&D”NOT
“IncrementalDevelopment”
II“Participation
inProject
& Program-LevelEvaluations”
III“Testbed
Participation--
IntegratingComponents
Into QA System”
43
Other Items
• Two Levels of AQUAINT Program in Phase I – Original Base 16 Projects– Base + Plus Up Expand to 23 Projects– “Base” or “Base + Plus Up”: Level of Phase II has not been
finalized
• “Meaningful” Teaming is encouraged• Phase II BAA will almost certainly be issued by the
Dept. of Interior’s Contracting Office at Ft. Huachuca, AZ
• Target date for BAA release is mid-late July• Projected Contract Award Dates in Late 2003 or
Early 2004• Will exert significant effort to better synchronize
contract award dates
44
AQUAINT PHASE II
June Sunrise over Kirkwall Bay in the Orkney Islands of Scotland
Continuing to Pursue our Visions And Dreams
for a Brighter, New Day
Your Questions & Comments
45
Contact Information
Dr. John Prange, AQUAINT Program Director
• ARDA Web Pages: http://www.ic-arda.org
• Email [email protected] [email protected]
• Phones: 301-688-7092800-276-3747301-688-7410 (Fax)
• Mailing: ARDA (RA)Room 12A69 NBP#1
STE 66449800 Savage Road
Fort Meade, MD 20755-6644