dec. 3-5, 2002aquaint 12-month workshop1 hitiqa: high-quality interactive question answering...

29
Dec. 3-5, 2002 AQUAINT 12-Month Workshop 1 HITIQA: High-Quality Interactive Question Answering 12-Month Review University at Albany, SUNY Rutgers University

Upload: bonnie-wiggins

Post on 12-Jan-2016

221 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Dec. 3-5, 2002AQUAINT 12-Month Workshop1 HITIQA: High-Quality Interactive Question Answering 12-Month Review University at Albany, SUNY Rutgers University

Dec. 3-5, 2002 AQUAINT 12-Month Workshop 1

HITIQA: High-Quality Interactive Question Answering

12-Month Review

University at Albany, SUNYRutgers University

Page 2: Dec. 3-5, 2002AQUAINT 12-Month Workshop1 HITIQA: High-Quality Interactive Question Answering 12-Month Review University at Albany, SUNY Rutgers University

Dec. 3-5, 2002 AQUAINT 12-Month Workshop 2

HITIQA Team• SUNY Albany:

– Prof. Tomek Strzalkowski, PI/PM– Prof. Rong Tang– Prof. Boris Yamrom, consultant– Ms. Sharon Small, Research Scientist– Mr. Ting Liu, Graduate Student– Mr. Nobuyuki Shimizu, Graduate Student– Mr. Tom Palen, summer intern– Mr. Peter LaMonica, summer intern/AFRL

• Rutgers:– Prof. Paul Kantor, co-PI– Prof. K.B. Ng– Prof. Nina Wacholder– Mr. Robert Rittman, Graduate Student– Ms. Ying Sun, Graduate Student– Mr. Peng Song, Graduate student

Page 3: Dec. 3-5, 2002AQUAINT 12-Month Workshop1 HITIQA: High-Quality Interactive Question Answering 12-Month Review University at Albany, SUNY Rutgers University

Dec. 3-5, 2002 AQUAINT 12-Month Workshop 3

HITIQA Concept

Question: What recent disasters occurred in tunnels used for transportation?

Possible Category Axes SeenV

ehic

le t

yp

eLosses/Cost

loca

tion

other

auto

train

USER PROFILE; TASK CONTEXT

QUESTION NL PROCESSING

Clarification Dialogue:S: Are you interested in train accidents,automobile accidents or others?U: Any that involved lost life or a majordisruption in communication. Must identifyloses.

Semantics: What the question“means”:• to the system• to the userS

EM

AN

TIC

PR

OC

FUSE &SUMMARIZE

Answer &Justification

AN

SW

ER

GE

NE

R.

SEARCH &CATEGORIZE

KB

TEMPLATE SELECTION

Focused Information Need

QUALITY ASSESSMENT

Page 4: Dec. 3-5, 2002AQUAINT 12-Month Workshop1 HITIQA: High-Quality Interactive Question Answering 12-Month Review University at Albany, SUNY Rutgers University

Dec. 3-5, 2002 AQUAINT 12-Month Workshop 4

Key Research Issues

• Question Semantics – how the system “understands” user requests

• Human-Computer Dialogue – how the user and the system negotiate this

understanding

• Information Quality Metrics – how some information is better than other

• Information Fusion – how to assemble the answer that fits user

needs.

Page 5: Dec. 3-5, 2002AQUAINT 12-Month Workshop1 HITIQA: High-Quality Interactive Question Answering 12-Month Review University at Albany, SUNY Rutgers University

Dec. 3-5, 2002 AQUAINT 12-Month Workshop 5

Document Retrieval

Document Retrieval

BuildFrames

BuildFrames

ProcessFrames

ProcessFrames

DialogueManager

DialogueManager

QuestionProcessor

QuestionProcessor

Wordnet

Completed Work

question

Segment/Filter

Segment/Filter

ClusterSegments

ClusterSegments

Query Refinement

Query Refinement

Current Focus

DB

Gate

AnswerGenerator

AnswerGenerator

answer

Visualization

Page 6: Dec. 3-5, 2002AQUAINT 12-Month Workshop1 HITIQA: High-Quality Interactive Question Answering 12-Month Review University at Albany, SUNY Rutgers University

Dec. 3-5, 2002 AQUAINT 12-Month Workshop 6

Data-Driven NL Semantics

What does the question mean to the user?– The speech act– The focus– User’s task,

intention, goal– User’s background

knowledge

What does the question mean to the system?– Available

information– Information that

can be retrieved– The dimensions of

the retrieved information

Page 7: Dec. 3-5, 2002AQUAINT 12-Month Workshop1 HITIQA: High-Quality Interactive Question Answering 12-Month Review University at Albany, SUNY Rutgers University

Dec. 3-5, 2002 AQUAINT 12-Month Workshop 7

Data-Driven Semantics• What’s available?

– Assemble potentially relevant information– Greedy retrieval to maximize recall

• How does it break down?– Break the retrieved set into topics and facets– Passage level clustering using dynamic n-grams

• What does it mean?– Frame each facet, determine attributes– Specialized information extraction routines

• What is the answer?– Match fact frames against the question frames– Consider full matches and near misses

Page 8: Dec. 3-5, 2002AQUAINT 12-Month Workshop1 HITIQA: High-Quality Interactive Question Answering 12-Month Review University at Albany, SUNY Rutgers University

Dec. 3-5, 2002 AQUAINT 12-Month Workshop 8

• Because of Iraq's defiance, ``the council may need to consider, at some stage, that the effect of these actions by Iraq may prove that the commission is obliged to conclude that it is unable to provide 100 percent verification,'' that Iraq has destroyed all its banned weapons, the inspectors said.

• They repeated previous statements that they were close to declaring that Iraq had complied with resolutions regarding its chemical weapons and missiles, but that questions remained as to Iraq's biological weapons program.

• The report cites the biological problem as the reason why Iraq and not inspectors should still be responsible for making disclosures about banned weapons.

• For nearly four years, Iraq failed to tell inspectors that it had a biological weapons program, the inspectors said. Only when forced did Baghdad disclose it, but its reports since then have been ``neither credible nor verifiable.'' How then could inspectors be asked to prove what Iraq has refused to divulge, the inspectors asked in their report. Iraq should continue to be responsible for providing all information about its banned weapons programs, as called for by U.N. resolutions, the inspectors argued.

Framing a Topical Cluster

TextFrameGroupId: Target:subTarget:

locations: , ,organizations:

Because of Iraq's defiance, ``the council may need to consider, at some stage, that the effect of these actions by Iraq may prove that the commission is obliged to conclude that it is unable to provide 100 percent verification,'' that Iraq has destroyed all its banned weapons, the inspectors said.

Iraq

weaponsIraq

They repeated previous statements that they were close to declaring that Iraq had complied with resolutions regarding its chemical weapons and missiles, but that questions remained as to Iraq's biological weapons program.

Iraq

biological weapons

U.N.

Relevance: Matches on all elements found in GoalFrame = {location, target}

0

, Iraq, Iraq

GoalFrame Target: possessing, weapons, mass destruction, nuclear weapons,

biological weapons

locations: Iraq

Page 9: Dec. 3-5, 2002AQUAINT 12-Month Workshop1 HITIQA: High-Quality Interactive Question Answering 12-Month Review University at Albany, SUNY Rutgers University

Dec. 3-5, 2002 AQUAINT 12-Month Workshop 9

Answer Space Topology

KERNELQUESTION

MATCH

KERNELQUESTION

MATCH

NEARMISSES,

ALTERNATIVE INTERPRETATIONS

ALL RETRIEVED

FRAMES

Page 10: Dec. 3-5, 2002AQUAINT 12-Month Workshop1 HITIQA: High-Quality Interactive Question Answering 12-Month Review University at Albany, SUNY Rutgers University

Dec. 3-5, 2002 AQUAINT 12-Month Workshop 10

Data-Driven Interaction What does the

question mean to the user?– The speech act– The focus– User’s

task/intention/goal– User’s background

knowledge

What does the question mean to the system?– Available information– Information that can

be retrieved– The dimensions of the

retrieved information Shared Understanding

– Semantic gaps drive the dialogue: to negotiate between user’s meaning and system’s

meaning to fill the gaps in the expected answer to resolve ambiguities in the data to reduce dimensionality of the answer space

Page 11: Dec. 3-5, 2002AQUAINT 12-Month Workshop1 HITIQA: High-Quality Interactive Question Answering 12-Month Review University at Albany, SUNY Rutgers University

Dec. 3-5, 2002 AQUAINT 12-Month Workshop 11

Dialogue with the System• Dialogue arises from:

– System’s need to clarify before proceeding– Analyst’s need to clarify to keep system on target

• Dialogue Strategies:– Alternative interpretations: narrowing

• SYSTEM: Ask user to differentiate answers from non-answers• USER: confirm, deny, offer extra cues, …

– Off-target interpretations: expanding• SYSTEM: Ask user to modify the question• USER: confirm, deny, extra cues, new question, …

– More details please: information seeking• USER: Ask linked questions to follow through• USER: Specify answer characteristics: shape, quality, …

• A dialogue is unique to each analyst-data pair

Page 12: Dec. 3-5, 2002AQUAINT 12-Month Workshop1 HITIQA: High-Quality Interactive Question Answering 12-Month Review University at Albany, SUNY Rutgers University

Dec. 3-5, 2002 AQUAINT 12-Month Workshop 12

A Sample Dialogue (actual)Analyst: “need information on pollution and the fishing industry in regard to the

Black Sea. also need related items to pollution and international

cooperation.”

HITIQA: <Work in progress messages sent>

HITIQA: “We have found relevant information for your query. (You may ask for

answer generation at any point now.) We have also found other information

that may be relevant. Are you interested in seeing information about The

Black Sea convention as it is related to Black Sea pollution?

Analyst: “yes”

HITIQA: “Thank you. Are you interested in seeing information about The

International Waters Assessment team as it is related to Black Sea

pollution?

Analyst: “Yes particularly about fishing if possible”

Page 13: Dec. 3-5, 2002AQUAINT 12-Month Workshop1 HITIQA: High-Quality Interactive Question Answering 12-Month Review University at Albany, SUNY Rutgers University

Dec. 3-5, 2002 AQUAINT 12-Month Workshop 13

Dialogue continues…

HITIQA: “We have more information like this. Would you like us to continue?”

Analyst: “yes”

HITIQA: The environment, wildlife and people linked with the Black Sea are also

under threat from large discharges of raw sewage, damaging levels of

coastal erosion and the suffocating impact of dumping sludge and mud

dredged from ports…

Page 14: Dec. 3-5, 2002AQUAINT 12-Month Workshop1 HITIQA: High-Quality Interactive Question Answering 12-Month Review University at Albany, SUNY Rutgers University

Dec. 3-5, 2002 AQUAINT 12-Month Workshop 14

AQUAINT DIALOGUE EVALUATIONS

ALL PARTICIPANTS

• 10 Scenario’s plus 1 training scenario

• 15 minutes per scenario

• Chat interface

• Wizard control allowed

HITIQA

• No scenario filtering of data - 3 Gigabytes of newswire

• 13% Wizard interruption of system responses

Page 15: Dec. 3-5, 2002AQUAINT 12-Month Workshop1 HITIQA: High-Quality Interactive Question Answering 12-Month Review University at Albany, SUNY Rutgers University

Dec. 3-5, 2002 AQUAINT 12-Month Workshop 15

0

5

10

15

20

25

30

35

Tr 9 4 6 2 8 1 10 3 7 5

Total Analyst

System Wizard

Breakdown of dialogue utterances Analyst Two

Analyst 34%

System 58%

Wizard 8%

Page 16: Dec. 3-5, 2002AQUAINT 12-Month Workshop1 HITIQA: High-Quality Interactive Question Answering 12-Month Review University at Albany, SUNY Rutgers University

Dec. 3-5, 2002 AQUAINT 12-Month Workshop 16

Information QualityQuality Criteria

• CONTENT– Accuracy and Objectivity– Completeness; uniqueness– Importance; Verifiability

• AUTHORITY– Reliability; credibility

• PRESENTATION– Clarity and Un-ambiguity– Style and Gravitas– Orientation and Level– Readability and Usability

• TIMELINESS– Recency– Currency

Measurable Quality Indicators

• IN/OUT-DEGREE MEASURE– Number of cites or links

to/from– Credibility of these cites/links

• DOCUMENT SIZE• STYLISTIC FEATURES

– Typical sentence length– Use of pronouns, punctuations

• LINGUISTIC FEATURES– Sentence forms, verbs– References to names,

amounts• STRUCTURAL FEATURES

– Organization of sections– Use of section titles, etc.

• COLLECTION FEATURES

Page 17: Dec. 3-5, 2002AQUAINT 12-Month Workshop1 HITIQA: High-Quality Interactive Question Answering 12-Month Review University at Albany, SUNY Rutgers University

Dec. 3-5, 2002 AQUAINT 12-Month Workshop 17

Information Quality Research

Document Selection

Focus Group Study

Implement Experimental Systems

Pretests

Quality Judgment Experiment

Textual Feature Extraction

Automated Document Quality Prediction

Start

We’re here

Page 18: Dec. 3-5, 2002AQUAINT 12-Month Workshop1 HITIQA: High-Quality Interactive Question Answering 12-Month Review University at Albany, SUNY Rutgers University

Dec. 3-5, 2002 AQUAINT 12-Month Workshop 18

Quality Judgments

• Focus Group:– Sessions conducted: March-April, 2002– Results: Nine quality aspects generated

• Expert Sessions:– Sessions Conducted: May-June, 2002– Results: 100 documents scored twice along 9 quality aspects

• Student Sessions:– Training and Testing Sessions: June-July, 2002

• 10 documents judged by experts used for training/testing

– Actual Judgment Sessions: June-August, 2002• Qualified students evaluated 10 documents per session

– Results: 900 documents scored twice along 9 quality aspects

Page 19: Dec. 3-5, 2002AQUAINT 12-Month Workshop1 HITIQA: High-Quality Interactive Question Answering 12-Month Review University at Albany, SUNY Rutgers University

Dec. 3-5, 2002 AQUAINT 12-Month Workshop 19

Quality Assessment GUI

Page 20: Dec. 3-5, 2002AQUAINT 12-Month Workshop1 HITIQA: High-Quality Interactive Question Answering 12-Month Review University at Albany, SUNY Rutgers University

Dec. 3-5, 2002 AQUAINT 12-Month Workshop 20

Factor Analysis of 9 Quality Features

Appearance

Content

Page 21: Dec. 3-5, 2002AQUAINT 12-Month Workshop1 HITIQA: High-Quality Interactive Question Answering 12-Month Review University at Albany, SUNY Rutgers University

Dec. 3-5, 2002 AQUAINT 12-Month Workshop 21

Modeling Quality of Text• Kitchen sink approach

– 160 “independent” variables– Part-of-speech, vocabulary – stylistics, named entities, …

• Statistical pruning– Statistically significant variables– May be nonsensical to human

• Human pruning– Only “sensible” variables retained for each quality

• Pruning improves performance– Kitchen sink overfits– Statistics and Human close in performance– More work needed to understand the relationship

Page 22: Dec. 3-5, 2002AQUAINT 12-Month Workshop1 HITIQA: High-Quality Interactive Question Answering 12-Month Review University at Albany, SUNY Rutgers University

Dec. 3-5, 2002 AQUAINT 12-Month Workshop 22

Quality Prediction by Linear Combination of Textual Features (from 5 to 17 variables). Split Half for Training and Testing.

Quality Factors Prediction Rate

Depth 67%Author Credential 55%

Accuracy 69%Source 57%

Objectivity 64%Grammar 79%

One Side vs Multi View 70%

Verbosity 63%Readability 76%

Performance of models

Page 23: Dec. 3-5, 2002AQUAINT 12-Month Workshop1 HITIQA: High-Quality Interactive Question Answering 12-Month Review University at Albany, SUNY Rutgers University

Dec. 3-5, 2002 AQUAINT 12-Month Workshop 23

Data Fusion

• Use multiple methods to assess the relevance of documents or passages, – For a given question, dialogue, or cluster– Each method assigns a “score”

• Candidates → points in a “score space”• Seek patterns to localize the most relevant

documents or passages in this “score space”• Developed interactive data analysis tool

Page 24: Dec. 3-5, 2002AQUAINT 12-Month Workshop1 HITIQA: High-Quality Interactive Question Answering 12-Month Review University at Albany, SUNY Rutgers University

Dec. 3-5, 2002 AQUAINT 12-Month Workshop 24

Non-linear “iso-relevance”

Page 25: Dec. 3-5, 2002AQUAINT 12-Month Workshop1 HITIQA: High-Quality Interactive Question Answering 12-Month Review University at Albany, SUNY Rutgers University

Dec. 3-5, 2002 AQUAINT 12-Month Workshop 25

Information Visualization

• Supports Evidence Fusion– Dimensional displays

• Supports Information Quality Decisions– User interfaces

• Supports Clarification Dialogue– Multi-media dialogue: “picture = Kilo-word”

• Navigation through information space– Multiple views and orientation

Page 26: Dec. 3-5, 2002AQUAINT 12-Month Workshop1 HITIQA: High-Quality Interactive Question Answering 12-Month Review University at Albany, SUNY Rutgers University

Dec. 3-5, 2002 AQUAINT 12-Month Workshop 26

Visual Dialog in HITIQA• Display space

– Multi-dimensional– non-homogeneous– non-structured

• Mapping– Documents → Frames → Visuals– Navigation through changing dimensions

• Selection─ Use of Color and Shape

Page 27: Dec. 3-5, 2002AQUAINT 12-Month Workshop1 HITIQA: High-Quality Interactive Question Answering 12-Month Review University at Albany, SUNY Rutgers University

Dec. 3-5, 2002 AQUAINT 12-Month Workshop 27

HITIQA Visual Panel: cluster view

Page 28: Dec. 3-5, 2002AQUAINT 12-Month Workshop1 HITIQA: High-Quality Interactive Question Answering 12-Month Review University at Albany, SUNY Rutgers University

Dec. 3-5, 2002 AQUAINT 12-Month Workshop 28

Current Status Summary

• HITIQA 1st Prototype complete• Data-driven semantics for questions• Framing and Dialogue• Good results of pilot evaluation• Information Quality Experiments• User studies phase I completed• 2-D visualization developed• Information fusion work started

Page 29: Dec. 3-5, 2002AQUAINT 12-Month Workshop1 HITIQA: High-Quality Interactive Question Answering 12-Month Review University at Albany, SUNY Rutgers University

Dec. 3-5, 2002 AQUAINT 12-Month Workshop 29

Plans for the next 6 months• Refine the prototype

– Typed, specialized frames– More informative dialogue– Handle series of questions

• Second round of quality experiments• Answer generation• Information fusion• Tests and evaluations