question answering passage retrieval using dependency parsing

28
August 17, 2005 Question Answering Passage Retrieval Using Dependency Parsing 1/28 Question Answering Passage Retrieval Using Dependency Parsing Hang Cui Renxu Sun Keya Li Min-Yen Kan Tat-Seng Chua Department of Computer Science National University of Singapore

Upload: varuna

Post on 09-Jan-2016

37 views

Category:

Documents


0 download

DESCRIPTION

Question Answering Passage Retrieval Using Dependency Parsing Hang Cui Renxu Sun Keya Li Min-Yen Kan Tat-Seng Chua Department of Computer Science National University of Singapore. Passage Retrieval in Question Answering. QA System. Document Retrieval. To narrow down the search scope - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Question Answering Passage Retrieval Using Dependency Parsing

August 17, 2005 Question Answering Passage Retrieval Using Dependency Parsing

1/28

Question Answering Passage Retrieval Using Dependency Parsing

Hang CuiRenxu Sun

Keya LiMin-Yen KanTat-Seng Chua

Department of Computer ScienceNational University of Singapore

Page 2: Question Answering Passage Retrieval Using Dependency Parsing

August 17, 2005 Question Answering Passage Retrieval Using Dependency Parsing

2/28

Passage Retrieval in Question Answering

Document Retrieval

Answer Extraction

Passage Retrieval

QA System

• To narrow down the search scope • Can answer questions with more context

• Lexical density based• Distance between question words

Page 3: Question Answering Passage Retrieval Using Dependency Parsing

August 17, 2005 Question Answering Passage Retrieval Using Dependency Parsing

3/28

Density Based Passage Retrieval Method

• However, density based can err when …

<Question> What percent of the nation's cheese does Wisconsin produce?

Incorrect: … the number of consumers who mention California when asked about cheese has risen by 14 percent, while the number specifying Wisconsin has dropped 16 percent.Incorrect: The wry “It's the Cheese” ads, which attribute California's allure to its cheese _ and indulge in an occasional dig at the Wisconsin stuff'' … sales of cheese in California grew three times as fast as sales in the nation as a whole 3.7 percent compared to 1.2 percent, …Incorrect: Awareness of the Real California Cheese logo, which appears on about 95 percent of California cheeses, has also made strides.

Correct: In Wisconsin, where farmers produce roughly 28 percent of the nation's cheese, the outrage is palpable.

Relationships between matched words differ …

Page 4: Question Answering Passage Retrieval Using Dependency Parsing

August 17, 2005 Question Answering Passage Retrieval Using Dependency Parsing

4/28

Our Solution

• Examine the relationship between words– Dependency relations

• Exact match of relations for answer extraction

• Has low recall because same relations are often phrased differently

• Fuzzy match of dependency relationship– Statistical similarity of relations

Page 5: Question Answering Passage Retrieval Using Dependency Parsing

August 17, 2005 Question Answering Passage Retrieval Using Dependency Parsing

5/28

Measuring Sentence Similarity

Sentence 1 Sentence 2

Sim (Sent1, Sent2) = ?

Matched words

Lexical matchingSimilarity of relations

between matched words+

Similarity of individualrelations

Page 6: Question Answering Passage Retrieval Using Dependency Parsing

August 17, 2005 Question Answering Passage Retrieval Using Dependency Parsing

6/28

Outline

• Extracting and Paring Relation Paths

• Measuring Path Match Scores

• Learning Relation Mapping Scores

• Evaluations

• Conclusions

Page 7: Question Answering Passage Retrieval Using Dependency Parsing

August 17, 2005 Question Answering Passage Retrieval Using Dependency Parsing

7/28

Outline

• Extracting and Paring Relation Paths

• Measuring Path Match Scores

• Learning Relation Mapping Scores

• Evaluations

• Conclusions

Page 8: Question Answering Passage Retrieval Using Dependency Parsing

August 17, 2005 Question Answering Passage Retrieval Using Dependency Parsing

8/28

What Dependency Parsing is Like

• Minipar (Lin, 1998) for dependency parsing

• Dependency tree– Nodes: words/chunks in the sentence– Edges (ignoring the direction): labeled by

relation types

What percent of the nation's cheese does Wisconsin produce?

Root

percent

whatof

cheese

nation

produce

Wisconsin

Page 9: Question Answering Passage Retrieval Using Dependency Parsing

August 17, 2005 Question Answering Passage Retrieval Using Dependency Parsing

9/28

Extracting Relation Paths

• Relation path– Vector of relations between two nodes in the tree

Root

percent

whatof

cheese

nation

produce

Wisconsin

produce < P1: subj > Wisconsin percent < P2: prep pcomp-n > cheese

Two constraints for relation paths:1. Path length (less than 7 relations) 2. Ignore those between two words that are within a chunk, e.g. New York.

Page 10: Question Answering Passage Retrieval Using Dependency Parsing

August 17, 2005 Question Answering Passage Retrieval Using Dependency Parsing

10/28

Paired Paths from Question and Answer

Root

percent

whatof

cheese

nation

produce

Wisconsin

Root

in

Wisconsin

produce

28 percent

of

cheese

nation

farmers

What percent of the nation's cheese does Wisconsin produce?

In Wisconsin, where farmers produce roughly 28 percent of the nation's cheese, the outrage is palpable.

< P1(Q) : subj >

< P1(Sent) : pcomp-n mod i >

Paired Relation Paths

SimRel (Q, Sent) = ∑i,j Sim (Pi (Q), Pj(Sent))

Page 11: Question Answering Passage Retrieval Using Dependency Parsing

August 17, 2005 Question Answering Passage Retrieval Using Dependency Parsing

11/28

Outline

• Extracting and Paring Relation Paths

• Measuring Path Match Scores

• Learning Relation Mapping Scores

• Evaluations

• Conclusions

Page 12: Question Answering Passage Retrieval Using Dependency Parsing

August 17, 2005 Question Answering Passage Retrieval Using Dependency Parsing

12/28

Measuring Path Match Degree

• Employ a variation of IBM Translation Model 1– Path match degree (similarity) as translation

probability• MatchScore (PQ, PS) → Prob (PS | PQ ) • Relations as words

• Why IBM Model 1? – No “word order” – bag of undirected relations– No need to estimate “target sentence length”

• Relation paths are determined by the parsing tree

Page 13: Question Answering Passage Retrieval Using Dependency Parsing

August 17, 2005 Question Answering Passage Retrieval Using Dependency Parsing

13/28

Calculating Translation Probability (Similarity) of Paths

m

n

n

i

Q

i

Sit

m

nQS lRelRePm

PPProb1 1

)()(

11

)|()|(

Considering the most probable alignment (findingthe most probable mapped relations)

Take logarithm and ignore the constants (for all sentences, question path length is a constant)

n

i

Q

iAS

itnQS lRelRePm

PPProb1

)()( )|()|(

MatchScores of paths are combined to give the sentence’srelevance to the question.

n

i

Q

iAS

it

QSS

lRelRePn

PPProbPMatchScore

1

)()( )|(log'

)|()(

?

Given two relation paths from the question and a candidate sentence

Page 14: Question Answering Passage Retrieval Using Dependency Parsing

August 17, 2005 Question Answering Passage Retrieval Using Dependency Parsing

14/28

Outline

• Extracting and Paring Relation Paths

• Measuring Path Match Scores

• Learning Relation Mapping Scores

• Evaluations

• Conclusions

Page 15: Question Answering Passage Retrieval Using Dependency Parsing

August 17, 2005 Question Answering Passage Retrieval Using Dependency Parsing

15/28

Training and Testing

Testing Training

Sim ( Q, Sent ) = ?

Relation Mapping Scores

Prob ( PSent | PQ ) = ?

P ( Rel (Sent) | Rel (Q) ) = ?

Q - A pairs

Paired Relation Paths

Relation Mapping Model

Similarity between relation vectors

Similarity between individual relations

1. Mutual information (MI) based

2. Expectation Maximization (EM) based

Page 16: Question Answering Passage Retrieval Using Dependency Parsing

August 17, 2005 Question Answering Passage Retrieval Using Dependency Parsing

16/28

Approach 1: MI Based

• Measures bipartite co-occurrences in training path pairs

• Accounts for path length (penalize those long paths)

• Uses frequencies to approximate mutual information

||||

),(log)|(

)()(

)()()()()(

Si

Qj

Si

QjQ

jS

iMI

t lRelRe

lRelRelRelReP

Page 17: Question Answering Passage Retrieval Using Dependency Parsing

August 17, 2005 Question Answering Passage Retrieval Using Dependency Parsing

17/28

Approach – 2: EM Based

• Employ the training method from IBM Model 1– Relation mapping scores = word translation

probability– Utilize GIZA to accomplish training– Iteratively boosting the precision of relation

translation probability

• Initialization – assign 1 to identical relations and a small constant otherwise

Page 18: Question Answering Passage Retrieval Using Dependency Parsing

August 17, 2005 Question Answering Passage Retrieval Using Dependency Parsing

18/28

Outline

• Extracting and Paring Relation Paths• Measuring Path Match Scores• Learning Relation Mapping Scores• Evaluations

– Can relation matching help?– Can fuzzy match perform better than exact match? – Can long questions benefit more?– Can relation matching work on top of query expansion?

• Conclusions

Page 19: Question Answering Passage Retrieval Using Dependency Parsing

August 17, 2005 Question Answering Passage Retrieval Using Dependency Parsing

19/28

Evaluation Setup

• Training data– 3k corresponding path pairs from 10k QA

pairs (TREC-8, 9)

• Test data– 324 factoid questions from TREC-12 QA task

• Passage retrieval on top 200 relevant documents by TREC

Page 20: Question Answering Passage Retrieval Using Dependency Parsing

August 17, 2005 Question Answering Passage Retrieval Using Dependency Parsing

20/28

Comparison Systems

• MITRE –baseline– Stemmed word overlapping– Baseline in previous work on passage retrieval evaluation

• SiteQ – top performing density based method– using 3 sentence window

• NUS – Similar to SiteQ, but using sentences as passages

• Strict Matching of Relations– Simulate strict matching in previous work for answer selection– Counting the number of exactly matched paths

• Relation matching are applied on top of MITRE and NUS

Page 21: Question Answering Passage Retrieval Using Dependency Parsing

August 17, 2005 Question Answering Passage Retrieval Using Dependency Parsing

21/28

Evaluation Metrics

• Mean reciprocal rank (MRR)– On the top 20 returned passages– Measure the mean rank position of the correct

answer in the returned rank list

• Percentage of questions with incorrect answers

• Precision at the top one passage

Page 22: Question Answering Passage Retrieval Using Dependency Parsing

August 17, 2005 Question Answering Passage Retrieval Using Dependency Parsing

22/28

Performance Evaluation

• All improvements are statistically significant (p<0.001)• MI and EM do not make much difference given our training data

– EM needs more training data – MI is more susceptible to noise, so may not scale well

Passage retrieval systems

MITRE SiteQ NUSRel_Strict(MITRE)

Rel_Strict(NUS)

Rel_MI (MITRE)

Rel_EM (MITRE)

Rel_MI (NUS)

Rel_EM (NUS)

MRR 0.2000 0.2765 0.2677 0.2990 0.3625 0.4161 0.4218 0.4756 0.4761

% MRR improvement overMITRESiteQNUS

N/AN/AN/A

+38.26N/AN/A

+33.88N/AN/A

+49.50+8.14

+11.69

+81.25+31.10+35.41

+108.09+50.50+55.43

+110.94+52.57+57.56

+137.85+72.03+77.66

+138.08+72.19+77.83

% Incorrect 45.68% 37.65% 33.02% 41.96% 32.41% 29.63% 29.32% 24.69% 24.07%

Precision at top one passage

0.1235 0.1975 0.1759 0.2253 0.2716 0.3364 0.3457 0.3889 0.3889

Fuzzy matching outperforms strict

matching significantly.

MRR Comparison

0.00

0.10

0.20

0.30

0.40

0.50

0.60

MR

R

Page 23: Question Answering Passage Retrieval Using Dependency Parsing

August 17, 2005 Question Answering Passage Retrieval Using Dependency Parsing

23/28

Performance Variation to Question Length

• Long questions, with more paired paths, tend to improve more– Using the number of non-trivial question terms

to approximate question length

0.2

0.25

0.3

0.35

0.4

0.45

0.5

0.55

0.6

0.65

0 2 4 6 8

# Question Terms

MR

R Rel_NUS_EM

Rel_MITRE_EM

Page 24: Question Answering Passage Retrieval Using Dependency Parsing

August 17, 2005 Question Answering Passage Retrieval Using Dependency Parsing

24/28

Error Analysis

• Mismatch of question terms• e.g. In which city is the River Seine • Introduce question analysis

• Paraphrasing between the question and the answer sentence

• e.g. write the book → be the author of the book• Most of current techniques fail to handle it• Finding paraphrasing via dependency parsing (Lin

and Pantel)

Page 25: Question Answering Passage Retrieval Using Dependency Parsing

August 17, 2005 Question Answering Passage Retrieval Using Dependency Parsing

25/28

Performance on Top of Query Expansion• On top of query expansion, fuzzy relation matching

brings a further 50% improvement• However

– query expansion doesn’t help much on a fuzzy relation matching system

– Expansion terms do not help in paring relation paths

Passage Retrieval Systems

NUS(baseline)

NUS+QERel_MI (NUS+QE)

Rel_EM (NUS+QE)

MRR(% improvement over baseline)

0.26770.3293(+23.00%)

0.4924(+83.94%)

0.4935(+84.35%)

% MRR improvement over NUS+QE

N/A N/A +49.54% +49.86%

% Incorrect 33.02% 28.40% 22.22% 22.22%

Precision at top one passage

0.1759 0.2315 0.4074 0.4074

Rel_EM (NUS)0.4761

Page 26: Question Answering Passage Retrieval Using Dependency Parsing

August 17, 2005 Question Answering Passage Retrieval Using Dependency Parsing

26/28

Outline

• Extracting and Paring Relation Paths

• Measuring Path Match Scores

• Learning Relation Mapping Scores

• Evaluations

• Conclusions

Page 27: Question Answering Passage Retrieval Using Dependency Parsing

August 17, 2005 Question Answering Passage Retrieval Using Dependency Parsing

27/28

Conclusions

• Proposed a novel fuzzy relation matching method for factoid QA passage retrieval– Brings dramatic 70%+ improvement over the state-of-

the-art systems– Brings further 50% improvement over query

expansion– Future QA systems should bring in relations between

words for better performance

• Query expansion should be integrated to relation matching seamlessly

Page 28: Question Answering Passage Retrieval Using Dependency Parsing

August 17, 2005 Question Answering Passage Retrieval Using Dependency Parsing

28/28

Q & A

Thanks!