learning from the past: answering new questions with past answers date: 2012/11/22 author: anna...
DESCRIPTION
INTRODUCTION Users struggle with expressing their need as short query 3TRANSCRIPT
![Page 1: LEARNING FROM THE PAST: ANSWERING NEW QUESTIONS WITH PAST ANSWERS Date: 2012/11/22 Author: Anna Shtok, Gideon Dror, Yoelle Maarek, Idan Szpektor Source:](https://reader033.vdocuments.net/reader033/viewer/2022052707/5a4d1b4c7f8b9ab0599a5d19/html5/thumbnails/1.jpg)
LEARNING FROM THE PAST:ANSWERING NEW QUESTIONS WITH PAST ANSWERS
Date: 2012/11/22Author: Anna Shtok, Gideon Dror,
Yoelle Maarek, Idan SzpektorSource: WWW ’12Advisor: Dr. Jia-Ling KohSpeaker: Yi-Hsuan Yeh
![Page 2: LEARNING FROM THE PAST: ANSWERING NEW QUESTIONS WITH PAST ANSWERS Date: 2012/11/22 Author: Anna Shtok, Gideon Dror, Yoelle Maarek, Idan Szpektor Source:](https://reader033.vdocuments.net/reader033/viewer/2022052707/5a4d1b4c7f8b9ab0599a5d19/html5/thumbnails/2.jpg)
OUTLINEIntroductionDescription of approach
Stage one: top candidate selectionStage two: top candidate validation
ExperimentOfflineOnline
Conclusion 2
![Page 3: LEARNING FROM THE PAST: ANSWERING NEW QUESTIONS WITH PAST ANSWERS Date: 2012/11/22 Author: Anna Shtok, Gideon Dror, Yoelle Maarek, Idan Szpektor Source:](https://reader033.vdocuments.net/reader033/viewer/2022052707/5a4d1b4c7f8b9ab0599a5d19/html5/thumbnails/3.jpg)
INTRODUCTION
Users struggle with expressing their need as short query3
![Page 4: LEARNING FROM THE PAST: ANSWERING NEW QUESTIONS WITH PAST ANSWERS Date: 2012/11/22 Author: Anna Shtok, Gideon Dror, Yoelle Maarek, Idan Szpektor Source:](https://reader033.vdocuments.net/reader033/viewer/2022052707/5a4d1b4c7f8b9ab0599a5d19/html5/thumbnails/4.jpg)
INTRODUCTION Community-based Question Answering(CQA)
sites, such as Yahoo! Answers or Baidu Zhidao
4
Title
Body15% of the questions unanswere
d
Answer new questions by past resolved question
![Page 5: LEARNING FROM THE PAST: ANSWERING NEW QUESTIONS WITH PAST ANSWERS Date: 2012/11/22 Author: Anna Shtok, Gideon Dror, Yoelle Maarek, Idan Szpektor Source:](https://reader033.vdocuments.net/reader033/viewer/2022052707/5a4d1b4c7f8b9ab0599a5d19/html5/thumbnails/5.jpg)
OUTLINEIntroductionDescription of approach
Stage one: top candidate selectionStage two: top candidate validation
ExperimentOfflineOnline
Conclusion 5
![Page 6: LEARNING FROM THE PAST: ANSWERING NEW QUESTIONS WITH PAST ANSWERS Date: 2012/11/22 Author: Anna Shtok, Gideon Dror, Yoelle Maarek, Idan Szpektor Source:](https://reader033.vdocuments.net/reader033/viewer/2022052707/5a4d1b4c7f8b9ab0599a5d19/html5/thumbnails/6.jpg)
A TWO STAGE APPROACH
6
find the most similar past question.
decides whether or not to serve the answer
![Page 7: LEARNING FROM THE PAST: ANSWERING NEW QUESTIONS WITH PAST ANSWERS Date: 2012/11/22 Author: Anna Shtok, Gideon Dror, Yoelle Maarek, Idan Szpektor Source:](https://reader033.vdocuments.net/reader033/viewer/2022052707/5a4d1b4c7f8b9ab0599a5d19/html5/thumbnails/7.jpg)
STAGE ONE: TOP CANDIDATE SELECTION Vector-space unigram model with TF-IDF weight
7 Ranking: Cos(Qpast title+body, Qnew title+body)
=> the top candidate past question and A
w1 w2 w3 . . . wn(title)Qnew Qpast 1
Qpast 2 . .Qpast n
0.1 0.2 0.12 . . . 0.8
0.3 0.5 0.2 . . . 0.1
0.2 0 0.1 . . . 0.6
0.9 0.3 0.5 . . . 0.1
TF-IDF
Cosine similarity => threshold α
![Page 8: LEARNING FROM THE PAST: ANSWERING NEW QUESTIONS WITH PAST ANSWERS Date: 2012/11/22 Author: Anna Shtok, Gideon Dror, Yoelle Maarek, Idan Szpektor Source:](https://reader033.vdocuments.net/reader033/viewer/2022052707/5a4d1b4c7f8b9ab0599a5d19/html5/thumbnails/8.jpg)
Train a classifier that validates whether A can be served as an answer to Qnew.
STAGE TWO: TOP CANDIDATE VALIDATION
8
![Page 9: LEARNING FROM THE PAST: ANSWERING NEW QUESTIONS WITH PAST ANSWERS Date: 2012/11/22 Author: Anna Shtok, Gideon Dror, Yoelle Maarek, Idan Szpektor Source:](https://reader033.vdocuments.net/reader033/viewer/2022052707/5a4d1b4c7f8b9ab0599a5d19/html5/thumbnails/9.jpg)
SURFACE-LEVEL FEATURE Surface level statistics
text length, number of question marks, stop word count, maximal IDF within all terms in the text, minimal IDF, average IDF, IDF standard deviation, http link count, number of figures.
Surface level similarity TF-IDF weighted word unigram vector space model Cosine similarity
Qnew title - Qpast title Qnew body - Qpast body Qnew title+ body - Qpast title+body Qnew title+ body - Answer Qpast title+ body - Answer
9
![Page 10: LEARNING FROM THE PAST: ANSWERING NEW QUESTIONS WITH PAST ANSWERS Date: 2012/11/22 Author: Anna Shtok, Gideon Dror, Yoelle Maarek, Idan Szpektor Source:](https://reader033.vdocuments.net/reader033/viewer/2022052707/5a4d1b4c7f8b9ab0599a5d19/html5/thumbnails/10.jpg)
LINGUISTIC ANALYSIS Latent topic
LDA(Latent Dirichlet Allocation)
10
Qnew Qpast A
Topic 1 0.3 0.1 0.25Topic 2 0.03 0.1 0.02Topic 3 0.15 0.08 0.12 . . . . . . . . . . . . . . . .Topic n 0.06 0.13 0.05
• Entropy• Most probable topic• JS divergence
![Page 11: LEARNING FROM THE PAST: ANSWERING NEW QUESTIONS WITH PAST ANSWERS Date: 2012/11/22 Author: Anna Shtok, Gideon Dror, Yoelle Maarek, Idan Szpektor Source:](https://reader033.vdocuments.net/reader033/viewer/2022052707/5a4d1b4c7f8b9ab0599a5d19/html5/thumbnails/11.jpg)
Lexico-syntactic analysis Stanford dependency parser
Main verb , subject, object, the main noun and adjective
Ex: Q1:Why doesn’t my dog eat?Main predicate : eatMain predicate argument: dog
Q2:Why doesn’t my cat eat?Main predicate : eatMain predicate argument: cat
11
![Page 12: LEARNING FROM THE PAST: ANSWERING NEW QUESTIONS WITH PAST ANSWERS Date: 2012/11/22 Author: Anna Shtok, Gideon Dror, Yoelle Maarek, Idan Szpektor Source:](https://reader033.vdocuments.net/reader033/viewer/2022052707/5a4d1b4c7f8b9ab0599a5d19/html5/thumbnails/12.jpg)
RESULT LIST ANALYSIS Query clarity
12
Qnew
Qpast1 Qpast2 Qpast3 Qpastall
A
B
C
D
0.5
0
0.3
0.2
0
0.5
0.1
0.4
0.1
0
0
0.9
0.5
0
0.3
0.2 Language model & KL divergence
![Page 13: LEARNING FROM THE PAST: ANSWERING NEW QUESTIONS WITH PAST ANSWERS Date: 2012/11/22 Author: Anna Shtok, Gideon Dror, Yoelle Maarek, Idan Szpektor Source:](https://reader033.vdocuments.net/reader033/viewer/2022052707/5a4d1b4c7f8b9ab0599a5d19/html5/thumbnails/13.jpg)
Query feedback Informational similarity between two queries can
be effectively estimated by the similarity between their ranked document lists.
Result list length The number of questions that pass the threshold α 13
![Page 14: LEARNING FROM THE PAST: ANSWERING NEW QUESTIONS WITH PAST ANSWERS Date: 2012/11/22 Author: Anna Shtok, Gideon Dror, Yoelle Maarek, Idan Szpektor Source:](https://reader033.vdocuments.net/reader033/viewer/2022052707/5a4d1b4c7f8b9ab0599a5d19/html5/thumbnails/14.jpg)
CLASSIFIER MODEL Random forest classifier Random n feature & training n past questions
… ….
14
![Page 15: LEARNING FROM THE PAST: ANSWERING NEW QUESTIONS WITH PAST ANSWERS Date: 2012/11/22 Author: Anna Shtok, Gideon Dror, Yoelle Maarek, Idan Szpektor Source:](https://reader033.vdocuments.net/reader033/viewer/2022052707/5a4d1b4c7f8b9ab0599a5d19/html5/thumbnails/15.jpg)
OUTLINEIntroductionDescription of approach
Stage one: top candidate selectionStage two: top candidate validation
ExperimentOfflineOnline
Conclusion 15
![Page 16: LEARNING FROM THE PAST: ANSWERING NEW QUESTIONS WITH PAST ANSWERS Date: 2012/11/22 Author: Anna Shtok, Gideon Dror, Yoelle Maarek, Idan Szpektor Source:](https://reader033.vdocuments.net/reader033/viewer/2022052707/5a4d1b4c7f8b9ab0599a5d19/html5/thumbnails/16.jpg)
OFFLINE Dataset
Yahoo! Answer: Beauty & Style, Health and Pets. Included best answers chosen by the askers, and
received at least three stars. Between Feb and Dec 2010
16
![Page 17: LEARNING FROM THE PAST: ANSWERING NEW QUESTIONS WITH PAST ANSWERS Date: 2012/11/22 Author: Anna Shtok, Gideon Dror, Yoelle Maarek, Idan Szpektor Source:](https://reader033.vdocuments.net/reader033/viewer/2022052707/5a4d1b4c7f8b9ab0599a5d19/html5/thumbnails/17.jpg)
MTurk Fleiss’s kappa
17
![Page 18: LEARNING FROM THE PAST: ANSWERING NEW QUESTIONS WITH PAST ANSWERS Date: 2012/11/22 Author: Anna Shtok, Gideon Dror, Yoelle Maarek, Idan Szpektor Source:](https://reader033.vdocuments.net/reader033/viewer/2022052707/5a4d1b4c7f8b9ab0599a5d19/html5/thumbnails/18.jpg)
18
![Page 19: LEARNING FROM THE PAST: ANSWERING NEW QUESTIONS WITH PAST ANSWERS Date: 2012/11/22 Author: Anna Shtok, Gideon Dror, Yoelle Maarek, Idan Szpektor Source:](https://reader033.vdocuments.net/reader033/viewer/2022052707/5a4d1b4c7f8b9ab0599a5d19/html5/thumbnails/19.jpg)
19
![Page 20: LEARNING FROM THE PAST: ANSWERING NEW QUESTIONS WITH PAST ANSWERS Date: 2012/11/22 Author: Anna Shtok, Gideon Dror, Yoelle Maarek, Idan Szpektor Source:](https://reader033.vdocuments.net/reader033/viewer/2022052707/5a4d1b4c7f8b9ab0599a5d19/html5/thumbnails/20.jpg)
ONLINE
20
![Page 21: LEARNING FROM THE PAST: ANSWERING NEW QUESTIONS WITH PAST ANSWERS Date: 2012/11/22 Author: Anna Shtok, Gideon Dror, Yoelle Maarek, Idan Szpektor Source:](https://reader033.vdocuments.net/reader033/viewer/2022052707/5a4d1b4c7f8b9ab0599a5d19/html5/thumbnails/21.jpg)
21
![Page 22: LEARNING FROM THE PAST: ANSWERING NEW QUESTIONS WITH PAST ANSWERS Date: 2012/11/22 Author: Anna Shtok, Gideon Dror, Yoelle Maarek, Idan Szpektor Source:](https://reader033.vdocuments.net/reader033/viewer/2022052707/5a4d1b4c7f8b9ab0599a5d19/html5/thumbnails/22.jpg)
OUTLINEIntroductionDescription of approach
Stage one: top candidate selectionStage two: top candidate validation
ExperimentOfflineOnline
Conclusions 22
![Page 23: LEARNING FROM THE PAST: ANSWERING NEW QUESTIONS WITH PAST ANSWERS Date: 2012/11/22 Author: Anna Shtok, Gideon Dror, Yoelle Maarek, Idan Szpektor Source:](https://reader033.vdocuments.net/reader033/viewer/2022052707/5a4d1b4c7f8b9ab0599a5d19/html5/thumbnails/23.jpg)
CONCLUSIONS Short questions might suffer from vocabulary
mismatch problems and sparsity.
The long cumbersome descriptions introduce many irrelevant aspects which can hardly be separated from the essential question details(even for a human reader).
Terms that are repeated in the past question and in its best answer should usually be emphasized more as related to the expressed need. 23
![Page 24: LEARNING FROM THE PAST: ANSWERING NEW QUESTIONS WITH PAST ANSWERS Date: 2012/11/22 Author: Anna Shtok, Gideon Dror, Yoelle Maarek, Idan Szpektor Source:](https://reader033.vdocuments.net/reader033/viewer/2022052707/5a4d1b4c7f8b9ab0599a5d19/html5/thumbnails/24.jpg)
A general informative answer can satisfy a number of topically connected but different questions.
A general social answer, may often satisfy a certain type of questions.
In future work, we would like to better understand time-sensitive questions, such as common in the Sports category
24