© johan bos april 2008 question answering (qa) lecture 1 what is qa? query log analysis challenges...
TRANSCRIPT
![Page 1: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/1.jpg)
© J
oh
an B
os
Ap
ril 2
008
Question Answering (QA)
Lecture 1• What is QA?• Query Log Analysis• Challenges in QA• History of QA• System Architecture• Methods• System Evaluation• State-of-the-art
Lecture 2• Question Analysis• Background Knowledge• Answer Typing
Lecture 3• Query Generation• Document Analysis• Semantic Indexing• Answer Extraction• Selection and Ranking
![Page 2: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/2.jpg)
© J
oh
an B
os
Ap
ril 2
008
knowledge
parsing
boxing
query
answertyping
Indri
answerextraction
answerselection
answerreranking
question answerccg
drs WordNetNomLex
Indexed Documents
Pronto architecture
![Page 3: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/3.jpg)
© J
oh
an B
os
Ap
ril 2
008
knowledge
parsing
boxing
query
answertyping
Indri
answerextraction
answerselection
answerreranking
question answerccg
drs WordNetNomLex
Indexed Documents
Lecture 3
![Page 4: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/4.jpg)
© J
oh
an B
os
Ap
ril 2
008
Question Answering
Lecture 3Query Generation
• Document Analysis
• Semantic Indexing
• Answer Extraction
• Selection and Ranking
![Page 5: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/5.jpg)
© J
oh
an B
os
Ap
ril 2
008
knowledge
parsing
boxing
query
answertyping
Indri
answerextraction
answerselection
answerreranking
question answerccg
drs WordNetNomLex
Indexed Documents
Architecture of PRONTO
![Page 6: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/6.jpg)
© J
oh
an B
os
Ap
ril 2
008
Query Generation
• Once we analysed the question, we need to retrieve appropriate documents
• Most QA systems use an off-the-shelf information retrieval system for this task
• Examples:– Lemur– Lucene– Indri (used by Pronto)
• The input of the IR system is a query;the output is a ranked set of documents
![Page 7: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/7.jpg)
© J
oh
an B
os
Ap
ril 2
008
Queries
• Query generation depends on the way documents are indexed
• Based on– Semantic analysis of the question– Expected answer type– Background knowledge
• Computing a good query is hard – we don’t want too little documents, and we don’t want too many!
![Page 8: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/8.jpg)
© J
oh
an B
os
Ap
ril 2
008
Generating Query Terms
• Example 1:– Question: Who discovered prions?
– Text A: Dr. Stanley Prusiner received the Nobel prize for the discovery of prions.
– Text B: Prions are a kind of proteins that…
• Query terms?
![Page 9: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/9.jpg)
© J
oh
an B
os
Ap
ril 2
008
Generating Query Terms
• Example 2:– Question: When did Franz Kafka die?
– Text A: Kafka died in 1924.– Text B: Dr. Franz died in 1971.
• Query terms?
![Page 10: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/10.jpg)
© J
oh
an B
os
Ap
ril 2
008
Generating Query Terms
• Example 3:– Question: How did actor James Dean die?
– Text:
James Dean was killed in a car accident.
• Query terms?
![Page 11: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/11.jpg)
© J
oh
an B
os
Ap
ril 2
008
Useful query terms
• Ranked on importance:– Named entities– Dates or time expressions– Expressions in quotes– Nouns– Verbs
• Queries can be expanded using the created local knowledge base
![Page 12: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/12.jpg)
© J
oh
an B
os
Ap
ril 2
008
Query expansion example
• Query: sacajawea Returns only five documents
• Use synonyms in query expansions
• New query: sacajawea OR sagajaweaReturns two hundred documents
TREC 44.6 (Sacajawea)
How much is the Sacajawea coin worth?
![Page 13: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/13.jpg)
© J
oh
an B
os
Ap
ril 2
008
knowledge
parsing
boxing
query
answertyping
Indri
answerextraction
answerselection
answerreranking
question answerccg
drs WordNetNomLex
Indexed Documents
Architecture of PRONTO
![Page 14: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/14.jpg)
© J
oh
an B
os
Ap
ril 2
008
Question Answering
Lecture 3• Query GenerationDocument Analysis
• Semantic Indexing
• Answer Extraction
• Selection and Ranking
![Page 15: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/15.jpg)
© J
oh
an B
os
Ap
ril 2
008
Document Analysis – Why?
• The aim of QA is to output answers, not documents
• We need document analysis to– Find the correct type of answer in the
documents– Calculate the probability that an answer
is correct
• Semantic analysis is important to get valid answers
![Page 16: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/16.jpg)
© J
oh
an B
os
Ap
ril 2
008
Document Analysis – When?
• After retrieval– token or word based index– keyword queries– low precision
• Before retrieval– semantic indexing– concept queries– high precision– More NLP required
![Page 17: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/17.jpg)
© J
oh
an B
os
Ap
ril 2
008
Document Analysis – How?
• Ideally use the same NLP tools as for question analysis– This will make the semantic matching of
Question and Answer easier– Not always possible: wide coverage tools
are usally good at analysing text, but not at analysing questions
– Questions are often not part of large annotated corpora, on which NLP tools are trained
![Page 18: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/18.jpg)
© J
oh
an B
os
Ap
ril 2
008
Documents vs Passages
• Split documents into smaller passages– This will make the semantic matching
faster and more accurate– In Pronto the passage size is two
sentences, implemented by a sliding window
• Too small passages risk losing important contextual information– Pronouns and referring expressions
![Page 19: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/19.jpg)
© J
oh
an B
os
Ap
ril 2
008
Document Analysis
• Tokenisation
• Part of speech tagging
• Lemmatisation
• Syntactic analysis (Parsing)
• Semantic analysis (Boxing)• Named entity recognition• Anaphora resolution
![Page 20: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/20.jpg)
© J
oh
an B
os
Ap
ril 2
008
Why semantics is important
• Example:– Question: When did Franz Kafka die? – Text A:
The mother of Franz Kafka died in 1918.
![Page 21: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/21.jpg)
© J
oh
an B
os
Ap
ril 2
008
Why semantics is important
• Example:– Question: When did Franz Kafka die? – Text A:
The mother of Franz Kafka died in 1918.– Text B:
Kafka lived in Austria. He died in 1924.
![Page 22: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/22.jpg)
© J
oh
an B
os
Ap
ril 2
008
Why semantics is important
• Example:– Question: When did Franz Kafka die? – Text A:
The mother of Franz Kafka died in 1918.– Text B:
Kafka lived in Austria. He died in 1924.– Text C:
Both Kafka and Lenin died in 1924.
![Page 23: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/23.jpg)
© J
oh
an B
os
Ap
ril 2
008
Why semantics is important
• Example:– Question: When did Franz Kafka die? – Text A:
The mother of Franz Kafka died in 1918.– Text B:
Kafka lived in Austria. He died in 1924.– Text C:
Both Kafka and Lenin died in 1924.– Text D:
Max Brod, who knew Kafka, died in 1930.
![Page 24: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/24.jpg)
© J
oh
an B
os
Ap
ril 2
008
Why semantics is important
• Example:– Question: When did Franz Kafka die? – Text A:
The mother of Franz Kafka died in 1918.– Text B:
Kafka lived in Austria. He died in 1924.– Text C:
Both Kafka and Lenin died in 1924.– Text D:
Max Brod, who knew Kafka, died in 1930.– Text E:
Someone who knew Kafka died in 1930.
![Page 25: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/25.jpg)
© J
oh
an B
os
Ap
ril 2
008
DRS for “The mother of Franz Kafka died in 1918.”
_____________________ | x3 x4 x2 x1 | |---------------------| | mother(x3) | | named(x4,kafka,per) | | named(x4,franz,per) | | die(x2) | | thing(x1) | | event(x2) | | of(x3,x4) | | agent(x2,x3) | | in(x2,x1) | | timex(x1)=+1918XXXX | |_____________________|
![Page 26: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/26.jpg)
© J
oh
an B
os
Ap
ril 2
008
DRS for:“Kafka lived in Austria. He died in 1924.”
_______________________ _____________________ | x3 x1 x2 | | x5 x4 | |-----------------------| |---------------------|(| male(x3) |+| die(x5) |) | named(x3,kafka,per) | | thing(x4) | | live(x1) | | event(x5) | | agent(x1,x3) | | agent(x5,x3) | | named(x2,austria,loc) | | in(x5,x4) | | event(x1) | | timex(x4)=+1924XXXX | | in(x1,x2) | |_____________________| |_______________________|
![Page 27: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/27.jpg)
© J
oh
an B
os
Ap
ril 2
008
DRS for: “Both Kafka and Lenin died in 1924.”
_____________________| x6 x5 x4 x3 x2 x1 ||---------------------|| named(x6,kafka,per) || die(x5) || event(x5) || agent(x5,x6) || in(x5,x4) || timex(x4)=+1924XXXX || named(x3,lenin,per) || die(x2) || event(x2) || agent(x2,x3) || in(x2,x1) || timex(x1)=+1924XXXX ||_____________________|
![Page 28: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/28.jpg)
© J
oh
an B
os
Ap
ril 2
008
DRS for:“Max Brod, who knew Kafka, died in 1930.”
_____________________| x3 x5 x4 x2 x1 ||---------------------|| named(x3,brod,per) || named(x3,max,per) || named(x5,kafka,per) || know(x4) || event(x4) || agent(x4,x3) || patient(x4,x5) || die(x2) || event(x2) || agent(x2,x3) || in(x2,x1) || timex(x1)=+1930XXXX ||_____________________|
![Page 29: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/29.jpg)
© J
oh
an B
os
Ap
ril 2
008
DRS for:“Someone who knew Kafka died in 1930.”
_____________________| x3 x5 x4 x2 x1 ||---------------------|| person(x3) || named(x5,kafka,per) || know(x4) || event(x4) || agent(x4,x3) || patient(x4,x5) || die(x2) || event(x2) || agent(x2,x3) || in(x2,x1) || timex(x1)=+1930XXXX ||_____________________|
![Page 30: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/30.jpg)
© J
oh
an B
os
Ap
ril 2
008
Document Analysis
• Tokenisation
• Part of speech tagging
• Lemmatisation
• Syntactic analysis (Parsing)
• Semantic analysis (Boxing)Named entity recognition• Anaphora resolution
![Page 31: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/31.jpg)
© J
oh
an B
os
Ap
ril 2
008
Recall the Answer-Type Taxonomy
• We divided questions according to their expected answer type
• Simple Answer-Type Taxonomy
PERSONNUMERALDATEMEASURELOCATIONORGANISATIONENTITY
![Page 32: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/32.jpg)
© J
oh
an B
os
Ap
ril 2
008
Named Entity Recognition
• In order to make use of the answer types, we need to be able to recognise named entities of the same types in the documents
PERSONNUMERALDATEMEASURELOCATIONORGANISATIONENTITY
![Page 33: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/33.jpg)
© J
oh
an B
os
Ap
ril 2
008
Example Text
Italy’s business world was rocked by the announcement last Thursday that Mr. Verdi would leave his job as vice-president of Music Masters of Milan, Inc to become operations director of Arthur Andersen.
![Page 34: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/34.jpg)
© J
oh
an B
os
Ap
ril 2
008
Named entities
Italy’s business world was rocked by the announcement last Thursday that Mr. Verdi would leave his job as vice-president of Music Masters of Milan, Inc to become operations director of Arthur Andersen.
![Page 35: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/35.jpg)
© J
oh
an B
os
Ap
ril 2
008
Named Entity Recognition
<ENAMEX TYPE=„LOCATION“>Italy</ENAME>‘s business world was rocked by the announcement <TIMEX TYPE=„DATE“>last Thursday</TIMEX> that Mr. <ENAMEX TYPE=„PERSON“>Verdi</ENAMEX> would leave his job as vice-president of <ENAMEX TYPE=„ORGANIZATION“>Music Masters of Milan, Inc</ENAMEX> to become operations director of <ENAMEX TYPE=„ORGANIZATION“>Arthur Andersen</ENAMEX>.
![Page 36: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/36.jpg)
© J
oh
an B
os
Ap
ril 2
008
NER difficulties
• Several types of entities are too numerous to include in dictionaries
• New names turn up every day
• Ambiguities – Paris, Lazio
• Different forms of same entities in same text– Brian Jones … Mr. Jones
• Capitalisation
![Page 37: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/37.jpg)
© J
oh
an B
os
Ap
ril 2
008
NER approaches
• Rule-based approaches– Hand-crafted rules– Help from databases of known
named entities [e.g. locations]
• Statistical approaches– Features – Machine learning
![Page 38: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/38.jpg)
© J
oh
an B
os
Ap
ril 2
008
Document Analysis
• Tokenisation
• Part of speech tagging
• Lemmatisation
• Syntactic analysis (Parsing)
• Semantic analysis (Boxing)• Named entity recognitionAnaphora resolution
![Page 39: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/39.jpg)
© J
oh
an B
os
Ap
ril 2
008
What is anaphora?
• Relation between a pronoun and another element in the same or earlier sentence
• Anaphoric pronouns: – he, him, she, her, it, they, them
• Anaphoric noun phrases:– the country, – these documents, – his hat, her dress
![Page 40: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/40.jpg)
© J
oh
an B
os
Ap
ril 2
008
Anaphora (pronouns)
• Question:What is the biggest sector in Andorra’s economy?
• Corpus:Andorra is a tiny land-locked country in southwestern Europe, between France and Spain. Tourism, the largest sector of its tiny, well-to-do economy, accounts for roughly 80% of the GDP.
• Answer: ?
![Page 41: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/41.jpg)
© J
oh
an B
os
Ap
ril 2
008
Anaphora (definite descriptions)
• Question:What is the biggest sector in Andorra’s economy?
• Corpus:Andorra is a tiny land-locked country in southwestern Europe, between France and Spain. Tourism, the largest sector of the country’s tiny, well-to-do economy, accounts for roughly 80% of the GDP.
• Answer: ?
![Page 42: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/42.jpg)
© J
oh
an B
os
Ap
ril 2
008
Anaphora Resolution
• Anaphora Resolution is the task of finding the antecedents of anaphoric expressions
• Example system:– Mitkov, Evans & Orasan (2002)– http://clg.wlv.ac.uk/MARS/
![Page 43: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/43.jpg)
© J
oh
an B
os
Ap
ril 2
008
“Kafka lived in Austria. He died in 1924.”
_______________________ _____________________ | x3 x1 x2 | | x5 x4 | |-----------------------| |---------------------|(| male(x3) |+| die(x5) |) | named(x3,kafka,per) | | thing(x4) | | live(x1) | | event(x5) | | agent(x1,x3) | | agent(x5,x3) | | named(x2,austria,loc) | | in(x5,x4) | | event(x1) | | timex(x4)=+1924XXXX | | in(x1,x2) | |_____________________| |_______________________|
![Page 44: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/44.jpg)
© J
oh
an B
os
Ap
ril 2
008
“Kafka lived in Austria. He died in 1924.”
_______________________ _____________________ | x3 x1 x2 | | x5 x4 | |-----------------------| |---------------------|(| male(x3) |+| die(x5) |) | named(x3,kafka,per) | | thing(x4) | | live(x1) | | event(x5) | | agent(x1,x3) | | agent(x5,x3) | | named(x2,austria,loc) | | in(x5,x4) | | event(x1) | | timex(x4)=+1924XXXX | | in(x1,x2) | |_____________________| |_______________________|
![Page 45: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/45.jpg)
© J
oh
an B
os
Ap
ril 2
008
Co-reference resolution
• Question:What is the biggest sector in Andorra’s economy?
• Corpus:Andorra is a tiny land-locked country in southwestern Europe, between France and Spain. Tourism, the largest sector of Andorra’s tiny, well-to-do economy, accounts for roughly 80% of the GDP.
• Answer: Tourism
![Page 46: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/46.jpg)
© J
oh
an B
os
Ap
ril 2
008
Question Answering
Lecture 3• Query Generation
• Document AnalysisSemantic Indexing
• Answer Extraction
• Selection and Ranking
![Page 47: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/47.jpg)
© J
oh
an B
os
Ap
ril 2
008
knowledge
parsing
boxing
query
answertyping
Indri
answerextraction
answerselection
answerreranking
question answerccg
drs WordNetNomLex
Indexed Documents
Architecture of PRONTO
![Page 48: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/48.jpg)
© J
oh
an B
os
Ap
ril 2
008
Semantic indexing
• If we index documents on the token level, we cannot search for specific semantic concepts
• If we index documents on semantic concepts, we can formulate more specific queries
• Semantic indexing requires a complete preprocessing of the entire document collection [can be costly]
![Page 49: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/49.jpg)
© J
oh
an B
os
Ap
ril 2
008
Semantic indexing example
• Example NL question:
When did Franz Kafka die?
• Term-based – query: kafka– Returns all passages containing the term “kafka"
![Page 50: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/50.jpg)
© J
oh
an B
os
Ap
ril 2
008
Semantic indexing example
• Example NL question: When did Franz Kafka die?
• Term-based – query: kafka– Returns all passages containing the term “kafka"
• Concept-based – query: DATE & kafka – Returns all passages containing the term "kafka"
and a date expression
![Page 51: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/51.jpg)
© J
oh
an B
os
Ap
ril 2
008
Question Answering
Lecture 3• Query Generation
• Document Analysis
• Semantic IndexingAnswer Extraction
• Selection and Ranking
![Page 52: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/52.jpg)
© J
oh
an B
os
Ap
ril 2
008
knowledge
parsing
boxing
query
answertyping
Indri
answerextraction
answerselection
answerreranking
question answerccg
drs WordNetNomLex
Indexed Documents
Architecture of PRONTO
![Page 53: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/53.jpg)
© J
oh
an B
os
Ap
ril 2
008
Answer extraction
• Passage retrieval gives us a set of ranked documents
• Match answer with question– DRS for question– DRS for each possible document– Score for amount of overlap
• Deep inference or shallow matching
• Use knowledge
![Page 54: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/54.jpg)
© J
oh
an B
os
Ap
ril 2
008
Answer extraction: matching
• Given a question and an expression with a potential answer, calculate a matching score S = match(Q,A) that indicates how well Q matches A
• Example– Q: When was Franz Kafka born?
– A1: Franz Kafka died in 1924.
– A2: Kafka was born in 1883.
![Page 55: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/55.jpg)
© J
oh
an B
os
Ap
ril 2
008
Using logical inference
• Recall that Boxer produces first order representations [DRSs]
• In theory we could use a theorem prover to check whether a retrieved passage entails or is inconsistent with a question
• In practice this is too costly, given the high number of possible answer + question pairs that need to be considered
• Also: theorem provers are precise – they don’t give us information if they almost find a proof, although this would be useful for QA
![Page 56: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/56.jpg)
© J
oh
an B
os
Ap
ril 2
008
Semantic matching
• Matching is an efficient approximation to the inference task
• Consider flat semantic representation of the passage and the question
• Matching gives a score of the amount of overlap between the semantic content of the question and a potential answer
![Page 57: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/57.jpg)
© J
oh
an B
os
Ap
ril 2
008
Matching Example
• Question: When was Franz Kafka born?
• Passage 1:Franz Kafka died in 1924.
• Passage 2:Kafka was born in 1883.
![Page 58: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/58.jpg)
© J
oh
an B
os
Ap
ril 2
008
Semantic Matching [1]
answer(X)
franz(Y)
kafka(Y)
born(E)
patient(E,Y)
temp(E,X)
franz(x1)
kafka(x1)
die(x3)
agent(x3,x1)
in(x3,x2)
1924(x2)
Q: A1:
![Page 59: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/59.jpg)
© J
oh
an B
os
Ap
ril 2
008
Semantic Matching [1]
answer(X)
franz(Y)
kafka(Y)
born(E)
patient(E,Y)
temp(E,X)
franz(x1)
kafka(x1)
die(x3)
agent(x3,x1)
in(x3,x2)
1924(x2)
Q: A1:
X=x2
![Page 60: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/60.jpg)
© J
oh
an B
os
Ap
ril 2
008
Semantic Matching [1]
answer(x2)
franz(Y)
kafka(Y)
born(E)
patient(E,Y)
temp(E,x2)
franz(x1)
kafka(x1)
die(x3)
agent(x3,x1)
in(x3,x2)
1924(x2)
Q: A1:
![Page 61: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/61.jpg)
© J
oh
an B
os
Ap
ril 2
008
Semantic Matching [1]
answer(x2)
franz(Y)
kafka(Y)
born(E)
patient(E,Y)
temp(E,x2)
franz(x1)
kafka(x1)
die(x3)
agent(x3,x1)
in(x3,x2)
1924(x2)
Q: A1:
Y=x1
![Page 62: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/62.jpg)
© J
oh
an B
os
Ap
ril 2
008
Semantic Matching [1]
answer(x2)
franz(x1)
kafka(x1)
born(E)
patient(E,x1)
temp(E,x2)
franz(x1)
kafka(x1)
die(x3)
agent(x3,x1)
in(x3,x2)
1924(x2)
Q: A1:
![Page 63: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/63.jpg)
© J
oh
an B
os
Ap
ril 2
008
Semantic Matching [1]
answer(x2)
franz(x1)
kafka(x1)
born(E)
patient(E,x1)
temp(E,x2)
franz(x1)
kafka(x1)
die(x3)
agent(x3,x1)
in(x3,x2)
1924(x2)
Q: A1:
![Page 64: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/64.jpg)
© J
oh
an B
os
Ap
ril 2
008
Semantic Matching [1]
answer(x2)
franz(x1)
kafka(x1)
born(E)
patient(E,x1)
temp(E,x2)
Match score = 3/6 = 0.50
Q: A1: franz(x1)
kafka(x1)
die(x3)
agent(x3,x1)
in(x3,x2)
1924(x2)
![Page 65: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/65.jpg)
© J
oh
an B
os
Ap
ril 2
008
Semantic Matching [2]
answer(X)
franz(Y)
kafka(Y)
born(E)
patient(E,Y)
temp(E,X)
kafka(x1)
born(x3)
patient(x3,x1)
in(x3,x2)
1883(x2)
Q: A2:
![Page 66: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/66.jpg)
© J
oh
an B
os
Ap
ril 2
008
Semantic Matching [2]
answer(X)
franz(Y)
kafka(Y)
born(E)
patient(E,Y)
temp(E,X)
kafka(x1)
born(x3)
patient(x3,x1)
in(x3,x2)
1883(x2)
Q: A2:
X=x2
![Page 67: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/67.jpg)
© J
oh
an B
os
Ap
ril 2
008
Semantic Matching [2]
answer(x2)
franz(Y)
kafka(Y)
born(E)
patient(E,Y)
temp(E,x2)
kafka(x1)
born(x3)
patient(x3,x1)
in(x3,x2)
1883(x2)
Q: A2:
![Page 68: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/68.jpg)
© J
oh
an B
os
Ap
ril 2
008
Semantic Matching [2]
answer(x2)
franz(Y)
kafka(Y)
born(E)
patient(E,Y)
temp(E,x2)
kafka(x1)
born(x3)
patient(x3,x1)
in(x3,x2)
1883(x2)
Q: A2:
Y=x1
![Page 69: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/69.jpg)
© J
oh
an B
os
Ap
ril 2
008
Semantic Matching [2]
answer(x2)
franz(x1)
kafka(x1)
born(E)
patient(E,x1)
temp(E,x2)
kafka(x1)
born(x3)
patient(x3,x1)
in(x3,x2)
1883(x2)
Q: A2:
![Page 70: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/70.jpg)
© J
oh
an B
os
Ap
ril 2
008
Semantic Matching [2]
answer(x2)
franz(x1)
kafka(x1)
born(E)
patient(E,x1)
temp(E,x2)
kafka(x1)
born(x3)
patient(x3,x1)
in(x3,x2)
1883(x2)
Q: A2:
E=x3
![Page 71: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/71.jpg)
© J
oh
an B
os
Ap
ril 2
008
Semantic Matching [2]
answer(x2)
franz(x1)
kafka(x1)
born(x3)
patient(x3,x1)
temp(x3,x2)
kafka(x1)
born(x3)
patient(x3,x1)
in(x3,x2)
1883(x2)
Q: A2:
![Page 72: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/72.jpg)
© J
oh
an B
os
Ap
ril 2
008
Semantic Matching [2]
answer(x2)
franz(x1)
kafka(x1)
born(x3)
patient(x3,x1)
temp(x3,x2)
kafka(x1)
born(x3)
patient(x3,x1)
in(x3,x2)
1883(x2)
Q: A2:
![Page 73: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/73.jpg)
© J
oh
an B
os
Ap
ril 2
008
Semantic Matching [2]
answer(x2)
franz(x1)
kafka(x1)
born(x3)
patient(x3,x1)
temp(x3,x2)
kafka(x1)
born(x3)
patient(x3,x1)
in(x3,x2)
1883(x2)
Q: A2:
Match score = 4/6 = 0.67
![Page 74: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/74.jpg)
© J
oh
an B
os
Ap
ril 2
008
Matching Example
• Question: When was Franz Kafka born?
• Passage 1:Franz Kafka died in 1924.
• Passage 2:Kafka was born in 1883.
Match score = 0.67
Match score = 0.50
![Page 75: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/75.jpg)
© J
oh
an B
os
Ap
ril 2
008
Matching Techniques
• Weighted matching– Higher weight for named entities– Estimate weights using machine learning
• Incorporate background knowledge– WordNet [hyponyms]– NomLex– Paraphrases:
BORN(E) & IN(E,Y) & DATE(Y) TEMP(E,Y)
![Page 76: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/76.jpg)
© J
oh
an B
os
Ap
ril 2
008
Question Answering
Lecture 3• Query Generation
• Document Analysis
• Semantic Indexing
• Answer ExtractionSelection and Ranking
![Page 77: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/77.jpg)
© J
oh
an B
os
Ap
ril 2
008
knowledge
parsing
boxing
query
answertyping
Indri
answerextraction
answerselection
answerreranking
question answerccg
drs WordNetNomLex
Indexed Documents
Architecture of PRONTO
![Page 78: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/78.jpg)
© J
oh
an B
os
Ap
ril 2
008
Answer selection
• Rank answer– Group duplicates– Syntactically or semantically equivalent– Sort on frequency
• How specific should an answer be?– Semantic relations between answers– Hyponyms, synonyms– Answer modelling
[PhD thesis Dalmas 2007]
• Answer cardinality
![Page 79: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/79.jpg)
© J
oh
an B
os
Ap
ril 2
008
Answer selection example 1
• Where did Franz Kafka die?– In his bed– In a sanatorium– In Kierling– Near Vienna– In Austria
– In Berlin– In Germany
![Page 80: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/80.jpg)
© J
oh
an B
os
Ap
ril 2
008
Answer selection example 2
• Where is 3M based?– In Maplewood– In Maplewood, Minn.– In Minnesota– In the U.S.– In Maplewood, Minn., USA
– In San Francisco– In the Netherlands
![Page 81: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/81.jpg)
© J
oh
an B
os
Ap
ril 2
008
knowledge
parsing
boxing
query
answertyping
Indri
answerextraction
answerselection
answerreranking
question answerccg
drs WordNetNomLex
Indexed Documents
Architecture of PRONTO
![Page 82: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/82.jpg)
© J
oh
an B
os
Ap
ril 2
008
Reranking
• Most QA systems first produce a list of possible answers…
• This is usually followed by a process called reranking
• Reranking promotes correct answers to a higher rank
![Page 83: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/83.jpg)
© J
oh
an B
os
Ap
ril 2
008
Factors in reranking
• Matching score– The better the match with the question, the
more likely the answers
• Frequency– If the same answer occurs many times,
it is likely to be correct
![Page 84: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/84.jpg)
© J
oh
an B
os
Ap
ril 2
008
Answer Validation
• Answer Validation– check whether an answer is likely to be
correct using an expensive method
• Tie breaking– Deciding between two answers with similar
probability
• Methods:– Inference check– Sanity checking– Googling
![Page 85: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/85.jpg)
© J
oh
an B
os
Ap
ril 2
008
Inference check
• Use first-order logic [FOL] to check whether a potential answer entails the question
• This can be done with the use of a theorem prover– Translate Q into FOL– Translate A into FOL– Translate background knowledge into FOL – If ((BKfol & Afol) Qfol) is a theorem,
we have a likely answer
![Page 86: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/86.jpg)
© J
oh
an B
os
Ap
ril 2
008
Sanity Checking
Answer should be informative, that is, not part of the question
Q: Who is Tom Cruise married to?
A: Tom Cruise
Q: Where was Florence Nightingale born?
A: Florence
![Page 87: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/87.jpg)
© J
oh
an B
os
Ap
ril 2
008
Googling
• Given a ranked list of answers, some of these might not make sense at all
• Promote answers that make sense
• How?
• Use even a larger corpus!– “Sloppy” approach– “Strict” approach
![Page 88: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/88.jpg)
© J
oh
an B
os
Ap
ril 2
008
The World Wide Web
![Page 89: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/89.jpg)
© J
oh
an B
os
Ap
ril 2
008
Answer validation (sloppy)
• Given a question Q and a set of answers A1…An
• For each i, generate query Q Ai
• Count the number of hits for each i
• Choose Ai with most number of hits
• Use existing search engines– Google, AltaVista– Magnini et al. 2002 (CCP)
![Page 90: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/90.jpg)
© J
oh
an B
os
Ap
ril 2
008
Corrected Conditional Probability
• Treat Q and A as a bag of words– Q = content words question– A = answer
hits(A NEAR Q)
• CCP(Qsp,Asp) = ------------------------------ hits(A) x hits(Q)
• Accept answers above a certain CCP threshold
![Page 91: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/91.jpg)
© J
oh
an B
os
Ap
ril 2
008
Answer validation (strict)
• Given a question Q and a set of answers A1…An
• Create a declarative sentence with the focus of the question replaced by Ai
• Use the strict search option in Google– High precision– Low recall
• Any terms of the target not in the sentence as added to the query
![Page 92: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/92.jpg)
© J
oh
an B
os
Ap
ril 2
008
Example
• TREC 99.3Target: Woody Guthrie.Question: Where was Guthrie born?
• Top-5 Answers: 1) Britain
* 2) Okemah, Okla.3) Newport
* 4) Oklahoma5) New York
![Page 93: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/93.jpg)
© J
oh
an B
os
Ap
ril 2
008
Example: generate queries
• TREC 99.3Target: Woody Guthrie.Question: Where was Guthrie born?
• Generated queries: 1) “Guthrie was born in Britain”
2) “Guthrie was born in Okemah, Okla.”3) “Guthrie was born in Newport”4) “Guthrie was born in Oklahoma”5) “Guthrie was born in New York”
![Page 94: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/94.jpg)
© J
oh
an B
os
Ap
ril 2
008
Example: add target words
• TREC 99.3Target: Woody Guthrie.Question: Where was Guthrie born?
• Generated queries: 1) “Guthrie was born in Britain” Woody
2) “Guthrie was born in Okemah, Okla.” Woody3) “Guthrie was born in Newport” Woody4) “Guthrie was born in Oklahoma” Woody5) “Guthrie was born in New York” Woody
![Page 95: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/95.jpg)
© J
oh
an B
os
Ap
ril 2
008
Example: morphological variants
TREC 99.3
Target: Woody Guthrie.
Question: Where was Guthrie born?
Generated queries:“Guthrie is OR was OR are OR were born in Britain” Woody
“Guthrie is OR was OR are OR were born in Okemah, Okla.” Woody
“Guthrie is OR was OR are OR were born in Newport” Woody
“Guthrie is OR was OR are OR were born in Oklahoma” Woody
“Guthrie is OR was OR are OR were born in New York” Woody
![Page 96: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/96.jpg)
© J
oh
an B
os
Ap
ril 2
008
Example: google hits
TREC 99.3
Target: Woody Guthrie.
Question: Where was Guthrie born?
Generated queries:“Guthrie is OR was OR are OR were born in Britain” Woody 0
“Guthrie is OR was OR are OR were born in Okemah, Okla.” Woody 10
“Guthrie is OR was OR are OR were born in Newport” Woody 0
“Guthrie is OR was OR are OR were born in Oklahoma” Woody 42
“Guthrie is OR was OR are OR were born in New York” Woody 2
![Page 97: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/97.jpg)
© J
oh
an B
os
Ap
ril 2
008
Example: reranked answers
TREC 99.3Target: Woody Guthrie.Question: Where was Guthrie born?
Original answers 1) Britain
* 2) Okemah, Okla.3) Newport
* 4) Oklahoma5) New York
Reranked answers * 4) Oklahoma
* 2) Okemah, Okla.5) New York 1) Britain3) Newport
![Page 98: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/98.jpg)
© J
oh
an B
os
Ap
ril 2
008
Question Answering (QA)
Lecture 1• What is QA?• Query Log Analysis• Challenges in QA• History of QA• System Architecture• Methods• System Evaluation• State-of-the-art
Lecture 2• Question Analysis• Background Knowledge• Answer Typing
Lecture 3• Query Generation• Document Analysis• Semantic Indexing• Answer Extraction• Selection and Ranking
![Page 99: © Johan Bos April 2008 Question Answering (QA) Lecture 1 What is QA? Query Log Analysis Challenges in QA History of QA System Architecture Methods System](https://reader035.vdocuments.net/reader035/viewer/2022081514/55190b7255034638428b477a/html5/thumbnails/99.jpg)
© J
oh
an B
os
Ap
ril 2
008
Where to go from here
• Producing answers in real-time
• Improve accuracy
• Answer explanation
• User modelling
• Speech interfaces
• Dialogue (interactive QA)
• Multi-lingual QA
• Non sequential architectures