a knowledge-based question answering system final report

27
1 _________ A Knowledge-based Question Answering System Department of Computer Science, University of Hong Kong Final Report 17 April 2016 Supervisor: Professor Benjamin C.M. Kao Group Members: BAI Zongling JIANG Ling

Upload: duongkhanh

Post on 14-Feb-2017

219 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: A Knowledge-based Question Answering System Final Report

1

_________

A Knowledge-based Question

Answering System

Department of Computer Science, University of Hong Kong

Final Report

17 April 2016

Supervisor:

Professor Benjamin C.M. Kao

Group Members:

BAI Zongling

JIANG Ling

Page 2: A Knowledge-based Question Answering System Final Report

2

Abstract

Question Answering (QA) is in area of natural language processing research and aims at

providing human users with a convenient and intuitive manner of accessing information. As a

fundamental component, the semantic parser processes natural language questions and generates

corresponding logical forms for further querying in knowledge base. Many problems are

involved in this process, for example, how to produce effective Lexicon and Grammar efficiently

for large corpus. In this paper we present our understanding about Knowledge-Based Question

Answering system and the procedure of implementing a prototype using some open-source tools,

such as SEMPRE, Freebase, etc. Future plans about schema matching and lexicon extension are

briefly introduced at the end.

Acknowledgement

We would like to express our appreciation to our FYP supervisor, professor Benjamin C.M. Kao,

for providing us with helpful guidance and suggestions. Also thanks for Stanford NLP team for

their excellent work on parser training toolkit SEMPRE under GNU general public license. Any

opinions, findings, results in this paper cannot exist without the work of other researchers in

question answering area.

Page 3: A Knowledge-based Question Answering System Final Report

3

Table of Contents

1. INTRODUCTION .......................................................................................................... 3

2. PROJECT BACKGROUND ......................................................................................... 4

3. LITERATURE REVIEW ...............................................................................................7

3.1 Semantic Parsing .................................................................................................7

3.2 Learning ...............................................................................................................8

3.3 Supervision ..........................................................................................................10

4. PROJECT SCOPE AND MILESTONES......................................................................11

4.1 Scope and Metrics ...............................................................................................11

4.2 Project Objectives ...............................................................................................11

4.3 Project Deliverables and Schedule ......................................................................12

5. PROJECT METHODOLOGY ..................................................................................... 14

5.1 System Architecture Design .............................................................................. 14

5.1.1 General Structure ............................................................................... 15

5.1.2 Freebase and Virtuoso ........................................................................ 15

5.1.3 SEMPRE ............................................................................................ 16

5.2 Implementation Details .......................................................................................18

6. TESTING AND RESULTS .............................................................................................22

6.1 Testing setup.........................................................................................................22

6.2 Testing result........................................................................................................22

7. DISCUSSION AND CONCLUSION .............................................................................24

7.1 Discussion.............................................................................................................24

7.2 Conclusion ...........................................................................................................24

7.3 Future Works.........................................................................................................24

8. ABBREVIATIONS ............................................................................................. ...........25

REFERENCES ........................................................................................................ ...........26

Page 4: A Knowledge-based Question Answering System Final Report

4

1. INTRODUCTION

Figure 1. These are two different information retrieval systems and their responses to the same

question. (Left) START developed by MIT, a question answering system that can return a short

answer directly. (Right) Google search engine returns a list of web pages with keywords

highlighted.

In the age with booming data on web, extraction of accurate and useful information from

numerous sources becomes increasingly important. A well-known type of information

retrieval (IR) technique is search engine, such as Google and Yahoo. Although search engine

is the most widely used IR system, it has limitations: it is difficult for search engine to

understand natural language question and provide an explicit and concise answer. In right

hand side of Figure 1, if the user inputs a question sentence “Who was the first president of

America”, the engine retrieves a list of relevant indexed web pages or documents that

contains keywords in the question. The user needs to go through those documents to find the

correct information and locate the answer manually.

In contrast, a question answering (QA) system can understand human language and generate

an intuitive answer. As the left hand side of Figure 1 presents, given the same question “Who

was the first president of America”, the QA system answers directly by returning “George

Washington”. User can benefit from the QA system because it eliminates user’s overhead of

information filtering, processing and integrating.

Page 5: A Knowledge-based Question Answering System Final Report

5

The improvement in Natural Language Processing (NLP) and IR techniques contributes

greatly to the development of the QA system. The rise of various knowledge base (KB) like

Freebase, DBpedia and YAGO also provides rich source of structured information in all

knowledge fields to build a QA system.

In this paper, we will introduce some basic components of a KBQA system and then show

some details of implementing a KBQA prototype that can answer a few factoid questions.

The remainder of the paper is organized as follows. Section 2 introduces the background of

this project. Section 3 includes some literature reviews in this field. Section 4 introduces the

scope and milestones. Section 5 presents project methodologies, including system

architecture design and details of prototype implementation. Section 6 presents the testing

results. Finally the paper concludes and points out future direction in section 7.

2. PROJECT BACKGROUND

QA system combines IR and NLP technique to give a brief answer to the human’s natural

language question. A KBQA system means that the system answers the question based on

facts well-structured in knowledge base. Some well-known applications of QA system are

MIT’s START, IBM’s Watson, etc. Although some of them have achieved incredible

performance, such as Watson beaten all human players in Jeopardy competition, there is still

large room for improvement in this field considering the precision and efficiency of some

advanced QA systems are not fully satisfactory as an answer search tool to human user.

Figure 2 lists out some research teams and the performance of their state-of-the-art systems:

PARASEMPRE (Berant and Liang, 2014) [6], CY13 (Cai and Yates, 2013) [1], BCFL with

SEMPRE (Berant et al., 2013) [7], KCAZ13 (Kwiatkowski et al., 2013) [18], etc. This

project mainly has a focus on systems developed with SEMPRE toolkit, which is developed

by Percy Liang’s team.

Page 6: A Knowledge-based Question Answering System Final Report

6

Figure 2: Test results of some state-of-art question answering systems on Free917

and WebQuestions datasets. Sources of statistics: [6][16][17].

The most essential component in a QA system framework is the semantic parser. The parser

converts natural language questions into suitable queries form that can be executed by KBs

(e.g., SPARQL, MQL, SQL, etc). In the case of Freebase, the query language is SPARQL.

Semantic parsing in open domain QA system needs large structured text corpus and

significant research on the mapping algorithm have been done to scale up the semantic parser

to large KB like Freebase [6].

The following items are main challenges in KBQA system:

● Complexity and variety in human language expression:

○ Polysemy.

A word or phrase often has more one meanings. Under different contexts, a word

can be interpreted differently, e.g., “star” could be mapped to

“fb:film.film.starring”, “fb:”, “fb:astronomy.satr”, etc.

Page 7: A Knowledge-based Question Answering System Final Report

7

○ Complexity

○ The performance is greatly affected by the complexity of the question asked.[19]

For instance, many system still cannot answer the temporally restricted questions,

such as “Who played the role of Superman before Christopher Reeve was

paralyzed?” [19] and “Where did Harriet Tubman live after the civil war?” that

cannot be answered by PARASEMPRE[6]

● The large scale of knowledge bases:

○ The extensive coverage of KB adds overwhelming burdens to lexicons and

matching in semantic parsing.

○ It is costly in time and effort to use the purely-supervised training approaches on

the large scale KB unless the annotation on training data can be processed

automatically.

○ Even the scale of KB is already extremely huge, the scale of human language is

even larger. There is always expression that is never seen by the QA system

before. [1] It is important to train the semantic parser on a larger corpus so that it

can learn to map question to logical form independent of KB.

● Semantic parser generates possible logical form candidates recursively. Therefore it need

to choose from exponentially large amount of candidates for a given question. [7] Berant

et al. used beam search to prune away derivations with lower possibility.

These challenges listed above are some shortcomings of many QA systems or semantic parsers.

To achieve high precision and F1 score, more research efforts are required in these directions.

3. LITERATURE REVIEW

3.1 Semantic Parsing

Semantic parsing is the core part of the knowledge based question answering system. Given a

natural language utterance, the semantic parser maps the utterance to formal representation

language, and then convert the representation into a query.

Page 8: A Knowledge-based Question Answering System Final Report

8

There are mainly two challenges in the semantic parsing process. The first challenge is how

to find the best representation of the question, convert it into query and execute on

Knowledge base like Freebase. SEMPRE toolkit supports logical forms such as lambda

calculus and lambda DCS.

The second challenge happens in the mismatch between various natural language expression

and the limited expressions in knowledge base. For instance, for the question “who is the

Obama’s daughter?” the relation daughter is not directly presented in Freebase. Instead, the

semantic parser needs to construct a specified logical form for the Freebase like parenthood

(Obama) ^ Gender (female), otherwise the wanted answer cannot be retrieved by the query.

Related work includes Cai and Yates (2013) applying pattern matching and relation

intersection between Freebase relations and predicate argument triples from the ReVerb

system; [1] Kwiatkowski et al. (2013) applying two stage parsing strategy that first translates

the utterance into a linguistically motivated domain-independent meaning representation, and

then uses a learned ontology matching model to transform this representation for the target

domain. [18]

Therefore the performance of the semantic parser is critical to the system performance,

considering the accuracy to understand the meaning of human language and the correctness

to construct a logical form that falls in the Freebase ontology.

3.2 Learning Methodologies

It is concluded by Percy Liang and Christopher Potts that current research has developed two

branches of learning: one is learning from logical forms, the other one is learning from

denotations [10]. Figure 4 exemplifies the difference between them. More details about each

approach are elaborated in the following two subsections.

Page 9: A Knowledge-based Question Answering System Final Report

9

Methodologies Utterances Training Instances (simplified version)

Learn from

Logical Forms

u: how tall is

mount Everest?

<u, (!fb:geography.mountain.elevation

fb:en.mount_everest)>

Learn from

Denotations

<u, (8,848.00 m)>

Table 1: An example of training pair in two learning approaches. Here we

simplify the format of the learning pairs to highlight important parts. The question

is “how tall is mount Everest?” In logical form approach, the learning pair is

<utterance, logical form>, while in denotation approach, the pair is <utterance,

answer>.

3.2.1 Learning from Logical Forms: question-logical pairs

The methodology of learning from logical forms focuses on translating natural language

sentences to semantic representations, often in logical form. Most of the semantic parsers in

mainstream QA system adopt this approach for twenty year [10]. Given a question, the

semantic parser will build a parse tree in order to construct logical form. In this process

lexicon files define fundamental mapping between utterances and derivation, which are

logical form candidates [10, 11]. For example, in Figure 7, “Everest” is mapped to

“fb:en.mount_everest” by applying the correct lexicon “N → everest: fb:en.mount_everest”

(N is a preterminal symbol).

It is possible that the word in one utterance finds multiple lexicon rules, such as “N →

everest: fb: en.film_everest”, “N → everest: fb: en.music_musicalgroup_everest” and

therefore multiple logical forms are generated. Such over generation is allowed, on the

ground that polysemy is important in linguistic usage and human languages are not rigidly

fixed. A word can have other meanings in different contexts. To figure out which lexicon

produced correct or closet to correct derivation, we also need to associate a feature scores

with each derivation and the highest one is the best the QA system can get. Training data will

appreciate derivations with more good features while depreciate those with more bad features.

Page 10: A Knowledge-based Question Answering System Final Report

10

Lexicons is one of the most vital part in precisely mapping between natural language

utterances and logical forms. However it is impossible to manually write all the rules so that

most researchers shift the burden to write a perfect lexicon file to feature construction and

training system using a well-developed training dataset, [10].

3.2.2 Learning from Denotations: question-answer pairs

A novel methodology of learning is to learn from question-answer pairs directly instead of

using the intermediate product - logical forms. This approach reduces the extent of

supervision, because information provided in denotations is less than that of logical forms

theoretically. Lexicons are still required in this methodology with even more word-predicate

pairs to make up the gap between utterances and denotations [10].

Due to the information loss in denotations, it is harder to make judgement among derivation

candidates. Especially when two different derivations share the same denotations, the system

cannot tell the difference. To mitigate the computation pressure, it is suggested that we

should consider the order of constructing logical forms and using type information [10].

Despite the fact that learning from denotations adds more difficulties compared with learning

from logical forms, less intensive human involvement in new domains learning makes this

approach more preferable in current research studies, as it is much easier for human to obtain

denotations than intermediate logical forms [10].

3.3 Supervision

Machine learning techniques are increasingly applied in semantic parsing to reduce the

burden of lexicon mapping. Previous systems used supervised learning to train on data with

pairs of question and annotated logical forms or answers. It takes huge human labor to

annotating logical forms or answers to create a training dataset, and it also has limitation on

learning scale of semantic parsers even many researcher makes their dataset to cover as many

topic and grammar as possible. Some recent studies try to resolve such limitation by reducing

the amount of supervision. In weak supervision or semi-supervision, less expertise is required

in labeling process. Learning under weak supervision needs executing logical forms and

Page 11: A Knowledge-based Question Answering System Final Report

11

assessing results [14].There is also unsupervised learning that relies on classification

techniques such as clustering. Although unsupervised learning usually has introduced noisy

data and degradation in model quality, some argue that the performance can be improved by

integrating such noise in practice because more contexts may also introduce phenomenon of

semantic drift. [15]

In short, although supervised learning can produce high-quality semantic parsers, the cost of

labeling push researchers to adopt learning approaches with less supervision, which achieve

good performance in large scale domains.

4. PROJECT SCOPE AND MILESTONE

4.1. Scope and metrics

The project system will only focus on answering factoid questions and is based on curated

knowledge base with relatively large number of domains and entities (e.g., Freebase).

Evaluation metrics for experiments are precision rate, recall rate and F1 score:

4.2 Project objectives

The goal of the project is to implement a demonstrable knowledge based question answering

system first - mainly an integration job - and then do some improvements on one element or

area in KBQA.

Page 12: A Knowledge-based Question Answering System Final Report

12

4.3 Project Deliverables and Schedule

This project consists of 3 stages:

● Stage 1: Understand different components of KBQA. Make use of some open-source

tools, integrate them and build up a prototype.

● Stage 2: Narrow down to one specific area or element in the KBQA system, define

the problem and scope, and do further research on it.

● Stage 3: Do some systematic experiments and draw a few meaningful conclusions in

the chosen area.

Each phase relies on previous phase’s result. For example, we can apply various Machine

Learning techniques to generate grammar templates automatically in phase 2, and we will

evaluate the effectiveness of these automate templates in terms of accuracy compared with

manually generated templates in phase 3.

The deliverables expected from this project are listed as following:

● An interim report on current status and future plan

● A KBQA prototype accepting factoid questions and return answer

● A final report including introduction of classic KBQA system, tools selection,

implementation details, further exploration, etc.

● A website describing our final year project including project plan, interim report, final

report and all the auxiliary supporting materials.

Date Event

4 October 2015

Deliverables of Phase 1

(Inception)

• Detailed project plan

• Project web page

22 January 2016 First presentation

24 January 2016

Deliverables of Phase 2

(Elaboration)

Page 13: A Knowledge-based Question Answering System Final Report

13

• Preliminary implementation

• Detailed interim report

17 April 2016

Deliverables of Phase 3

(Construction)

• Finalized tested implementation

• Final report

22 April 2016 Final presentation

3 May 2016 Project exhibition

6 June 2016 Project competition

(for selected projects only)

Table 2: Final year project schedule.

Page 14: A Knowledge-based Question Answering System Final Report

14

Question

Class

Semantic

Parse

Expected

Question Analysis

Query Generator

Logical From

KB- Compa

examine

Query Execut

er

Answer

Answer

Natural Langua

ge

KB

5. PROJECT METHODOLOGY

5.1 System Architecture Design

5.1.1 General Structure

Figure 3: The architecture of project prototype. It has two major processes

and many relevant components. SEMPRE will do the work of those

components surrended by dashed line.

Figure 5 shows the general structure of a KBQA system prototype. It includes 2 main parts:

question analysis process and answer retrieval process. In then question analysis process,

given a natural language question, the system transform the utterance into a machine

understandable language. Often the utterance is converted to an intermediate logical form

and eventually into a query. In the answer retrieval process, the system uses the query to

Page 15: A Knowledge-based Question Answering System Final Report

15

retrieve the desired answer from the knowledge base. In this section, three main components

- Question Classifier, Semantic Parser and Query Generator/Executor – are elaborated to give

readers a more comprehensive understanding of the system architecture of a general KBQA

prototype.

● Question Classifier

Given a specific factoid question, question classifier aims to return the answer type. By

knowing the answer type, the system can narrow down search scope and verify the

validity of the retrieved answer. For example, if the system gets the question “Where was

Max Planck born?” the answer type of this question should be a “Location”. The system

should seek for a location type of answer. Knowing the answer type reduces processing

time and effort, provides a feasible way to select correct answers from among the

possible answer candidates, and increase the precision rate.

● Semantic Parser

Semantic parser aims to translate the natural language question into an intermediate

logical form. It is a hard task to map a user question to the right meaning representation

form. Two popular approaches are observed for this process, one is CCG parsing that

relies on combinatory logic, the other is dependency-based compositional semantics that

relies on lambda DCS [2].

● Query Generator/Executor

Compared to previous two components, the query generator/executor is rather simple. It

formulates the logical form generated by parser into a KB compatible query. Once the

query is executed successfully, a candidate answer will be retrieved from the KB.

Nevertheless, due to the absence of information in the KB and the accuracy of classifier

and parser, there may be no retrieval result.

5.1.2 Freebase and Virtuoso

We implement our KBQA prototype based on least one curated KB. Table 2 compares

three popular curated KBs applied by major research studies. Freebase cover the highest

amount facts than YAGO and DBpedia, and the SEMPRE system is coded to read from

Page 16: A Knowledge-based Question Answering System Final Report

16

freebase. Although the API of Freebase is shutdown since June 2015, a full copy of

Freebase in open to download for the use of SEMPRE. Therefore, we chose Freebase as

the core KB in our prototype.

Curated

KB Entities Facts

Download

Version API

Query

Language

Freebase 58.12 m 3179.26 m 22GB gzip

API service has

been stopped since

June 2015

MQL

DBpedia 4.58 m 1800 m Collection OpenLink

SPARQL endpoint SPARQL

YAGO 10 m 120 m 10GB tsv OpenLink

SPARQL interface SPARQL

Table 3: A table of good candidates for implementing primary prototype.

However, the RDF dump1 of Freebase can only be queried via MQL query language.

Freebase does not support SPARQL - a more expressive and powerful query language -

and has no SPARQL endpoint. In order to query Freebase in SPARQL, we load

Freebase dump into Virtuoso SPARQL engine to have indexed and standardized data.

Virtuoso mainly provides two services here: database server (isql) and SPARQL

endpoint.

5.1.3 SEMPRE

Once we have chosen SEMPRE, the role of semantic parser, query generator, query executor,

etc., will be replaced by it, if the parser has been well trained. Next, we are going to have a

1 Freebase RDF dump: https://developers.google.com/freebase/data

Page 17: A Knowledge-based Question Answering System Final Report

17

closer look about SEMPRE, to understand how things process in the system.

Basically, it has 5 components: formula, denotation, executor, parser and learner.

SEMPRE Components [8]:

● Lexicon and Grammar: mapping/construction rules between utterances and

derivations or logical forms

● Denotation: final answer

● Executor: given logical forms, gets corresponding denotations from KB

● Parser: given utterances, gets corresponding logical forms

● Learner: runs over a dataset multiple times, during which it calls parser and updates

parameters.

Figure 4: The procedure of parsing a natural utterance into intermediate

logical form and get answers from KB eventually.

Page 18: A Knowledge-based Question Answering System Final Report

18

Figure 5: Parsing “what cities are contained by California”. Notice: the

derivations shown here are pseudo. Please refer to Figure 6 if you want to know

the full formula.

Figure 6 demonstrates the semantic parsing and answer retrieving process. The utterance will

be aligned with a set of derivations using lexicon and grammar file. A derivation has some

features and scores, and can be recursively constructed according to corresponding grammar.

Once the system gets logical form, it calls the SparqlExecutor to generate SPARQL query

and execute it. Figure 7 exemplifies the parsing tree of the question “what cities are

contained by California?” via semantic parser.

5.2 Implementation Details

During the implementation, we encountered many problems including the incompatibility among

different components and hardware conditions. We first tried out the latest version SEMPRE 2.0

and it had many defects so we decided to use the version 1.0 instead.

5.2.1 SEMPRE v2.0

In SEMPRE installation and training process, there are several problems encountered.

Page 19: A Knowledge-based Question Answering System Final Report

19

● Compilation failure in virtuoso installation:

During the installation of virtuoso v7.0.0, compilation failure occurs. Because Ubuntu

14.04 OS uses bison 3.0.2 package which generates different code for a reentrant parser.

[9] The virtuoso-opensource team creates a patch 042f142 commited to develop/7

branches to fix the code generation issues with bison 3.0.x by updating two files

(make.am and getdate.y). Therefore instead of “git checkout tags/v7.0.0”, the ubuntu 14

user should perform "git checkout 042f142" in the installation of virtuoso. We have been

struggled for quite a long time to resolve this problem and we suggest that the

SEMPRE’s README.md file should be modified, considering there are similar

questions raised in the forum frequently.

● Freebase connection error:

There is a freebase connection error. (java.lang.RuntimeException:

java.net.UnknownHostException: freebase.cloudapp.net) We tried to ping and access the

host address, but there is no response. This is probably because the Azure server

freebase.cloudapp.net has been shut down. We chose to download a full copy of the

Freebase graph and loaded the Freebase dump into Virtuoso so that we can access in

localhost.

● Insufficient hard disk space in FYP server:

Another problem occurred when we decide to download the freebase dump. Our fyp

server runs out of disk space. The uncompressed freebase dump is 81G and the fyp server

only has 80G in total. Currently we are running the SEMPRE in personal laptop and we

will move it to the server later after buying more space, in order to allow everyone to

access the system in anywhere at any time.

● Bugs in SEMPRE 2.0

Although SEMPRE 2.0 is multifunctional on the top pf SEMPRE 1.0, it has more bugs

than SEMPRE 1.0. Some could be fixed by modifying files, however others may be

beyond abilities. For example, a lucene file is missing under /sempre/lib/, which is very

Page 20: A Knowledge-based Question Answering System Final Report

20

important in constructing entity searchers in SEMPRE. We tried to fix this bug for

around two days and gave up finally by switching to SEMPRE 1.0.

● Insufficient RAM space in FYP server

Originally, our FYP server was allocated 2G RAM and was sufficient to run normal

programs such as using SEMPRE to answer questions without training. When we started

to train the semantic parser, we experienced countless Java runtime exception of lacking

memory. To train model, we need to keep Virtuoso running, which consumed nearly all

2G RAM. We applied for additional 2G RAM and found that Virtuoso accounted for

more than 3G RAM at that time. We tried other methods like squeezing RAM usage of

Virtuoso, allocating more swap space to substitute RAM, etc. Eventually, additional 4G

RAM was installed to server and now we can run programs normally.

5.2.2 SEMPRE 1.0 -- PARASEMPRE

We used SEMPRE version 1.0 to train our KBQA system. Consider the limited project

period, we train the system on the Free917 dataset using the parasempre model which takes

shortest time theoretically. The corresponding command line is

./parasempre @mode=train @domain=free917 @sparqlserver

● Parasempre framework

Notice that the parasempre uses a different approach semantic parsing based on paraphrasing

which can exploit a large amount of text not covered in Freebase. [6] Figure 7 presents the

framework for semantic parsing via paraphrasing.

Page 21: A Knowledge-based Question Answering System Final Report

21

Figure 6 Unlike sempre model, parasempre adopt an opposite approach. Noticing that

generally it is hard to map the natural language utterances to Freebase ontologically

compatible logical form, parasempre overgenerate logical form candidates and paraphrase

them to more canonical utterances. Then it use two paraphrase model to compare the

canonical utterances and original utterance, the more similar, the higher score to the

corresponding logical form candidate and intermediate canonical utterance pair.

● Slow execution in FYP server

SEMPRE 1.0 Github page says that those training on WebQuestions dataset will take

more than 3 days in the EMNLP2013 system (./sempre) and around 1 day in ACL2014

system (./parasempre). It also suggests that using Free917 takes only one hour. Consider

our FYP server’s capability, we adopted Parasempre model and trained with Free917

dataset. To our surprise, the training took extremely long time. We think the speed is

limited by the number of processing units in FYP server (it only has 1 CPU).

Page 22: A Knowledge-based Question Answering System Final Report

22

Figure 7 The CPU information of our FYP server

6. TESTING AND RESULTS

6.1 Testing Setup

After using PARASEMPRE model to train the system on the dataset Free917 for about 120

hours, we start to test the KBQA system prototype by using the command line as following

./parasempre @mode=train @domain=free917 @sparqlserver=localhost:3093

@cacheserver=local -Learner.maxTrainIters 0 -Dataset.inPaths test:testinput -

Builder.inParamsPath lib/models/44.exec/params -Grammar.inPaths

lib/models/15.exec/grammar -Dataset.readLispTreeFormat true

6.2 Test Result Example

The following is an example of the testing question

Page 23: A Knowledge-based Question Answering System Final Report

23

Figure 8 An example of the testing result of the question “what is the capital of France?”

Page 24: A Knowledge-based Question Answering System Final Report

24

7. CONCLUSION AND FUTURE WORKS

7.1 Conclusion

The initial phase of our project is to understand the components of a KBQA system and

integrate a prototype accordingly. We have integrated and trained the system by using

SEMPRE v1.0 and build a prototype of KBQA using the toolkit of SEMPRE that

developed by research team of Percy Liang [5]. We found that the mapping between

utterances and the logical forms in the ontology graph of Freebase is one of the hardest

problem in a Question Answering system, and it also takes extremely long time to

process a single problem. More advancing algorithms are needed in improving the

efficiency in this direction.

7.2 Difficulties

The largest difficulty comes from the incompatibility between the capability of FYP

server and the advancing requirements of conducting training and testing in SEMPRE.

The defects in SEMPRE also added some difficulties during the integration stage. We

suggest to use SEMPRE v1.0 at initial stage of later researchers before they conducting

deeper research in this field.

7.3 Future Works

Due to the time limit of final year project, we just finished building up an open domain

KBQA system based on other’s work and did not contribute to the research study in this

field. In the near future, we plan to use more test dataset to check the possible question

that the system cannot answer and improve on its performance and eventually put our

KBQA prototype online.

Page 25: A Knowledge-based Question Answering System Final Report

25

8. ABBREVIATIONS

Abbreviations Meaning

KB Knowledge Base

QA Question Answering System

KBQA Knowledge-Based Question Answering System

IR Information Retrieval

NLP Natural Language Processing

Page 26: A Knowledge-based Question Answering System Final Report

26

REFERENCES

[1] Cai, Q. and Yates, A., 2013, August. Large-scale Semantic Parsing via Schema Matching

and Lexicon Extension. In ACL (1) (pp. 423-433).

[2] Yao, X., Berant, J. and Van Durme, B., 2014. Freebase QA: Information Extraction or

Semantic Parsing?. ACL 2014, p.82.

[3] Artzi, Y. and Zettlemoyer, L., 2011, July. Bootstrapping semantic parsers from conversations.

In Proceedings of the conference on empirical methods in natural language processing (pp. 421-

432). Association for Computational Linguistics.

[4] Artzi, Y. and Zettlemoyer, L., 2013. Weakly supervised learning of semantic parsers for

mapping instructions to actions. Transactions of the Association for Computational Linguistics, 1,

pp.49-62.

[5] Berant, J., Chou, A., Frostig, R. and Liang, P. http://nlp.stanford.edu/software/sempre/

[6] Berant, J. and Liang, P., 2014. Semantic Parsing via Paraphrasing. In ACL (1) (pp. 1415-

1425).

[7] Berant, J., Chou, A., Frostig, R. and Liang, P., 2013, October. Semantic Parsing on Freebase

from Question-Answer Pairs. In EMNLP (Vol. 2, No. 5, p. 6).

[8] Liang, P. 2015 SEMPRE: Semantic Parsing with Execution [PDF file]

[9] virtuoso-opensource issue 160 in github https://github.com/openlink/virtuoso-

opensource/issues/160

[10] Liang, P. and Potts, C., 2015. Bringing machine learning and compositional semantics

together. Annu. Rev. Linguist., 1(1), pp.355-376.

Page 27: A Knowledge-based Question Answering System Final Report

27

[11] Zettlemoyer, L.S. and Collins, M., 2012. Learning to map sentences to logical form:

Structured classification with probabilistic categorial grammars.arXiv preprint arXiv:1207.1420.

[12] Steedman, M., 2014. Semantics for Semantic Parsing.

[13] Reddy, S., Lapata, M. and Steedman, M., 2014. Large-scale semantic parsing without

question-answer pairs. Transactions of the Association for Computational Linguistics, 2, pp.377-

392.

[14] Artzi, Yoav, Nicholas FitzGerald, and Luke S. Zettlemoyer. "Semantic Parsing with

Combinatory Categorial Grammars." ACL (Tutorial Abstracts) 3 (2013).

[15] Lita, L.V. and Carbonell, J., 2004, November. Unsupervised question answering data

acquisition from local corpora. In Proceedings of the thirteenth ACM international conference

on Information and knowledge management (pp. 607-614). ACM.

[16] Bordes, A., Chopra, S. and Weston, J., 2014. Question answering with subgraph

embeddings. arXiv preprint arXiv:1406.3676.

[17] Wang, Z., Yan, S., Wang, H. and Huang, X., 2014. An overview of Microsoft deep QA

system on Stanford WebQuestions benchmark. Technical report, Microsoft Research.

[18] Kwiatkowski, T., Choi, E., Artzi, Y. and Zettlemoyer, L., 2013. Scaling semantic parsers

with on-the-fly ontology matching. In In Proceedings of EMNLP. Percy.

[19] Moldovan, D., Paşca, M., Harabagiu, S. and Surdeanu, M., 2003. Performance issues and

error analysis in an open-domain question answering system. ACM Transactions on Information

Systems (TOIS), 21(2), pp.133-154.

[20] Ahn, D., Schockaert, S., De Cock, M. and Kerre, E., 2006, April. Supporting temporal

question answering: Strategies for offline data collection. In Proceedings of the 5th international

workshop on inference in computational semantics (pp. 127-132).