identifying opinion holders for question answering in opinion texts soo-min kim and eduard hovy...

24
Identifying Opinion Holders for Question Answering in Opinion Texts Soo-Min Kim and Eduard Hovy Information Sciences Institute University of Southern California 4676 Admiralty Way Marina del Rey, CA 90292-6695 {skim, hovy}@isi.edu Advisor: Hsin-Hsi Chen Speaker: Yong-Sheng Lo Date: 2007/08/16 AAAI - 2005

Upload: conrad-woods

Post on 03-Jan-2016

224 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Identifying Opinion Holders for Question Answering in Opinion Texts Soo-Min Kim and Eduard Hovy Information Sciences Institute University of Southern California

Identifying Opinion Holders for Question

Answering in Opinion Texts

Soo-Min Kim and Eduard HovyInformation Sciences Institute

University of Southern California4676 Admiralty Way

Marina del Rey, CA 90292-6695{skim, hovy}@isi.edu

Advisor: Hsin-Hsi ChenSpeaker: Yong-Sheng Lo

Date: 2007/08/16

AAAI - 2005

Page 2: Identifying Opinion Holders for Question Answering in Opinion Texts Soo-Min Kim and Eduard Hovy Information Sciences Institute University of Southern California

Introduction 1/2 Question answering in opinion texts

“Who strongly believes in Y” A system to recognize the holder of opinion Y

Application Stock market predictors

Earlier work (Kim and Hovy,2004) Focus on identifying opinion expressions within text 現在進一步要找出 opinion holder

Example 小叮噹認為銅鑼燒很好吃

Opinion holder :小叮噹 Opinion expression :認為 Opinion :銅鑼燒很好吃

Page 3: Identifying Opinion Holders for Question Answering in Opinion Texts Soo-Min Kim and Eduard Hovy Information Sciences Institute University of Southern California

Introduction 2/2 Define the opinion holder as an entity who express

es explicitly or implicitly the opinion contained in a sentence Entity

=(person, country, organization, or special group of people) 一個 opinion expression 對應一個 holder

“A think B’s criticism of T is wrong” B is the holder of “the criticism of T” A is the person who has an opinion that B’s criticism is wrong

Page 4: Identifying Opinion Holders for Question Answering in Opinion Texts Soo-Min Kim and Eduard Hovy Information Sciences Institute University of Southern California

辨別 opinion holder 的困難點1. The opinion sentence contains more than one likely

holder entity “Russia’s defense minister said Sunday that his country

disagrees with the U.S. view of Iraq, Iran and North Korea as an ‘axis of evil’”.

The candidate holders : “Russia”, “Russia’s defense minister”, “U.S.”, “Iraq”, “Iran”, “North Korea”

2. There is more than one opinion in a sentence “In relation to Bush’s axis of evil remarks, the German

Foreign Minister also said, Allies are not satellites, and the French Foreign Minister caustically criticized that the United States’ unilateral, simplistic worldview poses a new threat to the world”.

Page 5: Identifying Opinion Holders for Question Answering in Opinion Texts Soo-Min Kim and Eduard Hovy Information Sciences Institute University of Southern California

本文提的解法 Automatic method for identifying opinion holders (OH)

1. Identify all possible opinion holder entities in a sentence 使用現有工具找出句子中的Name entities 和 Noun phrases

2. Apply the Maximum Entropy (ME) ranking algorithm to select the most probable entity

Page 6: Identifying Opinion Holders for Question Answering in Opinion Texts Soo-Min Kim and Eduard Hovy Information Sciences Institute University of Southern California

System architecture

Page 7: Identifying Opinion Holders for Question Answering in Opinion Texts Soo-Min Kim and Eduard Hovy Information Sciences Institute University of Southern California

Holder candidate set

Named entities (NE) Using BBN’s named entity tag

ger IdentiFinder Noun phrases (NP)

Using Charniak’s parser For example

Page 8: Identifying Opinion Holders for Question Answering in Opinion Texts Soo-Min Kim and Eduard Hovy Information Sciences Institute University of Southern California

Maximum Entropy ranking algorithm

A machine learning approach Maximum Entropy modeling

Classification Select many candidates as answers as long as they are

marked as true and does not select any candidate if every one is marked as false

Poor performance Ranking

Select the most probable candidate as an answer To maximize a given conditional probability distribution

Page 9: Identifying Opinion Holders for Question Answering in Opinion Texts Soo-Min Kim and Eduard Hovy Information Sciences Institute University of Southern California

Training data MPQA corpus (Wiebe et al., 2003)

535 documents (10657 sentences) 以下是標記者的標記例子:

只選意見強度 (Strength) 為 high or extreme 的句子

Opinion

Holder

Page 10: Identifying Opinion Holders for Question Answering in Opinion Texts Soo-Min Kim and Eduard Hovy Information Sciences Institute University of Southern California

Training 流程

Page 11: Identifying Opinion Holders for Question Answering in Opinion Texts Soo-Min Kim and Eduard Hovy Information Sciences Institute University of Southern California

Feature selection for ME1. Full parsing features (f2,f3,f4,f6)2. Partial parsing features (f7,f8,f9)3. Others (f1,f5)

Page 12: Identifying Opinion Holders for Question Answering in Opinion Texts Soo-Min Kim and Eduard Hovy Information Sciences Institute University of Southern California

Full parsing features 1/5 Using charniak’s parser For example:

China’s official Xinhua news agency <H> Form MPQA

accusing <E> From Earlier work (Kim and Hovy,2004)

Page 13: Identifying Opinion Holders for Question Answering in Opinion Texts Soo-Min Kim and Eduard Hovy Information Sciences Institute University of Southern California

Full parsing features 2/5

Page 14: Identifying Opinion Holders for Question Answering in Opinion Texts Soo-Min Kim and Eduard Hovy Information Sciences Institute University of Southern California

Full parsing features 3/5 To express tree structure for ME training

“<H> NP S VP S S VP VBG <E>” Data sparseness problem

Page 15: Identifying Opinion Holders for Question Answering in Opinion Texts Soo-Min Kim and Eduard Hovy Information Sciences Institute University of Southern California

Full parsing features 4/5 Solution: 分成三條 path(f2,f3,f4)

For example “<H> NPH SHE VPE SE SE VPE VBGE <E>”

Page 16: Identifying Opinion Holders for Question Answering in Opinion Texts Soo-Min Kim and Eduard Hovy Information Sciences Institute University of Southern California

Full parsing features 5/5

f6: The top two levels below a child node of

HEhead on the path toward Hhead For example

P1 = “<H> NPH PPH NPH ” P2 = “<H> NPH NPH PPH VPH NPH PPH NPH

” P1 and P2 as the same because they share “

PPH NPH” at the top

Page 17: Identifying Opinion Holders for Question Answering in Opinion Texts Soo-Min Kim and Eduard Hovy Information Sciences Institute University of Southern California

Partial parsing features

Using CASS parser f7 : (vgp…) f8 : (c …) f9 : Yes or No

Page 18: Identifying Opinion Holders for Question Answering in Opinion Texts Soo-Min Kim and Eduard Hovy Information Sciences Institute University of Southern California

Other features Non-structural features

f1 : Type of <H>

The type of the candidate, with values NP, PERSON, ORGANIZATION, and LOCATION

This feature enables ME to determine the most probable one among them automatically

f5 : The distance between <H> and <E>, counted in

parse tree words

Page 19: Identifying Opinion Holders for Question Answering in Opinion Texts Soo-Min Kim and Eduard Hovy Information Sciences Institute University of Southern California

Answer selection for evaluation 1/2 Strict selection

For example 標準答案:小叮噹 System :小叮噹

Lenient selection For example

標準答案:“ Michel Sidibe, Director of the Country and Regional Support Department of UNAIDS”

System :“ Michel Sidibe” Accept candidates with priority 1 & 2 & 3

Page 20: Identifying Opinion Holders for Question Answering in Opinion Texts Soo-Min Kim and Eduard Hovy Information Sciences Institute University of Southern California

Answer selection for evaluation 2/2 Threshold 1 = 0.5

Allow a candidate as an answer in case half of the words in a holder appear in the candidate as well

Threshold 2 = 4 The average number words in human annotated

holders is 3.71

Page 21: Identifying Opinion Holders for Question Answering in Opinion Texts Soo-Min Kim and Eduard Hovy Information Sciences Institute University of Southern California

Experiments 1/3 961 pairs of (<H>,<E>)

863 for training 98 for testing

Baseline The system choose the closest candidate to the

expression as a holder without ME decision

Page 22: Identifying Opinion Holders for Question Answering in Opinion Texts Soo-Min Kim and Eduard Hovy Information Sciences Institute University of Southern California

Experiments 2/3

Page 23: Identifying Opinion Holders for Question Answering in Opinion Texts Soo-Min Kim and Eduard Hovy Information Sciences Institute University of Southern California

Experiments 3/3

Page 24: Identifying Opinion Holders for Question Answering in Opinion Texts Soo-Min Kim and Eduard Hovy Information Sciences Institute University of Southern California

Conclusions The importance of opinion holder identification

was noticed yet it has not been much studied to date, partly because of the lack of annotated data.

Using Maximum Entropy ranking to select the most probable holder among multiple candidates.

Adopting parsing features significantly improved system performance.