1 opinion integration and summarization chengxiang (“cheng”) zhai department of computer science...

36
1 Opinion Integration and Summarization ChengXiang (“Cheng”) Zhai Department of Computer Science Graduate School of Library & Information Science Institute for Genomic Biology Department of Statistics University of Illinois, Urbana-Champaign Joint work with Qiaozhu Mei (U. Mich), Yue Lu (UIUC), Hyun Duk Kim (UIUC)

Upload: kathleen-figge

Post on 14-Dec-2015

242 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: 1 Opinion Integration and Summarization ChengXiang (“Cheng”) Zhai Department of Computer Science Graduate School of Library & Information Science Institute

1

Opinion Integration and Summarization

ChengXiang (“Cheng”) Zhai

Department of Computer Science

Graduate School of Library & Information Science

Institute for Genomic Biology

Department of Statistics

University of Illinois, Urbana-Champaign

Joint work with Qiaozhu Mei (U. Mich), Yue Lu (UIUC), Hyun Duk Kim (UIUC)

Page 2: 1 Opinion Integration and Summarization ChengXiang (“Cheng”) Zhai Department of Computer Science Graduate School of Library & Information Science Institute

Effective Information Sharing

Information Sharing Network

Decision=?

Information Need

Summary

?Information Nuggets

Page 3: 1 Opinion Integration and Summarization ChengXiang (“Cheng”) Zhai Department of Computer Science Graduate School of Library & Information Science Institute

Opinion Integration & Summarization

Information Nuggets

Opinion Integration

SentimentAnalysis

ContradictoryOpinionAnalysis

[Lu & Zhai 08] [Mei et al. 07] [Kim & Zhai 09]

Page 4: 1 Opinion Integration and Summarization ChengXiang (“Cheng”) Zhai Department of Computer Science Graduate School of Library & Information Science Institute

4

1. Opinion Integration [Lu & Zhai WWW 08]

cute… tiny… ..thicker..last many hrs

die out soon

could afford it

still expensive

DesignBatteryPrice..

DesignBatteryPrice..

Topic: iPod

Expert review with aspects

Scattered opinions

Integrated Summary

DesignBattery

Price

DesignBattery

Price

iTunes … easy to use…warranty …better to extend..

Similar SupplementaryInput

Output

Page 5: 1 Opinion Integration and Summarization ChengXiang (“Cheng”) Zhai Department of Computer Science Graduate School of Library & Information Science Institute

5

Sample Integration Result: iPhone

Review article Similar opinions Supplementary opinions

You can make emergency calls, but you can't use any other functions…

N/A [10] … methods for unlocking the iPhone have emerged on the Internet in the past few weeks, although they involve tinkering with the iPhone hardware…

rated battery life of 8 hours talk time, 24 hours of music playback, 7 hours of video playback, and 6 hours on Internet use.

[19] iPhone will Feature Up to 8 Hours of Talk Time, 6 Hours of Internet Use, 7 Hours of Video Playback or 24 Hours of Audio Playback

[7] Playing relatively high bitrate VGA H.264 videos, our iPhone lasted almost exactly 9 freaking hours of continuous playback with cell and WiFi on (but Bluetooth off).

Page 6: 1 Opinion Integration and Summarization ChengXiang (“Cheng”) Zhai Department of Computer Science Graduate School of Library & Information Science Institute

6

Extra Aspects from Blogs

support Supplementary opinions on extra aspects

15 You may have heard of iASign … an iPhone Dev Wiki tool that allows you to activate your phone without going through the iTunes rigamarole.

13 Cisco has owned the trademark on the name "iPhone" since 2000, when it acquired InfoGear Technology Corp., which originally registered the name.

13 With the imminent availability of Apple's uber cool iPhone, a look at 10 things current smartphones like the Nokia N95 have been able to do for a while and that the iPhone can't currently match...

Page 7: 1 Opinion Integration and Summarization ChengXiang (“Cheng”) Zhai Department of Computer Science Graduate School of Library & Information Science Institute

7

Blog Coverage of Review Aspects

7

Page 8: 1 Opinion Integration and Summarization ChengXiang (“Cheng”) Zhai Department of Computer Science Graduate School of Library & Information Science Institute

8

2. Multi-Faceted Sentiment Summary [Mei et al. WWW 07]

query=“Da Vinci Code”Neutral Positive Negative

Facet 1:Movie

... Ron Howards selection of Tom Hanks to play Robert Langdon.

Tom Hanks stars in the movie,who can be mad at that?

But the movie might get delayed, and even killed off if he loses.

Directed by: Ron Howard Writing credits: Akiva Goldsman ...

Tom Hanks, who is my favorite movie star act the leading role.

protesting ... will lose your faith by ... watching the movie.

After watching the movie I went online and some research on ...

Anybody is interested in it?

... so sick of people making such a big deal about a FICTION book and movie.

Facet 2:Book

I remembered when i first read the book, I finished the book in two days.

Awesome book. ... so sick of people making such a big deal about a FICTION book and movie.

I’m reading “Da Vinci Code” now.

So still a good book to past time.

This controversy book cause lots conflict in west society.

Page 9: 1 Opinion Integration and Summarization ChengXiang (“Cheng”) Zhai Department of Computer Science Graduate School of Library & Information Science Institute

9

Separate Theme Sentiment Dynamics

“book” “religious belief”

Page 10: 1 Opinion Integration and Summarization ChengXiang (“Cheng”) Zhai Department of Computer Science Graduate School of Library & Information Science Institute

10

Probabilistic Topic-Sentiment Model

k

1

2

B

Facet 1

Facet k

Facet 2

Background B

Choose a facet (subtopic) i

battery 0.3 life 0.2..

nano 0.1release 0.05screen 0.02 ..

apple 0.2microsoft 0.1compete 0.05 ..

Is 0.05the 0.04a 0.03 ..

love 0.2awesome 0.05good 0.01 ..

suck 0.07hate 0.06stupid 0.02 ..

P N

PF

N

PF

N

PF

N

battery

love

hate

the

Page 11: 1 Opinion Integration and Summarization ChengXiang (“Cheng”) Zhai Department of Computer Science Graduate School of Library & Information Science Institute

11

3. Summarization of Contradictory Opinions [Kim & Zhai CIKM 09]

Neutral Positive Negative

Facet 1:Movie

... Ron Howards selection of Tom Hanks to play Robert Langdon.

Tom Hanks stars in the movie,who can be mad at that?

But the movie might get delayed, and even killed off if he loses.

Directed by: Ron Howard Writing credits: Akiva Goldsman ...

Tom Hanks, who is my favorite movie star act the leading role.

protesting ... will lose your faith by ... watching the movie.

After watching the movie I went online and some research on ...

Anybody is interested in it?

... so sick of people making such a big deal about a FICTION book and movie.

Facet 2:Book

I remembered when i first read the book, I finished the book in two days.

Awesome book. ... so sick of people making such a big deal about a FICTION book and movie.

I’m reading “Da Vinci Code” now.

So still a good book to past time.

This controversy book cause lots conflict in west society.

How can we help analysts digest and interpret contradictory opinioons?

Page 12: 1 Opinion Integration and Summarization ChengXiang (“Cheng”) Zhai Department of Computer Science Graduate School of Library & Information Science Institute

Contrastive Opinion Summarization

12

x1

x2

xn

x3

X

x4

x5

y1

y2

ym

y3

Y

y4

Page 13: 1 Opinion Integration and Summarization ChengXiang (“Cheng”) Zhai Department of Computer Science Graduate School of Library & Information Science Institute

Contrastive Opinion Summarization

13

x1

x2

xn

x3

X

x4

x5

y1

y2

ym

y3

Y

y4

u1

u2

uk

v1

v2

vk

Contrastive Opinion Summary

,XU YV

Page 14: 1 Opinion Integration and Summarization ChengXiang (“Cheng”) Zhai Department of Computer Science Graduate School of Library & Information Science Institute

Problem Formulation

14

x1

x2

xn

x3

X

x4

x5

y1

y2

ym

y3

Y

y4

u1

u2

uk

U

v1

v2

vk

V

Contrastiveness

Representativeness

Page 15: 1 Opinion Integration and Summarization ChengXiang (“Cheng”) Zhai Department of Computer Science Graduate School of Library & Information Science Institute

Problem Formulation

15

x1

x2

xn

x3

X

x4

x5

y1

y2

ym

y3

Y

y4

u1

u2

uk

U

v1

v2

vk

V

Contrastiveness

Representativeness

),(1

)(1

i

k

ii vu

kSc

Yy

iki

Xxi

kivy

Yux

XSr ),(max

1),(max

1)(

],1[],1[

Page 16: 1 Opinion Integration and Summarization ChengXiang (“Cheng”) Zhai Department of Computer Science Graduate School of Library & Information Science Institute

Summarization as Optimization

1. Define an appropriate content similarity function Ф

2. Define an appropriate contrastive similarity function ψ

3. Solve the optimization problem efficiently.

16

)),(1

),(max),(max(maxarg

))()1()((maxarg*

1

],1[],1[

k

iii

Yyi

kiXx

ikiS

S

vuk

vyY

uxX

ScSrS

Page 17: 1 Opinion Integration and Summarization ChengXiang (“Cheng”) Zhai Department of Computer Science Graduate School of Library & Information Science Institute

Sample ResultsNo Positive Negative

1 oh ... and file transfers are fast & easy .

you need the software to actually transfer files

2 i noticed that the micro adjustment knob and collet are well made and work well too.

the adjustment knob seemed ok, but when lowering the router, i have to practically pull it down while turning the knob.

3 the navigation is nice enough , but scrolling and searching through thousands of tracks , hundreds of albums or artists , or even dozens of genres is not conducive to save driving

difficult navigation - i wo n’t necessarily say " difficult ,“ but i do n’t enjoy the scrollwheel to navigate .

4 i imagine if i left my player untouched (no backlight) it could play for considerably more than 12 hours at a low volume level.

there are 2 things that need fixing first is the battery life.it will run for 6 hrs without problemswith medium usage of the buttons.

17

Page 18: 1 Opinion Integration and Summarization ChengXiang (“Cheng”) Zhai Department of Computer Science Graduate School of Library & Information Science Institute

Sample ResultNo Positive Negative

1 oh ... and file transfers are fast & easy .

you need the software to actually transfer files

2 i noticed that the micro adjustment knob and collet are well made and work well too.

the adjustment knob seemed ok, but when lowering the router, i have to practically pull it down while turning the knob.

3 the navigation is nice enough , but scrolling and searching through thousands of tracks , hundreds of albums or artists , or even dozens of genres is not conducive to save driving

difficult navigation - i wo n’t necessarily say " difficult ,“ but i do n’t enjoy the scrollwheel to navigate .

4 i imagine if i left my player untouched (no backlight) it could play for considerably more than 12 hours at a low volume level.

there are 2 things that need fixing first is the battery life.it will run for 6 hrs without problemswith medium usage of the buttons.

18

Different polarities of opinions made from different perspectives.

Page 19: 1 Opinion Integration and Summarization ChengXiang (“Cheng”) Zhai Department of Computer Science Graduate School of Library & Information Science Institute

Sample ResultNo Positive Negative

1 oh ... and file transfers are fast & easy .

you need the software to actually transfer files

2 i noticed that the micro adjustment knob and collet are well made and work well too.

the adjustment knob seemed ok, but when lowering the router, i have to practically pull it down while turning the knob.

3 the navigation is nice enough , but scrolling and searching through thousands of tracks , hundreds of albums or artists , or even dozens of genres is not conducive to save driving

difficult navigation - i wo n’t necessarily say " difficult ,“ but i do n’t enjoy the scrollwheel to navigate .

4 i imagine if i left my player untouched (no backlight) it could play for considerably more than 12 hours at a low volume level.

there are 2 things that need fixing first is the battery life.it will run for 6 hrs without problemswith medium usage of the buttons.

19

Positive vs. negativeNot much disagreement

Page 20: 1 Opinion Integration and Summarization ChengXiang (“Cheng”) Zhai Department of Computer Science Graduate School of Library & Information Science Institute

Sample ResultNo Positive Negative

1 oh ... and file transfers are fast & easy .

you need the software to actually transfer files

2 i noticed that the micro adjustment knob and collet are well made and work well too.

the adjustment knob seemed ok, but when lowering the router, i have to practically pull it down while turning the knob.

3 the navigation is nice enough , but scrolling and searching through thousands of tracks , hundreds of albums or artists , or even dozens of genres is not conducive to save driving

difficult navigation - i wo n’t necessarily say " difficult ,“ but i do n’t enjoy the scrollwheel to navigate .

4 i imagine if i left my player untouched (no backlight) it could play for considerably more than 12 hours at a low volume level.

there are 2 things that need fixing first is the battery life.it will run for 6 hrs without problemswith medium usage of the buttons.

20

Judgments revealing detailed conditions

Page 21: 1 Opinion Integration and Summarization ChengXiang (“Cheng”) Zhai Department of Computer Science Graduate School of Library & Information Science Institute

Summary

Information Nuggets

Opinion Integration

SentimentAnalysis

ContradictoryOpinionAnalysis

[Lu & Zhai 08] [Mei et al. 07] [Kim & Zhai 09]

Page 22: 1 Opinion Integration and Summarization ChengXiang (“Cheng”) Zhai Department of Computer Science Graduate School of Library & Information Science Institute

Future Plan

• Incorporate trustworthiness of sources in opinion integration

• Analyze opinions in context and discover topic communities

• Suggest opportunities for information sharing

• “Soft” policy of information sharing

22

Page 23: 1 Opinion Integration and Summarization ChengXiang (“Cheng”) Zhai Department of Computer Science Graduate School of Library & Information Science Institute

23

ReferencesOpinion integration: [WWW 08] Y. Lu and C. Zhai. Opinion integration through semi-supervised topic

modeling. In WWW ’08: Proceeding of the 17th international conference on World Wide Web, pages 121–130, New York, NY, USA, 2008.

Sentiment analysis:[WWW 07] Q. Mei, X. Ling, M. Wondra, H. Su, and C. Zhai. Topic sentiment

mixture: modeling facets and opinions in weblogs. In WWW ’07: Proceedings of the 16th international conference on World Wide Web, pages 171–180, New York, NY, USA, 2007.

Contradictory opinion summarization:[CIKM 09] H. Kim, C. Zhai, Generating Comparative Summaries of

Contradictory Opinions in Text, In Proceedings of CIKM 2009, to appear.

Page 24: 1 Opinion Integration and Summarization ChengXiang (“Cheng”) Zhai Department of Computer Science Graduate School of Library & Information Science Institute

Questions/Comments/Suggestions?

24

http://apps.facebook.com/news_letters/

Page 25: 1 Opinion Integration and Summarization ChengXiang (“Cheng”) Zhai Department of Computer Science Graduate School of Library & Information Science Institute

1. Content Similarity Function

: term similarity function

25

21

''21

2 11 2

),'(max)',(max),(

ss

vuvuss

sv susu sv

]1,0[),( vu

Sentence 1

Sentence 2 great

shorthasIt battery time

battery life Isn’tThe

),( vu

Page 26: 1 Opinion Integration and Summarization ChengXiang (“Cheng”) Zhai Department of Computer Science Graduate School of Library & Information Science Institute

1. Content Similarity Function

: term similarity function

26

21

''21

2 11 2

),'(max)',(max),(

ss

vuvuss

sv susu sv

]1,0[),( vu

Sentence 1

Sentence 2 great

shorthasIt battery time

battery life Isn’tThe

Page 27: 1 Opinion Integration and Summarization ChengXiang (“Cheng”) Zhai Department of Computer Science Graduate School of Library & Information Science Institute

1. Content Similarity Function

: term similarity function

27

21

''21

2 11 2

),'(max)',(max),(

ss

vuvuss

sv susu sv

]1,0[),( vu

Sentence 1

Sentence 2 great

shorthasIt battery time

battery life Isn’tThe

Page 28: 1 Opinion Integration and Summarization ChengXiang (“Cheng”) Zhai Department of Computer Science Graduate School of Library & Information Science Institute

2. Contrastive Similarity Function

: term similarity function

28

21

''21

2 11 2

),'(max)',(max),(

ss

vuvuss

sv susu sv

]1,0[),( vu

Sentence 1

Sentence 2 great

shorthasIt battery time

battery life Isn’tThe

Page 29: 1 Opinion Integration and Summarization ChengXiang (“Cheng”) Zhai Department of Computer Science Graduate School of Library & Information Science Institute

3. Approximation Algorithms

29

x1

x2

xn

x3

X

x4

x5

y1

y2

ym

y3

Y

y4

||

||

||

||

S

Y

S

X

Combinations

Page 30: 1 Opinion Integration and Summarization ChengXiang (“Cheng”) Zhai Department of Computer Science Graduate School of Library & Information Science Institute

Strategy 1: Representativeness-First

30

x1

x4

xn

x3

X

x2

x7

y1

y2

ym

y4

Y

y6

Find k cluster,

Page 31: 1 Opinion Integration and Summarization ChengXiang (“Cheng”) Zhai Department of Computer Science Graduate School of Library & Information Science Institute

31

x1

x4

xn

x3

X

x2

x7

y1

y2

ym

y4

Y

y6

Find k cluster,

Find contrastive pairs

u1

u2

uk

U

v1

v2

vk

V

Strategy 1: Representativeness-First

Page 32: 1 Opinion Integration and Summarization ChengXiang (“Cheng”) Zhai Department of Computer Science Graduate School of Library & Information Science Institute

Strategy 2: Contrastiveness-First

32

x1

x2

xn

x3

X

x4

x5

y1

y2

ym

y3

Y

y4

Page 33: 1 Opinion Integration and Summarization ChengXiang (“Cheng”) Zhai Department of Computer Science Graduate School of Library & Information Science Institute

Strategy 2: Contrastiveness-First

33

x1

x2

xn

x3

X

x4

x5

y1

y2

ym

y3

Y

y4

Find contrastive pair,

x1

xn

x7

x3

y2

y4

y1

ym

……x8 y5

Page 34: 1 Opinion Integration and Summarization ChengXiang (“Cheng”) Zhai Department of Computer Science Graduate School of Library & Information Science Institute

Strategy 2: Contrastiveness-First

34

x1

x2

xn

x3

X

x4

x5

y1

y2

ym

y3

Y

y4

Find contrastive pair,

Select representative pairs

x1

xn

x7

x3

y2

y4

y1

ym

……x8 y5

k

Page 35: 1 Opinion Integration and Summarization ChengXiang (“Cheng”) Zhai Department of Computer Science Graduate School of Library & Information Science Institute

Rep-First Outperforms Contrast-First

35

Page 36: 1 Opinion Integration and Summarization ChengXiang (“Cheng”) Zhai Department of Computer Science Graduate School of Library & Information Science Institute

Contrastive Similarity Heuristic Works

36

Opt. MethodPrecision Aspect Coverage

RF CF RF CF

WO 0.503 0.537 0.737 0.804

WO+all words 0.484 0.531 0.737 0.798

SEM 0.500 0.540 0.763 0.763

SEM+all words 0.470 0.507 0.718 0.686

Removing sentiment words is beneficial