sentiment analysis of peer review texts for scholarly papers · sentiment analysis of peer review...

30
Sentiment Analysis of Peer Review Texts for Scholarly Papers Ke Wang & Xiaojun Wan {wangke17,wanxiaojun}@pku.edu.cn July 9, 2018 Institute of Computer Science and Technology, Peking University Beijing , China

Upload: others

Post on 01-Jun-2020

8 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Sentiment Analysis of Peer Review Texts for Scholarly Papers · Sentiment Analysis of Peer Review Texts for Scholarly Papers Ke Wang & Xiaojun Wan {wangke17,wanxiaojun}@pku.edu.cn

Sentiment Analysis of Peer Review Textsfor Scholarly Papers

Ke Wang & Xiaojun Wan{wangke17,wanxiaojun}@pku.edu.cn

July 9, 2018

Institute of Computer Science and Technology, Peking UniversityBeijing , China

Page 2: Sentiment Analysis of Peer Review Texts for Scholarly Papers · Sentiment Analysis of Peer Review Texts for Scholarly Papers Ke Wang & Xiaojun Wan {wangke17,wanxiaojun}@pku.edu.cn

Outline

1. Introduction

2. Related Work

3. Framework

4. Experiments

5. Conclusion and Future Work

1/29

Page 3: Sentiment Analysis of Peer Review Texts for Scholarly Papers · Sentiment Analysis of Peer Review Texts for Scholarly Papers Ke Wang & Xiaojun Wan {wangke17,wanxiaojun}@pku.edu.cn

Outline

1. Introduction

2. Related Work

3. Framework

4. Experiments

5. Conclusion and Future Work

2/29

Page 4: Sentiment Analysis of Peer Review Texts for Scholarly Papers · Sentiment Analysis of Peer Review Texts for Scholarly Papers Ke Wang & Xiaojun Wan {wangke17,wanxiaojun}@pku.edu.cn

Introduction

• The boom of scholarly papers• Motivations

• Help review submission system todetect the consistency of reviewtexts and scores.

• Help the chair to write acomprehensive meta-review.

• Help authors to further improve theirpaper.

Figure 1: An example of peer reviewtext and the analysis results.

3/29

Page 5: Sentiment Analysis of Peer Review Texts for Scholarly Papers · Sentiment Analysis of Peer Review Texts for Scholarly Papers Ke Wang & Xiaojun Wan {wangke17,wanxiaojun}@pku.edu.cn

Introduction

• Challenges• Long length.• Mixture of non-opinionated and opinionated texts.• Mixture of pros and cons.

• Contributions• We built two evaluation datasets. (ICLR-2017 and ICLR-2018)• We propose a multiple instance learning network with a novel

abstract-based memory mechanism (MILAM)• Evaluation results demonstrate the efficacy of our proposed model

and show the great helpfulness of using abstract as memory.

4/29

Page 6: Sentiment Analysis of Peer Review Texts for Scholarly Papers · Sentiment Analysis of Peer Review Texts for Scholarly Papers Ke Wang & Xiaojun Wan {wangke17,wanxiaojun}@pku.edu.cn

Outline

1. Introduction

2. Related Work

3. Framework

4. Experiments

5. Conclusion and Future Work

5/29

Page 7: Sentiment Analysis of Peer Review Texts for Scholarly Papers · Sentiment Analysis of Peer Review Texts for Scholarly Papers Ke Wang & Xiaojun Wan {wangke17,wanxiaojun}@pku.edu.cn

Related Work

• Sentiment ClassificationSentiment analysis has been widely explored in many textdomains, but few studies trying to perform it in the domain ofpeer reviews for scholarly papers.

• Multiple Instance LearningMIL can extract instance labels(sentence-level polarities)from bags (reviews in our case), but none of previous workwas applied to this challenging task.

• Memory NetworkMemory network utilizes external information for greatercapacity and efficiency.

• Study on Peer ReviewsThese tasks are related but different from the sentimentanalysis task addressed in this study. 6/29

Page 8: Sentiment Analysis of Peer Review Texts for Scholarly Papers · Sentiment Analysis of Peer Review Texts for Scholarly Papers Ke Wang & Xiaojun Wan {wangke17,wanxiaojun}@pku.edu.cn

Outline

1. Introduction

2. Related Work

3. Framework

4. Experiments

5. Conclusion and Future Work

7/29

Page 9: Sentiment Analysis of Peer Review Texts for Scholarly Papers · Sentiment Analysis of Peer Review Texts for Scholarly Papers Ke Wang & Xiaojun Wan {wangke17,wanxiaojun}@pku.edu.cn

Framework

• Architecture1 Input

Representation2 Sentence

Classification3 Review

Classification

...

1I 2I nI...

...

...

...

1M 2M mM...

...

...

MLP MLP MLP

1V 2V nV

...

...

...

1V 2V nV

2h nh1h...

...

document attention

(2)E( )nE

(2)R ( )nR

(1)E

Input

Representation

Layer

Sentence

Classification

Layer

nP1P2P

reviewP

abstractT

1

aS 2

aS a

mS

reviewT

1

rS2

rSr

nS

matched

attention

response

content

sentence

embedding

convolution

...

max pooling

1a 2a nasoftmax

Review

Classification

Layer

Abstract-based Memory Mechanism

Sum

( )iR(1)R

( )

1

ie ( )

2

ie ( )i

me( )iE

Figure 2: The architecture of MILAM

8/29

Page 10: Sentiment Analysis of Peer Review Texts for Scholarly Papers · Sentiment Analysis of Peer Review Texts for Scholarly Papers Ke Wang & Xiaojun Wan {wangke17,wanxiaojun}@pku.edu.cn

Framework

1 Input Representation Layer:I A sentence S of length L (padded where necessary) is represented

as:S = w1 ⊕ w2 ⊕ · · · ⊕ wL, S ∈ RL×d, (1)

II The convolutional layer:

fk = tanh(Wc ·Wk−l+1:k + bc), (2)

f (q) = [f (q)1 , f (q)

2 , · · · , f (q)L−l+1], (3)

III A max-pooling layer:uq = max{f (q)}. (4)

Finally, the representations of the review text {Sri}n

i=1 and theabstract text {Sa

j }mj=1 are denoted as [Ii]

ni=1, [Mj]

mi=1

respectively. where Ii,Mj ∈ Rz.

9/29

Page 11: Sentiment Analysis of Peer Review Texts for Scholarly Papers · Sentiment Analysis of Peer Review Texts for Scholarly Papers Ke Wang & Xiaojun Wan {wangke17,wanxiaojun}@pku.edu.cn

Framework

2 Sentence Classification Layer:I Obtain a matched attention vector E(i) = [e(i)

t ]mt=1 which indicatesthe weight of memories.

II Calculate the response content R(i) ∈ Rz using this matchedattention vector.

III Use a MLP to obtain the final representation vector of eachsentence in the review text.

Vi = fmlp(Ii||R(i); θmlp), (5)

IV Use the softmax classifier to get sentence-level distribution oversentiment labels.

Pi = softmax(Wp · Vi + bp), (6)

Finally, we obtained new high-level representations ofsentences in the review text by leveraging relevant abstractinformation.

10/29

Page 12: Sentiment Analysis of Peer Review Texts for Scholarly Papers · Sentiment Analysis of Peer Review Texts for Scholarly Papers Ke Wang & Xiaojun Wan {wangke17,wanxiaojun}@pku.edu.cn

Framework

3 Review Classification Layer:I use separate LSTM modules to produce forward and back- ward

hidden vectors:

−→hi =

−−−→LSTM(Vi),

←−hi =

←−−−LSTM(Vi), hi =

−→hi ||←−hi (7)

II The importance (ai) of each sentence is measured as follows:

h′i = tanh(Wa · hi + ba), ai =

exp(h′i )∑

j exp(h′j )

(8)

III Finally, we obtain a document-level distribution over sentimentlabels as the weighted sum of sentence-level distributions:

P(c)review =

∑i

aiP(c)i , c ∈ [1,C] (9)

11/29

Page 13: Sentiment Analysis of Peer Review Texts for Scholarly Papers · Sentiment Analysis of Peer Review Texts for Scholarly Papers Ke Wang & Xiaojun Wan {wangke17,wanxiaojun}@pku.edu.cn

Framework

• Abstract-based Memory Mechanism1 Get the matched attention vector E(i) of memories:

e′t = LSTM(ht−1,Mt), (h0 = Ii, t = 1, ...,m) (10)

e(i)t =

exp(e′t )∑

j exp(e′j )

(11)

E(i) = [e(i)t ]mt=1 (12)

2 Calculate the response content R(i):

R(i) =m∑

t=1

e(i)t Mt (13)

3 Use R(i) and Ii to compute the new sentence representationvector Vi:

Vi = fmlp(Ii||R(i); θmlp), (14)

12/29

Page 14: Sentiment Analysis of Peer Review Texts for Scholarly Papers · Sentiment Analysis of Peer Review Texts for Scholarly Papers Ke Wang & Xiaojun Wan {wangke17,wanxiaojun}@pku.edu.cn

Framework

• Objective Function

• Our model only needs the review’s sentiment label while eachsentence’s sentiment label is unobserved.

• The categorical cross-entropy loss:

L(θ) =∑

Treview

C∑c=1

−P(c)review log(P(c)

review) (15)

13/29

Page 15: Sentiment Analysis of Peer Review Texts for Scholarly Papers · Sentiment Analysis of Peer Review Texts for Scholarly Papers Ke Wang & Xiaojun Wan {wangke17,wanxiaojun}@pku.edu.cn

Outline

1. Introduction

2. Related Work

3. Framework

4. Experiments

5. Conclusion and Future Work

14/29

Page 16: Sentiment Analysis of Peer Review Texts for Scholarly Papers · Sentiment Analysis of Peer Review Texts for Scholarly Papers Ke Wang & Xiaojun Wan {wangke17,wanxiaojun}@pku.edu.cn

Experiments

• Evaluation Datasets• Statistics for ICLR-2017 and ICLR-2018 datasets.

Data Set #Papers #Reviews #Sentences #WordsICLR-2017 490 1517 24497 9868ICLR-2018 954 2875 58329 13503

• The score distributions:

15/29

Page 17: Sentiment Analysis of Peer Review Texts for Scholarly Papers · Sentiment Analysis of Peer Review Texts for Scholarly Papers Ke Wang & Xiaojun Wan {wangke17,wanxiaojun}@pku.edu.cn

Experiments

• Comparison of review sentiment classification accuracy onthe 2-class task {accept(score ∈ [1, 5]), reject(score ∈ [6,10])}

16/29

Page 18: Sentiment Analysis of Peer Review Texts for Scholarly Papers · Sentiment Analysis of Peer Review Texts for Scholarly Papers Ke Wang & Xiaojun Wan {wangke17,wanxiaojun}@pku.edu.cn

Experiments

• Comparison of review sentiment classification accuracy onthe 3-class task {accept(score ∈ [1, 4]), borderline(score ∈[5, 6]), reject(score ∈ [7, 10])}

17/29

Page 19: Sentiment Analysis of Peer Review Texts for Scholarly Papers · Sentiment Analysis of Peer Review Texts for Scholarly Papers Ke Wang & Xiaojun Wan {wangke17,wanxiaojun}@pku.edu.cn

Experiments

• Sentence-Level Classification Results.We randomly selected 20 reviews, a total of 213 sentences, andmanually labeled the sentiment polarity of each sentence.

Figure 3: Example opinionated sentences with predicted polarityscores extracted from a review text.

18/29

Page 20: Sentiment Analysis of Peer Review Texts for Scholarly Papers · Sentiment Analysis of Peer Review Texts for Scholarly Papers Ke Wang & Xiaojun Wan {wangke17,wanxiaojun}@pku.edu.cn

Experiments

• Influence of Abstract Text.

Figure 4: Example sentences in a review text and its most relevant sentencein the paper abstract text. The sentence with the largest weight in thematched attention vector E(i) is considered most relevant. The red textsindicate similarities in the review text and the abstract text.

19/29

Page 21: Sentiment Analysis of Peer Review Texts for Scholarly Papers · Sentiment Analysis of Peer Review Texts for Scholarly Papers Ke Wang & Xiaojun Wan {wangke17,wanxiaojun}@pku.edu.cn

Experiments

• Influence of Abstract Text.• A simple method of using abstract texts as a contrast experiment

Remove the sentences that are similar to the paper abstract’ssentences from the review text and use the remaining text forclassification.(The threshold is set to 0.7)

Figure 5: The comparison of using and not using the paper abstract viaa simple method.

20/29

Page 22: Sentiment Analysis of Peer Review Texts for Scholarly Papers · Sentiment Analysis of Peer Review Texts for Scholarly Papers Ke Wang & Xiaojun Wan {wangke17,wanxiaojun}@pku.edu.cn

Experiments

• Influence of Borderline Reviews.

Figure 6: Experimental results on different datasets with, without and onlyborderline reviews.

21/29

Page 23: Sentiment Analysis of Peer Review Texts for Scholarly Papers · Sentiment Analysis of Peer Review Texts for Scholarly Papers Ke Wang & Xiaojun Wan {wangke17,wanxiaojun}@pku.edu.cn

Experiments

• Cross-Year Experiments.

Figure 7: Results of cross-year experiments. Model@ICLR− ∗ meansthe model is trained on ICLR− ∗ dataset.

22/29

Page 24: Sentiment Analysis of Peer Review Texts for Scholarly Papers · Sentiment Analysis of Peer Review Texts for Scholarly Papers Ke Wang & Xiaojun Wan {wangke17,wanxiaojun}@pku.edu.cn

Experiments

• Cross-Domain Experiments.We further collected 87 peer reviews for submissions in the NLPconferences (CoNLL, ACL, EMNLP, etc.), including 57 positive reviews(accept) and 30 negative reviews (reject).

Figure 8: Results of cross-domain experiments.∗ means the performanceimprovement over the first three methods is statistically significant withp-value < 0.05 for sign-test. Model@ICLR− ∗ means the model is trained onICLR− ∗ dataset. 23/29

Page 25: Sentiment Analysis of Peer Review Texts for Scholarly Papers · Sentiment Analysis of Peer Review Texts for Scholarly Papers Ke Wang & Xiaojun Wan {wangke17,wanxiaojun}@pku.edu.cn

Experiments

• Final Decision Prediction for Scholarly Papers.• Methods to predict the final decision of a paper based on several

review scores.

• Voting:

Decision =

{Accept if #accept > #reject

Reject Otherwise(16)

• Simple Average:Simply average the scores of all reviews. If the average score is largerthan or equal to 0.6, then the paper is predicted as final accept, andotherwise final reject.

• Confidence-based Average:

overall_score =1

|S|

|S|∑i=1

Si ∗1

(6 − ReviewerConfidencei)(17)

24/29

Page 26: Sentiment Analysis of Peer Review Texts for Scholarly Papers · Sentiment Analysis of Peer Review Texts for Scholarly Papers Ke Wang & Xiaojun Wan {wangke17,wanxiaojun}@pku.edu.cn

Experiments

• Final Decision Prediction for Scholarly Papers.• Results of final decision prediction for scholarly papers.

Figure 9: Results of final decision prediction for scholarly papers.

25/29

Page 27: Sentiment Analysis of Peer Review Texts for Scholarly Papers · Sentiment Analysis of Peer Review Texts for Scholarly Papers Ke Wang & Xiaojun Wan {wangke17,wanxiaojun}@pku.edu.cn

Outline

1. Introduction

2. Related Work

3. Framework

4. Experiments

5. Conclusion and Future Work

26/29

Page 28: Sentiment Analysis of Peer Review Texts for Scholarly Papers · Sentiment Analysis of Peer Review Texts for Scholarly Papers Ke Wang & Xiaojun Wan {wangke17,wanxiaojun}@pku.edu.cn

Conclusion and Future Work

• Contributions• We built two evaluation datasets. (ICLR-2017 and ICLR-2018)• We propose a multiple instance learning network with a novel

abstract-based memory mechanism (MILAM)• Evaluation results demonstrate the efficacy of our proposed model

and show the great helpfulness of using abstract as memory.

• Future Work

• Collect more peer reviews.• Try more sophisticated deep learning techniques.• Several other sentiment analysis tasks:

Prediction of the fine-granularity scores of reviews, Automaticwriting of meta-reviews, Prediction of the best papers...

27/29

Page 29: Sentiment Analysis of Peer Review Texts for Scholarly Papers · Sentiment Analysis of Peer Review Texts for Scholarly Papers Ke Wang & Xiaojun Wan {wangke17,wanxiaojun}@pku.edu.cn

Acknowledgments

• National Natural Science Foundation of China.

• Anonymous reviewers for their helpful comments.

• SIGIR Student Travel Grant.

28/29

Page 30: Sentiment Analysis of Peer Review Texts for Scholarly Papers · Sentiment Analysis of Peer Review Texts for Scholarly Papers Ke Wang & Xiaojun Wan {wangke17,wanxiaojun}@pku.edu.cn

29/29