paperrobot: incremental draft generation of …paperrobot: incremental draft generation of...

15
PaperRobot: Incremental Draft Generation of Scientific Ideas Qingyun Wang 1 , Lifu Huang 1 , Zhiying Jiang 1 , Kevin Knight 2 , Heng Ji 1,3 , Mohit Bansal 4 , Yi Luan 5 1 Rensselaer Polytechnic Institute 2 DiDi Labs 3 University of Illinois at Urbana-Champaign 4 University of North Carolina at Chapel Hill 5 University of Washington [email protected], [email protected] Abstract We present a PaperRobot who performs as an automatic research assistant by (1) conduct- ing deep understanding of a large collection of human-written papers in a target domain and constructing comprehensive background knowledge graphs (KGs); (2) creating new ideas by predicting links from the background KGs, by combining graph attention and con- textual text attention; (3) incrementally writ- ing some key elements of a new paper based on memory-attention networks: from the in- put title along with predicted related entities to generate a paper abstract, from the abstract to generate conclusion and future work, and finally from future work to generate a title for a follow-on paper. Turing Tests, where a biomedical domain expert is asked to com- pare a system output and a human-authored string, show PaperRobot generated abstracts, conclusion and future work sections, and new titles are chosen over human-written ones up to 30%, 24% and 12% of the time, respec- tively. 1 1 Introduction Our ambitious goal is to speed up scientific dis- covery and production by building a PaperRobot, who addresses three main tasks as follows. Read Existing Papers. Scientists now find it difficult to keep up with the overwhelming amount of papers. For example, in the biomedical domain, on average more than 500K papers are published every year 2 , and more than 1.2 million new pa- pers are published in 2016 alone, bringing the to- tal number of papers to over 26 million (Van No- orden, 2014). However, human’s reading ability 1 The programs, data and resources are publicly avail- able for research purpose at: https://github.com/ EagleW/PaperRobot 2 http://dan.corlan.net/medline-trend/ language/absolute.html keeps almost the same across years. In 2012, US scientists estimated that they read, on aver- age, only 264 papers per year (1 out of 5000 available papers), which is, statistically, not dif- ferent from what they reported in an identical sur- vey last conducted in 2005. PaperRobot automat- ically reads existing papers to build background knowledge graphs (KGs), in which nodes are enti- ties/concepts and edges are the relations between these entities (Section 2.2). Abstract Conclusion and Future Work Human Written Title Old Human Written Papers Enriched Knowledge Graphs New Title Abstract ... 1 st Paper 2 nd Paper Conclusion and Future Work Figure 1: PaperRobot Incremental Writing Create New Ideas. Scientific discovery can be considered as creating new nodes or links in the knowledge graphs. Creating new nodes usu- ally means discovering new entities (e.g., new pro- teins) through a series of real laboratory experi- ments, which is probably too difficult for Paper- Robot. In contrast, creating new edges is eas- ier to automate using the background knowledge graph as the starting point. Foster et al. (2015) shows that more than 60% of 6.4 million papers in biomedicine and chemistry are about incremen- tal work. This inspires us to automate the in- cremental creation of new ideas and hypotheses by predicting new links in background KGs. In fact, when there is more data available, we can construct larger and richer background KGs for more reliable link prediction. Recent work (Ji et al., 2015b) successfully mines strong relevance between drugs and diseases from biomedical pa- arXiv:1905.07870v4 [cs.CL] 31 May 2019

Upload: others

Post on 01-Jun-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: PaperRobot: Incremental Draft Generation of …PaperRobot: Incremental Draft Generation of Scientific Ideas Qingyun Wang 1, Lifu Huang , Zhiying Jiang , Kevin Knight2, Heng Ji1;3,

PaperRobot: Incremental Draft Generation of Scientific Ideas

Qingyun Wang1, Lifu Huang1, Zhiying Jiang1,Kevin Knight2, Heng Ji1,3, Mohit Bansal4, Yi Luan5

1 Rensselaer Polytechnic Institute 2 DiDi Labs 3 University of Illinois at Urbana-Champaign4 University of North Carolina at Chapel Hill 5 University of [email protected], [email protected]

Abstract

We present a PaperRobot who performs as anautomatic research assistant by (1) conduct-ing deep understanding of a large collectionof human-written papers in a target domainand constructing comprehensive backgroundknowledge graphs (KGs); (2) creating newideas by predicting links from the backgroundKGs, by combining graph attention and con-textual text attention; (3) incrementally writ-ing some key elements of a new paper basedon memory-attention networks: from the in-put title along with predicted related entitiesto generate a paper abstract, from the abstractto generate conclusion and future work, andfinally from future work to generate a titlefor a follow-on paper. Turing Tests, wherea biomedical domain expert is asked to com-pare a system output and a human-authoredstring, show PaperRobot generated abstracts,conclusion and future work sections, and newtitles are chosen over human-written ones upto 30%, 24% and 12% of the time, respec-tively.1

1 Introduction

Our ambitious goal is to speed up scientific dis-covery and production by building a PaperRobot,who addresses three main tasks as follows.

Read Existing Papers. Scientists now find itdifficult to keep up with the overwhelming amountof papers. For example, in the biomedical domain,on average more than 500K papers are publishedevery year2, and more than 1.2 million new pa-pers are published in 2016 alone, bringing the to-tal number of papers to over 26 million (Van No-orden, 2014). However, human’s reading ability

1The programs, data and resources are publicly avail-able for research purpose at: https://github.com/EagleW/PaperRobot

2http://dan.corlan.net/medline-trend/language/absolute.html

keeps almost the same across years. In 2012,US scientists estimated that they read, on aver-age, only 264 papers per year (1 out of 5000available papers), which is, statistically, not dif-ferent from what they reported in an identical sur-vey last conducted in 2005. PaperRobot automat-ically reads existing papers to build backgroundknowledge graphs (KGs), in which nodes are enti-ties/concepts and edges are the relations betweenthese entities (Section 2.2).

Abstract

Conclusion and Future Work

Human Written Title

Old Human Written Papers

Enriched Knowledge Graphs

New Title

Abstract

...1st Paper

2nd Paper

Conclusion and Future Work

Figure 1: PaperRobot Incremental Writing

Create New Ideas. Scientific discovery canbe considered as creating new nodes or links inthe knowledge graphs. Creating new nodes usu-ally means discovering new entities (e.g., new pro-teins) through a series of real laboratory experi-ments, which is probably too difficult for Paper-Robot. In contrast, creating new edges is eas-ier to automate using the background knowledgegraph as the starting point. Foster et al. (2015)shows that more than 60% of 6.4 million papersin biomedicine and chemistry are about incremen-tal work. This inspires us to automate the in-cremental creation of new ideas and hypothesesby predicting new links in background KGs. Infact, when there is more data available, we canconstruct larger and richer background KGs formore reliable link prediction. Recent work (Jiet al., 2015b) successfully mines strong relevancebetween drugs and diseases from biomedical pa-

arX

iv:1

905.

0787

0v4

[cs

.CL

] 3

1 M

ay 2

019

Page 2: PaperRobot: Incremental Draft Generation of …PaperRobot: Incremental Draft Generation of Scientific Ideas Qingyun Wang 1, Lifu Huang , Zhiying Jiang , Kevin Knight2, Heng Ji1;3,

transcription 

Knowledge Extraction Graph Attention

Link Prediction

Text Attention

Knowledge Extraction Entity Retrieval

Background Knowledge Graphs

Contextual Sentences

Salient Entities

Old Papers

Existing Paper Reading

Enriched Knowledge Graphs

New Paper WritingReference Embedding

cells Snail  ... ...

Related Entity Embedding

nasopharyngeal carcinoma Maspin Snail  ... ...

Diallyl Disulfide

Bidirectional GRU

Memory Attention qk

weighted sum sum

Memory Initialization Hop =1,2,...,k φ

...

...Reference Attention

Memory Network ...

...

...Final Distribution

Reference Distribution

Memory Distribution

Language Distribution

<SOS> is

qk−1

Figure 2: PaperRobot Architecture Overview

pers based on KGs constructed from weighted co-occurrence. We propose a new entity representa-tion that combines KG structure and unstructuredcontextual text for link prediction (Section 2.3).

Write a New Paper about New Ideas. The fi-nal step is to communicate the new ideas to thereader clearly, which is a very difficult thing to do;many scientists are, in fact, bad writers (Pinker,2014). Using a novel memory-attention networkarchitecture, PaperRobot automatically writes anew paper abstract about an input title along withpredicted related entities, then further writes con-clusion and future work based on the abstract, andfinally predicts a new title for a future follow-onpaper, as shown in Figure 1 (Section 2.4).

We choose biomedical science as our target do-main due to the sheer volume of available pa-pers. Turing tests show that PaperRobot-generatedoutput strings are sometimes chosen over human-written ones; and most paper abstracts only re-quire minimal edits from domain experts to be-come highly informative and coherent.

2 Approach

2.1 Overview

The overall framework of PaperRobot is illus-trated in Figure 2. A walk-through example pro-duced from this whole process is shown in Table 1.In the following subsections, we will elaborate onthe algorithms for each step.

2.2 Background Knowledge Extraction

From a massive collection of existing biomedi-cal papers, we extract entities and their relations

to construct background knowledge graphs (KGs).We apply an entity mention extraction and linkingsystem (Wei et al., 2013) to extract mentions ofthree entity types (Disease, Chemical and Gene)which are the core data categories in the Com-parative Toxicogenomics Database (CTD) (Daviset al., 2016), and obtain a Medical Subject Head-ings (MeSH) Unique ID for each mention. Basedon the MeSH Unique IDs, we further link all enti-ties to the CTD and extract 133 subtypes of rela-tions such as Marker/Mechanism, Therapeutic,and Increase Expression. Figure 3 shows an ex-ample.

2.3 Link Prediction

After constructing the initial KGs from existingpapers, we perform link prediction to enrich them.Both contextual text information and graph struc-ture are important to represent an entity, thus wecombine them to generate a rich representation foreach entity. Based on the entity representations,we determine whether any two entities are seman-tically similar, and if so, we propagate the neigh-bors of one entity to the other. For example, inFigure 3, because Calcium and Zinc are similarin terms of contextual text information and graphstructure, we predict two new neighbors for Cal-cium: CD14 molecule and neuropilin 2 which areneighbors of Zinc in the initial KGs.

We formulate the initial KGs as a list of tuplesnumbered from 0 to κ. Each tuple (ehi , ri, e

ti) is

composed of a head entity ehi , a tail entity eti, andtheir relation ri. Each entity ei may be involved inmultiple tuples and its one-hop connected neigh-bors are denoted as Nei = [ni1, ni2, ...]. ei is

Page 3: PaperRobot: Incremental Draft Generation of …PaperRobot: Incremental Draft Generation of Scientific Ideas Qingyun Wang 1, Lifu Huang , Zhiying Jiang , Kevin Knight2, Heng Ji1;3,

Calcium Zinc

caspase 3

cyclin D1

affect reaction

affect cotreatment

increase cleavage

affect cotreatment

decrease expressionincrease expression

AKTserine/threonine

kinase 1

increases phosphorylation decrease phosphorylation

decrease reaction increase reaction

CD14 molecule

neuropilin 2

paraoxonase 1

prepronociceptin

decrease reaction

decrease abundance

increase reaction

increase expression

Knowledge Graph

Gene ChemicalContextual Sentence: So, Ca2+possibly promoted caspases activation upstream of cytochrome c release, but inactivatedcaspase activity by calpain and/or fast depletion of ATP; whereas Zn2+ blocked the activation ofprocaspase‐3 with no visiblechange in the level of cytochrome c, and the block possibly resulted from its direct inhibition on caspase‐3 enzyme.

affect transport

affect binding

Figure 3: Biomedical Knowledge Extraction and Link Prediction Example (dash lines are predicted links)

also associated with a context description si whichis randomly selected from the sentences where eioccurs. We randomly initialize vector representa-tions ei and ri for ei and ri respectively.Graph Structure Encoder To capture the impor-tance of each neighbor’s feature to ei, we performself-attention (Velickovic et al., 2018) and com-pute a weight distribution over Nei :

e′i = Weei, n

′ij = Wenij

cij = LeakyReLU(Wf (e′i ⊕ n

′ij))

c′i = Softmax(ci)

where We is a linear transformation matrix ap-plied to each entity. Wf is the parameter for a sin-gle layer feedforward network. ⊕ denotes the con-catenation operation between two matrices. Thenwe use c

′i and Nei to compute a structure based

context representation of εi = σ(∑

c′ijn

′ij

),

where nij ∈ Nei and σ is Sigmoid function.In order to capture various types of relations

between ei and its neighbors, we further performmulti-head attention on each entity, based on mul-tiple linear transformation matrices. Finally, weget a structure based context representation ei =[ε0i ⊕ ... ⊕ εMi ], where εmi refers to the context

representation obtained with the m-th head, andei is the concatenated representation based on theattention of all M heads. Contextual Text En-coder Each entity e is also associated with a con-text sentence [w1, ..., wl]. To incorporate the localcontext information, we first apply a bi-directionallong short-term memory (LSTM) (Graves andSchmidhuber, 2005) network to get the encoder

hidden states Hs = [h1, ...,hl], where hi rep-resents the hidden state of wi. Then we com-pute a bilinear attention weight for each word wi:µi = e>Wshi,µ

′= Softmax(µ), where Ws is

a bilinear term. We finally get the context repre-sentation e = µ

′>hi. Gated Combination Tocombine the graph-based representation e and lo-cal context based representations e, we design agate function to balance these two types of infor-mation:

ge = σ(ge), e = ge � e+ (1− ge)� e

where ge is an entity-dependent gate function ofwhich each element is in [0, 1], ge is a learnableparameter for each entity e, σ is a Sigmoid func-tion, and � is an element-wise multiplication.Training and Prediction To optimize bothentity and relation representations, followingTransE (Bordes et al., 2013), we assume the re-lation between two entities can be interpreted astranslations operated on the entity representations,namely h+ r ≈ t if (h, r, t) holds. Therefore, foreach tuple (ehi , ri, e

ti), we can compute their dis-

tance score: F (ehi , ri, eti) =‖ ehi + ri − eti ‖22. We

use marginal loss to train the model:

Loss =∑

(ehi ,ri,eti)∈K

∑(ehi ,ri,e

ti)∈K

max(0,

γ + F (ehi , ri, eti)− F (ehi , ri, e

ti))

where (eh, r, th) is a positive tuple and (eh, rh, th)is a negative tuple, and γ is a margin. The negativetuples are generated by either replacing the heador the tail entity of positive tuples with a randomlychosen different entity.

Page 4: PaperRobot: Incremental Draft Generation of …PaperRobot: Incremental Draft Generation of Scientific Ideas Qingyun Wang 1, Lifu Huang , Zhiying Jiang , Kevin Knight2, Heng Ji1;3,

Title Snail transcription factor negatively regulates maspin tumor suppressor in human prostate cancer cellsEntities Related: nasopharyngeal carcinoma; diallyl disulfideOutput Human (Neal et al., 2012) System Post-edited by HumanAbstract Background: Maspin, a putative tu-

mor suppressor that is down-regulatedin breast and prostate cancer, has beenassociated with decreased cell motility.Snail transcription factor is a zinc fin-ger protein that is increased in breastcancer and is associated with increasedtumor motility and invasion by induc-tion of epithelial-mesenchymal transi-tion (EMT). We investigated the molec-ular mechanisms by which Snail in-creases tumor motility and invasionutilizing prostate cancer cells. Meth-ods: Expression levels were analyzedby RT-PCR and western blot analyses.Cell motility and invasion assays wereperformed , while Snail regulation andbinding to maspin promoter was ana-lyzed by luciferase reporter and chro-matin immunoprecipitation (ChIP) as-says. Results: Snail protein expressionwas higher in different prostate can-cer cells lines as compared to normalprostate epithelial cells.

Background: Snail is a multi-functional protein that plays animportant role in the pathogene-sis of prostate cancer. However,it has been shown to be associ-ated with poor prognosis. Thepurpose of this study was to in-vestigate the effect of negativelyon the expression of maspin inhuman nasopharyngeal carci-noma cell lines. Methods: Quan-titative real-time PCR and west-ern blot analysis were used todetermine whether the demethy-lating agent was investigated byquantitative RT-PCR (qRT-PCR)and Western blotting. Resultsshowed that the binding proteinplays a significant role in the reg-ulation of tumor growth and pro-gression.

Background: Snail is amultifunctional protein thatplays an important role inthe pathogenesis of prostatecancer. It has been shownassociated with poor prog-nosis. The purpose of thisstudy is to investigate thenegative effect of on the ex-pression of Maspin in hu-man nasopharyngeal car-cinoma cell lines. Meth-ods: Quantitative RT-PCR(qRT-PCR) and western blotanalyses were used to deter-mine correlation of the twoproteins expressions. Re-sults showed that the bind-ing protein plays a signifi-cant role in the regulation oftumor growth and progres-sion.

Conclusionand Future

work

Collectively, our results indicate for thefirst time that Snail can negatively reg-ulate maspin through direct promoterrepression resulting in increased migra-tion and invasion in prostate cancercells. This study reveals a novel mech-anism of how Snail may function andshow the importance of therapeutic tar-geting of Snail signaling in future.

In summary, our study demon-strates that Snail negatively in-hibited the expression of Maspinin human nasopharyngeal car-cinoma cell lines and in vitro.Our results indicate that the com-bination of the demethylatingagent might be a potential ther-apeutic target for the treatment ofprostate cancer.

In summary, our study invitro demonstrates that Snailnegatively inhibits the ex-pression of Maspin in hu-man nasopharyngeal carci-noma cell lines. Our resultsfurther indicate that Maspinmight be a potential thera-peutic target for the treat-ment of prostate cancer.

New Title Role of maspin in cancer (Berardiet al., 2013)

The role of nasopharyngeal car-cinoma in the rat model ofprostate cancer cells

The role of Maspin in therat model of nasopharyn-geal carcinoma cells

Table 1: Comparison of Human and System Written Paper Elements (bold words are topically related entities;italic words show human edits)

After training, for each pair of indirectly con-nected entities ei, ej and a relation type r, wecompute a score y to indicate the probability that(ei, r, ej) holds, and obtain an enriched knowledgegraph K = [(ehκ+1, rκ+1, e

tκ+1, yκ+1)...].

2.4 New Paper Writing

In this section, we use title-to-abstract generationas a case study to describe the details of our pa-per writing approach. Other tasks (abstract-to-conclusion and future work, and conclusion andfuture work-to-title) follow the same architecture.

Given a reference title τ = [w1, ..., wl], we ap-ply the knowledge extractor (Section 2.2) to ex-tract entities from τ . For each entity, we retrieve aset of related entities from the enriched knowledgegraph K after link prediction. We rank all the re-lated entities by confidence scores and select up to

10 most related entities Eτ = [eτ1 , ..., eτv ]. Then

we feed τ and Eτ together into the paper genera-tion framework as shown in Figure 2. The frame-work is based on a hybrid approach of a Mem2seqmodel (Madotto et al., 2018) and a pointer gener-ator (Gu et al., 2016; See et al., 2017). It allowsus to balance three types of sources for each timestep during decoding: the probability of generat-ing a token from the entire word vocabulary basedon language model, the probability of copying aword from the reference title, such as regulates inTable 1, and the probability of incorporating a re-lated entity, such as Snail in Table 1. The output isa paragraph Y = [y1, ..., yo].

3

Reference Encoder For each word in the refer-

3During training, we truncate both of the input and theoutput to around 120 tokens to expedite training. We labelthe words with frequency < 5 as Out-of-vocabulary.

Page 5: PaperRobot: Incremental Draft Generation of …PaperRobot: Incremental Draft Generation of Scientific Ideas Qingyun Wang 1, Lifu Huang , Zhiying Jiang , Kevin Knight2, Heng Ji1;3,

ence title, we randomly embed it into a vectorand obtain τ = [w1, ...,wl]. Then, we applya bi-directional Gated Recurrent Unit (GRU) en-coder (Cho et al., 2014) on τ to produce the en-coder hidden statesH = [h1, ...,hl].Decoder Hidden State Initialization Not all pre-dicted entities are equally relevant to the title.For example, for the title in Table 2, we pre-dict multiple related entities including nasopha-ryngeal carcinoma and diallyl disulfide, but na-sopharyngeal carcinoma is more related becausenasopharyngeal carcinoma is also a cancer relatedto snail transcription factor, while diallyl disul-fide is less related because diallyl disulfide’s anti-cancer mechanism is not closely related to maspintumor suppressor. We propose to apply memory-attention networks to further filter the irrelevantones. Recent approaches (Sukhbaatar et al., 2015;Madotto et al., 2018) show that compared withsoft-attention, memory-based multihop attentionis able to refine the attention weight of each mem-ory cell to the query multiple times, drawing bettercorrelations. Therefore, we apply a multihop at-tention mechanism to generate the initial decoderhidden state.

Given the set of related entities E = [e1, ..., ev],we randomly initialize their vector representationE = [e1, ..., ev] and store them in memories.Then we use the last hidden state of reference en-coderhl as the first query vector q0, and iterativelycompute the attention distribution over all memo-ries and update the query vector:

pki = ν>k tanh(W k

q qk−1 +Uke ei + bk

)qk = p>k e+ qk−1

where k denotes the k-th hop among ϕ hops intotal.4 After ϕ hops, we obtain qϕ and take it asthe initial hidden state of the GRU decoder.Memory Network To better capture the contri-bution of each entity ej to each decoding output,at each decoding step i, we compute an attentionweight for each entity and apply a memory net-work to refine the weights multiple times. We takethe hidden state hi as the initial query q0 = hi anditeratively update it:

pkj = ν>k tanh(W k

q qk−1 + Uke ej +Wccij + bk

)uik = p

′>k ej , qk = uik + qk−1

4We set ϕ = 3 since it performs the best on the develop-ment set.

where cij =∑i−1

m=0 βmj is an entity coverage vec-tor and βi is the attention distribution of last hopβi = p

′ψ, and ψ is the total number of hops. We

then obtain a final memory based context vectorfor the set of related entities χi = uiψ.Reference Attention Our reference attention issimilar to (Bahdanau et al., 2015; See et al., 2017),which aims to capture the contribution of eachword in the reference title to the decoding output.At each time step i, the decoder receives the pre-vious word embedding and generate decoder statehi, the attention weight of each reference token iscomputed as:

αij = ς> tanh(Whhi +Wτhj +Wccij + bτ

′i = Softmax (αi) ; φi = α

′>i hj

cij =∑i−1

m=0 αmj is a reference coverage vector,which is the sum of attention distributions overall previous decoder time steps to reduce repeti-tion (See et al., 2017). φi is the reference contextvector.Generator For a particular word w, it may occurmultiple times in the reference title or in multi-ple related entities. Therefore, at each decodingstep i, for each word w, we aggregate its attentionweights from the reference attention and memoryattention distributions: P iτ =

∑m|wm=w α

′im and

P ie =∑

m|w∈em βim respectively. In addition, ateach decoding step i, each word in the vocabu-lary may also be generated with a probability ac-cording to the language model. The probability iscomputed from the decoder state hi, the referencecontext vector φi, and the memory context vectorχi: Pgen = Softmax(Wgen[hi;φi;χi] + bgen),where Wgen and bgen are learnable parameters.To combine Pτ , Pe and Pgen, we compute a gategτ as a soft switch between generating a wordfrom the vocabulary and copying words from thereference title τ or the related entities E: gp =σ(W>

p hi + W>z zi−1 + bp), where zi−1 is the

embedding of the previous generated token at stepi − 1. Wp, Wz , and bp are learnable parame-ters, and σ is a Sigmoid function. We also com-pute a gate gp as a soft switch between copyingwords from reference text and the related entities:gp = σ(W>

φ φi +W>χ χi + bp), whereWφ,Wχ,

and bp are learnable parameters.The final probability of generating a token z at

decoding step i can be computed by:

P (zi) = gpPgen + (1− gp) (gpPτ + (1− gp)Pe)

Page 6: PaperRobot: Incremental Draft Generation of …PaperRobot: Incremental Draft Generation of Scientific Ideas Qingyun Wang 1, Lifu Huang , Zhiying Jiang , Kevin Knight2, Heng Ji1;3,

Dataset # papers # avg entitiesin Title /

paper

# avg predictedrelated entities /

paperTitle-to-Abstract

Abstract-to-Conclusionand Future work

Conclusion andFuture work-to-Title

Training 22,811 22,811 15,902 4.8 -Development 2,095 2,095 2,095 5.6 6.1Test 2,095 2,095 2,095 5.7 8.5

Table 2: Paper Writing Statistics

Model Title-to-Abstract Abstract-to-Conclusionand Future Work

Conclusion andFuture Work-to-Title

Perplexity METEOR Perplexity METEOR Perplexity METEORSeq2seq (Bahdanau et al., 2015) 19.6 9.1 44.4 8.6 49.7 6.0Editing Network (Wang et al., 2018b) 18.8 9.2 30.5 8.7 55.7 5.5Pointer Network (See et al., 2017) 146.7 8.5 74.0 8.1 47.1 6.6Our Approach (-Repetition Removal) 13.4 12.4 24.9 12.3 31.8 7.4Our Approach 11.5 13.0 18.3 11.2 14.8 8.9

Table 3: Automatic Evaluation on Paper Writing for Diagnostic Tasks (%). The Pointer Network can be viewed asremoving memory network part from our approach without repetition removal.

The loss function, combined with the coverageloss (See et al., 2017) for both reference attentionand memory distribution, is presented as:

Loss =∑

i− logP (zi) + λ

∑i(min (αij , cij)

+ min (βij , cij))

where P (zi) is the prediction probability of theground truth token zi, and λ is a hyperparameter.Repetition Removal Similar to many other longtext generation tasks (Suzuki and Nagata, 2017),repetition remains a major challenge (Foster andWhite, 2007; Xie, 2017). In fact, 11% sentencesin human written abstracts include repeated enti-ties, which may mislead the language model. Fol-lowing the coverage mechanism proposed by (Tuet al., 2016; See et al., 2017), we use a cover-age loss to avoid any entity in reference inputtext or related entity receiving attention multipletimes. We further design a new and simple mask-ing method to remove repetition during the testtime. We apply beam search with beam size 4 togenerate each output, if a word is not a stop wordor punctuation and it is already generated in theprevious context, we will not choose it again inthe same output.

3 Experiment

3.1 DataWe collect biomedical papers from the PMC OpenAccess Subset.5 To construct ground truth fornew title prediction, if a human written paper A

5ftp://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_package/

cites a paper B, we assume the title of A is gen-erated from B’s conclusion and future work ses-sion. We construct background knowledge graphsfrom 1,687,060 papers which include 30,483 enti-ties and 875,698 relations. Tables 2 shows the de-tailed data statistics. The hyperparameters of ourmodel are presented in the Appendix.

3.2 Automatic Evaluation

Previous work (Liu et al., 2016; Li et al., 2016;Lowe et al., 2015) has proven it to be a majorchallenge to automatically evaluate long text gen-eration. Following the story generation work (Fanet al., 2018), we use METEOR (Denkowski andLavie, 2014) to measure the topic relevance to-wards given titles and use perplexity to furtherevaluate the quality of the language model. Theperplexity scores of our model are based on thelanguage model6 learned on other PubMed pa-pers (500,000 titles, 50,000 abstracts, 50,000 con-clusions and future work) which are not used fortraining or testing in our experiment.7 The resultsare shown in Table 3. We can see that our frame-work outperforms all previous approaches.

3.3 Turing Test

Similar to (Wang et al., 2018b), we conduct Turingtests by a biomedical expert (non-native speaker)and a non-expert (native speaker). Each humanjudge is asked to compare a system output and ahuman-authored string, and select the better one.

6https://github.com/pytorch/examples/tree/master/word_language_model

7The perplexity scores of the language model are in theAppendix.

Page 7: PaperRobot: Incremental Draft Generation of …PaperRobot: Incremental Draft Generation of Scientific Ideas Qingyun Wang 1, Lifu Huang , Zhiying Jiang , Kevin Knight2, Heng Ji1;3,

Task Input Output Domain Expert Non-expert

End-to-End

Human Title Different Abstract (1st) 10 30Same 30 16

System Abstract Different Conclusion andFuture work

12 0Same 8 8

System Conclusion andFuture work

DifferentTitle

12 2Same 12 25

System Title Different Abstract (2nd) 14 4

DiagnosticHuman Abstract Different Conclusion and

Future work12 14

Same 24 20Human Conclusion and

Future workDifferent Title 8 12

Same 2 10

Table 4: Turing Test Human Subject Passing Rates (%). Percentages show how often a human judge chooses oursystem’s output over human’s when it is mixed with a human-authored string. If the output strings (e.g., abstracts)are based on the same input string (e.g., title), the Input condition is marked “Same”, otherwise “Different”.

BLEU1 BLEU2 BLEU3 BLEU4 ROUGE TER59.6 58.1 56.7 55.4 73.3 35.2

Table 5: Evaluation on Human Post-Editing(%)

Table 4 shows the results on 50 pairs in eachsetting. We can see that PaperRobot generatedabstracts are chosen over human-written ones bythe expert up to 30% times, conclusion and fu-ture work up to 24% times, and new titles up to12% times. We don’t observe the domain expertperforms significantly better than the non-expert,because they tend to focus on different aspects -the expert focuses on content (entities, topics, etc.)while the non-expert focuses on the language.

3.4 Human Post-Editing

In order to measure the effectiveness of Paper-Robot acting as a wring assistant, we randomlyselect 50 paper abstracts generated by the systemduring the first iteration and ask the domain expertto edit them until he thinks they are informativeand coherent. The BLEU (Papineni et al., 2002),ROUGE (Lin, 2004) and TER (Snover et al., 2006)scores by comparing the abstracts before and af-ter human editing are presented in Table 5. It tookabout 40 minutes for the expert to finish editing 50abstracts. Table 1 includes the post-edited exam-ple. We can see that most edits are stylist changes.

3.5 Analysis and Discussions

To better justify the function of each component,we conduct ablation studies by removing mem-ory networks, link prediction, and repetition re-moval respectively. The results are shown in Ta-ble 6. We can see that the approach withoutmemory networks tends to diverge from the maintopic, especially for generating long texts such as

abstracts (the detailed length statistics are shownin Table 8). From Table 6 we can see the laterparts of the abstract (Methods and Results) includetopically irrelevant entities such as “imipramine”which is used to treat depression instead of humanprostate cancer.

Link prediction successfully introduces newand topically related ideas, such as “RT-PCR” and“western blot” which are two methods for ana-lyzing the expression level of Snail protein, asalso mentioned in the human written abstract inTable 1. Table 7 shows more examples of enti-ties which are related to the entities in input titlesbased on link prediction. We can see that the pre-dicted entities are often genes or proteins whichcause the disease mentioned in a given title, orother diseases from the same family.

Our simple beam search based masking methodsuccessfully removes some repeated words andphrases and thus produces more informative out-put. The plagiarism check in Table 9 shows ourmodel is creative, because it’s not simply copyingfrom the human input.

3.6 Remaining Challenges

Our generation model is still largely dependenton language model and extracted facts, and thusit lacks of knowledge reasoning. It generates afew incorrect abbreviations such as “Organophos-phates(BA)”, “chronic kidney disease(UC)” and“Fibrosis(DC)”) because they appear rarely in thetraining data and thus their contextual representa-tions are not reliable. It also generates some in-correct numbers (e.g., “The patients were dividedinto four groups : Group 1 , Group B...”) and pro-nouns (e.g., “A 63-year-old man was referred toour hospital ... she was treated with the use of thedescending coronary artery” ).

Page 8: PaperRobot: Incremental Draft Generation of …PaperRobot: Incremental Draft Generation of Scientific Ideas Qingyun Wang 1, Lifu Huang , Zhiying Jiang , Kevin Knight2, Heng Ji1;3,

Output Without Memory Networks Without Link Prediction Without Repetition RemovalAbstract Background: Snail has been reported

to exhibit a variety of biological func-tions. In this study, we investigatedthe effect of negatively on maspindemethylation in human prostatecancer cells. Methods: Quantitativereal-time PCR and western blot analy-sis were used to investigate the effectsof the demethylating agent on the ex-pression of the protein kinase (TF)gene promoter. Results: The resultsshowed that the presence of a singledose of 50 µM in a dose-dependentmanner, whereas the level of the BMPimipramine was significantly higherthan that of the control group.

Background: Snail has beenshown to be associated withpoor prognosis. In this study,we investigated the effect ofnegatively on the expressionof maspin in human prostatecancer cells. Methods: Cellswere treated with a single doseof radiotherapy for 24 h, andwas used to investigate the sig-nificance of a quantitative factorfor the treatment of the disease.Results: The remaining controlsshowed a significant increase inthe G2/M phase of the tumorsuppressor protein (p<0.05).

Background: Snail is a majorhealth problem in human ma-lignancies. However, the roleof Snail on the expression ofmaspin in human prostate can-cer cells is not well understood.The aim of this study was toinvestigate the effect of Snailon the expression of maspin inhuman prostate cancer cells.Methods: The expression of theexpression of Snail and maspinwas investigated using quantita-tive RT-PCR and western blotanalysis. Results: The remainingoverall survival (OS) and overallsurvival (OS) were analyzed.

Conclusionand

Futurework

In summary, our study demonstratedthat negatively inhibited the expres-sion of the BMP imipramine in hu-man prostate cancer cells. Our find-ings suggest that the inhibition ofmaspin may be a promising therapeu-tic strategy for the treatment.

In summary, our results demon-strate that negatively inhibitedthe expression of maspin in hu-man prostate cancer cells. Ourfindings suggest that the combi-nation of radiotherapy may bea potential therapeutic target forthe treatment of disease.

In summary, our results demon-strate that snail inhibited the ex-pression of maspin in humanprostatic cells. The expressionof snail in PC-3 cells by snail,and the expression of maspinwas observed in the presence ofthe expression of maspin.

New Title Protective effects of homolog on hu-man breast cancer cells by inhibitingthe Endoplasmic Reticulum Stress

The role of prostate cancer inhuman breast cancer cells

The role of maspin and maspinin human breast cancer cells

Table 6: Ablation Test Results on the Same Title in Table 1

Titles Predicted Related EntitiesPseudoachondroplasia/COMP translating from the bench to thebedside

osteoarthritis; skeletal dysplasia; thrombospondin-5

Role of ceramide in diabetes mellitus: evidence and mechanisms diabetes insulin ceramide; metabolic diseaseExuberant clinical picture of Buschke-Fischer-Brauer palmo-plantar keratoderma in bedridden patient

neoplasms; retinoids; autosomal dominant disease

Relationship between serum adipokine levels and radiographicprogression in patients with ankylosing spondylitis

leptin; rheumatic diseases; adiponectin; necrosis;DKK-1; IL-6-RFP

Table 7: More Link Prediction Examples (bold words are entities detected from titles)

Abstract Conclusion andFuture Work

Title

System 112.4 88.1 16.5Human 106.5 105.5 13.0

Table 8: The Average Number of Words of System andHuman Output

Output 1 2 3 4 5Abstracts 58.3 20.1 8.03 3.60 1.46

Conclusions 43.8 12.5 5.52 2.58 1.28Titles 20.1 1.31 0.23 0.06 0.00

Table 9: Plagiarism Check: Percentage (%) of n-gramsin human input which appear in system generated out-put for test data.

All of the system generated titles are declar-ative sentences while human generated titles areoften more engaging (e.g., “Does HPV play anyrole in the initiation or prognosis of endometrial

adenocarcinomas?”). Human generated titles of-ten include more concrete and detailed ideas suchas “etumorType , An Algorithm of DiscriminatingCancer Types for Circulating Tumor Cells or Cell-free DNAs in Blood”, and even create new entityabbreviations such as etumorType in this example.

3.7 Requirements to Make PaperRobotWork: Case Study on NLP Domain

When a cool Natural Language Processing (NLP)system like PaperRobot is built, it’s natural to askwhether she can benefit the NLP community it-self. We re-build the system based on 23,594NLP papers from the new ACL Anthology Net-work (Radev et al., 2013). For knowledge ex-traction we apply our previous system trained forthe NLP domain (Luan et al., 2018). But the re-sults are much less satisfactory compared to the

Page 9: PaperRobot: Incremental Draft Generation of …PaperRobot: Incremental Draft Generation of Scientific Ideas Qingyun Wang 1, Lifu Huang , Zhiying Jiang , Kevin Knight2, Heng Ji1;3,

biomedical domain. Due to the small size of data,the language model is not able to effectively copyout-of-vocabulary words and thus the output is of-ten too generic. For example, given a title “Statis-tics based hybrid approach to Chinese base phraseidentification”, PaperRobot generates a fluent butuninformative abstract “This paper describes anovel approach to the task of Chinese-base-phraseidentification. We first utilize the solid foundationfor the Chinese parser, and we show that our toolcan be easily extended to meet the needs of thesentence structure.”.

Moreover, compared to the biomedical domain,the types of entities and relations in the NLP do-main are rather coarse-grained, which often leadsto inaccurate prediction of related entities. For ex-ample, for an NLP paper title “Extracting molec-ular binding relationships from biomedical text”,PaperRobot mistakenly extracts “prolog” as a re-lated entity and generates an abstract “In this pa-per, we present a novel approach to the problemof extracting relationships among the prolog pro-gram. We present a system that uses a macro-molecular binding relationships to extract the re-lationships between the abstracts of the entry. Theresults show that the system is able to extractthe most important concepts in the prolog pro-gram.”.

4 Related Work

Link Prediction. Translation-based ap-proaches (Nickel et al., 2011; Bordes et al.,2013; Wang et al., 2014; Lin et al., 2015; Jiet al., 2015a) have been widely exploited for linkprediction. Compared with these studies, we arethe first to incorporate multi-head graph atten-tion (Sukhbaatar et al., 2015; Madotto et al., 2018;Velickovic et al., 2018) to encourage the modelto capture multi-aspect relevance among nodes.Similar to (Wang and Li, 2016; Xu et al., 2017),we enrich entity representation by combining thecontextual sentences that include the target entityand its neighbors from the graph structure. This isthe first work to incorporate new idea creation vialink prediction into automatic paper writing.

Knowledge-driven Generation. Deep Neu-ral Networks have been applied to generate nat-ural language to describe structured knowledgebases (Duma and Klein, 2013; Konstas and Lap-ata, 2013; Flanigan et al., 2016; Hardy and Vla-chos, 2018; Pourdamghani et al., 2016; Trisedya

et al., 2018; Xu et al., 2018; Madotto et al.,2018; Nie et al., 2018), biographies based on at-tributes (Lebret et al., 2016; Chisholm et al., 2017;Liu et al., 2018; Sha et al., 2018; Kaffee et al.,2018; Wang et al., 2018a; Wiseman et al., 2018),and image/video captions based on backgroundentities and events (Krishnamoorthy et al., 2013;Wu et al., 2018; Whitehead et al., 2018; Lu et al.,2018). To handle unknown words, we designan architecture similar to pointer-generator net-works (See et al., 2017) and copy mechanism (Guet al., 2016). Some interesting applications in-clude generating abstracts based on titles for thenatural language processing domain (Wang et al.,2018b), generating a poster (Qiang et al., 2016) ora science news blog title (Vadapalli et al., 2018)about a published paper. This is the first work onautomatic writing of key paper elements for thebiomedical domain, especially conclusion and fu-ture work, and follow-on paper titles.

5 Conclusions and Future Work

We build a PaperRobot who can predict related en-tities for an input title and write some key elementsof a new paper (abstract, conclusion and futurework) and predict a new title. Automatic evalua-tions and human Turing tests both demonstrate herpromising performance. PaperRobot is merely anassistant to help scientists speed up scientific dis-covery and production. Conducting experimentsis beyond her scope, and each of her current com-ponents still requires human intervention: con-structed knowledge graphs cannot cover all tech-nical details, predicted new links need to be veri-fied, and paper drafts need further editing. In thefuture, we plan to develop techniques for extract-ing entities of more fine-grained entity types, andextend PaperRobot to write related work, predictauthors, their affiliations and publication venues.

Acknowledgments

The knowledge extraction and prediction com-ponents were supported by the U.S. NSF No.1741634 and Tencent AI Lab Rhino-Bird GiftFund. The views and conclusions contained inthis document are those of the authors and shouldnot be interpreted as representing the official poli-cies, either expressed or implied, of the U.S. Gov-ernment. The U.S. Government is authorized toreproduce and distribute reprints for Governmentpurposes notwithstanding any copyright notation

Page 10: PaperRobot: Incremental Draft Generation of …PaperRobot: Incremental Draft Generation of Scientific Ideas Qingyun Wang 1, Lifu Huang , Zhiying Jiang , Kevin Knight2, Heng Ji1;3,

here on.

A Hyperparameters

Table 10 shows the hyperparameters of our model.

Models Parameter Value

LinkPrediction

# Multi-head 8Multi-head hidden 8Entity embedding 64LeakyReLU α 0.2Margin loss γ 1

PaperWriting

Decoder hidden 256Coverage loss λ 1

Both

Vocabulary size 32553Text embedding 128Optimization Adam (Hu et al., 2009)Learning rate 0.001

Table 10: Hyperparameters

B Language model details

The perplexity scores of the language model are96.24, 59.69, and 120.31 on titles, abstracts, andconclusions respectively.

Page 11: PaperRobot: Incremental Draft Generation of …PaperRobot: Incremental Draft Generation of Scientific Ideas Qingyun Wang 1, Lifu Huang , Zhiying Jiang , Kevin Knight2, Heng Ji1;3,

C Good Examples that Passed Turing Test

System Output Human OutputEfficacy and Safety of Artesunate in the Treat-ment of Uncomplicated Malaria: a SystematicReview and Meta-analysis.

Low RBM3 Protein Expression Correlates withClinical Stage, PrognOStic Classification and In-creased Risk of Treatment Failure in TesticularNon-Seminomatous Germ Cell Cancer.

Background The aim of the present study wasto investigate the effect of Cnidium Lactone onthe expression of Mutant and histone deacety-lase (HDAC) inhibitors in human prostate cancer(PC). Material/Methods We evaluated the effectsof Prostate Cancer on cell proliferation and inva-sion in vitro and in vivo. Cells were incubatedwith a single dose of 25 (50 mg/kg), and 10 (100mg/kg/day), respectively. The primary endpointwas the ability of the mRNA and protein levels oftranscription factor (VEGF).

Background Cnidium Lactone is a naturalcoumarin compound that can inhibit a variety ofcancer cell proliferation and induce cancer cellapoptosis. This experiment investigated the effectof cnidium Lactone on molecular marker expres-sion in prostate cancer nude mice to study its ef-fect in inducing apoptosis . Material/Methods Werandomly and equally divided 30 male BALB/Cnude mice inoculated with human prostate can-cer cells PC-3 into a negative control group, a cy-clophosphamide group (500 mg/Kg) , and cnid-ium Lactone groups at 3 doses (280 mg/Kg, 140mg/Kg, and 70 mg/Kg). The mice were weighedat 2 weeks after administration.

Abstract Rationale: Aliskiren is a rare diseasecharacterized by a variety of hypertensive disor-ders . The aim of the present study was to eval-uate the effectiveness of aliskiren , pharmacody-namics , and clinical outcomes in patients withhypertension . Methods We reviewed the medi-cal records of ambulatory blood pressure ( BP ) ,kinetics , and high-sensitivity C-reactive protein( CRP ) levels in the treatment of corneal tissue. We performed a retrospective review of the En-glish literature search of PubMed , EMBASE ,and Cochrane Library databases . The primaryoutcome was established by using a scoring sys-tem.

The use of nanoparticles in medicine is an attrac-tive proposition. In the present study, Zinc oxideand silver nanoparticles were evaluated for theirantidiabetic activity . Fifty male albino rats withweight 120 ± 20 and age 6 months were used .Animals were grouped as follows: control; didnot receive any type of treatment, diabetic; re-ceived a single intraperitoneal dose of streptozo-tocin (100 mg/kg), diabetic + Zinc oxide nanopar-ticles (ZnONPs), received single daily oral doseof 10 mg/kg ZnONPs in suspension, diabetic+ silver nanoparticles (SNPs); received a singledaily oral dose of SNP of 10 mg/kg in suspensionand diabetic + insulin; received a single subcuta-neous dose of 0.6 units/50 g body.

Table 11: Good Examples that Passed Turing Test

Page 12: PaperRobot: Incremental Draft Generation of …PaperRobot: Incremental Draft Generation of Scientific Ideas Qingyun Wang 1, Lifu Huang , Zhiying Jiang , Kevin Knight2, Heng Ji1;3,

System Output Human OutputIn conclusion , our study demonstrated thatHOTAIR transcript expression in NSCLC cells.These results suggest that the overexpression ofmetastasis may play a role in regulating tumorprogression and invasion. Further studies areneeded to elucidate the molecular mechanisms in-volved in the development of cancer.

VWF is an autocrine/paracrine effector of signaltransduction and gene expression in ECs that reg-ulates EC adhesiveness for MSCs via activationof p38 MAPK in ECs.

In summary, the present study demonstrated thatBBR could suppress tubulointerstitial fibrosis inNRK 52E cells. In addition, the effects of ac-tion on the EMT and HG of DN in the liver celllines, and the inhibition of renal function may bea potential therapeutic agent for the treatment ofdiabetic mice. Further studies are needed to elu-cidate the mechanisms underlying the mechanismof these drugs in the future.

We characterised KGN cells as a malignanttumour model of GCTs. Continuously culti-vated KGN cells acquire an aggressive pheno-type, confirmed by the analysis of cellular ac-tivities and the expression of biomarkers. Morestrikingly, KGN cells injected under the skinwere metastatic with nodule formation occurringmostly in the bowel. Thus, this cell line is agood model for analysing GCT progression andthe mechanisms of metastasis.

In summary, the present study demonstrated thatHydrogen alleviates neuronal apoptosis in SAHrats. These results suggest that the Akt/GSK3 sig-naling pathway may be a novel therapeutic targetfor the treatment of EBI.

In reproductive-age women with ovarian en-dometriosis, the transcriptional factor SOX2 andNANOG are over expression. Future studies isneed to determine their role in pathogenesis ofovarian endometriosis.

In conclusion, the present study demonstratedthat DNA methylation and BMP-2 expressionwas associated with a higher risk of develop-ing Wnt/-catenin pathway in OA chondrocytes.These results suggest that the SOST of Wnt sig-naling pathways may be a potential target for thetreatment of disease.

Our novel data strongly suggest that BMP-2signaling modulates SOST transcription in OAthrough changes in Smad 1/5/8 binding affinityto the CpG region located upstream of the TSS inthe SOST gene, pointing towards the involvementof DNA methylation in SOST expression in OA.

The role of cancer stem cells to trastuzumab-based and breast cancer cell proliferation, migra-tion, and invasion.

Long-term supplementation of decaffeinatedgreen tea extract does not modify body weight orabdominal obesity in a randomized trial of men athigh risk for Prostate cancer.

Table 12: Good Examples that Passed Turing Test

Page 13: PaperRobot: Incremental Draft Generation of …PaperRobot: Incremental Draft Generation of Scientific Ideas Qingyun Wang 1, Lifu Huang , Zhiying Jiang , Kevin Knight2, Heng Ji1;3,

ReferencesDzmitry Bahdanau, Kyunghyun Cho, and Yoshua Ben-

gio. 2015. Neural machine translation by jointlylearning to align and translate. In Proceedings ofthe 5th International Conference on Learning Rep-resentations.

Rossana Berardi, Francesca Morgese, Azzurra Onofri,Paola Mazzanti, Mirco Pistelli, Zelmira Ballatore,Agnese Savini, Mariagrazia De Lisa, Miriam Cara-manti, Silvia Rinaldi, et al. 2013. Role of maspin incancer. Clinical and translational medicine.

Antoine Bordes, Nicolas Usunier, Alberto Garcia-Duran, Jason Weston, and Oksana Yakhnenko.2013. Translating embeddings for modeling multi-relational data. In Advances in neural informationprocessing systems.

Andrew Chisholm, Will Radford, and Ben Hachey.2017. Learning to generate one-sentence biogra-phies from Wikidata. In Proceedings of the 15thConference of the European Chapter of the Associa-tion for Computational Linguistics.

Kyunghyun Cho, Bart Van Merrienboer, Caglar Gul-cehre, Dzmitry Bahdanau, Fethi Bougares, HolgerSchwenk, and Yoshua Bengio. 2014. Learningphrase representations using rnn encoder-decoderfor statistical machine translation. In Proceedings ofthe 2014 Conference on Empirical Methods in Nat-ural Language Processing.

Allan Peter Davis, Cynthia J Grondin, Robin J Johnson,Daniela Sciaky, Benjamin L King, Roy McMorran,Jolene Wiegers, Thomas C Wiegers, and Carolyn JMattingly. 2016. The comparative toxicogenomicsdatabase: update 2017. Nucleic acids research.

Michael Denkowski and Alon Lavie. 2014. Meteoruniversal: Language specific translation evaluationfor any target language. In Proceedings of the 9thWorkshop on Statistical Machine Translation.

Daniel Duma and Ewan Klein. 2013. Generating nat-ural language from linked data: Unsupervised tem-plate extraction. In Proceedings of the 10th Interna-tional Conference on Computational Semantics.

Angela Fan, Mike Lewis, and Yann Dauphin. 2018. Hi-erarchical neural story generation. In Proceedingsof the 56th Annual Meeting of the Association forComputational Linguistics.

Jeffrey Flanigan, Chris Dyer, Noah A. Smith, andJaime Carbonell. 2016. Generation from abstractmeaning representation using tree transducers. InProceedings of the 2016 Conference of the NorthAmerican Chapter of the Association for Computa-tional Linguistics: Human Language Technologies.

Jacob G. Foster, Andrey Rzhetsky, and James A. Evans.2015. Tradition and innovation in scientists researchstrategies. American Sociological Review.

Mary Ellen Foster and Michael White. 2007. Avoid-ing repetition in generated text. In Proceedings ofthe 11th European Workshop on Natural LanguageGeneration.

Alex Graves and Jurgen Schmidhuber. 2005. Frame-wise phoneme classification with bidirectional lstmand other neural network architectures. In Proceed-ings of the 2015 IEEE International Joint Confer-ence on Neural Networks.

Jiatao Gu, Zhengdong Lu, Hang Li, and Victor O.K.Li. 2016. Incorporating copying mechanism insequence-to-sequence learning. In Proceedings ofthe 54th Annual Meeting of the Association for Com-putational Linguistics.

Hardy Hardy and Andreas Vlachos. 2018. Guidedneural language generation for abstractive summa-rization using Abstract Meaning Representation. InProceedings of the 2018 Conference on EmpiricalMethods in Natural Language Processing.

Chonghai Hu, Weike Pan, and James T. Kwok. 2009.Accelerated gradient methods for stochastic opti-mization and online learning. In Advances in NeuralInformation Processing Systems.

Guoliang Ji, Shizhu He, Liheng Xu, Kang Liu, andJun Zhao. 2015a. Knowledge graph embedding viadynamic mapping matrix. In Proceedings of the53rd Annual Meeting of the Association for Compu-tational Linguistics and the 7th International JointConference on Natural Language Processing.

Ming Ji, Qi He, Jiawei Han, and Scott Spangler. 2015b.Mining strong relevance between heterogeneous en-tities from unstructured biomedical data. Data Min-ing and Knowledge Discovery, 29:976998.

Lucie-Aimee Kaffee, Hady Elsahar, Pavlos Vou-giouklis, Christophe Gravier, Frederique Laforest,Jonathon Hare, and Elena Simperl. 2018. Learningto generate Wikipedia summaries for underservedlanguages from Wikidata. In Proceedings of the2018 Conference of the North American Chapter ofthe Association for Computational Linguistics: Hu-man Language Technologies.

Ioannis Konstas and Mirella Lapata. 2013. A globalmodel for concept-to-text generation. Journal of Ar-tificial Intelligence Research.

Niveda Krishnamoorthy, Girish Malkarnenkar, Ray-mond J Mooney, Kate Saenko, and Sergio Guadar-rama. 2013. Generating natural-language video de-scriptions using text-mined knowledge. In Proceed-ings of the 27th AAAI Conference on Artificial Intel-ligence.

Remi Lebret, David Grangier, and Michael Auli. 2016.Neural text generation from structured data with ap-plication to the biography domain. In Proceedingsof the 2016 Conference on Empirical Methods inNatural Language Processing.

Page 14: PaperRobot: Incremental Draft Generation of …PaperRobot: Incremental Draft Generation of Scientific Ideas Qingyun Wang 1, Lifu Huang , Zhiying Jiang , Kevin Knight2, Heng Ji1;3,

Jiwei Li, Michel Galley, Chris Brockett, Georgios Sp-ithourakis, Jianfeng Gao, and Bill Dolan. 2016. Apersona-based neural conversation model. In Pro-ceedings of the 54th Annual Meeting of the Associa-tion for Computational Linguistics.

Chin-Yew Lin. 2004. ROUGE: A package for auto-matic evaluation of summaries. In Proceedings ofText Summarization Branches Out.

Yankai Lin, Zhiyuan Liu, Maosong Sun, Yang Liu, andXuan Zhu. 2015. Learning entity and relation em-beddings for knowledge graph completion. In Pro-ceedings of the 39th AAAI Conference on ArtificialIntelligence.

Chia-Wei Liu, Ryan Lowe, Iulian Serban, Mike Nose-worthy, Laurent Charlin, and Joelle Pineau. 2016.How not to evaluate your dialogue system: An em-pirical study of unsupervised evaluation metrics fordialogue response generation. In Proceedings of the2016 Conference on Empirical Methods in NaturalLanguage Processing.

Tianyu Liu, Kexiang Wang, Lei Sha, Baobao Chang,and Zhifang Sui. 2018. Table-to-text generation bystructure-aware seq2seq learning. In Proceedings ofthe 32nd AAAI Conference on Artificial Intelligence.

Ryan Lowe, Nissan Pow, Iulian Serban, and JoellePineau. 2015. The Ubuntu dialogue corpus: A largedataset for research in unstructured multi-turn dia-logue systems. In Proceedings of the 16th AnnualMeeting of the Special Interest Group on Discourseand Dialogue.

Di Lu, Spencer Whitehead, Lifu Huang, Heng Ji, andShih-Fu Chang. 2018. Entity-aware image captiongeneration. In Proceedings of the 2018 Conferenceon Empirical Methods in Natural Language Pro-cessing.

Yi Luan, Luheng He, Mari Ostendorf, and HannanehHajishirzi. 2018. Multi-task identification of enti-ties, relations, and coreference for scientific knowl-edge graph construction. In Proceedings of the 2018Conference on Empirical Methods in Natural Lan-guage Processing.

Andrea Madotto, Chien-Sheng Wu, and Pascale Fung.2018. Mem2seq: Effectively incorporating knowl-edge bases into end-to-end task-oriented dialog sys-tems. In Proceedings of the 56th Annual Meeting ofthe Association for Computational Linguistics.

Corey L. Neal, Veronica Henderson, Bethany N. Smith,Danielle McKeithen, Tisheeka Graham, Baohan T.Vo, and Valerie A. Odero-Marah. 2012. Snail tran-scription factor negatively regulates maspin tumorsuppressor in human prostate cancer cells. BMCCancer.

Maximilian Nickel, Volker Tresp, and Hans-PeterKriegel. 2011. A three-way model for collectivelearning on multi-relational data. In Proceedingsof the 28th International Conference on MachineLearning.

Feng Nie, Jinpeng Wang, Jin-Ge Yao, Rong Pan, andChin-Yew Lin. 2018. Operation-guided neural net-works for high fidelity data-to-text generation. InProceedings of the 2018 Conference on EmpiricalMethods in Natural Language Processing.

Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. BLEU: a method for automaticevaluation of machine translation. In Proceedingsof the 40th Annual Meeting of the Association forComputational Linguistics.

Steven Pinker. 2014. Why academics stink at writing.The Chronicle of Higher Education.

Nima Pourdamghani, Kevin Knight, and Ulf Herm-jakob. 2016. Generating English from AbstractMeaning Representations. In Proceedings of the 9thInternational Natural Language Generation confer-ence.

Yuting Qiang, Yanwei Fu, Yanwen Guo, Zhi-HuaZhou, and Leonid Sigal. 2016. Learning to gener-ate posters of scientific papers. In Proceedings ofthe 30th AAAI Conference on Artificial Intelligence.

Dragomir R. Radev, Pradeep Muthukrishnan, VahedQazvinian, and Amjad Abu-Jbara. 2013. The acl an-thology network corpus. Language Resources andEvaluation, pages 1–26.

Abigail See, Peter J. Liu, and Christopher D. Manning.2017. Get to the point: Summarization with pointer-generator networks. In Proceedings of the 55th An-nual Meeting of the Association for ComputationalLinguistics.

Lei Sha, Lili Mou, Tianyu Liu, Pascal Poupart, SujianLi, Baobao Chang, and Zhifang Sui. 2018. Order-planning neural text generation from structured data.In Proceedings of the 32nd AAAI Conference on Ar-tificial Intelligence.

Matthew Snover, Bonnie Dorr, Richard Schwartz, Lin-nea Micciulla, and John Makhoul. 2006. A studyof translation edit rate with targeted human annota-tion. In Proceedings of the Association for MachineTranslation in the Americas.

Sainbayar Sukhbaatar, Jason Weston, Rob Fergus, et al.2015. End-to-end memory networks. In Advancesin Neural Information Processing Systems.

Jun Suzuki and Masaaki Nagata. 2017. Cutting-off re-dundant repeating generations for neural abstractivesummarization. In Proceedings of the 15th Confer-ence of the European Chapter of the Association forComputational Linguistics.

Bayu Distiawan Trisedya, Jianzhong Qi, Rui Zhang,and Wei Wang. 2018. GTR-LSTM: A triple encoderfor sentence generation from RDF data. In Proceed-ings of the 56th Annual Meeting of the Associationfor Computational Linguistics.

Page 15: PaperRobot: Incremental Draft Generation of …PaperRobot: Incremental Draft Generation of Scientific Ideas Qingyun Wang 1, Lifu Huang , Zhiying Jiang , Kevin Knight2, Heng Ji1;3,

Zhaopeng Tu, Zhengdong Lu, Yang Liu, Xiaohua Liu,and Hang Li. 2016. Modeling coverage for neuralmachine translation. In Proceedings of the 54th An-nual Meeting of the Association for ComputationalLinguistics.

Raghuram Vadapalli, Bakhtiyar Syed, Nishant Prabhu,Balaji Vasan Srinivasan, and Vasudeva Varma. 2018.When science journalism meets artificial intelli-gence: An interactive demonstration. In Proceed-ings of the 2018 Conference on Empirical Methodsin Natural Language Processing.

Richard Van Noorden. 2014. Scientists may be reach-ing a peak in reading habits. Nature.

Petar Velickovic, Guillem Cucurull, Arantxa Casanova,Adriana Romero, Pietro Lio, and Yoshua Bengio.2018. Graph attention networks. Proceedings of the8th International Conference on Learning Represen-tations.

Qingyun Wang, Xiaoman Pan, Lifu Huang, BoliangZhang, Zhiying Jiang, Heng Ji, and Kevin Knight.2018a. Describing a knowledge base. In Proceed-ings of the 11th International Conference on NaturalLanguage Generation.

Qingyun Wang, Zhihao Zhou, Lifu Huang, SpencerWhitehead, Boliang Zhang, Heng Ji, and KevinKnight. 2018b. Paper abstract writing through edit-ing mechanism. In Proceedings of the 56th AnnualMeeting of the Association for Computational Lin-guistics.

Zhen Wang, Jianwen Zhang, Jianlin Feng, and ZhengChen. 2014. Knowledge graph embedding by trans-lating on hyperplanes. In Proceedings of the 28thAAAI Conference on Artificial Intelligence.

Zhigang Wang and Juan-Zi Li. 2016. Text-enhancedrepresentation learning for knowledge graph. InProceedings of the 25th International Joint Confer-ence on Artificial Intelligence.

Chih-Hsuan Wei, Hung-Yu Kao, and Zhiyong Lu.2013. PubTator: a web-based text mining tool forassisting biocuration. Nucleic acids research.

Spencer Whitehead, Heng Ji, Mohit Bansal, Shih-FuChang, and Clare Voss. 2018. Incorporating back-ground knowledge into video description genera-tion. In Proceedings of the 2018 Conference on Em-pirical Methods in Natural Language Processing.

Sam Wiseman, Stuart Shieber, and Alexander Rush.2018. Learning neural templates for text generation.In Proceedings of the 2018 Conference on EmpiricalMethods in Natural Language Processing.

Qi Wu, Chunhua Shen, Peng Wang, Anthony Dick, andAnton van den Hengel. 2018. Image captioning andvisual question answering based on attributes andexternal knowledge. In Proceedings of the 2018IEEE transactions on pattern analysis and machineintelligence.

Ziang Xie. 2017. Neural text generation: A practicalguide. arXiv preprint arXiv:1711.09534.

Jiacheng Xu, Kan Chen, Xipeng Qiu, and XuanjingHuang. 2017. Knowledge graph representation withjointly structural and textual encoding. In Proceed-ings of the 26th International Joint Conference onArtificial Intelligence.

Kun Xu, Lingfei Wu, Zhiguo Wang, Yansong Feng,and Vadim Sheinin. 2018. SQL-to-text generationwith graph-to-sequence model. In Proceedings ofthe 2018 Conference on Empirical Methods in Nat-ural Language Processing.