search pubmed with r part3
TRANSCRIPT
SearchSearch PubmedPubmed withwith RR
Part3Part3
Query Query pubmedpubmed titles for systemic lupus titles for systemic lupus erythematosuserythematosus with R Package RISmedwith R Package RISmed11
#Type the following in the R console:#Type the following in the R console:library(RISmedlibrary(RISmed))lupus<lupus<-- EUtilsSummary('lupus[TiEUtilsSummary('lupus[Ti] ] erythematosus[tierythematosus[ti] ] systemic[Tisystemic[Ti]', ]', retmaxretmax=200)=200)
# # retmaxretmax refer to Maximum number of records to retrieve, default is 100refer to Maximum number of records to retrieve, default is 1000.0.
fetch.lupusfetch.lupus <<-- EUtilsGet(lupusEUtilsGet(lupus))fetch.lupusfetch.lupus
# Results: # Results: PubMedPubMed query: query: lupus[Tilupus[Ti] AND ] AND erythematosus[tierythematosus[ti] AND ] AND systemic[Tisystemic[Ti] Records: 200 ] Records: 200
lupus.titlupus.tit<<--ArticleTitle(fetch.lupusArticleTitle(fetch.lupus))lupus.titlupus.tit [1:10] # to view the first 10 results of titles[1:10] # to view the first 10 results of titles
# export results to text file# export results to text file
write(lupus.tit,filewrite(lupus.tit,file="="lupusRISmedTi.txtlupusRISmedTi.txt")")ReferencesReferences11-- RISmedRISmed packagepackage: : StephanieStephanie KovalchikKovalchik (2013). (2013). RISmedRISmed: : DownloadDownload contentcontent fromfrom NCBI NCBI databasesdatabases. R . R packagepackage versionversion 2.1.0. 2.1.0.
httphttp://://CRAN.RCRAN.R--project.orgproject.org//packagepackage==RISmedRISmed
Query Query pubmedpubmed titles for systemic titles for systemic lupus lupus erythematosuserythematosus using using RISmedRISmed
View results of the exported text fileView results of the exported text file
Export results to text file with R command line Export results to text file with R command line write(lupus.tit,filewrite(lupus.tit,file="="lupusRISmedTi.txtlupusRISmedTi.txt")")# export title results as text file and open file in excel or an# export title results as text file and open file in excel or any other valid text editory other valid text editor
Find the Title Verb Relation with Find the Title Verb Relation with ReverbReverb
REVERB1 is an open extractor executable jarexecutable jar programdeveloped by the University of Washington's Turing Center.
• It is important to note that Reverb is dependent on JAVA, therefore itis not a R program.
• Reverb is powerful and provides useful information about structurerelation of a text. It is relative easy to use and runs very fast.
• In our case we will apply Reverb to to our text title results.
Reference:@inproceedings{ReVerb2011, author = {Anthony Fader and Stephen Soderland and Oren Etzioni},
title = {Identifying Relations for Open Information Extraction}, booktitle = {Proceedings of the Conference of Empirical Methods in Natural Language Processing ({EMNLP} '11)}, year = {2011}, month = {July 27-31}, address = {Edinburgh, Scotland, UK} }
Install ReverbInstall ReverbYou can download the latest You can download the latest ReVerbReVerb jar from jar from http://reverb.cs.washington.edu/reverbhttp://reverb.cs.washington.edu/reverb--latest.jarlatest.jar
This is the executable jar file is easy to run from MSThis is the executable jar file is easy to run from MS--DOS command. DOS command.
In In https://github.com/knowitall/reverb/https://github.com/knowitall/reverb/ you can find how to use you can find how to use Reverb. It provides the following example which illustrates whaReverb. It provides the following example which illustrates what it t it does:does:
““ReVerbReVerb takestakes rawraw texttext as as inputinput, , andand outputsoutputs (argument1, (argument1, relationrelationphrasephrase, argument2) triples. , argument2) triples. ForFor exampleexample, , givengiven thethe sentencesentence"Bananas are "Bananas are anan excellentexcellent sourcesource ofof potassiumpotassium," ," ReVerbReVerb willwill extractextractthethe triple (bananas, be triple (bananas, be sourcesource ofof, , potassiumpotassium).).””
In In orderorder toto runrun ReverbReverb youyou needneed toto havehave Java Java installedinstalled onon youryourcomputercomputer. . YouYou can can installinstall Java Java fromfrom https://www.java.com/en/download/https://www.java.com/en/download/
Reference:@inproceedings{ReVerb2011, author = {Anthony Fader and Stephen Soderland and Oren Etzioni}, title = {Identifying Relations for Open Information Extraction}, booktitle = {Proceedings of the Conference of Empirical Methods in
Natural Language Processing ({EMNLP} '11)}, year = {2011}, month = {July 27-31}, address = {Edinburgh, Scotland, UK} }
Use of ReverbUse of Reverb
Place Place reverb-latest.jar file and the result file “lupusRISmedTi.txtlupusRISmedTi.txt”” under the same folderunder the same folder
Figure shows example of the 2 files in the same folder (which we named Reverb-Java)
Use of ReverbUse of Reverb
11--Open the MSOpen the MS--DOS DOS cmdcmd and type the path of and type the path of the folder (Reverbthe folder (Reverb--Java in our example) Java in our example) containing both files: containing both files: reverb-latest.jar file and lupusRISmedTi.txtlupusRISmedTi.txt
Use ReverbUse Reverb22-- Type the following cmd line to view results on the console:
java -Xmx512m -jar reverb-latest.jar lupusRISmedTi.txtlupusRISmedTi.txt
Results are displayed on the MSResults are displayed on the MS--DOS windowDOS window
Use of ReverbUse of Reverb-- export the results to export the results to xlsxls filefile
33-- Type the following cmd line to export results to a file ::
java -Xmx512m -jar reverb-latest.jar lupusRISmedTi.txtlupusRISmedTi.txt > > ReverbLupusRISmedTi.txtReverbLupusRISmedTi.txt
(the name given to the file was ReverbLupusRISmedTi.txtReverbLupusRISmedTi.txt. You can use . You can use other name or even export to a other name or even export to a xlsxls file if you type file if you type ReverbLupusRISmedTi.xlsReverbLupusRISmedTi.xls
Open the Reverb result file Open the Reverb result file ReverbLupusRISmedTi.txtReverbLupusRISmedTi.txt with MS excel with MS excel
Reverb outputReverb outputThe Reverb output has 18 columnsThe Reverb output has 18 columns
(see results in the excel file)(see results in the excel file)TheThe mostmost interestinginteresting are:are:
Col 3 (Col C) : Argument1 Col 3 (Col C) : Argument1 Col 4 (Col D): Col 4 (Col D): VerbVerb RelationRelation phrasephraseCol 5 (Col E): Argument2Col 5 (Col E): Argument2
(Col 12 (Col 12 referrefer toto thethe confidenceconfidence thatthat thisthis extractionextraction isis correctcorrect andand col 2 col 2 referrefer toto the sentence number where the extraction came from)
Reverb ResultsReverb ResultsResults of the first 5 rows (excel) from columns 3Results of the first 5 rows (excel) from columns 3--55
11-- childhoodchildhood--onset systemic lupus onset systemic lupus erythematosuserythematosus is associated withis associated with ethnicityethnicity22-- renal involvementrenal involvement are lower inare lower in ACE inhibitorACE inhibitor--treated patientstreated patients33-- PrednisonePrednisone inducedinduced twotwo--way myocardial developmentway myocardial development44-- Acetylated Acetylated histoneshistones contribute tocontribute to the the immunostimulatoryimmunostimulatory potential of potential of
NeutrophilNeutrophil ExtracellularExtracellular TrapsTraps55--clinical practiceclinical practice monitor the impact ofmonitor the impact of systemic lupus systemic lupus erythematosuserythematosus
Note: Note: Blue color refer to argument 1Blue color refer to argument 1; white color is verb relation; ; white color is verb relation; orange color orange color refer to argument 2refer to argument 2
Prepare Reverb ResultsPrepare Reverb Resultsdata for R data for R WordcloudWordcloud
# use # use read.tableread.table script (from referencescript (from reference11 ) as follows:) as follows:d <d <--
read.table('ReverbLupusRISmedTi.txt',quoteread.table('ReverbLupusRISmedTi.txt',quote='',='',commentcomment.char.char='', ='', allowEscapesallowEscapes==F,sepF,sep='='\\t', header=FALSE, t', header=FALSE, as.isas.is=T, =T, stringsAsFactorsstringsAsFactors=F)=F)
# transforms the data into a data frame# transforms the data into a data framee<e<--as.data.frame(das.data.frame(d))# merge columns (3# merge columns (3--5) into a single text sentence5) into a single text sentencef=paste(e$V3,e$V4,e$V5) f=paste(e$V3,e$V4,e$V5) f[1:3] f[1:3] # view the first 3 lines # view the first 3 lines [1] "childhood[1] "childhood--onset systemic lupus onset systemic lupus erythematosuserythematosus is associated with ethnicity"is associated with ethnicity"[2] "renal involvement are lower in ACE inhibitor[2] "renal involvement are lower in ACE inhibitor--treated patients" treated patients" [3] "Prednisone induced two[3] "Prednisone induced two--way myocardial development"way myocardial development"Reference:Reference:1 Please stop using Excel1 Please stop using Excel--like formats to exchange datalike formats to exchange data
December 7th, 2012John MountDecember 7th, 2012John Mount
Represent Reverb ResultsRepresent Reverb Resultsin R in R WordcloudWordcloud
library (tm)my.corpusmy.corpus<<--Corpus(VectorSource(fCorpus(VectorSource(f))))summary(my.corpus)inspect(my.corpus [1:3]) my.corpus <- tm_map(my.corpus, removeWords, stopwords("english"))#my.corpus <- tm_map(my.corpus, stemDocument)myTdm <- TermDocumentMatrix(my.corpus, control =
list(wordLengths=c(1,Inf)))myTdm
# A term-document matrix (140 terms, 26 documents)# Non-/sparse entries: 163/3477# Sparsity : 96%# Maximal term length: 22 # Weighting : term frequency (tf)
Represent Reverb ResultsRepresent Reverb Resultsin R in R WordcloudWordcloud
findFreqTerms(myTdm, lowfreq=2)# [1] "associated" "damage" "distinct" "erythematosus"# [5] "increased" "independently" "lupus" "systemic"
termFrequency <- rowSums(as.matrix(myTdm))termFrequency <- subset(termFrequency, termFrequency>=10)m <- as.matrix(myTdm)wordFreq <- sort(rowSums(m), decreasing=TRUE) # This yields Word
Frequencylibrary (wordcloud)#library (RColorBrewer)set.seed(375) pal1 <- brewer.pal(6,"Dark2")wordcloud(words=names(wordFreq), freq=wordFreq,
scale=c(2,.9),min.freq=1, random.order=F, colors= pal1)
R WordcloudR Wordcloud of Reverb Resultsof Reverb Results