Download - Text mining
![Page 1: Text mining](https://reader036.vdocuments.net/reader036/viewer/2022081414/54c28fb34a7959b4268b4620/html5/thumbnails/1.jpg)
Text mining
![Page 2: Text mining](https://reader036.vdocuments.net/reader036/viewer/2022081414/54c28fb34a7959b4268b4620/html5/thumbnails/2.jpg)
Explosion
![Page 3: Text mining](https://reader036.vdocuments.net/reader036/viewer/2022081414/54c28fb34a7959b4268b4620/html5/thumbnails/3.jpg)
exponential increase
![Page 4: Text mining](https://reader036.vdocuments.net/reader036/viewer/2022081414/54c28fb34a7959b4268b4620/html5/thumbnails/4.jpg)
![Page 5: Text mining](https://reader036.vdocuments.net/reader036/viewer/2022081414/54c28fb34a7959b4268b4620/html5/thumbnails/5.jpg)
![Page 6: Text mining](https://reader036.vdocuments.net/reader036/viewer/2022081414/54c28fb34a7959b4268b4620/html5/thumbnails/6.jpg)
some things are constant
![Page 7: Text mining](https://reader036.vdocuments.net/reader036/viewer/2022081414/54c28fb34a7959b4268b4620/html5/thumbnails/7.jpg)
![Page 8: Text mining](https://reader036.vdocuments.net/reader036/viewer/2022081414/54c28fb34a7959b4268b4620/html5/thumbnails/8.jpg)
“graph calculus”
![Page 9: Text mining](https://reader036.vdocuments.net/reader036/viewer/2022081414/54c28fb34a7959b4268b4620/html5/thumbnails/9.jpg)
=
![Page 10: Text mining](https://reader036.vdocuments.net/reader036/viewer/2022081414/54c28fb34a7959b4268b4620/html5/thumbnails/10.jpg)
~45 seconds per paper
![Page 11: Text mining](https://reader036.vdocuments.net/reader036/viewer/2022081414/54c28fb34a7959b4268b4620/html5/thumbnails/11.jpg)
Information retrieval
![Page 12: Text mining](https://reader036.vdocuments.net/reader036/viewer/2022081414/54c28fb34a7959b4268b4620/html5/thumbnails/12.jpg)
find the relevant papers
![Page 13: Text mining](https://reader036.vdocuments.net/reader036/viewer/2022081414/54c28fb34a7959b4268b4620/html5/thumbnails/13.jpg)
user-specified query
![Page 14: Text mining](https://reader036.vdocuments.net/reader036/viewer/2022081414/54c28fb34a7959b4268b4620/html5/thumbnails/14.jpg)
“yeast AND cell cycle”
![Page 15: Text mining](https://reader036.vdocuments.net/reader036/viewer/2022081414/54c28fb34a7959b4268b4620/html5/thumbnails/15.jpg)
![Page 16: Text mining](https://reader036.vdocuments.net/reader036/viewer/2022081414/54c28fb34a7959b4268b4620/html5/thumbnails/16.jpg)
stemming
![Page 17: Text mining](https://reader036.vdocuments.net/reader036/viewer/2022081414/54c28fb34a7959b4268b4620/html5/thumbnails/17.jpg)
dynamic query expansion
![Page 18: Text mining](https://reader036.vdocuments.net/reader036/viewer/2022081414/54c28fb34a7959b4268b4620/html5/thumbnails/18.jpg)
ranking
![Page 19: Text mining](https://reader036.vdocuments.net/reader036/viewer/2022081414/54c28fb34a7959b4268b4620/html5/thumbnails/19.jpg)
Mitotic cyclin (Clb2)-bound Cdc28 (Cdk1 homolog) directly phosphorylated Swe1 and this modification served as a priming step to promote subsequent Cdc5-dependent Swe1
hyperphosphorylation and degradation
![Page 20: Text mining](https://reader036.vdocuments.net/reader036/viewer/2022081414/54c28fb34a7959b4268b4620/html5/thumbnails/20.jpg)
no tool will find that
![Page 21: Text mining](https://reader036.vdocuments.net/reader036/viewer/2022081414/54c28fb34a7959b4268b4620/html5/thumbnails/21.jpg)
Entity recognition
![Page 22: Text mining](https://reader036.vdocuments.net/reader036/viewer/2022081414/54c28fb34a7959b4268b4620/html5/thumbnails/22.jpg)
identify the substance(s)
![Page 23: Text mining](https://reader036.vdocuments.net/reader036/viewer/2022081414/54c28fb34a7959b4268b4620/html5/thumbnails/23.jpg)
Mitotic cyclin (Clb2)-bound Cdc28 (Cdk1 homolog) directly phosphorylated Swe1 and this modification served as a priming step to promote subsequent Cdc5-dependent Swe1
hyperphosphorylation and degradation
![Page 24: Text mining](https://reader036.vdocuments.net/reader036/viewer/2022081414/54c28fb34a7959b4268b4620/html5/thumbnails/24.jpg)
Mitotic cyclin (Clb2)-bound Cdc28 (Cdk1 homolog) directly phosphorylated Swe1 and this modification served as a priming step to promote subsequent Cdc5-dependent Swe1
hyperphosphorylation and degradation
![Page 25: Text mining](https://reader036.vdocuments.net/reader036/viewer/2022081414/54c28fb34a7959b4268b4620/html5/thumbnails/25.jpg)
Pafilis, O’Donoghue, Jensen et al., Nature Biotechnology, 2009
![Page 26: Text mining](https://reader036.vdocuments.net/reader036/viewer/2022081414/54c28fb34a7959b4268b4620/html5/thumbnails/26.jpg)
comprehensive lexicon
![Page 27: Text mining](https://reader036.vdocuments.net/reader036/viewer/2022081414/54c28fb34a7959b4268b4620/html5/thumbnails/27.jpg)
orthographic variation
![Page 28: Text mining](https://reader036.vdocuments.net/reader036/viewer/2022081414/54c28fb34a7959b4268b4620/html5/thumbnails/28.jpg)
“black list”
![Page 29: Text mining](https://reader036.vdocuments.net/reader036/viewer/2022081414/54c28fb34a7959b4268b4620/html5/thumbnails/29.jpg)
manual correction
![Page 30: Text mining](https://reader036.vdocuments.net/reader036/viewer/2022081414/54c28fb34a7959b4268b4620/html5/thumbnails/30.jpg)
still too much to read
![Page 31: Text mining](https://reader036.vdocuments.net/reader036/viewer/2022081414/54c28fb34a7959b4268b4620/html5/thumbnails/31.jpg)
Information extraction
![Page 32: Text mining](https://reader036.vdocuments.net/reader036/viewer/2022081414/54c28fb34a7959b4268b4620/html5/thumbnails/32.jpg)
formalize the facts
![Page 33: Text mining](https://reader036.vdocuments.net/reader036/viewer/2022081414/54c28fb34a7959b4268b4620/html5/thumbnails/33.jpg)
co-occurrence
![Page 34: Text mining](https://reader036.vdocuments.net/reader036/viewer/2022081414/54c28fb34a7959b4268b4620/html5/thumbnails/34.jpg)
global statistical analysis
![Page 35: Text mining](https://reader036.vdocuments.net/reader036/viewer/2022081414/54c28fb34a7959b4268b4620/html5/thumbnails/35.jpg)
NLPNatural Language Processing
![Page 36: Text mining](https://reader036.vdocuments.net/reader036/viewer/2022081414/54c28fb34a7959b4268b4620/html5/thumbnails/36.jpg)
parsing individual sentences
![Page 37: Text mining](https://reader036.vdocuments.net/reader036/viewer/2022081414/54c28fb34a7959b4268b4620/html5/thumbnails/37.jpg)
Gene and protein names
Cue words for entity recognition
Verbs for relation extraction
[nxexpr The expression of [nxgene the cytochrome genes [nxpg CYC1 and CYC7]]]is controlled by[nxpg HAP1]
![Page 38: Text mining](https://reader036.vdocuments.net/reader036/viewer/2022081414/54c28fb34a7959b4268b4620/html5/thumbnails/38.jpg)
Mitotic cyclin (Clb2)-bound Cdc28 (Cdk1 homolog) directly phosphorylated Swe1 and this modification served as a priming step to promote subsequent Cdc5-dependent Swe1
hyperphosphorylation and degradation
![Page 39: Text mining](https://reader036.vdocuments.net/reader036/viewer/2022081414/54c28fb34a7959b4268b4620/html5/thumbnails/39.jpg)
store in a database
![Page 40: Text mining](https://reader036.vdocuments.net/reader036/viewer/2022081414/54c28fb34a7959b4268b4620/html5/thumbnails/40.jpg)
then the fun begins :-)
![Page 41: Text mining](https://reader036.vdocuments.net/reader036/viewer/2022081414/54c28fb34a7959b4268b4620/html5/thumbnails/41.jpg)
Acknowledgments
NLP pipeline– Jasmin Saric– Rossitza Ouzounova– Isabel Rojas– Peer Bork
Reflect– Heiko Horn– Sune Frankild– Evangelos Pafilis– Sven Haag– Michael Kuhn– Peer Bork– Reinhardt Schneider– Sean O’Donoghue