bibliological data science and drug discovery
TRANSCRIPT
![Page 1: Bibliological data science and drug discovery](https://reader034.vdocuments.net/reader034/viewer/2022050719/58ed67491a28ab4e428b456d/html5/thumbnails/1.jpg)
Bibliological data science and drug discoveryKnowing the knowns*
Effectively Harnessing the World’s Literature To Inform Rational Compound Design - ACS National Meeting, Philadelphia, Aug 21-24, 2016
Jeremy J Yang
Translational Informatics Division School of Medicine
University of New Mexico
Integrative Data Science Lab School of Informatics & Computing
Indiana University
*phrase borrowed from Edgar Jacoby, Janssen.
![Page 2: Bibliological data science and drug discovery](https://reader034.vdocuments.net/reader034/viewer/2022050719/58ed67491a28ab4e428b456d/html5/thumbnails/2.jpg)
In science, luck favors the prepared.- Louis Pasteur
The main thing was not to . . . "foul up." - The Right Stuff, by Tom Wolfe, about John Glenn.
![Page 3: Bibliological data science and drug discovery](https://reader034.vdocuments.net/reader034/viewer/2022050719/58ed67491a28ab4e428b456d/html5/thumbnails/3.jpg)
Overview of talk
● Formulation of problem● Resources and examples:
TIN-X, Target Importance and Novelty Explorer (&IDG)
Chem2Bio2RDF
OPDDR, Open Phenotypic Drug Discovery Resource
DrugCentral
![Page 4: Bibliological data science and drug discovery](https://reader034.vdocuments.net/reader034/viewer/2022050719/58ed67491a28ab4e428b456d/html5/thumbnails/4.jpg)
Formulation of problem
● "World's Literature" redefined by online revolution● Rational Compound Design = improving our odds● For given research question, what are the known knowns?● Connect the dots and weigh the evidence from global
knowledge graph.
![Page 5: Bibliological data science and drug discovery](https://reader034.vdocuments.net/reader034/viewer/2022050719/58ed67491a28ab4e428b456d/html5/thumbnails/5.jpg)
TIN-X
![Page 6: Bibliological data science and drug discovery](https://reader034.vdocuments.net/reader034/viewer/2022050719/58ed67491a28ab4e428b456d/html5/thumbnails/6.jpg)
TIN-X Target Importance & Novelty Explorer
● Bibliometric application developed for Illuminating the Druggable Genome (IDG) project
● Text mining from Novo Nordisk Center for Protein Research (U. Copenhagen) lab of Lars Juhl Jensen.
● Algorithm and client developed at UNM (Cristian Bologa, Daniel Cannon)
● Disease Ontology (DO) classification ● Drug Target Ontology (DTO) protein classification
![Page 7: Bibliological data science and drug discovery](https://reader034.vdocuments.net/reader034/viewer/2022050719/58ed67491a28ab4e428b456d/html5/thumbnails/7.jpg)
Illuminating the Druggable Genome (IDG)
7Knowledge Mgmt Center PI:
Tudor Oprea, MD, PhD
pharos.nih.gov
![Page 8: Bibliological data science and drug discovery](https://reader034.vdocuments.net/reader034/viewer/2022050719/58ed67491a28ab4e428b456d/html5/thumbnails/8.jpg)
TIN-X
http://newdrugtargets.org
![Page 9: Bibliological data science and drug discovery](https://reader034.vdocuments.net/reader034/viewer/2022050719/58ed67491a28ab4e428b456d/html5/thumbnails/9.jpg)
TIN-X
![Page 10: Bibliological data science and drug discovery](https://reader034.vdocuments.net/reader034/viewer/2022050719/58ed67491a28ab4e428b456d/html5/thumbnails/10.jpg)
TIN-X
http://newdrugtargets.org
![Page 11: Bibliological data science and drug discovery](https://reader034.vdocuments.net/reader034/viewer/2022050719/58ed67491a28ab4e428b456d/html5/thumbnails/11.jpg)
Target Novelty:
Fk = 1 / Tk
● Tk = # targets in paper (k)● Fk = fractional score of paper (k)● for papers where Tk > 0
Ni = 1 / ∑(Fk)● Ni = novelty, target (i)● sum over papers where target (i) mentioned
Target-Disease Importance:
Fk = 1 / (Tk * Dk)● Tk = # targets in paper (k)● Dk = # diseases in paper (k)● Fk = fractional score of paper (k)
Iij = ∑(Fk)● Iij = importance, target (i) for disease (j)● sum over papers where both mentioned
Target Importance and Novelty Explorer (TIN-X), Daniel Cannon, Jeremy Yang, Stephen Mathias, Oleg Ursu, Subramani Mani, Anna Waller, Stephan Schürer, Lars Juhl Jensen, Larry Sklar, Cristian Bologa, and Tudor Oprea (manuscript in preparation).
TIN-X
![Page 12: Bibliological data science and drug discovery](https://reader034.vdocuments.net/reader034/viewer/2022050719/58ed67491a28ab4e428b456d/html5/thumbnails/12.jpg)
TIN-X Target Importance & Novelty Explorer
● Text mining is a valuable tool for monitoring literature, filtering and ranking, and detecting trends.
● Automation can infer patterns regarding community trends and consensus.
● Interactive visualization tools help navigate big data.● Good big data text miners care about small data too!
![Page 13: Bibliological data science and drug discovery](https://reader034.vdocuments.net/reader034/viewer/2022050719/58ed67491a28ab4e428b456d/html5/thumbnails/13.jpg)
TIN-X Key contributors
Cristian Bologa Daniel Cannon Lars Juhl Jensen
![Page 14: Bibliological data science and drug discovery](https://reader034.vdocuments.net/reader034/viewer/2022050719/58ed67491a28ab4e428b456d/html5/thumbnails/14.jpg)
Chem2Bio2RDF
![Page 15: Bibliological data science and drug discovery](https://reader034.vdocuments.net/reader034/viewer/2022050719/58ed67491a28ab4e428b456d/html5/thumbnails/15.jpg)
● 24 sources, 52 datasets, 78M triples
● Semantically linked● Chen, B, et al, BMC
Bioinformatics (2010).● Chen, B et al, PLoS
Comp Bio (2012).● Fu, G et al, BMC
Bioinfo (2016).● Related projects:
Bio2RDF, LOD
http://chem2bio2rdf.org
![Page 16: Bibliological data science and drug discovery](https://reader034.vdocuments.net/reader034/viewer/2022050719/58ed67491a28ab4e428b456d/html5/thumbnails/16.jpg)
Classes:biological chemical
chemogenomicsliterature
phenotypesystemsdiseasepathway
polypharmacologyPPI
side effect
BindingDBBindingMOADIUChEBIChEMBLCTDDCDBDIP
DrugBankHGNCHPRDKEGGMATADOROMIMPDBePDSP
PharmGKBPubChemPubMedReactomeSIDERTTDUniProt
Sources:
![Page 17: Bibliological data science and drug discovery](https://reader034.vdocuments.net/reader034/viewer/2022050719/58ed67491a28ab4e428b456d/html5/thumbnails/17.jpg)
![Page 18: Bibliological data science and drug discovery](https://reader034.vdocuments.net/reader034/viewer/2022050719/58ed67491a28ab4e428b456d/html5/thumbnails/18.jpg)
Linked Open Data (LOD)
http://linkeddata.org/
![Page 19: Bibliological data science and drug discovery](https://reader034.vdocuments.net/reader034/viewer/2022050719/58ed67491a28ab4e428b456d/html5/thumbnails/19.jpg)
Chem2Bio2RDF apps: (1) SLAP, (2) Metapaths
2012
2016
![Page 20: Bibliological data science and drug discovery](https://reader034.vdocuments.net/reader034/viewer/2022050719/58ed67491a28ab4e428b456d/html5/thumbnails/20.jpg)
● Data semantics essential for integration of heterogeneous sources
● Strong evidence requires strong semantics● Semantic Web Technologies common framework
enabling -- but not assuring -- community progress● Chem2Bio2RDF v2.0 to leverage major community
advances (esp. Open PHACTS)● Data ecosystems, coop-tition & prisoner's dilemma
![Page 21: Bibliological data science and drug discovery](https://reader034.vdocuments.net/reader034/viewer/2022050719/58ed67491a28ab4e428b456d/html5/thumbnails/21.jpg)
Key contributors
Bin Chen Ying Ding David Wild
![Page 22: Bibliological data science and drug discovery](https://reader034.vdocuments.net/reader034/viewer/2022050719/58ed67491a28ab4e428b456d/html5/thumbnails/22.jpg)
OPDDR
![Page 23: Bibliological data science and drug discovery](https://reader034.vdocuments.net/reader034/viewer/2022050719/58ed67491a28ab4e428b456d/html5/thumbnails/23.jpg)
OPDDR
Open Phenotypic Drug Discovery Resource
![Page 24: Bibliological data science and drug discovery](https://reader034.vdocuments.net/reader034/viewer/2022050719/58ed67491a28ab4e428b456d/html5/thumbnails/24.jpg)
https://ncats.nih.gov/expertise/preclinical/pd2
![Page 25: Bibliological data science and drug discovery](https://reader034.vdocuments.net/reader034/viewer/2022050719/58ed67491a28ab4e428b456d/html5/thumbnails/25.jpg)
OPDDRcollaboration
![Page 26: Bibliological data science and drug discovery](https://reader034.vdocuments.net/reader034/viewer/2022050719/58ed67491a28ab4e428b456d/html5/thumbnails/26.jpg)
Example: OIDD HeLa cell based assayIntegrated RDF
bioassay:AID1117350skos:exactMatchoidd_assay:17 .
bioassay:AID1117350 dcterms:source source:ID846 ; dcterms:title "Increased chromatin condensation in HeLa cells-IC50"@en .
bioassay:AID1117350 rdf:type bao:BAO_0002786 .bioassay:AID1117350 rdf:type bao:BAO_0000010 .bioassay:AID1117350 rdf:type bao:BAO_0000219 .
endpoint:SID170464897_AID1117349 vocabulary:PubChemAssayOutcome vocabulary:active ; sio:has-value "0.0656"^^xsd:float ; a bao:BAO_0000190 ; rdfs:label "IC50"@en .
substance:SID170464897skos:exactMatchchembl_molecule:CHEMBL1483 .
chembl_assay:OIDD00017cco:hasCellLinechembl_cell_line:CHEMBL3308376 .
![Page 27: Bibliological data science and drug discovery](https://reader034.vdocuments.net/reader034/viewer/2022050719/58ed67491a28ab4e428b456d/html5/thumbnails/27.jpg)
D2D builds apps, tools and solutions
for knowledge discovery powered
by fast, scalable network analytics
and rigorous semantics.
d2discovery.com
Predictive Phenotypic Profiler (P3) prototype
![Page 28: Bibliological data science and drug discovery](https://reader034.vdocuments.net/reader034/viewer/2022050719/58ed67491a28ab4e428b456d/html5/thumbnails/28.jpg)
openphacts.org
![Page 29: Bibliological data science and drug discovery](https://reader034.vdocuments.net/reader034/viewer/2022050719/58ed67491a28ab4e428b456d/html5/thumbnails/29.jpg)
OPDDR
● OPDDR phenotypic assays have been linked and integrated via community semantics to both phenotypic (cell lines) and molecular (genomic/protein targets)
● New phenotypic knowledge domain offers additional value in drug discovery and pharmacological informatics
● Open PHACTS excellent, well suited platform
![Page 30: Bibliological data science and drug discovery](https://reader034.vdocuments.net/reader034/viewer/2022050719/58ed67491a28ab4e428b456d/html5/thumbnails/30.jpg)
DrugCentral
![Page 31: Bibliological data science and drug discovery](https://reader034.vdocuments.net/reader034/viewer/2022050719/58ed67491a28ab4e428b456d/html5/thumbnails/31.jpg)
DrugCentral
● DrugCentral is a free, open, curated resource about approved drugs, designed for research
● Compounds, products, labels, targets, IDs, names● DrugCentral developed over several years at UNM● DrugCentral recently released with new interface● License: CC-BY-SA
http://drugcentral.org
![Page 32: Bibliological data science and drug discovery](https://reader034.vdocuments.net/reader034/viewer/2022050719/58ed67491a28ab4e428b456d/html5/thumbnails/32.jpg)
http://drugcentral.org
![Page 33: Bibliological data science and drug discovery](https://reader034.vdocuments.net/reader034/viewer/2022050719/58ed67491a28ab4e428b456d/html5/thumbnails/33.jpg)
http://drugcentral.org
![Page 34: Bibliological data science and drug discovery](https://reader034.vdocuments.net/reader034/viewer/2022050719/58ed67491a28ab4e428b456d/html5/thumbnails/34.jpg)
DrugCentral
● Free, open, accurate, comprehensive drug reference for biomolecular and biomedical informatics research
Compounds 4444
Products 84787
Synonyms 20522
Structures 4231
Targets 3651
Bioactivities 15620
MoA 3484
SNOMED 45349
![Page 35: Bibliological data science and drug discovery](https://reader034.vdocuments.net/reader034/viewer/2022050719/58ed67491a28ab4e428b456d/html5/thumbnails/35.jpg)
"DrugCentral: online drug compendium", Oleg Ursu, Jayme Holmes, Jeffrey Knockel, Cristian Bologa, Jeremy Yang, Stephen Mathias, Stuart Nelson, Tudor Oprea (manuscript submitted).
![Page 36: Bibliological data science and drug discovery](https://reader034.vdocuments.net/reader034/viewer/2022050719/58ed67491a28ab4e428b456d/html5/thumbnails/36.jpg)
In Conclusion● New resources continue to emerge and evolve, providing
opportunities for knowledge driven drug discovery● Community standards → more intelligent web● Adapt to new data environment for success● Private + public data must be integrated to
○ Be prepared (like Pasteur)○ Not "foul up" (like Glenn)