hypothesis fusion to improve the odds of successful drug
TRANSCRIPT
Hypothesis Fusion to Improve the Odds of Successful Drug
Repurposing
Alexander Tropsha, Charles Schmitt, Eugene Muratov
UNC-Chapel HillWeifan Zheng, NCCU
Nabarun Dasgupta, Epidemico
Information resources for bioactive chemicals are abundant and growing
FDA approved labels for marketed drugs
Over 18 million citations from MEDLINE and other life science
journals for biomedical articles back to the 1950s
Reflects the scientific conclusion reached by the Committee for
Medicinal Products for Human Use (CHMP) at the end of the centralised
evaluation process.
European Public Assessment Reports (EPAR)
The DrugBank database combines detailed drug data (8200+ drug entries) with comprehensive drug target
information
The FDA Safety Information and Adverse Event Reporting
Program
FDA data for five liver enzyme endpoints for
Drug interactions with cytochrome P450 isoforms
Cytochrome P450 Drug Interaction Table
FDA Orange Book ofApproved Drug Products
“Potential Safety Issue” data“Drug Interactions” table FDA New Drug
Application documents
eMC provides electronic Summaries of Product
Characteristics
Compound Assay data for proteins and cytotoxicity
IntegratedChemical-
BioactivityData
Modified from a slide provided by Julie Barnes, Biowisdom
Data Science and Data Cycle
Predictive data models & toolsExperimental Design
Data Analysis
and Modeling
Structured Data
Repository
Data collection, curation, integration, and
structuring (ontology).Literature data
Electronic Databases:
Text MiningLab collections
Disease
ExperimentalValidation
4
Effect
Unstructured test:FacebookTwitterOther Social Media
Decision support
Data reproducibility and data curation are critical, otherwise:
BD2K = Bogus Data to Knonsense
5
Data set curation workflows: Trust but Verify!
Fourches D. et al. J. Chem. Inf. Model., 2010, 50, 1189
Fourches D. et al. Nat. Chem. Bio. 2015, 11, 535
Disease gene
signatures
Disease related
genes or proteins
Text/database mining Network mining
PubMed/Chemotext
CTD
HMDB
Disease related
proteins
cmapChemoText
New hypothesis about connectivity between chemicals and diseases
Binding data
Target related ligands
Functional data
QSAR
Predictive models
Database mining
Structural hypothesis“putative drug candidates”
Hypothesesfusion
New testable hypotheses with higher confidence
Disease-TargetAssociation
Hajjo et al, Chemocentric Informatics Approach to Drug DiscoveryJ Med Chem. 2012, 55(12):5704-19
QSAR modeling and Virtual Screening: Hit identification in external libraries
~106 – 109
molecules
VIRTUAL SCREENING
CHEMICALSTRUCTURES
CHEMICALDESCRIPTORS
PROPERTY/ACTIVITY
PREDICTIVEQSAR MODELS
INACTIVES (confirmed inactives)
QSARMAGIC
HITS (confirmed
actives)
CHEMICAL DATABASE
5-HT6 receptor QSAR models & QSAR-based VS
5-HT6predictor
300 VS Hits“Actives”
59 K cps.
Model statistics
94 Inactives Ki ≥ 10 µM
196 cps.
102 Actives Ki < 10 µM
Dataset Virtual screening
Source: PDSP Ki-DB
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
Model
CC
Rev
s kNN-Dragon Model
kNN-Dragon Random
CBA-SG Model
CBA-SG Random
8
The connectivity map
Step3 : list of correlated compounds
Step2: query the cmap
Database
Step1: upload signature
Output
High correlation
Low correlation
Null
Biological state 1
ControlSignature
Input
Lamb, J. et al. Science, 313, 1929-1935 (2006)Lamb, J. Nature 7, 54-60 (2007)
Querying the cmap
S1: Hata, R. et al., Biochem. Biophys. Res. Commun 284, 310 (2001).S2: Ricciarelli, R. et al., IUBMB Life 56, 349 (2004).
10
cmap
1.00
0.00
0.00
-1.00
cmap SCORE
Upload signature Query the cmap List of compounds
(S1) (S2)
Alzheimer’s disease gene signatures
97 COMMON HITS with S1106 COMMON HITS with S2
Chemocentric Informatics
QSARFILTER
Furtherselection
34 Higher Confidence Hits
CONSENSUSHYPOTHESES
300 5-HT6Active HITS
WDIDATABASE
73 COMMON HITS with S1 & S2
cmapFILTER
cmapDATABASE
881 instances with S1861 instances with S2
59 Kcompounds
6.1 K Individual instances
AntipsychoticsAntidepressantsCalcium Channel BlockersSelective Estrogen Receptor
Modulators (SERMs)
Exploring PubMed as one of the largest Chemical Biology Databases: the ChemoText Project
•2008 Medline baseline: 16,880,015 records •6,635,344 records had subject chemicals
9,360,330 relationships
5,395,144relationships
20,466,335relationships
SubjectChemical134,184 distinct
Diseases4,865 distinct
Proteins61,329 distinct
Drug Effects7,761 distinct
9,088,747relationships
13,157,701relationships
http://chemotext.mml.unc.edu/
Baker, N. Hemminger, B.J Biomed Inform. 2010 Aug;43(4):510-9
Swanson’s ABC approach to drug discovery via text mining*
AChemicals
C Disease
B Intermediate
Terms Relationships established through co-occurrence of terms
Migraine
VasodilationSpreading cortical depressionPlatelet aggregation
Magnesium
Relationships established through co-occurrence of terms
Deduced relationship
*Swanson DR. Medical literature as a potential source of new knowledge. Bull Med Libr Assoc 1990;78(1):29–37
ABC Method as applied to discern chemical-target-disease associations (using Chemotext)
CDisease
BProtein
AChemical
http://chemotext.mml.unc.edu/
Raloxifene identified as a 5-HT6 receptor ligand and potential treatment for the Alzheimer’s disease Raloxifene binds to 5-HT6
receptor with a Ki= 750 nM.*
Raloxifene given at a dose of 120 mg/day led to reduced risk of cognitive impairment in post-menopausal women.Yaffe, K. et al., Am J Psychiatry, 2005, 162, 683–690.
Adjunctive raloxifene treatment improves attention and memory in men and women with schizophrenia.Weickert TW, et al Mol Psychiatry. 201520, 685-94
Raloxifene
Chlorpromazine
Competition binding at 5-HT6 receptors forraloxifene (yellow triangle) andchlorpromazine (square) versus [3H] LSD.Tested by our collaborators at PDSP.
*Hajjo et al, Chemocentric Informatics Approach toDrug Discovery. J Med Chem. 2012, 55(12):5704-19
Social Media
Etc.
Anal
ysis
Cancer-Related Assertions
Aim 1: From Man
Curated Database of Assertions
Hypothesis generation
Aim 2: To Molecule
Hypothesis confirmation
Curated Cancer-Related Bioassay Database
PrimaryHits
Virtual screening platform
Aim 3: To Man
Hypothesis enrichment
Disease Effect
Drug-Target-Disease Database
Candidates for Repurposing
ElectronicMedical Records
On-line Databases
Etc.
Expe
rimen
tal v
alid
atio
n in
-vitr
o an
d in
-viv
o
Chemotext
NIH 1U01CA207160-01. Drug repurposing: From Man to Molecules to Man