cdd: vault, cdd: vision and cdd: models software for biologists and chemists doing drug discovery
TRANSCRIPT
CDD: VAULT, CDD: VISION AND CDD: MODELS SOFTWARE FOR
BIOLOGISTS AND CHEMISTS DOING DRUG DISCOVERY
SEAN EKINS AND BARRY A. BUNIN
COLLABORATIVE DRUG DISCOVERY, 1633 BAYSHORE
HIGHWAY, SUITE 342, BURLINGAME, CA 94010, USA
•2016: CDD Vault = Activity & Registration + Inventory (New) + Visualization
•2015: CDD Vault adds advanced calculations, modeling, and visualization
•2014: CDD celebrates 10 Year Users Meeting together with Leading Scientists
•2014: CDD Vault hosts >23 Million Compounds
•2013: CDD Vault securely hosted >9 years – with 99.98% up-time
•2013: CDD surpassed >100,000 customer logins
•2012: CDD FISMA compliant and accredited
•2012: NIH picked CDD Vault for Neuroscience Blueprint Network
•2012: CDD securely hosted >160,000,000 datapoints
•2011: CDD won Bio-IT World Editors’ Choice Best Practices Award
•2011: MM4TB 5 year EU funded project with AstraZeneca, Sanofi-Aventis
•2010: GSK, Novartis, Pfizer, and NIH Collaborations announced
•2009: CDD Vault surpasses >1 Million Compounds
•2008: Gates Foundation 2 year grant (extended to 8 years)
•2005: Eli Lilly co-invested in a syndicate with Omidyar Network and Founders Fund
•2004: CDD spun out of Lilly, UCSF signs up as first customer
Overview: Balancing grants, contracts and sales
Barry Bunin
CEO and Board Director
Visualization
Inventory
ELNELN: Electronic
Laboratory Notebook
(Coming Soon)
Inventory:
Reagents &
Samples
CDD Visualization:
Calculations & Models
CDD Vault: Public +
Collaborative Value Efforts to Advance our Field
for the Public Good
Beyond What the
Market Naturally
Supports
Activity &
Registration
Full Suite:
Commercial
Capabilities
CDD Vault – Modern Drug Discovery Informatics
Ideas focused around neglected diseases
• Work on neglected diseases is less sensitive
• Provides more exposure for diseases
• Less competitive than major diseases like cancer
• Opportunities for grants and funding
• Can build a following by publishing and networking
• Leverages what we do best – we use the product in collaborations
• Find new angles to combine collaboration with software
• Create new knowledge, mine data, model data
• Help create IP for collaborators
• Combine in silico and in vitro
• = ‘Scientific marketing’ adds credibility
NIH GRANTS
• Non - dilutive
• STTR
• Find an academic collaborator lab with data and willing to
make / test compounds
• E.g. Tuberculosis, Chagas, Biocomputation grants
• SBIR
• Develop software with collaborators
• E.g. Ontology development
CDD- Over a decade of drug discovery collaborations
SaaS
Easy to use
Used by
Academia
Industry, Biotech
Private
Selective
collaboration
100’s of
published
datasets
Enterprise CapabilitiesWeb Interface, Management Tools, Integration,
Customizable
Drug Discovery Data Mining Search, Visualization, Presentation
Chemical IntelligenceChemical Drawing, Registration, Property
Calculators, Structure Search, SAR Tools
Collaborative Environment Controlled Access, Data Privacy, Security,
Community
Free Public Data AccessScreening Data, Compound Data
CDD Vault Features
• Online Zendesk
• CDD Models
• CDD Vision
• Integration of CDD Public,
ChemSpider, Zinc, and
PubChem.
Benefits of CDD Vault
CDD Vault® - Differentiators
• Registration, SAR, Tracking, Visualization for budget sensitive labs
• Easy for Your Whole Project Team to Embrace (Key Differentiation)
• Secure & Proven (>250,000 logins)
• 99.98% uptime for global access over 12 years
• No hardware, no software, login online, fast adoptions
Budget Sensitive Startup or Academic Scientists
Won’t lose data
Get better results
Easy to trial, set up, configure, be trained and GO!
ex-Big Pharma Scientist Familiar with Registration/SAR Software
Nimble
Save $$$ with modern cloud solution
Relax – data migration is a snap!
Big Collaborations funded by Pharma, NIH, Foundations (PPP)
Control exactly which data you share with others
Relax – security is built in
Foster interactions between biologists and chemists
Passed Big Pharma & NIH FISMA audits (CDD does not own IP)
CDD Vault® - “Value Proposition”
Login/month
TB Project overview
Phase I STTR – Proof of concept of mimic strategy
Phase II STTR – Expand mimic strategy and validation of phase I hits
Ekins et al., PLoS One. 2015, 10:e0141076
We have published 27 papers on tuberculosis with collaborators since 2010
~50 papers and book chapters etc. published on CDD since 2009
Getting the word out
Copyright © 2013 All Rights Reserved Collaborative Drug Discovery
MM4TB: 25 organizations
New
Old
Neuroscience
Kinetoplastid Drug Development
Consortium
CDD Vault: Visualization, Calculations, & Models
• Data Visualization Analytics
• Scatterplots, Histograms, Interactive
• Excel Type Calculations
• On Experimental and/or Calculated Values
• Automatically During Data Capture
• Machine Learning Algorithms
• Simplifies Model Building with Public or Private Data
CDD VISION
Data taken from CDD Vault and utilized in CDD Vision
Backend formed using immutable and Crossfilter.js, binding layer uses
d3.js and jQuery, Rendering uses d3.js and Pixi.js
Launching CDD Vision
CDD VISION
Filters
CDD VISION
Compound details
CDD VISION
Multiple plots, different sizes
CDD VISION
MoDELS RESIDE IN PAPERS
NOT ACCESSIBLE…THIS IS
UNDESIRABLE
How do we share them?
How do we use Them?
Open Extended Connectivity Fingerprints
ECFP_6 FCFP_6
• Collected,
deduplicated,
hashed
• Sparse integers
• Invented for Pipeline Pilot: public method, proprietary details
• Often used with Bayesian models: many published papers
• Built a new implementation: open source, Java, CDK
– stable: fingerprints don't change with each new toolkit release
– well defined: easy to document precise steps
– easy to port: already migrated to iOS (Objective-C) for TB Mobile app
• Provides core basis feature for CDD open source model service
Clark et al., J Cheminform 6:38 2014
Predictions for the InhA target: (a) the ROC curve with ECFP_6 and FCFP_6 fingerprints; (b)
modified Bayesian estimators for active and inactive compounds; (c) structures of selected
binders.
For each listed target with at least two binders, it is first assumed that all of the molecules in
the collection that do not indicate this as one of their targets are inactive.
In the app we used ECFP_6 fingerprints
Building Bayesian models for each target in TB Mobile
Clark et al., J Cheminform 6:38 2014
TB Mobile
Ekins et al., J Cheminform 5:13, 2013Clark et al., J Cheminform 6:38 2014
Predict targetsCluster molecules
http://goo.gl/vPOKS
http://goo.gl/iDJFR
Single point data > 300K molecules
Uses Bayesian algorithm and FCFP_6 fingerprints
Clark et al., J Cheminform 6:38 2014
Using AZ-ChEMBL data for CDD Models
• Human microsomal intrinsic
clearance • Rat hepatocyte intrinsic
clearance
Clark et al., JCIM 55: 1231-1245 (2015)
Exporting models from CDD
Clark et al., JCIM 55: 1231-1245 (2015)9R44TR000942-02
Open Models in MMDS
9R44TR000942-02
Composite Models – Binned Bayesians
Clark et al., J Chem Inf Model. 2016, 56(2):275-85
Summary
• Accessible software
• Used widely in academia and industry
• Leader in collaboration and security
• Grown steadily through sales and grants
• Dedicated sales in Europe, Asia
• Coming soon: ELN
• CDD provides integrated software for
drug discovery
Anna Coulon-Spektor, Kellan Gregory, Charlie Weatherall, Krishna Dole,
Andrew McNutt, Peter Nyberg, Tom Gilligan, Xiao Ba, Barbara Holtz,
Sylvia Ernst, Frank Cole, Marc Navre, Alex M. Clark
Joel Freundlich
Peter Madrid
Carolyn Talcott
Malabika Sarker
Jair Lage de Siqueira-Neto
EU FP7 funding MM4TB
NIH NIAID
NIH NLM
NIH NCATS
Bill and Melinda Gates Foundation (Grant#49852)
Acknowledgments