open source drug discovery (osdd) connecting minds &...

41
Anshu Bhardwaj Scientist & Community Builder OSDD, CSIR India Open Source Drug Discovery (OSDD) Connecting Minds & Machines A CSIR led team India consortium with global partnership for affordable healthcare for all National Knowledge Network “First Annual Workshop” “The e-Infrastructure of India” 31 st Oct – 1 st Nov 2012

Upload: lykhue

Post on 06-Mar-2018

259 views

Category:

Documents


5 download

TRANSCRIPT

Page 1: Open Source Drug Discovery (OSDD) Connecting Minds & Machinesworkshop.nkn.in/2012/Document/slides/OSDD-NKN.pdf · Open Source Drug Discovery (OSDD) Connecting Minds & Machines

Anshu Bhardwaj

Scientist & Community Builder

OSDD, CSIR

India

Open Source Drug Discovery (OSDD) Connecting Minds & Machines

A CSIR led team India consortium with global partnership for

affordable healthcare for all

National Knowledge Network “First Annual Workshop” “The e-Infrastructure of India” 31st Oct – 1st Nov 2012

Page 2: Open Source Drug Discovery (OSDD) Connecting Minds & Machinesworkshop.nkn.in/2012/Document/slides/OSDD-NKN.pdf · Open Source Drug Discovery (OSDD) Connecting Minds & Machines

First Disease Target : Tuberculosis; Now extended to Malaria Tuberculosis (TB) is one of leading causes of fatality, ranking second only to HIV as the killer infectious disease of adults worldwide.

Source:

http://www.globalhealthfacts.org/data/topic/map.aspx?ind=12

OSDD Focus : Tropical Neglected Diseases

At least one person in

the world is newly

infected with TB bacilli

every second

Over 1000 deaths a day or

3 deaths every 2 mins

New TB cases 2010

No New TB Drugs past 50 years

Page 3: Open Source Drug Discovery (OSDD) Connecting Minds & Machinesworkshop.nkn.in/2012/Document/slides/OSDD-NKN.pdf · Open Source Drug Discovery (OSDD) Connecting Minds & Machines

Research Spending Per New Drug

Company Number of drugs

approved

R&D Spending Per

Drug ($Mil)

Total R&D Spending

1997-2011 ($Mil)

AstraZeneca 5 11,790.93 58,955

GlaxoSmithKline 10 8,170.81 81,708

Sanofi 8 7,909.26 63,274

Roche Holding AG 11 7,803.77 85,841

Pfizer Inc. 14 7,727.03 108,178

Johnson & Johnson 15 5,885.65 88,285

Eli Lilly & Co. 11 4,577.04 50,347

Abbott Laboratories 8 4,496.21 35,970

Merck & Co Inc 16 4,209.99 67,360

Bristol-Myers Squibb Co. 11 4,152.26 45,675

Novartis AG 21 3,983.13 83,646

Amgen Inc. 9 3,692.14 33,229

Slate’s Bad Math : $55 million on each new drugs

Source: http://www.forbes.com/sites/matthewherper/2012/02/10/the-truly-staggering-cost-of-inventing-new-drugs/

Page 4: Open Source Drug Discovery (OSDD) Connecting Minds & Machinesworkshop.nkn.in/2012/Document/slides/OSDD-NKN.pdf · Open Source Drug Discovery (OSDD) Connecting Minds & Machines

Drug Discovery is a Long Risky process with Low Probability of Success

http://www.bayerpharma.com/en/research-and-development/processes/index.php

Page 5: Open Source Drug Discovery (OSDD) Connecting Minds & Machinesworkshop.nkn.in/2012/Document/slides/OSDD-NKN.pdf · Open Source Drug Discovery (OSDD) Connecting Minds & Machines

Prediction of non-toxic targets & inhibitors

Efficacy

Inhibitor should target the right protein in the pathogen (Mycobacterium tuberculosis)

Toxicity Inhibitor should not target any crucial protein in host (Human)

x

Page 6: Open Source Drug Discovery (OSDD) Connecting Minds & Machinesworkshop.nkn.in/2012/Document/slides/OSDD-NKN.pdf · Open Source Drug Discovery (OSDD) Connecting Minds & Machines

From a mathematical point of view, to create an accurate model of a single mammalian cell may require generating and then solving somewhere between 100,000 to one million equations

Biology is complex !!

http://news.vanderbilt.edu/2011/10/robot-biologist/

The human brain can only process seven pieces of data at a time!!! Need automation & new

technology to address the complexity

Page 7: Open Source Drug Discovery (OSDD) Connecting Minds & Machinesworkshop.nkn.in/2012/Document/slides/OSDD-NKN.pdf · Open Source Drug Discovery (OSDD) Connecting Minds & Machines

Predictive Science in the Drug Discovery (DD) Process

Predicting toxicity and metabolism

of drugs

Prediction tools and models to prioritize candidates molecules

HPC for OSDD Community by

Garuda/ CMMACS

Systems Level Models for DD - Target Identification - Pharmacomodeling - Off-target binding predictions

Virtual Screening for selected

targets& Models for predicting antiTB

and mutagenic properties

Systems Biology for predicting -

Drug-targets MOA

Page 8: Open Source Drug Discovery (OSDD) Connecting Minds & Machinesworkshop.nkn.in/2012/Document/slides/OSDD-NKN.pdf · Open Source Drug Discovery (OSDD) Connecting Minds & Machines

Why Open Source Drug discovery ?

Many eye balls make the bug shallow!

Lack of market incentive for TB

Successful Open Source Models

Human Genome Sequencing Initiative

Open Source Software Initiative (eg: Linux OS)

Android

The WWW

Page 9: Open Source Drug Discovery (OSDD) Connecting Minds & Machinesworkshop.nkn.in/2012/Document/slides/OSDD-NKN.pdf · Open Source Drug Discovery (OSDD) Connecting Minds & Machines

Real Innovation lies in

“Innovating how we innovate”…

“We cannot solve our problems with the same

thinking we used when we created them.”

Albert Einstein

Page 10: Open Source Drug Discovery (OSDD) Connecting Minds & Machinesworkshop.nkn.in/2012/Document/slides/OSDD-NKN.pdf · Open Source Drug Discovery (OSDD) Connecting Minds & Machines

Open TB Drug Discovery Platform Informatics to Experimental Validation to Clinical Trials

Target

Validation

of insiilico targets

Systems

Biology

Chem-

informatics

Mtb Strain

and Clone

Repository

Screening

Facility

Assay

Developm-

ent

OSDD

Chem and

Directed

Synthesis

Lead

Identificati-

on

Lead

Optimizati-

on

Target

Identificati

on for

Leads

DMPK In vivo

efficacy

Safety

Pharmacol-

ogy

Pre-

Clinical

Candidate Phase I-III

Pharmco-

genomics

Page 11: Open Source Drug Discovery (OSDD) Connecting Minds & Machinesworkshop.nkn.in/2012/Document/slides/OSDD-NKN.pdf · Open Source Drug Discovery (OSDD) Connecting Minds & Machines

OSDD portal Virtual Lab

Computer Scientists

Mathematical modeling

Data upload

Disease experts Gene/Protein Expression Analysis

Pharmacogenomics expert

Administrator Manages server

Virtual Screening

Unconventional Collaborative Network

Page 12: Open Source Drug Discovery (OSDD) Connecting Minds & Machinesworkshop.nkn.in/2012/Document/slides/OSDD-NKN.pdf · Open Source Drug Discovery (OSDD) Connecting Minds & Machines

Shaping Science 2.0 OSDD Semantic Web Architecture

Page 13: Open Source Drug Discovery (OSDD) Connecting Minds & Machinesworkshop.nkn.in/2012/Document/slides/OSDD-NKN.pdf · Open Source Drug Discovery (OSDD) Connecting Minds & Machines

OSDD Platform

System Architecture

Collaborative tools to accelerate neglected diseases research” in the book “Collaborative Computational Technologies for Biomedical Research”. Wiley and Sons. May 2011

Released : April 2010

Page 14: Open Source Drug Discovery (OSDD) Connecting Minds & Machinesworkshop.nkn.in/2012/Document/slides/OSDD-NKN.pdf · Open Source Drug Discovery (OSDD) Connecting Minds & Machines

Scientific Workflow Management Systems

http://www.tavaxy.org/ http://www.taverna.org.uk/ https://kepler-project.org/ http://galaxyproject.org/

Experimental data from biology and chemistry needs to be managed and analyzed systematically Large datasets and compute intensive analyses needs compute infrastructure

Page 15: Open Source Drug Discovery (OSDD) Connecting Minds & Machinesworkshop.nkn.in/2012/Document/slides/OSDD-NKN.pdf · Open Source Drug Discovery (OSDD) Connecting Minds & Machines

Weka Workflow

a. Convert CSV to test and train files

b. Convert both CSVs to arff files: output_file1 is always train file and output_file2 is test file.

c. Select two input files for Classifier. Change the parameters in right side panel for each tool

d. Evaluate model file: Classifier will be Misc -> SerializedClassifier

Page 16: Open Source Drug Discovery (OSDD) Connecting Minds & Machinesworkshop.nkn.in/2012/Document/slides/OSDD-NKN.pdf · Open Source Drug Discovery (OSDD) Connecting Minds & Machines

http://sysborg2.osdd.net

Electronic lab note books

APIs to submit workflow method to lab note book

APIs to submit results to lab note book

APIs to extract files from lab note books

More than 250 applications integrated

Customized workflow with grid infrastructure & applications

Jobs are invoked from Customized Galaxy and submitted to Gridway

Input file + parameters

Gridway meta

scheduler

LRM Torque

Clusters

Programs

Gridway runner Job template PBS

Customized

Job Status may be checked using DRMAA API

Page 17: Open Source Drug Discovery (OSDD) Connecting Minds & Machinesworkshop.nkn.in/2012/Document/slides/OSDD-NKN.pdf · Open Source Drug Discovery (OSDD) Connecting Minds & Machines

Get data customized for extracting files from open lab note book

Custom APIs for importing input files from OSDD’s open lab note book into Galaxy

Page 18: Open Source Drug Discovery (OSDD) Connecting Minds & Machinesworkshop.nkn.in/2012/Document/slides/OSDD-NKN.pdf · Open Source Drug Discovery (OSDD) Connecting Minds & Machines

Workflows and the result of the workflows are stored as separate lab note books

Lab note book has details of the experiments performed Results of one experiment may be invoked for analysis in another experiment All versions of the workflow and the results are stored Flexibility to execute nested workflows

Custom APIs for exporting results to OSDD’s Open lab note book

Page 19: Open Source Drug Discovery (OSDD) Connecting Minds & Machinesworkshop.nkn.in/2012/Document/slides/OSDD-NKN.pdf · Open Source Drug Discovery (OSDD) Connecting Minds & Machines

List of >250 modules integrated as web services by OSDD Community

S. No Resources Clients 1 KEGG: Kyoto Encyclopedia of Genes and Genomes 60 2 GetEntry: DDBJ sequence search by accessionID 43 3 GPSR : tools 33 4 PDB : Protein Data Bank 30 5 BioModel:mathematical models of biological DB 25 6 Gtps : Gene Trek in Prokaryote Space 8

7 WSDbfetch: retrieve entries from biological dbs using entry identifiers or accession no.

7

8 Gibv: Genome Information Broker for Viruses 7 9 DDBJ :DNA Data bank of Japan 7 10 Mafft: a multiple sequence alignment program 4 11 Fasta:- DDBJ database 4 12 Ensembl : maintains automatic annotation 4 13 VecScreen vector contamination 4 14 OMIM:Online Mendelian Inheritance in man 4 15 Gtop: Gene-product Informatics 3 16 GO: Gene Ontology 3 17 SPS : Splicing Profile based Score 2 18 GIBIS: Genome Information Broker for Insertion Sequence 1 19 RefSeq: database of sequence 1 20 GIB: Genome Information Broker 1 21 GIBEnv- DDBJ database 1 22 TxSearch: Database indexing & searching 1

Page 20: Open Source Drug Discovery (OSDD) Connecting Minds & Machinesworkshop.nkn.in/2012/Document/slides/OSDD-NKN.pdf · Open Source Drug Discovery (OSDD) Connecting Minds & Machines

Ongoing: Cheminformatics

Curated molecule datasets

Cheminformatics Models

Data Mining and Analysis

HT Virtual screening

PubChem

ChEMBL

DrugBank

Experimental Assays

Community of About 400

Other Active Communities: •OSDD Women Scientists Forum •OSDD Junior Scientists Forum

Page 21: Open Source Drug Discovery (OSDD) Connecting Minds & Machinesworkshop.nkn.in/2012/Document/slides/OSDD-NKN.pdf · Open Source Drug Discovery (OSDD) Connecting Minds & Machines

Background and Premise

Page 22: Open Source Drug Discovery (OSDD) Connecting Minds & Machinesworkshop.nkn.in/2012/Document/slides/OSDD-NKN.pdf · Open Source Drug Discovery (OSDD) Connecting Minds & Machines

Why are we doing this?

Page 23: Open Source Drug Discovery (OSDD) Connecting Minds & Machinesworkshop.nkn.in/2012/Document/slides/OSDD-NKN.pdf · Open Source Drug Discovery (OSDD) Connecting Minds & Machines

Crowd-Sourcing Large-Scale Data-Driven

Cheminformatics Analysis

Machine Learning

based

Computational

Models

Bioassay Datasets

Computational Tools

and Resources

People

Standard

re-ususable

models/

Publications

Page 24: Open Source Drug Discovery (OSDD) Connecting Minds & Machinesworkshop.nkn.in/2012/Document/slides/OSDD-NKN.pdf · Open Source Drug Discovery (OSDD) Connecting Minds & Machines

Pubchem Bioassay data

(approx. 1 lakh molecules/

dataset

6000 descriptors/molecule

Successful Models

Screen PubChem

(30 million)

Data amplification in Cheminformatics

Potential Hits

o Down sizing and random validation require multiple calculation for validation of results o Cross validation up to 50+ time for each experiment

Page 25: Open Source Drug Discovery (OSDD) Connecting Minds & Machinesworkshop.nkn.in/2012/Document/slides/OSDD-NKN.pdf · Open Source Drug Discovery (OSDD) Connecting Minds & Machines

The Problem

Page 26: Open Source Drug Discovery (OSDD) Connecting Minds & Machinesworkshop.nkn.in/2012/Document/slides/OSDD-NKN.pdf · Open Source Drug Discovery (OSDD) Connecting Minds & Machines

C-DAC’s Garuda Grid – Indian Grid Computing Initiative

• C-DAC is R&D organization under Ministry of Communication & Information Technology, India

• C-DAC’s Garuda Grid is targeted at providing a facility for the scientific community, which would enable them to seamlessly access the distributed resources

• Compute Power of GARUDA: ~ 70TFs (6000 CPUs)

• Currently there are 55 Garuda Partners

• Has NKN (National Knowledge Network) connectivity at 10Gbps

Page 27: Open Source Drug Discovery (OSDD) Connecting Minds & Machinesworkshop.nkn.in/2012/Document/slides/OSDD-NKN.pdf · Open Source Drug Discovery (OSDD) Connecting Minds & Machines

Internet/NKN

Results

NKN

OSDD-Garuda Interface

Page 28: Open Source Drug Discovery (OSDD) Connecting Minds & Machinesworkshop.nkn.in/2012/Document/slides/OSDD-NKN.pdf · Open Source Drug Discovery (OSDD) Connecting Minds & Machines

Weka in Galaxy

Page 29: Open Source Drug Discovery (OSDD) Connecting Minds & Machinesworkshop.nkn.in/2012/Document/slides/OSDD-NKN.pdf · Open Source Drug Discovery (OSDD) Connecting Minds & Machines

OSDD – Garuda Activities • Created OSDD Virtual organization and 70 users registered

under this VO.

• Garuda Portal customized to support OSDD requirements

• Galaxy – a biology workbench has been customized as per OSDD requirements

• JNU Head node was set up for hosting Galaxy

• Common data has been uploaded to Data Location for accessibility through Galaxy and Portal by all OSDD users

• Three cluster resources have been provided for OSDD activities – Hyderabad Cluster with 320 CPUs

– Chennai Cluster with 304 CPUs

– Param Yuva at Pune with 4368 CPUs

• Hand-holding users from the community & resolving their queries

Page 30: Open Source Drug Discovery (OSDD) Connecting Minds & Machinesworkshop.nkn.in/2012/Document/slides/OSDD-NKN.pdf · Open Source Drug Discovery (OSDD) Connecting Minds & Machines
Page 31: Open Source Drug Discovery (OSDD) Connecting Minds & Machinesworkshop.nkn.in/2012/Document/slides/OSDD-NKN.pdf · Open Source Drug Discovery (OSDD) Connecting Minds & Machines

OSDD Cheminformatics Programme Present Status

Models for anti-tubercular activity

Periwal et al (2012) BMC Pharmacology

Periwal et al (2011) BMC Res Notes

Models for anti-malarial activity

Periwal et al (2012) under review

Models for drug toxicity

Seal et al (2012) Journal of Cheminformatics

Models for specific drug targets

(GlmU, Kinases, DAP)

Singla et al (2011) BMC Pharmacology

Garg et al (2010) BMC Bioinformatics

Garg et al (2010) BMC Bioinformatics

Models for drug metabolism

Mishra et al (2010) BMC Pharmacology

Databases and Datasets for Cheminformatics

Singh et al (2012) Nucleic Acids Research

Singla et al (2010) BMC Pharmacology

Collaboration on

cheminformatics training and

research

Trained ~ 50 students in

advanced cheminformatics data

analysis methods

Training for students on parallel

data analysis environments

TRAINING

Page 32: Open Source Drug Discovery (OSDD) Connecting Minds & Machinesworkshop.nkn.in/2012/Document/slides/OSDD-NKN.pdf · Open Source Drug Discovery (OSDD) Connecting Minds & Machines

OSDD Cheminformatics Programme Overview

Models for anti-tubercular activity

Models for anti-malarial activity

Models for drug toxicity

Models for drug metabolism

Computational

Resources for

Drug Discovery

(CRDD)

Models for specific drug targets NKN+

CDAC-Garuda

Public reporitories of

Chemical Data

(PubChem/ChEMBL/Drug

bank)

OSDD Chemical

Respository

(OSDDChem)

OSDD Chemistry

Outreach Programme

ANALYTICS DATA RESOURCES

Prioritization of biologically active molecules for assays

Predictive modeling of Drug Metabolism and toxicity

(predictive-insilico pre-clinical trial)

OUTCOMES

Page 33: Open Source Drug Discovery (OSDD) Connecting Minds & Machinesworkshop.nkn.in/2012/Document/slides/OSDD-NKN.pdf · Open Source Drug Discovery (OSDD) Connecting Minds & Machines

Anshu Bhardwaj Council of Scientific & Industrial Research (CSIR),

India

Chintalapati Janaki, Center for Development of Advanced Computing (C-DAC),

India

www.osdd.net 25-26 May 2011

Customized Galaxy with applications as Web Services and on the Grid for Open Source Drug Discovery (OSDD)

A CSIR led team India consortium with global partnership for affordable healthcare

Page 34: Open Source Drug Discovery (OSDD) Connecting Minds & Machinesworkshop.nkn.in/2012/Document/slides/OSDD-NKN.pdf · Open Source Drug Discovery (OSDD) Connecting Minds & Machines

Literature

Annotation Tools

Genomic Databases

Curated Annotations

Raw Annotations

OSDD C2D Community

800+ Student Researchers

Collaborative Curation

Pathway/Interactome | Gene Ontology | Protein Structure/Fold | Glycomics| Immunome

The “Connect to Decode” Programme

Page 35: Open Source Drug Discovery (OSDD) Connecting Minds & Machinesworkshop.nkn.in/2012/Document/slides/OSDD-NKN.pdf · Open Source Drug Discovery (OSDD) Connecting Minds & Machines

Community Curation!!

Wrong (mark in red)

Right (mark in green)

Online discussion

Working on the cloud..

Page 36: Open Source Drug Discovery (OSDD) Connecting Minds & Machinesworkshop.nkn.in/2012/Document/slides/OSDD-NKN.pdf · Open Source Drug Discovery (OSDD) Connecting Minds & Machines

OSDD Community Effort to Understand Mtb Biology

Page 37: Open Source Drug Discovery (OSDD) Connecting Minds & Machinesworkshop.nkn.in/2012/Document/slides/OSDD-NKN.pdf · Open Source Drug Discovery (OSDD) Connecting Minds & Machines

The largest Mtb Interactome

54 Authors 29 Institutions

More than 2500 views and 350 downloads till date

Published: July 11, 2012

Page 38: Open Source Drug Discovery (OSDD) Connecting Minds & Machinesworkshop.nkn.in/2012/Document/slides/OSDD-NKN.pdf · Open Source Drug Discovery (OSDD) Connecting Minds & Machines

Knowledge Discovery Systems S. no. Resource Description URL

1 SysBorg* Community interaction portal http://sysborg2.osdd.net

2 OSDDChem* Portal for submission/proposal of synthetic compounds for screening

http://crdd.osdd.net/osddchem

3 OSDDChemDesign Portal for submission/proposal for compounds for screening

http://180.149.49.37/servers/osddchemdesign

4 Tbrowse* Genome browser for Mtb http://tbrowse.osdd.net

5 IPW* Interacting partners database http://crdd.osdd.net/servers/ipw

6 curateTB Curated data on TB from literature http://180.149.49.37/servers/ctb

7 Structural Annotation* Structural proteome of Mtb http://proline.physics.iisc.ernet.in/Tbstructuralannotation

8 ccPDB* Compilation and creation of data sets from Protein Data Bank

http://crdd.osdd.net/raghava/ccpdb

9 GDoQ* Predicting novel/potent inhibitors against GLMU http://crdd.osdd.net/raghava/gdoq

10 KiDoQ:* Predicting novel/potent inhibitors against DHDPS http://crdd.osdd.net/raghava/kidoq

11 MbtA* QSAR and combinatorial library for MbtA

12 MetaPred* Prediction of cytochrome P450 isoform responsible for metabolizing a drug molecule

http://crdd.osdd.net/raghava/metapred

13 Anti-tubercular models* Predictive models for anti-tubercular molecules using machine learning on high throughput biological screening data sets

14 Mutagenicity models* In-silico Predictive Mutagenicity Model Generation Using Supervised Learning Approaches

15 Natural product database - β version

Database of biologically active phytomolecules and plant extracts with anti-mycobacterial activity

http://crdd.osdd.net/osddchem/biophytmol

16 Pharmacomodeling predictions*

Modeling metabolic adjustment in Mtb upon treatment with isoniazid

17 Galaxy workflow engine Workflow engine to plugin applications for generating computational pipelines

http://sysborg2.osdd.net

* Published Available

Page 39: Open Source Drug Discovery (OSDD) Connecting Minds & Machinesworkshop.nkn.in/2012/Document/slides/OSDD-NKN.pdf · Open Source Drug Discovery (OSDD) Connecting Minds & Machines

Within weeks, 830 volunteered to re-annotate the entire M. tuberculosis genome. The work started in December 2009 and was completed by April 2010, packing nearly 300 man-years into 4 months!

Source: Munos B. Can Open-Source Drug R&D Repower Pharmaceutical Innovation? Clin Pharmacol Ther 2010;87:534–536

Source: Hiroaki Kitano Nature Chemical Biology 7, 323–326 (2011)

Social engineering for virtual 'big science' in systems biology

Page 40: Open Source Drug Discovery (OSDD) Connecting Minds & Machinesworkshop.nkn.in/2012/Document/slides/OSDD-NKN.pdf · Open Source Drug Discovery (OSDD) Connecting Minds & Machines

OSDD : A Global Community - More than 6500 members from over 130 countries

Statistics as of October 2012

Page 41: Open Source Drug Discovery (OSDD) Connecting Minds & Machinesworkshop.nkn.in/2012/Document/slides/OSDD-NKN.pdf · Open Source Drug Discovery (OSDD) Connecting Minds & Machines

Together we can …

.. and we should !

http://www.osdd.net http://c2d.osdd.net

[email protected] [email protected]

anshu.bhardwaj Report of the CEWG of WHO

Recognised OSDD as an Open

Innovation Model 5 April 2012 | Geneva

How Open Source Drug Discovery Is

Helping India Develop New Drugs

Apr 9, 2012

DNDi POLICY BRIEF recognised

OSDD as part of Global Landscape

for Neglected Diseases R&D

April 2012

Crowd Sourcing

Innovation:

CSIR portal for OSDD

2011

Crowd-Sourcing Drug Discovery

24 February 2012

Vol. 335 no. 6071 p. 909