drug design / drug discovery jerome baudry assistant professor bcmb ut/ornl center for molecular...

Drug Design / drug discovery

Jerome BaudryAssistant Professor BCMB

UT/ORNL Center for Molecular Biophysics

2 previous incarnations:

Research faculty at UIUCResearch scientist at Transtech Pharma, Inc.

Drug Design / drug discovery

What’s a drug? A substance that treats/cure a disease.A small molecule that interacts with a target, (often protein involved in the disease process; activator/inhibitor)

Drug discovery:The process of finding such a small molecule – combination of approaches

Drug discovery or drug design? In principle: “Design” is more rational and targeted, and “discovery” is more serendipitous. But design and discovery share a lot and are ~ synonymous in a pharmaceutical context.

Hoopkins, Groom, Nat Rev Drug Discov. 2002 1(9):727-30. 5% of human genome is “druggeable”

Gigantic economic importance: 10 years & $200 to $1,900 million to develop a drug

25 new molecules /year

Intense scientific activity: very interdisciplinary approach

> $340 billion

Drug discovery market

in millions US$ Revenue R&D income

Johnson & Johnson 53,324 7,125 11,053

Pfizer 48,371 7,599 19,337

GlaxoSmithKline 42,813 6,373 10,135

Novartis 37,020 5,349 7,202

Sanofi-Aventis 35,645 5,565 5,033

Hoffmann–La Roche 33,547 5,258 7,318

AstraZeneca 26,475 3,902 6,063

Merck & Co. 22,636 4,783 4,434

Abbott Laboratories 22,476 2,255 1,717

Wyeth 20,351 3,109 4,197

http://en.wikipedia.org/wiki/List_of_pharmaceutical_companies

http://en.wikipedia.org/wiki/Johnson_%26_Johnson

http://en.wikipedia.org/wiki/Pfizer

http://en.wikipedia.org/wiki/GlaxoSmithKline

http://en.wikipedia.org/wiki/Novartis

http://en.wikipedia.org/wiki/Sanofi-Aventis

http://en.wikipedia.org/wiki/Sanofi-Aventis

http://en.wikipedia.org/wiki/Hoffmann%E2%80%93La_Roche

Chemistry: synthesis

Discovery and design (hit/lead/optimisation)

Biology: assay (binding/activity; in vitro / in vivo,)

Target identification

The drug discovery and design workflow:

drug development:Pharmacology / testing

The long and winding road to drug discovery

Computational chemistry /Molecular modeling

useful across the pipeline, but

very different techniques

aim for success,but if not:

fail early, fail cheap

Structure-basedknow receptor,don’t known ligands

Two pathways to drug discovery / drug seign

?What will be happy in there?

Structure-baseddon’t know receptor,known ligands

Protein/ligand interactionsstructure/biophysicsdocking

Statistical analysis of what group(s) are important for biological activity

structure modeling(homology/experimental X-ray/NMR/neutron)Get a structure

high-throughput docking/screeningGet a “hit” (anything at all)

Structure-based approachesUse knowledge of structure to find something that 1) binds, and 2) does the desired biological activity

focused library dockingfragment-based growth

‘individual’ molecules simulations

Structure-based library screening

What do we need:

1) Compounds libraries2) Protein target3) Binding site in the protein4) Docking: generate different (many) possible conformations of the compounds in the binding site5) Scoring: evaluate the strength of the protein/ligand interactions (score).6) Select preferred ligands to propose a list of prioritized compounds for experimental screening.

Best case scenario, a high-quality experimental structure exists:PDB: http://www.rcsb.org/pdb/- experimental collection of (49 295) structures, ~18 000 non-redundant sequences- X-Ray & NMR,- nucleic acids, proteins, carbohydrates

Structure-based approachesStructure modeling

that’s ~1% of the 5.5 million protein sequences in swissprot (http://www.ebi.ac.uk/swissprot/sptr_stats/index.html)

and < ~0.00007% of earth’s proteins, (5E6 organisms, 5K genes/genome, low-end estimate.)

~50,000 non-redundant protein structures in the PDB: is that a lot?

Structure-based drug discovery = “Post genomics challenge”:structural biology, functional genomics, chemical biology…

…AQRTEVYTYRRS…proteinsequence protein

structure

Must do for new pharmaceutical target

(homology, ab-initio folding…)



If no available experimental structure – work on that , and in the meantime: Homology modeling: use structure of close (sequence-wise) proteins to build, by analogy, a new protein.

R1

R2

R4N

N O

O

R3

http://blaster.docking.org/zinc/

Databases of compounds- vendors- literature- corporate/laboratory - virtual compounds-A priori anything, but we can be smarter than that

http://nihroadmap.nih.gov/molecularlibraries/

F

O

N

Library designed against protein target, - based on hits from previous database screening

Millions of cmpds’ structures are available from public databases.Major NIH effort to fund & develop libraries:

moreexploratory

morefocused

Structure-based approachesCompound selection

outside inside deleted

When site is not known, eraser/flooding techniques

binding site (3D)

Or…make your life easier and build the site around a co-crystallized ligandIf available…

Locate cavities in a protein

Structure-based approachesBinding Site

save

HIGH-THROUGHPUT OR LOW-THROUGHPUT ?fast (initial) accurate (on best cmpds from initial)

Choices based on the desired throughputfrom 10 seconds to 10 minutes / compound

650,000 cmpds library, on 10 processors: from 3 days to 6 months

Most time-consuming part (by far)

YES

NO

OK

BETTER

Structure-based approachesdocking

Scoring functions. Quantify the energy of protein/ligand interactions such as: hydrogen bondelectrostaticsvan der Waalshydrophobic etc …

Several scoring functions exist, more/less specialized, fast etc…

PROTEIN

LIGAND

Structure-based approachesscoring

scoring functions:

Force-field based: (CHARMM, AMBER etc). MMFF: very popular one because of “modular parametrisation”: easy to derive parameters from functional groups, well adapted to organic molecules.

Physically ‘accurate’ but slow, parametrisation issues.

Empirical – count the number of interactions and assign a score based on the # of occurrences. E.g. :H-bonds, ionic interactions (easy because very directional and well quantified)

Hydrophobic interactions (more difficult to assess and quantify)Number of rotatable bonds frozen (link to entropic cost of binding, quite difficult to estimate)

Knowledge-based – observe known protein/ligand structures, and favor interactions and geometries that are seen often. Idea: directly link to free energy because “real life” distribution (potential of mean force).

But: based on small # of entries.

Intense competition “my scoring function is better than yours”

Future: force-field based / even QM-basedDifferent approaches depending on size


Enrichment factor = (5/30) / (30/ 1000000) = 166 HUGE SUCCESS

Often: consensus scoring: choose the few molecules that are ranked consistently well among many docking function

1,000,000 molecules, 30 actives. 1000 selected, 5 actives

Enrichment factor = (3/1000) / (30/1,000,000) = 100 HUGE SUCCESS

1,000,000 molecules, 30 actives. 1000 selected, 3 actives


R1

R2

R4N

N O

O

R3

F

O

N

COMPUTATIONAL DOCKING: GENERATE TESTABLE IDEAS

Chemistry: synthesis

Discovery and design (hit/lead/optimisation)

Biology: assay (binding/activity; in vitro / in vivo,)

Possible to start next round of iteration (or do ‘traditional’ modeling). Redock with improved accuracy (e.g QMMM)

Reproduce know xtal structure HIV protease and inhibitor

Examples (low-throughput)Works great … in most publications

crystal structurefirst round of docking (shape only)final result (after rigid-body minimizations: energetics taken into account)

Ligand-based site Flood-based site

Venkatachalam, et al.; J. Mol. Graph. Model. 2003, 289-307

But also… fails miserably (rarely in publications !)

crystal structurefinal results (rigid-body minimizations)Illustrate issues with binding site’s shape (there are workarounds)

Examples (low-throughput)

Venkatachalam, et al.; J. Mol. Graph. Model. 2003, 289-307

Ke et al, Archives of Biochemistry and Biophysics 436 (2005) 110–120

Example II): discovery of ligand/function for a new P450

Development of a database of bio and agrochemical compounds of relevance for P450 (currently ~ 14,000 structures). In-house compounds, KEGG database: (http://www.genome.jp/kegg/ligand.html), Compendium of Pesticide Common Names: (http://www.alanwood.net/pesticides/index.html).

Development of CYP120A1 model from CYP107A template (23.6% identity)

HT-docking (LigandFit). identify 99 compounds consistently predicted to be good binders. Confirmed: retinoic acid

~14,000 structures

Ke et al.. Arch. Biochem. Biophys. 2005

high-throughput dockingGet a “hit” (anything at all)

CONCLUSIONS

In-silico combinatorial library design & structure-based screening:fast, efficient and inexpensive tool to :- discover new possible ligands against a macromolecular target- test library design ideas- identify most promising scaffolds and R groups prior to synthesis

Baudry, J.; Hergenrother, P. J. "Structure-based Design and In-Silico Virtual Screening of Combinatorial Libraries. A Combined Chemical/Computational Laboratory Assignment" J. Chem. Ed. 2005, 82, 890-894. http://www.scs.uiuc.edu/~phgroup/pdfs/2005PJHchemed.pdf

HT-DOCKING SUCCESS IF:

i) FIND A FEW MOLECULES OF INTERESTii) MUCH QUICKER AND CHEAPER THAN “real” screening

Comparison model / crystal structure

residues within 4 Å of heme

Green/blue: model, red/orange:crystal

Residues around the ligand’s -ionone ring are very close in both structures

(phe182 & Trp76 same pharmacophore)

Green/blue: model, red/orange:crystal

Comparison model / crystal structure

De novo designFragment-based “inside-out” approach

Put functional groups in binding site (docking or manually, or combination)

Link these groups (docking or manual, or combination): *must* be able to synthesize it – no molecular monsters

Caflish, Miranker, Karplus J .Med.Chem. 36, 2142-2167 (1993) Eisen, Wiley, Karplus, Hubbard Proteins Structure, Function and Genetics 19, 199-221 (1994).

i)dock functional groupsii)keep low energy groupslink with scaffolds

iii) correct binding site, but ≠ too;“lead hopping”

drug design / drug discovery jerome baudry assistant professor bcmb ut/ornl center for molecular...

Documents

drug discovery drug

drug discovery market

drug development

synthesis discovery

structurebased dont

knowledge of structure

druggeable slide

cheap slide