molecular modelling for in silico drug discovery
TRANSCRIPT
![Page 1: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/1.jpg)
Molecular modelling for In silico drug discovery:modelling small molecules and proteins
Dr Lee [email protected]
![Page 2: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/2.jpg)
Lecture Aim
This lecture aims to provide a basic understanding of the concept of protein and molecular in silico engineering/design as part of the drug development process:-
Introducing theory and approaches, drivers, databases and software – and with a focus on safety and efficacy.
![Page 3: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/3.jpg)
This Lecture Covers
• Drivers for use of computational approaches
• Getting protein structures• Simulation of molecular interactions• Considering safety during design
• We will also highlight key software or data sources along the way
![Page 4: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/4.jpg)
Key Drivers for in silico
![Page 5: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/5.jpg)
Business
Target identification
Lead selection
Lead refinement
Pre-Clinical phases
GenomicsProteomics/MetabolomicsInteraction Networks
Molecular modellingProtein modellingChemoinformatics
Molecular modellingData modellingInteraction Networks
Systems BiologyIn vitroIn vivo
££
£
£
££
£
![Page 6: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/6.jpg)
Ethics Drivers• Use of animals in research
• 3Rs – Refine, Reduce, Replace
• Relevance of animal data for human use• Extrapolation across species
• Improvement of safety for subsequent trials
• Regulatory requirements and change
![Page 7: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/7.jpg)
Extrapolation of data across speciesHow relevant is animal physiology to human physiology ?
Models not available for all diseases
Choice of species can be important• 30% attrition due to no efficacy in man• 10% attrition due to toxicity
For biologics, even more difficult to predict
![Page 8: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/8.jpg)
Part 1: Small Molecule Drugs
8
![Page 9: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/9.jpg)
Safety and Efficacy of Small Molecule Drugs
• Safety: safety issues primarily focus on the potential of the small molecule to have off-target effects, metabolite/breakdown product toxicity, or buildup/non clearance
• Efficacy: efficacy issues focus on bioavailability and good binding kinetics to the right target protein – including variations of that protein (SNPs/mutants)
![Page 10: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/10.jpg)
1st we need a source of molecules: Chemical Repositories
• Databases with safety information (GRS, CAS)
• Databases with structure and vendor/price – individual chemical supply companies - Zinc
• Databases with multiple information types – ChEMBLdb, PubChem, Kegg
![Page 11: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/11.jpg)
ChEMBLdb“The ChEMBL database (ChEMBLdb) contains medicinal chemistry bioassay data, integrated from a wide variety of sources (the literature, deposited data sets, other bioassay databases). Subsets of ChEMBLdb, relating to particular target classes, or disease areas, are exported to smaller databases, These separate data sets, and the entire ChEMBLdb, are available either via ftp downloads, or via bespoke query interfaces, tailored to the requirements of the scientific communities with a specific interest in these research areas”
• Targets: 10,579• Compound records: 1,638,394• Distinct compounds: 1,411,786• Activities: 12,843,338• Publications: 57,156
(release 19)
![Page 12: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/12.jpg)
ChEMBLwww.ebi.ac.uk/chembl/
![Page 13: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/13.jpg)
1. Representation of atomic coordinates
2. Scoring
3. Searching
Basic Requirements for modelling
![Page 14: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/14.jpg)
• How much information do you want to include?
• atoms present• connections between atoms
• bond types• stereochemical configuration• charges• isotopes • 3D-coordinates for atoms
Structure Representation
C8H9NO3
![Page 15: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/15.jpg)
• 3D-coordinates for atoms
Structure Representation
OH
CH2
CHNH2
OH
O
• connections between atoms• bond types
![Page 16: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/16.jpg)
Structure Representation - InChi
http://en.wikipedia.org/wiki/International_Chemical_Identifier
Morphine
InChI=1S/C17H19NO3/c1-18-7-6-17-10-3-5-13(20)16(17)21-15-12(19)4-2-9(14(15)17)8-11(10)18/h2-5,10-11,13,16,19-20H,6-8H2,1H3/t10-,11+,13-,16-,17-/m0/s1
The condensed, 27 character standard InChIKey is a hashed version of the full standard InChI (using the SHA-256 algorithm), designed to allow for easy web searches of chemical compounds.
BQJCRHHNABKAKU-KBQPJGBKSA-N
![Page 17: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/17.jpg)
Energy can be broken into a sum of potential energy terms
Estr stretch
Ebend bend
Etors torsion
EvdW van der Waals
Eel electrostatic
Epol polarization
-+-+
Repulsion
Attraction
+ -++-
+-+
+-+
+- +
E = Ebonds + Eangles + Etorsions + Evdw + Eelectrostatic
θ
φ
Scoring (Energy Functions & Force Fields)
![Page 18: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/18.jpg)
Mol Mechanics (static) – minimisation
Mol Dynamics (dynamic) – laws of motion
MD a bit more complicated … need to know about:
• Classical mechanics
• classical equations of motion (EOM)
• e.g. Newton’s equations of motion
If we know these equations we *could* try to search for ALL possible
structures of Proteins and how they fold e.g. Protein Folding
Searching
![Page 19: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/19.jpg)
• Treat molecule as a set of balls (with mass) connected by rigid rods and springs
• Rods and springs have empirically determined force constants
• Allows one to treat atomic-scale motions in proteins as classical physics problems (OK approximation)
Energy Minimisation Theory
![Page 20: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/20.jpg)
• Efficient way of “polishing and shining” your model• Removes atomic overlaps and unnatural strains in the structure• Stabilizes or reinforces strong hydrogen bonds, breaks weak
ones• Brings molecule to lowest energy
Local minimum vs global minimumMany local minima; only ONE global minimumMethods: Steepest descent, Conjugate gradient, others…
Energy Minimisation
![Page 21: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/21.jpg)
Structures with Low Energyenergy
coordinates
Local minimum
Global minimum
![Page 22: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/22.jpg)
High EnergyLow Energy
Makes small locally steep moves down gradient
Sufficient if starting point already close to optimal solution (e.g. refinement of experimental structure)
Steepest Descent Minimisation
Low Energy
![Page 23: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/23.jpg)
MM calculates just minimum energy state.
MM ignores kinetic energy, does only potential energy
Molecular Mechanics vs Dynamics
Molecules, especially proteins, are not static.• Dynamics can be important to function• Trajectories, not just minimum energy state.
MD takes same force model, but calculates F=ma and calculates velocities of all atoms (as well as positions)
I have no idea where this image came from, but it is a very nice illustation of the comparison. If anyone knows where it is from please let me know!
![Page 24: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/24.jpg)
Why simulate the dynamics of (molecular) systems?
• Molecular systems are not static
• Molecules are in dynamic equilibrium
• Properties are averages over dynamic behaviour of molecules
• Molecular processes are not instantaneous
• Time course (kinetics) of events is important
• Time dependence essential to understand development and regulation
Why?
![Page 25: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/25.jpg)
What can we do with chemical models?
We can investigate structure and similarities of structure between molecules
We can map structural characteristics to properties (SARs)
We can study molecular interactions – particularly with proteins
![Page 26: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/26.jpg)
• Computation to assess binding affinity
• Looks for conformational and electrostatic "fit" between proteins and other molecules
• Optimization: Does position and orientation of the two molecules minimise the total energy? (Computationally intensive)
• Docking small ligands to proteins is a way to find potential drugs. Industrially important!
Interactions – Docking & Screening
![Page 27: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/27.jpg)
• Docking small ligands to proteins is a way to find potential drugs. Industrially important
• A small region of interest (pharmacophore) can be identified, reducing computation
• Empirical scoring functions are not universal
• Various search methods:• Rigid- provides score for whole ligand (accurate)• Flexible- breaks ligands into pieces and docks them
individually
Virtual Screening
![Page 28: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/28.jpg)
So – we need protein (target) structures
http://www.rcsb.org/
![Page 29: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/29.jpg)
The PDB
The PDB was established in 1971 at Brookhaven National Laboratory and originally contained 7 structures. In 1998, the Research Collaboratory for Structural Bioinformatics (RCSB) became responsible for the management of the PDB.
Last year (2013), 9597 structures were deposited from scientists all over the world – this year (2014) so far, 8391
Now totals 105,839 (yesterday) structures
![Page 30: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/30.jpg)
Entries in database - cumulative and by year
Red = total
Blue = yearly
![Page 31: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/31.jpg)
What if there is no structure available?Can we predict structures?
Tertiary structure is dependent on ‘folding’ of the protein.
Recognition, characterisation, and assignment of domains and folds is a major area of structural bioinformatics.
Predicting structure from sequence is one of the biggest challenges...
![Page 32: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/32.jpg)
Historical perspective? Basic secondary structure prediction
Basic methods of secondary structure prediction rely on statistical applications of ‘propensity’
The propensity/inclination/tendency of an amino acid to be in a particular structure based on observation of known datasets
![Page 33: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/33.jpg)
Propensity
n[I][s] / n[I]
n[s] / nP =
P = propensity
I = residue of interest
n[I] = number of residues [I] in the database
n = total number of residues in the database
n[I][s] = number of residues [I] in state of interest i.e. helices
n[s] = number of all residues in the database in the state of interest.
![Page 34: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/34.jpg)
Example
124 / 1640
1246 / 10136P[A] = = 0.61
So, the helical propensity for Alanine where:
• the number of alanines in the database is 1640,
• and the total number of residues in the database is 10136,
• and where the number of alanines found in helices is 124,
• and the total number of residue found in helices is 1246,
would be 0.61
![Page 35: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/35.jpg)
Sliding windows
Propensity values are often assigned using sliding window methods
Sequence: A G T W Y K M C Q N P V
window 1: A G T W Y K M average applied to W
window 2: G T W Y K M C average applied to Y
window 3: T W Y K M C Q average applied to K
Theory that neighboring residues affect local structure
![Page 36: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/36.jpg)
GOR
Method by Garnier, Osguthorpe & Robson (1978).
Uses propensity values for Helix, Sheet, Coil, Turn for each residue from experimentally-determined structures
Analysis done for each state, most probable state is assigned
Sequence EVSAEEIKKHEEKWNKYYGVNAFNLPKELFSKVDEKDRQKYPYNTIGNVFVKGQTSATGV
GOR Sheet ---------------------------------------------SSSSSSSS---SSSS
GOR Helix --------HHHHHHHHH----HHHHHHHHHHHHHHHH-----------------------
GOR Coil --CCCC---------------------------------CCCC-----------------
GOR Turn ------------------TT----------------------------------------
![Page 37: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/37.jpg)
Hydrophobicity
Method by Kyte & Doolittle (1982)
Uses values representing hydrophobicity of residues rather than structural propensity
Applied with a sliding window method
![Page 38: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/38.jpg)
Hydrophobicity
Often helices tend to be more hydrophobic
Internalised regions of a protein are more hydrophobic
Transmembrane domains are hydrophobic
![Page 39: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/39.jpg)
Example Kyte-Doolittle plot
![Page 40: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/40.jpg)
One more - Hydrophilicity
A method by Hopp & Woods (1980)
Experimentally derived values representing residue hydrophilicity
Attempts to determine surface/solvent accessibility - Antigenicity?
![Page 41: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/41.jpg)
Problems
Many of these tools are old - and rely on statistical values from small datasets
They generally cannot achieve better than 60% accuracy (depends on how you measure it!)
60% right is still 40% wrong!!!
!! However – they are still in common use !!
(eg. Emboss tools: garnier & antigenic)
![Page 42: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/42.jpg)
Many scales existhttp://web.expasy.org/protscale/
![Page 43: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/43.jpg)
Application example: Stability
There can be some benefit to using these scales in combination (similar to antigenic) – here using scales for order/disorder, aggregation potential and hydrophobicity to look at protein stability in the absence of structural information
![Page 44: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/44.jpg)
Levinthal’s paradox (1969)
100 residues = 99 peptide bonds
therefore 198 different phi and psi bond angles
3 stable conformations of bond angle = 3198 possible conformations
At a nano/pico second sample rate proteins would not find correct structure for a long time (longer than the age of the Universe!)
Folding is Complex: Is a truly random approach possible?
Proteins fold on a milli/micro second timescale – this is the paradox...
phi
psi
![Page 45: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/45.jpg)
1. proteins do NOT fold from random conformations, which was an assumption of Levinthal's calculation
2. instead, they fold from denatured states that retain substantial 2o, and possibly 3o, structure
• Simulations are computational expensive• Gross approximations in simulations• Nature uses tricks such as
• Posttranslational processing • Chaperones• Environment change
Why are folding simulations so difficult?
How does it work at all?
![Page 46: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/46.jpg)
Complexity & Diversity – potential vs reality
If the average protein contains about 300 amino acids, then there could be a possible 20300 different proteins
(Apparently) this is more than the atoms in the universe!
Yet a human (complex) has only 30,000 proteins
All proteins so far appear to be represented by between 1000 - 5000 fold types
![Page 47: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/47.jpg)
Two reasons for limited fold space
Convergent evolution
Certain folds are biophysically favourable and may have arisen in multiple cases
Divergent evolution
The number of folds seen is limited because they have evolved from a limited number of common ancestor proteins
Despite the evolutionary limitation of the number of existing folds (fold space) it is still complex enough to make classification and
comprehension difficult
![Page 48: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/48.jpg)
Why is Folding Difficult to do?
It's amazing that not only do proteins self-assemble -- fold -- but they do so amazingly quickly: some as fast as a millionth of a second. While this time is very fast on a person's timescale, it's remarkably long for computers to simulate.
In fact, it takes about a day to simulate a nanosecond (1/1,000,000,000 of a second) of dynamics for a reasonable sized protein. (eg Intel core i7 2.66Ghz)
Unfortunately, proteins fold on the tens of microsecond timescale (10,000 nanoseconds). Thus, it would take 10,000 CPU days to simulate folding -- i.e. it would take 30 CPU years! That's a long time to wait for one result!
![Page 49: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/49.jpg)
Folding @ Home folding.stanford.edu
![Page 50: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/50.jpg)
Similar Project:
ab initio protein tertiary structure prediction based on the approach that sequence-dependent local interactions limit or bias segments of the chain to form only distinct sets of local structures
and that non-local interactions select the lowest free-energy tertiary conformations compatible with the local biases.
different models are used to treat the local and non-local interactions.
Rather than attempting a physical model for local sequence-structure relationships, the approach turns to the protein database to look at the distribution of local structures adopted by short sequence segments (fewer than 10 residues in length) in known three-dimensional structures
http://boinc.bakerlab.org/rosetta/
Berkley Open Infrastructure for Network Computing
![Page 51: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/51.jpg)
Infrastructure: BOINChttp://boinc.berkeley.edu/
![Page 52: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/52.jpg)
A: Left, crystal structure of the MarA transcription factor bound to DNA; right, our best submitted model in CASP3.Despite many incorrect details, the overall fold is predicted with sufficient accuracy to allow insights into the mode of DNA binding.
B: Left, the crystal structure of bacteriocin AS-48; middle, our best submitted model in CASP4; right, a structurally and functionally related protein (NK-lysin) identified using this model in a structure-based search of the Protein Data Bank (PDB). The structural and functional similarity is not recognizable using sequence comparison methods (the identity between the two sequences is only 5 percent).
C: Left, crystal structure of the second domain of MutS; middle, our best submitted model for this domain in CASP4; right, a structurally related protein (RuvC) with a related function recognized using the model in a structure-based search of the PDB. The similarity was not recognized using sequence comparison or fold recognition methods.
Some Rosetta@home results
![Page 53: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/53.jpg)
Robetta server http://robetta.bakerlab.org/
![Page 54: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/54.jpg)
Robetta results
This took about 3 weeks to complete
![Page 55: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/55.jpg)
Fold It http://fold-it/portal
![Page 56: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/56.jpg)
A compromise: Homology modelling
If there is no structure for your protein - perhaps there is one for a similar protein.
Sequence alignment tools can be used to compare this to your sequence with unknown structure
Homology searching and sequence alignment is now the first step to protein structure prediction
If homologous proteins are found with structures, unknown can be ‘overlayed’ and structure inferred
![Page 57: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/57.jpg)
Homology Modeling
Based on two assumptions:
1.The structure of a protein is determined by its amino acid sequence alone
2.With evolution, the structure changes more slowly than the sequence - similar sequences may adopt the same structure
![Page 58: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/58.jpg)
Sequence alignment
TEX19 – human protein without a structure.
PDB 2AAM: Crystal structure of a putative glycosidase (tm1410) from thermotoga maritima
![Page 59: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/59.jpg)
Structure inference/alignment
![Page 60: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/60.jpg)
ExPASy - SwissModelSwissModel (swissmodel.expasy.org/)
![Page 61: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/61.jpg)
Phyre2http://www.sbg.bio.ic.ac.uk/phyre2
![Page 62: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/62.jpg)
More annotation http://genome3d.eu/
![Page 63: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/63.jpg)
Using the Models – Docking/Screening
• Choose and prepare target protein• Identify binding pocket• Fit ligand to pocket• Score
• (for screening – repeat!)
![Page 64: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/64.jpg)
Identify the Binding Pocket
• Could identify this by the location of an existing co-crystallised ligand
• Or use surface sphere clusters• Or identify it by clustering of solvent molecules (normally
water)• Perhaps identify it by clustering of fragments (SurFlex
dock protomol)
![Page 65: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/65.jpg)
Binding site based on existing ligand
• Most methods allow you to specify where the site is – perhaps by identifying key residues or based on an existing ligand
• Could use the ‘hole’ left by the ligand as a pocket, or use the ‘surface’ of the ligand as a protomol
![Page 66: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/66.jpg)
Surface Sphere generation• Generate the surface of the target
– Connolly surface
• ‘Rolls’ a sphere the radius of water across the van der Waal’s surface of the target
• Each atom’s centre of van der Waal’s radius acts as a sitepoint for the generation of a sphere on the surface whose centre is perpendicular to the surface at the sitepoint.
• Spheres are then clustered – each cluster is a potential pocket
![Page 67: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/67.jpg)
Identified pocket
![Page 68: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/68.jpg)
Prepare the ligand
• The ligand needs to be prepared too• Drawn & minimised• From a database - & minimised• Extracted from another/the same binding site
• Hydrogens added etc• Minimised/optimised – ready to dock
![Page 69: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/69.jpg)
Docking
• Rigid docking -> ligand is fixed conformationally
• Flexible docking –> ligand is conformationally flexible
• Posable -> ligand is rigid, but moved spacially
![Page 70: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/70.jpg)
Rigid Ligand docking• Centres of spheres
representing the binding pocket act as ‘Site Points’
• The atoms of the ligand are matched to the site points
• Once orientation made, possibly interaction minimised: receptor kept rigid and ligand flexible
![Page 71: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/71.jpg)
Alternatives
Flexible Docking Posable Docking
Rings treated as flexible
Other bonds treated as flexible/rotamers
Rings treated as rigid – ligand fragmented
Rigid docking, but ligands posed conformationally
•Rotated•Twisted•Flipped etc
And repetitively docked to find best fit
![Page 72: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/72.jpg)
Example Interaction – Avidin / Biotin
![Page 73: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/73.jpg)
Virtual Screening• Docking – but repeated with many potential ligands
• Libraries can come from resources such as PubChem/ChEMBLdb – vendors – or other in-house sources
• From specialised databases holding structures suitable for docking
• It is important to have a diversified library especially for rigid docking !
![Page 74: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/74.jpg)
Considering safety & efficacy – “Drug-like”
Lipinski rule of 5 (or Pfizer rule)
‘Compounds which violate at least two of the following conditions have a very low chance of being orally bioavailable’
• MW <500 Da• log P (lipophilicity) <5• number of H bond donors <5• number of H bond acceptors <10
Works well once you have descriptions of small molecules – can be search criteria in databases...
![Page 75: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/75.jpg)
ADME / ADME-Tox• Lipinski rule is really the 1st step in ADME (adsorption,
distribution, metabolism, excretion) modelling
• Structure Activity Relationships (SARs) – similar molecules will behave in similar ways, ie have similar effects.
• Allows for knowledge-based compariative analysis – Tox databases
![Page 76: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/76.jpg)
ChEMBL SARfari(s)
![Page 77: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/77.jpg)
Knowledge-based tox in silico
www.dixa-fp7.eu
![Page 78: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/78.jpg)
Toxicogenomics – Open TG-Gates
![Page 79: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/79.jpg)
HeCaToS http://www.hecatos.eu/
![Page 80: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/80.jpg)
Final Comments
![Page 81: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/81.jpg)
Remember the Key Drivers for in silico approaches
![Page 82: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/82.jpg)
Explore the following Software ToolsAs well as resources mentioned in the slides!
Homology ModellingModeller, Phyre, SwissModel
Model ViewersPymol, Jmol, Rasmol
Molecular Simulation etcGromacs, Tinker, Amber, NAMD, Charmm,
Docking/ScreeningSurflex Dock, Dock, AutoDock, Vina
Graphical Tools/builders/interfacesChimera, Maestro, Ghemical, VMD, DeepView
Suites (companies)Tripos, Accellrys, OpenEye, ChemAxon, Schrodinger, MoE, Yasara
Some are free for academic use, but cost for commercial use
Take note and beware!
![Page 83: Molecular modelling for in silico drug discovery](https://reader033.vdocuments.net/reader033/viewer/2022052700/55a13dd31a28ab66188b45df/html5/thumbnails/83.jpg)
Workflow example – free vs paid
ChEMBL
PDB
Discovery Studio
Marvin Sketch
Chimera
Gromacs
Dock
Chimera
ligand
target
get structures
minimisation
dynamics
docking
evaluation
preparation
Commercial suite vs free tools
£££ $$$