a review of protein-small molecule docking methods[1]
TRANSCRIPT
Searching and Scoring• Searching cannot be done fully as it would take
about 2,000,000 years to search the entire space.
• Searching through 3 degrees of rotation and 3 degrees of translation plus protein and ligandflexibilities.
• Either try to search entire search space in systematic manner or search part of space in partially random/partially criteria guided manner
Search Methods• Molecular Dynamics• Monte Carlo• Genetic Algorithms• Fragment-based• Point Complementarity• Distance Geometry• Tabu Searches• Systematic Searches• Multiple Method Algorithms
AMBER
• energy between covalently bonded atoms • energy due to the geometry of electron orbitals
involved in covalent bonding • energy for twisting a bond due to bond order
(e.g. double bonds) and neighboring bonds or lone pairs of electrons
• non-bonded energy between all atom pairs, which can be decomposed into van der Waalsand electrostatic energies
Molecular Dynamics
• AMBER and CHARMM using force fields.• It has been used to dock flexible ligands to
flexible receptors.• It does not get out of a local minimum well• It can be coupled with a well jumping
technique
Monte Carlo
• A simpler energy function can be used.• More ease to step over energy barriers.• Making random moves and accepting or
rejecting based on Boltzmann probability.
(Comp. Chem. 2 WS 04/05)
From A comparison of heuristic search algorithms for molecular docking By David R. Westhead, David E. Clark and Christopher W. Murray
MC AutoDock
• Flexible ligands to rigid receptors• Utilizes AMBER force field.• Reproduced all 6 tested complexes• Lowest energy structures did not always
correspond to crystallographic conformation
MC Prodock
• Flexible ligands to flexible receptors.• After each random move a local gradient-
based minimization is performed.• Utilizes AMBER force field.• MC Caflisch uses CHARMM similar to
Prodock.
Genetic Algorithms
• These require initial population, instead of a single initial state as MD and MC.
• Degrees of freedom are encoded into genes and the fitness function is based on a scoring function.
GOLD
• Full ligand flexibility and Partial receptor flexibility.
• 71% success rate (72% on their web site)• 66 had RMSD of 2.0 angstroms or less• 71 had RMSD of 3.0 angstroms or less• Out of 100 complexes• http://gold.ccdc.cam.ac.uk/
Autodock 3.0
• Genetic Algorithm as a global optimizer with an energy minimization as a local search.
• For each new population created a user defined fraction undergo a local search.
• http://www.scripps.edu/mb/olson/people/gmm/movies.html
• Tested on 7 complexes. All lowest energy structures were within 1.14 angstroms RMSD of the crystal structure.
Other GA
• DIVALI 3 out 4 1.7 angs RMSD• DARWIN• Study on HIV by Gehlhaar.
– 34 (out 100 simulations) with max RMSD of 1.5 angs
Fragment-based Methods
• Dividing the ligand into fragments• Docking the Fragments• Linking of Fragments
FlexX• Picking base fragment (manual but not it has been
automated)• Dock the base in multiple ways• Incrementally add to base fragment until the whole
ligand is docked.• 10 out 19 complexes, with the best score, reproduced
the experimental result with a max RMSD of 1.04 angs• Recent extension of FlexX includes the placement of
explicit waters into binding site during docking. (Improvement in some cases was observed, such as HIV protease)
Other FB
• DOCK 4.0 – 7 out of 10 reproduced crystal structure with
maximum RMSD of 1.3 angs• LUDI• ADAM
– 2 out of 2 with RMSD of less than 1 angsHammerhead
– 4 out of 4 with RMSD of less than 1.7 angs
Point Complementarity• Most Fragment-based algorithms could also be
in this category• Major distinction is that the ligand is not broken
up here.• Molecular surface is represented by a series of
surface cubes. There are also another set of inner cubes representing the inside of the molecule.
• The docking is performed by maximizing matches between surface cubes minus the inner cube overlaps.
Programs• Soft Docking• FTDOCK• LIGIN• SANDOCK• FLOG• Most of these methods were originally intended
for protein-protein docking. They also only consider rigid bodies, since proteins are much larger.
• As a result they did not perform very well
Other Methods
• Tabu Searches– PRO_LEADS– Start with random conformation.– Generate about 100 new ones from current– Pick new current (best)– Generate 100 repeat….– If new 100 do not contain a better one than
current, pick from old currents
Other Methods
• Tabu Searches– Flexible ligand docking was originally
performed on PRO_LEADS with 50 complexes
– 86% had RMSD of 1.5 angs of crystal structs.
Other Methods
• Systematic searches– EUDOC rigid/rigid (ligand/receptor)– Uses AMBER
• Multiple method algorithms– Tabu/MC– FlexX/CHARMM– GA/Tabu
Comparisons
1.5086.00%4350PRO_LEADSTS
1.70100.00%44HammerheadFB
1.00100.00%22ADAMFB
1.3070.00%710DOCK 4.0FB
1.0452.63%1019FlexXFB
1.5034.00%34100GehlhaarGA
1.7075.00%34DIVALIGA
1.14100.00%77AutoDock 3.0GA
3.0071.00%71100GOLDGA
2.0066.00%66100GOLDGA
NA100.00%66AutoDockMC
max RMSD in angstroms
% of matched complexe
s
# of matched complexe
s
# of comple
xesAlgorithmMethod
Comparisons
• MD (Molecular Dynamics)• MC (Monte Carlo)• GA (Genetic Algorithm)• GA best on small space searches• MD best on large space searches• These results are contradictory• MC did reasonably on both
More Comparisons
• MC, GA, EP, Tabu– GA lowest median energies, but Tabu located
the global minimum more reliably• Dock, FlexX, GOLD
– GOLD had best RMSD solutions and best rankings of ligands.