a review of protein-small molecule docking methods[1]

29
A review of protein-small molecule docking methods Taylor, Jewsbury and Essex By Pedro Alves

Upload: others

Post on 16-Mar-2022

1 views

Category:

Documents


0 download

TRANSCRIPT

A review of protein-small molecule docking methods

Taylor, Jewsbury and Essex

By Pedro Alves

Overview

• Searching Methods• Scoring Methods• Comparisons• Conclusion

Searching and Scoring• Searching cannot be done fully as it would take

about 2,000,000 years to search the entire space.

• Searching through 3 degrees of rotation and 3 degrees of translation plus protein and ligandflexibilities.

• Either try to search entire search space in systematic manner or search part of space in partially random/partially criteria guided manner

Search Methods• Molecular Dynamics• Monte Carlo• Genetic Algorithms• Fragment-based• Point Complementarity• Distance Geometry• Tabu Searches• Systematic Searches• Multiple Method Algorithms

AMBER

• energy between covalently bonded atoms • energy due to the geometry of electron orbitals

involved in covalent bonding • energy for twisting a bond due to bond order

(e.g. double bonds) and neighboring bonds or lone pairs of electrons

• non-bonded energy between all atom pairs, which can be decomposed into van der Waalsand electrostatic energies

Molecular Dynamics

• AMBER and CHARMM using force fields.• It has been used to dock flexible ligands to

flexible receptors.• It does not get out of a local minimum well• It can be coupled with a well jumping

technique

Monte Carlo

• A simpler energy function can be used.• More ease to step over energy barriers.• Making random moves and accepting or

rejecting based on Boltzmann probability.

(Comp. Chem. 2 WS 04/05)

From A comparison of heuristic search algorithms for molecular docking By David R. Westhead, David E. Clark and Christopher W. Murray

MC AutoDock

• Flexible ligands to rigid receptors• Utilizes AMBER force field.• Reproduced all 6 tested complexes• Lowest energy structures did not always

correspond to crystallographic conformation

MC Prodock

• Flexible ligands to flexible receptors.• After each random move a local gradient-

based minimization is performed.• Utilizes AMBER force field.• MC Caflisch uses CHARMM similar to

Prodock.

Other MC’s

• ICM• MCDOCK• DockVision• QXP• Affinity• Glide

Genetic Algorithms

• These require initial population, instead of a single initial state as MD and MC.

• Degrees of freedom are encoded into genes and the fitness function is based on a scoring function.

GOLD

• Full ligand flexibility and Partial receptor flexibility.

• 71% success rate (72% on their web site)• 66 had RMSD of 2.0 angstroms or less• 71 had RMSD of 3.0 angstroms or less• Out of 100 complexes• http://gold.ccdc.cam.ac.uk/

Autodock 3.0

• Genetic Algorithm as a global optimizer with an energy minimization as a local search.

• For each new population created a user defined fraction undergo a local search.

• http://www.scripps.edu/mb/olson/people/gmm/movies.html

• Tested on 7 complexes. All lowest energy structures were within 1.14 angstroms RMSD of the crystal structure.

Other GA

• DIVALI 3 out 4 1.7 angs RMSD• DARWIN• Study on HIV by Gehlhaar.

– 34 (out 100 simulations) with max RMSD of 1.5 angs

Fragment-based Methods

• Dividing the ligand into fragments• Docking the Fragments• Linking of Fragments

FlexX• Picking base fragment (manual but not it has been

automated)• Dock the base in multiple ways• Incrementally add to base fragment until the whole

ligand is docked.• 10 out 19 complexes, with the best score, reproduced

the experimental result with a max RMSD of 1.04 angs• Recent extension of FlexX includes the placement of

explicit waters into binding site during docking. (Improvement in some cases was observed, such as HIV protease)

Other FB

• DOCK 4.0 – 7 out of 10 reproduced crystal structure with

maximum RMSD of 1.3 angs• LUDI• ADAM

– 2 out of 2 with RMSD of less than 1 angsHammerhead

– 4 out of 4 with RMSD of less than 1.7 angs

Point Complementarity• Most Fragment-based algorithms could also be

in this category• Major distinction is that the ligand is not broken

up here.• Molecular surface is represented by a series of

surface cubes. There are also another set of inner cubes representing the inside of the molecule.

• The docking is performed by maximizing matches between surface cubes minus the inner cube overlaps.

Programs• Soft Docking• FTDOCK• LIGIN• SANDOCK• FLOG• Most of these methods were originally intended

for protein-protein docking. They also only consider rigid bodies, since proteins are much larger.

• As a result they did not perform very well

Other Methods

• Distance Geometry– Dockit.– Only considers Hydrogen Bonding.

Other Methods

• Tabu Searches– PRO_LEADS– Start with random conformation.– Generate about 100 new ones from current– Pick new current (best)– Generate 100 repeat….– If new 100 do not contain a better one than

current, pick from old currents

Other Methods

• Tabu Searches– Flexible ligand docking was originally

performed on PRO_LEADS with 50 complexes

– 86% had RMSD of 1.5 angs of crystal structs.

Other Methods

• Systematic searches– EUDOC rigid/rigid (ligand/receptor)– Uses AMBER

• Multiple method algorithms– Tabu/MC– FlexX/CHARMM– GA/Tabu

Comparisons

1.5086.00%4350PRO_LEADSTS

1.70100.00%44HammerheadFB

1.00100.00%22ADAMFB

1.3070.00%710DOCK 4.0FB

1.0452.63%1019FlexXFB

1.5034.00%34100GehlhaarGA

1.7075.00%34DIVALIGA

1.14100.00%77AutoDock 3.0GA

3.0071.00%71100GOLDGA

2.0066.00%66100GOLDGA

NA100.00%66AutoDockMC

max RMSD in angstroms

% of matched complexe

s

# of matched complexe

s

# of comple

xesAlgorithmMethod

Comparisons

• MD (Molecular Dynamics)• MC (Monte Carlo)• GA (Genetic Algorithm)• GA best on small space searches• MD best on large space searches• These results are contradictory• MC did reasonably on both

More Comparisons

• MC, GA, EP, Tabu– GA lowest median energies, but Tabu located

the global minimum more reliably• Dock, FlexX, GOLD

– GOLD had best RMSD solutions and best rankings of ligands.

Conclusion

• Best solution is a hybrid• Rigid receptor/flexible ligand algorithms

are well established with success rate of 70-80%

• Solvent model is still somewhat problematic