cd and md. what’s my problem with md? 1.its development has been manifestly unscientific 2.its...
TRANSCRIPT
CD and MD
What’s my problem with MD?
1. Its development has been manifestly unscientific2. Its answers (numbers, trajectories, minima) are
as unreliable (or more) than simpler methods3. Yet its manifest societal advantages- “physics”,
movies, CPU time, complexity, jargon- lead to cognitive dissonance (hopeful thinking) concerning its actual value to drug discovery
CD: Cognitive Dissonance
Cognitive dissonance theory explains human behavior by positing that people have a bias to seek consonance between their expectations
and reality. According to Festinger, people engage in a process he termed "dissonance reduction," which can be achieved in one of three ways:
lowering the importance of one of the discordant factors, adding consonant elements, or changing one of the dissonant factors. This bias sheds light
on otherwise puzzling, irrational, and even destructive behavior.
Lowering importance- Actually agreeing (numerically) with experimentAdding consonance- “It’s an idea generator”Changing the dissonance- Reparameterizing
Wikipedia:-
(+ Effort Justification Paradigm)
AM I CD?
• Came from Barry’s Lab (the Great PB MD Wars)• Don’t sell MD (perhaps I’m jealous)
Why should you believe me?-Don’t write/ need grants-Don’t need tenure-PB is not a significant OE income stream-Been observing MD for > 25 years-I hired an MD guy (who I sent to China!)-I manifestly want this to be a better industry
Also..
• The fastest PB- DelPhi, ZAP• The fastest surfacing algorithms- GRASP, ZAP• The fastest 3D shape alignment- ROCS, FastROCS• The fastest conformer generator- OMEGA• The fastest, non-stochastic docker- FRED• The fastest (accurate) Surface Area, RMSD, AM1,
protein pka, proton placement..
• If I wanted to do MD, mine would rock• I believe the effort/reward ratio is (way) too low
How Galileo Transformed Science
1. Resolution
2. Demonstration
3. Experiment
Think something up
See if it matches available evidence
Think of a new experiment to test it (to differentiate from old theories)
A Galilean Value Scale for Experiments
• Retrospective Data that shapes the theory– MD, Most of molecular modeling, economics
• Prospective without Controls– Rich Friesner, Xavier Barril
• Unanticipated Retrospective Data– SAMPL solvation energies
• Prospective designed with NULL model Controls– Bertrand Garcia Moreno, protein pKa Collective– Lyall Isaacs, SAMPL host-guest
• Prospective to distinguish from Best-of-Class Controls– Nobody
Better
A Galilean Value Scale for Experiments
• Retrospective Data that shapes the theory– MD, Most of molecular modeling, economics
• Prospective without Controls– Rich Friesner, Xavier Barril
• Unanticipated Retrospective Data– SAMPL solvation energies
• Prospective designed with NULL model Controls– Lyall Isaacs, SAMPL host-guest– Bertrand Garcia Moreno, protein pKa Collective
• Prospective to distinguish against Best-of-Class Controls– Nobody
Better
Vast Majority
Prospective Without Controls
• Surgeons coming up with new procedures– Osteoarthritis & Arthroscopic knee surgery
• US Foreign policy– Just do something, claim success when it works,
bury it when it doesn’t• Anecdotal stories
– The “hot hand” phenomena– I did “X”, it worked.
I did “X”, it worked
Two chief fallacies(i) Fallacy of Composition
-What else did you actually do(ii) Fallacy of Selection
-File Drawer effect (False Positives)
-Parameterization (implicit or explicit) to the result (False Negatives)
Fallacy of Composition
• Method X, e.g. MD, is but one part of a multipart process (filtering, chemists inspection, database bias)- success is claimed for X alone
• The same procedure with X replaced with a different method is never done/ presented
Example of Composition Error
• We predicted affinity with MM/QM and “It Worked”
• Was QM getting you anything?
• Did you do MM with QM-level charges, multipoles? MM alone? A scoring function?
Example of Composition Error
• We used a polarizable force field and got these results for the (SAMPL4) host-guest systems. “It Worked”, so polarization worked.
• Did you also try it without polarization? With better quality charges? With equivalent CPU time but without polarization (more sampling)?
Example of Composition Error
• We ran MD for a bit, looked at how the ligands wiggled and designed six drugs (Christopher Bayly & others at Merck Frosst)
• Did you compare to MM? To other simple heuristics? Without any chemists input?
• It’s not “Science” until someone else does it
Fallacy of Selection:The Tanimoto of TruthTM
0 1 1 0 1 0 0 1 0 1
1 1 1 0 0 1 0 1 0 0
Reality
An Event Happened An Event Didn’t
Predictions
ToT = Events that happened and were predicted Events predicted or happened
The Tanimoto of Truth
0 1 1 0 1 0 0 1 0 1
1 1 1 0 0 1 0 1 0 0
Reality
An Event Happened An Event Didn’t
Predictions
PublishedEspecially by Academia
The Tanimoto of Truth
The Tanimoto of Truth
0 1 1 0 1 0 0 1 0 1
1 1 1 0 0 1 0 1 0 0
Reality
An Event Happened An Event Didn’t
Predictions
“File Drawer” False PositivesEspecially by Industry
The Tanimoto of Truth
The Tanimoto of Truth
0 1 1 0 1 0 0 1 0 1
1 1 1 0 0 1 0 1 0 0
Reality
An Event Happened An Event Didn’t
Predictions
False Negatives- Parameterize till publishableEspecially by Academia
The Tanimoto of Truth
The Tanimoto of Truth
0 1 1 0 1 0 0 1 0 1
1 1 1 0 0 1 0 1 0 0
Reality
An Event Happened An Event Didn’t
Predictions
True Negatives- Not sexy, “Hempel’s Ravens”Largely ignored by Academia & Industry
The Tanimoto of Truth
The Tanimoto of TruthThe Tanimoto of Truth
• “Similarity” methods, Docking, Machine Learning• All are judged by some kind of ToT
• Quantification for MD ‘events’? Never.
• MD is mostly uncontrolled, anecdotal & unscientific
Psychology, Philosophy,
Social Dynamics
Underlying Physics,Examination of
Successes
Molecular Dynamics:Types of Applications
1) Global sampling- thermodynamic averages-FEP etc. Absolute or Relative Energies
2) Simulate time evolution (movies)-D.E. Shaw, Vijay Pande- Mechanism
3) Local sampling (thermally accessible barriers)-Bayly & co., WaterMap, MM/PBSA. Qualitative
Assessment
Thermodynamic energies and Fables of Physics
“We all know that if we had the perfect force field and simulated for an infinite time, we’d get the right answer”- Woody Sherman, ACS San Francisco, March 24th, 2010
1) pKa, Tautomers2) Finite temperature, MD & Stat Mech3) Ergoticity?4) The illusion of a ‘perfect” ForceField (that ≠ QM)
Typical FF Thinking: Polarization
• Polarization is tricky• But it makes dipoles bigger, e.g. water
– 1.85D (vacuum) 2.5~2.6D (condensed phase)• So therefore increase charges by ~15%
– E.g. use HF-6-31G*• Now molecules are roughly correct
Polarization of Dipoles
-|+ -|+-|+ -|+ +
+
++
+
-
--
--
-E0
Epol
-|+ +|--|+ +|- -
-
--
-
-
--
--
-
E0Epol
Favorable
Unfavorable
D
D
Scaling vs Polarization
Alignment Scaling Charges PolarizationFavorable Lowers Energy Lowers Energy
Unfavorable Raises Energy Lowers Energy
Scaling dipoles can only be accurate on average
(with parameterization) not locally!
JF
PIDAMOEBA
EPIC Quantummechanics
Kim Sharp:
Ah, but then there’s AMOEBA
(“PB”!)
(Jean-Francois Truchon)
JF
Applications: cation-p
Acetylcholinesterase
Hydrogen Bonds: Formamide dimer
“Close agreement between the orientation dependence of hydrogen bonds observed in protein structures and quantum mechanical calculations”
A. V. Morozov, T. Kortemme, K. Tsemekhman and D. Baker,
PNAS, Volume 101, page 6946, 2004. Method δ(H-
A)ψ ϴ X
DFT 1.94 112.34 159.43 -177.51
MP2 1.97 110.49 155.33 -179.49
HF 2.10 138.16 170.94 -179.54
CHARMM27 1.82 170.25 170.83 -106.83
OPLS-AA 1.75 165.04 175.61 145.12
MM3-2000 1.98 121.16 161.07 149.63
PDB 1.93 115.00 175.00 175.00
Geometry optimizations starting fromthe Baker MP2 minimum
Geometry optimizations starting fromthe Baker MP2 minimum
Geometry optimizations starting fromthe second MP2 minimum
Geometry optimizations starting fromthe second MP2 minimum
1.9 2.2 2.5 2.8 3.1 3.4 3.7
-9
-6
-3
0
QM*
Pt. Charge
Pt. Octupole
R(O..H) (Å)E
_ele
(k
cal/m
ol)
*CCSD/aug-cc-pVTZ
Ah, but then there’s AMOEBA
Model Electrostatic Energy (kcal/mol)
Point Monopole -4.33Point Dipole -5.81Point Quadrupole -6.36Point Octupole -6.31Exponential Monopole -7.68Exponential Dipole -8.32Exponential Quadrupole -8.52Exponential Octupole -8.18CCSD/aug-cc-pVTZ -8.23
Fitting to the electron densityDenny Elking, Tom Darden
Model Electrostatic Energy (kcal/mol)
Point Monopole -4.33Point Dipole -5.81Point Quadrupole -6.36Point Octupole -6.31Exponential Monopole -7.68Exponential Dipole -8.32Exponential Quadrupole -8.52Exponential Octupole -8.18CCSD/aug-cc-pVTZ -8.23
Or……
Increase Dipole from1.85D to 2.56D
Details, Details..
1) Just incorporate Volume Terms (PB)2) And all those other terms:
- Exchange interactions- VdW anisotropy- pKa & Tautomers- Cross-terms between valence and non-
bonded- Three (N) body terms….
Eventually it’ll be right! Woody’ll be right.Inconceivable it can’t ever be right. (Wolynes)
Concrete MD Examples
• Binding Energies- Shirts- Also Solvation (Simpler system)
• Protein Trajectories- Shaw- Also Peptides (Simpler systems)
• “Minimization” – Shoichet- Is a simple system
FKBP-12Unanticipated Retrospective Data?
FKBP-12 Again
FKBP-12 Yet AgainRetrospective Data that shapes the theory
Contributions to Affinity
VdW
Coulombic
Buried Area
Desolvation
Entropy Discrete Waters
Polarization
Correlations to Affinity
VdW
Desolvation
Entropy
Discrete Waters
Polarization
Buried AreaSh
ape
Electrostatics
Coulombic
E.g. VdW
Train on 17 HIV-1 Protease Inhibitors
1) Minimization (MM2X)
2) pIC50=-0.15*Einter-8.1
Prospectively used on 16 more
E.g. Coulombic
• Urokinase
Coulombic InteractionBrown & Muchmore, JCIM, 2007, (47) 4
-16 -15 -14 -13 -12 -11 -10 -9 -8
-7
-6
-5
-4
-3
-2
-1
0
f(x) = 0.318123406 x − 0.7159427453R² = 0.747397244313095
Expt. Binding (kcal/mol)
Buri
ed A
rea
Ener
gy (k
cal/
mol
)-16 -15 -14 -13 -12 -11 -10 -9 -8
-60
-50
-40
-30
-20
-10
0
f(x) = 2.33898104 x − 11.3077717R² = 0.849428953124217
Experimental Binding (kcal/mol)
Pred
icte
d Bi
ndin
g (k
cal/
mol
)
MM-PBSA Buried Area
“Fast and Accurate Predictions of Binding Free Energies using MM-PBSA and MM-GBSA”Rastelli, G., Del Rio, A.,Degliesposti,G., Sgobba, M.
J. Comp. Chem. Vol 31, #4, pg 797-810
DHFR
E.g. Buried Area
My observation over 20 years
• For congeneric series, something basic often correlates, sometime well (VdW, Coulombic)
• For non-congeneric usually nothing works
• If something works for non-congenerics, it’s usually something basic (mass, buried area)
Simpler System: Solvation#
CompoundsMD RMSEkCal/mol
PB RMSEkCal/mol
SAMPL0 17 1.35 (Vijay) 1.76 (Me)
SAMPL1 56 3.6 (Mobley) 2.2 (Me)
SAMPL2 40 2.4 (Jay) 2.1 (Ben)
SAMPL4: 50 Solvation Energies
My PB MethodBest MD
QM + SpecificGroup-wiseParameterization
Structural basis for modulation of a G-protein-coupled receptor by allosteric drugs- D. E. Shaw
1) Where they bind- Confirmed by mutagenesis
2) A surprise in how they bind-pi-charge interactions-not charge-charge
3) Cause of allostery:(i) Charge(ii) Binding pocket width -Confirmed by synthesis
IMHO
1) Where they bind- Confirmed by mutagenesis
2) How they bind-pi-charge interactions-not charge-charge
3) Cause of allostery:(i) Charge(ii) Binding pocket width -Confirmed by synthesis
1) Docking with Glide did almostas well. Confirmation is WEAK.
2) THIS IS NOT A SURPRISE!
3)(i) Already known & follows chargemultiplicity exactly.(ii) –ONE CMPD (better than most!)
Also..• Local ionizable residues never (de)protonate
– Binding +3 ligands• NMS was modeled, not simulated• Experimental errors claimed are <0.1 kcal in
vivo
Simpler Story- Peptides• Poly-Ala propensities (2010)
– Have to modify FF to get helicity right• Side-chain conformation preferences (2012)
– Little agreement between force-fields– Poor agreement with crystals (2013)
• H-bond geometries (2005)– Flawed Baker study
• Beta-hairpin simulations (2012)– Little agreement between force-fields
Simple System: Shoichet- Relative binding energies in a cavity
A signal!
Maybe not!
RMSE from Phenol = 2.5 kcal/mol RMSE from from Catechol = 1.1 kcal/molRMSE of the “NULL” hypothesis = 1.2 kcal/molFrom “closest” Phenol|Catechol = 0.8 kcal/mol
Poses selected, not found, sois this dynamics or minimization?
NULL MODELS
One, Inescapable, Conclusion
• We cannot calculate the energies of protein microstates with any accuracy
• It is unclear even how bad we are• Even ranking must be suspect
• Ranking Ligands, Absolute or Relative• Flexible Docking
• Protein folding to atomic resolution• Evaluating unfolded states
• Excursions from the crystal structure
Of Dubious Value
So how can we fold (small) proteins?
• Luck- are small proteins self-selectingly robust?
• Some parameterization (Shaw)
• Stability of kinetic pathways might be more robust than energetics suggest (Pande)
?
But what’s the alternative?
• To Local Minimization– Sample (MC, Low Mode etc) and minimize
• To Energy evaluation– Exhaustively sample and minimize
• To time evolution– Elastic network? Low mode dynamics?– Run MD!
Experiments I Wish Were Done
• Protein Crystallography– Predict the room temperature density
• Small molecule NMR– Predict the dominant low energy conformer
• Protein Electrostatics– Predict potentials in the active site
• Host-guest systems– Binding energies, salt effects
And how I wish they were done:Maximal Disinformation Testing
1. FIRST calculate for two or more methods, e.g. polarization vs static, PB vs MD, MD vs MM
2. Prospectively measure those systems that most distinguish methods- mutual disinformation
3. Adapt theories- no one’s perfect!4. Repeat steps 1,2 & 35. Does a prediction ‘gap’ persist?
E.g. Kepler vs Epicycles.
Final Thoughts
• I’d love MD to work! Make my job easier• It doesn’t. At least not as advertised/ believed• It’s nature (“physics”, big calculations, movies)
leads to overconfidence• Until a more scientific approach is adopted it’s
unlikely to get better. GPUs won’t save MD• What’s needed is Maximal Disinformation
Testing & Model systems