cd and md. what’s my problem with md? 1.its development has been manifestly unscientific 2.its...

CD and MD

What’s my problem with MD?

1. Its development has been manifestly unscientific2. Its answers (numbers, trajectories, minima) are

as unreliable (or more) than simpler methods3. Yet its manifest societal advantages- “physics”,

movies, CPU time, complexity, jargon- lead to cognitive dissonance (hopeful thinking) concerning its actual value to drug discovery

CD: Cognitive Dissonance

Cognitive dissonance theory explains human behavior by positing that people have a bias to seek consonance between their expectations

and reality. According to Festinger, people engage in a process he termed "dissonance reduction," which can be achieved in one of three ways:

lowering the importance of one of the discordant factors, adding consonant elements, or changing one of the dissonant factors. This bias sheds light

on otherwise puzzling, irrational, and even destructive behavior.

Lowering importance- Actually agreeing (numerically) with experimentAdding consonance- “It’s an idea generator”Changing the dissonance- Reparameterizing

Wikipedia:-

(+ Effort Justification Paradigm)

AM I CD?

• Came from Barry’s Lab (the Great PB MD Wars)• Don’t sell MD (perhaps I’m jealous)

Why should you believe me?-Don’t write/ need grants-Don’t need tenure-PB is not a significant OE income stream-Been observing MD for > 25 years-I hired an MD guy (who I sent to China!)-I manifestly want this to be a better industry

Also..

• The fastest PB- DelPhi, ZAP• The fastest surfacing algorithms- GRASP, ZAP• The fastest 3D shape alignment- ROCS, FastROCS• The fastest conformer generator- OMEGA• The fastest, non-stochastic docker- FRED• The fastest (accurate) Surface Area, RMSD, AM1,

protein pka, proton placement..

• If I wanted to do MD, mine would rock• I believe the effort/reward ratio is (way) too low

How Galileo Transformed Science

1. Resolution

2. Demonstration

3. Experiment

Think something up

See if it matches available evidence

Think of a new experiment to test it (to differentiate from old theories)

A Galilean Value Scale for Experiments

• Retrospective Data that shapes the theory– MD, Most of molecular modeling, economics

• Prospective without Controls– Rich Friesner, Xavier Barril

• Unanticipated Retrospective Data– SAMPL solvation energies

• Prospective designed with NULL model Controls– Bertrand Garcia Moreno, protein pKa Collective– Lyall Isaacs, SAMPL host-guest

• Prospective to distinguish from Best-of-Class Controls– Nobody

Better

A Galilean Value Scale for Experiments

• Retrospective Data that shapes the theory– MD, Most of molecular modeling, economics

• Prospective without Controls– Rich Friesner, Xavier Barril

• Unanticipated Retrospective Data– SAMPL solvation energies

• Prospective designed with NULL model Controls– Lyall Isaacs, SAMPL host-guest– Bertrand Garcia Moreno, protein pKa Collective

• Prospective to distinguish against Best-of-Class Controls– Nobody

Better

Vast Majority

Prospective Without Controls

• Surgeons coming up with new procedures– Osteoarthritis & Arthroscopic knee surgery

• US Foreign policy– Just do something, claim success when it works,

bury it when it doesn’t• Anecdotal stories

– The “hot hand” phenomena– I did “X”, it worked.

I did “X”, it worked

Two chief fallacies(i) Fallacy of Composition

-What else did you actually do(ii) Fallacy of Selection

-File Drawer effect (False Positives)

-Parameterization (implicit or explicit) to the result (False Negatives)

Fallacy of Composition

• Method X, e.g. MD, is but one part of a multipart process (filtering, chemists inspection, database bias)- success is claimed for X alone

• The same procedure with X replaced with a different method is never done/ presented

Example of Composition Error

• We predicted affinity with MM/QM and “It Worked”

• Was QM getting you anything?

• Did you do MM with QM-level charges, multipoles? MM alone? A scoring function?


• We used a polarizable force field and got these results for the (SAMPL4) host-guest systems. “It Worked”, so polarization worked.

• Did you also try it without polarization? With better quality charges? With equivalent CPU time but without polarization (more sampling)?


• We ran MD for a bit, looked at how the ligands wiggled and designed six drugs (Christopher Bayly & others at Merck Frosst)

• Did you compare to MM? To other simple heuristics? Without any chemists input?

• It’s not “Science” until someone else does it

Fallacy of Selection:The Tanimoto of TruthTM

0 1 1 0 1 0 0 1 0 1

1 1 1 0 0 1 0 1 0 0

Reality

An Event Happened An Event Didn’t

Predictions

ToT = Events that happened and were predicted Events predicted or happened

The Tanimoto of Truth

0 1 1 0 1 0 0 1 0 1

1 1 1 0 0 1 0 1 0 0

Reality


Predictions

PublishedEspecially by Academia



0 1 1 0 1 0 0 1 0 1

1 1 1 0 0 1 0 1 0 0

Reality


Predictions

“File Drawer” False PositivesEspecially by Industry



0 1 1 0 1 0 0 1 0 1

1 1 1 0 0 1 0 1 0 0

Reality


Predictions

False Negatives- Parameterize till publishableEspecially by Academia



0 1 1 0 1 0 0 1 0 1

1 1 1 0 0 1 0 1 0 0

Reality


Predictions

True Negatives- Not sexy, “Hempel’s Ravens”Largely ignored by Academia & Industry


The Tanimoto of TruthThe Tanimoto of Truth

• “Similarity” methods, Docking, Machine Learning• All are judged by some kind of ToT

• Quantification for MD ‘events’? Never.

• MD is mostly uncontrolled, anecdotal & unscientific

Psychology, Philosophy,

Social Dynamics

Underlying Physics,Examination of

Successes

Molecular Dynamics:Types of Applications

1) Global sampling- thermodynamic averages-FEP etc. Absolute or Relative Energies

2) Simulate time evolution (movies)-D.E. Shaw, Vijay Pande- Mechanism

3) Local sampling (thermally accessible barriers)-Bayly & co., WaterMap, MM/PBSA. Qualitative

Assessment

Thermodynamic energies and Fables of Physics

“We all know that if we had the perfect force field and simulated for an infinite time, we’d get the right answer”- Woody Sherman, ACS San Francisco, March 24th, 2010

1) pKa, Tautomers2) Finite temperature, MD & Stat Mech3) Ergoticity?4) The illusion of a ‘perfect” ForceField (that ≠ QM)

Typical FF Thinking: Polarization

• Polarization is tricky• But it makes dipoles bigger, e.g. water

– 1.85D (vacuum) 2.5~2.6D (condensed phase)• So therefore increase charges by ~15%

– E.g. use HF-6-31G*• Now molecules are roughly correct

Polarization of Dipoles

-|+ -|+-|+ -|+ +

+

++

+

-

--

--

-E0

Epol

-|+ +|--|+ +|- -

-

--

-

-

--

--

-

E0Epol

Favorable

Unfavorable

D

D

Scaling vs Polarization

Alignment Scaling Charges PolarizationFavorable Lowers Energy Lowers Energy

Unfavorable Raises Energy Lowers Energy

Scaling dipoles can only be accurate on average

(with parameterization) not locally!

JF

PIDAMOEBA

EPIC Quantummechanics

Kim Sharp:

Ah, but then there’s AMOEBA

(“PB”!)

(Jean-Francois Truchon)

JF

Applications: cation-p

Acetylcholinesterase

Hydrogen Bonds: Formamide dimer

“Close agreement between the orientation dependence of hydrogen bonds observed in protein structures and quantum mechanical calculations”

A. V. Morozov, T. Kortemme, K. Tsemekhman and D. Baker,

PNAS, Volume 101, page 6946, 2004. Method δ(H-

A)ψ ϴ X

DFT 1.94 112.34 159.43 -177.51

MP2 1.97 110.49 155.33 -179.49

HF 2.10 138.16 170.94 -179.54

CHARMM27 1.82 170.25 170.83 -106.83

OPLS-AA 1.75 165.04 175.61 145.12

MM3-2000 1.98 121.16 161.07 149.63

PDB 1.93 115.00 175.00 175.00

Geometry optimizations starting fromthe Baker MP2 minimum

Geometry optimizations starting fromthe second MP2 minimum

1.9 2.2 2.5 2.8 3.1 3.4 3.7

-9

-6

-3

0

QM*

Pt. Charge

Pt. Octupole

R(O..H) (Å)E

_ele

(k

cal/m

ol)

*CCSD/aug-cc-pVTZ

Ah, but then there’s AMOEBA

Model Electrostatic Energy (kcal/mol)

Point Monopole -4.33Point Dipole -5.81Point Quadrupole -6.36Point Octupole -6.31Exponential Monopole -7.68Exponential Dipole -8.32Exponential Quadrupole -8.52Exponential Octupole -8.18CCSD/aug-cc-pVTZ -8.23

Fitting to the electron densityDenny Elking, Tom Darden

Model Electrostatic Energy (kcal/mol)

Point Monopole -4.33Point Dipole -5.81Point Quadrupole -6.36Point Octupole -6.31Exponential Monopole -7.68Exponential Dipole -8.32Exponential Quadrupole -8.52Exponential Octupole -8.18CCSD/aug-cc-pVTZ -8.23

Or……

Increase Dipole from1.85D to 2.56D

Details, Details..

1) Just incorporate Volume Terms (PB)2) And all those other terms:

- Exchange interactions- VdW anisotropy- pKa & Tautomers- Cross-terms between valence and non-

bonded- Three (N) body terms….

Eventually it’ll be right! Woody’ll be right.Inconceivable it can’t ever be right. (Wolynes)

Concrete MD Examples

• Binding Energies- Shirts- Also Solvation (Simpler system)

• Protein Trajectories- Shaw- Also Peptides (Simpler systems)

• “Minimization” – Shoichet- Is a simple system

FKBP-12Unanticipated Retrospective Data?

FKBP-12 Again

FKBP-12 Yet AgainRetrospective Data that shapes the theory

Contributions to Affinity

VdW

Coulombic

Buried Area

Desolvation

Entropy Discrete Waters

Polarization

Correlations to Affinity

VdW

Desolvation

Entropy

Discrete Waters

Polarization

Buried AreaSh

ape

Electrostatics

Coulombic

E.g. VdW

Train on 17 HIV-1 Protease Inhibitors

1) Minimization (MM2X)

2) pIC50=-0.15*Einter-8.1

Prospectively used on 16 more

E.g. Coulombic

• Urokinase

Coulombic InteractionBrown & Muchmore, JCIM, 2007, (47) 4

-16 -15 -14 -13 -12 -11 -10 -9 -8

-7

-6

-5

-4

-3

-2

-1

0

f(x) = 0.318123406 x − 0.7159427453R² = 0.747397244313095

Expt. Binding (kcal/mol)

Buri

ed A

rea

Ener

gy (k

cal/

mol

)-16 -15 -14 -13 -12 -11 -10 -9 -8

-60

-50

-40

-30

-20

-10

0

f(x) = 2.33898104 x − 11.3077717R² = 0.849428953124217

Experimental Binding (kcal/mol)

Pred

icte

d Bi

ndin

g (k

cal/

mol

)

MM-PBSA Buried Area

“Fast and Accurate Predictions of Binding Free Energies using MM-PBSA and MM-GBSA”Rastelli, G., Del Rio, A.,Degliesposti,G., Sgobba, M.

J. Comp. Chem. Vol 31, #4, pg 797-810

DHFR

E.g. Buried Area

My observation over 20 years

• For congeneric series, something basic often correlates, sometime well (VdW, Coulombic)

• For non-congeneric usually nothing works

• If something works for non-congenerics, it’s usually something basic (mass, buried area)

Simpler System: Solvation#

CompoundsMD RMSEkCal/mol

PB RMSEkCal/mol

SAMPL0 17 1.35 (Vijay) 1.76 (Me)

SAMPL1 56 3.6 (Mobley) 2.2 (Me)

SAMPL2 40 2.4 (Jay) 2.1 (Ben)

SAMPL4: 50 Solvation Energies

My PB MethodBest MD

QM + SpecificGroup-wiseParameterization

Structural basis for modulation of a G-protein-coupled receptor by allosteric drugs- D. E. Shaw

1) Where they bind- Confirmed by mutagenesis

2) A surprise in how they bind-pi-charge interactions-not charge-charge

3) Cause of allostery:(i) Charge(ii) Binding pocket width -Confirmed by synthesis

IMHO

1) Where they bind- Confirmed by mutagenesis

2) How they bind-pi-charge interactions-not charge-charge

3) Cause of allostery:(i) Charge(ii) Binding pocket width -Confirmed by synthesis

1) Docking with Glide did almostas well. Confirmation is WEAK.

2) THIS IS NOT A SURPRISE!

3)(i) Already known & follows chargemultiplicity exactly.(ii) –ONE CMPD (better than most!)

Also..• Local ionizable residues never (de)protonate

– Binding +3 ligands• NMS was modeled, not simulated• Experimental errors claimed are <0.1 kcal in

vivo

Simpler Story- Peptides• Poly-Ala propensities (2010)

– Have to modify FF to get helicity right• Side-chain conformation preferences (2012)

– Little agreement between force-fields– Poor agreement with crystals (2013)

• H-bond geometries (2005)– Flawed Baker study

• Beta-hairpin simulations (2012)– Little agreement between force-fields

Simple System: Shoichet- Relative binding energies in a cavity

A signal!

Maybe not!

RMSE from Phenol = 2.5 kcal/mol RMSE from from Catechol = 1.1 kcal/molRMSE of the “NULL” hypothesis = 1.2 kcal/molFrom “closest” Phenol|Catechol = 0.8 kcal/mol

Poses selected, not found, sois this dynamics or minimization?

NULL MODELS

One, Inescapable, Conclusion

• We cannot calculate the energies of protein microstates with any accuracy

• It is unclear even how bad we are• Even ranking must be suspect

• Ranking Ligands, Absolute or Relative• Flexible Docking

• Protein folding to atomic resolution• Evaluating unfolded states

• Excursions from the crystal structure

Of Dubious Value

So how can we fold (small) proteins?

• Luck- are small proteins self-selectingly robust?

• Some parameterization (Shaw)

• Stability of kinetic pathways might be more robust than energetics suggest (Pande)

?

But what’s the alternative?

• To Local Minimization– Sample (MC, Low Mode etc) and minimize

• To Energy evaluation– Exhaustively sample and minimize

• To time evolution– Elastic network? Low mode dynamics?– Run MD!

Experiments I Wish Were Done

• Protein Crystallography– Predict the room temperature density

• Small molecule NMR– Predict the dominant low energy conformer

• Protein Electrostatics– Predict potentials in the active site

• Host-guest systems– Binding energies, salt effects

And how I wish they were done:Maximal Disinformation Testing

1. FIRST calculate for two or more methods, e.g. polarization vs static, PB vs MD, MD vs MM

2. Prospectively measure those systems that most distinguish methods- mutual disinformation

3. Adapt theories- no one’s perfect!4. Repeat steps 1,2 & 35. Does a prediction ‘gap’ persist?

E.g. Kepler vs Epicycles.

Final Thoughts

• I’d love MD to work! Make my job easier• It doesn’t. At least not as advertised/ believed• It’s nature (“physics”, big calculations, movies)

leads to overconfidence• Until a more scientific approach is adopted it’s

unlikely to get better. GPUs won’t save MD• What’s needed is Maximal Disinformation

Testing & Model systems

cd and md. what’s my problem with md? 1.its development has been manifestly unscientific 2.its...

Documents

md slide

better slide

presented slide

low slide

tanimoto of truth slide

better industry slide

theory md

drug discovery slide