understanding protein electrostatics using boundary-integral equations jaydeep p. bardhan dept. of...
TRANSCRIPT
Understanding Protein Electrostatics Using Boundary-
Integral Equations
Jaydeep P. Bardhan Dept. of Physiology and Molecular Biophysics
Rush University Medical Center, Chicago IL
Joint work with • M. Knepley (Computation Institute, U. Chicago)• P. Brune (Math and Computer Science Division, Argonne)• A. Hildebrandt (J. Gutenberg U., Mainz, Germany)
Outline:• Preliminaries:
Biomolecule electrostatics Continuum theory and
boundary-integral methods Numerical simulation
1.Fast Poisson approximation
2.Nonlocal continuum model
Applied Math
Computer science(HPC)
Biophysics
My research
Emphasizing the interdisciplinary nature of computational biophysics
Fact: Water Makes Life Possible
L. Freberg
Vander
Fox
Kass ‘05
A Crucial Consequence of Solvation
• Molecular binding involves sacrificing solute--solvent interactions for solute--solute interactions:
Basic Continuum Electrostatic Theory
• 100-1000 times faster than MD
• Protein model: Shape: “union of spheres” (atoms) Point charges at atom centers Not very polarizable: = 2-4
• Water model: no fixed charges Single water: sphere of radius 1.4
Angstrom Highly polarizable: = 80
• In total: mixed-dielectric Poisson
Modeling ions in solution is critical! But today’s focus is on the simpler math of “pure” water.
Solving the PDE Directly is Possible, But…
1. Boundary conditions are at infinity
2. Point charges must be spread onto the grid
3. The dielectric interface is approximated
PDE Complications
The idea: Just throw down a finite-difference grid or a finite-element mesh and go to town!
Green’s Representation Formula• Well known: boundary values of a
harmonic function determine it uniquely
• The challenge is determining given BV. Separation of variables Numerical: finite elements, finite differences.. In 3 dimensions, solve for 3-dimensional
unknown
• Alternatively: if you knew BOTH conditions,
D
Dirichlet: given
Thus: finding the other boundary condition gives you the answer
directly
Ex:S
Potential anywhere in
D
(3 dimensions)
Surface integrals ONLY
Neumann: given(For exterior domain!)
Deriving a Boundary Integral Equation: Exterior Neumann
Problem• Given , need to find
• Let r approach surface
r is in the domain D
Given datar ’ is on the boundary S
D
S
Addressing the Singularities• Single-layer potential
• Dipole-layer
ax
y
z
+ ++++ +
r
D
S
ax
y
zr
+ +++
++
Continuous as Also continuous as
-+
-+
-+
-+
-+
-+
-+ -
+-+ -
+
Limit depends on WHICH SIDE of the surface your point is approaching!!
Deriving a Boundary Integral Equation: Exterior Neumann
Problem• Given , need to find
D
S
Why Bother With Integral Equations?Easy problem:
Exterior problems?
To infinity
Problems with mostly empty, uninteresting
space
Medium problem:
Practical Advantages of PDE Approaches
1. Much more general (nonlinearity, etc)2. Easier to parallelize (that’s different from “easy”)3. Often easier to explore model space (see point 1)4. PDE solvers give sparse systems; BIE, dense systems!
•Less accessible (few codes)•Hard to convince it to run•Does one thing really well
•Accessible (many codes)•Reliable, durable•Versatile, does OK job
PDE BIE
Similarity Between FEM and BEM• Both weighted residual methods:
FEM BEM
1 on panel i0 elsewhere
Enforce Enforce
(Galerkin method) Galerkin:
Differences Between BEM and FEM
1. Extra freedom in choosing test functions
2. Matrix elements are harder to compute
Collocation: test = delta functions
Centroids of elements
Galerkin BEM:Galerkin FEM:
Smooth integrand: Easily computed with quadrature!
Double integral of a singular function!!
Fast Solvers for Integral Equations1. Solve Ax=b approximately using Krylov-subspace iterative
methods such as GMRES:
2. Compute dense matrix-vector product using O(N) method (fast
multipole; tree code; precorrected FFT; FFTSVD)
3. Improve iterative convergence with preconditioning
4. For many problems, use diagonal entries!
Storing matrix: O(N2) time and memoryEach multiply: O(N2) time
Storing compressed matrix: O(N) time and memoryEach multiply: O(N) time
P “looks like” A-1
Iteration converges faster if matrix eigenvalues are “well clustered”
A Boundary Integral Method For Biomolecule Electrostatics
+ -++
++ + + ++
---
-- -
1. Boundary conditions handled exactly
2. Point charges are treated exactly
3. Meshing emphasis can be placed directly on the interface
Conservation law
Constitutive relation
BIBEE: A New, Rigorous Model of Continuum Electrostatics for
Proteins“Boundary Integral Based Electrostatics Estimation”
• Idea: Use preconditioner to approximate inverse
No need to compute sparsified operator (saves time and memory)
No need for Krylov solve
• Test of elementary charges in a 20-Angstrom sphere:
+1, -1 charges 3 A apartSingle +1 charge
BIBEE: Introducing Different Variants• The preconditioning approximation takes into account the
singular character of the electric-field kernel:
• The Coulomb-field approximation ignores the operator entirely:
CFA seems better here… …and worse here.
BIBEE Approximates the Eigenvalues of the Boundary Integral Operator
• The integral operator has to be split into two terms
• BIBEE approximates E’s eigenvalues P uses 0 (limit for sphere, prolate
spheroid) CFA uses -1/2 (known extremal)
i
-1/2
-1/6
-1/10
• Eigenvalues are real in [-1/2,+1/2)• -1/2 is always an EV• Left, right eigenvectors of -1/2
are constants
A hundred years of analysis!
Sphere: analytical
BIBEE Clarifies an Empirical, Heuristic Model
R1 R2 R3
+ +BIBEE approx. charge
includes all contributions
Coulomb-field approximation: corresponds
exactly to ignoring the integral operator.
BIBEE/CFA is the extension of CFA to multiple charges!
No ad hoc parameters, no heuristic interpolation
Still equation: the basis of totally nonphysical Generalized Born (GB)
models
“Effective Born radius” - the radius of a sphere with the same solvation energy
Same approach taken by Borgis et al. in variational CFA
BIBEE/CFA Energy Is a Provable Upper Bound
• BIBEE/P is an effective lower bound, provable in some cases but not all• Another variant (BIBEE/LB) is a provable LB but too loose to be useful
Bardhan, Knepley, Anitescu (2009)
Feig et al. test set, > 600 proteins
BIBEE: Improve by Analyzing the Sphere
• Get first mode (monopole) analytically correct, other modes are bounded from below: tighter lower bound!
• Impact on sphere is better than impact on proteins (Feig et al. test set)
Bardhan+Knepley, J. Chem. Phys. (in press)
i
-1/2
-1/6
-1/10
BIBEE: Accurate One-parameter Model
• This effective parameter is expected to be rigorously determined by approximating protein as ellipsoid (Onufriev+Sigalov, ‘06)
Bardhan+Knepley, J. Chem. Phys. (in press)
i
-1/2
-1/6
-1/10
Dominant energies come from dominant modes: try
to capture dipole/quadrupole modes
approximately!
BIBEE: A New, Rigorous Model
• BIBEE is 3-5 times faster than full solve (including large setup time for both)
• Unoptimized implementation (will save big on setup time)
• Modern FMM implementation (Yokota, Knepley, Barba, et al.) gives 10-20X speedup
3968 7564 15,212 32,022 49,708
18.368 24.493 87.647 515.256 735.092
(3.271) (6.665) (18.274) (62.217) (109.040)
1.198 2.623 7.070 14.611 28.861
2.974 6.540 18.125 39.066 77.205
Tripeptide Protein-Drug
# boundary elements
Matrix compression time
Total BEM time
SGB/CFA (heuristic) time
BIBEE time
Reaction-Potential Operator Eigenvectors Have Physical Meaning
• Eigenvectors from distinct eigenvalues are orthogonal
• Thus: the eigenvectors correspond to charge distributions that do not interact via solvent polarization (weird, huh?)
• If an approximate method generates a solvation matrix , its eigenvectors should “line up” well with the actual eigenvectors, i.e.
i = j
“Getting the Modes Right” Is Important• Modes from small eigenvalues still contribute significantly
to the total energy
• Here, 25% of the total energy comes from modes with eigenvalues smaller than 1% of the maximum eigenvalue
Pro
jec
tio
n o
f ch
arg
e d
istr
ibu
tio
n o
nto
eig
enve
cto
r
Cu
mu
lati
ve E
lect
ros
tati
c F
ree
En
erg
y (k
cal/
mo
l)
102
100
10-2
104
Eigenvalue Magnitude
Eigenvalue Index
20 40 60 80
Eigenvalue Index
20 40 60 80-30
-20
-10
BIBEE Is An Accurate, Parameter-Free Model
• Peptide example
SGB/CFA GBMV BIBEE/CFA
Met-enkephalin
Snapshots from MD
All models look essentially the same here.
BIBEE’s stronger “diagonal” appearance indicates superior reproduction of the
eigenvectors of the operator.
BIBEE: A New, Rigorous Model of Continuum Electrostatics for
ProteinsApplied
Math
Computer science(HPC)
Biophysics
BIBEE
Design systematic approximation
Have proved that the model: • Gives upper and lower bounds• Preserves important physics
Relate empirical models to strong math
Leverages existing algorithms (e.g. fast multipole methods, parallel codes)
Next: Apply to other physics problems
Nonlocal Continuum Electrostatics: Adding molecular realism “the right
way”KNOWN weaknesses of Poisson model:
1. Linear response assumption
Caveat: Nonlinearity IS important for more highly charged species!
2. Violates continuum-length-scale assumption
Water molecules have finite size Water molecules form semi-structured networks
Oxygen
Hydrogens
Lone pair electrons
Hydrogen bonds
Nina, Beglov, Roux ‘97
Test with all-atom molecular dynamics
y=x denotes exactly linear response
Relatively small deviation!
First look for ways to extend existing models--don’t just give up
and reinvent everything!
Nonlocal Continuum Electrostatics: Demonstrating the Failure Mode
Run all-atom molecular dynamics: ion surrounded by water
Ion radius in nanometers
Data points: radii from molecular simulation (Aqvist 1990) and energies from experimental data
Consequence: ion energies are wrong
Significant structuring of
charge density!
Nonlocal Continuum Modeling:A Classical Multiscale Theory
• Studied since the 1970s in numerous domains
de Abajo ‘08Schatz et al. ‘01
Park ‘06
Scott et al. ‘04Duan et al. ‘07
Gao et al., ‘09
Problems whose length scales are NOT well
separated from those of the constituent
molecules!
Expect nonlocal theory to play major roles in
nanoscale science and engineering modeling…
Nonlocal Continuum Electrostatics:Nonlocal Dielectric Response
Smoothly interpolates between known limits
Supported by experiments and detailed simulations
Local response
Wave number (inverse distance)
• Polarization charge as a function of distance from the ion: not simple
Short-range: electronic response
Long-range: bulk behavior
• Local: bulk everywhere
• Nonlocal: simple function that captures asymptotes
Nonlocal Continuum Electrostatics:Lorentzian Model and Promising
Tests• Nonlocal response:
• Now
• Integrodifferential Poisson equation
Green’s function for
Single parameter fit for gives much better agreement with experiment!!
Nonlocal Continuum Electrostatics:Reformulation for Fast Simulations
• Integrodifferential equations in complex geometries?
• Result: No progress on nonlocal model for DECADES
Spherical ions, charges near planar half-spaces… nothing else.
• Breakthrough in 2004 (Hildebrandt et al.):1. Define an auxiliary field: the displacement potential2. Approximate the nonlocal boundary condition3. Double reciprocity leads to a boundary-integral
method
“Licorice” “Cartoon” Molecular surface
Nonlocal Continuum Electrostatics:1. Introduce an Auxiliary Potential
• Use Helmholtz decomposition:
• Electrostatic potential now satisfies a Yukawa equation:
Yukawa/linearized Poisson-Boltzmann
equation
Displacement potential acts as a volume source
Nonlocal Continuum Electrostatics:2. Approximate Nonlocal B.C.
• Original boundary conditions:
• Exact normal deriv. of solvent potentials satisfy
• The actual PDEs complete the local formulation:Nonlocal boundary condition: Choose to drop
Nonlocal Continuum Electrostatics:3. Green’s Theorem + Double
Reciprocity• Electric potential Green’s theorem gives a volume integral
• The displacement potential is harmonic:
• Defining single- and double-layer operators
0
Nonlocal Continuum Electrostatics: Purely BIE Formulation
• Three surface variables, two types of Green’s functions, and a mixed first-second kind problem
• Fasel et al. have recently derived a purely second-kind method
Hildebrandt et al. 2005, 2007
Nonlocal Continuum Electrostatics: Analytical Solution for Sphere
• Solve each mode independently and presto!
• Note: This is not about matching interior and exterior expansions--unlike the Kirkwood solution for local model
• This decomposition may provide further analytical insights (e.g., eigenvectors of reaction-potential operator)
Bardhan and Brune, to be submitted
For sphere, these operators share a common eigenbasis: spherical
harmonics
All of these are diagonal
Nonlocal Continuum Electrostatics: Charge Burial and the pKa Problem
• Understanding charge burial energetics is important! For protein folding, misfolding (Alzheimer’s), etc. For two molecules binding (drug-protein, protein-protein, etc.) For change in environment (pH, temperature, concentration,
etc.)
Ion or charged chemical group, alone in water
Ion or charged chemical group, buried in protein
Demchuk+Wade, 1996
Local theory needs unrealistically large dielectric constants to match experiment!
3
2
1
0
Error in pKa value
(RMSD)
20 40 60 805
Measured protein dielectric constants
suggest = 2-5
Nonlocal Continuum Electrostatics: Charge Burial and the pKa Problem
• Nonlocal theory with realistic dielectric constant predicts similar energies as (widely successful) local theories with unrealistic dielectric constants!
Bardhan, J. Chem. Phys. (in press)
Dense BEM Fast BEM
Nonlocal Continuum Electrostatics: Fast Solver is a Must for Accurate
Studies
• O(N2) memory limitation: big discretization errors• O(N) fast solver: only way to get accurate energies
Illustration of surface representations for memory-constrained dense and fast BEM
# of boundary elements
1000
10,000
100,000
1,000,000
Memory needed (GB)
0.07 7 700 70,000
Nonlocal Continuum Electrostatics: Fast BIE Solver Performance
• Time and memory scale linearly in the number of unknowns
• Unoptimized code still allows a laptop to solve 10X larger problems than is possible on a cluster
• Preconditioning is vital (use diagonal entries of blocks)
Dense methods used previously could not achieve useful accuracy!
Required accuracy
Bardhan and Hildebrandt, DAC ‘11
Nonlocal Continuum Electrostatics:Fast BIE Solver Enables Tests on
ProteinsLocal Model Nonlocal Model
• Observe reduced “electrostatic focusing” • Next step: compare to molecular dynamics
Bardhan and Hildebrandt, DAC ‘11
Nonlocal Continuum Electrostatics: Adding molecular realism “the right
way”Applied
Math
Computer science(HPC)
Biophysics
Nonlocalmodel
Extend the space of models that are supported by good theory
Derive fast analytical methods for testing the new theories
Test on important open questions
Build high-performance solvers for realistic, accurate simulations
Summary:
Applied Math
Computer science(HPC)
Biophysics
My research
• Improve understanding of existing models
• Develop new models on strong foundations
• Stringent tests of new models
• Identify critical model weaknesses
• Explain previously unresolved phenomena
• Leverage HPC expertise by re-using computational primitives
• “Think computationally” to gain new insights into model development
Acknowledgments• Support:
Wilkinson Fellowship at Argonne National Lab
Partial support from a Rush University New Investigator award
• Colleagues: Ridgway Scott (U. Chicago) Bob Eisenberg, Dirk Gillespie (Rush) Mala Radhakrishnan (Wellesley) Nathan Baker (Pacific Northwest Nat’l Lab)