extrapolation and interpolation strategies for e ... - nist

Extrapolation and interpolation

Extrapolation and interpolation strategies for efficiently estimatingstructural observables as a function of temperature and density

Jacob I. Monroe,1, a)

Harold W. Hatch,1

Nathan A. Mahynski,1

M. Scott Shell,2

and Vincent K. Shen1, b)

1)National Institute of Standards and Technology, Gaithersburg, MD

2)University of California - Santa Barbara, Santa Barbara, CA

(Dated: 20 October 2020)

Thermodynamic extrapolation has previously been used to predict arbitrary structural observables in molecular sim-ulations at temperatures (or relative chemical potentials in open-system mixtures) different from those at which thesimulation was performed. This greatly reduces the computational cost in mapping out phase and structural transitions.In this work we explore the limitations and accuracy of thermodynamic extrapolation applied to water, where qualitativeshifts from anomalous to simple-fluid-like behavior are manifested through shifts in the liquid structure that occur as afunction of both temperature and density. We present formulae for extrapolating in volume for canonical ensembles anddemonstrate that linear extrapolations of water’s structural properties are only accurate over a limited density range. Onthe other hand, linear extrapolation in temperature can be accurate across the entire liquid state. We contrast these ex-trapolations with classical perturbation theory techniques, which are more conservative and slowly converging. Indeed,we show that such behavior is expected by demonstrating exact relationships between extrapolation of free energiesand well-known techniques to predict free energy differences. An ideal gas in an external field is also studied to moreclearly explain these results for a toy system with fully analytical solutions. We also present a recursive interpolationstrategy for predicting arbitrary structural properties of molecular fluids over a predefined range of state conditions,demonstrating its success in mapping qualitative shifts in water structure with density.

I. INTRODUCTION

With modern advances in computational power and algo-rithms, data generated via molecular simulation has becomeubiquitous across numerous fields.1,2 Despite the availabil-ity of past data and the ease with which new results may begenerated, it is crucial that data be generated and utilized effi-ciently. Rather than improve computational efficiency directlythrough enhanced sampling algorithms3,4 or refining hard-ware usage,5–11 we focus here on general developments ofstatistical mechanical theory. It is often the case that simula-tion data is desired to demonstrate a trend with a specific statecondition or other variable. For instance, it might be of inter-est to know the unfolding behavior of a protein as a functionof temperature, or how the adsorption of fouling species atan interface changes over a wide range of temperatures, pres-sures, or even some adjustable parameter of the solute or inter-face. Many such quantities may be considered averages overconfigurations generated through Monte Carlo, molecular dy-namics, or ab initio simulations. Recent work has provided“mapped averaging” formulas for calculating structural prop-erties, greatly reducing the simulation time needed to achievehighly precise estimates of physical observables.12–14 How-ever, this technique does not address the need to run multiplesimulations at different state points, which can consume time.

Complementary to mapped averaging, thermodynamicextrapolation15–18 provides an efficient route to estimate thevariation of observables with changes in simulation condi-tions, such as temperature or pressure. Thermodynamic ex-trapolation is based on statistical mechanical relationships

a)Electronic mail: [email protected])Electronic mail: [email protected]

that, via a single simulation, directly provide estimates ofthe derivatives of an observable with respect to a specificvariable. This technique has met with great success in pre-dicting grand canonical number density distributions acrosstemperatures17,18 and relative chemical potentials.15 It hasalso proven effective in predicting changes in virial coeffi-cients upon alchemical variation of simulation parameters.19

A number of structural properties as a function of tempera-ture have also been successfully predicted for relatively sim-ple model systems.20 Use of extrapolation expressions to pro-vide temperature derivatives of dynamical quantities has beendeveloped independently21,22 and used to more efficiently de-termine activation energies21 and fit kinetic expressions asso-ciated with structural reorganization of water.23

An open question remains, however, as to the general ac-curacy and precision of these techniques and their relevanceto more realistic fluids. For instance, liquid water exhibitsa number of anomalies as a function of state conditions.24,25

With changes to temperature and density, water’s fundamen-tal structure may flip between what is expected for tetrahedraland simple fluids.26–28 Such changes are expected to be diffi-cult for thermodynamic extrapolation to capture — indeed, weclearly demonstrate that this technique is subject to the samepitfalls as any extrapolation strategy. Interestingly, a recentstudy utilizing first-order extrapolation techniques success-fully captured the temperature dependence of oxygen-oxygenradial distribution functions of liquid water.29 We find, how-ever, that this is not necessarily general behavior. Namely, thesuccess of the method relies on accurately capturing the localcurvature in the observable of interest over a desired range ofthe extrapolation variable (e.g., temperature or density).

To better understand the errors inherent in extrapolation,we examine a toy ideal gas system for which analytic solu-tions exist. We develop intuition for the scaling of uncertain-ties in this toy system and propose practical guidelines for the

Extrapolation and interpolation 2

use of extrapolation techniques. We also connect these re-sults to rigorous theories underlying well-known free energycalculation techniques, which allows us to establish strategiesand heuristics for consistently obtaining high accuracy fromderivative estimation methods. Comparison of our methods tostandard re-weighting of observables reveals significant gainsin accuracy and efficiency, which we show is expected fromprevious literature concerning free energy calculations. Us-ing these insights, we propose a polynomial interpolation ap-proach based on thermodynamic extrapolation for efficientlyusing simulation data at multiple conditions, with popular freeenergy calculation techniques appearing as special cases of itsimplementation. Our presented algorithms are packaged intoan easily extensible Python library that we use here to ana-lyze both the ideal gas model and simulations of water acrossa wide range of liquid temperatures and densities. Whilewe find linear extrapolation is capable of capturing the varia-tion of many of water’s structural properties with temperature,more advanced strategies are necessary to describe its behav-ior with density.

We organize the paper as follows: Section II presentsthermodynamic extrapolation and formalizes connections be-tween estimation of general observables and free energies.Section II A covers various classes of observables and howextrapolation must be adapted to each, while II B discussesthe differences between extrapolation techniques and re-weighting via standard perturbation theories.30 Section IIIthoroughly investigates extrapolation, including extensive un-certainty analysis, for an analytic ideal gas system. With thelessons learned from this toy model, we propose a recursiveinterpolation algorithm in Section IV, providing fundamentalconnections to free energy calculation techniques, as well asa simple visual check for self-consistency. Section V appliesall of these results to simulations of liquid water, testing theirapplicability in a more realistic system with non-trivial shiftsin qualitative structural behavior.

II. THERMODYNAMIC EXTRAPOLATION AND ITSRELATIONSHIP TO RE-WEIGHTING

A. Classifications of thermodynamic extrapolation

Thermodynamic extrapolation is related to re-weightingtechniques. To illustrate this, we first introduce common usecases that differ in the functional form of the quantity beingextrapolated and its dependence on the extrapolation variable,with free energies appearing as our last example. This will beimportant later when we demonstrate that use of derivative in-formation to perform polynomial interpolation with multipledata points is equivalent to both thermodynamic integration31

and optimal combinations of cumulant expansions.32 Thoughnot completely exhaustive, the following classifications covermost cases of interest in molecular simulation. We define thefunction to be extrapolated as Y , the extrapolation variable asa , and some arbitrary simulation observable to be X . For in-stance, if Y is the average end-to-end distance of a polymerchain and we want to know how this varies with temperature

(a), then X would be the end-to-end distance at each simu-lation snapshot. In the simplest case, we want to average anobservable of interest that does not explicitly depend on a

Y = hXi (1)

This applies to most structural observables, such as radial dis-tribution functions (RDFs), the radius of gyration of a poly-mer, or particle density, when extrapolating over temperature.The second case involves a structural observable that explic-itly depends on the extrapolation variable.

Y = hX(a)i (2)

Though the above dependence is not common for most av-erage observables extrapolated in temperature (a = T ), it ishighly relevant to extrapolation over volume or simulation pa-rameters (such as those appearing in a potential energy func-tion). An example is the variation of a canonical ensembleRDF (Y ) with respect to volume (a =V ) where X is then thenumber of bin counts at a specific separation distance for asingle system configuration. Though the first case is treatedexhaustively in previous literature,16,17 we reproduce a sim-plified version of its derivation in the next section. Thoroughinvestigation of the second case has also been performed17

and will not be repeated here, though we note that the releasedlibrary of code is capable of handling such scenarios.

The next two cases are simple extensions of the first two.For quantities like excess chemical potentials, we are not in-terested in the average observable itself, but instead the nega-tive of its natural logarithm.

Y=� lnhXi (3)Y=� lnhX(a)i (4)

Again, X may implicitly or explicitly depend on the extrapo-lation variable. The latter is true, for example, in the case ofexcess chemical potentials in the canonical ensemble given byb µex = � ln

⌦e�bDU

↵with b as the inverse temperature and

DU as the potential energy difference with and without theadded molecule. For the case of hard-sphere excess chemi-cal potentials, however, only Eq. 3 is necessary as the explicittemperature derivatives go to zero. The latter two cases inEqs. 3 and 4 may be simply handled in terms of the first twoby noting that the derivative of the negative natural logarithmof a function can be expressed in closed form at arbitrary or-der in terms of the function itself and its derivatives up to thesame order.

∂ nY

∂an=

n

Âk=1

(k�1)!✓�1hXi

◆k

⇥Bn,k

✓∂ hXi∂a

,∂ 2 hXi∂a2 , ...,

∂ n�k+1 hXi∂an�k+1

◆(5)

The above expression may be derived from Faà diBruno’s formula33 where Bn,k represents the partial Bellpolynomials.34

Finally, the function of interest may be a free energy, inwhich case we wish to extrapolate the natural logarithm of a


partition function that will in most cases of interest have anexplicit dependence on the extrapolation variable.

Y =� lnX(a) (6)

Extrapolation of this quantity in terms of temperature for thecanonical ensemble is almost identical to the cumulant expan-sions of Zwanzig in his original perturbation theory.30 In con-trast to our work, Zwanzig focused on the free energy differ-ence with respect to a specific change in the potential energyfunction. More importantly, Zwanzig always uses the infinitetemperature state (inverse temperature b = 1/kBT equal tozero) as a reference when developing his expansions (hencethe “high temperature equation of state”). If we wish to di-rectly represent the free energy at a particular b in terms ofa free energy and its derivatives at a reference b0, we mayexpand to obtain the following.

bA(b )= b0A(b0)�∂ lnQ

∂b

��b0

(b �b0)

� 12!

∂ 2 lnQ

∂b 2

��b0

(b �b0)2 + . . .

bA(b )= b0A(b0)+ hHib0(b �b0)

� 12!

⇣⌦H

2↵b0�hHi2

b0

⌘(b �b0)

2 + . . . (7)

The Helmholtz free energy A is related to the canonical en-semble partition function Q =

Re�bH(rN ,pN)drNdpN as bA =

� lnQ with rN and pN representing the positions and momentaof all particles in the system. Averages and derivatives eval-uated at specific values of the extrapolation variable are indi-cated with subscripts. Eq. 7 is a cumulant expansion in termsof the difference in b rather than b itself, with H being theHamiltonian of the system, including both the potential andkinetic energies. A particularly useful form for many types offree energy calculations results from expanding in terms of anarbitrary variable in the Hamiltonian l

bA(l )= bA(l0)+

⌧b ∂H

∂l

�

l0

(l �l0)

� 12!

0

@*✓

b ∂H

∂l

◆2+

l0

�⌧

b ∂H

∂l

�2

l0

1

A(l �l0)2

+ . . . (8)

B. Extrapolation versus perturbation in the canonicalensemble

Before contrasting extrapolation with perturbation theory,we derive temperature extrapolation in the canonical ensem-ble, which corresponds to the simplest case described above(i.e., Y = hXi). More thorough treatments, including exten-sion to other variables, may be found elsewhere,15,17 whilemore involved derivations for extrapolation of volume in thecanonical ensemble are provided in the Appendix. For anyobservable X(rN) that only depends on coordinates rN of the

N atoms in the system and does not depend on momenta orexplicitly on temperature, we can write the extrapolation fromb0 by an amount db for the ensemble average of X as

hXib = hXib0+

∂ hXi∂b

��b0

(db )+ ∂ 2 hXi∂b 2

��b0

(db )2

2!+ . . . (9)

For structural averages, the relevant normalization factor isthe configurational part Z of the canonical ensemble partitionfunction Q and the appropriate configuration weight is thene�bU(rN )

Z, with U being the potential energy. The derivatives in

Eq. 9 (up to second order in b ) become

∂ hXi∂b

��b0

= hXib0hUib0

�hXUib0(10)

∂ 2 hXi∂b 2

��b0

= 2hXib0hUi2

b0�hXib0

⌦U

2↵b0

�2hXUib0hUib0

+⌦XU

2↵b0

(11)

As noted by Mahynski et al.,20 configurational quantities ofinterest can almost always be expressed as ensemble averages.This includes each point in any structural distribution functionsuch as a RDF.

To relate thermodynamic extrapolation to more well knownre-weighting methods, we briefly revisit perturbation theory.In the original theory developed by Zwanzig,30 we may re-weight averages in one ensemble in order to obtain the sameaverage in another ensemble. For two canonical ensembles atdifferent temperatures, with the observable having no explicitdependence on temperature or momenta, this becomes

hXib =

DXe�(b�b0)U

E

b0⌦e�(b�b0)U

↵b0

(12)

Or more explicitly in terms of configuration weights in a givenensemble, w = e�bU

Z

hXib =

DX

w

w0

E

b0Dw

w0

E

b0

(13)

As each is mathematically rigorous, perturbation and extrapo-lation utilizing an infinite number of terms should yield identi-cal results. In practice, the results will differ due to finite sam-pling and series truncation, despite both techniques utilizingthe exact same data, namely the potential energies and observ-able values for each sampled configuration. In each case, wemanipulate the data by taking various averages, then combinethem to produce the result. To illustrate differences betweenthe two approaches, we Taylor expand the numerator and de-nominator in Eq. 12 around an inverse temperature differencedb of zero

hXib =hXib0

�hXUib0db +

⌦XU

2↵b0

db 2

2 � . . .

1�hUib0db + hU2ib0

db 2

2 � . . .(14)


FIG. 1. Schematic representations of scenarios in which extrapo-lation/interpolation or perturbation (re-weighting) are expected tobe efficient. When the behavior of X with respect to a is well-approximated by a low-order polynomial (a and b), only derivativesof low order (here second for extrapolation, first for polynomial inter-polation) are needed to accurately capture a wide range of a valuesand extrapolation/interpolation is appropriate. Perturbation is suit-able when fluctuations in X are large compared to its variations witha (a and c). Neither method is efficient for systems with many ex-trema and relatively small fluctuations (d). The adjustable parametera may be a thermodynamic variable (e.g., temperature or density) ormodel parameter (e.g., appearing in a potential energy function).

We emphasize that there is no practical reason for expandingEq. 12 in this way as it can be directly applied to simulationdata. However, Eq. 14 demonstrates that we are effectivelytaking the ratio of two series expansions, with both extendingto infinite order. Actually taking this ratio should yield Eq. 9substituted with Eqs. 10 and 11. Indeed, we may obtain thenumerator of Eq. 14 by multiplying the denominator by thecombined Eqs. 9–11 and collecting terms of the same orderof db . In practice, error is introduced into extrapolation pre-dominantly through truncation of Eq. 9, while perturbation viaEq.12 evaluates the series in Eq. 14 at infinite order but suffersfrom finite sampling errors stemming from a lack of sufficientphase-space overlap.

Finite sampling limits both perturbation theory and extrap-olation, with the latter also limited by the order at which wetruncate the expansion. As demonstrated later, however, ex-trapolation may significantly outperform perturbation for thespecial cases of observables that vary approximately linearlyor quadratically as a function of the extrapolation variable(i.e., first- or second-order extrapolation, respectively) overlarge ranges of temperatures or densities. This is because per-turbation does not utilize information regarding the local cur-vature in terms of the dependence of the observable as a func-

tion of the extrapolation variable (Fig. 1). Instead, perturba-tion relies on selecting configurations that have high weightsin the new ensemble but were sampled in the reference ensem-ble, as is the case for large observable fluctuations shown inFig. 1. If no such configurations (or phase-space overlaps) ex-ist, perturbation theory cannot work, as can be seen from theratio of weights appearing in the numerator and denominatorof Eq. 13. Moving further from the reference ensemble tendsto greatly reduce overlap, rendering perturbation estimates un-reliable. Lack of overlap becomes even more pronounced forlarge systems due to reductions in relative fluctuations withsystem size. On the other hand, extrapolation at a given or-der is only accurate while the observable of interest varies ina similar fashion to the highest-order term of the polynomialexpansion (Fig. 1). Similar to perturbation theory, the highestextrapolation order we can use, and hence the distance we canperturb, is limited by the amount of sampling that we haveperformed. This is because extrapolation requires convergedestimates of the moments of an observable.

III. ANALYTICAL EXAMPLE: IDEAL GAS IN ANEXTERNAL FIELD

To concretely compare perturbation and extrapolation andunderstand why each fails or succeeds, we examine the sim-ple case of a one-dimensional ideal gas in an external potentialdepending linearly on the particle positions. This model is an-alytically tractable, but still exhibits non-trivial behavior in thetemperature dependence of its structural properties. We treata canonical ensemble with temperature T , number of particlesN, and length of L in the single dimension with the coordinateof the i

th particle being xi. The particles do not interact witheach other, but do interact with an external field linear in theparticle positions, F(x) = ax. The total N-particle partitionfunction Q is given by the product of single-particle partitionfunctions q as Q = q

N

N! with

q =z

L(15)

where L accounts for kinetic degrees of freedom and will notbe of interest here, and z is the configurational partition func-tion given by

z=Z

L

0e�baxdx

=1� e�baL

ba(16)

The total potential energy for a system with N particles is sim-ply U = aÂi xi. For a single particle, the probability of a givenx is

P(x) =e�bax

z(17)

and the total probability for N particles with different posi-tions is the product of the probability in Eq. 17 due to lack of


particle-particle interactions. For a single particle the poten-tial energy distribution is also given by P(x) because a particlecoordinate uniquely maps to a potential energy. For a suffi-ciently large number of particles, however, the distribution ofpotential energies becomes Gaussian and can be completelycharacterized by only its mean and variance. Throughout, weuse 1000 particles, which satisfies this condition and is on anorder of magnitude with the degrees of freedom that might bepresent in a molecular simulation. The average x position (fora single particle or over all particles) is given by

hxi= 1ba

� L

ebaL �1(18)

The average potential energy is then given by hUi = Nahxiand, similarly, the variance of the potential energy is s2

U=

Na2s2

x, where

s2x=

1b 2a2 � L

2ebaL

�ebaL �1

�2 (19)

We will treat the average position hxi as our structural prop-erty of interest that we perturb and extrapolate in temperaturespace. It is important that hxi is not a simple function of b andin fact exhibits significant non-linearity (Fig. 2). This makesthe extrapolation non-trivial even though analytical results forextrapolation at each order (i.e., the exact n

th derivative withrespect to b ) are trivial to compute. Without loss of generality,we henceforth set a and L equal to 1 for simplicity.

Fig. 2 shows finite sampling results for perturbation and ex-trapolation along with analytical results in the limit of infinitesampling. Finite sampling consists of generating Ns config-urations by sampling from P(x) for each particle and calcu-lating hxi and U for each configuration. It is clear that finitesampling limits both perturbation and extrapolation, but in dif-ferent ways. Perturbation works well close to the referencetemperature but reaches a saturation value further away. Theaccuracy range is extended slightly with more samples, butthe increases in accuracy (i.e., distance from the infinite sam-pling, infinite-order result) are marginal. The reason for thiscan be inferred from P(x) at various temperatures (Fig. S1).To obtain the correct average at one temperature, samples arerequired from all relevant configurations with high P(x) at thattemperature. In other words, the distributions of potential en-ergies at the two temperatures must overlap. Thus, samplingonly a subset of these configurations amounts to implicit trun-cation of the average.35 The result is a highly conservative(i.e., predicting no variation with temperature) estimate dueto overly concentrated sampling of phase space relevant onlyto the reference temperature. This behavior is well-known inperturbation theory — most modern techniques involving freeenergy calculations and enhanced sampling seek to overcomeexactly the difficulties described above.3,4

For the ideal gas system investigated in Fig. 2, both linearand quadratic extrapolation are superior to perturbation the-ory, even though extrapolation suffers from not only samplingerror, but also truncation error that is not present in perturba-tion theory. Depending on the system, however, both errorsmay be much lower than those obtained from re-weighting

FIG. 2. Perturbation theory and extrapolation at various orders arecompared for predicting the average position hxi of an ideal gas par-ticle in a linear external field. Analytical behavior at infinite orderand sampling is shown as a black dashed line, the analytical resultwith infinite sampling for each order of extrapolation is shown asa red solid line, and perturbation or extrapolation results with vary-ing numbers of randomly sampled configurations from the referencestate of b = 5.6 are represented by circles.

techniques. Fig. 2 also provides some sense of the behaviorof both errors as a function of extrapolation order and samplesize. While first-order extrapolation quickly converges to theinfinite sampling limit, second order takes many more samplesto attain similar accuracy. At higher orders, finite sampling er-rors result in inaccurate estimates of local derivatives, whichin turn lead to highly inaccurate estimates far from the refer-ence temperature. This is in contrast to the very low trunca-tion error for sixth-order extrapolation with infinite sampling,which nearly exactly tracks the analytical result of Eq. 18.

There is clearly a trade-off between sampling and trunca-tion errors when using extrapolation. As shown in Figs. 2and 3a error and uncertainty (or imprecision) also grow withextrapolation distance db . This is expected since increasingextrapolation distance magnifies errors in derivative estimatesaccording to Eq. 9. The dependence of the uncertainty on


FIG. 3. Uncertainty (standard deviations over 1000 independent datadraws at the reference condition) in the average position hxi relativeto the infinite sampling extrapolation result hxianalytic (top row) andin the derivative estimates (bottom) are shown as functions of ex-trapolation distance db (a), extrapolation order (b), and number ofsamples Ns (c). In (b) and (c) the extrapolation distance is fixed atdb = 1.0. Uncertainty increases with greater distance from the ref-erence point, higher order, and fewer numbers of samples. In thebottom panel of (a), uncertainties in derivatives do not grow with dbas these are locally estimated at the reference state.

sample size and order is summarized in Fig. 3b-c for a fixedtemperature difference. Estimation of higher-order derivativesinvolves higher-order moments of the potential energy, lead-ing to exponentially increasing uncertainty (Fig. 3b). At anyorder, the uncertainty in extrapolation estimates or derivativeestimates vary as 1/

pNs or 1/Ns, respectively, with Ns be-

ing the number of samples (Fig. 3c). Relative unsigned er-rors from infinite-sampling results also decrease significantlywith sampling and increase quickly as the order is increased(Fig. S2), which is also obvious from Fig. 2. In most prac-tical settings, it is in fact the sampling error that will dom-inate, limiting the order and determining the truncation er-ror. Even for the simple system investigated here, more than10000 uncorrelated samples are required to achieve less than10% uncertainty at third order. In molecular systems involv-ing biomolecules, obtaining on the order of 100000 uncorre-lated samples can require hundreds of nanoseconds or more ofsimulation time. As we demonstrate below, it is likely moreefficient to combine the results of short simulations at multipleconditions, using only lower-order information.

IV. AN INTERPOLATION STRATEGY FOR OBSERVABLEESTIMATION

Instead of extrapolating from a single point, we can simul-taneously extrapolate from multiple points to improve our es-timate. Simulations at additional state points are already nec-essary to check the accuracy of the extrapolation, even if theuncertainty may be estimated through bootstrapping. Previ-

ous work has suggested weighting each extrapolation estimateby the distance to the new state point,15 which we refer to as“weighted extrapolation”, where subscripts 1 and 2 below in-dicate the two reference points

hXib =hXib1 |b �b2|m + hXib2 |b �b1|m

|b �b1|m + |b �b2|m(20)

This formulation affords a great deal of flexibility in that m

may be optimized as a free parameter in order to satisfy crite-ria such as the Gibbs-Duhem relation.15 For simplicity, how-ever, we set m = 20 throughout this work. This approach ismost useful in the case of interpolation, when we desire thebehavior of the observable over a range between two points atwhich data is available. From the perspective of free energycalculations, this is similar to techniques that utilize infor-mation from two states, such as Bennett’s Acceptance Ratio(BAR)36. Determining the behavior of an observable over anexactly or approximately known region between two states isin fact a common task. We may then use derivative informa-tion up to the maximum order possible to interpolate withinthe interval of interest, greatly reducing our uncertainty andincreasing accuracy.

As an alternative to weighted extrapolation, we suggest di-rectly fitting observable values and their derivatives, which arerelated to fluctuations (e.g., Eqs. 10 and 11), with an interpo-lating polynomial. Given two sets of observable and derivativevalues up to order k, it is possible to exactly fit a polynomial oforder 2k+ 1. For example, with derivative information up tofirst order we obtain the third-order polynomial that best ap-proximates the observable over the interval of interest. By de-sign, this polynomial will exactly match the provided observ-ables and their derivatives at the values of the interpolationvariable (e.g., temperature or density) for which data was col-lected. Using two points up to arbitrary order of derivatives,this is obtained by solving for the polynomial coefficients ci

of order i in the set of Eqs.

2

66666666666664

1 b1 b 21 . . . b 2k+1

10 1 2b1 . . . (2k+1)b 2k

1...0 0 0 . . . (2k+1)!

(k+1)! b k+11

1 b2 b 22 . . . b 2k+1

20 1 2b2 . . . (2k+1)b 2k

2...0 0 0 . . . (2k+1)!

(k+1)! b k+12

3

77777777777775

·

2

66664

c0c1c2...

c2k+1

3

77775=

2

6666666666666666666664

hXib1

∂ hXi∂b

��b1

...∂ khXi∂b k

��b1

hXib2

∂ hXi∂b

��b2

...∂ khXi∂b k

��b2

3

7777777777777777777775

(21)It is also possible to incorporate more than two points, furtherincreasing the polynomial order. However, this can lead tooverfitting with large numbers of data points. As described be-low, we found a piecewise interpolation procedure to be morereliable.


We directly compare weighted extrapolation, polynomialinterpolation, and re-weighting using BAR in Fig. 4a for theaverage x position in the ideal gas model. In the last tech-nique, we use BAR to estimate the free energy difference, andhence weight of each state, for the lowest and highest tem-peratures. To re-weight observations from each state to thestate of interest, weights from BAR are combined with ob-served Boltzmann weights. This is very similar to using stan-dard perturbation theory to re-weight, the only difference be-ing that we are also incorporating relative weights between thetwo states from which samples are drawn. Polynomial inter-polation provides the smoothest curve, which is expected incomparison to the weighted extrapolation procedure of stitch-ing together two first-order extrapolations rather than cleanlyfitting a higher-order polynomial. Notably, polynomial inter-polation also has the highest uncertainty predicted by boot-strap resampling. This indicates that the results for weightedextrapolation and BAR re-weighting exhibit less variance, butin this case also exhibit greater bias. Lower accuracy withBAR re-weighting is due to the same deficiencies as for stan-dard perturbation theory extrapolations. Namely, the lack ofphase-space overlap results in a highly conservative estimateheavily weighted towards the state with the lower free energy.

This exact malady has been known since the inception ofBAR.36 In his original paper, Bennett recommends against us-ing BAR in cases where there is very little overlap of the distri-butions of potential energy differences. In such cases, Bennettinstead suggests to fit the logarithm of each distribution to apolynomial and graphically determine the shift necessary tomake the polynomials align, which is the free energy differ-ence. Recent work has also used this same polynomial inter-polation of potential energies to efficiently predict free energydifferences between crystal polymorphs.38 This is closely re-lated to work that optimally combines cumulant expansionsbuilt from two states.32 We may recover identical expressionsin the current context by performing polynomial interpolationto determine the free energy as a function of the variable ofinterest. In this case, we assume that the first state has a freeenergy of zero and lose an order of the polynomial as we mustalso estimate the unknown free energy difference.

2

66666666666664

1 b1 . . . l 2k

1 00 1 . . . (2k)l 2k�1

1 0...0 0 . . . (2k)!

(k)! l k

1 01 b2 . . . l 2k

2 �10 1 . . . (2k)l 2k�1

2 0...0 0 . . . (2k)!

(k)! l k

2 0

3

77777777777775

·

2

66664

c0c1...

c2k

DA

3

77775=

2

6666666666666666666664

0

b ∂H

∂l

��l1

...

b ∂ kH

∂l k

��l1

0

b ∂H

∂l

��l2

...∂ k

H

∂l k

��l2

3

7777777777777777777775

(22)

FIG. 4. Interpolation procedures are shown for weighted extrap-olation and interpolating polynomials (both using only first-orderderivatives), as well as BAR or MBAR depending on the numberof data points (using the pymbar package37). Interpolation from twopoints is shown in (a), while interpolation using all of the points se-lected during a recursive interpolation run using polynomial interpo-lation and an error tolerance of 2% are shown in (b). Eq. 18 is shownas a black dashed line. The same 10000 samples at each temperatureare used for each method, with error bars representing one standarddeviation in 100 bootstrap resamples. (c) shows polynomial fits onsliding sets of three points selected during recursive interpolation,which are indicated by black squares and vertical lines. For each set,a polynomial in the same color is plotted based on each pair of points(solid lines indicate the outermost points of the set, dotted the left twopoints, and dash-dot the right two points). In many cases, polynomi-als diverge significantly outside the interval over which they were fit.In the shared interval, however, good overlap indicates that the lo-cal curvature of the polynomial fits is consistent, meaning additionalpoints are not necessary.

With first-order derivatives, solving the above system yields

bDA =12(l2 �l1)

⌧b ∂H

∂l

�

l2

+

⌧b ∂H

∂l

�

l1

!(23)

If we let l1 = 0 and l2 = 1 and H(l ) = H0 + lDH, as iscommon in free energy calculation literature and equivalentto the choices in Ref. 32, we find that

bDA =b2

⇣hDHil2

+ hDHil1

⌘(24)


This is identical to the result found by Hummer32 if we notethat non-equilibrium work in one direction is the negative ofthe work in the opposite direction. Going to fourth order inthe derivatives (yielding an exact eighth-order polynomial andfree energy difference) results in

bDA=b2

⇣C(1)l2

+C(1)l1

⌘� 3b

28

⇣C(2)l2

�C(2)l1

⌘

+b84

⇣C(3)l2

+C(3)l1

⌘� b

1680

⇣C(4)l2

�C(4)l1

⌘(25)

In the above, C(i) represents the i

th cumulant of the free energyevaluated at a specific state indicated by l . This is again iden-tical to the cumulant expansion derived by Hummer, but withthe explicit interpretation as an interpolating polynomial. Ifwe also consider thermodynamic integration via the trapezoidrule, we are assuming that the derivative of the free energyas a function of the variable l is linear. This is equivalent toassuming that the free energy itself is a second-order polyno-mial and using the estimated derivatives at the end points tosimultaneously fit this polynomial and determine the free en-ergy difference, as in Eq. 22. In the context of estimating freeenergy differences, we have thus shown that our polynomialinterpolation strategy using up to first derivatives is equivalentto thermodynamic integration or optimal combination of cu-mulant expansions. We also obtain a closed-form polynomialexpression of the free energy estimate over the entire interpo-lation range, whereas other techniques solely provide the freeenergy difference.

From the above considerations, it is not surprising that aninterpolating polynomial for predicting structural observablesoutperforms re-weighting with BAR (Fig. 4a). Being equiv-alent to thermodynamic integration in the context of free en-ergy calculations, polynomial interpolation only relies on ac-curately capturing the local curvature of the observable asa function of the interpolation variable, not on overlappingphase-space probability distributions as in BAR.39 While sta-tistically optimal re-weighting methods such as BAR or theMultistate Bennett Acceptance Ratio (MBAR)40 can reliablyprovide highly precise free energy estimates, they are notalways optimal compared to thermodynamic integration.39

When curvature along the chosen thermodynamic path ishigh and there is significant phase-space overlap, as is typi-cal in computing solvation free energies of small molecules,re-weighting techniques significantly outperform thermody-namic integration.41,42 Such behavior is not observed in ourpresent work, but is represented by panel c in Fig. 1. Ourpurpose here is not, then, to advocate for the general use ofpolynomial interpolation (or equivalently thermodynamic in-tegration) to compute free energy differences or potentials ofmean force. We make the connection only to explain differ-ences in performance observed in Fig. 4 and Section V, wherethere is very little configurational overlap over large changesin temperature or density. This is analogous to the situationshown in Fig. 1(b), where extrapolation or interpolation is ef-ficient while perturbation is not.

If we are not interested in computing the free energybut only an observable over a specific range of conditions,methods that rely on free energy calculations to perform re-

weighting will consistently provide estimates heavily biasedtowards the reference state ensemble. Re-weighting will onlybe preferred to interpolation if the fluctuations in the observ-able at a fixed state point rival its variations with changingsimulation conditions (Fig. 1(a,c)). For extensive observables,this criterion will be met with vanishing frequency as systemsize increases. In any scenario, it is expected that if free en-ergies are computed accurately, observables will be as well.However, a reduced computational cost is expected for com-puting observables alone, as performing the extensive sam-pling necessary for accurate free energy calculations is typi-cally a computationally expensive endeavor. In such a case,direct polynomial interpolation of the observables further re-duces the simulation cost by providing a prediction of the ob-servable over the specified range. Interpolation relies on thelocal curvature of the observable (as a function of interpola-tion variable) being of equal or lower order to the estimatedpolynomial over the interval of interest. In the worst-case sce-nario shown schematically in Fig. 1(c,d), this is not true andthe observable function is rugged. For small fluctuations inthe observable (Fig. 1(d)), this will require collecting a simi-lar number of data points as if a re-weighting procedure wereimplemented.

To address this issue, we require an algorithm to guide se-lection of additional data points, as well as a simple methodto gauge convergence of the interpolation over the range ofinterest. Heuristics along these lines already exist in the freeenergy calculation literature, namely that efficiency and accu-racy of re-weighting methods may be enhanced by selectingintermediate states that maximize phase-space overlap and en-force constant entropy differences at each step.35,43,44 As men-tioned previously, we have found that it is advantageous to fitinterpolating polynomials in a piecewise fashion rather thanuse all data points to estimate coefficients for a higher-orderpolynomial. This is accomplished through a recursive algo-rithm. We start with the two most extreme points over therange of interest and collect data at each. Estimates of the ob-servable of interest along with bootstrapped uncertainty esti-mates are obtained over the entire interval on a fine grid. Therelative uncertainties given by |sY

Y| are computed and com-

pared to a specified error tolerance. If the tolerance is not metfor all points on the interval, the point with maximum uncer-tainty is selected for new data collection. This algorithm re-curses, dividing all high-uncertainty intervals into two parts,until the error tolerance is satisfied on all subintervals. Fig. 4bshows the result of executing this algorithm with a toleranceof 2% on the ideal gas model. A total of four points are nec-essary to achieve the desired accuracy.

This interpolation method crucially requires accurate es-timates of local curvature on each interval. We can checkthis self-consistently with a simple visualization, similar tochecking for histogram overlap when performing free energycalculations.45 This visual check is shown in Fig. 4c. If weconsider all sets of three adjacent points used to calculatepolynomials, we can compare the local curvature of interpo-lations using the outer points of the interval to those using thepoints on the two sub-intervals. If the polynomials on the sub-intervals match closely the polynomial on the larger interval


then the curvature of that interval is well-approximated to thegiven polynomial order and no additional points are needed.In the ideal gas case, overlap occurs for only a small numberof points when using third-order polynomials (Fig. 4c). Moregenerally, local curvature of a third-order polynomial will cap-ture scenarios where the sign of the derivative changes, cover-ing a wide variety of scenarios. However, a third-order poly-nomial only requires observable values and first derivative in-formation, which is exponentially easier to obtain to a givenuncertainty compared to higher-order derivative information(Fig. 3). Additionally, the interpolations on different intervalsagree in their value and first derivative at the selected datapoints. Thus the resulting piecewise function is continuous inboth its value and first derivative by construction.

V. PREDICTING STRUCTURAL PROPERTIES OFLIQUID WATER

To demonstrate our above theory and algorithms on amore realistic system, we examine the TIP4P/2005 modelof water46 in its liquid state. Simulations of 1400 wa-ter molecules spanning a wide range of temperatures (250-350 K) and densities (0.87-1.40 g/cm3) were performedwith GROMACS 2016.1,47 with additional details publishedpreviously.28 The range of temperatures and densities was se-lected to span the structural anomaly envelop first describedby Errington and Debenedetti,27 which encompasses much ofthe liquid state including metastable regions (for instance, thestate point discussed later at 0.87 g/cm3 and 300 K is very nearthe liquid-vapor spinodal). Each simulation consists of snap-shots saved every 1 ps over 5 ns trajectories, with simulationsat 300 K and 1.00 g/cm3 extended to 50 ns to examine the ef-fect of increasing sampling by an order of magnitude. We con-sider both translational and orientational properties that clas-sify water structure in addition to hard-sphere insertions (0.33nm radius), which characterize local density fluctuations.48,49

In all results below, error bars are the standard deviation as de-termined by bootstrap resampling (the calculation is repeated100 times with random samples of the same size as the origi-nal data set drawn with replacement).

A. Variation in temperature

Fig. 5a demonstrates that oxygen-oxygen RDFs at 1.00g/cm3 may be extrapolated from ambient conditions overnearly the entire liquid-state temperature range using onlyfirst-order derivative information. We accomplish this by in-dividually extrapolating the count in each RDF bin beforenormalization.20 In other words, the count in each bin of theRDF corresponds to the observable X in Eq. 9. The accu-racy in Fig. 5 (and Fig. S10) is remarkable considering thata structural anomaly boundary is crossed over this tempera-ture range (see Fig. 4 in Ref. 28). These results hold trueeven at the highest and lowest densities (Figs. S4-S7). Ex-trapolations remain quantitatively accurate even when RDFsundergo qualitative changes, such as developing shoulders at

lower temperatures. This implies that each set of histogrambin counts varies approximately linearly in temperature ac-cording to its own temperature derivative. Since we allowthese derivatives to vary for each bin, we capture qualitativeshifts in RDF behavior. Using second-order derivative in-formation greatly increases uncertainty in the extrapolation,which remains the case even with an order of magnitude moresimulation time (Fig. S3). Given the success of first-orderextrapolation, any improvements associated with adding sec-ond derivative information are likely not worth the increasedcost of sampling to the point of low uncertainty (on the orderof 100 ns). Extrapolating RDFs in temperature also allowsus to capture variation of the translational order parameter ofErrington and Debenedetti27 (Fig. 6a). Based on the integra-tion of RDFs, this metric characterizes radial translational or-der and is defined as t = 1

xc

R xc0 |g(x )�1|dx where x = rr1/3

with xc = 2.843.27 In contrast, re-weighting via perturbationtheory only performs well for temperatures close to the refer-ence simulation.

Similar success is found in temperature extrapolations oforientational properties, while perturbation encounters thesame difficulties that were previously discussed. Three-bodyangles are defined as the angle between two water oxygenswithin a predefined cutoff (here 0.34 nm) of a central wateroxygen. This structural feature has proven critical to develop-ing monatomic models of water50 and has also been shown toqualitatively track water’s response to changes in state condi-tions or the presence of solutes.28 Again, only 5 ns are neces-sary to successfully extrapolate entire three-body angle distri-butions over the entire temperature range (Fig. 5b). Qualita-tive shifts in the distribution are even more pronounced com-pared to RDFs, but are nonetheless predicted accurately withonly first derivative information. Second derivative informa-tion based on these short trajectories only serves to signifi-cantly increase uncertainty, as was the case with translationalmeasures of structure. Fig. 6b also demonstrates that extrapo-lation is successful in predicting the tetrahedral order param-eter, which is defined as q =

D1�Âi, j

�cosqi, j +

13�2E

.27,51

The summation runs over all unique pairs of the four near-est neighbors of a central water oxygen with qi, j represent-ing the three-body angle of a pair. Averaging is performedover all water oxygens and system configurations. q is relatedto but can be calculated independently of three-body angledistributions.28,52 In this case, it is more obvious why first or-der is appropriate — the order parameter only varies linearlyin temperature. This is not only the case at ambient condi-tions, but also at lower and higher densities (Figs. S4-S7).

Water’s density fluctuations as a function of temperatureare more difficult to capture with extrapolation. Such fluc-tuations are critical to understanding solvation phenomena,53

hydrophobic association,48,54 and characterizing the nature ofinterfaces in contact with water.49,55 Density fluctuations arecommonly characterized by the excess chemical potential of ahard-sphere solute, which is given by the probability of havingzero water oxygens within the hard-sphere volume

b µHSex =� lnP(N = 0) (26)


FIG. 5. In all panels, lines represent direct simulation results at each temperature for RDFs (a) or three-body angle distributions (b) whilepoints are the estimates via extrapolation from 300 K (constant density of 1.00 g/cm3) with 5000 snapshots obtained at regular intervals overa 5 ns simulation. Error bars represent one standard deviation and are determined through bootstrap resampling. The neighbor cutoff was setto 0.34 nm in computing three-body angles.

As Eq. 26 depends on histogram counts of a quantity thatonly depends on sampled configurations, we categorize hard-sphere chemical potential as a structural observable akin toothers discussed in this paper. Fig. 6c-d shows extrapolationsin temperature from 300 K for hard-sphere excess chemicalpotentials, also revealing non-monotonic behavior with tem-perature (extrapolations of P(N = 0) are also shown in Fig.S8). With non-monotonicity in P(N = 0) and b µHS

ex , it is nosurprise that first-order extrapolation fails for some portionsof the temperature range. It is interesting to note that, at thisdensity, the temperature dependence of b µHS

ex is weak, withthe changes in µHS

ex dominated by its linear dependence onb (Fig. 6c-d). This weak temperature dependence means thatextremely precise estimates of the P(N = 0) are needed to cor-rectly capture the variation. As such, we utilize 50 ns at thereference temperature to perform extrapolation in Fig. 6c-d,as it is clear in Fig. S9 that the first derivative estimate with 5ns of simulation time is highly inaccurate. In this case, the in-herent noise in the method used to calculate the observable issimilar to its variations with temperature. To reduce the over-all simulation time in this scenario, faster converging methodsof observable estimation should be employed. Other methods

to estimate b µHSex could then be extrapolated in the same way

but at higher certainty and accuracy.Polynomial interpolations over temperature are presented

in Fig. S11, converging to very low uncertainty (and high ac-curacy) using simulations at only the two most extreme tem-peratures. Since the translational and tetrahedral order param-eters vary approximately linearly over the studied temperaturerange, extrapolation already performs well and interpolationdisplays only marginally better accuracy. In the case of hard-sphere chemical potentials, which vary non-monotonicallywith temperature, interpolation leads to starker improvementover extrapolation (Fig. S11c). Variations with density arenon-monotonic for all observables and are explored in detailin the next section.

B. Variation in volume

First-order derivatives in the canonical ensemble of arbi-trary observables with respect to volume may be computed


FIG. 6. Direct simulation results (solid lines), perturbation theoryestimates (stars), and first- (circles) and second-order (diamonds) ex-trapolations are shown versus temperature for (a) translational orderparameters, (b) tetrahedral order parameters, (c)/(d) excess chemicalpotentials of 0.33 nm radii hard-spheres. Translational order param-eters are computed from RDFs extrapolated in temperature from 5 nssimulations at 300 K and 1.00 g/cm3, while all other quantities areextrapolated directly from the same reference state. Due to the inher-ently high ratio of noise to temperature variation for excess chemicalpotential estimates at this density, 50 ns of simulation are used forextrapolations in (c) and (d). For clarity, error bars are not shown forperturbation or second-order extrapolation in (c) or (d), as they areon the same scale as the entire ordinate axis.

according to

dhXidV

=1

3V

"�b hXihW i+b hXW i+

*

Âi

∂X

∂xi

xi

+#(27)

where X is the observable of interest, W is the virial, and xi

represents the ith degree of freedom in the system. The last

term appearing on the right hand side involves the response ofthe observable itself to changes in volume, and as such will beunique to each observable. A derivation of Eq. 27 along withexpressions necessary to compute the last term for a variety ofobservables are provided in the Appendix. We do not pursuemore than first-order derivative information in what follows,as higher-order derivatives in volume require derivatives offorces, which are not readily available in most MD simulationpackages. Given the difference in observed uncertainties of

first and second-order extrapolations with respect to tempera-ture, it is likely that high uncertainty would prevent the use ofsuch extrapolations. In contrast, perturbation is not feasible atall due to negligible phase-space overlap between simulationsat different densities.

On the whole, extrapolation in density is much less success-ful than in temperature, as shown in Figs. 7-8 and Fig. S12.Contributing to this is the fact that canonical ensemble simu-lations at different densities are spaced further apart in termsof their phase-space overlap, resulting in higher uncertaintiesin extrapolation estimates. Liquid structure is typically highlysensitive to density, with pressure changing by multiple ordersof magnitude over the range studied here. Additionally, manyproperties of water naturally exhibit non-monotonic behavioras a function of density. With changing density, water struc-ture changes significantly, crossing the structural anomalyboundary twice27,28 and leading the translational and tetrahe-dral order parameters to behave non-monotonically. As such,first-order extrapolations should not in fact be expected to pro-duce accurate results over large density differences. On theother hand, perturbation is highly impractical (and as such isnot presented) as there is effectively no configurational over-lap between even neighboring densities.

It is clear from Figs. 7-8 that successful extrapolation indensity would require higher-order derivative information.However, relative uncertainties will grow exponentially withderivative order, as established in Section III. As a result,orders of magnitude more simulation time will be requiredto estimate even second-order derivatives. More modest in-creases in simulation time will further lower uncertainty infirst-order extrapolations, but will not significantly improveaccuracy. We turn instead to our recursive algorithm for poly-nomial interpolation, using the tetrahedral order parameter asa test case. This metric exhibits non-monotonic behavior indensity, which is easy to visualize, as it is a scalar, and can bedirectly extrapolated or interpolated (unlike the translationalorder parameter). Fig. 8b shows that interpolation using twostate points correctly captures the basic qualitative behavior ofthe tetrahedral order parameter. However, the local curvatureis not accurate across the range of interest. This is largely dueto the steep slope estimated at the lowest density point, whichis in very close proximity to the liquid-vapor spinodal.28 Asa result, the slope of q changes rapidly across low-density,metastable state points (see Fig. S14), representing a mildcase of what is shown schematically in Fig. 1d. Though ourstatistical analysis indicates low uncertainty in presented q

values and derivatives, we caution that, as stability limits areapproached, all properties become more difficult to determinewith high statistical certainty due to increasing wavelengthsand magnitudes of density fluctuations.56

Complicating matters further, the bootstrapped uncertaintyis already within 1% over the entire region using just the twomost extreme densities. In this case, it is necessary to checkthe consistency of the local curvature in a piecewise fashionby adding an additional point within the interval (specificallywhere the bootstrapped uncertainty is greatest). The result-ing visual consistency check is shown in Fig. 9a. It is clearthat the interpolation over the full interval (solid line) does


FIG. 7. Lines represent direct simulation results at each density for RDFs (a) or three-body angle distributions (b) while points are the estimatesvia extrapolation from a density of 1.00 g/cm3 (top) or interpolation from the two most extreme densities of 0.87 and 1.40 g/cm3 (bottom).Temperature is constant at 300 K. Error bars represent one standard deviation and are determined through bootstrap resampling.

not overlap very well with either of the interpolations oversubintervals (dotted and dashed lines). This indicates a lackof consistency in the local curvature and suggests that at leastone additional point should be added. The result of adding asingle point on the high-density interval is shown in Fig. 9b,revealing that the local curvature is consistent on that inter-val, but not on the low-density interval where the curvatureis in fact the largest. We can reduce the error tolerance forthe recursive algorithm until the visual consistency check ispassed, or (likely at lower computational cost) manually in-spect each interval and add points where the local curvatureis highest until self-consistent results are obtained. A com-plete implementation of this procedure is illustrated in Fig.S13, revealing that 50% or less of the original simulation datais required to achieve the same accuracy. We note, however,that this procedure is only necessary for quantitatively accu-rate results in the present case. With two 5 ns simulationsproviding 5000 snapshots each, we obtain correct qualitativebehavior for the tetrahedral order parameter over the densityrange from 0.87 to 1.40 g/cm3. Such information would besufficient to determine the presence of an anomaly boundaryand roughly approximate its location.

VI. CONCLUSIONS

We have demonstrated, both for analytic and more realis-tic systems, that the use of thermodynamic extrapolation orinterpolation performs no worse, and in many cases better,than re-weighting observables to other state conditions usingstandard perturbation theory. This result is not in fact surpris-ing once connections between thermodynamic extrapolationand the extensive free energy calculation literature are madeexplicit. We find that the use of interpolating polynomials tocalculate free energy differences and profiles along predefinedcoordinates is a special case of applying interpolating poly-nomials to general observables. This then illuminates whythe same techniques outperform re-weighting for estimatingstructural observables in the case of very little overlap in phasespace.36 Successful interpolations of water structural proper-ties in volume make this point particularly clear, as configura-tions satisfying other densities are exceptionally rare, render-ing re-weighting impractical.

The connection with free energy calculations (or just rudi-mentary knowledge of numerical methods) also makes it clearthat interpolation should be preferred over extrapolation. Aparticular range of interest for parameters or state variables is


FIG. 8. Direct simulation results (solid lines), first-order extrapo-lations from 1.0 g/cm3 (circles), and interpolations from the mostextreme densities (squares) are shown versus density (at constanttemperature of 300 K) for (a) translational order parameters, (b)tetrahedral order parameters, (c) excess chemical potentials of 0.33nm radii hard-spheres. Translational order parameters are computedfrom RDFs extrapolated in density from 5 ns simulations at 300 Kand 1.00 g/cm3, while all other quantities are extrapolated directlyfrom the same reference state. Interpolations utilize two 5 ns sim-ulations, one from each of the most extreme densities. Hard-spherechemical potentials could not be computed at densities above 1.20g/cm3.

not always known, but can usually be determined at little com-putational cost. Based on extensive uncertainty analysis of ouranalytic ideal gas system, we have also provided guidelinesfor allocation of computational resources: as higher-orderderivatives are exponentially more difficult to converge to lowuncertainty, it is recommended to use only low (first or sec-ond) order derivatives and run more short simulations at dif-ferent conditions. This is corroborated by our studies of water,finding that only a few nanoseconds are required to convergefirst derivative information while hundreds of nanosecondsmay be required for even second-order derivatives. Our sub-sequent results indicate that the behavior of a system can bedetermined straightforwardly over a continuous range of con-ditions by collecting low-order derivative information, whichrequires minimal additional computational cost. To facilitatethe use of such methods, we have presented a recursive inter-polation algorithm as well as a simple self-consistency check.These procedures are available as a Python library (availableat https://github.com/usnistgov/thermo-extrap) that was usedfor both the ideal gas and water systems presented here. Suchtechniques may not only be used to efficiently generate newdata, but also to extract additional information from previ-ously generated trajectories, as we have done.

FIG. 9. Polynomial fits are shown on sliding sets of three points,which are indicated by solid black vertical lines. For each set, a poly-nomial in the same color is plotted based on each pair of points (solidlines indicate the outermost points of the set, dotted the two higher-density points, and dash-dot the two lower-density points). Piecewiseinterpolations using three (a) and four (b) data points are comparedto demonstrate improvement of local curvature consistency with thenumber of points. Results from direct simulation at each density areshown as a black dashed line.

We expect that the presented theory and techniques will im-prove the efficiency of calculations based on molecular simu-lation. Derivative information for a very general set of observ-ables is readily available, though typically underused. Ourresults suggest that the use of this information can greatlyreduce the amount of required simulation, even when com-pared to other techniques for re-using simulation data such asstandard perturbation theories. It has recently been pointedout that this derivative information also provides easy accessto thermodynamic properties.29 Indeed, entropies are imme-diately obtained by applying expressions presented here andelsewhere15–17 to determine first temperature derivatives ofpotentials of mean force. This is possible because potentialsof mean force represent differences in natural logarithms ofprobabilities and we have demonstrated in two cases how toextrapolate bin counts that form the basis of determining rel-ative probabilities. It is not immediately clear that such in-formation may be usefully interpreted in the case of RDFsor three-body angle distributions, but could, for example,quickly determine entropies associated with solvation free en-ergies. Remarkably, this information can be directly extractedfrom the simulations used to determine solvation free ener-


gies without additional computational expense. To further en-hance efficiency, it may be fruitful to consider classes of func-tions beyond polynomials when interpolating or extrapolating.For instance, Gaussian process regression57 only assumes thatfunctions are smooth on a specified length scale, positing themost likely functions satisfying both this criteria and the pro-vided observable values and derivative information. Such im-provements are currently under investigation.

SUPPLEMENTARY MATERIAL

Additional supporting figures are provided in the Supple-mentary Material.

ACKNOWLEDGMENTS

We thank Vikram Khanna for fruitful conversations andcareful reading of the manuscript. This research was per-formed while J.I.M. held an NRC Research Associateshipaward at the National Institute of Standards and Technology.

DATA AVAILABILITY

The data that support the findings of this study are availablefrom the corresponding author upon reasonable request.

Appendix A: Volume extrapolation in the canonical ensemble

To compute the derivative dhXidV

at constant temperature andnumber of particles, we follow the derivations for dZ

dVfound in

McQuarrie58 or Shell.59 Both derivations rely on the scalingof the coordinates by the characteristic length of the systemvolume. We will here treat cubic volumes where V = L

3 sothat we can compute dhXi

dLand in turn the desired derivative

with respect to volume through the chain rule. From the abovereferences and use of the chain rule, we have that

dZ

dL=

3NZ

L�bZ

DÂi

∂U

∂xi

xi

E

L(A1)

The summation runs over all degrees of freedom in a specificcartesian dimension x (i.e., atomic coordinates in this dimen-sion). If the potential energy is pairwise, this may simplifythe expression to only include pairwise distances in this di-mension. As written, however, the above equation includespotentials where the system experiences an external field de-pendent on the atomic coordinates, or includes angle, dihedralor other terms to describe intramolecular interactions.

The derivative we want is

dhXidL

=dR

L

0 Xe�bU

Zdx1dy1dz1 . . .dxNdyNdzN

dL(A2)

We have only written the integration over translational coor-dinates in the simulation box. This is sufficient to fully de-scribe monatomic particles, but for full molecules we shouldalso integrate over rotations and intramolecular motions, notjust center of mass translations. If we assume that such de-grees of freedom do not depend on the size of the box, theseterms do not affect the above derivative and simply contributeto the definition of the average. This treatment is exact forrigid molecules or for molecules that are small relative to thetypical box size. For large flexible molecules, however, evenintramolecular interactions may depend on the box size (fol-lowing best practices in determining simulation box sizes forperiodic systems this should in practice occur only rarely).In such cases we may instead simply assume that the inte-gration in Eq. A2 does indeed run over all atomic positionsand that the potential energy function includes bonds, angles,etc. in terms of these coordinates (in practice this is in factthe case). In real simulations, the potential energy will alsodepend on the box size through the application of minimumimaging in systems with periodic boundaries. This is not ex-plicitly treated in the following derivation, but is assumed tobe accurately captured if the virial (defined below) is properlycomputed.

First define a scaled coordinate as x0i= xi

Land transform the

integrals in Eq. A2 so that we can take derivatives directlyinside the integral.

dhXidL

=dR 1

0 Xe�bU

ZL

3Ndx01 . . .dx

0N

dL

=1

Z2

Z 1

0Ze�bU

3NL

3N�1X +L

3N

✓dX

dL�bX

dU

dL

◆�

�L3N

Xe�bUdZ

dLdx

01 . . .dx

0N

=Z 1

0

e�bU

Z

3NL

3N�1X +L

3N

✓dX

dL�bX

dU

dL

◆

� 1Z

L3N

XdZ

dL

�dx

01 . . .dx

0N

(A3)

We can substitute in the derivative with respect to Z fromEq. A1, but must determine the derivatives with respect to U

and X . If we have not transformed the coordinates inside thesefunctions, we can write

dU

dL= Â

i

∂U

∂xi

∂xi

∂L

= Âi

∂U

∂xi

x0i

=1L

Âi

∂U

∂xi

xi (A4)

The same goes for X . Substituting in all of the derivatives


dhXidL

=Z 1

0

e�bU

Z

"3NL

3N�1X +L

3N

1L

Âi

∂X

∂xi

xi �bX1L

Âi

∂U

∂xi

xi

!� 1

ZL

3NX

0

@3NZ

L�bZ

DÂi

∂U

∂xi

xi

E

L

1

A#

dx01 . . .dx

0N

(A5)

We now transform the integral back to the original coordinates and clean everything up.

dhXidL

=Z

L

0

e�bU

ZL3N

"3NL

3N�1X +L

3N

1L

Âi

∂X

∂xi

xi �bX1L

Âi

∂U

∂xi

xi

!� 1

ZL

3NX

0

@3NZ

L�bZ

DÂi

∂U

∂xi

xi

E

L

1

A#

dx1 . . .dxN

=Z

L

0

e�bU

ZL

"3NX +

Âi

∂X

∂xi

xi �bX Âi

∂U

∂xi

xi

!� 1

ZX

3NZ �bZ

*

Âi

∂U

∂xi

xi

+!#dx1 . . .dxN

=Z

L

0

e�bU

ZL

"3NX +

Âi

∂X

∂xi

xi �bX Âi

∂U

∂xi

xi

!�3NX +bX

*

Âi

∂U

∂xi

xi

+#dx1 . . .dxN

=Z

L

0

e�bU

ZL

"

Âi

∂X

∂xi

xi �bX Âi

∂U

∂xi

xi +bX

*

Âi

∂U

∂xi

xi

+#dx1 . . .dxN

=1L

"*

Âi

∂X

∂xi

xi

+�b

*X Â

i

∂U

∂xi

xi

++b hXi

*

Âi

∂U

∂xi

xi

+#

=1L

"b hXi

*

Âi

∂U

∂xi

xi

+�b

*X Â

i

∂U

∂xi

xi

++

*

Âi

∂X

∂xi

xi

+#(A6)

Using the chain rule we can write

dhXidV

=dhXi

dL

✓dV

dL

◆�1

=1

3L2dhXi

dL

=1

3V

b hXi

*

Âi

∂U

∂xi

xi

+�b

*X Â

i

∂U

∂xi

xi

+

+

*

Âi

∂X

∂xi

xi

+�(A7)

It is typical to define the virial as W = �Âi

∂U

∂xi

xi. Using thisdefinition we obtain Eq. 27 in the main text

dhXidV

=1

3V

"�b hXihW i+b hXW i+

*

Âi

∂X

∂xi

xi

+#(A8)

We can obtain the virial W from most simulation softwarepackages, with rigid-body constraints and periodic boundaryconditions with minimum imaging properly accounted for.Note that the above derivation only allows extrapolation upto first order. Continuing to second order will result in sec-ond derivatives of the potential energy with respect to the boxlength, which in turn results in second derivatives of the po-tential energy with respect to the coordinates. This meansthat we would need to keep track of the derivatives of theforces, which is not typically trivial in most simulation pack-ages. With the value of the observable and the virial for eachsimulation configurational snapshot, we can easily computethe first two terms in Eq. 27. A remaining difficulty, however,

is the last term involving derivatives of the observable withrespect to coordinates. There is no general expression for thisquantity as it will depend on the definition of the observableof interest. Derivations for approximations to this term areprovided below for radial distribution functions, three-bodyangle distributions, and hard-sphere insertions. Though thisterm is typically orders of magnitude smaller than the otherterms in Eq. 27, it may still become significant when the dif-ference between the other two terms is small. While we havefound the last term in Eq. 27 to be small for RDFs and three-body angle distributions, it makes a non-negligible contribu-tion for hard-sphere insertions. For the case of the tetrahedralorder parameter, we do not present a derivation as we expectno contribution — interparticle distances do not affect the cal-culation of the three-body angle and are only used to deter-mine the four nearest neighbors. The selected neighbors willnot change with box scaling as all particle separations will bescaled by the same amount.

1. Radial distribution functions

Extrapolating an RDF entails extrapolating each set of bincounts of a certain interparticle radial distance r. The averagefraction of particle pairs with a separation in bin k of fixed binwidth d is the average number of bin counts divided by the


number of particle pairs and is given by

ck=2

N(N �1)

Z e�bU

ZÂ

i

Q✓

ri � (rk �d

2)

◆

⇥✓

1�Q✓

ri � (rk +d

2)

◆◆drN

=2

N(N �1) Âi

Z e�bU

ZQ✓

ri � (rk �d

2)

◆

⇥✓

1�Q✓

ri � (rk +d

2)

◆◆drN

=Z e�bU

ZQ✓

r1 � (rk �d

2)

◆

⇥✓

1�Q✓

r1 � (rk +d

2)

◆◆drN (A9)

The summation runs over all pairs of particles with the ith pair-

wise particle distance given by ri. The product of Heavisidefunctions Q(x) ensures that the configuration is only countedif the interparticle separation is within bin k. In the last line,we have utilized the fact that all particle pairs are identical.It will be convenient to rewrite Eq. A9 explicitly in termsof cartesian coordinates and transform these coordinates suchthat r0

i= ri � r1. For a homogeneous system, setting r1 as the

origin does not change the average since the potential energyonly depends on relative particle positions. With this new co-ordinate system and selecting the first pair as particles 1 and2, we may write

ck=Z e�bU

ZQ✓|r02|� (rk �

d

2)

◆

⇥✓

1�Q✓|r02|� (rk +

d

2)

◆◆dr0N (A10)

where the distance between particles 1 and 2 is, in carte-sian coordinates, |r02| = (x022 + y

022 + z

022 )

1/2. It is thenclear that the simulation observable we are averaging isX = Q

�|r02|� (rk � d

2 )��

1�Q�|r02|� (rk +

d

2 )��

. All of thederivatives in last term of Eq. 27 will be zero except thosewith respect to the transformed coordinates of particle 2. Thederivative with respect to a single cartesian coordinate of par-ticle 2 is

∂X

∂x02=

∂X

∂ |r02|∂ |r02|

x02

=∂X

∂ |r02|x02

|r02|

=

d✓|r02|� (rk �

d

2)

◆✓1�Q

✓|r02|� (rk +

d

2)

◆◆

�Q✓|r02|� (rk �

d

2)

◆d✓|r02|� (rk +

d

2)

◆�x02

|r02|

=

d✓|r02|� (rk �

d

2)

◆

�d✓|r02|� (rk +

d

2)

◆�x02

|r02|(A11)

Due to most derivatives being zero, the sum inside the expec-tation in the last term of Eq. 27 becomes

Âi

∂X

∂xi

xi=∂X

∂x02

x02 +

∂X

∂y02

y02 +

∂X

∂ z02

z02

=

d✓|r02|� (rk �

d

2)

◆

�d✓|r02|� (rk +

d

2)

◆�|r02| (A12)

Taking the expectation over all system configurations yields*

Âi

∂X

∂xi

xi

+=Z e�bU

Z

d✓|r02|� (rk �

d

2)

◆

�d✓|r02|� (rk +

d

2)

◆�|r02|dr0N

=Z

P�r02�

d✓|r02|� (rk �

d

2)

◆

�d✓|r02|� (rk +

d

2)

◆�|r02|dr02 (A13)

By performing the integration over all coordinates except r02we obtain a marginal probability P(r02). In order to integratethe above, we first transform from cartesian to spherical coor-dinates.*

Âi

∂X

∂xi

xi

+=Z

P(r02)sin(q2)r2

2

⇥

d✓

r2 � (rk �d

2)

◆�d

✓r2 � (rk +

d

2)

◆�

⇥r2 sin(q2)r22dr2dq2df2

=Z

P(r2)

d✓

r2 � (rk �d

2)

◆

�d✓

r2 � (rk +d

2)

◆�r2dr2

= P

✓rk �

d

2

◆✓rk �

d

2

◆

�P

✓rk +

d

2

◆✓rk +

d

2

◆(A14)

By transforming to spherical coordinates, the probability den-sity is divided by a term that cancels the Jacobian determi-nant. To arrive at the second line we have integrated over theangular coordinates to obtain the marginal probability of onlythe radial coordinate. P(r) represents the marginal probabil-ity density of observing any particle pair with an interparticleseparation of r. If the bin width d is sufficiently small, then toa good approximation P

�r = rk � d

2�⇡ P

�r = rk +

d

2�. Fur-

thermore, these are also approximately equal to ck divided bythe width of the bin

*

Âi

∂X

∂xi

xi

+=

ck

d

✓rk �

d

2� rk �

d

2

◆

=�ck (A15)


In practice, we obtain ck for each frame by counting the num-ber of pairs of particles with a separation in bin k and dividingby the total number of unique pairs.

Strictly speaking, the definition of the radial distributionfunction involves normalizing by a term depending on the vol-ume in order to account for the fractional volume of the bins.

g(rk) =ck

(vk/V )(A16)

The volume of bin k is given by vk =43 p(r3

k� r

3k�1) and V is

the total volume. Through the chain rule, the appropriate fullderivative for the radial distribution function becomes

dg(rk)

dV=

ck

vk

+V

vk

dck

dV(A17)

This is only the first-order derivative, but will suffice for thepresent work.

2. Three-body angle distributions

While an RDF classifies the translational order in water,the angular order may be explored by using the distribution ofthree-body angles.28 Given a central water oxygen, the three-body angle is that between vectors connecting the central wa-ter to two neighbors within its first hydration shell. The hydra-tion shell is simply defined through a cutoff distance rc, withthe resulting probability distribution being largely insensitiveto this cutoff as long as it does not extend significantly be-yond the first minimum in the RDF.28 As with RDFs, we mayconsider all triples of particles indistinguishable and averageover only one of them. As such, the observable that we willaverage is

X= (1�Q(r1,2 � rc))(1�Q(r1,3 � rc))

⇥d✓

cos�1

r2 � r1

r1,2· r3 � r1

r1,3

��q◆

(A18)

Particle 1 has been chosen as the central particle with dis-tances to the two other particles defined by r1, j. When aver-aged over all configurations, this counts the number of timesthat a triple is observed with two particles within the cutoffand forming an angle q . Technically, the angle q is binnedand so the delta function should be the product of two Heav-iside functions. However, the term within the inverse cosinedoes not depend on the box size due to the normalizations ofthe vectors between particles and so treatment of this term willnot impact the analysis below. This can be seen by imaginingthat we reference all particles to the coordinates of the centralparticle and transform to spherical coordinates. The term inthe delta function will then no longer depend on the radial co-ordinates but only the angles. Angular coordinates will haveno dependence on the system size and thus not enter into thesum over degrees of freedom in the last term of Eq. 27. Byperforming the coordinate transform mentioned above, we ar-rive at the derivative with respect to the radial coordinate of a

non-central particle j

∂X

∂ r j

=�d (r j � rc)(1�Q(rk � rc))

⇥d

cos�1

"r0

j

r j

·r0

k

rk

#�q

!(A19)

where k represents the other non-central particle forming theangle. Multiplying by the interparticle separation followed bysumming and averaging yields

*

Âi

∂X

∂ ri

ri

+=�2

Z e�bU

Zr2d (r2 � rc)

⇥(1�Q(r3 � rc))

⇥d✓

cos�1

r02r2

·r03r3

��q◆

dr0N

=�2Z

P(r02,r03)r2d (r2 � rc)

⇥(1�Q(r3 � rc))

⇥d✓

cos�1

r02r2

·r03r3

��q◆

dr02dr03

=�2P(q ,r2 = rc,r3 rc)rc (A20)

In the first line, we have asserted that terms from particles 2and 3 will be identical, resulting in factor of 2. Integratingover all degrees of freedom except those of particles 2 and3, we arrive at the second line where P(r02,r03) is a marginalprobability density. The final line indicates that further inte-gration results in the probability that the angle q is formedwith particle 3 within the cutoff and particle 2 exactly at a ra-dial distance of rc from the central particle. Configurationscontributing to this probability are a subset of those contribut-ing to the full probability of an angle with both non-centralparticles at or below the cutoff. As such, we may write

P(q ,r2 = rc,r3 rc)= P(q ,r2 rc,r3 rc)

⇥P(r2 = rc|q ,r2 rc,r3 rc)(A21)

The first probability on the right-hand side is equal to hXiwhich is proportional to the number of counts. The secondis the probability of r2 being at a distance of rc given thatall other conditions are met. In other words, this representsthe fraction of the number of counts of an angle q where onenon-central particle is exactly at the cutoff. If we approximateP(r2 = rc|q ,r2 rc,r3 rc) to be uniform along r2, then

P(r2 = rc|q ,r2 rc,r3 rc)⇡4pr

2c

43 pr3

c

=3rc

(A22)

Combining this with Eqs. A20 and A21 we obtain*

Âi

∂X

∂ ri

ri

+=�6hXi (A23)

In practice, hXi is the average number of counts over all con-figurations where a triple of particles exhibits an angle q and


the non-central molecules are within the cutoff. Presentedthree-body angle distributions represent the probability of anangle given that the distance cutoff is satisfied, which involvesa normalization by the total number of bin counts. Note thatthe three-body angle distribution with discrete bins is a prob-ability mass function with the sum of the bins equalling one.While we directly extrapolate bin counts, we always normal-ize the result so that multiple such distributions may be com-pared on the same scale.

3. Hard-sphere insertions

For hard-spheres, the potential energy change upon addingthe particle is infinite if any molecules overlap and zero oth-erwise, leading to a simplification in the expression for theexcess chemical potential

e�b µHSex =

De�bDU

E= P(N = 0) (A24)

Hence the average quantity of interest is the probability that noatoms lie within the defined volume of a hard-sphere particleplaced randomly within the system. In the case of most watermodels, we only check water oxygens as the hydrogens do notparticipate in LJ or other repulsive core interactions. We canwrite P(N = 0) for a hard-sphere centered at the origin with adistance of closest-approach to water oxygens of R as

P(N = 0) =Z e�bU

Z’

i

Q(ri �R)drN (A25)

ri is the radial coordinate (or distance from the origin) of par-ticle i with the product running over all particles in the system(oxygens only for water). The product takes the value of 1only if no particles are within R of the hard-sphere particlecentered at the origin. In this way, the total probability ofhaving no particles in the hard-sphere volume is totaled by in-tegrating over all configurations weighted by their configura-tional weights. From this it is clear that, in our usual notation,hXi= P(N = 0) and X = ’i Q(ri �R). We have that

∂X

∂ ri

= d (ri �R)’j 6=i

Q(r j �R) (A26)

Under an integral, the above ensures that no particles j 6= i arewithin the hard-sphere volume and that particle i is exactly atR.*

Âi

∂X

∂ ri

ri

+=Z e�bU

ZÂ

i

d (ri �R)ri ’j 6=i

Q(r j �R)drN

= NP(r1 = R,r j � R 8 j > 1)R= NP(N = 0,r1 = R)R

= NP(r1 = R|N = 0)P(N = 0)R⇡ NP(r1 = R)P(N = 0)R

=4pR

3

VNP(N = 0) (A27)

The approximation on the fifth line assumes that the proba-bility of a particle residing at R is negligibly affected by thefact that no other particles are inside the hard-sphere volume,which for a homogeneous system with uniform particle den-sity leads to the expression on the last line. This is certainlynot true if no particles are inside R, in which case the proba-bility of finding a particle at R is increased and is termed the“contact value.”53 More accurate expressions for this quantitymay be found in the context of scaled particle theories,53 butare not implemented here.1J. C. Palmer and P. G. Debenedetti, “Recent advances in molecular simu-lation: A chemical engineering perspective,” AIChE Journal 61, 370–383(2015).

2S. A. Hollingsworth and R. O. Dror, “Molecular Dynamics Simulation forAll,” Neuron 99, 1129–1143 (2018).

3V. Spiwok, Z. Sucur, and P. Hosek, “Enhanced sampling techniquesin biomolecular simulations,” Biotechnology Advances 33, 1130–1140(2015).

4Y. I. Yang, Q. Shao, J. Zhang, L. Yang, and Y. Q. Gao, “Enhanced samplingin molecular dynamics,” The Journal of Chemical Physics 151, 070902(2019).

5W. M. Brown, P. Wang, S. J. Plimpton, and A. N. Tharrington, “Imple-menting molecular dynamics on hybrid high performance computers – shortrange forces,” Computer Physics Communications 182, 898–911 (2011).

6W. M. Brown, A. Kohlmeyer, S. J. Plimpton, and A. N. Tharrington, “Im-plementing molecular dynamics on hybrid high performance computers –Particle–particle particle-mesh,” Computer Physics Communications 183,449–459 (2012).

7P. Eastman, J. Swails, J. D. Chodera, R. T. McGibbon, Y. Zhao, K. A.Beauchamp, L.-P. Wang, A. C. Simmonett, M. P. Harrigan, C. D. Stern,R. P. Wiewiora, B. R. Brooks, and V. S. Pande, “OpenMM 7: Rapid de-velopment of high performance algorithms for molecular dynamics,” PLOSComputational Biology 13, e1005659 (2017).

8A. W. Götz, M. J. Williamson, D. Xu, D. Poole, S. Le Grand, and R. C.Walker, “Routine Microsecond Molecular Dynamics Simulations with AM-BER on GPUs. 1. Generalized Born,” Journal of Chemical Theory andComputation 8, 1542–1555 (2012).

9C. Kutzner, S. Páll, M. Fechner, A. Esztermann, B. L. Groot, and H. Grub-müller, “More bang for your buck: Improved use of GPU nodes for GRO-MACS 2018,” Journal of Computational Chemistry 40, 2418–2431 (2019).

10R. Salomon-Ferrer, A. W. Götz, D. Poole, S. Le Grand, and R. C. Walker,“Routine Microsecond Molecular Dynamics Simulations with AMBER onGPUs. 2. Explicit Solvent Particle Mesh Ewald,” Journal of Chemical The-ory and Computation 9, 3878–3888 (2013).

11D. E. Shaw, J. Grossman, J. A. Bank, B. Batson, J. A. Butts, J. C. Chao,M. M. Deneroff, R. O. Dror, A. Even, C. H. Fenton, A. Forte, J. Gagliardo,G. Gill, B. Greskamp, C. R. Ho, D. J. Ierardi, L. Iserovich, J. S. Kuskin,R. H. Larson, T. Layman, L.-S. Lee, A. K. Lerer, C. Li, D. Killebrew,K. M. Mackenzie, S. Y.-H. Mok, M. A. Moraes, R. Mueller, L. J. Noci-olo, J. L. Peticolas, T. Quan, D. Ramot, J. K. Salmon, D. P. Scarpazza,U. B. Schafer, N. Siddique, C. W. Snyder, J. Spengler, P. T. P. Tang,M. Theobald, H. Toma, B. Towles, B. Vitale, S. C. Wang, and C. Young,“Anton 2: Raising the Bar for Performance and Programmability in aSpecial-Purpose Molecular Dynamics Supercomputer,” in SC14: Interna-

tional Conference for High Performance Computing, Networking, Storage

and Analysis (IEEE, New Orleans, LA, USA, 2014) pp. 41–53.12A. Purohit, A. J. Schultz, and D. A. Kofke, “Force-sampling methods for

density distributions as instances of mapped averaging,” Molecular Physics117, 2822–2829 (2019).

13A. J. Schultz and D. A. Kofke, “Alternatives to conventional ensemble av-erages for thermodynamic properties,” Current Opinion in Chemical Engi-neering 23, 70–76 (2019).

14A. J. Schultz, S. G. Moustafa, W. Lin, S. J. Weinstein, and D. A. Kofke,“Reformulation of Ensemble Averages via Coordinate Mapping,” Journalof Chemical Theory and Computation 12, 1491–1498 (2016).

15N. A. Mahynski, J. R. Errington, and V. K. Shen, “Multivariable extrapo-lation of grand canonical free energy landscapes,” The Journal of ChemicalPhysics 147, 234111 (2017).


16N. A. Mahynski, M. A. Blanco, J. R. Errington, and V. K. Shen, “Predictinglow-temperature free energy landscapes with flat-histogram Monte Carlomethods,” The Journal of Chemical Physics 146, 074101 (2017).

17N. A. Mahynski, J. R. Errington, and V. K. Shen, “Temperature extrap-olation of multicomponent grand canonical free energy landscapes,” TheJournal of Chemical Physics 147, 054105 (2017).

18M. Witman, N. A. Mahynski, and B. Smit, “Flat-Histogram Monte Carloas an Efficient Tool To Evaluate Adsorption Processes Involving Rigid andDeformable Molecules,” Journal of Chemical Theory and Computation 14,6149–6158 (2018).

19H. W. Hatch, S. Jiao, N. A. Mahynski, M. A. Blanco, and V. K. Shen,“Communication: Predicting virial coefficients and alchemical transfor-mations by extrapolating Mayer-sampling Monte Carlo simulations,” TheJournal of Chemical Physics 147, 231102 (2017).

20N. A. Mahynski, S. Jiao, H. W. Hatch, M. A. Blanco, and V. K. Shen,“Predicting structural properties of fluids by thermodynamic extrapolation,”The Journal of Chemical Physics 148, 194105 (2018).

21Z. A. Piskulich, O. O. Mesele, and W. H. Thompson, “Removing the barrierto the calculation of activation energies: Diffusion coefficients and reorien-tation times in liquid water,” The Journal of Chemical Physics 147, 134103(2017).

22Z. A. Piskulich, O. O. Mesele, and W. H. Thompson, “Activation Energiesand Beyond,” The Journal of Physical Chemistry A 123, 7185–7194 (2019).

23Z. A. Piskulich and W. H. Thompson, “The dynamics of supercooled wa-ter can be predicted from room temperature simulations,” The Journal ofChemical Physics 152, 074505 (2020).

24E. Brini, C. J. Fennell, M. Fernandez-Serra, B. Hribar-Lee, M. Lukšic, andK. A. Dill, “How Water’s Properties Are Encoded in Its Molecular Structureand Energies,” Chemical Reviews 117, 12385–12414 (2017).

25P. Gallo, K. Amann-Winkel, C. A. Angell, M. A. Anisimov, F. Caupin,C. Chakravarty, E. Lascaris, T. Loerting, A. Z. Panagiotopoulos, J. Russo,J. A. Sellberg, H. E. Stanley, H. Tanaka, C. Vega, L. Xu, and L. G. M.Pettersson, “Water: A Tale of Two Liquids,” Chemical Reviews 116, 7463–7500 (2016).

26M. Agarwal, M. P. Alam, and C. Chakravarty, “Thermodynamic, Diffu-sional, and Structural Anomalies in Rigid-Body Water Models,” The Jour-nal of Physical Chemistry B 115, 6935–6945 (2011).

27J. R. Errington and P. G. Debenedetti, “Relationship between structural or-der and the anomalies of liquid water,” Nature 409, 318–321 (2001).

28J. I. Monroe and M. S. Shell, “Decoding signatures of structure, bulk ther-modynamics, and solvation in three-body angle distributions of rigid watermodels,” The Journal of Chemical Physics 151, 094501 (2019).

29Z. A. Piskulich and W. H. Thompson, “On the temperature dependence ofliquid structure,” The Journal of Chemical Physics 152, 011102 (2020).

30R. W. Zwanzig, “High-Temperature Equation of State by a PerturbationMethod. I. Nonpolar Gases,” The Journal of Chemical Physics 22, 1420–1426 (1954).

31J. G. Kirkwood, “Statistical Mechanics of Fluid Mixtures,” The Journal ofChemical Physics 3, 300–313 (1935).

32G. Hummer, “Fast-growth thermodynamic integration: Error and efficiencyanalysis,” The Journal of Chemical Physics 114, 7330–7337 (2001).

33A. D. D. Craik, “Prehistory of Faà di Bruno’s Formula,” The AmericanMathematical Monthly 112, 119–130 (2005).

34L. Comtet, Advanced Combinatorics (D. Reidel Publishing Company, Dor-drecht, Holland, 1974).

35N. Lu and D. A. Kofke, “Understanding and Improving Free Energy Cal-culations in Molecular Simulations: Error Analysis and Reduction Meth-ods,” in Free Energy Calculations, edited by C. Chipot and A. Pohorille(Springer, 2007) pp. 199–247.

36C. H. Bennett, “Efficient estimation of free energy differences from MonteCarlo data,” Journal of Computational Physics 22, 245–268 (1976).

37K. A. Beauchamp, J. D. Chodera, L. N. Naden, and M. R. Shirts, “Pym-bar,”.

38K. Kamat and B. Peters, “Diabat Interpolation for Polymorph Free-EnergyDifferences,” The Journal of Physical Chemistry Letters 8, 655–660 (2017).

39P. V. Klimovich, M. R. Shirts, and D. L. Mobley, “Guidelines for the analy-sis of free energy calculations,” Journal of Computer-Aided Molecular De-sign 29, 397–411 (2015).

40M. R. Shirts and J. D. Chodera, “Statistically optimal analysis of samplesfrom multiple equilibrium states,” The Journal of Chemical Physics 129,124105 (2008).

41M. R. Shirts and V. S. Pande, “Comparison of efficiency and bias of freeenergies computed by exponential averaging, the Bennett acceptance ra-tio, and thermodynamic integration,” The Journal of Chemical Physics 122,144107 (2005).

42S. Bruckner and S. Boresch, “Efficiency of alchemical free energy simula-tions. II. Improvements for thermodynamic integration,” Journal of Com-putational Chemistry 32, 1320–1333 (2011).

43D. A. Kofke, “Free energy methods in molecular simulation,” Fluid PhaseEquilibria 228-229, 41–48 (2005).

44N. Lu and D. A. Kofke, “Optimal intermediates in staged free energy cal-culations,” The Journal of Chemical Physics 111, 4414–4423 (1999).

45A. Pohorille, C. Jarzynski, and C. Chipot, “Good Practices in Free-EnergyCalculations,” The Journal of Physical Chemistry B 114, 10235–10253(2010).

46J. L. F. Abascal and C. Vega, “A general purpose model for the con-densed phases of water: TIP4P/2005,” The Journal of Chemical Physics123, 234505 (2005).

47M. J. Abraham, T. Murtola, R. Schulz, S. Páll, J. C. Smith, B. Hess, andE. Lindahl, “GROMACS: High performance molecular simulations throughmulti-level parallelism from laptops to supercomputers,” SoftwareX 1-2,19–25 (2015).

48G. Hummer, S. Garde, A. E. Garcia, A. Pohorille, and L. R. Pratt, “Aninformation theory model of hydrophobic interactions.” Proceedings of theNational Academy of Sciences 93, 8951–8955 (1996).

49S. N. Jamadagni, R. Godawat, and S. Garde, “Hydrophobicity of Pro-teins and Interfaces: Insights from Density Fluctuations,” Annual Reviewof Chemical and Biomolecular Engineering 2, 147–171 (2011).

50V. Molinero and E. B. Moore, “Water Modeled As an Intermediate Elementbetween Carbon and Silicon †,” The Journal of Physical Chemistry B 113,4008–4016 (2009).

51P.-L. Chau and A. J. Hardwick, “A new order parameter for tetrahedralconfigurations,” Molecular Physics 93, 511–518 (1998).

52A. Chaimovich and M. S. Shell, “Tetrahedrality and structural order for hy-drophobic interactions in a coarse-grained water model,” Physical ReviewE 89, 022140 (2014).

53H. S. Ashbaugh and L. R. Pratt, “Colloquium : Scaled particle theory andthe length scales of hydrophobicity,” Reviews of Modern Physics 78, 159–178 (2006).

54A. P. Willard and D. Chandler, “Instantaneous Liquid Interfaces,” The Jour-nal of Physical Chemistry B 114, 1954–1958 (2010).

55A. J. Patel, P. Varilly, S. N. Jamadagni, M. F. Hagan, D. Chandler, andS. Garde, “Sitting at the Edge: How Biomolecules use Hydrophobicity toTune Their Interactions and Function,” The Journal of Physical ChemistryB 116, 2498–2503 (2012).

56P. G. Debenedetti, Metastable Liquids: Concepts and Principles, PhysicalChemistry: Science and Engineering (Princeton University Press, Prince-ton, NJ, 1997).

57C. E. Rasmussen and C. K. I. Williams, Gaussian Processes for Machine

Learning, Adaptive Computation and Machine Learning (MIT Press, Cam-bridge, MA, 2006).

58D. A. McQuarrie, Statistical Mechanics (University Science Books, Sausal-ito, CA, 2000).

59M. S. Shell, Thermodynamics and Statistical Mechanics (Cambridge Uni-versity Press, Cambridge, United Kingdom, 2015).

Supplementary Material: Extrapolation andinterpolation strategies for e�ciently estimating

structural observables as a function oftemperature and density

Jacob I. Monroe Harold W. Hatch Nathan A. Mahynski

M. Scott Shell Vincent K. Shen

September 3, 2020

1 Supporting Figures

Figure S1: For the 1D ideal gas in an external field described in the text, a simplestructural property of interest is the average x value, which is plotted in the top panelas a function of �. Changes in P (x) and P (U) with temperature are shown as well withcoloring by temperature (blue is the lowest T , highest �). Clearly the configurationaldistributions will not overlap in their important regions and neither will the potentialenergy distributions.

1

Figure S2: Relative unsigned error from extrapolation values at infinite sampling(hXianalytic) for both observables (top) and derivatives (bottom) are shown as func-tions of the number of samples (left) and order of extrapolation or derivative (right). Atlow order, extrapolation results quickly converge to associated analytic values with fewsamples.

2

Figure S3: In all panels, lines represent the RDFs from direct simulation at each tem-perature while points are the estimates via extrapolation from 300 K (constant densityof 1.00 g/cm3) with 50 ns of simulation (50000 snapshots saved every 1 ps). Error barsrepresent one standard deviation and are determined through 100 bootstrap resamplesof the original data.

3

Figure S4: In all panels, lines represent the RDFs or three-body angle distributionsfrom direct simulation at each temperature while points are the estimates via extrap-olation from 300 K (constant density of 0.87 g/cm3) with 5 ns of simulation (5000snapshots saved every 1 ps). Error bars represent one standard deviation and are de-termined through bootstrap resampling. The neighbor cuto↵ was set to 0.34 nm incomputing three-body angles. When extrapolating three-body angle distributions tolower temperatures, the derivative is underestimated, indicating that at this density alinear extrapolation is not fully su�cient. Figure S5 also reveals that the slope of thetetrahedral order parameter with temperature increases at lower temperatures for thisdensity.

4

Figure S5: In all panels, lines represent the direct simulation translational or tetra-hedral order parameter values while points are the estimates via perturbation or ex-trapolation from 300 K (constant density of 0.87 g/cm3) with 5 ns of simulation (5000snapshots saved every 1 ps). Error bars represent one standard deviation and are deter-mined through bootstrap resampling. For translational order parameters, this involvesresampling to obtain uncertainties in RDF bins and propagating the uncertainty associ-ated with application of Simpson’s 1/3 rule.

5

Figure S6: In all panels, lines represent the RDFs or three-body angle distributionsfrom direct simulation at each temperature while points are the estimates via extrapo-lation from 300 K (constant density of 1.40 g/cm3) with 5 ns of simulation (5000 snap-shots saved every 1 ps). Error bars represent one standard deviation and are determinedthrough bootstrap resampling. The neighbor cuto↵ was set to 0.34 nm in computingthree-body angles.

6

Figure S7: In all panels, lines represent the direct simulation translational or tetra-hedral order parameter values while points are the estimates via perturbation or ex-trapolation from 300 K (constant density of 1.40 g/cm3) with 5 ns of simulation (5000snapshots saved every 1 ps). Error bars represent one standard deviation and are deter-mined through bootstrap resampling. Non-monotonic behavior in the translational orderparameter at this density is captured because this quantity is not directly extrapolated.Successful first-order extrapolation of RDFs in Figure S6 leads to accurate translationalorder parameters.

7

Figure S8: Extrapolation and perturbation results in temperature from reference sim-ulations of 50 ns at 300 K are shown for the probability of having zero water oxygenswithin the hard-sphere probe volume at a constant density 1.00 g/cm3.

8

Figure S9: Extrapolation and perturbation of �µHSex and µHS

ex are shown as a function oftemperature based on reference simulations of 5 ns at 300 K and 1.00 g/cm3. Trianglesare used for perturbation results, while circles represent direct extrapolation in of �µHS

ex .Diamonds are determined by extrapolating P (N = 0) and taking the negative of thenatural logarithm and generally agree closely with direct extrapolations.

Figure S10: Relative error of RDF (left) and three-body angle (right) extrapolationsfrom a reference temperature of 300 K are shown. The same data is shown in Figure5, but here the RDFs calculated from simulations at each temperature are subtractedfrom extrapolations and then used to normalize this di↵erence.

9

Figure S11: Direct simulation results (solid lines), first-order extrapolations from 300K (circles), and interpolations from the most extreme temperatures (squares) are shownversus temperature (at constant density of 1.00 g/cm3) for (a) translational order pa-rameters, (b) tetrahedral order parameters, (c)/(d) excess chemical potentials of 0.33nm radii hard-spheres. Translational order parameters are computed from RDFs ex-trapolated in temperature from 5 ns simulations at 300 K and 1.00 g/cm3, while allother quantities are extrapolated directly from the same reference state. Interpolationsutilize two simulations, one from each of the most extreme densities, and hence twiceas much data as presented extrapolations. Due to the inherently high ratio of noise totemperature variation for excess chemical potential estimates at this density, 50 ns ofsimulation are used for extrapolations and interpolations in (c) and (d).

10

Figure S12: Relative error of RDF (left) and three-body angle (right) extrapolationsfrom a density of 1.00 g/cm3 or interpolations from the most extreme densities are shown.The same data is shown in Figure 7, but here the RDFs calculated from simulationsat each density are subtracted from extrapolations and then used to normalize thisdi↵erence.

11

Figure S13: An example implementation of the recursive algorithm and manual visualconsistency check is presented for the tetrahedral order parameter over water densitiesat a fixed temperature of 300 K. As in Fig. 9, polynomial fits are shown over sliding setsof three points. For each set, a polynomial in the same color is plotted based on each pairof points (solid lines for the outermost points, dotted for the higher-density interval, anddash-dot lower-density). In step (a), we choose the point with the highest bootstrappederror over the initial interval defined by the two most extreme densities. The relativeerror from the results based on brute force simulations goes from a maximum of 6% to3%. Since the consistency check is most poorly satisfied on the lower-density interval,we select the point of highest bootstrap error on this interval and add its data, loweringthe relative error to less than 1% (b). Note that both the low- and high-density intervalsnow more fully satisfy the visual consistency check, validating our choice of selectinga new point in the low-density region. Finally, in (c) we select another point on thelowest-density interval, further improving visual consistency. The end result is within0.01% error of all reference points (thick black dashed curve), but utilizes only half ofthe data. If, through intuition or reasoning, we had initially selected the point at themaximum of the polynomial predicted by the two most extreme density points (ratherthan the point with the maximum bootstrapped error), the relative error would drop toless than 1% with only 3 points, using only 30% of the original data.

12

Figure S14: Each panel displays the derivative of the piecewise interpolating polyno-mial described in the corresponding panels of Fig. S13 (light blue curves). By design,these curves pass exactly through sets of points (vertical lines in Fig. S13) at which thederivative of the tetrahedral order parameter with respect to density is estimated usingEq. 27 (blue squares). As a comparison, finite di↵erence estimates of derivatives areshown as black circles, using central di↵erence accounting for unequally spaced data forinternal points and forward/backward di↵erence for the leftmost/rightmost data points.It is clear that the curvature increases sharply at low density, with derivatives changingrapidly below 0.95 g/cm3. Due to these rapid variations, estimates from Eq. 27 andfinite di↵erences begin to disagree. In particular, the lowest density finite di↵erencevalue is determined based on information at higher densities where the derivative is notso steep, leading to an underestimate. Note that the derivatives of interpolations inFig. S13 are not the best polynomial fits to solely the derivative data, but also includeinformation concerning the observable values themselves.

13

extrapolation and interpolation strategies for e ... - nist

Documents