molecular interpretation of preferential interactions in...

15
Molecular Interpretation of Preferential Interactions in Protein Solvation: A Solvent-Shell Perspective by Means of Minimum- Distance Distribution Functions Leandro Martínez* ,and Seishi Shimizu Institute of Chemistry and Center for Computational Engineering & Science, University of Campinas, Campinas, Sã o Paulo 13083-970, Brazil York Structural Biology Laboratory, Department of Chemistry, University of York, Heslington, York YO10 5DD, U.K. * S Supporting Information ABSTRACT: Preferential solvation is a fundamental param- eter for the interpretation of solubility and solute structural stability. The molecular basis for solutesolvent interactions can be obtained through distribution functions, and the thermodynamic connection to experimental data depends on the computation of distribution integrals, specically Kirk- wood-Buintegrals for the determination of preferential interactions. Standard radial distribution functions, however, are not convenient for the study of the solvation of complex, nonspherical solutes, as proteins. Here we show that minimum-distance distribution functions can be used to compute KB integrals while at the same time providing an insightful view of solutesolvent interactions at the molecular level. We compute preferential solvation parameters for Ribonuclease T1 in aqueous solutions of urea and trimethylamine N-oxide (TMAO) and show that, while macroscopic solvation shows that urea is preferentially bound to the protein surface and TMAO is preferentially excluded, both display specic density augmentations at the protein surface in dilute solutions. Therefore, direct protein-osmolyte interactions can play a role in the stability and activity of the protein even for preferentially hydrated systems. The generality of the distribution function and its natural connection to thermodynamic data suggest that it will be useful in general for the study of solvation in mixtures of structurally complex solutes and solvents. 1. INTRODUCTION Biomolecules perform their function in an environment comprised mainly of water molecules. In addition to water, there are many other small- and macromolecules that inuence the equilibrium and kinetics of binding, folding, and conforma- tional changes. 1 Especially important are the osmolytes, which are small, organic metabolites such as sugars and polyols, used by organisms to protect biomolecules under denaturation stress (due to heat, cold, pressure, or chemical denaturants). 2,3 For an elucidation of biomolecular stability and function on a molecular scale, and toward the rational design and exploitation of enzymes, it is indispensable to understand how biomolecules in water interact with the third component. 4,5 (Throughout this paper, the third component is referred to as the cosolvent, which encompasses both protein stabilizers - osmolytes - and denaturants, both organic and inorganic, and both small- and macromolecules.) The modulation of protein stability by cosolvents can be quantied via preferential solvation, i.e., the competition between proteinwater and proteincosolvent interactions for native and denatured states, which is accessible experimentally by membrane dialysis, scattering, and analytical ultracentrifugation. 6 Such experiments, when complemented by volumetric measurements, 7 can determine both the proteinwater and proteincosolvent interactions dened rigorously via statistical thermodynamics. 811 Such interactionsare now dened rigorously in terms of the net excess or decit of water and cosolvent concentrations from the bulk, referred to as the Kirkwood-Buintegrals (KBI). 811 KBIs serve not only as powerful tool for rationalizing and explaining experimental data on a variety of cosolvent-controlled eects (from biophysics, formulation science, pharmacy to food science) 5,1215 but also as a benchmark for simulation and force eld determination. 4,16 KBIs are usually dened as the integrals of the solutesolvent distribution functions, most commonly the radial distribution functions (RDFs). Radial distribution functions are not convenient for the interpretation of the solvation of structurally complex solutes. This is because they are computed from the distances between the centers of masses, or specic atoms, of the solute and solvent molecules, 17 being highly dependent on the solvent and solute shapes. Thus, the necessity of more general distribution functions for the analysis of solutes of complex shapes has been recognized fre- Received: June 8, 2017 Published: November 9, 2017 Article pubs.acs.org/JCTC © 2017 American Chemical Society 6358 DOI: 10.1021/acs.jctc.7b00599 J. Chem. Theory Comput. 2017, 13, 63586372

Upload: others

Post on 18-Jun-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Molecular Interpretation of Preferential Interactions in ...leandro.iqm.unicamp.br/leandro/shtml/publications/2017_jctc.pdf · solvation in mixtures of solutes and solvents of complex

Molecular Interpretation of Preferential Interactions in ProteinSolvation: A Solvent-Shell Perspective by Means of Minimum-Distance Distribution FunctionsLeandro Martínez*,† and Seishi Shimizu‡

†Institute of Chemistry and Center for Computational Engineering & Science, University of Campinas, Campinas, Sao Paulo13083-970, Brazil‡York Structural Biology Laboratory, Department of Chemistry, University of York, Heslington, York YO10 5DD, U.K.

*S Supporting Information

ABSTRACT: Preferential solvation is a fundamental param-eter for the interpretation of solubility and solute structuralstability. The molecular basis for solute−solvent interactionscan be obtained through distribution functions, and thethermodynamic connection to experimental data depends onthe computation of distribution integrals, specifically Kirk-wood-Buff integrals for the determination of preferentialinteractions. Standard radial distribution functions, however,are not convenient for the study of the solvation of complex,nonspherical solutes, as proteins. Here we show thatminimum-distance distribution functions can be used to compute KB integrals while at the same time providing an insightfulview of solute−solvent interactions at the molecular level. We compute preferential solvation parameters for Ribonuclease T1 inaqueous solutions of urea and trimethylamine N-oxide (TMAO) and show that, while macroscopic solvation shows that urea ispreferentially bound to the protein surface and TMAO is preferentially excluded, both display specific density augmentations atthe protein surface in dilute solutions. Therefore, direct protein-osmolyte interactions can play a role in the stability and activityof the protein even for preferentially hydrated systems. The generality of the distribution function and its natural connection tothermodynamic data suggest that it will be useful in general for the study of solvation in mixtures of structurally complex solutesand solvents.

1. INTRODUCTION

Biomolecules perform their function in an environmentcomprised mainly of water molecules. In addition to water,there are many other small- and macromolecules that influencethe equilibrium and kinetics of binding, folding, and conforma-tional changes.1 Especially important are the osmolytes, whichare small, organic metabolites such as sugars and polyols, usedby organisms to protect biomolecules under denaturation stress(due to heat, cold, pressure, or chemical denaturants).2,3 For anelucidation of biomolecular stability and function on amolecular scale, and toward the rational design and exploitationof enzymes, it is indispensable to understand how biomoleculesin water interact with the third component.4,5 (Throughout thispaper, the third component is referred to as the cosolvent,which encompasses both protein stabilizers - osmolytes - anddenaturants, both organic and inorganic, and both small- andmacromolecules.)The modulation of protein stability by cosolvents can be

quantified via preferential solvation, i.e., the competitionbetween protein−water and protein−cosolvent interactionsfor native and denatured states, which is accessibleexperimentally by membrane dialysis, scattering, and analyticalultracentrifugation.6 Such experiments, when complemented by

volumetric measurements,7 can determine both the protein−water and protein−cosolvent interactions defined rigorously viastatistical thermodynamics.8−11 Such “interactions” are nowdefined rigorously in terms of the net excess or deficit of waterand cosolvent concentrations from the bulk, referred to as theKirkwood-Buff integrals (KBI).8−11 KBIs serve not only aspowerful tool for rationalizing and explaining experimental dataon a variety of cosolvent-controlled effects (from biophysics,formulation science, pharmacy to food science)5,12−15 but alsoas a benchmark for simulation and force field determination.4,16

KBIs are usually defined as the integrals of the solute−solvent distribution functions, most commonly the radialdistribution functions (RDFs). Radial distribution functionsare not convenient for the interpretation of the solvation ofstructurally complex solutes. This is because they are computedfrom the distances between the centers of masses, or specificatoms, of the solute and solvent molecules,17 being highlydependent on the solvent and solute shapes. Thus, thenecessity of more general distribution functions for the analysisof solutes of complex shapes has been recognized fre-

Received: June 8, 2017Published: November 9, 2017

Article

pubs.acs.org/JCTC

© 2017 American Chemical Society 6358 DOI: 10.1021/acs.jctc.7b00599J. Chem. Theory Comput. 2017, 13, 6358−6372

Page 2: Molecular Interpretation of Preferential Interactions in ...leandro.iqm.unicamp.br/leandro/shtml/publications/2017_jctc.pdf · solvation in mixtures of solutes and solvents of complex

quently.18−21 In particular, the use of the distance of onesolvent site (an atom or the center of mass) of the solventmolecule to the surface of the solute, or to the nearest soluteatom, was proposed independently by different authors as analternative to overcome the complexity of the soluteshape.18−20,22−27 This choice defines what has been called the“solvation-shell” distribution functions, gss(r),

23,24 or proximaldistribution functions, g⊥(r),

18,22,26,27 which appeal directly tothe concept of Voronoi tesselation.21,28 In all cases, thecounting of nearest distances is straightforward from asimulation, but the normalization procedure leading to thedistribution functions can be cumbersome.20 When usingVoronoi tesselation, the normalization might depend on theestimation of the volumes in space associated with eachreference atom or site.21,26 The normalization of the solvation-shell distribution function, on the other hand, has employed arandom distribution of solvent molecules.23,25,29

Independent computation of the KBIs for each solventcomponent and its comparison with a distribution function thatreflects the solvent-shell structure are important for under-standing the molecular basis of solvation of complex solutes,biomolecules in particular. We demonstrate that these two aimscan be fulfilled simultaneously by adopting an alternativedistribution function, i.e., minimum-distance distributionfunction, which is better suited for characterizing solutionstructure according to the distance between the solute andsolvent surfaces, which is in line with our classical view of thehydration shell of proteins. The resulting distribution functionsare clear to interpret from the point of view of molecularinteractions and can be naturally decomposed into thecontributions of each solvent atom.Despite the powerfulness of KBIs as a bridge between the

microscopic solution structure and macroscopic thermody-namics, KBI determination from simulation usually requires avery large simulation box to account for the long-rangeddeviation of the radial distribution function (RDF) from 1 (i.e.,the long-ranged deviation of the local concentration from thebulk). To put it precisely, small nonzero g(r) − 1 at large r ismultiplied by the volume element, 4πr2, and contributes greatlyto KBIs. Therefore, the use of RDFs as a route for calculationand interpretation of KBIs have been questioned,21 and anumber of proposals have been made to overcome this problemcomputationally, in order to facilitate the calculation of KBIs viaRDFs. Because RDFs rely usually on the distance between thecenters of mass, nonspherical molecules require longerdistances for the RDF to converge to 1 due simply to theirshape.21 For ternary systems (solute, solvent, and cosolvent) itis possible from KBIs to compute preferential interactionparameters, which are a measure of the excess number ofcosolvent molecules in the domain of the solute and can bedetermined experimentally.4,19,30,31 Interestingly, it is easier tocompute the preferential interaction parameter than KBIs.4,32,33

This is because the preferential interaction parameter can becomputed from the difference in the number of solvent andcosolvent molecules in the “solute domain”, this being thevolume in space for which the solute affects effectively thestructure of the solvent.4,19,34 The computation of the KBIs foreach solvent component depends on the number of moleculesof that component on the protein domain relative to areference state, the pure solvent with the same density. Thecalculation of the number of molecules of the solvent in thesolute domain depends, therefore, on the volume of the proteindomain, which must be independently determined.

Using the minimum-distance framework, we study thepreferential interactions of a model protein Ribonuclease T1(RNaseT1) by urea and TMAO (trimethylamine N-oxide),which are known experimentally to be preferentially attractedand excluded from the vicinities of the protein.31 We show thatsimulations reproduce the experimental preferential solvationparameters but that the molecular interpretation for thepreferential exclusion is counterintuitive at first sight. Indeed,both TMAO and urea exhibit local density augmentation at theprotein surface, particularly at low concentrations. The twocosolvents, however, differ in their relative accumulation on theprotein surface relative to that of water, explaining the overallpreferential hydration or dehydration observed experimentally.The detailed interpretation of the density augmentation of thecosolvent molecules on the protein surface is beyond the reachof conventional RDFs. The generality of the approachpresented here suggests that it can be useful for the study ofsolvation in mixtures of solutes and solvents of complex shapes,for which the definition of distribution functions to representsolvation interactions can be cumbersome.

2. THEORY2.1. General Formalism. In this work we will use

thermodynamic functions for the analysis of the solvation ofcomplex solutes using minimum-distance computations.Throughout the paper, we will use the following notation:subscript u to refer to the solute (protein), s to refer to any ofthe solvent components (water or cosolvent), w to refer towater, and c to refer to the cosolvent, either urea or TMAO.Let nus(r) be ensemble average number density (which can

be converted to molarity by multiplying by Avogadro’snumber) of solvent atoms which are minimum-distanceatoms to the solute at r. We define nus*(r) as the number ofminimum distances that would be observed at r if there were nosolute−solvent interactions. Note that nus*(r) represents theminimum-distance distribution in the presence of the solute butin the absence of solute−solvent interactions. Such a definitionrequires the presence of a “phantom” solute, which has noattractive nor repulsive interaction with the solvent molecules,but is present to define the position in the solution in the sameway as when the solute−solvent interactions are present.The ratio between nus(r) and nus*(r) is the variation in density

associated with the insertion of the solute, and we will call itgusmd(r)

≡ *g rn rn r

( )( )( )us

md us

us (1)

gusmd(r) has a simple physical interpretation: − RT ln gus

md(r)represents the change in the potential of mean force at theposition r when the solute−solvent interactions are switchedon. If the solvent has a single site, or if any single reference(center of mass or any atom) of the solvent is consideredinstead of the minimum distance, gus

md(r) reduces to a standarddistribution function, because nus(r) becomes the density ofsolvent molecules at r, and nus*(r) turns out to be constant andequal to the bulk density of the solvent.The minimum-distance densities nus(r) and nus*(r) differ from

the densities of the solvent, because they are associated with thecounting of any atom of the solvent molecule at each distance.For example, in the absence of solute−solvent interactions, thedensity of atoms of the solvent at short distances is associatedwith the probability of finding any atom of the solvent at that

Journal of Chemical Theory and Computation Article

DOI: 10.1021/acs.jctc.7b00599J. Chem. Theory Comput. 2017, 13, 6358−6372

6359

Page 3: Molecular Interpretation of Preferential Interactions in ...leandro.iqm.unicamp.br/leandro/shtml/publications/2017_jctc.pdf · solvation in mixtures of solutes and solvents of complex

distance, which corresponds to the atomic density, not themolecular density of the solvent. The difference between thetwo densities will be discussed in further detail in Section 2.2.The Kirkwood-Buff integral can be obtained from nus(r) and

nus*(r). To this end, let us define an integral up to a finitemaximum distance R from the solute, as

∫ρ= − *G R n r n r S r r( )

1[ ( ) ( )] ( ) d

R

uss 0

us us

where S(r) is the surface defined by the minimum-distance r toany solute atom. By integration, we obtain

ρ= − *G R N R N R( )

1[ ( ) ( )]us

sus us

(2)

where Nus(R) and Nus*(R) are, in this case, the number ofsolvent molecules with at least an atom within R of the solute,in the presence or not of solute−solvent interactions, and ρs isthe bulk density of this solvent component. The volumedefined by distance R to the solute is, of course, dependent onthe shape of the solute. These equations are generalizations ofthe standard equation to compute KB integrals from thestandard radial distribution function, for which we have S(r) =4πr2 and nus*(r) = ρs for all r, and are needed here because of thedependence of the minimum-distance densities on the shape ofthe solute and solvent molecules.A brief justification concerning the convergence of Gus(R)

defined in eq 2 to the actual KBI at the limit of R → ∞ can bemade in the following way. The equivalence of this definition ofGus(R) at large R to the standard definition of KBIs requiresthat4,35

⎯ →⎯⎯⎯⎯⟨ ⟩

⟨ ⟩⟨ ⟩−

→∞G R V

N NN N

V( )Rus

s u

u s (3)

This can be achieved most simply by noting that for eachmolecule the minimum-distance atom (i. e., the one whichtakes the shortest distance from the solute molecule) isuniquely defined (the configurational space in which there aretwo atoms at exactly the same distance is infinitesimal andhence can be neglected). Hence the Nus*(R) of eq 2 yields thenumber of solvent molecules in the absence of solute−solventinteractions, which is ⟨Ns⟩; hence →

ρ

*VN R( )us

s. That the

ρN R( )us

s

term converges to the first term of eq 3 can be justified by theuse of the inhomogeneous solvation theory, according towhich36

ρ⟨ ⟩

⟨ ⟩⟨ ⟩=

⟨ ⟩⟨ ⟩

=⟨ ⟩

VN N

N NV

NN

Ns u

u s

s u

s

s u

s

where ⟨Ns⟩u is the number of solvent molecules in theinhomogeneous system, i.e., in the presence of the solute at thefixed position in the origin. That the number of solventmolecules can be evaluated by enumerating the minimum-distance atoms again guarantees that Nus(R) converges to⟨Ns⟩u, thereby justifying that the KB integrals can be calculatedusing eq 2.For small R, the use of the minimum-distance counts for

Nus(R) and Nus*(R) associates the distance dependence ofGus(R) with the gus

md(r) function of eq 1, providing theconnection between preferential solvation parameters and themolecular structure of the solvent in the solute solvation shell.

2.2. Interpretation of the gmd(r) Distribution in Termsof Atomic Contributions. Let us consider the case of a singlesolute molecule and a single component of the solution, forsimplicity of notation (the subscript us will be omitted here).Here we define r as the minimum distance between any pointin space and an atom of the solute. The minimum-distancedistribution function gmd(r) is defined in eq 1, as a function ofthe density of minimum distances at distance r, n(r), and thedensity of minimum distances at r in the absence of solute−solvent interactions, n*(r).Given a solvent atom of type i (i = 1,...N, where N is the

number of atoms of the solvent molecule), we define ni(r) asthe number density of atoms i at distance r from the solute,such that ni(r)dV(r) is the number of atoms of type i within rand r + dr. Then, we define the probability of, given that theatom i is in volume element dV(r), this atom being the closestatom to the solute. This probability is dependent nontriviallyon the shape of the surface associated with the volume elementat r and on the shape of the solvent molecule, and we will call itwi(r). The revolutions of a solvent molecule that are associatedwith an atom being the closest atom to the solute are illustratedin Figure 1.

It follows that the contribution of atoms of type i to theminimum-distance density at r will be ni(r)wi(r). The totalminimum-distance density at r is, therefore

∑==

n r n r w r( ) ( ) ( )i

N

i i1 (4)

The normalization of the gmd(r) function of eq 1 depends onthe computation of the density of minimum distances in theabsence of solute−solvent interactions. However, for each r, thevolume element associated with r is dependent on the structureof the solute. The intuitive picture we develop here is that of a“phantom” solute molecule, immersed in a solvent with bulkproperties, which will be useful to define the shape of theminimum-distance surfaces for every r.If the density of the solvent was that of bulk and there were

no solute−solvent interactions, the density of atoms of type i(or any other type) at every distance is simply ni*(r) = ρs, thesolvent molecular density. However, the probability wi*(r) thatthe atom is the closest atom to the solute is dependent on theshape of the surface defined by r and on the structure of thesolvent molecule.In general, the density of minimum distances at r in bulk is

then

∑ρ* = *=

n r w r( ) ( )i

N

is1 (5)

Note that the minimum-distance density is in general largerthan the bulk solvent density. In particular, if r is short enough,

Figure 1. Fraction of revolutions of the solvent molecule (dashed blueline), with rotations centered at atom i that preserves the atom asbeing the closest to the noninteracting solute.

Journal of Chemical Theory and Computation Article

DOI: 10.1021/acs.jctc.7b00599J. Chem. Theory Comput. 2017, 13, 6358−6372

6360

Page 4: Molecular Interpretation of Preferential Interactions in ...leandro.iqm.unicamp.br/leandro/shtml/publications/2017_jctc.pdf · solvation in mixtures of solutes and solvents of complex

n*(r) = Nρs, because wi*(r) = 1 for every i. In other words, theminimum-distance density at very short distances is the atomicdensity, not the molecular density of the solvent.This result shows that the minimum-distance density differs

from the bulk solvent density even in the absence of solute−solvent interactions. Each solvent atom contributes differentlyto the minimum-distance summation, according to the wi(r)parameter, which is the probability of it being the closest atomto the solute if it is at distance r. wi(r) is associated not onlywith the orientation of the solvent molecule relative to thesolute but also with the accessibility of each atom to thesolvent, which might change if the solvent intramolecularstructure is perturbed by the solute.From eqs 4 and 5, it follows that the minimum-distance

distribution function can be written as

∑ρ

=∑ *

= =

⎛⎝⎜⎜

⎞⎠⎟⎟g r

n r w r

w r( )

( ) ( )

( )i

Ni i

iN

i

md

1 bulk 1 (6)

which, as expected, is constant and equal to 1 in the absence ofsolute−solvent interactions, where ni(r) = ρbulk for every i (andwi(r) = wi*(r), by definition).This definition provides a useful decomposition of gmd(r) in

terms of atomic contributions. gmd(r) can be considered as thecontributions of each atom of the solvent, with two weights:first, the density of each atom relative to bulk density atdistance r, and second (in parentheses in eq 6), an orientationalparameter associated with atom at r being the closest atom tothe solute. An analysis of the distance dependence of the atomiccontributions is available in Appendix A.1.2.3. Computation of Minimum-Distance Correlation

Functions. The process of obtaining the gusmd(r) (from eq 1)

and the corresponding KB integral (as defined in eq 2) isillustrated in Figure 2. It relies on the construction of twohistograms: 1) the histogram of nus(r), which contains thedensity of minimum-distance sites at a volume element in thevicinity of r in the actual solute−solvent simulation, and 2) thehistogram of the normalization, nus*(r), which must contain thedensity of minimum-distance sites in bulk solvent in the samevolume. The computation of nus(r) is a straightforward sitecounting following directly from the simulation of the system ofinterest. The choice of the method to compute nus*(r), however,deserves further justification.In the absence of solute−solvent interactions (that is, in the

presence of the “phantom” solute molecule), the density of anyspecific atom at r is the bulk density of the solvent, and thus thedensity of minimum distances at r is given by eq 5. It isdependent on the conditional probabilities wi(r), of eachspecific atom being the closest one to the solute if it is found atdistance r. These probabilities are dependent on the solventstructure. For example, if the solvent contains a single atom w1= 1 and hence nus*(r) = ρs, which turns out to be equivalent tothe normalization of standard radial distribution functions.If the solvent has more than one atom, nus*(r) must be

obtained explicitly. The simplest conceptual alternative is toperform a simulation of the pure solvent with bulkconcentration and compute the distribution by the insertionof a “phantom” solute. Nevertheless, we will argue that nus*(r)can be obtained by a numerical procedure which avoids havingto simulate the pure solvent.First, we note that the probability, in bulk, of an atom at a

distance r of the solute being the closest atom depends only onthe geometry of the solvent molecule and on the shape of the

surface defined by r. It consists of the fraction of the revolutionsof the molecule, with atom i as the rotation center, thatpreserves atom i as the closest atom to the solute, as illustratedin Figure 1.If the solvent molecule is rigid, the probabilities wi*(r) of eq 5

can be computed by the numerical integration of the rotationsof a solvent molecule at every distance. Equivalently, asdiscussed in Appendix A.2, they can be computed by simulatinga set of noninteracting solvent molecules - with proper bulkdensity - in the presence of a “phantom solute” or simply bygenerating random positions and rotations for those molecules.These noninteracting solvent configurations could be used tocompute the probabilities wi*(r) to use eq 5, or, moreconveniently, to compute directly the minimum-distancecount required for obtaining nus*(r).If the solvent molecules are not rigid, their conformational

flexibility must be taken into account. The conformationalflexibility in this case is that of the solvent molecule in bulksolvent. Therefore, it can be obtained by simulating the puresolvent or by sampling solvent molecules from the bulk (large-distance) phase of the solute−solvent simulation. Given aproperly sampled set of solvent molecule conformations, therandom solvent molecule distribution can be used to computethe histograms for nus*(r).Therefore, the algorithm we use here for computing nus*(r)

consists of the following:1. Estimation of the bulk density of the solvent:

1.1. Determine by numerical integration the volume of the“solute domain”, Vsolute−domain, defined by a distance toany atom of the solute considered large enough so thatthe solvent displays bulk properties outside this domain.

Figure 2. Illustration of the calculation of the minimum-distancedistributions. (A) A “solute domain” is defined as a region around thesolute outside of which the effect of the solute on the solvent structurecan be neglected for practical purposes. The volume of this region isdetermined by numerical integration. The bulk density of the solvent isthen estimated from the number of solvent molecules outside the“solute domain” and the complementary volume of the box. (B) Arandom distribution of the solvent, with a density equal to the bulkdensity of the simulation, but occupying the whole box, is used tocompute the minimum-distance density without solute−solventinteractions. (C) Histograms are computed for the minimum-distancescounts obtained from the simulation and from the random solventdistribution, to compute the gmd(r) distribution according to eq 1.

Journal of Chemical Theory and Computation Article

DOI: 10.1021/acs.jctc.7b00599J. Chem. Theory Comput. 2017, 13, 6358−6372

6361

Page 5: Molecular Interpretation of Preferential Interactions in ...leandro.iqm.unicamp.br/leandro/shtml/publications/2017_jctc.pdf · solvation in mixtures of solutes and solvents of complex

1.2. Compute the volume of the bulk domain by Vbox −Vsolute domain, where Vbox is the volume of the simulationbox.

1.3. Count the number of molecules of the solvent within thesolute domain, to obtain by subtraction from the totalnumber of molecules the number of molecules in thebulk domain.

1.4. Using 1.2 and 1.3, estimate the bulk density of thesolvent component.

2. Generation of a noninteracting solvent box:

2.1. Choose a random molecule from the bulk region of thesimulation.

2.2. Generate a random position and orientation to themolecule within the simulation box and compute theminimum distance to the solute, to add to the histogramof nus*(r).

2.3. Repeat steps 2.1 and 2.2 until Nrand molecules withrandom positions were generated. Nrand must be largeenough to allow a proper sampling, thus it is reasonablethat it is at least of the order of the number of solventmolecules in the simulation.

2.4. Scale the histogram of nus*(r) by ρs Vbox/Nrand, where ρs is

the bulk density of the solvent and Vbox is the totalvolume of the simulation box. That is, adjust thehistogram to the proper bulk density.

With approximations of nus(r) and nus*(r) determined withthe procedures above, the correlation function gus

md(r) can becomputed from eq 1.2.4. KB Integrals and Preferential Interaction Param-

eters. From the estimates of nus(r) and nus*(r) obtained with thestrategy summarized in the previous section, it is possible tocompute the distance-dependent KBI for each of the solventcomponents independently using eq 2. These KBIs can becompared with the corresponding gus

md(r) distributions for themolecular understanding of the interactions that build up themain characteristics of solvent accumulation or depletion fromthe surface of the solute. Additionally, the preferentialinteraction parameters in a ternary solution can be obtainedfrom the KBIs using, for example, the following expression forthe cosolvent4,11

ρΓ ≈ −R G R G R( ) [ ( ) ( )]uc c uc uw (7)

where ρc is the bulk density of the cosolvent component ofinterest, and Guc(R) and Guw(R) are the KBIs for the cosolventand water, respectively. Alternatively, the preferential inter-action parameter can be computed directly from the solventcounts n(r) for each component as described by Baynes andTrout19

ρρ

Γ = −N R N R( ) ( )uc cc

ww

(8)

where Nw(R) and Nc(R) are the number of molecules of waterand the cosolvent within R from the solute, and ρw and ρc arethe corresponding bulk concentrations.The distance R within which the number of molecules is

integrated must be large enough such that the properties of thesolution are those of bulk solvent, and Gus(R) is converged, butminimizing the fluctuations of the Ns(R) counts. Baynes andTrout estimated that the optimal R is the minimum distance forwhich the distributions are converged to one within thenumerical errors.19 The difference between our procedure and

that of Baynes and Trout is that they use R as the distancebetween the center of mass of the solute and the van der Waalssurface of the protein, while we define it as the minimumdistance between any atom of the solute and any atom of thesolvent. They find that 6 Å is adequate to obtain a qualitativeestimate of Γ.19 We use here R = 8 Å, according to the observedconvergence of the gus

md(r) distributions in our systems and theintention to compare the preferential solvation parametersquantitatively with the experimental data. The use of theminimum distance between the solute and the solvent providesa more precise definition of the distribution, and the molecularpicture of the solvation shell arising from this distributionprovides an intuitive representation of solute−solvent inter-actions.

3. MATERIALS AND METHODSSimulations of Ribonuclease T1 (RNaseT1) in solutions ofTMAO and urea were performed as follows. The solutionNMR structure of RNaseT1 (PDB id. 1YGW) was used.37 Itconsists of 34 models, which were used as startingconformations for independent simulations. The models weresolvated using the software Packmol38,39 in pure water, inmixtures of water and urea, and in water and trimethylamine N-oxide (TMAO), with added sodium and chloride ions forsystem neutrality. The TIP3P model was used for water;40

CHARMM3641 parameters were used to simulate the protein.Urea was simulated either with the CHARMM General

Force Field (CGENFF)42,43 or with the same force field butsubstituting partial charges by those developed by Weerasingheand Smith44 to reproduce experimental KB integrals ofsolutions of urea in water (urea KBFF model). In this casewe chose to preserve other nonbonded parameters of theCGENFF model, particularly because the combination rules foratomic radii of the KBFF model are different. The resultsobtained supported this choice, and we will refer to this forcefield as simply KBFF from now on. The details of theparameters used are reported in Supporting Information TableS2.Three different force fields were used to simulate TMAO

molecules: the CGENFF model42,43 and two models developedto reproduce experimental properties of aqueous TMAOsolutions, which are improvements over the popular Kastmodel45 of TMAO. The second model was developed toreproduce the osmotic pressures of TMAO solutions and willbe referred to as the Osmotic model.46 The third model wasmore recently developed by Scheck et al.47 to reproduceexperimental transfer free energies of peptides by varying thenonbonded parameters of the Kast model and will be referredto as the “Optimized Kast” model or simply the OptKastmodel.Simulations were performed with the NAMD software,48

with a 2 fs time step, and were run at 298.15 K and 1 atm, with12 Å van der Waals interaction cutoff and Particle Mesh Ewaldfor evaluation of long-range electrostatic potential. Constanttemperature and pressure were maintained by a Langevinthermostat with a damping coefficient of 5 ps−1 and Langevinbarostat with a piston period of 200 fs and a damping time scaleof 100 fs. The systems’ energies were minimized by 1000 stepsof conjugated gradient method (CG) and equilibrated by 100ps of constant-temperature and constant-pressure (NPT) MDwith the protein fixed, followed by 100 CG steps and 100 psNPT MD with the Cα atoms fixed and finally 500 psunrestrained NPT MD. Production simulations were then run

Journal of Chemical Theory and Computation Article

DOI: 10.1021/acs.jctc.7b00599J. Chem. Theory Comput. 2017, 13, 6358−6372

6362

Page 6: Molecular Interpretation of Preferential Interactions in ...leandro.iqm.unicamp.br/leandro/shtml/publications/2017_jctc.pdf · solvation in mixtures of solutes and solvents of complex

for 10 ns. For each solvent concentration, 34 independentsetups and simulations were performed starting with each of theNMR models, for a total of 340 ns of production MD.Simulations were performed for RNaseT1 in pure water, in ureaconcentrations of 0.50, 1.00, and 2.00 mol × L−1, and forTMAO concentrations of 0.25, 0.50, and 1.00 mol × L−1 toreproduce the experimental conditions reported by Lin andTimasheff.31 Initial volumes were obtained from experimentaldensities of the solvent mixtures.49,50 A total of 5.1 μs ofproductive simulations were performed, using the high-performance computing environment of the Center ofComputational Engineering & Science of the University ofCampinas. The details of the systems simulated are reported in

Supporting Information Table S1. Additional 400 ns simu-lations of systems composed only of one urea or TMAOmolecule in 2000 water molecules were used for the examplesdiscussed in Section 4. These simulations were performed withthe same production protocol as described above.The computation of the minimum-distance distribution

function was implemented as the gmd module of ourMDAnalysis software suite and is available at http://leandro.iqm.unicamp.br/mdanalysis/gmd.51 A solvent molecule wasconsidered to belong to the bulk domain if all of its atoms werefarther from the protein than 8 Å. The distribution functionsand KBIs were observed to effectively converge at thesedistance. The volume associated with a minimum distance of 8

Table 1. Kirkwood-Buff Integrals Computed from the Simulations in L × mol−1 Using the KBFF Urea Model44 and the OsmoticTMAO Model46a

urea trimethylamine N-oxide

C Guw Guc Guw Guc

0 −7.756 ± 0.003 −7.756 ± 0.0030.25 −7.731 ± 0.008 −10.46 ± 0.530.50 −7.919 ± 0.017 −0.55 ± 0.68 −7.683 ± 0.010 −10.81 ± 0.451.00 −8.081 ± 0.016 −0.83 ± 0.38 −7.618 ± 0.013 −10.61 ± 0.282.00 −8.268 ± 0.029 −2.55 ± 0.31

aThe deviations reported are the standard error of the means of the 34 simulations performed for each system. The experimental apparent molarvolume of RnaseT1 in water31 is ϕu

0 = 7.924 L × mol−1 and differs by 2.2% from the predicted value from the simulations (7.756 L × mol−1). Similarresults using the other force fields are provided in Supporting Information Tables S3 and S4.

Figure 3. Comparison of the standard radial distribution function and the minimum-distance distribution function for the study of the solvation of asmall irregular solute, as urea in water (using the KBFF urea model44). (A) Water-urea minimum-distance distribution function compared to theradial distribution function (computed from water oxygen and urea carbon atoms). The distance of the peak of the g(r) distribution, at ∼3.8 Å, doesnot correspond to direct interactions, whereas the peak of the gmd(r) distribution at ∼1.9 Å clearly indicates the presence of hydrogen bonds. (B)Distance-dependence of the KB-integrals computed by integrating the g(r) (eq 9) or using the minimum-distance count (eq 2). The integrals areequivalent for large r, but the use of the minimum-distance count shows that the first solvation shell essentially determines the final Guw(r) andavoids fluctuations associated with the urea molecular volume. The urea model fits exactly the experimental apparent molar volume of urea (ϕu

0).50

(C) The decomposition of the gmd(r) distribution into atomic contributions shows that three types of interactions are important for the hydrationshell of urea: with water donating or accepting hydrogen bonds and secondary interactions with hydrogen atoms at ∼3.0 Å. (D) Illustration of theinteractions determinant for the distribution functions: the interactions of water hydrogen atoms with urea nitrogen atoms (at 3.03 Å in the figure)can be associated with the shoulder of the gmd(r) distribution. (E) and (F) Comparison of two urea force fields. The KBFF model displays lesshydrogen-bond capacity than the CGENFF model, and the difference can be associated with the reduced bonding of urea to water oxygen atoms(panel F), which results from the reduced partial charges of urea hydrogen atoms (Supporting Information Table S2).

Journal of Chemical Theory and Computation Article

DOI: 10.1021/acs.jctc.7b00599J. Chem. Theory Comput. 2017, 13, 6358−6372

6363

Page 7: Molecular Interpretation of Preferential Interactions in ...leandro.iqm.unicamp.br/leandro/shtml/publications/2017_jctc.pdf · solvation in mixtures of solutes and solvents of complex

Å from the RNaseT1 surface contains a number of moleculesequivalent to that of a sphere of about 25 Å of pure water, suchthat the fluctuations expected are smaller than those of aconventional radial distribution function at much smallerdistances. Independent distribution functions were computedfor the simulations performed with each one of the 34 NMRmodels, from frames extracted from the simulations at every 10ps. The fluctuations of the distribution functions within thesimulations were negligible, and the fluctuations of the KBIs arereported as standard errors in Tables 1, S4, and S5 and inFigures 6 and S5. Finally, preferential interaction parameterswere computed using eq 7, with the standard error of the KBintegrals being propagated to Γ.

4. SOLVENT-SHELL STRUCTURE ANDKIRKWOOD-BUFF INTEGRALS OF IRREGULARSOLUTES: PROOF-OF-CONCEPT EXAMPLE

To illustrate the effectiveness of minimum-distance distributionfunctions in representing the solvent-shell structure aroundirregularly shaped solutes, we have performed simulations ofsingle urea and TMAO molecules in pure water. We discusshere the results obtained for urea in water, with similar plots forTMAO available as Supporting Information (see Table S3 andFigure S2).In Figure 3A, we illustrate the difference between the gmd(r)

and the standard radial distribution function, g(r), inrepresenting the water-urea interactions, for the KBFF ureamodel44 solvated by water. The gmd(r) displays a peak at ∼1.9Å, indicative of the formation of the expected hydrogen bonds.In this example, the g(r) was computed between the wateroxygen atom and the urea carbon atom and displays a peak at∼3.8 Å. Obviously, this peak also results from water-ureahydrogen bonds, but the associated distance does not indicatethat because of the references sites for the calculation of theg(r). Naturally, this analysis could be, for a small molecule as

urea, extended for the computation of the g(r) between otherpairs of sites, and a more clear picture of the solvation structurecould be obtained. However, the advantage of the gmd(r)distribution is that it is properly defined for the representationof the solvation structure for solutes of any size and structuralcomplexity.In Figure 3B, we illustrate distance dependence of the KB-

integrals computed from the integration of the radialdistribution function, i.e.

∫π= −G R g r r r( ) 4 [ ( ) 1] dR

us0

2(9)

or from the minimum-distance count, using eq 2. At large r, asexpected, both integrals converge to the same value, validatingthe methods proposed here. The urea and water models in thiscase reproduce precisely the experimental apparent molarvolume of urea.52 The KB integral computed from theminimum-distance count (blue curve) is determined by theexcluded van der Waals volume of urea (which results to be ofthe order of −72 cm3/mol (first dip)) and the accumulation ofwater at the first solvation shell. The same integral computedfrom the g(r) displays oscillations at larger distances, associatedwith the molecular volumes of urea and water, which turn outto perturb the integral up to a distance of 8 to 10 Å. The factthat the integration from the minimum distances accounts forthe molecular volumes at short distances is fundamental for thecomputation of KB integrals for larger solutes. Therefore, notonly the convergence of the KB integral is favored by the use ofthe minimum-distance approach but also the molecularinterpretation of the integral for variable r is clearer than thatobtained from the standard radial distribution functions.The gmd(r) distribution can, additionally, be decomposed

into atomic contributions, as we discussed in Section 2.2. Thisdecomposition is shown in Figure 3C. The peak at ∼1.9 Åresults from the contributions of hydrogen bonds to wateroxygen and hydrogen atoms. Additionally, there is a second

Figure 4.Minimum-distance distribution functions of (A) water in urea solutions, (B) urea, (C) water in TMAO solutions, and (D) TMAO, relativeto RNaseT1, for different cosolvent concentrations. The insets are augmented representations of the highest peak. These plots correspond to the useof the KBFF44 model for urea and the Osmotic46 model for TMAO. Similar plots for other solute models are provided as Supporting Information.

Journal of Chemical Theory and Computation Article

DOI: 10.1021/acs.jctc.7b00599J. Chem. Theory Comput. 2017, 13, 6358−6372

6364

Page 8: Molecular Interpretation of Preferential Interactions in ...leandro.iqm.unicamp.br/leandro/shtml/publications/2017_jctc.pdf · solvation in mixtures of solutes and solvents of complex

peak on the hydrogen distribution which explains the shoulderof the complete gmd(r) distribution. All these peaks, beingassociated with minimum distances between water and urea,can clearly be linked to intermolecular configurations, as shownin Figure 3D. The hydrogen bonding is associated with urea-oxygen and urea-hydrogen bonding to water and is expected.Additionally, the second peak of the hydrogen distribution canbe clearly traced down to the interaction of the water hydrogenatoms with the urea nitrogen atoms, which is perpendicular tothe urea molecular plane. The specificity of these interactions ishardly obtainable from the standard radial distributionfunctions, for which the peak at ∼3.8 Å incorporates all typesof contributions.In Figures 3E and F we show that this precise molecular

picture of solvation is useful for the interpretation, for example,of the differences between molecular models used. The gmd(r)and atomic contributions computed from simulations using theKBFF model or the CGENFF model show that the KBFF leadsto a lesser propensity of urea to form hydrogen bonds withwater and that this is particularly associated with hydrogenbonds with water oxygen atoms. This is probably a direct resultof the smallest partial charges of hydrogen atoms in the KBFFmodel relative to the CGENFF model. These smaller chargesalso favor the interactions of the water hydrogen atoms with theurea nitrogen atoms, increasing the second peak of thehydrogen distribution.Finally, at long distances, the atomic contributions of water

converge to ∼0.21 for the oxygen atom and to ∼0.79 for thesum of the hydrogen contributions (Figures 3C and 3F). Thisis expected and is characteristic of the shape of the watermolecule and of the minimum-distance count at distanceswhere the interactions with the solute are negligible. A

theoretical analysis of these contributions is provided inAppendix A.1.

5. RECONCILING PREFERENTIAL INTERACTIONPARAMETERS AND THE SOLVENT-SHELL PICTUREOF PROTEIN SOLVATION

Our discussion will focus in this section on the simulationsperformed with the KBFF44 for urea and with the Osmotic46

model for TMAO, because the experimental preferentialinteraction parameters were better reproduced with thesemodels, as it will be shown. Similar data for the other modelssimulated are provided as Supporting Information (Figures S3,S4, and S5 and Tables S4 and S5).Figure 4 displays the minimum-distance distribution

functions [gusmd(r)] computed for all systems simulated. As

expected, gusmd(r) distributions provide a very clear picture of

direct solute−solvent interactions. For example, Figure 4Adisplays the distribution of water around the protein. A distinctpeak at ∼1.8 Å indicates the formation of hydrogen bonds. Asecond peak at ∼2.7 Å is associated with the second hydrationshell. Water molecules therefore clearly form specific directinteractions with the protein and display increased densitiesrelative to bulk at short distances. The addition of ureadecreases the relative water density at the surface of the protein,indicating that urea competes with water with apparent greateraffinity (although there is a slight increase in the hydrogen-bonding peak of water, as shown in the inset of Figure 4A).Figure 4B shows the guc

md(r) distribution of urea relative to theprotein. The density of urea is 4 to 4.5 times greater athydrogen bonding distances than in bulk. Therefore, urea formsa strongly favorable interaction with the surface of protein.With increasing urea concentration, its local density augmenta-

Figure 5. Decomposition of the gusmd(r) into atomic contributions. Plots (B) and (D) are insets of (A) and (C) focused on the first and second

solvation layers. (A) and (B) Urea: the peak at ∼1.8 Å is determined by hydrogen bonds of the protein with urea hydrogen or oxygen atoms. Ureaonly rarely interacts directly with the protein through its nitrogen or carbon atoms. The most important protein-urea interactions are associated withhydrogen bonds through the hydrogen atoms. (C) and (D) TMAO: hydrogen bonds of the protein with a TMAO oxygen atom are associated withthe shoulder on the gmd(r) distribution at ∼1.7 Å. The most prominent peak of the distribution is, however, determined by hydrophobic interactionsof the protein with TMAO hydrogen atoms, which peak at ∼2.1 Å. The stability of hydrophobic interactions through the hydrogen atoms increaseswith TMAO concentration. The contributions of TMAO nitrogen and carbon atoms for the gmd(r) distributions are negligible.

Journal of Chemical Theory and Computation Article

DOI: 10.1021/acs.jctc.7b00599J. Chem. Theory Comput. 2017, 13, 6358−6372

6365

Page 9: Molecular Interpretation of Preferential Interactions in ...leandro.iqm.unicamp.br/leandro/shtml/publications/2017_jctc.pdf · solvation in mixtures of solutes and solvents of complex

tion decreases, indicating that the most urea-affine sites areoccupied.Interestingly, as shown in Figures 4C and D, the distributions

of water and TMAO at the vicinity of the protein displaysimilarities to those of urea: both water and TMAO exhibitaugmented relative densities at short distances. The guw

md(r) ofwater is perturbed by TMAO at hydrogen-bonding distances asit was by urea. The density of water at larger distances appearsto be mostly unaffected by TMAO. TMAO also accumulates onthe protein surface, but the most important interactions occurat ∼2.3 Å. A small shoulder at hydrogen-bonding distances isalso visible in Figure 4D. Contrary to what is observed for urea,however, the relative density of TMAO increases at the highestpeak with increasing TMAO concentration. This means thatthere might be some cooperative effect associated with TMAObinding to the protein surface.Therefore, gus

md(r) correlation functions indicate that allmolecules form specific and favorable interaction with theRNaseT1, leading to the augmentation of their densities in theproximity of the protein surface. This picture may, at firstglance, seem at odds with the common interpretation thatTMAO molecules are excluded from the protein surface. As wewill see, in spite of this molecular picture, there is nocontradiction when the preferential interaction parameters arecomputed, at least for dilute solutions.As discussed in section 2.2, the gus

md(r) can be naturallydecomposed into atomic contributions, allowing for a moredetailed structural interpretation of the solvent interactionswith the protein surface. Figures 5A and B show thecontribution of each type of urea atom to the urea gus

md(r)distribution. Urea hydrogen and oxygen atoms account for theinteractions at short distances, as expected. Minimum-distanceinteractions through the carbon or nitrogen atoms areinfrequent and occur only eventually at the second solvationshell.

The decomposition of the gucmd(r) of TMAO into atomic

contributions, shown in Figure 5C and D, is more interesting,because TMAO is an amphiphilic molecule. The small shoulderat hydrogen bonding distances is almost completely determinedby the oxygen atom. At the same time, the significant densityaugmentation at ∼2.3 Å results solely from methyl-hydrogencontributions, thus being hydrophobic in nature. Thesehydrophobic interactions are slightly strengthened with theincrease of TMAO concentration. The stabilization of hydro-phobic interactions is possibly the underlying cause of theapparent cooperative binding observed in the full guc

md(r)distributions. At the same time, no particular trend can bediscerned on the oxygen minimum-distance distribution withvarying TMAO concentration.Figure 6 and Table 1 display the KB integrals computed for

water, urea, and TMAO, in all systems. The KB integrals forwater are around −8 L × mol−1 in both cases (Figures 6A andC). Therefore, water is overall excluded from the proteindomain. This exclusion results directly from the excludedprotein volume, and it is only slightly compensated by theaugmented density of water at the surface of the protein. Theaddition of urea decreases the water KBIs, thus inducing furtherwater exclusion, while the addition of TMAO slightly increasesthe water KBIs, indicating water accumulation on the proteindomain. Both effects are small relative to the overall KBIs(Table 1). The value of the water−protein KB integral in purewater, −7,756 L × mol−1, agrees within 2.2% with theexperimental apparent molar volume of RnaseT131 of 7.924 L× mol−1. This indicates that that the water−protein interactionsare nicely modeled by the current force fields and provideadditional support for the methods proposed here.Urea KBIs are numerically small, as shown in Figure 6B. This

indicates that urea accumulation on the protein surface almostcompletely compensates the excluded protein volume (whichcontributes negatively to the KBI). The urea KBIs decrease

Figure 6. Kirkwood-Buff integrals of (A) water in urea solutions, (B) urea, (C) water in TMAO solutions, and (D) TMAO, relative to RNaseT1, fordifferent cosolvent concentrations. The simulations using the KBFF44 and Osmotic46 force fields for urea and TMAO were used for these plots. Theexperimental apparent molar volume (ϕu

0) of RNaseT131 in water is indicated by dashed lines in (A) and (C) and agrees within 2.2% with the KBintegral in pure water, indicating that the water and protein force fields are adequate to study water−protein interactions. Similar plots using otherforce fields and numerical values of the integrals are available as Supporting Information.

Journal of Chemical Theory and Computation Article

DOI: 10.1021/acs.jctc.7b00599J. Chem. Theory Comput. 2017, 13, 6358−6372

6366

Page 10: Molecular Interpretation of Preferential Interactions in ...leandro.iqm.unicamp.br/leandro/shtml/publications/2017_jctc.pdf · solvation in mixtures of solutes and solvents of complex

with increasing urea concentration, indicating, as thedistribution functions suggested, saturation of the ureainteraction sites on the protein surface.The TMAO KBIs are around −11 L × mol−1 thus smaller

than those of water (Figure 6D). Therefore, TMAO is overallexcluded from the protein domain. The TMAO KB integralsappear to become slightly more negative with increasingTMAO concentration (Table 1 - last column) but withinnumerical errors. The increase in the local density athydrophobic interaction distances (∼2.1 Å - Figure 4D) isreflected in the Guc(r) integrals for up to about 4−5 Å. The firstsolvation shell of TMAO roughly determines the final KBIs butnot to the precision required to distinguish a clearconcentration dependent trend, as with urea.The profiles of the KBIs shown in Figure 6 reveal how the

accumulation or exclusion of the cosolvents is distancedependent. The negative character of most KB integrals isdue to the rapid drop of Gus(r) at short distances and resultsdirectly from the excluded molecular volumes. In all cases, atroughly hydrogen bonding distances (∼1.8 Å) there is reversalof the Gus(r) drop, due to the accumulation of the solvents atthe protein surface, as indicated in the gus

md(r) distributions.Only for urea is this accumulation enough to completelycounteract the effect of the protein excluded volume. At about∼6 Å, all KB integrals are essentially converged. Therefore, thefinal KB integrals are determined mostly by the excludedvolume of the protein and the accumulation of the solvents inthe first solvation shell, characterized by protein−solventminimum distances between ∼1.8 Å and ∼3 Å.Finally, from the KB integrals, it is possible to compute the

preferential interaction parameters to be compared withexperimental data. Figure 7 shows the preferential interactionparameters computed from the simulations as a function ofeach cosolvent concentration, compared with the experimental

parameters obtained by Timasheff.31 The results obtained forall solvent models are shown.The simulations reproduce the qualitative dependence of the

preferential interaction parameters for the osmolytes reasonablywell. Preferential interaction parameters, computed using eq 7,consist of the increase or decrease in the number of solventmolecules (water or osmolyte) on changing the concentrationof the protein.4 Therefore, for example, a negative value of Γindicates that the increase in the concentration of the proteinexcludes solvent molecules.Preferential interaction parameters for water, in urea or

TMAO solutions, are shown in Figures 7A and B. Waterpreferential interaction parameters are deeply negative (Figure7A), indicating there is preferential dehydration induced byurea. By contrast, water preferential interactions are positive inTMAO solutions. Therefore, the protein is preferentiallydehydrated in urea solutions and preferentially hydrated inTMAO solutions. However, experimentally, the preferentialinteractions with water increase with the increase in urea andTMAO concentrations, and the trends are not properlyreproduced by the models, except perhaps for the KBFF ureaforce field.Figure 7C displays the preferential interaction parameters for

urea. It is positive at all concentrations, indicating that urea isaccumulated on the protein domain. The simulation resultsagree almost quantitatively with the experimental data in thiscase, specially when using the KBFF urea model.44 Withincreasing urea concentration, the accumulation of thedenaturant is increased. That is, a greater number of ureamolecules accumulate on the protein domain. This occurs inspite of the fact that the free-energy of urea binding to thesurface of the protein decreases, as indicated by the decrease ofthe guc

md(r) peaks in Figure 4B at short distances. Therefore, thesimulations and the experiments predict urea accumulation onthe protein domain, associated with water exclusion.

Figure 7. Preferential interaction parameters for (A) water in urea solutions, (B) water in TMAO solutions, (C) urea, and (D) TMAO, relative toRNaseT1, as a function of cosolvent concentrations. Experimental values were obtained from Lin and Timasheff,31 with experimental errors notbeing significant for the purposes of this comparison. The error of the simulated data was computed by extrapolation of the standard error of theKBIs.

Journal of Chemical Theory and Computation Article

DOI: 10.1021/acs.jctc.7b00599J. Chem. Theory Comput. 2017, 13, 6358−6372

6367

Page 11: Molecular Interpretation of Preferential Interactions in ...leandro.iqm.unicamp.br/leandro/shtml/publications/2017_jctc.pdf · solvation in mixtures of solutes and solvents of complex

As expected, the preferential interaction of TMAO displaysthe opposite trend, being negative at all concentrations (Figure7D) and decreasing with the increase in TMAO concentration.This means that TMAO molecules are excluded from theprotein domain, with qualitative agreement with the exper-imental result.31 Therefore, more TMAO molecules areexcluded from the protein domain with increasing TMAOconcentration. Again, this occurs in spite of the fact that thefree-energy of TMAO to the protein surface now increases withTMAO concentration, as indicated by the augmentation of thegucmd(r) function at short distances (Figure 4D). The agreementof computed preferential interaction parameters with with theexperimental values is quantitatively reasonable for the moredilute solutions. For the 1.0 mol × L−1 solutions the exclusionof TMAO from the protein domain is underestimated by allmodels. Thus, the distribution functions of TMAO can betrusted for the two most dilute solutions, but additionalimprovements of the force fields appear to be necessary tostudy protein-TMAO interactions at higher concentrations.It is important to remark that the preferential accumulation

(for urea) or preferential exclusion (for TMAO) are bothassociated with augmented local densities of the osmolytes atthe surface of the protein, as indicated by minimum-distancedistribution functions (Figure 4). The difference between ureaand TMAO is not qualitative in this sense but associated withthe greater urea accumulation, which is enough to completelycounteract the effect of the excluded protein volume. TheTMAO molecules also accumulate on the protein surface,mostly through hydrophobic interactions.This picture of the interactions of each cosolvent, either an

excluder or a crowder, with the protein surface, provides anadditional level of understanding of protein-osmolyte inter-actions. Generally, protein stabilization is associated withpreferential hydration, and this indicates the exclusion of theosmolyte from the protein surface, as has been identifiedpreviously.31 This picture, although being correct from thepoint of view of the preferential solvation parameters, mightlead to a misleading interpretation of the microscopicinteractions. Actually, both TMAO and urea (and it is likelyto be the case for other osmolytes) do accumulate on theprotein surface and interact therewith, possibly playingadditional roles in the protein stability and activity. Thismicroscopic picture is not contradictory with the observedpreferential interactions at low concentrations, which aredetermined by the relative weights of the excluded solutevolume and solvent accumulation on the overall protein−solvent interactions. The use of minimum-distance distributionfunctions to visualize the solvent densities around the solutesprovides intuitive pictures of these interactions, with easyatomic decomposition for structural interpretation.

6. CONCLUSIONSThe study of solvation of complex solutes, biomolecules inparticular, is fundamental for the understanding of their stabilityin biologically or industrially important media. Distributionfunctions provide the means to connect the microscopic pictureof solvation with experimental data, particularly preferentialsolvation, which is associated with protein stability. However,radial distribution functions are not appropriate for therepresentation of the solvation structure around complexsolutes and neither for the computation of the KBIs, whichconnect the molecular structure to thermodynamic data. Theexperimental observation of the preferential exclusion or

accumulation of cosolvents often lead to the proposition ofthermodynamic models of protein stabilization based on thehypothesis that some molecules avoid the interaction with theprotein surface.53−55

In this work we have shown that with minimum-distancedistribution functions both a molecular picture and theexperimentally determined interaction parameters can beobtained from molecular dynamics simulations. The mini-mum-distance distribution functions are clearly associated withthe solvation shell perspective which is often used intuitively todescribe cosolvent interactions in terms of accumulation orexclusion.55 However, while we confirm that urea, a denaturingosmolyte, accumulates preferentially on the protein surface andTMAO, a stabilizer, is preferentially excluded, both interactwith the protein surface displaying density augmentation atshort distances from the protein. The most determinant anduniversal factor associated with the exclusion of the solventmolecules from the protein domain is the protein molecularvolume. The augmented densities of the solvent molecules atshort distances may or may not be enough to lead to overallsolvent accumulation on the protein domain, as measured bythe Kirkwood-Buff integrals.We conclude, therefore, that the observation of augmented

densities of solvent components around proteins at a molecularlevel can occur even if the species is net-preferentially excludedfrom the protein domain, thereby acting as a stabilizer of theprotein structure. This picture, which can be obtained by theuse of minimum-distance distribution functions, provides anadditional level of understanding of protein−solvent inter-actions and solvent induced stabilization. Finally, thedistribution functions proposed here can be used for thestudy of solvation of solutes of any structural complexitywithout modification.

■ APPENDIX A

A.1. Distance Dependence of Atomic Contributions togmd(r)It is interesting to analyze the atomic contributions to theminimum-distance distribution function at very short and verylong distances. Here we follow the notation used in Section 2.2.Let us examine specific examples. For very short distances,

wi(r) = 1 for every i, thus

∑ρ

==

g rN

n r( short)

1 ( )

i

Nimd

1 bulk

In the presence of solute-solvent interactions ni(r) = 0 for everyi at very short distances because of atomic volumes, thus gmd(r)= 0.Far from the solute, ni(r) = ρs for all i, thus the gmd(r)

reduces to the sum of the orientational contribution of eachatom, which sum up to one

∑ ∑=∑ *

=*

∑ *=

= = = =

g rw r

w r

w r

w r( large)

( )

( )

( )

( )1

i

Ni

iN

i i

Ni

iN

i

md

1 1 1 1

The orientational contribution of each atom to this sum isdependent on the shape of the minimum-distance surface andon the solvent molecule structure. Let us analyze the propertiesof the orientational/conformational parameter w in more detail,at long distances. If the distance between the solute and thesolvent molecule is large enough, the curvature of the

Journal of Chemical Theory and Computation Article

DOI: 10.1021/acs.jctc.7b00599J. Chem. Theory Comput. 2017, 13, 6358−6372

6368

Page 12: Molecular Interpretation of Preferential Interactions in ...leandro.iqm.unicamp.br/leandro/shtml/publications/2017_jctc.pdf · solvation in mixtures of solutes and solvents of complex

minimum-distance surface will be small relative to the size ofsolvent molecule. Let us assume that this surface is flat.The simplest case with some interest is that of a diatomic

molecule, illustrated in Figure 8A. The parameter wi(r) will be,for each atom, the fraction of revolutions for which the otheratom is farther from the solute. According to angles defined inFigure 8A, this fraction is

∫ ∫

∫ ∫

θ ϕ

θ ϕ= =

π

π

π

π π−w r( )

d d

d d

12

0 /2

/2

0 0

2

A more interesting example is that of a triatomic molecule, aswater. It is illustrated in Figure 8B. In this case, thereorientational parameter is dependent on the angle betweenthe two covalent bonds. For the central (oxygen) atom, it is

∫ ∫

∫ ∫

θ ϕ

θ ϕ= ≈π

π

π π

− °

−w r( )d d

d d0.210O

0

(180 104.5)

/2

/2

0 0

2

and for the hydrogen atoms, illustrated in the second panel ofFigure 8B, it is

∫ ∫

∫ ∫

θ ϕ

θ ϕ= ≈π

π

π π

°

−w r( )d d

d d0.395H

0

142.25

/2

/2

0 0

2

These predictions can be numerically verified in the atomicdecomposition water minimum-distance distribution functionsaround urea and TMAO, which are shown in Figures 3, S1, andS2.At large r, ∑i

Nwi = 1, because the molecule is for thiscomputation a convex polyhedron, and the component of eachatom consists of a fraction of the rotation of the polyhedronassociated with each vertex. Many atoms might have zerocontributions for being in the interior of the polyhedron, asillustrated in Figure 8D. The convergence of the gmd(r)distribution to one at long distances corresponds to theconvergence of this sum, as the atomic densities all converge tothe bulk density. For the same reason, at long distances thew(r) converge to the atomic contributions to gmd(r).A.2. Numerical Integration Schemes for the Normalizationof the Minimum-Distance Distribution FunctionHere we will demonstrate that the normalization of thedistribution functions by a random distribution of solventmolecules, considering the appropriate density and intra-molecular flexibility, is equivalent to the normalization bysolvent simulation in the presence of a noninteracting solute atinfinite dilution.First, let us consider distribution functions computed

considering a single reference site at the solvent molecules(as conventional RDFs), usually the center of mass of the

solvent molecule or one of its atoms. Let ρs be the bulk densityof the solvent. Without the solute, the ensemble average densityof the solvent is also ρs everywhere. This is the standardnormalization for distribution functions

ρρ

=g rr

( )( )

s

where ρ(r) is the ensemble average density of solventmolecules at position r. The normalization by ρs can beconsidered equivalently as 1) a constant density equal to that ofbulk; 2) the average density of an ensemble of pure-solventconfigurations around a noninteracting solute; and 3) theaverage density of random solvent configurations with averagebulk density (which is equivalent to an ideal-gas distribution ofthe solvent).Now, we consider the distribution function computed from

the minimum distance between any atom of the solvent andany atom of the solute. Initially, we will consider the case of arigid solvent molecule. Let us suppose that a simulation of thepure solvent was performed. The average density of any specificatom at any position in space is ρs. A noninteracting,“phantom”, solute is added to the solution, with no effect onthe structure of the solvent. Therefore, the atomic densitiesremain unchanged. However, to compute the minimum-distance distribution, we need to know which is the probabilitythat each atom, if found at a given position in space, is theclosest atom to the “phantom” solute, in order to compute thenormalization of the minimum-distance distribution using eq 5.At this point, we note that all configurations of the solvent

which differ only by rotation around the atom of interest arethermodynamically equivalent, because there are no solute-solvent interactions. We illustrate these configurations whichdiffer by rotation in Figure 9. Therefore, the probability that thereference atom is the closest atom to the solute is the same asthe fraction of these rotations that maintain the reference atomas the closest atom. This is dependent, exclusively, on the shape

Figure 8. Orientational parameter at large distances: A) diatomic molecule, B) water molecule with oxygen as the closest atom, C) water moleculewith hydrogen as the closest atom, and D) general molecule as a convex polyhedron.

Figure 9. In the presence of a noninteracting solute, all configurationsof the solvent associated with rotations around a specific atom i areequivalent. The probability of a solvent atom being the closest atom tothe solute is then solely dependent on the shape of the solute and onthe geometry of the solvent molecule.

Journal of Chemical Theory and Computation Article

DOI: 10.1021/acs.jctc.7b00599J. Chem. Theory Comput. 2017, 13, 6358−6372

6369

Page 13: Molecular Interpretation of Preferential Interactions in ...leandro.iqm.unicamp.br/leandro/shtml/publications/2017_jctc.pdf · solvation in mixtures of solutes and solvents of complex

of the solute and on the geometry of the individual solventmolecule. In particular, it is not dependent on the structure ofthe solvent.It follows, then, that the minimum-distance density at each

distance from the noninteracting solute can be computed bypurely geometrical parameters of the solute and an individualsolvent molecule, as described for long distances in the previoussection. Explicitly, it can be computed by generating at eachposition r a solvent molecule centered at one of its atoms andcomputing by numerical integration the fraction of config-urations that preserve the reference atom as the closet atom tothe solute (w*(r) in eq 5). The procedure is repeated for eachatom of the solvent molecule at the same position, and theatomic contributions w*(r) are summed up to compute then*(r) minimum-distance density of eq 5.The numerical integration described above is equivalent to a

Monte Carlo integration by generation of many randompositions and rotations for solvent molecules around anoninteracting solute. If the number of random molecules islarge enough, this procedure can be used to approximate thefraction of configurations at each position in space that displaya specific atom closer to the solute. However, the explicitcomputation of the probabilities w*(r) is not necessary in thiscase, since the n*(r) minimum-distance counting can beexplicitly computed, with the only condition that the finaldensity of randomly generated molecules is adjusted to thedesired solvent density, as we describe in the procedure ofSection 2.3.The simulation of the pure solvent and the generation of a

random solvent configuration for the computation of n*(r) candiffer, however, if the solvent molecules are not rigid. In thiscase, the interactions within solvent molecules might alter theirgeometry, thus directly affecting the probability of each atombeing the closest atom at each r. In this case, the ensemble ofthe solvent molecules must reflect the conformationalvariability of the solvent molecules in bulk solvent. It is notpossible to obtain that ensemble without simulating bulksolvent. On the other side, if a bulk solvent simulation isavailable, this problem is solved by randomly picking solventmolecules from this simulation to numerically compute n*(r)by the numerical procedure described. Here, the bulk solventconfigurations are obtained directly from the solvent moleculesat large distances from the solute.Finally, let us remark that the solute conformations that are

considered here are those of the solute in the presence of thesolvent. The normalization is thought to evaluate how theseconformations affect the solvent structure and local density.Thus, the minimum-distance distribution around the solute inthe presence of solute-solvent interactions, n(r), or withoutthem, n*(r), is computed from each solute conformationobtained in the solute-solvent simulations independently, theensemble averages being the final reported result.

■ ASSOCIATED CONTENT

*S Supporting InformationThe Supporting Information is available free of charge on theACS Publications website at DOI: 10.1021/acs.jctc.7b00599.

Compositions of the solvents simulated (Table S1);details of the force fields used (Tables S2 and S3);comparison of radial distribution and minimum dis-tribution functions for urea and TMAO with differentforce fields (Figures S1 and S2); protein−solvent

minimum-distance distribution functions for differentforce fields and their atomic decompositions (Figures S3and S4); KB integrals computed from minimum-distancedistribution functions for all systems (Tables S4 and S5and Figure S5) (PDF)

■ AUTHOR INFORMATIONCorresponding Author*E-mail: [email protected].

ORCIDLeandro Martínez: 0000-0002-6857-1884Seishi Shimizu: 0000-0002-7853-1683FundingThe authors thank FAPESP (Grants: 2010/16947-9, 2013/05475-7, 2013/08293-7) and CNPq (Grant: 470374/2013-6)for financial support.

NotesThe authors declare no competing financial interest.

■ REFERENCES(1) Zhou, H.-X.; Rivas, G.; Minton, A. P. Macromolecular Crowdingand Confinement: Biochemical, Biophysical, and Potential Physio-logical Consequences. Annu. Rev. Biophys. 2008, 37, 375−397.(2) Holthauzen, L. M. F.; Auton, M.; Sinev, M.; Rosgen, J. ProteinStability in the Presence of Cosolutes. Methods Enzymol. 2011, 492,61−125.(3) Canchi, D. R.; García, A. E. Cosolvent Effects on ProteinStability. Annu. Rev. Phys. Chem. 2013, 64, 273−293.(4) Pierce, V.; Kang, M.; Aburi, M.; Weerasinghe, S.; Smith, P. E.Recent Applications of Kirkwood-Buff Theory to Biological Systems.Cell Biochem. Biophys. 2008, 50 (1), 1−22.(5) Shimizu, S.; Stenner, R.; Matubayasi, N. Gastrophysics: StatisticalThermodynamics of Biomolecular Denaturation and Gelation fromthe Kirkwood-Buff Theory towards the Understanding of Tofu. FoodHydrocolloids 2017, 62, 128−139.(6) Timasheff, S. N. Control of Protein Stability and Reactions byWeakly Interacting Cosolvents: The Simplicity of the Complicated.Adv. Protein Chem. 1998, 51, 355−432.(7) Chalikian, T. V. Volumetric Properties of Proteins. Annu. Rev.Biophys. Biomol. Struct. 2003, 32, 207−235.(8) Shimizu, S. Estimating Hydration Changes upon BiomolecularReactions from Osmotic Stress, High Pressure, and PreferentialHydration Experiments. Proc. Natl. Acad. Sci. U. S. A. 2004, 101 (5),1195−1199.(9) Shimizu, S.; Boon, C. L. The Kirkwood-Buff Theory and theEffect of Cosolvents on Biochemical Reactions. J. Chem. Phys. 2004,121 (18), 9147−9155.(10) Smith, P. E. Cosolvent Interactions with Biomolecules: RelatingComputer Simulation Data to Experimental Thermodynamic Data. J.Phys. Chem. B 2004, 108 (48), 18716−18724.(11) Shulgin, I. L.; Ruckenstein, E. A Protein Molecule in anAqueous Mixed Solvent: Fluctuation Theory Outlook. J. Chem. Phys.2005, 123 (5), 054909.(12) Abbott, S.; Booth, J. J.; Shimizu, S. Practical MolecularThermodynamics for Greener Solution Chemistry. Green Chem. 2017,19 (1), 68−75.(13) Booth, J. J.; Omar, M.; Abbott, S.; Shimizu, S. HydrotropeAccumulation around the Drug: The Driving Force for Solubilizationand Minimum Hydrotrope Concentration for Nicotinamide and Urea.Phys. Chem. Chem. Phys. 2015, 17 (12), 8028−8037.(14) Reid, J. E. S. J.; Walker, A. J.; Shimizu, S. Residual Water in IonicLiquids: Clustered or Dissociated? Phys. Chem. Chem. Phys. 2015, 17(22), 14710−14718.(15) Nicol, T. W. J.; Matubayasi, N.; Shimizu, S. Origin of Non-Linearity in Phase Solubility: Solubilisation by Cyclodextrin beyond

Journal of Chemical Theory and Computation Article

DOI: 10.1021/acs.jctc.7b00599J. Chem. Theory Comput. 2017, 13, 6358−6372

6370

Page 14: Molecular Interpretation of Preferential Interactions in ...leandro.iqm.unicamp.br/leandro/shtml/publications/2017_jctc.pdf · solvation in mixtures of solutes and solvents of complex

Stoichiometric Complexation. Phys. Chem. Chem. Phys. 2016, 18 (22),15205−15217.(16) Ploetz, E. A.; Rustenburg, A. S.; Geerke, D. P.; Smith, P. E. ToPolarize or Not to Polarize? Charge-on-Spring versus KBFF Modelsfor Water and Methanol Bulk and Vapor-Liquid Interfacial Mixtures. J.Chem. Theory Comput. 2016, 12 (5), 2373−2387.(17) Allen, M. P.; Tildesley, D. J. Section 2.6: Structural Quantities.In Computer Simulation of Liquids; Oxford University Press: Oxford,UK, 1989.(18) Mehrotra, P. K.; Beveridge, D. L. Structural Analysis ofMolecular Solutions Based on Quasi-Component DistributionFunctions. Application to [H2CO]aq at 25.degree.C. J. Am. Chem.Soc. 1980, 102 (13), 4287−4294.(19) Baynes, B. M.; Trout, B. L. Proteins in Mixed Solvents: AMolecular-Level Perspective. J. Phys. Chem. B 2003, 107 (50), 14058−14067.(20) Coutinho, K.; Rivelino, R.; Georg, H. C.; Canuto, S. TheSequential QM/MMMethod and Its Applications to Solvent Effects inElectronic and Structural Properties of Solutes. In Solvation Effects onMolecules and Biomolecules; Canuto, S., Ed.; Springer: Netherlands,2008; pp 159−189, DOI: 10.1007/978-1-4020-8270-2_7.(21) Zeindlhofer, V.; Khlan, D.; Bica, K.; Schroder, C. ComputationalAnalysis of the Solvation of Coffee Ingredients in Aqueous IonicLiquid Mixtures. RSC Adv. 2017, 7 (6), 3495−3504.(22) Mezei, M. Modified Proximity Criteria for the Analysis of theSolvation of a Polyfunctional Solute. Mol. Simul. 1988, 1 (5), 327−332.(23) Song, W.; Biswas, R.; Maroncelli, M. Intermolecular Interactionsand Local Density Augmentation in Supercritical Solvation: A Surveyof Simulation and Experimental Results. J. Phys. Chem. A 2000, 104(30), 6924−6939.(24) Furlan, A. C.; Favero, F. W.; Rodriguez, J.; Laria, D.; Skaf, M. S.Solvation in Supercritical Fluids. In Solvation Effects on Molecules andBiomolecules; Canuto, S., Ed.; Springer: Netherlands, 2008; pp 433−453, DOI: 10.1007/978-1-4020-8270-2_16.(25) Oliveira, I. P.; Martínez, L. Molecular Basis for CompetitiveSolvation of the Burkholderia Cepacia Lipase by Sorbitol and Urea.Phys. Chem. Chem. Phys. 2016, 18 (31), 21797−21808.(26) Ou, S.-C.; Pettitt, B. M. Solute-Solvent Energetics Based onProximal Distribution Functions. J. Phys. Chem. B 2016, 120 (33),8230−8237.(27) Ou, S.-C.; Drake, J. A.; Pettitt, B. M. Nonpolar Solvation FreeEnergy from Proximal Distribution Functions. J. Phys. Chem. B 2017,121 (15), 3555−3564.(28) Haberler, M.; Schroder, C.; Steinhauser, O. Hydrated IonicLiquids with and without Solute: The Influence of Water Content andProtein Solutes. J. Chem. Theory Comput. 2012, 8 (10), 3911−3928.(29) Patel, N.; Biswas, R.; Maroncelli, M. Solvation and Friction inSupercritical Fluids: Simulation-Experiment Comparisons in DiphenylPolyene/CO2 Systems. J. Phys. Chem. B 2002, 106 (28), 7096−7114.(30) Gekko, K.; Timasheff, S. N. Mechanism of Protein Stabilizationby Glycerol: Preferential Hydration in Glycerol-Water Mixtures.Biochemistry 1981, 20 (16), 4667−4676.(31) Lin, T. Y.; Timasheff, S. N. Why Do Some Organisms Use aUrea-Methylamine Mixture as Osmolyte? Thermodynamic Compen-sation of Urea and Trimethylamine N-Oxide Interactions with Protein.Biochemistry 1994, 33 (42), 12695−12701.(32) Schroer, M. A.; Michalowsky, J.; Fischer, B.; Smiatek, J.; Grubel,G. Stabilizing Effect of TMAO on Globular PNIPAM States:Preferential Attraction Induces Preferential Hydration. Phys. Chem.Chem. Phys. 2016, 18, 31459−31470.(33) Diddens, D.; Lesch, V.; Heuer, A.; Smiatek, J. Aqueous IonicLiquids and Their Influence on Peptide Conformations: Denaturationand Dehydration Mechanisms. Phys. Chem. Chem. Phys. 2017, 19,20430−20440.(34) Smiatek, J. Aqueous Ionic Liquids and Their Effects on ProteinStructures: An Overview on Recent Theoretical and ExperimentalResults. J. Phys.: Condens. Matter 2017, 29, 233001.

(35) Kirkwood, J. G.; Buff, F. P. The Statistical Mechanical Theory ofSolutions. I. J. Chem. Phys. 1951, 19 (6), 774−777.(36) Shimizu, S.; Matubayasi, N. Hydrotropy: Monomer-MicelleEquilibrium and Minimum Hydrotrope Concentration. J. Phys. Chem.B 2014, 118 (35), 10515−10524.(37) Pfeiffer, S.; Karimi-Nejad, Y.; Ruterjans, H. Limits of NMRStructure Determination Using Variable Target Function Calculations:Ribonuclease T1, a Case Study. J. Mol. Biol. 1997, 266 (2), 400−423.(38) Martínez, J. M.; Martínez, L. Packing Optimization forAutomated Generation of Complex System’s Initial Configurationsfor Molecular Dynamics and Docking. J. Comput. Chem. 2003, 24 (7),819−825.(39) Martínez, L.; Andrade, R.; Birgin, E. G.; Martínez, J. M.PACKMOL: A Package for Building Initial Configurations forMolecular Dynamics Simulations. J. Comput. Chem. 2009, 30 (13),2157−2164.(40) Jorgensen, W. L.; Chandrasekhar, J.; Madura, J. D.; Impey, R.W.; Klein, M. L. Comparison of Simple Potential Functions forSimulating Liquid Water. J. Chem. Phys. 1983, 79 (2), 926−935.(41) MacKerell, A. D.; Bashford, D.; Bellott, M.; Dunbrack, R. L.;Evanseck, J. D.; Field, M. J.; Fischer, S.; Gao, J.; Guo, H.; Ha, S.;Joseph-McCarthy, D.; Kuchnir, L.; Kuczera, K.; Lau, F. T.; Mattos, C.;Michnick, S.; Ngo, T.; Nguyen, D. T.; Prodhom, B.; Reiher, W. E.;Roux, B.; Schlenkrich, M.; Smith, J. C.; Stote, R.; Straub, J.; Watanabe,M.; Wiorkiewicz-Kuczera, J.; Yin, D.; Karplus, M. All-Atom EmpiricalPotential for Molecular Modeling and Dynamics Studies of Proteins. J.Phys. Chem. B 1998, 102 (18), 3586−3616.(42) Vanommeslaeghe, K.; Hatcher, E.; Acharya, C.; Kundu, S.;Zhong, S.; Shim, J.; Darian, E.; Guvench, O.; Lopes, P.; Vorobyov, I.;Mackerell, A. D. CHARMM General Force Field: A Force Field forDrug-like Molecules Compatible with the CHARMM All-AtomAdditive Biological Force Fields. J. Comput. Chem. 2009, NA−NA.(43) Yu, W.; He, X.; Vanommeslaeghe, K.; MacKerell, A. D.Extension of the CHARMM General Force Field to Sulfonyl-Containing Compounds and Its Utility in Biomolecular Simulations.J. Comput. Chem. 2012, 33 (31), 2451−2468.(44) Weerasinghe, S.; Smith, P. E. A Kirkwood-Buff Derived ForceField for Mixtures of Urea and Water. J. Phys. Chem. B 2003, 107 (16),3891−3898.(45) Kast, K. M.; Brickmann, J.; Kast, S. M.; Berry, S. Binary Phasesof Aliphatic N-Oxides and Water: Force Field Development andMolecular Dynamics Simulation. J. Phys. Chem. A 2003, 107 (27),5342−5351.(46) Canchi, D. R.; Jayasimha, P.; Rau, D. C.; Makhatadze, G. I.;Garcia, A. E. Molecular Mechanism for the Preferential Exclusion ofTMAO from Protein Surfaces. J. Phys. Chem. B 2012, 116 (40),12095−12104.(47) Schneck, E.; Horinek, D.; Netz, R. R. Insight into the MolecularMechanisms of Protein Stabilizing Osmolytes from Global Force-FieldVariations. J. Phys. Chem. B 2013, 117 (28), 8310−8321.(48) Phillips, J. C.; Braun, R.; Wang, W.; Gumbart, J.; Tajkhorshid,E.; Villa, E.; Chipot, C.; Skeel, R. D.; Kale, L.; Schulten, K. ScalableMolecular Dynamics with NAMD. J. Comput. Chem. 2005, 26 (16),1781−1802.(49) Makarov, D. M.; Egorov, G. I.; Kolker, A. M. Density andVolumetric Properties of Aqueous Solutions of Trimethylamine N-Oxide in the Temperature Range from (278.15 to 323.15) K and atPressures up to 100 MPa. J. Chem. Eng. Data 2015, 60 (5), 1291−1299.(50) Egan, E. P.; Luff, B. B. Heat of Solution, Heat Capacity, andDensity of Aqueous Urea Solutions at 25°C. J. Chem. Eng. Data 1966,11 (2), 192−194.(51) Martínez, L. MDAnalysis. Version 17.224; Institute of Chemistry- University of Campinas: 2017. http://leandro.iqm.unicamp.br/mdanalysis (accessed Nov 14, 2017).(52) Hamilton, D.; Stokes, R. H. Apparent Molar Volumes of Urea inSeveral Solvents as Functions of Temperature and Concentration. J.Solution Chem. 1972, 1, 213−221.

Journal of Chemical Theory and Computation Article

DOI: 10.1021/acs.jctc.7b00599J. Chem. Theory Comput. 2017, 13, 6358−6372

6371

Page 15: Molecular Interpretation of Preferential Interactions in ...leandro.iqm.unicamp.br/leandro/shtml/publications/2017_jctc.pdf · solvation in mixtures of solutes and solvents of complex

(53) Kumar, A.; Attri, P.; Venkatesu, P. Effect of Polyols on theNative Structure of α-Chymotrypsin: A Comparable Study.Thermochim. Acta 2012, 536, 55−62.(54) Arakawa, T.; Ejima, D.; Kita, Y.; Tsumoto, K. Small MoleculePharmacological Chaperones: From Thermodynamic Stabilization toPharmaceutical Drugs. Biochim. Biophys. Acta, Proteins Proteomics 2006,1764 (11), 1677−1687.(55) Yancey, P. H.; Siebenaller, J. F. Co-Evolution of Proteins andSolutions: Protein Adaptation versus Cytoprotective Micromoleculesand Their Roles in Marine Organisms. J. Exp. Biol. 2015, 218 (12),1880−1896.

Journal of Chemical Theory and Computation Article

DOI: 10.1021/acs.jctc.7b00599J. Chem. Theory Comput. 2017, 13, 6358−6372

6372