comparison of a qm/mm force field and molecular mechanics force

13
Comparison of a QM/MM Force Field and Molecular Mechanics Force Fields in Simulations of Alanine and Glycine “Dipeptides” (Ace-Ala-Nme and Ace-Gly-Nme) in Water in Relation to the Problem of Modeling the Unfolded Peptide Backbone in Solution Hao Hu, 1 Marcus Elstner, 2 and Jan Hermans 1 * 1 Department of Biochemistry and Biophysics, School of Medicine, University of North Carolina, Chapel Hill, North Carolina 2 Department of Theoretical Physics, University of Paderborn, Paderborn, Germany ABSTRACT We compare the conformational distributions of Ace-Ala-Nme and Ace-Gly-Nme sampled in long simulations with several molecular mechanics (MM) force fields and with a fast com- bined quantum mechanics/molecular mechanics (QM/MM) force field, in which the solute’s intramo- lecular energy and forces are calculated with the self-consistent charge density functional tight bind- ing method (SCCDFTB), and the solvent is repre- sented by either one of the well-known SPC and TIP3P models. All MM force fields give two main states for Ace-Ala-Nme, and separated by free energy barriers, but the ratio in which these are sampled varies by a factor of 30, from a high in favor of of 6 to a low of 1/5. The frequency of transitions between states is particularly low with the amber and charmm force fields, for which the distributions are noticeably narrower, and the energy barriers between states higher. The lower of the two barriers lies between and at values of near 0 for all MM simulations except for charmm22. The results of the QM/MM simulations vary less with the choice of MM force field; the ratio / varies between 1.5 and 2.2, the easy pass lies at near 0, and transitions between states are more frequent than for amber and charmm, but less frequent than for cedar. For Ace-Gly-Nme, all force fields locate a diffuse stable region around and , whereas the amber force field gives two additional densely sampled states near 100° and 0, which are also found with the QM/MM force field. For both solutes, the distribution from the QM/MM simulation shows greater similarity with the distribution in high- resolution protein structures than is the case for any of the MM simulations. Proteins 2003;50:451– 463. © 2003 Wiley-Liss, Inc. Key words: dynamics simulation; alanine dipep- tide; glycine dipeptide; QM/MM model; peptide backbone; solution conforma- tion; Ramachandran plot INTRODUCTION At present, simulations with molecular mechanics (MM) force fields, such as amber, 1 charmm, 2 gromos, 3 and opls, 4 offer a comprehensive approach to modeling biological macromolecules in atomic detail over time spans that are commensurate with the relaxation times of these mol- ecules in their native state. Even then, because of limits on available computing power, simulations cannot be per- formed for sufficiently long times to be able to follow many important processes, such as protein folding and conforma- tion changes of allosteric molecules. The energetics computed according to a molecular me- chanics force field are meant to replace the underlying quantum mechanical (QM) energetics; accordingly, the design (which includes both the form and the values of the associated parameters) of such a force field is often wholly or partly based on a comparison with accurate quantum mechanical calculations, which, per force, can only be conducted for systems with relatively small numbers of atoms. 2,5,6 An alternative (and oldest) route to determine the best values of force field parameters is to impose agreement between measured physical properties and the results of simulations (e.g., Refs. 7–9). In fact, the theoreti- cal and empirical approaches to force-field development can be effectively combined, with the former able to give more accurate parameters for geometric deformation (bond stretching, angle bending), atomic partial charges and coefficients for repulsive forces due to atomic overlap, and the latter giving the more accurate estimates of weak long-range attractions (dispersion terms). Force fields should be used with an understanding of their accuracy, which can be assessed by comparison of This article contains Supplementary Materials that can be found at http://www.interscience.wiley.com/jpages/0887-3585/suppmat/2003/50/ v50.451.html. Grant sponsor: National Center for Research Resources, U.S. Na- tional Institutes of Health; Grant number: RR08012. *Correspondence to: Jan Hermans, Department of Biochemistry, University of North Carolina, Chapel Hill, NC 27599-7260. E-mail: [email protected] Received 3 June 2002; Accepted 19 August 2002 PROTEINS: Structure, Function, and Genetics 50:451– 463 (2003) © 2003 WILEY-LISS, INC.

Upload: others

Post on 03-Feb-2022

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Comparison of a QM/MM Force Field and Molecular Mechanics Force

Comparison of a QM/MM Force Field and MolecularMechanics Force Fields in Simulations of Alanine andGlycine “Dipeptides” (Ace-Ala-Nme and Ace-Gly-Nme) inWater in Relation to the Problem of Modeling the UnfoldedPeptide Backbone in SolutionHao Hu,1 Marcus Elstner,2 and Jan Hermans1*1Department of Biochemistry and Biophysics, School of Medicine, University of North Carolina, Chapel Hill, North Carolina2Department of Theoretical Physics, University of Paderborn, Paderborn, Germany

ABSTRACT We compare the conformationaldistributions of Ace-Ala-Nme and Ace-Gly-Nmesampled in long simulations with several molecularmechanics (MM) force fields and with a fast com-bined quantum mechanics/molecular mechanics(QM/MM) force field, in which the solute’s intramo-lecular energy and forces are calculated with theself-consistent charge density functional tight bind-ing method (SCCDFTB), and the solvent is repre-sented by either one of the well-known SPC andTIP3P models. All MM force fields give two mainstates for Ace-Ala-Nme, � and � separated by freeenergy barriers, but the ratio in which these aresampled varies by a factor of 30, from a high in favorof � of 6 to a low of 1/5. The frequency of transitionsbetween states is particularly low with the amberand charmm force fields, for which the distributionsare noticeably narrower, and the energy barriersbetween states higher. The lower of the two barrierslies between � and � at values of � near 0 for all MMsimulations except for charmm22. The results of theQM/MM simulations vary less with the choice of MMforce field; the ratio �/� varies between 1.5 and 2.2,the easy pass lies at � near 0, and transitionsbetween states are more frequent than for amberand charmm, but less frequent than for cedar. ForAce-Gly-Nme, all force fields locate a diffuse stableregion around � � � and � � �, whereas the amberforce field gives two additional densely sampledstates near � � �100° and � � 0, which are alsofound with the QM/MM force field. For both solutes,the distribution from the QM/MM simulation showsgreater similarity with the distribution in high-resolution protein structures than is the case forany of the MM simulations. Proteins 2003;50:451–463.© 2003 Wiley-Liss, Inc.

Key words: dynamics simulation; alanine dipep-tide; glycine dipeptide; QM/MM model;peptide backbone; solution conforma-tion; Ramachandran plot

INTRODUCTION

At present, simulations with molecular mechanics (MM)force fields, such as amber,1 charmm,2 gromos,3 and opls,4

offer a comprehensive approach to modeling biologicalmacromolecules in atomic detail over time spans that arecommensurate with the relaxation times of these mol-ecules in their native state. Even then, because of limits onavailable computing power, simulations cannot be per-formed for sufficiently long times to be able to follow manyimportant processes, such as protein folding and conforma-tion changes of allosteric molecules.

The energetics computed according to a molecular me-chanics force field are meant to replace the underlyingquantum mechanical (QM) energetics; accordingly, thedesign (which includes both the form and the values of theassociated parameters) of such a force field is often whollyor partly based on a comparison with accurate quantummechanical calculations, which, per force, can only beconducted for systems with relatively small numbers ofatoms.2,5,6 An alternative (and oldest) route to determinethe best values of force field parameters is to imposeagreement between measured physical properties and theresults of simulations (e.g., Refs. 7–9). In fact, the theoreti-cal and empirical approaches to force-field developmentcan be effectively combined, with the former able to givemore accurate parameters for geometric deformation (bondstretching, angle bending), atomic partial charges andcoefficients for repulsive forces due to atomic overlap, andthe latter giving the more accurate estimates of weaklong-range attractions (dispersion terms).

Force fields should be used with an understanding oftheir accuracy, which can be assessed by comparison of

This article contains Supplementary Materials that can be found athttp://www.interscience.wiley.com/jpages/0887-3585/suppmat/2003/50/v50.451.html.

Grant sponsor: National Center for Research Resources, U.S. Na-tional Institutes of Health; Grant number: RR08012.

*Correspondence to: Jan Hermans, Department of Biochemistry,University of North Carolina, Chapel Hill, NC 27599-7260. E-mail:[email protected]

Received 3 June 2002; Accepted 19 August 2002

PROTEINS: Structure, Function, and Genetics 50:451–463 (2003)

© 2003 WILEY-LISS, INC.

Page 2: Comparison of a QM/MM Force Field and Molecular Mechanics Force

simulated and experimental properties of a few well-characterized model systems. Accuracy is then found tovary, depending on the composition of the simulatedsystem as well as on the physical properties of interest.

In the present study, we set out to enquire into thequality of commonly used force fields with respect to theproperties of unfolded polypeptide chains in solution, interms of preferred conformation and conformational statis-tics. These are of particular interest in connection with thephysical properties of unfolded proteins and with thekinetics and equilibria of protein folding. Debated ques-tions about the “structure” remaining in denatured pro-teins, even in the presence of strong denaturants, canpresumably be illuminated by accurate long simulations ofpeptides in solution (e.g., Ref. 10), whereas a simulation ofprotein folding (e.g., Ref. 11) can only be successful givenan accurate free energy balance between the ensemble ofdenatured conformations and the folded state.

As a model system to investigate the performance ofmolecular mechanics force fields in simulating the unstruc-tured polypeptide chain, we have chosen the terminallyblocked amino acid, or “dipeptide” model. Advantages ofusing this model include the ease with which a reasonablyprecise conformational distribution (in the two backbonedihedral angles � and �) can be obtained simply byextended simulations, the ability to compare the resultingdistribution with the distribution of backbone conforma-tion in proteins whose structure is accurately known, and(not in the last place) its familiarity. A disadvantage ofusing the dipeptide model as a basis is a shortage ofexperimental data that can report on conformationaldistributions of flexible molecules in aqueous solution.Although there has long been experimental evidence thatthe most prevalent conformation of the alanine residue inunfolded polypeptide chains is a “polyproline II” conforma-tion (PII), with � near �70° and � near 140°,12 this hasonly recently been confirmed with different ap-proaches.13–16 This finding agrees with results of an earlysimulation study of Ace-Ala-Nme in water17 but less wellwith a more recent simulation based on a different forcefield.18 Disagreement between (in the first instance equiva-lent) models tends to render these irrelevant to the experi-mentalist.15

Use of accurate quantum-mechanical energies to obtainMM force-field parameters that adequately describe thepolypeptide backbone is complicated by the problem thatthese calculations can be performed only for a limitednumber of configurations of systems consisting of smallnumbers of atoms. This would suggest as the model systema molecule such as Ace-Ala-Nme in vacuo. However, aswas shown some time ago, the free energy surface of thismolecule in vacuo is very different from that in aqueoussolution.19 This finding suggests why it is difficult todesign an MM energy function that fits the QM energysurface of Ace-Ala-Nme in vacuo and that is at the sametime accurate in the highly polar environment of anaqueous solution or a folded protein,1,2,5,6 and especially sowhen the atomic charges in this MM force field are heldfixed.

It appears much preferable to use as a QM model systemone in which the highly polar solvent also is represented.Obstacles to working with such a model system are (i) themuch larger number of atoms in the model, (ii) the factthat for a given conformation of the solute, an ensemble ofconformations of the solvent must be considered, and (iii)the need to obtain adequate statistics to assess the relativeimportance of different conformation states separated byfree energy barriers. For the alanine model, Ace-Ala-Nme,this requires of the order of 6 ns of simulation withmolecular dynamics, which perhaps can be reduced to 1 nswith optimal use of computation of potentials of mean forcealong paths connecting free energy minima. Because thisappears not feasible with current QM methods, we haveresorted to simulation of two of these systems (Ace-Ala-Nme and Ace-Gly-Nme, both in water) with a QM/MMmethod, in which the solute is treated with quantummechanics, but the solvent and the solute-solvent interac-tions are treated with molecular mechanics. For the sol-vent, we have chosen the (very similar) SPC and TIP3Pmodels,20,21 which are known to adequately representmany properties of liquid water and which, in conjunctionwith molecular mechanical descriptions of small mol-ecules, have proven able to represent the solvation ofthese.

The solute is represented with the self-consistent tightbinding (SCCDFTB) method, a fast approximate quantum-mechanical method.22,23 This method was recently usedwith good result in a 300-ps simulation of crambin insolution, in which the crambin molecule was treated withthe SCCDFTB method, the solute-solvent interactionswith the amber force field, and the solvent with the TIP3Pwater model.24 Another recent study has used the sameapproach, except that the solute-solvent interactions werecomputed with the charmm22 force field.25 That articleincludes simulations of peptide helices in solution thatshow a realistic tendency of formation of �-, rather than310 helical structure as the molecules become longer.

To better evaluate the contents of this article, it is usefulto briefly review different methods of calculating theenergy of macromolecular systems. MM methods use anartificial decomposition of the energy into local terms (e.g.,energy terms for bond stretching and bond angle bending,local torsional energy terms, and Lennard–Jones andCoulomb pair energies). Use of a QM representation forthe solute unifies the energy expression in a single Hamil-tonian, requires no a priori information about moleculargeometry, and inherently represents complex effects, suchas nonadditivity of terms, changes of polarization coupledto changes of geometry and polarization caused by thelocal electrostatic field due to intra- and intermolecularinteractions. The design and development of a fast approxi-mate QM method, based on judicious approximations thatretain these advantages as well as high accuracy iscomplex. Approximations introduced in SCCDFTB includeexplicit treatment of only the valence electrons and use ofa minimal basis set of pseudoatomic orbitals. Use ofprecomputed Hamiltonian matrix elements for pairs ofatom types leads to a significant speedup. The charge

452 H. HU ET AL.

Page 3: Comparison of a QM/MM Force Field and Molecular Mechanics Force

density is written as a sum of the charge density of neutralatoms (the number of valence electrons) and atomic chargefluctuations (atomic polarization). The Hamiltonian con-tains a sum of pairwise terms representing the interactionbetween charge fluctuations, according to an expressionthat produces the Coulomb energy at large interatomicdistance but takes account of exchange-correlation contri-butions at shorter separation. A double sum of pairwiseatom–atom potentials is included in the Hamiltonian, fit torepresent the difference between the energy from a high-level DFT calculation and the SCCDFTB electronic en-ergy. The two cited application articles contain moredetailed summaries of the method followed in the SCCD-FTB calculation.24,25 For a detailed description of theSCCDFTB methodology, see Refs 22 and 23.

It is not easy to predict the effect of such approxima-tions, and there are few examples of simulations of pep-tides or proteins in water with QM/MM methods. There-fore, the SCCDFTB method has been tested for variousbiological model systems, including H-bonded complexes,peptides and DNA bases.23 Special emphasis was put onthe investigation of structures and relative energies ofpeptides with up to eight amino acid residues in the gasphase. A comparison with results from DFT and MP2calculations has shown that the SCCDFTB method canreproduce structures and energetics of polypeptides reli-ably, with an accuracy comparable to that of the higher-level methods.26,27

In addition, vibrational frequencies of Ace-Ala-Nmehave been investigated for SCCDFTB in comparison withDFT and MP2, and only slightly larger deviations of 6.7%from experimental values have been reported than forDFT (4.4%) and MP2 (3.0%).28 These studies indicate thatthe SCCDFTB method describes the potential energysurface around the local minima in the gas phase withgood accuracy. However, one lacks the insight and experi-ence needed to know whether system properties are wellrepresented far from the local minima or in solution,especially when the solvent is represented by the SPC orTIP3P models with their well-known shortcomings. There-fore, the present study is in the nature of an exploration, inwhich we ask how this particular QM/MM method de-scribes the system, and then compare the results with onesof MM simulations and with available experimental data.

In what follows, we present and compare the distribu-tions for alanine and glycine models obtained with simula-tions with the SCCDFTB/MM method and with simula-tions with several MM force fields. Because directexperimental information on the conformation of thesemolecules in solution is so sparse, we compare the resultsof the simulations also with distributions of conformationsof alanine and glycine residues in the database of high-resolution protein structures of known structure as asecond, if less direct source of experimental information.

MATERIALS AND METHODSDynamics Simulations

With the exceptions noted below, molecular dynamicssimulations were performed with the Sigma program.29

One molecule of Ace-Ala-Nme or Ace-Gly-Nme and 362water molecules were simulated in a cubic periodic box(with edge slightly �22 Å). Simulations were performedwith a multiple timestep scheme,30 with a basic timestepof 2 fs, doubled to 4 fs for nonbonded interactions atseparation between 6 and 11 Å, and again doubled to 8 fsfor long-range electrostatic forces calculated via particle-mesh Ewald summation.31,32 No significant difference wasnoticed when the Ewald summation was omitted, and thusEwald summation was not used in all calculations. Inaddition, no significant difference was noted when thenumber of water molecules was increased. The nonbondedpairlist was updated every 32 fs. Pressure was maintainedat 1 bar and temperature at 300 K with Berendsenmanostat and thermostat (with separate thermostats forsolute and solvent) using relaxation times of 0.1 ps.33 Bondlengths were held fixed with the Shake algorithm.34 Thedihedral angles � and � were monitored at regular inter-vals. In each simulation, the choice of water model (TIP3Por SPC) is dictated by the choice of force field.

Simulations were started from extensively equilibratedcoordinate sets.

Force FieldsAmber

Simulations with the parm98 amber force field1 wereperformed starting with amber topology files prepared forus by Drs. Yong Duan (Ace-Ala-Nme) and Lee Bartolotti(Ace-Gly-Nme). The 1–4 interactions were scaled by divi-sion by factors of 2.0 (Lennard–Jones energy) and 1.2(Coulomb energy). The water model was TIP3P.21 Theaccuracy of Sigma’s implementation of the amber forcefield was checked by us by comparison with results of asimulation of Ace-Ala-Nme done with the amber pro-gram.35

Charmm22

Simulations with the charmm22 force field2 were per-formed with use of a protein structure file prepared withthe X-plor program36 using the charmm22 protein dictio-nary and the charmm22 all-atom parameter files distrib-uted with that program. The water model was TIP3P. Thedistribution obtained for Ace-Ala-Nme with the Sigmaprogram agrees with that obtained by Smith.18

Cedar

Simulations with the cedar all-atom force field37,38 wereperformed by using a protein structure file prepared withthe X-plor program and the cedar all-atom topology andparameter files. The water model was SPC.20

Gromos

Simulations with the gromos96 force field were per-formed with the gromos96 program.3 In this one case, theCH and CH3 groups of the solute were represented assingle centers for nonbonded force calculations. The simu-lations with the gromos force field were performed also in acubic periodic system but with 2795 (Ace-Ala-Nme) or1497 water molecules (Ace-Gly-Nme). The water model

QM/MM AND MM SIMULATIONS OF DIPEPTIDES 453

Page 4: Comparison of a QM/MM Force Field and Molecular Mechanics Force

was SPC. (Larger numbers of water molecules were usedin the simulations with gromos and with QM/MM as aresult of inadequate communication between the authors.)

OPLS

A simulation of Ace-Ala-Nme with the all-atom opls-aaforce field4,6 was performed by starting with a topology filein amber format prepared for us by Dr. Julian Tirado-Rives. The 1–4 interactions were scaled by division byfactors of 2.0 (Lennard–Jones energy) and 1.2 (Coulombenergy). The water model was TIP3P.

QM/MM

The QM/MM simulations were performed with the SC-CDFTB method,23 with code and data files incorporatedinto the Sigma program. As was done in an earlier study ofcrambin from this laboratory,24 the intramolecular forcesof the solute were computed with the SCCDFTB method,the solvent-solvent interactions were calculated as appro-priate for the (SPC or TIP3P) water model, electrostaticinteractions between solvent and solute were evaluated aspart of the SCCDFTB calculation, and Lennard–Jonesinteractions between solvent and solute were computedwith molecular mechanics by using nonbonded parame-ters for water-solute interactions from one of these molecu-lar mechanics force fields: charmm22, amber or cedar,with the water model (SPC or TIP3P) belonging to thatforce field. The QM/MM simulations were performed in acubic periodic system, with 2795 (Ace-Ala-Nme) or 1497water molecules (Ace-Gly-Nme). The near and far cutoffswere 8 and 12 Å, with timesteps of 1 and 3 fs. For technicalreasons, no Ewald summation was applied, and the simu-lations were run at constant volume of 85,184 Å3 (Ace-Ala-Nme) or 46,656 Å3 (Ace-Gly-Nme), rather than at constantpressure.

The study of crambin showed that the application ofexplicit dispersion forces (1/r6 energy terms) within thecrambin molecule (which was in its entirety representedwith the QM force field) was necessary to retain a native-like crambin structure during the simulation.24 In thatstudy, the dispersion terms were damped at short distancewith a switching function.39 Results of simulations ofAce-Ala-Nme in which such damped dispersion terms wereapplied by using the attractive Lennard–Jones parame-ters of the same force field that was used for the water-solute interactions (data not shown) depended much morestrongly on the choice of MM force field than was the casewhen these terms were omitted. (We believe that this isdue to the use of a switching function, and this is some-thing we are investigating.) Because we find that in theMM simulations the mean dispersion energy of Ace-Ala-Nme is not significantly different for conformations with� � 0 and for conformations with � � 0, we report hereonly the results of simulations in which intramoleculardispersion forces have not been applied.

Obtaining Adequate SamplingAce-Ala-Nme

With several of the force fields the simulations exploreconformation space (as represented by the dihedral angles

� and �) in a simulation time of 6 ns. With all force fields,the exploration of conformations with �� � � � 0 iseffective; in that region, the distribution consists of twomain clusters, one having �� � � � 0, and the otherhaving 0 � � � �, and the extent to which the two regionsare fairly represented in the sample depends on how manytransitions occur between these two regions of conforma-tion space. The frequency of such transitions was analyzedfor each of the simulations.

With the charmm22 and amber force fields, no samplingoccurred in the region of positive � (not counting samplingat values of � close to �, that are part of the two mainclusters). Separate simulations were run with these forcefields, with the value of � restrained to lie between 0 and �,and this served to locate a third cluster for both. Therelative importance of this cluster was in both casesdetermined by a potential of mean force calculation inwhich the value of � was forced to change by 2�.17

Ace-Gly-Nme

Because this solute is achiral, the probability density isthe same for a given (�, �) as it is for the invertedgeometry, that is, (��, ��). The results have been re-ported without consideration of the symmetry of thedistribution, and the symmetry of the figures thus gives aqualitative sense of the extent of convergence of thedistribution or lack thereof. As indicated by the approxi-mate symmetry of the sample distribution, all simulationsof Ace-Gly-Nme achieved reasonable sampling of the acces-sible conformation space.

Reproducibility

At the suggestion of a reviewer, we make available thebonded and nonbonded parameters as supplementarymaterial. All elements needed to compute energy andforces, with exception of atomic coordinates, for the simula-tions of Ace-Ala-Nme and Ace-Gly-Nme with the amber,cedar, charmm, and opls force fields have been collected inthe identical, self-documenting format, preceded by anannotated example and have in this form been madeavailable as part of the supplementary material of thisarticle. (These data can be used as input for the Sigmaprogram.) For the gromos force field we have deposited thegromos topology file for Ace-Ala-Nme, plus a list of differ-ences between the gromos topology file for Ace-Gly-Nmeand the former file, with which the latter file can beregenerated, and refer the reader to the extensive documen-tation of the gromos force field.3

The SCCDFTB tables occupy �2 megabytes of storageon a unix system and contain no indications as to themeaning of the contents. The SCCDFTB code and theintegral and spline tables for O, N, C, and H are availableon request to the second author, Marcus Elstner (E-mail:[email protected]). The tables used in thisstudy are identified as “integral tables for O N C H, version1, created 5/1999.” The Sigma code with built-in SCCD-FTB code but without these tables is available from eitherone of the other two authors or via the web site //femto.med.unc.edu/SIGMA.

454 H. HU ET AL.

Page 5: Comparison of a QM/MM Force Field and Molecular Mechanics Force

RESULTSAce-Ala-Nme

Figures 1–5 show the distributions from simulations ofAce-Ala-Nme in water with five different MM force fields,each for periods exceeding 6 ns. As can be seen, in all cases,conformation states with � � 0 predominate (we includethe “overflow” at values slightly below � in these states),and within this category two states (one with � � 0 and theother with � � 0) are clearly separated. The simulationswith the amber and charmm22 force fields did not at anytime sample states with � � 0; these states occur sparselyin the samples obtained with the other three force fields:cedar, gromos, and opls. Additional simulations with the

amber and charmm22 force fields, in which the conforma-tion was prevented from assuming either of the twodominant states, showed a third locally stable conforma-tion state for both amber and charmm22 (indicated withtriangles in Figs. 1 and 3). The relative importance of thisstate was assessed by potential of mean force calculationsin which the value of � was forced through a range of 2�.These indicated that this state is at least 3 kcal/mol higherin free energy and that a barrier of at least 6 kcal/mol mustbe overcome in a transition from states centered at � � 0.For the other force fields, states centered at � � 0 also havehigher free energy (by �1.5 kcal/mol), but passage to thesestates from conformation states centered at � � 0 is easier.The distribution obtained with the charmm22 force field isparticularly tight and also is distinct from the others in

Fig. 1. Sampled conformational distribution of Ace-Ala-Nme with theamber force field. The circles and triangles represent results from twoindependent simulations that have been scaled together with a thirdsimulation in which a potential of mean force was calculated.

Fig. 2. Sampled conformational distribution of Ace-Ala-Nme with thecedar force field.

Fig. 3. Sampled conformational distribution of Ace-Ala-Nme with thecharmm22 force field. See legend for Figure 1 for explanation of symbols.

Fig. 4. Sampled conformational distribution of Ace-Ala-Nme with thegromos force field.

QM/MM AND MM SIMULATIONS OF DIPEPTIDES 455

Page 6: Comparison of a QM/MM Force Field and Molecular Mechanics Force

having the highest population density between the twoprincipal states at values of � near �150°, whereas this“easy pass” lies near �30° for the other four force fields.Finally, one notices that with the amber force field, theglobal maximum of the distribution occurs at � � 0,whereas this lies at a value of � � 0 for all others. Thebackbone conformation found in �-helices is close to thelocation of this maximum, and this leads one to suspectthat simulation of oligopeptides with this force field mayproduce an exaggerated predominance of the �-helix con-formation, something that has indeed been shown recentlyin a number of instances.40 Table I shows the relativesampling of the conformation states separated by the linesdrawn (somewhat subjectively) in each of the figures.

The QM/MM simulations were performed with each ofthree force fields determining the interaction of the MMwater model with the solute. Results of these simulationsare very similar; a single distribution is shown in Figure 6.With one exception, these samplings represent four differ-ent conformation states, with relatively easy passagebetween states. Table II shows the relative sampling ofconformation states in these simulations; as indicated bythe lines drawn in Figure 6, we have defined a fifth state,labeled “pass,” close to the internally hydrogen-bondedC7eq conformation (the global minimum in vacuo). Thedistributions found with the QM/MM simulations more

strongly resemble those obtained with the cedar, gromos,and opls force fields. For these and for the QM/MMsimulations, the most stable state is centered at � � 0 and� � 0, in distinction to what is obtained with the amberforce field, and the easy pass lies at positive �, in distinc-tion to what is obtained with the charmm22 force field.

More than any of the MM distributions, the QM/MMdistribution resembles that of alanine residues in a data-base of well-ordered residues in high-resolution X-raystructures, as updated in the accompanying paper byLovell et al.41 Contours enclosing successively more “fa-

Fig. 12. Sampled conformational distribution of Ace-Gly-Nme withQM/MM (SCCDFTB/cedar; 7-ns simulation.) Successive contours en-close 99.8% (purple), 99.5%, 98%, 95%, and 90% (pink) of the data pointsfor glycine residues in the new data base.41

Fig. 5. Sampled conformational distribution of Ace-Ala-Nme with theopls-aa force field.

TABLE I. Sampling of Different Conformation States(Local Minima) of Ace-Ala-Nme in Simulations

With MM Force Fields

Forcefield Amber Charmm22 Cedar Gromos Opls

beta 0.16 0.50 0.71 0.82 0.86alpha R 0.84 0.50 0.22 0.13 0.135alpha L — — 0.05 0.04 .004state 4 — — 0.02 0.0005 .0006

Fig. 6. Sampled conformational distribution of Ace-Ala-Nme withQM/MM (SCCDFTB/amber.) Successive contours enclose 99.8% (purple),99.5%, 98%, 95%, and 90% (pink) of the data points for alanine residuesin the new data base.41

456 H. HU ET AL.

Page 7: Comparison of a QM/MM Force Field and Molecular Mechanics Force

vored” regions, encompassing, respectively, 99.8%, 99.5%,98%, 95%, and 90% of the data points for alanine residuesin the data base are included in Figure 6. (Alanineresidues in repetitive secondary structure and preprolinealanines had been excluded from the contoured data set.These contours were computed as described by Lovell etal.41) One notes in particular, that the distributions near(� �60, � 0) and (� 60, � 0) scatter around axesthat make angles of about 45° with the coordinate axes, asis the case for the distribution in the database, but not forany of the MM simulations.

To assess the extent of convergence of these distribu-tions, cumulative averages were calculated of the fractionof (recorded) instances in which the conformation fellwithin the limits of the various conformations designatedas “beta,” “alpha-R,” and so forth. Results for the SCCD-FTB simulation of Figure 6 have been plotted in Figure 7.For each conformation, periods of increasing and decreas-ing fraction alternate. The number of transitions to confor-mations with positive values of � (0 � � � 130°) is only 7,and the estimated prevalence of 8% for this set of conforma-tions (Table II) obviously has a large relative uncertainty.

For the MM force fields, the number of transitionsbetween beta and alpha R regions was established in asimilar manner. This number was lowest for the simula-tion with the amber force field (seven transitions); it wastwice as large for charmm22 and higher still for the otherthree force fields. A second, 40-ns-long simulation with theamber force field showed close to 50 transitions, and thefraction of times at which the alpha-R conformation wassampled was 0.87, versus 0.84 in the first simulation (cf.Table II). In conclusion, the precision of the distributionsachieved in these simulations appears adequate for pur-poses of comparison of the kind made in this article.

Ace-Gly-Nme

Figures 8–11 show the distributions from simulations ofAce-Gly-Nme in water with four different MM force fields,each for a period exceeding 1 ns. As can be seen, allsimulations produce a broadly distributed set of conforma-tions having both � and � near �. In addition, thesimulation with the amber force field produces two addi-tional states, with � near 0.

The QM/MM sample distribution is given in Figure 12,which also includes contours of the database distributionof Lovell et al.41 for glycine residues, as described above foralanine residues. Although the QM/MM force field and theMM force fields all sample a state centered at � � �,

the QM distribution also samples a second and third statecentered at � 120° and � near 0°, including conforma-tions that are only sparsely sampled in simulations withall MM force fields except amber. This is a qualitativedifference; remarkably, the distribution of glycine residuesin the database of well-ordered residues in high-resolutionglobular proteins41 shows dense sampling in all threeareas. The correspondence between database and simula-tion results is less marked than for alanine.

DISCUSSIONA Failure of MM Force Fields

The results presented here show that molecular mechan-ics force fields represent the conformation of unfoldedpolypeptide chains in aqueous solution with insufficientaccuracy. This conclusion follows from the wide spread ofresults with different force fields, rather than from a directcomparison with experimental data. In fact, for several ofthe force fields, the most common conformation of thealanine residue in solution is a more or less spread-outregion with a relatively high density near the PII conforma-tion (� near �70° and � near 140°), the conformationdetected with experimental studies,12–16 but for one forcefield (amber parm98) the �R conformation, with � near�60° and � near �50°, dominates, and for another(charmm22) the �R state is equally populated. In thiscontext, it is useful to mention that the single-residuemodel, Ace-Ala-Nme, in first approximation provides anadequate representation of the backbone of an Ala residuein a modeled unfolded chain composed of several alanineresidues; this can be seen by comparing the results ofsimulations with the same force field (gromos96) forAce-Ala-Nme (Fig. 4 of this article), for Ala3 (Fig. 2 of Ref.42), and for Ala8 (Fig. 1 of Ref. 10). Although that mayprogressively become less true of longer chains and forresidues other than alanine, it seems safe to assume that,quite generally, an unfolded polypeptide cannot be mod-eled correctly with an MM force field that fails the simplestmembers of the family.

The MM force fields used in this article have beendeveloped by following very similar principles, as de-scribed in Introduction; consequently, unless all agree, allfail to provide guidance in interpretation of experimentalstudies.15 Of course, the models have been found inad-equate only under one specific set of physical conditions(the unfolded backbone in solution), under which theproperties obviously are sensitive to small changes in theenergy function, and we do not for a moment suggest thatthe entire body of extant simulations of biological macro-molecules with MM force fields has somehow becomeinvalidated. A recent study indicates that different MMforce fields behave comparably in tests with a different setof systems.43 Nevertheless, the use of many different forcefields having basically the same form, but with differentparameters, each by a different set of laboratories, shouldbe considered an embarrassment.

Origin of the Discrepancies

On the energy surface of Ace-Ala-Nme in vacuo, the twoconformations of lowest energy are those for which the two

TABLE II. Sampling of Different Regions in ConformationSpace of Ace-Ala-Nme in QM/MM Simulations

Force fieldSCCDFTB

cedarSCCDFTB

amberSCCDFTBcharmm22

beta 0.61 0.48 0.48pass 0.12 0.16 0.14alpha R 0.27 0.27 0.33alpha L — 0.07 0.03state 4 — 0.01 0.01

QM/MM AND MM SIMULATIONS OF DIPEPTIDES 457

Page 8: Comparison of a QM/MM Force Field and Molecular Mechanics Force

highly polar peptide groups form close contacts, withformation of a (distorted) intramolecular hydrogen bond,namely, the C7eq and C7ax conformations, with (�,�),respectively, near (�80,80) and (80,�50) (e.g., Ref. 6);

these are reasonably well reproduced by a MM model invacuo.19 These conformations are unstable relative toother conformations when these are stabilized by hydro-gen bonds, either with other polar groups of the polypep-

Fig. 7. Cumulative average of the fraction of time each of four conformations occurs in the simulation ofAce-Ala-Nme with QM/MM of Figure 6.

Fig. 8. Sampled conformational distribution of Ace-Gly-Nme with the amber force field (3.1-ns simula-tion).

458 H. HU ET AL.

Page 9: Comparison of a QM/MM Force Field and Molecular Mechanics Force

tide (often, other peptide groups), or with water, and theyare uncommon in folded proteins. Conversely, conforma-tions that are common in proteins have considerablyhigher energies in the in vacuo model. The most stable

conformations of dipeptides in aqueous solution found bysimulation with an MM energy function obtained as a fit toa QM energy surface of Ace-Ala-Nme in vacuo, correspondto high-energy conformations of the isolated molecule. It is

Fig. 9. Sampled conformational distribution of Ace-Gly-Nme with the cedar force field (1.2-ns simula-tion).

Fig. 10. Sampled conformational distribution of Ace-Gly-Nme with the charmm22 force field (1.1-nssimulation).

QM/MM AND MM SIMULATIONS OF DIPEPTIDES 459

Page 10: Comparison of a QM/MM Force Field and Molecular Mechanics Force

obviously difficult to design an MM energy function, basedon a fit to the QM energy surface of Ace-Ala-Nme in vacuo,that is accurately transferable to the highly polar environ-ment of an aqueous solution or a folded protein. Although amodel that is accurate under a wide variety of conditionswould appear advantageous, a greater advantage isachieved in practice by having a model that is highlyaccurate under the conditions of interest.

Importance of Explicit Torsional Energy Terms

The five MM energy functions have the same form butdifferent parameters. Differences that may be at the rootof the observed differences between the free energy sur-faces in water are: atomic partial charges of the peptidegroup, repulsive parameters of the Lennard–Jones poten-tial, and torsion parameters. Equilibrium values and forceconstants of bond lengths and bond angles and attractiveparameters of the Lennard–Jones potential do not differgreatly between force fields, and, also, the two watermodels (SPC and TIP3P) are very similar. With oneexception, we have not attempted to correlate the differ-ences in conformational distributions with differences inparameters. The MM force fields include energy terms, U�

that depend explicitly on the value of the dihedral angle, �(� or �) in the form of one or several Fourier terms,

U� � �i

U�,i

2 �1 � cos ni�� � �0,i��� (1)

and the force fields disagree on the details of these terms.Thus, for �, cedar has a single threefold term (ni 3) witha small barrier (U�,i 0.2 kcal/mol), whereas opls has a setof three terms with ni 1, 2, and 3, and amber a set ofthree terms with ni 1, 2, and 4. The torsional potentialfor amber favors values of � near �60° over values near180° by several kcal/mol and appears responsible for thehigh probability of conformations with � near �60° forboth Ace-Ala-Nme and Ace-Gly-Nme (Figs. 1 and 8. Itwould be easy to adjust the constants in U� for amber and

thereby achieve a different balance between population ofconformations. Backbone torsional parameters in am-ber9444 and amber9945 are different from those in am-ber98.) The gromos potential contains a sixfold energyterm for �, with U�,i 0.48 kcal/mol, the effect of which isdiscernable in the distribution for Ace-Gly-Nme (Fig. 11).

Quality of the QM/MM Method

The choice of a QM/MM method is a compromise be-tween accuracy and speed. The SCCDFTB method isamong the fastest approximate QM methods; neverthe-less, SCCDFTB closely reproduces minimum-energy mo-lecular geometries found with high-level methods. Theenergies of several conformations of Ace-Ala-Nme in vacuohave been found to be properly ranked, which is not thecase for several other fast QM methods.23 We have foundthe energy surface to be qualitatively very similar to thatreported for a high-level QM method, LMP26; the largestdeviation is found for the C7ax conformation; the energy ofthe C7ax conformation lies above that of the C7eq conforma-tion by 1.1 kcal/mol, compared with 2.71 kcal/mol forLMP2. Application of SCCDFTB to helices of short alanineoligomers in vacuum, to crambin and to alanine oligomersin aqueous solution has amply confirmed these find-ings.24,25 On the negative side, we mention that SCCDFTBprovides a poor representation of liquid water in a dynam-ics simulation of several hundred water molecules withperiodic boundary conditions (results not shown). Further-more, the conformational distribution of Ace-Ala-Nmeobtained here with this QM/MM force field does not showthe preference for the PII conformation that has beenindicated by experimental studies.12–16 A previous studyinvestigating structures and relative energies of solvatedAce-Ala-Nme finds that the PII conformation is the globalminimum on the potential energy surface when the wholesystem (solute and solvent) is treated with QM, whereas ina QM/MM description, with solute treated with QM andsolvent with MM, the �R conformation is favored over thePII conformation.46

We have seen here that the results obtained withQM/MM are sensitive to the choice of MM parameters,mainly the repulsive Lennard–Jones parameters, the wa-ter models, and the attractive Lennard–Jones parametersbeing similar. It will be appropriate to optimize a set ofLennard–Jones parameters specifically for describing in-teractions between the SPC or TIP3P water model andsolutes represented with SCCDFTB. (These may then beused in combination with a recently developed set ofintramolecular long-range potentials.47) Development ofsuch an intermolecular parameter set on the basis of freeenergies of transfer of selected solutes is in progress.Preliminary results indicate that the properties of a singleQM H2O molecule represented with SCCDFTB are verysensitive to the strength of the Lennard–Jones repulsiveinteraction between it and surrounding SPC water mol-ecules. This finding suggests that optimization of theseparameters should be a first step, only after which one maywant to consider use of more sophisticated MM water

Fig. 11. Sampled conformational distribution of Ace-Gly-Nme with thegromos force field (7-ns simulation).

460 H. HU ET AL.

Page 11: Comparison of a QM/MM Force Field and Molecular Mechanics Force

models, such as models with additional charge centers andpolarizable models.

Relevance of Database Statistics

We have used the extent of agreement between distribu-tions of backbone conformations from simulations of Ace-Ala-Nme and Ace-Gly-Nme in water and these samedistributions of alanine and glycine residues in proteins asevidence indicating the accuracy of the QM/MM modelwith SCCDFTB. A direct relation between the two hadbeen proposed by two groups, who suggested also thatthese distributions (corrected for the prevalence of regularsecondary structures) could be used in a statistical descrip-tion of the conformation of unfolded peptides in solu-tion.48,49 Later, discrepancies between the database distri-butions and distributions from simulations with a MMforce field, in particular for the glycine residue, wereadduced as an argument that the different nature of thetwo environments (aqueous solution and folded protein)would tend to limit the proposed agreement.50 The presentgood agreement suggests that because virtually all polargroups participate in hydrogen bonds in the protein inte-rior, this may be considered a polar environment not verydissimilar to water for polar groups, while at the sametime presumably providing a much less polar environmentfor apolar groups. Although it is not surprising thatconformations with high energy due to atomic overlap areabsent in folded proteins as well as in the simulatedsample, it remains debatable if, and if so, why, thestatistics of, for example, the alanine backbone conforma-tion in proteins follow the equilibrium distribution of thealanine residue of peptides such as Ace-Ala-Nme in wa-ter.51,52 Ultimately, the accuracy of simulated distribu-tions of backbone geometry of unfolded peptides in waterwill presumably be superior to that provided by thedatabase, but at present such is not the case.

Protein Stability

An accurate representation of the unfolded state isrequired for simulations of conformation change from theunfolded to an ordered folded state, such as helix forma-tion of polypeptides and folding of globular proteins.Accurate free energy differences between folded and un-folded states are known from experiment for many suchsystems, and it is important to establish the accuracy withwhich these are reproduced in the simulated system. Aproblem is that, given the actual rates, transitions fromunfolded to folded states are unlikely within the times overwhich such systems can be simulated in practice. Forma-tion of helices by �-peptides (chains formed by �-aminoacids linked by amide bonds) in aqueous solution isexceptionally fast, and this is one system for which simula-tions (with the gromos force field) have proven to repro-duce the observed equilibria quite well.53–55

Formation of �-helices by polypeptides is slower becauseof the cooperative nature of the helix initiation step,although the accretion or loss of helical structure at theends of �-helical regions occurs on a timescale where theprocess can presumably be studied with free simulations.

Whether the force field accurately represents the equilib-rium, depends, in the first instance, on the free energydifference between the unfolded and folded states. Specialsimulation techniques exist, which allow one to calculatefree energy differences; in one example, the free energy offormation of �-helix by short alanine oligomers in solutionhas been computed for the cedar force field and shown to bein reasonable agreement with experiment.56 Other simula-tions directed at assessing stability of elements of second-ary structure of proteins include studies of reverse turns,�-sheet and �-helix57–60 and of designed �-sheet pro-teins.61–63 These simulations have been able to describebelievable free-energy landscapes for the folding of thesemolecules, including the marginal stability of the foldedstate. It has been theorized64,65 that marginal stability isrequired for rapid folding of proteins, in which case MMforce fields used in simulation of the folding process wouldof necessity have to represent only marginal stability.Inaccuracies in the force field could then easily render thefolded conformation unstable relative to the ensemble ofall other solution conformations. In this light, the expecta-tion of successful, accurate protein folding in long simula-tions of an unfolded protein molecule11 appears naiveunless it can first be established that the used MM forcefield produces a relatively small balance of the free energyin favor of the folded state.

ACKNOWLEDGMENT

Supercomputing time was provided by the North Caro-lina Supercomputing Center. We thank Drs. Jane andDavid Richardson and Ian Davis for helpful discussionsand providing results of recent database analysis studiesin their laboratory. We thank Drs. Julian Tirado-Rives,Yong Duan, and Lee Bartolotti for providing us topologyfiles in amber format and Drs. Weitao Yang and HaiyanLiu for advice on use of SCCDFTB.

REFERENCES

1. Cornell WD, Cieplak P, Bayly C, Gould IR, Merz KMJ, FergusonDM, Spellmeyer DC, Fox T, Caldwell JW, Kollman PA. A secondgeneration force field for the simulation of proteins and nucleicacids. J Am Chem Soc 1995;117:5179–5197.

2. MacKerell AD Jr, Bashford D, Bellott M, Dunbrack RL Jr,Evanseck JD, Field MJ, Fischer S, Gao J, Guo H, Ha S, Joseph-McCarthy D, Kuchnir L, Kuczera K, Lau FTK, Mattos C, MichnickS, Ngo T, Nguyen DT, Prodhom B, Reiher WE III, Roux B,Schlenkrich M, Smith JC, Stote R, Straub J, Watanabe M,Wiorkiewicz-Kuczera J, Yin D, Karplus M. All-atom empiricalpotential for molecular modeling and dynamics studies of pro-teins. J Phys Chem 1998;B 102:3586–3616.

3. van Gunsteren WF, Billeter SR, Eising AA, Hunenberger PH,Kruger P, Mark AE, Scott WRP, Tironi IG. Biomolecular simula-tion: the GROMOS96 manual and user guide. Zurich: Vdf Hochs-chulverlag AG an der ETH Zurich; 1996.

4. Jorgensen WL. OPLS force fields. In: Schleyer PvR, editor. Ency-clopedia of computational chemistry. Vol. 3. New York: Wiley;1998.

5. Maple JR, Hwang M-J, Jalkanen KJ, Stockfisch TP, Hagler AT.Derivation of class II force fields. V. Quantum force field foramides, peptides, and related compounds. J Comput Chem 1998;19:430–458.

6. Kaminski G, Friesner RA, Tirado-Rives J, Jorgensen WL. Evalua-tion and reparametrization of the OPLS-AA forcefield for proteinsvia comparison with accurate quantum chemical calculations onpeptides. J Phys Chem 2001;B 105:6474–6487.

QM/MM AND MM SIMULATIONS OF DIPEPTIDES 461

Page 12: Comparison of a QM/MM Force Field and Molecular Mechanics Force

7. Williams DE. Nonbonded potential parameters derived fromcrystalline hydrocarbons. J Chem Phys 1967;47:4680–4684.

8. Scheraga HA. Calculations of conformations of polypeptides. AdvPhys Org Chem 1968;6:103–184.

9. Ferro D, Hermans J. Semiempirical energy calculations on modelcompounds of polypeptides. Crystal structures of DL-acetylleucineN-methylamide and DL-acetyl-n-butyric acid N-methylamide.Biopolymers 1972;11:105–117.

10. Sreerama N, Woody RW. Molecular dynamics simulations ofpolypeptide conformations in water: a comparison of �, �, andpoly(Pro)II conformations. Proteins 1999;36:400–406.

11. Duan Y, Kollman PA. Pathways to a protein folding intermediateobserved in a 1-microsecond simulation in aqueous solution.Science 1998;282:740–744.

12. Sreerama N, Woody RW. Poly(Pro) II type structure in globularproteins—identification and CD analysis. Biochemistry 1994;33:10022–10025.

13. Poon CD, Samulski ET, Weise CF, Weisshaar JC. Do bridgingwater molecules dictate the structure of a model dipeptide inaqueous solution? J Am Chem Soc 2000;122:5642–5643.

14. Schweitzer-Stenner R, Eker F, Huang Q, Griebenow K. Dihedralangles of trialanine in D2O determined by combining FTIR andpolarized visible Raman spectroscopy. J Am Chem Soc 2001;123:8628–9633.

15. Woutersen S, Hamm P. Structure determination of trialanine inwater using polarization sensitive two-dimensional vibrationalspectroscopy. J Phys Chem 2001;B 104:11316–11320.

16. Shi Z, Olson CA, Rose GD, Baldwin RL, Kallenbach NR. Polypro-line II structure in a 7-residue alanine peptide. Proc Natl Acad SciUSA 2002;accepted for publication.

17. Anderson AG, Hermans J. Microfolding: conformational probabil-ity map for the alanine dipeptide in water from molecular dynamicsimulation. Proteins 1988;3:262–265.

18. Smith PE. The alanine dipeptide free energy surface in solution.J Chem Phys 1999;111:5568–5579.

19. Tobias DJ, Brooks CL. Conformational equilibrium in the alaninedipeptide in the gas phase and aqueous solution—a comparison oftheoretical results. J Phys Chem 1992;96:3864–3870.

20. Berendsen HJC, Postma JPM, van Gunsteren WF, HermansJ. Interaction models for water in relation to protein hydration. In:Pullman B, editor. Intermolecular forces. Dordrecht, Holland:Reidel; 1981. p 331–342.

21. Jorgensen WL, Chandrasekhar J, Madura JD, Impey RW, KleinML. Comparison of simple potential functions for simulatingliquid water. J Chem Phys 1983;79:926–935.

22. Elstner M, Porezag D, Jungnickel G, Elsner J, Haugk M, Frauen-heim T, Suhai S, Seifert G. Self-consistent charge density func-tional tight-binding method for simulation of complex materialproperties. Phys Rev B 1998;58:7260–7268.

23. Elstner M, Frauenheim T, Kaxiras E, Seifert G, Suhai S. Aself-consistent charge density-functional based tight-bindingscheme for large biomolecules. Phys Status Solidi B 2000;217:357–376.

24. Liu H, Elstner M, Kaxiras E, Frauenheim T, Hermans J, Yang W.Quantum mechanics simulation of protein dynamics on long timescale. Proteins 2001;44:484–489.

25. Cui Q, Elstner M, Kaxiras E, Frauenheim T, Karplus M. AQM/MM implementation of the self-consistent-charge densityfunctional tight binding (SCC-DFTB) method. J Phys Chem2001;B 105:569–585.

26. Elstner M, Jalkanen K, Knapp-Mohammady M, Frauenheim T,Suhai S. DFT studies on helix formation in N-acetyl-(L-alanyl)n-N�-methylamide for n 1–20. Chem Phys 2000;256:15–27.

27. Elstner M, Jalkanen KJ, Knapp-Mohammady M, Frauenheim T,Suhai S. Energetics and structure of glycine and alanine basedmodel peptides: approximate SCC-DFTB, AM1 and PM3 methodsin comparison with DFT, HF and MP2 calculations. Chem Phys2001;263:203–219.

28. Bohr HG, Jalkanen KJ, Elstner M, Frimand K, Suhai S. Acomparative study of MP2, B3LYP, RHF and SCC-DFTB forcefields in predicting the vibrational spectra of N-acetyl-(L-alanine)-N�-methyl amide: VA and VCD spectra. Chem Phys 1999;246:13–36.

29. Mann G, Yun RH, Nyland L, Prins J, Board J, Hermans J. TheSigma MD program and a generic interface applicable to multi-functional programs with complex, hierarchical command struc-ture. In: Schlick T, Gan HH, editors. Computational methods for

macromolecules: challenges and applications. Proceedings of the3rd International Workshop on Algorithms for MacromolecularModelling, New York, October 12–14, 2000. Berlin and New York:Springer-Verlag; 2002. Forthcoming.

30. Tuckerman ME, Berne BJ, Martyna GJ. Reversible multiple timescale molecular dynamics. J Chem Phys 1992;97:1990–2001.

31. Darden TA, York DM, Pedersen LG. Particle mesh Ewald: anN.log(N) method for Ewald sums in large systems. J Chem Phys1993;98:10089–10092.

32. Schlick T, Skeel RD, Brunger AT, Kale LV, Board JA, Hermans J,Schulten K. Algorithmic challenges in computational molecularbiophysics. J Comput Phys 1999;151:9–48.

33. Berendsen HJC, Postma JPM, van Gunsteren WF, DiNola A,Haak JR. Molecular dynamics with coupling to an external bath.J Chem Phys 1984;81:3684–3690.

34. Ryckaert JP, Ciccotti G, Berendsen HJC. Numerical integration ofthe Cartesian equations of motion of a system with constraints:molecular dynamics of n-alkanes. J Comput Phys 1977;23:327–341.

35. Weiner PK, Kollman PA. AMBER: assisted model building withenergy refinement. A general program for modeling molecules andtheir interactions. J Comput Chem 1981;2:287–303.

36. Brunger AT. X-PLOR, a system for X-ray crystallography andNMR. New Haven, CT: Yale University Press; 1992.

37. Ferro DR, McQueen JE, McCown JT, Hermans J. Energy minimi-zation of rubredoxin. J Mol Biol 1980;136:1–18.

38. Hermans J, Berendsen HJC, van Gunsteren WF, Postma JPM. Aconsistent empirical potential for water-protein interactions.Biopolymers 1984;23:1513–1518.

39. Mooij WTM, van Duijneveldt FB, van Duijneveldt-van de RijdtJGCM, van Eijck BP. Transferable ab initio molecular poten-tials.1. Derivation from methanol dimer and trimer calculations. JPhys Chem A 1999;103:9872–9882.

40. Okur A, Strockbine B, Hornak V, Simmerling C. Using PCclusters to evaluate the transferability of molecular mechanicsforce fields for proteins. J Comput Chem 2002; in press.

41. Lovell SC, Davis IW, Arendall WB, de Bakker PIW, Word JM,Prisant MG, Richardson JS, Richardson DC. Structure validationby C� geometry: �,� and C� deviation. Proteins 2003;50:437–450.

42. Mu Y, Stock G. Conformational dynamics of trialanine in water: amolecular dynamics study. J Phys Chem 2002;B 106:5294–5301.

43. Price DJ, Brooks CL. Modern protein force fields behave compara-bly in molecular dynamics simulations. J Comput Chem 2002;23:1045–1056.

44. Weiner SJ, Kollman PA, Case DA, Singh UC, Ghio C, Alagona G,Profeta S Jr, Weiner P. A new force field for molecular mechanicalsimulation of nucleic acids and proteins. J Am Chem Soc 1984;106:765–784.

45. Wang J, Cieplak P, Kollman PA. How well does a restrainedelectrostatic potential (RESP) model perform in calculating confor-mational energies of organic and biological molecules? J ComputChem 2000;21:1049–1074.

46. Han W, Elstner M, Jalkanen KJ, Frauenheim T, Suhai S. HybridSCC-DFTB/molecular mechanical studies of H-bonded systemsand of N-acetyl-(L-Ala)n-N�-methylamide helices in water solu-tion. Int J Quant Chem 2000;78:459–479.

47. Elstner M, Hobza P, Frauenheim T, Suhai S, Kaxiras E. Hydrogenbonding and stacking interactions of nucleic acid base pairs: adensity-functional-theory based treatment. J Chem Phys 2001;114:5149–5155.

48. Munoz V, Serrano L. Intrinsic secondary structure propensities ofthe amino acids, using statistical �-� matrices: comparison withexperimental scales. Proteins 1994;20:301–311.

49. Swindells MB, MacArthur MW, Thornton JM. Intrinsic ���propensities of amino acids, derived from the coil regions of knownstructures. Nat Struct Biol 1995;2:596–603.

50. O’Connell T, Wang L, Tropsha A, Hermans J. Comparison ofmodels from database statistics and molecular simulations. Pro-teins 1999;36:407–418.

51. Sippl MJ. Boltzmann’s principle, knowledge-based mean fields,and protein folding. An approach to the computational determina-tion of protein structures. J Comput Aided Mol Design 1993;7:473–501.

52. Thomas PD, Dill KA. Statistical potentials extracted from proteinstructures: how accurate are they? J Mol Biol 1996;257:457–469.

53. Daura X, van Gunsteren WF, Rigo D, Jaun B, Seebach D.

462 H. HU ET AL.

Page 13: Comparison of a QM/MM Force Field and Molecular Mechanics Force

Studying the stability of a helical �-heptapeptide by moleculardynamics simulation. Chem Eur J 1997;3:1410–1417.

54. Daura X, Jaun B, Seebach D, van Gunsteren WF, Mark AE.Reversible peptide folding in solution by molecular dynamicssimulation. J Mol Biol 1998;280:925–932.

55. Daura X, van Gunsteren WF, Mark AE. Folding-unfolding thermo-dynamics of a �-heptapeptide from equilibrium simulations. Pro-teins 1999;34:269–280.

56. Wang L, O’Connell T, Tropsha A, Hermans J. Thermodynamicparameters for the helix-coil transition of oligopeptides: moleculardynamics simulation with the peptide growth method. Proc NatlAcad Sci USA 1995;92:10924–10928.

57. Tobias DJ, Sneddon SF, Brooks CL. Reverse turns in blockeddipeptides are intrinsically unstable in water. J Mol Biol 1990;216:783–796.

58. Tobias DJ, Mertz JE, Brooks CL. Nanosecond timescale foldingdynamics of a pentapeptide in water. Biochemistry 1991;30:6054–6058.

59. Tobias DJ, Brooks CL. Thermodynamics and mechanism of �-he-lix initiation in alanine and valine peptides. Biochemistry 1991;30:6059–6070.

60. Tobias DJ, Sneddon SF, Brooks CL. Stability of a model beta-sheetin water. J Mol Biol 1992;227:1244–1252.

61. Bursulaya BD, Brooks CL. Folding free energy surface of athree-stranded �-sheet protein. J Am Chem Soc 1999;121:9947–9951.

62. Ferrara P, Caflisch A. Folding simulations of a three-strandedantiparallel �-sheet peptide. Proc Natl Acad Sci USA 2000;97:10780–10785.

63. Cavalli A, Ferrara P, Caflisch A. Weak temperature dependence ofthe free energy surface and folding pathways of structuredpeptides. Proteins 2002;47:305–314.

64. Klimov DK, Thirumalai D. Factors governing the foldability ofproteins. Proteins 1996;26:411–441.

65. Dinner AR, Abkevich V, Shakhnovich EI, Karplus M. Factors thataffect folding ability of proteins. Proteins 1999;35:34–40.

QM/MM AND MM SIMULATIONS OF DIPEPTIDES 463