large-scale computations in chemistry: a bird s eye view of a...

94
Large-Scale Computations in Chemistry: A Birds Eye View of a Vibrant Field Alexey V. Akimov and Oleg V. Prezhdo* Department of Chemistry, University of South California, Los Angeles, California 90089, United States CONTENTS 1. Introduction 5798 1.1. The Meaning of Large Scale5798 1.1.1. Large Size 5798 1.1.2. Long Time Scale 5798 1.1.3. Combinatorial Complexity 5798 1.1.4. Use of Many Compute Nodes 5799 1.1.5. Combined Strategies 5799 1.2. When Large-Scale Computations Are Needed 5799 1.3. What Is Large? A Survey of the Records 5799 1.4. Scope and Philosophy of the Review 5800 2. Basic Grounds 5802 2.1. Wave Function Theory (WFT) 5802 2.1.1. Wave Functions and Transformations 5802 2.1.2. Hamiltonian and Energy 5803 2.1.3. Variational Principle and the Eigenvalue Problem 5805 2.1.4. Representation of Operators in Dierent Bases 5806 2.1.5. Useful Properties 5806 2.2. Density Functional Theory (DFT) 5806 2.3. Limitations of WFT and DFT, and Approaches To Overcome Them 5807 2.3.1. Scaling Law 5807 2.3.2. Classication of Approximations 5808 2.3.3. Sources of Performance Bottlenecks 5808 3. Physically Motivated Approximations 5809 3.1. Semiempirical MO Methods 5809 3.1.1. CNDO and CNDO/2 Methods 5809 3.1.2. INDO and NDDO Methods 5811 3.1.3. MINDO, MNDO, AM1, and PMn Methods 5812 3.1.4. SINDO, SINDO1, and MSINDO Methods 5814 3.1.5. ZINDO Method 5815 3.1.6. Sparkle Model for f-Elements 5816 3.1.7. DFTB and Derived Methods 5816 3.1.8. Extended Hü ckel Theory 5818 3.1.9. Timeline of Semiempirical Methods 5821 3.2. Density-Based Methods 5821 3.2.1. Empirical Density Embedding Schemes: EAM and MEAM 5821 3.2.2. Orbital-Free DFT (OF-DFT) 5824 3.2.3. Timeline of Density-Based Methods 5825 3.3. Bond Order Methods 5825 3.3.1. Bond Order Concept 5825 3.3.2. Bond Order Conservation 5825 3.3.3. Bond Order Potentials and Reactive Force Fields 5828 3.3.4. Construction of Reactive Bond-Order Potentials as a Phenomenological Varia- tional Principle 5832 3.3.5. Timeline of Bond Order Based Methods 5833 4. Computationally Motivated Approximations 5833 4.1. Classication of Computationally Motivated Approaches 5834 4.2. Fragmentation-Based Approaches 5835 4.2.1. Density-Based Fragmentation 5836 4.2.2. Energy-Based Fragmentation 5843 4.2.3. Dynamical Growth with Localization 5849 4.2.4. Diabatic Approaches 5852 4.3. Direct Optimization Methods 5855 4.3.1. Density Matrix 5856 4.3.2. Wave Function or Density Matrix as Dynamical Variables 5857 4.3.3. Orbital Minimization (OM) Methods 5858 4.3.4. Density Matrix Minimization (DMM) Methods 5859 4.3.5. Fermi Operator Expansion (FOE) Meth- ods 5860 4.4. Quantal Force Fields 5861 4.4.1. Basics of QM/MM 5861 4.4.2. Polarizable Force Fields 5864 4.4.3. Eective Fragment Potential (EFP) Meth- od 5865 4.5. Embedding Schemes 5866 5. Conclusions and Outlook 5868 5.1. Methods Diversity: New and Old 5868 5.2. Force Field Development: New Dimensions: Energy Excited States Spin? 5869 5.3. Are Semiempirical/Force Field Methods Needed? 5870 5.4. Linear-Scaling Methods from a Dierent Perspective, Dynamic Programming 5870 Special Issue: Calculations on Large Systems Received: September 18, 2014 Published: April 8, 2015 Review pubs.acs.org/CR © 2015 American Chemical Society 5797 DOI: 10.1021/cr500524c Chem. Rev. 2015, 115, 57975890

Upload: others

Post on 27-Jul-2020

11 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

Large-Scale Computations in Chemistry: A Bird’s Eye View of aVibrant FieldAlexey V. Akimov and Oleg V. Prezhdo*

Department of Chemistry, University of South California, Los Angeles, California 90089, United States

CONTENTS

1. Introduction 57981.1. The Meaning of “Large Scale” 5798

1.1.1. Large Size 57981.1.2. Long Time Scale 57981.1.3. Combinatorial Complexity 57981.1.4. Use of Many Compute Nodes 57991.1.5. Combined Strategies 5799

1.2. When Large-Scale Computations AreNeeded 5799

1.3. What Is Large? A Survey of the Records 57991.4. Scope and Philosophy of the Review 5800

2. Basic Grounds 58022.1. Wave Function Theory (WFT) 5802

2.1.1. Wave Functions and Transformations 58022.1.2. Hamiltonian and Energy 58032.1.3. Variational Principle and the Eigenvalue

Problem 58052.1.4. Representation of Operators in Different

Bases 58062.1.5. Useful Properties 5806

2.2. Density Functional Theory (DFT) 58062.3. Limitations of WFT and DFT, and Approaches

To Overcome Them 58072.3.1. Scaling Law 58072.3.2. Classification of Approximations 58082.3.3. Sources of Performance Bottlenecks 5808

3. Physically Motivated Approximations 58093.1. Semiempirical MO Methods 5809

3.1.1. CNDO and CNDO/2 Methods 58093.1.2. INDO and NDDO Methods 58113.1.3. MINDO, MNDO, AM1, and PMn Methods 58123.1.4. SINDO, SINDO1, and MSINDO Methods 58143.1.5. ZINDO Method 58153.1.6. Sparkle Model for f-Elements 58163.1.7. DFTB and Derived Methods 58163.1.8. Extended Huckel Theory 5818

3.1.9. Timeline of Semiempirical Methods 58213.2. Density-Based Methods 5821

3.2.1. Empirical Density Embedding Schemes:EAM and MEAM 5821

3.2.2. Orbital-Free DFT (OF-DFT) 58243.2.3. Timeline of Density-Based Methods 5825

3.3. Bond Order Methods 58253.3.1. Bond Order Concept 58253.3.2. Bond Order Conservation 58253.3.3. Bond Order Potentials and Reactive

Force Fields 58283.3.4. Construction of Reactive Bond-Order

Potentials as a Phenomenological Varia-tional Principle 5832

3.3.5. Timeline of Bond Order Based Methods 58334. Computationally Motivated Approximations 5833

4.1. Classification of Computationally MotivatedApproaches 5834

4.2. Fragmentation-Based Approaches 58354.2.1. Density-Based Fragmentation 58364.2.2. Energy-Based Fragmentation 58434.2.3. Dynamical Growth with Localization 58494.2.4. Diabatic Approaches 5852

4.3. Direct Optimization Methods 58554.3.1. Density Matrix 58564.3.2. Wave Function or Density Matrix as

Dynamical Variables 58574.3.3. Orbital Minimization (OM) Methods 58584.3.4. Density Matrix Minimization (DMM)

Methods 58594.3.5. Fermi Operator Expansion (FOE) Meth-

ods 58604.4. Quantal Force Fields 5861

4.4.1. Basics of QM/MM 58614.4.2. Polarizable Force Fields 58644.4.3. Effective Fragment Potential (EFP) Meth-

od 58654.5. Embedding Schemes 5866

5. Conclusions and Outlook 58685.1. Methods Diversity: New and Old 58685.2. Force Field Development: New Dimensions:

Energy → Excited States → Spin? 58695.3. Are Semiempirical/Force Field Methods

Needed? 58705.4. Linear-Scaling Methods from a Different

Perspective, Dynamic Programming 5870

Special Issue: Calculations on Large Systems

Received: September 18, 2014Published: April 8, 2015

Review

pubs.acs.org/CR

© 2015 American Chemical Society 5797 DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

Page 2: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

5.5. Operator Representations, Projectors, Huzi-naga−Cantu Equation, and SeparabilityTheory 5872

5.6. Importance of Diabatic Approaches 58725.7. Importance of Software 5872

Author Information 5873Corresponding Author 5873Notes 5873Biographies 5873

Acknowledgments 5873Abbreviations 5873References 5874

1. INTRODUCTIONProgress in computational chemistry has primarily beenmotivated by research and development in technological orbiomedical applications, as work in these areas can lead tosolutions for practical problems that have a direct impact onsociety. This work is driven by fundamental questions, which theresearcher inevitably has to address when explaining themechanisms of the processes observed experimentally. Althoughthe particular questions and types of systems studied change overtime, with currently “fashionable” topics supplanted by newerproblems, the general trend in the field of theoretical andcomputational chemistry remains constant. Namely, there isgreat interest in performing increasingly larger simulations,including treatment of more complex systems, simulation oflonger dynamics, resolution of a larger number of microscopicdetails, and increase in the diversity of studied systems.The growing capabilities of computational techniques create

interesting opportunities for studying a greater range of smaller-scale systems, which can be investigated in a brute-force fashionthrough the use of large-scale computational facilities andresources. The possibility of performing such types ofsimulations is also motivated by steady progress in computertechnologies, encompassing new techniques for mathematicaloperations in processors, and the use of parallel and high-throughput computing architectures. On the other hand, there isgreat interest in studying complex systems as a whole, processesthat occur in large spatial regions, and processes that occur veryslowly. These systems and processes are not amenable to theexisting standard techniques, and the need for truly large-scalesimulation methodologies cannot be overemphasized. As aresult, the computational chemistry community continually facesthe stimulating challenge of attempting to reach beyond itscurrent limits.1.1. The Meaning of “Large Scale”

In general, the term “large scale” can have one of the followingfive meanings in computational chemistry:

(1) large size: power-law and exponential scaling (dependingon level of theory)

(2) long time scale: linear scaling (yet, in practice, many ordersof magnitude would have to be overcome)

(3) combinatorial complexity: many variations (distributedcomputations)

(4) use of many compute nodes(5) any combinations of the above1.1.1. Large Size. As the size of a system increases, the

number of interactions increases as well, leading to a propor-tional growth of the CPU time required for calculations. In anideal case, the relationship between computational resources and

the size of a system is size-independent, or constant, although thisscaling is never achieved if the atomistic details are accounted for.Thus, the best practical scaling method through which most ofthe developments are realized is the linear scaling of computa-tional expenses versus the system size. Practically, though, this isalso not always possible because the scaling follows a power lawor linearity, which becomes evident only for very large systems inwhich the prefactor of the relationship itself is large. The scalingquickly follows high-power and exponential complexity for moreaccurate approaches. We will outline the methodology develop-ments on different levels of theory that aim at a linear-scalingrelationship in the discussion of the large-size simulations.

1.1.2. Long Time Scale. The group of methods in the long-time scale branch of the large-scale simulation methods mainlyaddresses the time scale of the processes that can bemodeled withina reasonable time frame. This is in contrast to the previous groupof methods, which concerns the size of the system that can behandled efficiently. The intrinsic scaling relationship in thisbranch of methods is linear: the computation of an N-times-longer process will require an N-times-longer simulation time.Thus, unlike the size-scaling methods, the problem may beperceived as easily overcome. However, the main challenge herelies in the extremely large range of time scales of the processes,varying over many orders of magnitude. If the simulations on thefemtosecond (10−15) or picosecond (10−12) time scales are easilyachievable nowadays, the simulations on the microsecond (10−6)to second (100) time scales often become prohibitivelyexpensive, if the level of theory remains the same. Of course,with alternative techniques to describe dynamics one can easilyovercome this limit. The challenge here then is to encode asmuch atomistic (or subatomic) details as possible into suchmethodologies.We want to emphasize that the linear scaling described above

does not address the complications arising from statisticalsamplings and ergodicity issues. For instance, if the thermody-namic or kinetic properties are needed for complex systems, suchas proteins, that could exist in a variety of conformations and canbe trapped in many local minima, a large number of trajectoriesare required to attain statistical relevance. A similar situation isencountered in semiclassical approaches to quantum dynamics,which represent nuclear distributions in terms of swarms oftrajectories. A large number of such trajectories are typicallyneeded for accurate results, which slows down calculations byseveral orders of magnitude. An even more problematic situationis observed in the coupled/entangled trajectories methods or inexact quantum mechanical propagation schemes. The computa-tional resources may scale as a power of the number of coupledtrajectories, basis states, or grid points, restricting the time scalelimits to tens of picoseconds at most.

1.1.3. Combinatorial Complexity. Yet another aspect oflarge-scale simulations is a situation in which both the size of thesystem and the time scale of the process (if present) are suitablefor existing standard methodologies, but the number of suchsystems or processes is very large, or multiple variations need tobe considered. This can occur with a typical “screening” process,in which the number of possible candidates (e.g., drug molecules,photovoltaic or photocatalytic materials, etc.) is combinatoriallylarge.1−3 Calculations of this type can be organized as high-throughput calculations on a variety of distributed computingsystems that need not be homogeneous. Performing thescreening of a large database of candidates for the desired targetproperty or application can be prohibitively expensive. This canalso be considered a variant of large-size computations.

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5798

Page 3: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

1.1.4. Use of Many Compute Nodes. By this we mean thestraightforward utilization of high-performance supercomputersand similar distributed systems, as well as new technologies suchas graphical processing units (GPUs). The approach might bethought of as “brute force”, not necessarily employing advancedmodeling techniques, but rather attacking the problem’scomplexity by efficient parallelization strategies. Of course,advanced methodologies can be and are often augmented withthese types of large-scale calculations to make them even moreproductive and insightful; however, we recognize this type ofsimulation in our classification to highlight the computational/parallelization strategies rather than sophisticated physicallymotivated methodologies.1.1.5. Combined Strategies. It is often difficult to classify

an approach as belonging solely to one of the categories above.Highly productive and insightful simulations do, in fact, combineall or several of the above listed strategies, and are oftennonseparable from each other. In the following discussion, wewill adopt the term “large-scale” as a generic term for describingone of the five approaches listed above. The first two groups oflarge-scale computations imply mostly algorithmic and meth-odological developments, while groups 3 and 4 rely mostly onutilization of large-scale computational facilities, although forgroup 3 some algorithm/methodology aspects may beimportant.

1.2. When Large-Scale Computations Are Needed

The physics and chemistry of many processes in large-scalesystems can be understood in terms of a hierarchy ofapproximations. This is the typical philosophy of multiscalemodels, in which effects stemming from lower levels oforganization can be incorporated into higher levels via effectivemapping. For example, when it is possible to introduce electronicand quantum effects at the atomic level with effective charges4−6

and other interatomic parameters, the atomistic resolution andhigh-frequency modes can be hidden in a coarse-graindescription by constructing effective potentials derived fromstatistical considerations (e.g., potentials of mean force).7−15

Although multiscale modeling is a well-defined strategy forhandling large systems, it is not always a straightforwardapproach because of the theoretical, computational, andtechnical difficulties of defining the mapping between differentrepresentative levels. In addition, it may not always be suitable incases when a detailed description must be retained for the entiresystem of interest at all times. In these situations, the utilization oflarge systems with explicit representation of all details isnecessary by the very nature of the object or the process. Forexample, biological macromolecules have to be consideredexplicitly or with a minimal degree of coarse-graining when one isinterested in nonlocal effects, such as electron−nuclearcorrelation16 or reactivity. In other cases smaller atomisticmodels can be adopted, but at the cost of deteriorated accuracy.For instance, representation of QDs by small clusters cansignificantly affect their electronic structure, as well as introducequantum confinement effects not present in a realistic system.Although this rough representation may be the only tractablemodel, using it can lead to qualitative misrepresentation of thesystem and its properties. This approach may require carefulinterpretation of the obtained results or introduction of empiricalcorrections.The scale of the problem at hand may drive the need for large

simulation setups. Some properties, such as heat capacity, massdensity, or bulk radial distribution function, can be well described

using relatively small systems. Other properties that rely on long-distance correlation and complexity of possible reactive channelsrequire large-scale atomistic or coarse-grained models. Examplesinclude reactive dynamics at interfaces,17−23 chemical trans-formations in energy materials,24−27 combustion28 and catal-ysis,29,30 and mechanical properties of materials.31−36

Large atomistic models are often needed for studies oftransport properties, such as directed motion of biological37 andartificial molecular machines,38−40 surface diffusion via a long-range hopping mechanism,41 and mass and charge transfer innanoscale structures.42−47 At the quantum level, when quantumdynamics or static ab initio electronic structures are concerned,modeling of reactive processes or large adsorbates requiresextended atomistic represention of the substrate, which may bedifficult to handle with standard tools. Long-range electrostaticeffects may play an important role in charge carrier transport andseparation. Similar to avoidance of self-interaction effects ininteraction potentials in systems subjected to periodic boundaryconditions, artificial quantum confinement should also beminimized, unless the realistic system is very small. Representa-tion of a bulk material should be larger than the exciton size.Calculation of charge carrier transport properties would requireinclusion of at least several localized exciton sites, which wouldfurther increase the size of the minimal setup.48−50

1.3. What Is Large? A Survey of the Records

The meaning of the word “large” differs across subfields ofcomputational chemistry and is constantly redefined. In oneextreme of the example of a large system is the simulation of largebiomolecular complexes. Steady progress has beenmade over thepast few decades in this area. The size and time scales ofsimulations have increased from approximately 500 atomssimulated over t = 10 ps in 70 s to greater than 2.6 × 106

atoms simulated over t > 1.5 ns in 2000 s. The latter comprisesimulations of the following macromolecules: complete satellitetobacco mosaic virus51 (N = 1.07 × 106, t ∼ 10 ns), bacterialflagellar filament52 (N = 2.38× 106, t∼ 1.6 ns), and 70S ribosomewith mRNA53 (N = 2.64 × 106, t ∼ 4.0 ns). Recently, an all-atomic simulation of the mature HIV-1 capsid (N = 60 × 106, t∼100 ns)54 was reported. It constitutes the largest all-atomicsystem modeled to date, to the best of our knowledge.Type-specific records of systems include the modeling of lipid

bilayers and related structures. In 2003 a micelle simulation wasreported,55 which included 23 775 particles simulated over a ∼5ns time scale. A study on ion dynamics and binding inmembranes was conducted in 2004, which included about20 000 atoms simulated over a 200 ns time interval. Morerecently, in 2011, an even larger all-atomic emulsion simulationwas performed.56 This simulation included 63 816 atomsconfined in a 60 × 60 × 160 Å box.The use of coarse-grain (CG) approaches allows for the

modeling of interesting processes of even larger sizes and timescales, which are comparable to realistic values achievablethrough experiments. For example, in 2006 a rotation of bacterialflagellum was modeled.57 The simulation included greater than3400 CG particles, which are equivalent to greater than 15 × 106

atoms in all-atomic model, with a time scale spanning over t > 30μs. More recently, the liposome has been studied as a potentialdrug-delivery shuttle for biomedical applications.58 The modelincluded approximately 2500 lipid CG particles and 160 000water CG particles and the simulation spanned 10 μs. Finally,advances in memory utilization and parallelization strategiesallowed simulation of the dynamics in huge molecular systems

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5799

Page 4: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

containing 125 nm bilayer vesicles with a 2 μm × 2 μm planarbilayer.59 The system contained 58.5× 106 CG particles, which islikely the largest simulated system with such a level of theory todate. It should be noted that such a simulation was only possiblewith the utilization of large supercomputer resources.Extensive reviews of the advances in simulations of

biomolecular systems can be found elsewhere,53,60 which includethat of coarse-grain simulations.61 The most notable break-throughs that significantly contribute to this steady progress arethe efficient linear-scaling techniques for electrostatic inter-actions, as well as efficient parallelization methodologies.Generally, computational methods for molecular-mechanics-based calculations are well-established for most of the centralalgorithms for the all-atomic interactions already available. Themethodological challenges reside mostly in development ofaccurate potentials, techniques for long-time simulations andefficient sampling, parallelization strategies with favorablescaling, and in flexible and transferable implementation in thecommunity-supported software. Although the challenge ofdimensionality is largely resolved, undoubtedly new interestingdevelopments will emerge yet.On the other side of the spectrum are the ab initio and

quantum dynamics calculations. Atoms numbering in multiplesof tens can very quickly become a really “large” system. Thisbecomes the case when correlated and many-body methods orfully quantal propagation of wave functions is used. On the otherhand, in the more popular wave function (WFT) and densityfunctional theory (DFT) methods, the “large” size limit variedthrough the decades from about 10−20 atoms in the 1960s tocurrently up to hundreds of atoms nowadays. This transition wasled mainly by advances in hardware and computing technologies.Further extensions of this limit will require the utilization of moresophisticated approaches and approximations.One of the first developments in this area was the utilization of

an approximation within the WFT or DFT frameworks, whichleads to efficient semiempirical and tight-binding ap-proaches.62−64 These methods allow the modeling of systemsas large as a few thousand atoms. The use of particle-basedquantum dynamics allowed for efficient modeling of dynamics ofexcited states in large nanoscale structures.5,6 The orbital-freeDFT formulation allowed a record simulation of systems withmore than 106 atoms.65 Using the effective Hamiltonianapproach combined with the classical molecular dynamics(MD) and semiempirical quantum-chemistry methods, thequantum dynamics in soft matter systems have been studied inthe Schulten group. The exemplary works include studies of therole of the environment on coherence and charge transferdynamics in the Fenna−Matthews−Olson complex in theglycerol−water mixture,66 and the record-breaking study ofenergy transfer in the lamellar chromatophores in Rhodospirillumphotometricum.67 The former featured more than 1.4 × 106 QM/MM computations for each solvent and temperature, andextensive ensemble averaging in the large biomolecular complex.The latter study is remarkable for its size: it comprised 20 × 106

atoms, and the quantum dynamics simulations were performedover 50 ps.The advent and further elaboration of linear-scaling

methods68−76 has made it feasible to apply the “standard” abinitio and DFT methods, including some correlated approaches,to systems having over 10 000 atoms. Using one of thesemethods, a record-breaking simulation of a fullerite systemcontaining over 106 atoms has become possible.77 A single pointcalculation takes just slightly more than an hour of wall time on a

128 core Xeon cluster. A number of groups are activelydeveloping these methodologies, although the broader scientificcommunity does not yet routinely use them. Likewise, some ofthese approaches are implemented in a number of availableprograms, but are not yet common parts of most packages. Thedevelopment of linear-scaling techniques and their furtherapplications are expected to significantly contribute to advancesin computational chemistry in the near future.Finally, another important group of methods is comprised of

reactive28,78 and quantal force fields.79−84 These approaches havethe combined advantage of being extremely fast through the useof molecular mechanics (classical force field), and being able todescribe chemical transformations and excited electronic states(only for quantal FF) via quantum methods. Because of thesecombined effects, long reactive dynamics of thousands of atomscan be efficiently simulated. Some of the applications for the useof reactive force fields include studies of oxidation and catalyticprocesses, fracture of nanoscale objects, and interfacial processes.Quantal force fields are more commonly used for biophysicalapplications, for instance, studies of enzyme reactivity, chargetransfer and excitation relaxation in photosynthetic complexes,and reactivity in condensed phases.

1.4. Scope and Philosophy of the Review

It becomes a difficult task to track novel developments andapproaches as the number of methods of theoretical andcomputational chemistry rapidly grows. As such, many groupsthroughout the world may be focusing on specific methodologiesof colleagues in similar disciplines, while overlooking thedevelopments in areas outside their expertise. This narrowapproach is further reinforced through the fundamental groundsthat the methodologies are based upon, as well as the restrictedtypes of systems and processes under investigation. For example,membrane transport, protein folding, and biomolecule aggrega-tion are often based on coarse-grained techniques withoccasional overlap with all-atomic force-field models. Studies ofprotein−ligand interactions and docking are most commonlybased on all-atomic classical molecular mechanics, with anoccasional utilization of combined QM/MM schemes. Studies ofreactive dynamics in the catalytic centers of enzymes oftencombine QM/MM and high-level ab initio techniques, while thereactive dynamics in condensed matter systems is typically foundthrough reactive force field approaches. Photoinduced processesrequire wave function based techniques or perturbative DFTmethods. The utilization of the methodologies outside the abovespectrum may not always be considered. We believe that aninterdisciplinary approach through the use of hybrid techniquesanalogous to those used in other subfields can be beneficial forscientists having different areas of expertise.There are many excellent reviews and feature pa-

pers61,62,76,85−102 on specific techniques targeted to large-scalesimulations coming from different perspectives. The recent issueof the Accounts of Chemical Research on fragmentation methodsfor linear-scaling computations103−111 warrants particularattention. The papers cited above provide essential details,examples of applications, and critical evaluation of thecorresponding methodologies. The goal of this review will beto summarize the existing approaches to large-scale calculationsfrom a rather general perspective. We aim to introduce the basicideas in these approaches and to provide a minimal yet sufficientreview on the latest developments across various subdisciplines.We also present a critical evaluation of existing methods andapplications, and outline possible future directions in these fields.

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5800

Page 5: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

In this paper, we aim to avoid repetition of material alreadypresented in specialized reviews. Our strategy is to give anoverview of work from seemingly different areas, and present acomparative analysis to discover their common features andpossible interplay, as well as their differences. Our goal is toelucidate the hierarchy of the approximations that connects thevarious techniques. Often, this hierarchy is rather straightforwardand follows the natural historic evolution of the methods. Alsofrequently, different methods may appear at different times andare derived from varying perspectives. Still, they share a commonground and can be related to each other. Sometimes, applicationof a certain technique developed for specific problems toquestions outside its intended scopemay yield fruitful results.Weoutline several such ideas, which we suggest as possible directionsfor future studies.Although our intent is to present a wide range of method-

ologies that have applications for systems of differing sizes, werestrict our discussion to methods used for calculation of reactiveinteractions and excited-state properties in large-scale systems.Other related techniques are mentioned, but are not the mainfocus. Particularly, we do not discuss any of the following areas:(a) acceleration of computations with correlated electronicstructure methods; (b) improved mathematical algorithms formolecular integral calculations and advanced convergence/summation methods; (c) acceleration of computations withclassical force fields; (d) methods for faster time evolution andsampling; (e) parallelization and hardware-related strategies. Forcompleteness, we provide below relevant literature leads for theinterested reader.Significant computational advances can be achieved within

groups (a) and (b) by utilizing the resolution of identity (RI)technique, also known as density fitting.112−115 The methodexpands the two-center products entering the four-centermolecular integrals either using explicit auxiliary functions, orimplicitly. The singular value116 or Cholesky117 decompositionsare the most frequent choices in the latter case. An extensivereview of the Cholesky method and its utility in the molecularintegrals calculations and reduction of the molecular systemdimensionality is given by Aquilante et al.118 Beebe et al.117

present a fine description of the decomposition procedure and itsapplication to accelerated computation of the integrals. One ofthe important aspects of such computations is the fact thatnumerically insignificant information can be discarded byneglecting the tensor elements smaller than a specified threshold.This simplification can drastically reduce the rank of the problemand lead to improved scalability of the computational costs. Thenumerically insignificant integrals can be discarded based on theSchwartz inequality119 or more sophisticated criteria, such asmulipole-based integral estimates (MBIE).120

Themathematical grounds of the RI can be related to the innerprojection procedure, discussed in the early works of Lowdin.121

A thorough discussion of the RI method is given by Whitten122

and Dunlap et al.123 A number of improvements and extensionsof the RI method have been reported, including extension tocorrelated methods124 and combinations with the divide-and-conquer scheme to achieve linear scaling.125

The advanced algorithmic approaches within groups (b) and(c) primarily aim to overcome quadratic scaling of theelectrostatic potential computations. The most popular method-ology is the one derived from fast multipole moments(FMM).126−128 It includes various methods for classical chargedparticles,126,129−136 as well as extensions and generalizations forquantum distributions.127,137,138 We refer the reader to the

recent review of Ochsenfeld et al.139 for a more detaileddiscussion of the FMM-based methods. The details of the FMMimplementation in the GAMESS program can be found in ref140. We also point the reader to the work on matrixreformulation of the FMM method.141

The approaches for accelerated time evolution (group d) areoften developed in conjunction with the classical MD methodsfor systems in the ground electronic state. The techniquesinclude temperature-accelerated dynamics (TAD),142−145 meta-dynamics,146 hyperdynamics147 and the related bond-boostmethod,148,149 replica-exchange molecular dynamics(REMD),150 and its combination with the temperature-enhanced sampling.151 The discrete MD method developed ina number of groups152−156 obtains notable acceleration of thestandard MD calculations. We refer the reader to the existingreviews94,157 for a broader outlook on the enhanced samplingmethods.The above techniques achieve long time scales through

thouroghly constructed sampling schemes that close the gapbetween realistic rare events and the atomistic dynamicsachievable in computer simulation. It is important to emphasizethat the acceleration in this group of methods is not a result ofreduction or simplification of the interaction potential. Themethods are limited by system size, especially if quantumcalculations are to be done. Progress toward joining the twoworldsenhanced sampling techniques and linear-scalingmethods for quantal calculationshas started only recently,158

and developments in this direction are expected to come out inthe near future.Finally, group (e) is represented by the methods for handling

larger sizes and time scales via utilization of distributedcomputations and efficient parallelization stratagies. A notableprogress has been achieved within the last subcategory by the riseand widespread utilization of graphical processing units (GPUs).The leading groups developing computational codes andmethods with GPUs in mind include the Martinez group(quantum chemisty and dynamics),159−162 the Travesset163 andSchulten164 groups (classical MD), and the Aspuru-Guzik165 andYasuda groups166 (accelerated computation of molecularintegrals in quantum calculations), to name a few.We aim to overview the efficient techniques and strategies of

such calculations for researchers studying dynamics, particularlythose who are interested in quantum (charge, energy, spin,coherence) and reactive dynamics. Therefore, we pay specialattention to presenting methods that have applications to theabove dynamics problems. The subjects of extensive time-domain (“long time”) simulations are somewhat beyond thescope of the present review, although they are mentioned whereappropriate.While compiling this review, we found the historic method for

analysis of various methodologies very fruitful. To understandthe diversity of the currently available methodologies, it is oftenhelpful to go back to their origins and study the evolution of theoriginal ideas. The common roots of seemingly differentmethodologies can be found this way, and other relatedtechniques can be identified. It is common that some of theoriginal ideas are developed quite extensively, while others attractsomewhat less interest. However, the less popular ideas stillpossess potential for further developments, especially in light ofthe changed paradigm to focus on large-scale systems. We willoutline some of those ideas, which might be useful to revitalize.When possible, we provide sufficient details of formulas and

derivations. Although in some cases such derivations are rather

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5801

Page 6: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

straightforward and can be found in numerous sources, includingtextbooks and other reviews or original papers, we believe it iscrucial to present a consistent and self-contained formulation ofthe discussed methods. Ideally, such details should be sufficientfor the implementation of the methods in computer programs.Due to size restrictions, we will focus on the most central andimportant details, provide references to more elaboratediscussions, and omit some intermediate derivations. One ofthe motivations behind this approach is to help the interestedreader to bridge the gap between theory and practicalimplementation.We believe that a rigorous and consistent mathematical

formulation and intuitively transparent use of notation andillustrations are crucial for proper description of various theories.Finally, weminimize discussion of specific applications, and focusprimarily on the theoretical formulations of the methods. Ourwork puts emphasis on comparative and historic analysis ofdifferent methods in one place, providing the reader with abroader outlook and new ideas for future developments of thelarge-scale computation methodologies.

2. BASIC GROUNDS

In this section, we present the mathematical grounds of the basictheory of molecular interactions. Although the theory is well-known to most theoretical/computational chemists, it is worthrepeating here for the following reasons:(a) Universal notation:We present a common notation for the

rest of the paper and facilitate the correlation of quantitiespresented with those known from other resources. This ismotivated by the fact that there is a wide diversity of notationsused by different authors, which may be confusing.(b) Clarity: We wish to put all readers, possibly with different

levels of familiarity with the subject, on common ground. Thismay especially be helpful to students and junior scientists, and itmight serve to refresh the knowledge of more experiencedreaders.(c) Foundational knowledge: We present the basic

foundations of the molecular interactions methods in order toforesee the possible bottlenecks that could limit the applicabilityof the methods. This section will help to guide the reader alongthe hierarchies of further developments in that area of research.(d) Terminology, concepts, and mathematics: Presenting a

convenient and systematic notation and basic concepts ofquantum chemistry in one place facilitates discussion of themethods to follow, without the need to reintroduce variables,notation, and formulas multiple times. Because the presentreview covers a variety of advanced linear-scaling methods, thetheory recalled in the present section has relevance to thefollowing sections of the review.We adopt the following labeling convention in the present

section and the following chapters. The lower case Latin letterssuch as i, j, k, etc. correspond to general running indices, and alsoto molecular orbitals. The lower case Latin letters such as a, b, c,etc. correspond to localized (atomic) orbitals. The Greek indicesare associated with spin components or Cartesian projections.The upper case Latin letters denote atomic species. The abovereservations vary slightly depending on the context of thediscussion. In the sections discussing fragment-based methodswe use the lower case Latin letters a, b, c, etc. to refer to atoms,and the upper case letters to refer to fragments. In the sectionsdescribing atomic orbital (AO) and molecular orbital (MO)transformations and semiempirical methods, the lower case

letters are used to denote AOs and the upper case letters are usedto denote atoms.2.1. Wave Function Theory (WFT)

2.1.1. Wave Functions and Transformations. Thesimplest ab initio wave function theory (WFT) is the Hartree−Fock method. According to this approach, the wave function of asystem with N electrons, Ψ(r1, ..., rN,σ1,σ2, ..., σN), isapproximated by a single Slater determinant (SD), Φ0(r1, ...,rN,σ1,σ2, ..., σN):

σ σ σ σ σ σΨ = Φ r r r r( , ..., , , , ..., ) ( , ..., , , , ..., )N N N N1 1 2 0 1 1 2(2.1.1)

σ σ σ

ψ σ ψ σ ψ σ

ψ σ ψ σ ψ σ

ψ σ ψ σ ψ σ

ψ σ ψ σ ψ σ

Φ

=!

=!

r r

N

r r r

r r r

r r r

NA r r r

( , ..., , , , ..., )

1

( , ) ( , ) ... ( , )

( , ) ( , ) ... ( , )... ... ... ...

( , ) ( , ) ... ( , )

1[ ( , ) ( , ) ... ( , )]

N N

N

N

N N N N N N N

N N N

0 1 1 2

1 1 1 2 1 1 1 1

1 2 2 2 2 2 2 2

1 2

1 1 1 2 2 2(2.1.2)

where

∑ = − A P( 1)P

P[ ]

(2.1.3)

is the antisymmetrizer, P is the permutation operator, and [P]denotes the parity of the permutation P. The lower-case vectorsymbol ri is reserved for the position of the ith electron and theGreek letters σi∈ {α,β} denote spin variables of ith electron. Theone-particle functions ψi(r,σ) are known as molecular spin−orbital and are commonly represented as a product of spatial andspin functions:

ψ σ ψ σ = r r( , ) ( )i i (2.1.4)

The spin function, σ, typically takes one of the two values:

σα

β=| ⟩

| ⟩⎪

⎧⎨⎩ (2.1.5)

The vectors α| ⟩ = ( )10 and β| ⟩ = ( )0

1 are orthonormal:

α β⟨ | ⟩ = =⎜ ⎟⎛⎝

⎞⎠(1 0)

01

0(2.1.6a)

α α β β⟨ | ⟩ = ⟨ | ⟩ = 1 (2.1.6b)

and span a two-dimensional space of the spin functions. In amore general treatment, the spin−orbitals transform to spinorfunctions that can be represented using the above basis as

ψ σψ

ψψ α ψ β =

= +

α

βα β

⎝⎜⎜

⎠⎟⎟r

r

rr r( , )

( )

( )( ) ( )i

i

ii i

(2.1.7)

In the following discussion we will assume the most commonrepresentation of the spin−orbitals; ψi′(r,σ) will imply eitherψi(r)α or ψi(r)β. To distinguish between the indexing of spin−orbitals and their spatial components (orbitals), we will utilizeprimed indices for the former and unprimed indices for the latter.The primed indices will run over a wider range and can bemapped to the unprimed indices using the pair convention: i′ =(i, σi′). For example, for a system with Nα spin-up and Nβ spin-

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5802

Page 7: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

down electrons, the index i will belong to the range [1, Nα] or tothe range [1,Nβ], depending on which quantity is indexed, whilethe index i′ will belong to the range [1, Nα + Nβ]. The indexedquantity will be additionally indexed with the spin component.Note that when not dealing with orbitals/spin−orbitals (e.g.,when enumerating electrons, atoms, excitations, etc.), the primedand unprimed indices do not take effect, unless otherwise noted.The set of 2NM orthonormal spin−orbitals form a Hilbert

space, Ω = {|ψi′⟩ |i′ = 1, ..., 2NM}, with the scalar product of twovectors in this space f i′, f j′ ∈ Ω defined by

∫ ∫σ σ σ⟨ | ⟩ = * ′ ′ ′ ′f f r f r f rd d ( , ) ( , )i j i j (2.1.8)

and with the basis vectors being orthonormal:

ψ ψ δ δ⟨ | ⟩ = σ σ′ ′ ′ ′i j ij i j (2.1.9)

The above space,Ω′, can be factorized into two subspacesonefor each spin component:

Ω′ = Ω ⊗ Ωα β (2.1.10a)

ψ ψ ψ α ψ αΩ = | ′⟩ = | ⟩ = | ⟩ = | ⟩ = | ⟩| ⟩ | =α α α α′i i N{ 1, ..., }i i i i M( , ) , ,

(2.1.10b)

ψ ψ ψ β ψ βΩ = | ′⟩ = | ⟩ = | ⟩ = | ⟩ = | ⟩| ⟩ | =β β β β′i i N{ 1, ..., }i i i i M( , ) , ,

(2.1.10c)

To clarify the notation, the pair indexing in |ψ(i,α)⟩ denotes the i′= (i,σi′)th (i′ ∈ [1,2NM]) spin−orbital from the entire set |ψi′⟩ ∈Ω′, while the double index in |ψσ,i⟩ denotes the ith orbital fromthe subset for a given spin component σ:|ψσ,i⟩ = |(ψσ)i⟩ = |ψσ⟩i ∈Ωσ. We will keep various indices in the subscript, reserving thesuperscript for mathematical operations (inverse, transpose,etc.).The spatial components of the spin−orbitalorbitals |ψα,i⟩

and |ψβ,i⟩need not be the same for any given MO index i. Allcomponents should belong to the space of square-integrablefunctions (according to the basic requirements of quantummechanics). In addition, we assume that each subset {|ψσ,i⟩ |i = 1,..., NM forms an orthonormal basis:

∫ψ ψ ψ ψ δ⟨ | ⟩ = * =σ σ σ σr r rd ( ) ( )i j i j ij, , , , (2.1.11)

which is implicitly assumed in eq 2.1.9. Any other set of linearlyindependent but nonorthonormal functions, {|χa⟩ |a = 1, ..., NM,can be transformed to the orthonormal MO basis by formingappropriate linear combinations:

∑ ∑ψ χ χ σ α β| ⟩ = | ⟩ = | ⟩ =σ σ σC C , ,ia

a i aa

ai a, ( , ) ,

(2.1.12)

Similar to the notation defined for the vectors in eqs 2.1.10, thenotation for matrices is as follows: Ca(i,σ) refers to an elementfrom the ath row and i′ = (i, σ)th column of a single combinedmatrix C with dimensions NA × 2NM, while Cσ,ai refers to anelement from the ath row and ith column of a distinct submatrixCσ with dimensionsNA ×NM. In the following discussion we willmake an association between the basis {|χa⟩} and the AO.Equation 2.1.12 represents the central assumption of themolecular orbital as a linear combination of atomic orbitals(MO-LCAO).For practical purposes and mathematical clarity, the above

transformation, as well as many others to follow, canconveniently be written in matrix notation. Defining the MObasis for a spin component σ as

ψ ψ ψ ψ| ⟩ =σ σ σ σ( ... )N,1 ,2 , M (2.1.13)

and the AO basis (spin-independent functions) as

χ χ χ χ| ⟩ = ( ... )N1 2 A (2.1.14)

the MO-LCAO equation can be written as

ψ ψ ψ ψ

χ

χ χ χ

| ⟩ =

= | ⟩

=

σ σ σ σ

σ

σ σ

σ σ

σ σ

⎜⎜⎜⎜⎜

⎟⎟⎟⎟⎟

C

C C

C C

C C

( ... )

( ... )

...

...... ... ...

...

N

N

N

N

N N N

,1 ,2 ,

1 2

,11 ,1

,21 ,2

, 1 ,

M

M

M

M

A

A A (2.1.15a)

ψ ψ χ

χ χ χ

* * = | *⟩ *

= * * *

* *

* *

* *

σ σ σ

σ σ

σ σ

σ σ

⎜⎜⎜⎜⎜

⎟⎟⎟⎟⎟

C

C C

C C

C C

( ... )

( ... )

...

...... ... ...

...

N

N

N

N

N N N

,1 ,

1 2

,11 ,1

,21 ,2

, 1 ,

M

M

M

A A M

A

(2.1.15b)

where NA is the number of atomic orbitals. In general, thenumber of AOs is not less than that of the MOs, while bothnumbers are larger than the number of electrons, N. In thefollowing discussion we will refer to the number of occupiedalpha orbitals, Nocc,α, and the number of occupied beta orbitals,Nocc,β. In a general unrestricted case, when all one-electronfunctions are understood as spin−orbitals, the number ofoccupied orbitals with each spin is equal to the number ofelectrons with that spin, Nocc,σ = Nσ.

2.1.2. Hamiltonian and Energy. The nonrelativistic field-free electronic Hamiltonian of the N-electron system is written:

∑ ∑ = += =

H N h ir

(1, ..., ) ( )12

1

i

N

i ji j

N

ijel

1 , 1(2.1.16a)

= − ∇ +h i V i( )12

( )i2

(2.1.16b)

∑= −| − |=

V iZ

r R( )

n

Nn

i n1

nucl

(2.1.16c)

In these equations we utilize shorthand notation for electronicdegrees of freedom: i ≡ (ri,σi) denotes both spin and coordinatevariables of the ith electron. Because the nonrelativistic field-freeHamiltonian, eqs 2.1.16, does not depend on spin variables, theabove notation will most often refer to spatial variables only: i ≡(ri). In eqs 2.1.16, rij = |ri − rj| is the distance between electrons iand j

∇ = ∂∂

+ ∂∂

+ ∂∂

⎛⎝⎜⎜

⎞⎠⎟⎟x y z

12

12i

i i i

22

2 2

2

2 2

2

2 2

is the kinetic energy operator of the ith electron, V(i) is theexternal potential of nuclei acting on the ith electron, Zn and Rnare the charge and position of the nucleus n, respectively, and

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5803

Page 8: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

Nnucl is the number of nuclei in the system. In this paper we useatomic units: ℏ = 1, e = 1, 1/(4πε0) = 1, and me = 1.The operators h (i) are the one-particle operators, describing

the energy of the ith electron interacting with an externalpotential, but not with other electrons. The operators 1/rij aretwo-particle operators describing the interaction between thepair of electrons i and j.According to the rules of quantum mechanics, the electronic

energy of the system is given by the expectation value of theelectronic Hamiltonian, eq 2.1.16a. In the Hartree−Fockmethod, the expectation value is computed using the SD wavefunctions eq 2.1.2. Using the fact that the SD is normalized tounity, the final expression for energy is

∑ ∑

∑ ∑

∑ ∑

= ⟨ ′| | ′⟩ + ⟨ ′ ′| ′ ′⟩ − ⟨ ′ ′| ′ ′⟩

= ′| | ′ + ′ ′| ′ ′ − ′ ′| ′ ′

= + −

′= ′ ′=

′= ′ ′=

′=′ ′

′ ′=′ ′ ′ ′

E i h i i j i j i j j i

i h i i i j j i j j i

h J K

12

[ ]

( )12

[( ) ( )]

12

[ ]

i

N

i j

N

i

N

i j

N

i

N

i ii j

N

i j i j

el1 , 1

1 , 1

1 , 1 (2.1.17)

The angle brackets in eq 2.1.17 represent physicists’ notationof molecular integrals:

∫ ∫

ψ ψ ψ ψ

σ σ ψ ψ ψ ψ

≡ ⟨ ′ ′| ′ ′⟩

≡ ⟨ | | ⟩

≡ * *

′ ′

′ ′ ′ ′

′ ′ ′ ′

J i j i j

r

r rr

1

d d d d (1) (2)1

(1) (2)

i j

i j i j

i j i j

12

1 1 2 212

(2.1.18a)

∫ ∫

ψ ψ ψ ψ

σ σ ψ ψ ψ ψ

≡ ⟨ ′ ′| ′ ′⟩

≡ ⟨ | | ⟩

≡ * *

′ ′

′ ′ ′ ′

′ ′ ′ ′

K i j j i

r

r rr

1

d d d d (1) (2)1

(1) (2)

i j

i j j i

i j j i

12

1 1 2 212

(2.1.18b)

while parentheses are used in chemists’ notation:

∫ ∫

ψ ψ ψ ψ

σ σ ψ ψ ψ ψ

≡ ′ ′| ′ ′

≡ * *

′ ′

′ ′ ′ ′

′ ′ ′ ′

⎛⎝⎜⎜

⎞⎠⎟⎟

J i i j j

r

r rr

( )

1

d d d d (1) (1)1

(2) (2)

i j

i i j j

i i j j

12

1 1 2 212

(2.1.19a)

∫ ∫

ψ ψ ψ ψ

σ σ ψ ψ ψ ψ

≡ ′ ′| ′ ′

≡ * *

′ ′

′ ′ ′ ′

′ ′ ′ ′

⎛⎝⎜⎜

⎞⎠⎟⎟

K i j j i

r

r rr

( )

1

d d d d (1) (1)1

(2) (2)

i j

i j j i

i j j i

12

1 1 2 212

(2.1.19b)

Themolecular integrals Ji′j′ andKi′j′ are known as the Coulomband exchange integrals, respectively. The Coulomb integral Ji′j′

describes the interaction between the charge densities created bythe one-particle orbitals ψi′ and ψi′, as is particularly apparent inthe chemists’ notation, eq 2.1.19a:

∫ ∫

∫ ∫

∫ ∫

σ σ ψ ψ ψ ψ

ψ ψ

ρ ρ

= * *

= | | | |

=

=

σ σ

σ σ

σ σ

′ ′ ′ ′ ′ ′

′ ′

′ ′

′ ′

J r rr

r rr r

r

r rr r

r

J

d d d d (1) (1)1

(2) (2)

d d( ) ( )

d d( ) ( )

i j i i j j

i j

i

ij

1 1 2 212

1 2, 1

2, 2

2

12

1 2, 1 ,2 2

12

, ,

i j

i j

i j (2.1.20)

The exchange integral, Ki′j′, does not have a straightforwardclassical analogue, but can be interpreted in terms of theinteraction of charge density fluxes between a pair of molecularspin−orbitals.The above definitions of the Coulomb and exchange integrals

retain factors that depend on the spin components of thecorresponding spin−orbitals. When integrated over spinvariables, these factors yield the more common Coulomb andexchange integrals defined in spaces of orbitals, Jij and Kij:

= σ σ′ ′ ′ ′J Ji j ij, ,i j (2.1.21a)

∫ ∫ ψ ψ ψ ψ

σ σ α β

≡ * *

σ σ σ σ σ σ ′

′ ′

′ ′ ′ ′ ′ ′J r r

rd d (1) (1)

1(2) (2),

, { , }

ij i i j j

i j

, , 1 2 , ,12

, ,i j i i j j

(2.1.21b)

δ= σ σ σ σ′ ′ ′ ′ ′ ′K Ki j ij, , ,i j i j (2.1.21c)

∫ ∫ ψ ψ ψ ψ≡ * *σ σ σ σ σ σ′ ′ ′ ′ ′ ′K r r

rd d (1) (1)

1(2) (2)ij i j j i, , 1 2 , ,

12, ,i j i j j i

(2.1.21d)

The one-particle molecular integrals appearing in eq 2.1.17

= ⟨ ′| | ′⟩ ≡ σ′ ′ ′h i h i hi i ii (2.1.22)

are called core integrals.The energy, eq 2.1.17, can be written in the basis of molecular

orbitals (not spin−orbital) in a more common form:

∑ ∑ ∑∑

∑∑ ∑∑

∑∑

= + + −

+ − +

+

α β α α α α

β β β β α β

β α

= = = =

= = = =

= =

α β α α

β β α β

β α

E h h J K

J K J

J

12

( )

12

( )12

( )

12

( )

i

N

ii

N

ii

N

j

N

ij ij

i

N

j

N

ij iji

N

j

N

ij

i

N

j

N

ij

el1

,1

,1 1

, , , ,

1 1, , , ,

1 1, ,

1 1, ,

(2.1.23)

In the restricted formulation, the spatial orbitals for a given pairof alpha and beta spins are the same:

ψ ψ ψ| ⟩ = | ⟩ = | ⟩α βi i i, , (2.1.24)

and hence

= = = =α α β β α β β αJ J J J Jij ij ij ij ij, , , , , , , , (2.1.25)

= =α α β βK K Kij ij ij, , , , (2.1.26)

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5804

Page 9: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

If we assume Nα = Nβ = N/2, the expression eq 2.1.23simplifies to the more commonly used form:

∑ ∑ ∑= + −= = =

E h J K2 (2 )i

N

ii

N

j

N

ij ijel1

/2

1

/2

1

/2

,(2.1.27)

Using eqs 2.1.19a and 2.1.19b, the energy expression, eq2.1.17, can be transformed to a more convenient and commonlyused orbital (unprimed) notation:

∑ ∑

∑ ∑ ∑ ∑

∑ ∑ ∑ ∑

∑ ∑

∑∑ ∑

∑∑ ∑

∑ ∑

∑ ∑

∑ ∑

∑ ∑

αα αα

αα αα

αα ββ

αβ αβ

ββ αα

βα βα

ββ ββ

ββ ββ

= ′| | ′ + ′ ′| ′ ′ − ′ ′| ′ ′

= | | + |

− |

= | | + | |

+ | |

− | |

+ | |

− | |

+ | |

− | |

+ | |

− | |

= +

+ + | − |

+ + | − |

= + + +

= + + +

= + + +

* * *

* *

* *

* *

* *

* *

* *

* *

* *

* *

* *

α α β β

α α α α

α α α α

α α β β

α β α β

β β α α

β α β α

β β β β

β β β β

α β

α α β α

β α β β

α α β β

α α β β

α α β β

′= ′ ′=

′= =′ ′

′ ′=′ ′ ′ ′

′ ′ ′ ′

= = = =

=

= =

= =

=

= =

= =

= =

α β

α

α β

β α

β

E i h i i i j j i j j i

C C a h b C C C C ab cd

C C C C ab cd

C C a h b C C a h b

C C C C ab cd

C C C C ab cd

C C C C ab cd

C C C C ab cd

C C C C ab cd

C C C C ab cd

C C C C ab cd

C C C C ab cd

P H P H

P P P ab cd P ad cb

P P P ab cd P ad cb

P H G P H G

P H F P H F

P H F P H F

( )12

[( ) ( )]

( )12

[ ( )

( )]

( ) ( )

12

[ ( )( )

( )( )]

12

[ ( )( )

( )( )]

12

[ ( )( )

( )( )]

12

[ ( )( )

( )( )]

12

[( )( ) ( )]

12

[( )( ) ( )]

12

[2 ]12

[2 ]

12

[ ]12

[ ]

12

tr( ( ))12

tr( ( ))

i

N

i j

N

i

N

a b

N

ai bii j

N

a b c d

N

ai bi cj dj

ai bj cj di

i

N

a b

N

a i b ii

N

a b

N

a i b i

i j

N

a b c d

N

a i b i c j d j

a i b j c i d j

i

N

j

N

a b c d

N

a i b i c j d j

a i b j c i d j

i

N

j

N

a b c d

N

a i b i c j d j

a i b j c i d j

i j

N

a b c d

N

a i b i c j d j

a i b j c i d j

a b

N

ab aba b

N

ab ab

a b c d

N

ab cd cd cd

a b c d

N

ab cd cd cd

a b

N

ab ab aba b

N

ab ab ab

a b

N

ab ab aba b

N

ab ab ab

el1 , 1

1 , 1 , 1 , , ,

1 , 1( , ) ( , )

1 , 1( , ) ( , )

, 1 , , ,( , ) ( , ) ( , ) ( , )

( , ) ( , ) ( , ) ( , )

1 1 , , ,( , ) ( , ) ( , ) ( , )

( , ) ( , ) ( , ) ( , )

1 1 , , ,( , ) ( , ) ( , ) ( , )

( , ) ( , ) ( , ) ( , )

, 1 , , ,( , ) ( , ) ( , ) ( , )

( , ) ( , ) ( , ) ( , )

, 1,

, 1,

, , ,, , , ,

, , ,, , , ,

, 1, ,

, 1, ,

, 1, ,

, 1, ,

A A

A A

A

A

A

A

A A

A

A

A A

A A

(2.1.28)

The letters a, b, c and d denote atomic orbitals, while indices iand j represent MOs. Here we introduce the core Hamiltonian inthe AO basis, {|a⟩ ≡ a ≡ χa ≡ |χa⟩}:

= | |H a h b( )ab (2.1.29)

the two-particle integral (Coulomb and exchange interactions)matrix for spin component σ, in the AO basis:

∑= | − |σ σG P ab cd P ad cb[ ( ) ( )]abc d

N

cd cd,,

,

A

(2.1.30)

the Fock matrix for spin component σ, in the AO basis:

= + ⇔ = +σ σ σ σF H G F H Gab ab ab, , (2.1.31)

and the density matrix for spin component σ, in the AO basis:

∑ ∑= * = =

= ⇔ =

σ σ σ σ σ σ σ σ σ

σ σ σ σ σ σ σ

= =

+ +

+ +

σ

P C C C O C C O C

C O C P C O C

( )

( )

abi

N

ai bii

N

bi ii ia ba

ab

,1

, ,1

, , ,

A

(2.1.32)

and the total density matrix:

= +α βP P P (2.1.33)

The matrix Oσ in eq 2.1.32 is the density matrix for spincomponent σ (dimension Nσ × Nσ, σ ∈ {σ,β}) in the MO basis,also known as the occupation number matrix. The first Nσ

diagonal elements, which correspond to occupied orbitals, are setto 1:

⎜⎜⎜⎜⎜

⎟⎟⎟⎟⎟O

1 0 0 0 00 ... 0 0 00 0 1 0 00 0 0 0 00 0 0 0 ... (2.1.34)

2.1.3. Variational Principle and the Eigenvalue Prob-lem.To find the wave function that describes the systemwith theHamiltonian given by eqs 2.1.16a, the energy, eq 2.1.28, can beminimized with respect to either the expansion coefficients, Cσ,or with respect to the corresponding density matrices, Pσ. Thevariation must be performed with an additional constrainttheorthonormality of MOs must be preserved:

ψ ψ χ χ

δ

⟨ | ⟩ = * ⟨ | ⟩

= *

=

=

=

σ σ σ σ

σ σ

σ σ

σ σ

+

+

C C

C S C

C S C

C SC( )

i ja b

ai a b bj

a bai ab bj

a bia ab bj

ab

ab

, ,,

, ,

,, ,

,, ,

(2.1.35a)

or in matrix form:

=σ σ+C SC I (2.1.35b)

The matrix S contains the overlaps of AOs:

χ χ= ⟨ | ⟩Sab a b (2.1.36)

To find the orbitals that minimize the energy subject to theorthonormalization condition, eqs 2.1.35, one needs to minimizethe extended Lagrangian:

= − − − −α α α β β β+ +L E E C SC I E C SC Itr( ( )) tr( ( ))el

(2.1.37)

with respect to each of the sets of coefficients Cσ and Cβ

independently. The matrices Eσ contain undetermined Lagrangemultipliers and, in general, may have an arbitrary structure, buteventually one often chooses a diagonal form (canonical MO).Omitting the details of the derivation, the result is the well-

known secular (generalized eigenvalue, Hartree−Fock−Roo-than, Roothan−Hall) equation:

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5805

Page 10: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

=σ σ σ σF C SC E (2.1.38)

The matrix Eσ contains eigenvalues of the Fock operator forthe spin channel σ, Fσ, (with Fσ being thematrix representation ofthe Fock operator in AO basis). These eigenvalues have acommonly accepted interpretation of the energies of the one-particle states (MO) |ψσ,i⟩. The Fock operator plays the role of aneffective one-particle Hamiltonian. The corresponding one-particle Schrodinger equation (SE) is then

ψ ψ | ⟩ = | ⟩σ σ σ σF Ei i i, , , (2.1.39)

One can then obtain eq 2.1.38 by projecting both sides of eq2.1.39 on the AO basis functions:

χ ψ χ ψ χ χ

χ χ

⟨ | | ⟩ = ⟨ || ⟩ ⇔ ⟨ | | ⟩

= ⟨ || ⟩ ∀ ∈

σ σ σ σ σ σ

σ σ

F E F C

C E i N, [1, ]

a i a i ib

a b bi

ba b bi i M

, , , ,

, ,(2.1.40)

2.1.4. Representation of Operators in Different Bases.In some derivations, mathematical operations may be signifi-cantly simplified by utilizing an explicit representation of anyarbitrary operator, X, in an arbitrary basis {|χi⟩} in terms of theirmatrix elements of the operator in the basis, Xij = ⟨χi|X|χj⟩, wherethe ket, |χi⟩, and the bra, ⟨χj|, vectors are introduced. In thissubsection we summarize some useful results related to such arepresentation and show some properties of the constructedoperators. This math will become useful in the discussion ofprojectors in block-localized MO and DFT approaches, and forHuzinaga−Cantu equations.We assume that the basis {|χi⟩} is not orthonormal, in general,

so

χ χ⟨ | ⟩ = Si j ij (2.1.41)

Then one can show that the representation of the arbitraryoperator X in this basis is

∑ ∑χ χ = | ⟩ ⟨ |− −X S X S[ ( ) ( ) ]a b

ai j

ai ij jb b, ,

1 1

(2.1.42)

with

χ χ= ⟨ | | ⟩X Xij i j (2.1.43)

being the matrix elements of the operator in the given basis.Indeed, computing matrix elements in a given basis, we obtain

∑ ∑

χ χ χ χ χ χ⟨ | | ⟩ = ⟨ | | ⟩ ⟨ | | ⟩

=

= · · · ·

=

− −

− −

− −

X S X S

S S XS S

S S X S S

X

( [ ( ) ( ) ] )

( ( ) )

( )

A B Aa b

ai j

ai ij jb b B

a bAa ab bB

AB

AB

, ,

1 1

,

1 1

1 1

(2.1.44)

In particular, the identity operator is given by

∑ ∑χ δ χ = | ⟩ ⟨ |− −I S S[ ( ) ( ) ]a b

ai j

ai ij jb b, ,

1 1

(2.1.45)

with the matrix elements ⟨χi|I|χj⟩ = δij being the Kronecker deltasymbols.If the basis {|χj⟩} is orthonormal (e.g., MO basis), the

expression 2.1.44 simplifies:

∑ χ χ = | ⟩ ⟨ |X Xa b

a ab b, (2.1.46)

The operator representation, eq 2.1.42, is convenient becauseit explicitly shows in which basis the operator is represented. Achange of basis can be done using mathematical properties of ketand bra vectors (e.g., transformation eqs 2.1.15) and standardmatrix transformations of the matrices representing operatorelements in the given basis, eq 2.1.43.

2.1.5. Useful Properties. Discussions in the context of MOtheory often involve a number of properties that can becomputed from MOs as target quantities or as useful auxiliaryproperties. Below we present a brief summary of the definition ofthese quantities and describe some of their properties.(a) Charge density matrix in general basis set {|χi⟩}:

=σ σD P S (2.1.47a)

= +α βD D D (2.1.47b)

Pσ and P are the density matrices (for each spin component andthe total), eqs 2.1.32 and 2.1.33; S is the overlap matrix, eq 2.1.36or 2.1.41. The charge density matrix represents the density ofelectronic charge rather than the density of electronic states,given by the density matrices Pσ and P. In the orthogonal basis,density and charge density matrices are equivalent.(b) Mulliken populations can appear in different projections.

For instance, the orbital-resolved Mulliken populations are

=n Di ii (2.1.48a)

which is the electron density on the orbital |χi⟩ (often AO). Thegroup-resolved Mulliken populations are the sums of the orbital-resolved populations:

∑=∈

n nAi A

i(2.1.48b)

where A represents a set of orbitals of interest (group). Often,such groups consists of the atomic orbitals centered on a givenatom, in which case eq 2.1.48b gives the atomic Mullikenpopulations.(c) The Fermi energy, or chemical potential, EF, is one of the

central quantities characterizing the energetics of electrons andtheir mobility in a given system. For a system ofNσ electrons of agiven spin σ, the Fermi energy (thus, separate for each spin) isgiven by

∑ =σ σ σ σ σ=

σ

E g f E E N: ( ; )Fi

N

i i F,1

, , ,(2.1.49a)

= +−Δσ σ

σ σΔ

−⎛⎝⎜

⎛⎝⎜

⎞⎠⎟⎞⎠⎟f E E

E E

E( ; ) 1 expE i

i, F,

, F,1

(2.1.49b)

where f is the Fermi distribution function having width ΔE, gi,σf(Ei,σ; EF,σ) gives the population (number of electrons) of the gi,σdegenerate energy level Ei,σ for the value of Fermi energy EF,σ.2.2. Density Functional Theory (DFT)

The DFT was founded in 1964 by Hohenberg and Kohn(HK)167 and consists of four main postulates/assumptions:(1) Charge density, ρ(r), is the main variable.(2) Uniqueness theorem (mapping statement): There exists a

1-to-1 correspondence between the ground state charge density,ρ(r), and the external potential acting on it, v(r).

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5806

Page 11: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

(3) Universal functional: There exists a universal functional,F[ρ(r)], of charge density, ρ(r), that is independent of theexternal potential, v(r), such that the electronic energy of thesystem for a given charge density can be computed by

∫ρ ρ ρ = + E r v r r r F r[ ( )] ( ) ( ) d [ ( )]v (2.2.1)

(4) Variational principle: The ground state charge density isobtained by minimization of the energy functional, eq 2.2.1, withrespect to the charge density, subject to conservation of the totalnumber of electrons:

∫ ρ =r r N( ) d(2.2.2)

where N is the total number of electrons in the system.The energy functional, eq 2.2.1, can further be elaborated by

separating the Coulombic interactions of the electrons, Vee[ρ],from the universal functional:

ρ ρ ρ= +F G J[ ] [ ] [ ] (2.2.3)

∫ρ ρ ρ= ′| − ′|

′Jr r

r rr r[ ]

12

( ) ( )d d

(2.2.4)

For simplicity here and in further discussions we will omit thedependence of the charge density on the coordinates of allelectrons, keeping in mind that the variable ρ represents afunction. This is emphasized by the square brackets that denote afunctional dependence on the argument.The functional G[ρ] is unique and, in general, unknown.

Finding this functional is the main challenge of DFT. As a result,although the theory described above is formally exact, it hasrelatively little practical power on its own and critically dependson practical approximations of the universal functional, F[ρ],especially its non-Coulombic part, G[ρ]. This part contains allcorrelation and exchange effects.Elaboration of the G[ρ] functional was completed in 1965 by

Kohn and Sham (KS).168 According to the KS formulation, theG[ρ] functional can be split into the functional that describes thekinetic energy of noninteracting electrons, Tnint[ρ], and theremainder that describes the exchange and correlation ofinteracting electrons, Exc[ρ] (which, in addition, includes effectsarising from the kinetic energy of the interacting electrons):

ρ ρ ρ= +G T E[ ] [ ] [ ]nint xc (2.2.5)

Variation of eq 2.2.1 subject to condition eq 2.2.2 is equivalentto the unconstrained variation of the functional:

∫ ∫ρ ρ ρ μ ρ = + − −E v r r r F r r r N[ ] ( ) ( ) d [ ( )] [ ( ) d ]v

(2.2.6)

where μ is a Lagrange multiplier.Using eqs 2.2.3−2.2.5, one obtains the Euler−Lagrange

equation:

∫ ∫δ ρδρ

δρ ρ δδρ

δδρ

μ

= +

′| − ′|

′ +

+ −

=

⎡⎣⎢

⎤⎦⎥

Er v r

rr r

rT

Er

[ ]( ) ( )

( )d

d

0

v nint

xc

(2.2.7)

Equation 2.2.7 can be represented in a more convenient formthat corresponds to the system of noninteracting electronsmoving in the field of effective potential, veff(r):

∫δ ρδρ

δρδ

δρμ

= + − =

⎡⎣⎢

⎤⎦⎥

Er

Tv r r

[ ]( ) ( ) d 0v nint

eff

(2.2.8a)

or

δδρ

ρ μ+ =T

v r([ ]; )ninteff

(2.2.8b)

)with

∫ρ ρ δ ρδρ

= +′

| − ′|′ +v r v r

rr r

rE

([ ]; ) ( )( )

d[ ]

effxc

(2.2.9)

The Schrodinger equations for each of the noninteractingsingle particles moving in the field of effective potential, eq 2.2.9,are known as the KS equations:

ψ ψ=H Ei i iKS KS KS KS

(2.2.10)

= − ∇ + H v r12

( )KS 2eff (2.2.11)

The effective one-electron Hamiltonian, HKS, is known as theKS Hamiltonian. It serves the same role in DFT as the Fockoperator does in WFT. The meaning of the KS orbitals, ψi

KS, issubject to multiple interpretations. Despite having multipleinterpretations of their meaning, they can always be thought of asthe auxiliary functions that describe the density of the system ofinteracting particles:

∑ρ ψ = | |=

r( )i

N

i1

KS 2

(2.2.12)

in which the summation runs over N lowest energy orbitals.2.3. Limitations of WFT and DFT, and Approaches ToOvercome Them

2.3.1. Scaling Law. In this section we present sources of thecomputational limitations of the WF and DF theories thatprevent their application to large-scale simulations, or make suchsimulations difficult. We then outline the general approaches tosolving these problems and describe the classification of suchapproaches, which are adopted in further discussions in thispaper.The scaling law often characterizes the performance of most

computational chemistry methods:

= ·R A NB (2.3.1)

reflects the fact that the computational resources, R (time andmemory), increase as the Bth power of some measure of thesystem’s size,N, with the proportionality prefactor A. The scalinglaw eq 2.3.1 is often written in the formO(NB), to emphasize thepower-law dependence, which is critically important for large-scale applications. The dependence on the prefactor A is oftenweak and can be reduced by various approximations, and at somepoint (so-called break-even points), the resources required willbe dominated by the size of the system, N.The choice of the size measure, N, can vary depending on the

particular method used. In molecular mechanics it is typically setto the number of atoms, N = Nat, while in WFT and DFT it isoften set to the number of basis (molecular) orbitals,N =NMO. IfMOs are represented in the basis of localized AOs, theproportionalities NMO ∼ NAO ∼ Nat hold. If the MOs arerepresented in the basis of plane waves (PW), the number of thePW orbitals, NPW, is determined by the maximal kinetic energy,

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5807

Page 12: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

Ecut, of plane waves and by the volume of the system, V: NMO ∼NPW ∼ VEcut

2, and is formally independent of the number ofatoms in the system. In many cases, the volume of the simulationcell is proportional to the number of atoms, making the baselinearly dependent on the number of atoms in a system as well:NMO ∼ NPW ∼ NatEcut

2. Often, not all MOs are required incomputations, but only the occupied and, perhaps, some smallfraction of the nearby-unoccupiedMOs. In this case the base maybe reduced, N = NMO(occ), which can be particularly important asthe size of the system increases.2.3.2. Classification of Approximations. Depending on

the specific choice of N, other proportionality factors can beadsorbed into the prefactor A. The efforts to minimize the baseparameter, N, or the prefactor A can be classified as intensive,because they typically do not take advantage of the opportunitiesgiven by the size of the system. These approaches can speed upcalculations, but only up to a certain extent, after which they canbe prohibitive with the current state of computationaltechnologies.Themain challenge of most computational chemistry methods

applied to large-scale systems is in reducing the exponent, B, toproduce linear (B = 1) or sublinear (B < 1) scaling methods. Thedesired techniques would take advantage of the opportunitiesprovided by the system size. Therefore, these methods can beclassified into a group of extensive approximations. The simplesthierarchy of extensive approximations is the following: correlatedWFT methods (B > 4) − Hartree−Fock (B = 4) − DFT (withpure functionals)/semiempirical methods (B = 3) − force fieldswith full electrostatics (FF, B = 2) − linear-scaling methods (N).Methods that minimize A,N, and B can be classified according

to the nature of the approximations and techniques used. Inmany cases the approximations are based on the use ofcomputationally efficient alternatives for some of the quantitiesinvolved in calculations. For example, certain integrals may beapproximated by functions that are not computationallyintensive, which behave similarly to the integrals of interest.The methods can use various distance cutoffs or value thresholdsto neglect some of the terms. Purely mathematical orprogramming tricks can be used to accelerate some calculations.For example, efficient schemes can be used to accelerate theconvergence of self-consistent field iterations. Numerouspartitioning methods (such as divide-and-conquer, subsystemDFT, or fragment MO) can also be used. Unlike the previousmethods, which minimize the parameters A and N, the latter areused to minimize parameter B. Despite the differences in theways in which these methods affect the scaling law, eq 2.3.1, noneof them alter the physics of or the level of description of theseinteractions. In other words, they introduce minimal approx-imation regarding the physics of the interactions. We classifythese techniques as computationally motivated.The complementary group of approximations can be classified

as physically motivated ones. The approximations in this group aremostly driven by more fundamental assumptions, which areintended to maintain an accurate account of the physics of theprocess, but describe it in a different way. Often, this alternativeway of representing the physics uses more computationallyefficient formulations. The best example of physically motivatedapproximations is DFT, which appeared as an inexpensivealternative to correlated WFTs. Another example of physicallymotivatedmodels is a transformation from the ab initio Hartree−Fock theory to semiempirical methods. Finally, one can think ofthe model Hamiltonian approaches as yet another level of

physically motivated models that attempt to mimic moresophisticated all-atomic orbital-based models.The two groups of approximationscomputationally and

physically motivatedcan both lead to a substantial accelerationof computations, but through different means. Often, the twotypes of approximations are present in an interconnected way,making it difficult to distinguish and classify them into oneparticular group. For example, the formulations of semiempiricalmethods make use of both computationally motivatedapproximations, such as the Nishimoto−Mataga or Ohnoformulas to represent two-electron integrals. They may alsoneglect orbital overlaps and physically motivated approximationssuch as introduction of suitable empirical functions to describenuclear−nuclear or electron−nuclear interactions or ionizationpotentials and electronic affinities to approximate certain corematrix elements.

2.3.3. Sources of Performance Bottlenecks. In thesimplest quantum mechanical treatment of the electronicstructurethe Hartree−Fock techniquethe scaling is quartic,with the inclusion of the number of orbitals, O(Norb

4). Thisscaling arises at the Fock matrix formation step because it isnecessary to calculate all of the four-center, two-electron(Coulomb and exchange) integrals, eqs 2.1.18. The scaling ofthe higher order correlated methods is even more unfavorable,restricting the size of the systems to a few tens of atoms at most;examples include the Møller−Plesset perturbation theory(MP2/MP4), configuration interactions (CI), coupled cluster(CC), and complete active space SCF (CASSCF). In the presentreview we do not discuss approaches to improving the scalabilityof those methods. Rather, we will only discuss the methodologiesthat scale better than the Hartree−Fock O(Norb

4) method.The next level of scaling, O(Norb

3), is achieved in thesemiempirical WFT and DFT (with pure functionals). Bothtechniques avoid the computation of all or part of the exchangeand Coulomb integrals that enter the Fock matrix. In thesemiempirical branch this is achieved by neglecting the termsinvolving the products of the atomic orbitals localized ondifferent atoms, as in the complete neglect of differential overlap(CNDO) technique. Alternatively, in the neglect of diatomicdifferential overlap (NDDO) approximation, only thoseexchange integrals that involve products of the orbitals centeredonly on two atoms (or one atom) are retained. In the KS-DFTthe exchange functional depending only on the electronic densityeffectively represents exchange integrals. In addition, thecorrelation absent at the HF level can also be introduced in asimilar manner, via the correlation functional. The cubic scalingarises because of two reasons: (a) the necessity of computing thedensity matrix, eqs 2.1.32 and 2.1.33, which involves densematrixmultiplications; (b) the diagonalization step. Although findingonly the eigenvalues of the N × N matrix may be achieved inlinear time, the eigenvectors (MOs) are typically required, whichmakes the algorithm scale as O(N3).The next level of scaling, O(N2), can typically be achieved in

classical force field/molecular mechanics methods. This scalingarises from pairwise long-range interactions (Coulombic), whilethe so-called bonded (bonds, angles, torsions, dihedral) andshort-range nonbonded (vdW) interactions can be computed inlinear time,O(N). The quadratic scaling can be reduced to linear,O(N), or quasi-linear, O(N log N), by utilizing efficientsummation techniques (Ewald, particle-mesh Ewald, etc.) forlong-range electrostatic interactions, especially in periodicsystems. Therefore, the standard (nonpolarizable) force fieldmethods show the promising quasi-linear scaling O(N) (often

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5808

Page 13: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

O(N log N) making it possible to study huge atomistic modelscomprising over several million atoms with only modestrequirements for memory and CPU time.The MM methodologies become more complicated when

reactive many-body (such as Sutton−Chen, MEAM, Tersoff−Brenner, ReaxFF, etc.) potentials are used or polarization isincluded. For example, potentials such as MEAM involvenonseparable nonpairwise terms and are to some extent similarto density functional methods, although they do not requirediagonalization/electron density optimization. One of thesimplest ways of treating the polarization of classical atoms canbe accomplished with charge equilibration (QEq) schemes. Suchschemes require the solution of the system of linear equationsthat introduces cubic scaling O(N3). Although N in this case isnotably smaller than Norb, the complexity is still cubic,comparable to that obtained from DFT and semiempiricalmethods.Finally, we arrive at a group of methods that scale sublinearly

with the number of atoms. A variety of coarse-graining (CG)force fields (CG-FF) has been developed, primarily in thebiophysical community. These methods treat groups of atoms aseffective particles, parametrized to thermodynamic and structuraldata that are obtained either from all-atomic methods orexperimentally. CG-FFs relate to all-atomic FFs (AA-FF),analogous to the relationship of the fragment MO (FMO) orsubsystem DFT methods and standard MO and KS-DFTmethods, respectively.Alternatively, in the mapping approaches, smaller subsets of

effective collective coordinates can be found that describeprocesses of interest in large-scale systems. In the field of classicalMM simulations this is known, for example, as the diffusion mapMD. This approach effectively reduces the dimensionality of theproblem at hand, leading again to sublinear scaling. In the field ofquantum dynamics this approach has widely been used in theframework of model Hamiltonians.Many of the topics mentioned above have been extensively

reviewed and discussed in detail previously. As we stated insection 1.4, this review will discuss only some of the promisingtechniques for computing electronic structure and reactiveinteractions in large-scale systems. Our review is motivated bythe desire to understand the potential of these methods fordynamical problems, including the reactivity on the ground statePES and the nonadiabatic quantum dynamics on the manifold ofexcited state PESs.

3. PHYSICALLY MOTIVATED APPROXIMATIONSThis group appeared relatively early in the development ofquantum chemical methods. Some methods were invented veryearly, as the most natural and logical ways of reducingcomputational costs of higher-level approaches. Other methodswere developed relatively recently, although the basis for theirrise has been forged in earlier developments. Some methods inthis group were developed as nonquantum methods, relativelyindependently of the evolution of the mainstream quantumchemistry methods. We list them in this section as well, becausethey can be derived from the high-level quantum approaches, bya sequence of appropriate approximations, and because theyshow big promise for large-scale computations. In contrast to thecomputationally motivated approaches, which utilize mathemat-ical advantages of method reformulation (although often guidedby chemical and physical concepts), the group of methodsdiscussed in this section starts from the physical interpretationfirst and often relies on less rigorous mathematical or physical

grounds, choosing one of them depending on a desired degree ofacceleration.

3.1. Semiempirical MO Methods

This group of methods appeared first, in order to circumvent thequartic scaling of the Hartree−Fock method. The mainassumption that makes the semiempirical methods efficient isthe neglect of a large portion of molecular integrals from eq2.1.30 or its approximation with more computationally efficientformulas. Typically, the calculation of the Fock matrix scalesquadratically, due to the necessity to account for the pairwiseelectrostatic interactions of the Coulomb type. In the tight-binding methods without electrostatic effects, such as theextended Huckel theory (EHT) or the density functional tight-binding (DFTB), formation of the Fock (or effective one-electron Hamiltonian) matrix can be linear, due to the short-range nature of interactions.The purpose of this section is to summarize some of the central

steps in the development of the semiempirical methods,minimizing discussion of the details. Such discussion can befound in the dedicated reviews on the semiempirical methods.62

Rather, we demonstrate evolution of the semiempirical methodsand discuss their current status. In certain cases, when reviews arelacking (such as on the EHT), a slightly more extendeddiscussion will be presented.

3.1.1. CNDO and CNDO/2 Methods. 3.1.1.1. BasicFormulation of the CNDO Method. The complete neglect ofthe differential overlap (CNDO) method, developed by Popleand co-workers,169,170 was one of the first approaches deriveddirectly from the HF theory. The method aimed to reduce thenumber of two-electron integrals in the Fock matrix, and topreserve the rotational invariance of the resulting approximateFock operator. The main assumptions of the method can besummarized as follows:(1) Neglect of differential overlap is the general (main)

approximation:

δ = ⇔ =+

C C S Ia b ab (3.1.1)

Under this assumption, some Hamiltonian matrix elementsbecome zero (see next), and the generalized eigenvalue problem,eq 2.1.38, becomes the standard eigenvalue problem:

=FC CE (3.1.2)

(2) Approximations regarding the two-electron integrals arethe following. Because of eq 3.1.1, all two-electron integrals (ab|cd) are zero, unless a = b and c = d:

δ δ| = | =| ≠ = =

≠ ≠⎪

⎪⎧⎨⎩ab cd aa cc

aa cc a b c d

a b c d( ) ( )

( ) 0, ,

0, orab cd

(3.1.3)

The nonzero two-electron integrals do not depend on orbitaltype, to preserve rotational invariance of the Hamiltonian:

γ| = ∈ ∈aa bb a A b B( ) , ,AB (3.1.4)

where A can be the same as B. Here, and in the followingdiscussion, capital letters denote atom labels, and the notation a∈ Aimplies that the AO a is centered on the atom A. For practicalreasons, the integral is evaluated in STO (or in STO-nGTO)using s-orbitals of the valence shell.(3) Approximations regarding the one-electron integrals

involve diagonal elements of the core Hamiltonian. Because of

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5809

Page 14: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

eq 3.1.1, the one-electron integrals (a|VB|c) will be nonzero onlyif a = c:

| | =| | ≡ ∈ ≠

≠⎪

⎪⎧⎨⎩a V c

a V a V a A A B

a c( )

( ) , ,

0,B

B AB

(3.1.5)

Note that the element(a|VA|a), a∈ A, is implicitly included in theparametrization of the core Hamiltonian.(4) Off-diagonal matrix elements of the core Hamiltonian are

proportional to the AO overlaps, similar to the EHTprescription:

β= ≠ ∈ ∈H S a b a A b B, , ,ab AB ab0

(3.1.6)

Note that the off-diagonal elements of the overlap matrix arecomputed only for this purpose, but they are assumed zero tosimplify the eigenvalue problem.3.1.1.2. Explicit Expression of the Fock Matrix Elements.

Using the approximations above, the Fock matrix elements canbe written as

∑∑

∑ ∑

δ δ δ δ

δ

δ

δ

δ γ γ

= +

= + | − |

= + | − |

= + | − |

= + | − |

= + | − |

= + −

σ σ

σ

σ

σ

σ

σ

σ

F H G

H P ab cd P ad bc

H P ab cd P ad bc

H P aa cc P aa bb

H P aa cc P aa bb

H aa cc P P aa bb

H P P

[ ( ) ( )]

[ ( ) ( )]

( ( )) ( )

( ( )) ( )

( ) ( )

ab ab ab

abc d

cd cd

abc d

cd ab cd cd ad bc

ab abc

cc ba

ab abC c C

cc ab

ab abC c C

cc ab

ab abC

AC C ab AB

, ,

,,

,,

,

,

,

,(3.1.7)

or, with a change of indexing:

∑δ γ γ= + −σ σF H P Pab ab abB

AB B ab AB, ,(3.1.8)

where

∑=∈

P PBb B

bb(3.1.9)

is the total electron density on atom B.Equation 3.1.8 can be elaborated further:

∑γ γ

γ=

+ − + ∈

− ≠ ∈ ∈σ

σ

σ

⎧⎨⎪

⎩⎪F

H P P P a A

H P a b a A b B

( ) ,

, , ,ab

aa B aa AAB A

AB B

ab ab AB

,

,

,

(3.1.10)

3.1.1.3. Explicit Expression for the Core Hamiltonian MatrixElements. The explicit expression for the core Hamiltonianmatrix elements is

β

=

− ∇ −

= − ∈ ∈

∈ ∈ ≠

⎪⎪⎪

⎪⎪⎪

H

a V r R b

U V a A b A

S a A b B A B

(12

( , ) )

, ,

, , ,

ab

CC

abB A

AB

AB ab

2

0(3.1.11)

where

≡ − ∇ − =≠⎪

⎪⎧⎨⎩U a V r R b

U

a b(

12

( , ) ), parametrized

0,ab A

aa2

(3.1.12)

≡ | | =| | ∈

≠⎪

⎪⎧⎨⎩

V a V r R ba V r R a a A

a b( ( , ) )

( ( , ) ),

0,AB B

B

(3.1.13)

3.1.1.4. Some Technicalities. This section includes sometechnicalities.(1) The electron repulsion integrals (ERI) of type γAB are

computed explicitly:

∬γ = | =| − |

ns ns ns ns nsr r

ns r r( ) (1)1

(1) d dAB A A B B A B2

1 2

21 2

(3.1.14)

To preserve rotational invariance, the integrals are computedusing s-type orbitals of the valence shell given by STOs withparametrized exponents:

ξ∼ −ns r r( ) exp( ) (3.1.15)

ξ = ′Zn (3.1.16)

where n is the principal quantum number of the valence shell andZ′ is a parameter (with the physical meaning of the screened corecharge, but numerically not necessarily equal to it).(2) The nuclear attraction integrals (NAI) of type (a|VB)(r,R)|

a), a∈A, are evaluated explicitly, using valence s-orbitals, as in eq3.1.15:

∫| | =| − |

a V r R a nsZ

r Rr( ( , ) ) (1) dB A

B

B

2

(3.1.17)

(3) The core integralsUaaare parametrized from the IP and EAenergies and the ERIs, eq 3.1.14. For example, one obtains for the2sm2pn configuration:

γ= − − −U I X Z( , 2s 2p ) ( 1)m nX XX2s,2s s eff, (3.1.18a)

γ= − − −U I X Z( , 2s 2p ) ( 1)m nX XX2p,2p p eff, (3.1.18b)

The parameters Is(X,2sm2pn) and Ip(X,2s

m2pn) can be taken fromatomic data (experimental or computational) or can beadjustable.(4) The quantities βAB

0 are averages of the single-atomparameters βX

0:

β β β= +12

( )AB A B0 0 0

(3.1.19)

(5) Note that so far we have introduced three types of Z:(a) Z is the nuclear charge. It is used in computation of the

NAI, eq 3.1.17, and nuclear−nuclear interaction energy.

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5810

Page 15: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

(b) Zeff = m + n is the number of valence electrons in the2sm2pn configuration. This quantity plays the role of the effectivecore charge (but in a crude way, for example, not obtained fromthe Slater rules). It is used in the core parameter evaluation, eqs3.1.18.(c) Z′ is the effective core charge, more closely related to the

Slater rules and nuclear charge screening. Solving the hydrogen-like problem with this effective charge gives the exponent of theresulting STOs. It is used in the NAI and ERI evaluations as theparameter entering the AO exponents.3.1.1.5. CNDO/2 Method. The original CNDO approach was

valuable for the calculation of energies and electronic structure,but it failed notably in predicting good nuclear geometries. Themethod has been significantly improved to yield its secondversion, abbreviated CNDO/2.171 In addition to the approx-imations made by CNDO, CNDO/2 relies on the following twoapproximations:(1) Neglect of the so-called “penetration” term: To derive this

correction, one can start with the CNDO Fock matrix andregroup the terms in the following way:

∑ ∑

γ γ

γ γ

γ γ

γ

= + − +

= + − + −

= + − + −

− −

σ σ

σ

σ

≠ ≠

F H P P P

U P P P V

U P P P Z

V Z

( )

( )

( ) ( )

( )

aa aa B aa AAB A

AB B

aa B aa AAB A

AB BB A

AB

aa B aa AAB A

AB B B

B AAB B AB

, ,

,

, eff,

eff,(3.1.20)

The last sum ∑B≠A(VAB − Zeff,BγAB) has the meaning of“penetration” energy, although there is no strict justification.According to the approximation, this term is neglected.Effectively, the approximation is equivalent to the following:

γ=V ZAB B ABeff, (3.1.21)

(2) Utilization of a different strategy for parametrization of thecore Hamiltonian, using both IP and EA:

γ− = + −I U Z( 1)a aa A AAeff, (3.1.22a)

γ− = +A U Za aa A AAeff, (3.1.22b)

so that

γ= − + − −⎜ ⎟⎛⎝

⎞⎠U I A Z

12

( )12aa a a A AAeff, (3.1.23)

3.1.2. INDO and NDDO Methods. 3.1.2.1. Basic For-mulation. Limited applicability to spin-polarized calculations hasbeen the main difficulty of the CNDO and CNDO/2 models.Because the exchange integrals are completely neglected by theZDO approximation, the energies of the singlet and tripletconfigurations are the same. The difficulty can be avoided in theimproved approximation schemes: intermediate neglect ofdifferential overlap (INDO)172 and neglect of diatomic differ-ential overlap (NDDO).169 The approaches utilize most of theapproximations made in the CNDO/2 method, but they retain acertain fraction of exchange-type integrals.In the NDDO approximation, the four-center integrals (ab|cd)

are neglected except when the orbitals a and b are centered onthe same atom A, and the orbitals c and d are centered on theatom B (which can be A):

δ δ| = |ab cd ab cd( ) ( )a b c d[ ],[ ] [ ],[ ] (3.1.24)

where [a] denotes the index of the atom on which the AO χaiscentered.The Fock matrix can be written with this approximation as

∑ ∑

δ δ

δ δ

δ

= +

= + | − |

= + |

− |

= + |

− |

σ σ

σ

σ

σ

∈∈

F H G

H P ab cd P ad bc

H P ab cd

P ad bc

H P ab cd

P ad bc

[ ( ) ( )]

[ ( )

( )]

( )

( )

ab ab ab

abc d

cd cd

abc d

cd a b c d

cd a d b c

ab a bB c d B

cd

c bd a

cd

, ,

,,

,[ ],[ ] [ ],[ ]

, [ ],[ ] [ ],[ ]

[ ],[ ],

[ ][ ]

,

(3.1.25)

or in the more explicit form:

∑ ∑ ∑= + | − |

σ σ∈ ∈

F H P ab cd P ad bc

a b A

( ) ( ),

,

ab abB c d B

cdc d A

cd,, ,

,

(3.1.26a)

∑= − | ∈ ∈ ≠σ σ∈∈

F H P ad bc a A b B A B( ), , ,ab abc Bd A

cd, ,

(3.1.26b)

with the core Hamiltonian,Hab, given by eq 3.1.11. In contrast toCNDO and CNDO/2, not only the diagonal elements, but allelements Uaband VABin eqs 3.1.12 and 3.1.13 are computedexplicitly (or parametrized). This is because the overlap χa*χb isnot neglected.The INDO approximation is essentially the same as the

CNDO/2, with the exception that all one-center exchange-typeintegrals are retained. The approximation 3.1.3 is modified:

δ δ δ| = |ab cd ab cd( ) ( )a c a b c d[ ],[ ] [ ],[ ] [ ],[ ] (3.1.27)

The Fock matrix can be written as

∑ ∑γ= + − + |

− | ∈

σ

σ

≠ ∈F U P Z P aa cd

P ac ad a A

( ) [ ( )

( )],

aa aaB A

BB B ABc d A

cd

cd

,,

, (3.1.28a)

∑= + | − |

∈ ≠

σ σ∈

F U P ab cd P ac bd

a b A a b

[ ( ) ( )],

, ,

ab abc d A

cd cd,,

,

(3.1.28b)

β γ= − ∈ ∈ ≠σ σF S P a A b B A B, , ,ab AB ab ab AB,0

,

(3.1.28c)

Because of the altered form of the Fock matrix, a slightly differentway of parametrization of the core matrix elements is utilized. Ittakes into account the exchange integrals in the determination ofIPs and EA. For instance, the INDO parametrization utilizes theSlater−Condon factors.In the NDDO approximation, the changed parametrization is

related to an increased number of two-electron integrals. Theintegrals can be either computed directly, or treated as adjustableparameters, or both. The increased number of degrees offreedom in the semiempirical NDDO parametrization allows

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5811

Page 16: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

more flexibility and, hence, accuracy of the resulting models. Atthe same time, the transferability of the obtained parameters(especially if a small training set is used) is relatively low, and theparametrization procedure is more difficult. Historically, theinitial version of the NDDO approximation showed somewhatpoor performance. In later developments, both the NDDO andINDO Hamiltonians were used as the starting grounds for moreaccurate and transferable methodologies that will be outlinedbelow.3.1.3. MINDO, MNDO, AM1, and PMn Methods. The

INDO approximation has been extended in a series ofmodifications, known as the modif ied INDO (MINDO)method.173−175 The following modifications were used in thefirst version:173 (a) The parameters were fit to experimental data,rather than to exact Hartree−Fock calculations. (b) The corematrix elements were fit to experimental atomic spectra, usingtransitions among high-spin electronic configurations. (c) Anapproximate functional form was utilized for estimation of thetwo-center integrals. The total energy is expressed as the sum ofHF-like (electronic) energy, Eel, eq 2.1.28, and terms thatrepresent the core−core interaction, Ecore,AB:

∑= +<

E E EA B

ABtot el core,(3.1.29)

The MINDO expression for the core−core repulsion utilizedan approximate functional form:

γ=E Z ZAB A B ABcore, (3.1.30)

where ZX, X = A, B, is the core charge, and γAB represents thescreened Coulomb potential. The latter was approximated viathe Dewar−Sabelli176−178−Klopman−Ohno179,180 expression:

γρ ρ

= =+ +

J RR

( )1

( )AB AB AB

AB A l B l2

, ,2

1 2 (3.1.31a)

where ρA,l1 = 1/2JAA and ρB,l2 = 1/2JBB are adjustable in general.An expression similar to eq 3.1.30 was also used to describe the

electron−core attraction, VAB. The choice of the function f1(R) ismotivated by the proper asymptotic behavior at R→ 0 and R→+∞, as well as by computational efficiency. Several otherapproximations of the kind have been discussed in theliterature,181 namely the following three formulas.(1) The Nishimoto−Mataga182 formula:

=+ +

J RR

( )1

AB ABAB J J

1

A l B l, 1 , 2 (3.1.31b)

(2) The Nishimoto−Mataga−Weiss183 formula:

=+ +

J Rf

R( )AB AB

ABf

J J2

A l B l, 1 , 2 (3.1.31c)

with an adjustable parameter f = 1.2.(3) The Ohno184 formula:

=

+ +( )J R

R

( )1

AB AB

AB J J2 2

2

AA BB (3.1.31d)

(4) The DasGupta−Huzinaga185 formula:

=+

+

J RR

( )1

AB ABAB J J

1e eAA

kARABBB

k RAB12

12

B (3.1.31e)

with the adjustable parameter 0.4 ≤ k ≤ 0.8.The parameters JAA describe repulsion of two electrons on the

same atom (one-center Coulomb integrals). They are calledidempotential or self-Coulomb integrals.186 Having the meaningof the second derivative of energy with respect to atomic charge,they can be evaluated using atomic (or valence state, for orbital-resolved quantities) IP and EA:

= − = ∂∂

JE

qIP EAAA A A

A

2

2(3.1.32)

The methodology was applied primarily to hydrocarbonmolecules and radicals, and was successful in describing theirheats of formation. However, the method could not describebond lengths, and it failed to predict correctly the relativeenergies of rotational isomers. These shortcomings wereaddressed in the second version of the MINDO approximation(MINDO/2)174 by developing parametric functions for the coreresonance integrals,Hij, and the core−core repulsion terms. Bothtypes of functions depend on the interatomic distance and arechosen to satisfy suitable physical requirements. In particular, theresonance integrals are taken proportional to the AO overlapintegrals, Sij, and to the valence-state ionization potentials oforbitals, Ii, similarly to the EHT and it extensions:

= +H B I I S f R( ) ( )ij i j ij ij1 (3.1.33a)

with the distance-dependent function f1(Rij). Later, the unitfunction was found to be the optimal choice:

=f R( ) 1ij1 (3.1.33b)

Balancing of the core−core repulsion terms was the secondmajor modification developed in the MINDO/2. In particular,the core−core repulsion of two atoms, Ecore,AB, should be equal atdissociation to the electron−electron repulsion of the neutralatoms, Eel,AB. There should be no long-range electrostaticinteractions between the neutral atoms. To obey this asymptoticproperty, the following expression has been adapted:

= + −⎛⎝⎜

⎞⎠⎟E E

Z ZR

E f R( )AB ABA B

ABAB ABcore, el, el, 2

(3.1.34a)

with the optimal choice for the function f 2(RAB) given by

α= −f R R( ) exp( )AB AB2 (3.1.34b)

The resulting MINDO/2 approach was found to be muchmore accurate and consistent with respect to its predecessor,MINDO (also known as MINDO/1). It gives good estimates forbond lengths, heats of formation, and force constantssimultaneously and for a larger set of hydrocarbons. Althoughthe accuracy of the estimated heats of formation improvedsignificantly, it was judged to be “a bit high for chemicalpurposes”. The method underestimates the strain energy in smallrings and overestimates attraction of nonbonded hydrogens,leading to low rotational barriers. Improper description of lonepairs constituted another deficiency. As a consequence, themethod had limited applicability to compounds containing the Nand O atoms.Finally, the MINDO/3 method was developed.175 The basic

methodology was similar to the previous strategies, with an

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5812

Page 17: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

additional adjustment of the parametric formulas and newparameters. It showed encouraging performance as applied to awide variety of conditions, including neutral molecules, ions,radicals, carbenes, and triplet states. The average accuracy of thecomputed heats of atomization was ca. 6 kcal/mol. The averageaccuracy for the estimated activation barriers of a number ofreactions was in the range of ±5 kcal/mol. Good performancewas found in estimation of different ground state properties, suchas first ionization potentials, molecular polarizabilities, chemicalshifts, electronic band structures, and gas phase proton affinities.The possibility of extending theMINDO/3 to study excited stateproperties was discussed as a realistic avenue. It was noted by theauthors that the MINDO/3 reached the limits of theapproximations based on the INDO framework. Any furtherimprovements required more rigorous NDDO approximations.The following deficiencies of the MINDO/3 method werenoted: (a) overestimation of stability of compounds containingtriple bonds; (b) slight underestimation of strain energy of smallrings; (c) underestimation of resonance energies and stabilities ofcompact globular molecules.It is worth reminding at this point that, although development

of the approximations, eqs 3.1.31−3.1.34, has been guided bydifferent considerations, all of them are motivated by physicalexpectations for the limiting cases. These examples constitute aperfect illustration of the physically motivated approximations,which are especially common in the field of semiempiricalmethodologies and reactive force fields.The next level of elaboration in the development of

semiempirical approaches was reached with the modif ied neglectof diatomic overlap (MNDO) methodology developed by Dewarand Thiel.187,188 The MNDO and derivative methods start withthe NDDO-type Fock operator, eqs 3.1.26, and utilize a numberof approximations previously encountered in theMINDO familyof methods. These corrections mostly concern the form of thecore−core repulsion terms and the approximation of the two-electron integrals.The core−core interaction terms are approximated by the

function with the desired asymptotic behavior and are para-metrized to experimental or high-accuracy ab initio calculationdata. Specifically, the following form is used:

= | + +α α− −E Z Z s s s s( )[1 e e ]AB A B A A B BR R

core,A AB B AB (3.1.35)

Because the electronic system is described by a single-determinant wave function (due to neglect of some integrals),and because of the approximations utilized to compute thesurviving integrals, the accuracy of the resulting approach islimited and strongly relies on a proper parametrization. Ajudicious choice of the functions EAB is expected to compensatefor the basic deficiencies of the chosen wave function and theHamiltonian approximations. Because of the high importance ofparametrization, there is no reason for computing molecularintegrals directly. Instead, computationally efficient approxima-tions can and should be used. The two-center two-electronintegrals are computed in the MNDO by the multipoleexpansion:

∑ ∑∑| = =

∈ ∈

+= =

ab cd M M f R

a b A c d B

( ) [ ]1

2( ),

, , ,

l l ml mA

l mA

l li

l

j

l

ij, , 1

2

1

2

11 2

1 1 1 2

1 2

(3.1.36)

whereMlmA represents a point multipolea collection of 2l point

charges of 1/2l magnitude, offset by the distance Dl from the

center of the multipole and ordered according to the orbitalnumbers l and m. The summation in the second formula runsover all point charges, with Rij being the distance between them.f1(Rij) is a properly behaving function. The distance parametersDl are computed such that the multipole moments of the pointmultipoles are equal to the values of the corresponding chargedistributions χa*χb. Examples of configurations representing pointmultipoles are shown in Figure 1.Utilization of most of the two-electron integrals in theMNDO

method (from the NDDO approximation) is particularlyimportant to reproducing bonding directionality. The abovemultipole approximation clearly suggests greater capability of theMNDO in such cases. The INDO approximation is essentiallyequivalent to onlymonopole terms. As a result, the average errorsin computation of the ground state properties are reduced by50% relative to those obtained with the MINDO/3 method.Treatment of unsaturated hydrocarbons was problematic for theMINDO/3, which predicted the heats of formation to beconsistently too positive for compounds containing doublebonds and too negative for compounds containing triple bonds.These deficiencies are overcome in the MNDO. The MNDOalso corrects the overly negative heats of formation for moleculescontaining adjacent atoms with lone pairs (e.g., NH2−NH2) andincreases the corresponding bond lengths. Better treatment ofdirectional bonding helps improve the accuracy of predictedangles and restores proper ordering of MO energies. Forexample, the MNDO method does not produce spurious high-lying sigma orbitals in unsaturated systems.The original MNDO methodology has been extended to d-

elements (MDNO/d) in the later works of Thiel andVoityuk.189,190 Various MNDO parametrizations have beendeveloped, including third-row elements,191 aluminum, andboron.192 Later parametrizations combined with slight changesof the parametric functions have branched as independentmethods. Among them are the Austin model 1 (AM1)193−196 ofDewar and co-workers, the parametric model number 3(PM3),197−199 and the parametric model number 6 (PM6)200 ofStewart. The latter contains a wide parametrization for most ofthe elements in the periodic system. The parametric modelnumber 7 (PM7) has been reported recently.201 It reconsiders theparametrization strategy used in the earlier versions andintroduces slight corrections to the underlying NDDOapproximation. As a result, the description of long-rangeinteractions is improved, leading to increased applicability ofthe method to crystal structures and heats of formation of solids.Most importantly, the PM7 corrects the deficiencies of the earlierNDDO-derived methods that lead to infinite errors when solidsare concerned.The MNDO model was improved even further by

incorporating the orthogonalization correction, penetrationintegrals, and effective core potentials into the one-center partof the core Hamiltonian, resulting in a series of orthogonalizationmodels (OMx, x = 1, 2, 3).202−204 The improvements over theMNDO calculations are relatively small, but are systematic,pointing to methodological advantages of the new approach. Thecorrections were found especially important for describingexcited states, transition states, and strong hydrogen bonds.Recently, the performance of the OMx models was extensivelybenchmarked against other semiempirical and DFT methods fora range of properties,205 including excited states.206 It wasconcluded that the OMx methods are comparable to the DFT-GGA in accuracy. Since their computational costs are notably

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5813

Page 18: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

smaller than those for the DFT calculations, they can be efficientalternatives for calculations on large systems.Another approach to account for overlaps was employed by

Sattelmeyer et al. in their nonorthogonal version of MNDO (NO-MNDO).207 Unlike Thiel’s OMx models, the new approachsimply reintroduces the overlap matrix into the secular equation,leading to a generalized eigenvalue problem. The approach doesnot change the overall scaling law for the computational costs. Itdoubles the efforts at most, because an additional matrixdiagonalization is needed. The overlap matrix is always available,because it is used for approximation of the Fock matrix. No extracalculations are needed here. Reintroduction of the overlapmatrix in the secular equation is essential to account properly forthe Pauli repulsion effects. This factor provides the main physicalmotivation for the method. The authors argue that Pauli

repulsion arises mainly because of the orbital overlap, rather thanmolecular integrals. Internal consistency of the resultingsemiempirical MO model is another important argument. Thisrequirement was not satisfied in many other models, whichutilize overlaps to compute the Fock matrix, but disregard themin the secular equations. The performance of the NO-MNDO forprediction of heats of formation is comparable to or slightlybetter than that of the AM1 and MNDO methods. Theperformance with respect to conformational equilibria andtorsional barriers improved significantly.Most semiempirical models rely on local approximations,

which tend to be more refined for description of short-rangeinteractions. The reference data are taken from relatively smallsystems in most cases. Parametrizations based on data setscomposed of only small molecules are likely to produce lesstransferable parameters that may not capture the proper physicsof long-range interactions. Therefore, it is important, in general,to utilize large molecules or condensed-matter systems forparametrization, in order to obtain more physically meaningfulparameters that describe long-range effects. The PM7 can beconsidered a step in this direction, although not necessarilyoriginating from the considerations just discussed.3.1.4. SINDO, SINDO1, and MSINDO Methods. Neglect

of the AO overlaps in the eigenvalue problems is one of the

central assumptions of the CNDO, and the following INDO andNDDO methods. The overlap matrix, S, is approximated in theINDOmethod by the identity matrix, I, although the overlaps arenon-negligible and are used in calculations of molecular integrals.The so-called symmetrically orthogonalized INDO (SINDO)208

method has been invented to provide a consistent treatment ofthe overlaps.A prototype of the SINDO method was proposed by Coffey

and Jug,209 who modified the INDO expression for the off-diagonal matrix elements of the core Hamiltonian, eq 3.1.28c, bythe following the prescription derived from the simple two-atomic symmetric orthogonalization considerations:

=H Haa aa (3.1.37a)

ρ ρρ ρ

=+

+ +−

−+

− +

⎡⎣⎢⎢

⎤⎦⎥⎥

H SK K

H HS

H HR

SR

12 2

( )1

1

( )1 d

d

ab abA B

aa bbAB

B A

B A

aa bbab

2

(3.1.37b)

where the quantities with tildes denote elements of the coreHamiltonian in the orthogonalized basis, KX are constants, ρa =|⟨χa|z|χa⟩|, and SAB is the s−s overlap.The INDO/NDDO approximation that neglects the AO

overlaps is based on the localized nature of AOs, suggesting fastdecay of overlaps with separation of orbital centers. The actualAOs are not orthogonal. However, despite the fact that thiscondition is not satisfied in most cases, it is always possible totransform the original nonorthogonal AO basis, {χi}, to anorthogonal basis, {χi}. This can be achieved via symmetric(Lowdin) orthogonalization, which is known to minimize thedistortion of orbital shapes from their canonical values, in theleast-squares sense:

∑χ χ χ χ| ⟩ = | ⟩ ⇔ | ⟩ = | ⟩− −S S( )ij

ij j1/2 1/2

(3.1.38)

A central assumption of the INDO/NDDO family of semi-empirical methods, the orthogonality of the resulting basisorbitals is satisfied exactly under this transformation. Accord-ingly, the Hamiltonian matrix must be transformed as

= − −H S HS1/2 1/2 (3.1.39)

The S−1/2 matrix is expanded into a Taylor series forcomputational efficiency using

= +S I s (3.1.40)

= − + − +−S I s s s12

38

516

...1/2 2 3(3.1.41)

The expansion is truncated to retain in the transformation of eq3.1.39 only the terms up to the second order in s. The SINDO1Hamiltonian is written as

∑ ∑ = − + − H H S L S H H14

( )aa aab

ab abb

ab aa bb2

(3.1.42a)

= − +

+ + −

H H S L S L

S S H H H

12

( )

14

12

( 2 )

ab abc

ac bc bc ac

bac bc aa bb cc

(3.1.42b)

Figure 1. Point charge configurations corresponding to differentmultipoles. a and b label the coordinate axes, e is the electron charge, andD1 and D2 are the distances between the expansion centers; see text forthe description. q, μ, andQ represent monopole, dipole, and quadrupolemoments of the expansion of the electrostatic potential, respectively.

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5814

Page 19: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

where Lab is the kinetic energy correction term, and Haa is thediagonal element of the core Hamiltonian in the nonorthogonalbasis, averaged over all orbitals of the same angular quantumnumber.The model was improved further to yield the SINDO1

methodology of Jug and Nanda.210−212 Among other distinctivefeatures, this method accounts for the core electrons by theutilizing pseudopotentials, as described previously by Zerner.213

The combination of the SINDO1 method with the CI approachwas used extensively to study various photochemical processes,including photoinduced isomerization and photofragmenta-tion,214,215 and photochemical reactions.216,217

Later, Ahlswede and Jug proposed a consistent modif ication ofSINDO1 (MSINDO) method.218,219 The method was para-metrized for a wide range of elements.220,221 The maindistinction from the original SINDO1 is neglect of the termsin eqs 3.1.42 that are quadratic in overlaps. This was necessarybecause such terms become unrealistically large for systemscontaining several transition metal atoms. Only linear terms areconsidered in the MSINDO. In addition, an empirical screeningof core matrix elementsUa was introduced. The atomic basis wasextended to account for hypervalent coordination. Two sets oforbital exponents are used for calculations of one- and two-centerintegrals, respectively.TheMSINDOmethod was utilized extensively by Bredow and

Jug. They performed molecular dynamics studies employing thissemiempirical Hamiltonian for various systems and processes,such as Si cluster structure,222 adsorbate−surface interac-tions,223,224 vacancy diffusion in solid-state materials,225 andchemical transformations.226,227 The applicability of the methodto complex and relatively large systems of varying chemicalnature and the ability to perform long-time molecular dynamicssimulations emphasize MSINDO advantages. To be fair, weshould note that similar computational performance andcomparable accuracy can be expected for other semiempiricalmethods.Very recently, the semiempirical CIS method (sCIS) was

formulated on the basis of the MSINDOHamiltonian, leading tothe MSINDO-sCIS228 method for excited states of organicmolecules. Because most semiempirical Hamiltonians, includingthe MSINDO Hamiltonian, are parametrized to reproduceground state properties, a correction of the CIS Hamiltonian wasnecessary. The CIS wave function is represented in the standardway, as a superposition of singlet (triplet) configuration statefunctions, |1,3Φi

a⟩:

∑| Ψ ⟩ = | Φ ⟩∈∈

tIia

ia

ia1,3

occvirt

1,3

(3.1.43)

where tia is the amplitude of configuration |1,3Φi

a⟩. The CISHamiltonian matrix elements, MHia,jb ≡ ⟨MΦi

a|H|MΦjb⟩, M = 1, 3,

are given by

δ δ= − − + | − |H E E d c ia jb c ij ab[ ] 2 ( ) ( )ia jb ij ab i ia,1

a corr, 1 2

(3.1.44)

δ δ= − − |H E E c ij ab[ ] ( )ia jb ij ab i T,3

a (3.1.45)

The Hamiltonian matrix elements are different from those ofthe nonempirical CIS by the scaling coefficients c1, c2, and cT,which are all set to unity in the ab initio formulation. In addition,the one-electron orbital energy correction, dcorr,ia, is introducedfor totally symmetric singly excited states that may have strongmixing with double excitations ii → aa:

= | | + | |d aa ia ii ia( ) ( )iacorr, (3.1.46)

The combination of the MSINDO with the nonempirical CISapproach and the cyclic cluster method (CCM) for extendedperiodic systems, MSINDO-CCM-CIS, was reported re-cently.229 It allows one to study large solid-state systems,including crystalline materials, interfaces, and even biologicalsystems. The analytical gradients for the MSINDO-sCIS werederived,230 opening opportunities for studies of excited statedynamics (electronic and nuclear) in large organic molecules.

3.1.5. ZINDO Method. ZINDO is the spectroscopicparametrization of the INDO method by Zerner and co-workers.183 Typical calculations are split into two steps: INDO/1231 and INDO/S, where “S” stands for “spectroscopic”. The firststep is parametrized to reproduce ground-state properties andmolecular geometries. The second parametrization is used withinthe CIS formalism for calculations of excited state properties forsystems with d-elements.The one-center two-electron integrals are specially para-

metrized using the Slater−Condon factors, F and G, and theexperimental ionization potentials.232 Derivation of the corre-sponding relations considers various possible electronicconfigurations in d-elements and different excitation schemes.For example, two ionization schemes are possible for transitionmetals with s, p, and d electrons.233

Scheme I is represented by eqs 3.1.47a−3.1.47c:Is

→− −3d 4s 3dn n1 1 (3.1.47a)

Ip

→− −3d 4p 3dn n1 1(3.1.47b)

Id

→− −3d 4s 3d 4sn n1 2 (3.1.47c)

Scheme II is represented by eqs 3.1.48a−3.1.48c:Is

→− −3d 4s 3d 4sn n1 2 1 (3.1.48a)

Ip

→− −3d 4s4p 3d 4sn n1 1(3.1.48b)

Id

→− −3d 4s 3d 4sn n1 2 2 2 (3.1.48c)

It was found that ionization according to each of these schemescould contribute to the ionization potentials. Hence, bothschemes must be included in parametrization of the coreintegrals. Zerner considered a simple two-configurationinteraction scheme involving the competing configurations,with the coupling between the configurations taken as anadjustable parameter. The approach is used to determine theweights of the contributions coming from each excitationschemes. The average value is then used to determine the coreintegrals.Unlike many other INDO-derived methods, ZINDO utilizes

distance-dependent orbital exponents, ξ:

ξ ξ= +a b Rmin( / , )0 (3.1.49)

This approach may be considered an alternative to the use ofmultiple-ζ basis functions. The charge dependence of the orbitalexponents was also considered.234 The charge-dependent basis

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5815

Page 20: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

was found to be especially important for anionic systems(molecules and transition states).The two-electron integrals are approximated by an empirical

function, similarly to other semiempirical methods. TheNishimoto−Mataga−Weiss formula, eq 3.1.31c, is used for thispurpose. All one-center hybrids of the (ab|cd) type are includedfor the atoms containing d-orbitals, in order to preserverotational invariance of the method. This was not necessary inthe original INDO formulation, because many integrals involvings- and p-orbitals vanished due to symmetry, which is not truewhen d-orbitals are added. All such integrals that transform intoone another are included in the ZINDO. Overall, the methodproved to be quite accurate for prediction of spectroscopicproperties of compounds containing d-elements.3.1.6. Sparkle Model for f-Elements. Extension of the

semiempirical models to the d- and f-elements representsadditional difficulties. One has to deal with a larger number ofelectrons and increased complexity of the corresponding AOsand integrals (or multipole approximations). These factors maysignificantly decrease the computational efficiency of suchapproaches, especially for systems with a large number of d-and f-elements. Other difficulties are present as well: coreelectron and relativistic effects are of greater importance; high-spin and nearly degenerate states are encountered more often;the multiconfigurational nature of wave functions becomes muchmore pronounced; the relationships between the parameters andthe observable properties become more complex, makingparametrization more difficult.Inclusion of orbitals with high angular momentum may be

important. For example, Zerner and co-workers235 found in thepioneering work on modeling lanthanide halides MXn that f-orbitals are required for predicting correct geometries, includingpyramidal geometries of trihalides and bent geometries of thedihalides. de Andrade et al.236 indicate that interactions of thelanthanide atoms with their ligands are mostly of electrostaticnature. This observation formed the basis of the sparkle model forlanthanide complexes (SMLC),236,237 or simply Sparkle. Themodel concerns only lanthanide atoms; the rest of the system canbe treated by a semiempirical method of choice, although theparametrization will be different for each method. Initially, theSMLC was parametrized for use with the AM1 method and wasknown as the SMLC/AM1.236,237 It was improved further byadding Gaussian functions to the core−core interaction energy,in order to make the model compatible with the AM1.238 TheGaussian functions proved essential for accurate description oflanthanide complex geometry. The AM1 parametrization waslater extended to other rare earth elements within the AM1model (Sparkle/AM1).239,240 Parametrizations for the PM3(Sparkle/PM3),241−244 PM6 (Sparkle/PM6),245 and, veryrecently, for PM7 (Sparkle/PM7)246 methods were developed.A sparkle is an integer charge surrounded by a spherical

repulsive potential of the exp(−αr) form. The adjustableparameter α controls the hardness of the sparkle sphere. Theconstruction is not a simple point charge; rather, it is designed torepresent a pair of ions forming an ionic bond. The central chargeis typically taken as an integer value (e.g., +3 for lanthanidecations), and the ionic radius is on the order of 0.7 Å. A sparkledoes not introduce orbitals explicitly. Therefore, the calculationson complexes containing large numbers of lanthanide atoms canbe accelerated significantly with this approach, in contrast toexplicit consideration of all f-orbitals. A sparkle has zero heat ofatomization and no ionization potential.

The development and successful application of the Sparklemodel demonstrates an important methodological concept.Namely, an increasing complexity of description may be abruptlysimplified at some point, without sacrificing any accuracy. It isnot always necessary to push the established approach with theconventional and systematic improvements; rather, a qualita-tively different assumption may prove much more efficient. Alarge number of lanthanide f-orbitals and their relative sphericalorientation provide good grounds for approximating f-electronsby an effective spherical charge distribution. The disadvantage ofelements with f-orbitals becomes their advantage. The transitionbetween quantum and classical mechanics provides a well-knownanalogy. At a certain point, for sufficiently large systems,quantum effects become less relevant, and efficient and accuratestudies can be based on a classical description. One should alwayslook for opportunities to perform such reductions whendeveloping physically motivated approximations. The reductionscan allow scientists to study large-scale systems with realisticresources.

3.1.7. DFTB and Derived Methods. In this section weoverview yet another group of semiempirical approachesthedensity functional tight-binding (DFTB) method. We start byemphasizing its qualitative difference from the other methodsdescribed above. This difference is so notable that this subsectioncould be classified into a larger category of DFT-derivedsemiempirical methods as opposed to theHF-derived semiempiricalmethods discussed in the previous subsections. To structure thisreview and to simplify the hierarchy of classifications, we considerthe DFTB on the same footing as the CNDO, INDO, and othersimilar methodsas a semiempirical MO-based methodkeeping in mind the alternative classification just introduced.The method originates from the DFT method and, therefore,

accounts for correlation effects via DFT functionals (exchangeand correlation), rather than via extensive parametrization or CIschemes. To derive the DFTB method, the Taylor seriesexpansion of the DFT energy with respect to the fluctuation ofthe charge density, ρ, around a given reference value, ρ0, isconsidered:

∫ ∫∫

∫ ∫

∫ ∫

∑ρ ψρ

ρ ψρ ρ

ρ ρ ρ

δδρ δρ

ρ ρ

δδρ δρ

ρ ρ ρ

ρ ρ ρ ρ ρ ρ

ρ ρ ρ ρ

= ⟨ | − ∇ + +′

| − ′|′

+ | ⟩ − ′ ′ | − ′|

− + +

+ ′ | − ′|

+′

Δ Δ ′ + ″ ′

| − ′|

+′

Δ Δ ′Δ ″ +

= + Δ + Δ Δ ′

+ Δ Δ ′ Δ ″ +

ρ ρ

ρ ρ

⎝⎜⎜

⎠⎟⎟

⎝⎜⎜

⎠⎟⎟

E n Vr

r rdr

V r rr r

r r

V r r E E

r rr r

E

r r

rr r

E

E E E

E

[ ]12

( )

[ ]12

d d( ) ( )

[ ] ( ) d [ ]

12

d d1

16

d d

d1

...

[ ] [ , ] [ , , ]

[ , , , ] ...

ii i i ne

i

nn

2 0

xc 00 0

xc 0 0 xc 0

2xc

,

2xc

,

0 0 1 0 2 0

3 0

0 0

0 0

(3.1.50)

where

ρ ρ ρΔ = − 0 (3.1.51)

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5816

Page 21: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

)The reference density is typically given by a superposition of

atomic densities:

∑ρ ρ=A

A0 0,(3.1.52)

which corresponds to the following direct sum of charge densitymatrix:

= ⊕ ⊕P P P ...A B0 0, 0, (3.1.53)

Depending on the terms included in the Taylor expansion, onearrives at a hierarchy of systematic approximations, giving rise tothe family of DFTB methods: DFTB (or DFTB1),247,248 SCC-DFTB (or DFTB2),249 DFTB3 (only diagonal terms), andDFTB3 (with off-diagonal terms).The first term on the right-hand side of eq 3.1.50 describes the

basic contributions to covalent bondingthe tight-binding term.It can be represented via

∫∑

ψρ

ρ ψ

= ⟨ | − ∇ + +′

| − ′|′

+ | ⟩

=

E n Vr

r rr

V

P H

12

( )d

[ ]

ii i i ne

i

i jij ij

02 0

xc 0

,

0

(3.1.54)

whereHij0 is the effective tight-binding Hamiltonian in the orbital

basis. It can either be computed directly or parametrized (as inthe EHT method).The second, repulsion term

∫ ∫∫

ρ ρ

ρ ρ ρ

= − ′ ′ | − ′|

− + +

E r rr r

r r

V r r E E

12

d d( ) ( )

[ ] ( ) d [ ] nn

rep0 0

xc 0 0 xc 0 (3.1.55)

describes the electron−nuclear, nuclear−nuclear, and ex-change−correlation interactions at the non-self-consistentdensity. It depends only on the fixed (reference) charge density,ρ0, and does not depend on its variation. Therefore, forcomputational efficiency the repulsion term can be approximatedvia the sum of transferable one- and two-body functions:

∑ ∑ρ ρ ρ≈ +E V V R[ ]12

[ , , ]A

A AA B

AB A B ABreprep

0,,

rep0, 0,

(3.1.56)

Since the one-body terms affect only the absolute values ofenergies and cancel out when relative energies are considered, itis convenient to choose the energy scale such that these termsgive zero contribution. This consideration leads to a furtherapproximation:

∑ ρ ρ≈E V R12

[ , , ]A B

AB A B ABrep,

rep0, 0,

(3.1.57)

which can be regarded as a coordinate system change.The total energy is finally given by the sum of the electronic

and repulsion energies:

= +E E EDFTB0 rep (3.1.58)

Variational optimization of the total energy leads to thegeneralized eigenvalue problem with the effective one-electron

Hamiltonian (Fock matrix) given by the charge-independentcore tight-binding Hamiltonian:

=F Hij ijDFTB 0

(3.1.59)

The approximations above constitute the basis of the originalDFTB method.247,248 The latter is non-self-consistent andrequires only one diagonalization. The core Hamiltonian canbe either computed directly or parametrized. The formulation isessentially identical to that of the EHT method250,251 and itsextensions that introduce repulsion terms for a better descriptionof equilibrium interatomic bond lengths.252−254 The details ofcomputation of the core Hamiltonian and its parametrization aredifferent.The non-self-consistent formulation is expected to work well

for systems in which charge transfer does not play an importantrolewhen the atomic electronegativities are similar, e.g., inhydrocarbons or single-element materials. This approach canalso be applied when the charge transfer is very largein ioniccrystals. However, the constant core Hamiltonian matrixelements are expected to fail when a delicate balance of chargetransferred between different species is needed. At the very least,the transferability of parameters becomes limited, owing to thelack of their variation with the environment.The simplest way to overcome the shortcomings of the tight-

binding model is to allow the Hamiltonian matrix elements tovary in response to changes in the charge density of the orbitalsurroundings. In other words, the Hamiltonian is made charge-dependent. This is equivalent to considering effects of higherorder terms in the Taylor expansion, eq 3.1.50. Inclusion of thesecond-order terms into computational schemes requires a self-consistent evaluation of atomic charges and their fluctuationswith respect to reference values. The energy optimization isachieved via iterative procedure. The resulting method is calledself-consistent charge DFTB (SCC-DFTB).249 For computa-tional efficiency, the second-order terms

∫ ∫ δδρδ ρ

ρ ρ= ′ | − ′|

+′

Δ Δ ′ρ ρ ′

⎝⎜⎜

⎠⎟⎟E r r

r rE1

2d d

1(2)2

xc

,0 0

(3.1.60)

are approximated as

∑ γ≈ ≡ Δ ΔγE E q q12 A B

AB A B(2)

, (3.1.61)

where γAB is the screened Coulomb interaction function, whichapproaches 1/RAB for large interatomic distances, RAB. Possiblefunctional forms for approximation of the function γAB have beenpresented earlier, as the parameter to approximations eqs 3.1.31.ForA = B the onsite self-repulsion values γAA(0) =UA are know asthe Hubbard parameters, also called idempotential or self-Coulomb terms.186 They are defined as in eq 3.1.32.The energy in the DFTB2 method is given by

= +‐E E ESCC DFTB DFTB (2) (3.1.62)

The effective Hamiltonian (Fock matrix) within this method isgiven by

∑ γ γ= + + Δ‐F H S q12

( )ij ij ijA

IA JA ASCC DFTB 0

(3.1.63)

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5817

Page 22: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

where ΔqA is the difference of the actual Mulliken charge onatom A and the charge on atom A in the reference system (e.g.,isolated atom).All third-order terms were included recently, resulting in the

DFTB3 approach.63,255−257 The energy is given by

= +‐E E EDFTB3 SCC DFTB (3) (3.1.64)

where the third-order terms are approximated by

∑≈ ≡ Δ Δ ΓΓE E q q13 A B

A B AB(3)

,

2

(3.1.65)

with

γΓ =

∂∂qAB

AB

A q A0, (3.1.66)

The Fock matrix is then

∑= + Δ Δ Γ + Δ Γ

+ Δ Γ + Γ

‐ ⎡⎣⎢

⎤⎦⎥

F F S q q q

q

13

( )

12

( )

ij ij ijA

A I IA J JA

A AI AJ

DFTB3 SCC DFTB

(3.1.67)

An improved Coulombic interaction between partial chargeswas incorporated in the third-order expansion of the DFTBmethod (DFTB3).256 The full third-order expansion of the DFTenergy was utilized. The resulting method was especiallysuccessful for charged systems with the main biological elements.Unlike many other semiempirical approaches, the DFTB

family utilized atomic pseudopotentials to take effects of the coreelectrons into account. In contrast to simple models based on aneffective nuclear (core) charge screened by core electrons, aprojector technique for the representation of the effective corepotential accounts for proper nodal structure of wave functions.The requirement that valence orbitals, included in the SCFoptimization process, be orthogonal to atomic core orbitalsobtained from atomistic calculations can be satisfied with theprojector operator technique (essentially Schmidt orthogonali-zation):

∑χ χ χ χ χ| ⟩ = | ⟩ − | ⟩⟨ | ⟩∈

a ab

b b acore (3.1.68)

The orthogonalization correction, eq 3.1.68, modifies the matrixelements of the effective KS potential according to

∑χ χ χ χ χ χ⟨ | | ⟩ → ⟨ | − | ⟩ ⟨ | | ⟩∈

V V E( )a b ab

b b b beff effcore (3.1.69)

The effective Hamiltonian matrix elements will then change:

∑ ∑χ χ χ χ→ − ⟨ | ⟩ ⟨ | ⟩ = −∈ ∈

F F E F S E Sab abc

a c c c b abc

ac c cbcore core

(3.1.70)

The DFTB method and its derivatives have been elaboratedfurther in a number of ways. The method has been combinedwith PCM to account for solvent effects. The time-dependentformulation has also been reported, allowing one to study excitedstates of large systems in solvents.258,259 Analytical gradients andHessians are available for ground and excited states, making itpossible to study dynamical properties. The DFBT has also beenextended to the GW formalism, allowing accurate treatment ofexcited states of large systems.260 The method has beenaugmented with the chemical potential equalization scheme, to

improve description of electronic polarizabilities of largesystems.261 Although the original DFTB method was para-metrized only to the main biological elements (C, N, O, H),recent work has extended the parametrization to a much widerscope of elements, spanning a large fraction of the periodictable.262 A spin-polarized version of the DFTB has been reportedfor studies of magnetic properties.263 The method has beensuccessfully combined with the fragment orbital method and thecoarse-grained model Hamiltonian, to study charge transfer inDNA-based molecular wires,264 as well as hole transfer inDNA.265,266 The DFTB has been combined as a part of variousQM/MM schemes to allow investigation of large biomole-cules.267−269

The DFTB method is being extensively developed by theElstner group, who implemented it in the DFTB+ package.270 Adetailed overview of the method, its derivation, and discussion ofits applications can be found elsewhere.63,64,89,271−274 Themethod has been applied to many large-scale systems, includingsystems of biological interest,275−277 metal−organic frame-works,278,279 nanotubes,280 and other nanoscale systems.281

A downgrade approximation to the DFT method, the DFTBapproach reduces the empiricism common in the HF-derivedmethods to the minimum. It accounts for correlation effects byutilization of the proper correlation DFT functional, rather thanby relying on parametrization quality. In this sense, the method ismore rigorous than the HF-derived methods. With all itsstrengths, the DFTB method inherits the DFT problems as well.The problems can be accounted for by proper parametrization ofadditional empirical terms. One should also keep in mind thatmany modern DFT functionals are also parametrized in someway. Therefore, both DFT- and HF-derived semiempiricalmethods are at the same theoretical level, occasionally out-performing one another with respect to some properties andsystems. Although it is often illuminating to relate a semi-empirical method to a higher-level theory, it may not be sorelevant when one considers results. Rather, it is particularlyimportant to take into account all asymptotic properties,qualitative relationships, and other physically motivatedrestrictions. The exact functional form or the origin of anapproximation may be secondary if it requires an extensiveparametrization.

3.1.8. Extended Huckel Theory. 3.1.8.1. General Over-view. The extended Huckel theory250,282,283 (EHT) was amongthe first successful MO-based theories of chemical bonding. Itssimplicity allowed a transparent interpretation of many chemicalphenomena, including reactivity and electronic effects. Despiteits simplicity, the method can produce accurate quantitativeresults for many observable properties, such as heats offormation, geometric structures, ionization potentials, etc.Originally formulated mostly for organic compounds, themethod was later extended to inorganic materials as well,including periodic systems and nanoscale structures.284−288

The EHT method is a semiempirical approach. However, itwas not derived as an approximation to the HFmethod, like mostof the other methods discussed in sections 3.1.1−3.1.6. Thesimplest form of EHT has a tight-binding structure, similar tothat of the noniterative DFTB method. Moreover, many otherdevelopments of the EHT method come from physicallymotivated but empirical grounds. In this sense, the method canbe considered a model Hamiltonian with atomic resolution. Someof the basic assumptions of the EHT method and its extensionscan go beyond those in the semiempirical and DFTBformulations considered previously. This allows the method to

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5818

Page 23: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

achieve high accuracy, retaining self-consistent formalism andhigh computational efficiency. For these reasons, we classify themethod into a separate group.The interest in the EHT method has gradually declined in the

era of high-performance computers, as the ab initio wavefunction and DFT approaches gained more popularity, especiallyfor small systems. Nonetheless, the EHT method has greatpotential, mostly because of its high computational efficiency,allowing application to large-scale systems. The approach isespecially fruitful in today’s world for studies of the dynamics ofelectronic states,16,289−295 electron transport (conductiv-ity),296−301 electronic structure and magnetic properties ofinorganic materials,300,302−305 as well as enthalpies of formationand interaction energies of organic molecules.306−309 The EHTHamiltonian was utilized as the framework for the time-dependent tight-binding calculations of molecular excitedstates310 and electronic resonances.311 Recent papers reportgeneralizations to periodic systems, with multiple k-pointsconsidered (crystal momentum quantization)287,296,312 and tounrestricted formulations, in which the dependence onelectronic spin polarization is included.312 These developmentsmake the EHT method applicable to large systems and toprocesses that involve spin polarization and spin relaxationdynamics.3.1.8.2. Basic Formulation. The original EHT formulation250

was proposed as a noniterative approach based on diagonaliza-tion of a simple tight-binding Hamiltonian, to account forcovalent chemical bonding. The matrix elements of the EHTHamiltonian, H, are charge-independent and are computed inthe AO basis as

= +H K S h h12

( )ij ij ij i j (3.1.71)

where Sij is the overlap integral in the AO basis, eq 2.1.36, and Kijis the proportionality constant, typically assumed to be in rangebetween 1 and 2. In general, the constant can depend on the typeof orbitals i and j. This constant is set to 1 for diagonal elements,Kii = 1, while it is treated as an adjustable parameter for all otherpairs of orbital types Kij,i ≠ j. The default value proposed byHoffmann is 1.75 for all types of orbital pairs. The parameter hi isthe energy of the ith orbital of an isolated atom, called valencestate ionization potential (VSIP). These parameters are typicallyavailable from the X-ray spectroscopy as the orbital bindingenergies, or they can be treated as adjustable parameters. Therelation is

= = −h h VSIPi i i0

(3.1.72)

The AOs, |χa⟩, used to compute orbital overlaps in eq 3.1.71are typically taken in the EHTmethod as single-exponent Slater-type orbitals (STOs):

χ ξ| ⟩ = −N rexp( )a (3.1.73a)

or as double-ζ STOs:

χ ξ ξ| ⟩ = − + −c N r c N rexp( ) exp( )a 1 1 1 2 2 2 (3.1.73b)

where N, N1, and N2 are the normalization coefficients, c1 and c2are the linear combination coefficients, and ξ, ξ1, and ξ2 are theorbital exponents. The latter are typically available from ab initiocalculations of electronic structure of isolated atoms313−315 orcan be treated as adjustable parameters.287,296 To compute theoverlap integrals of STOs, one first switches to the coordinatesystem with specific orientation of the AOs, in which the overlaps

can be computed according to the known formulas.316,317 Thecomputed quantities are then rotated back to the molecularcoordinate system.

3.1.8.3. Improved Electronic Hamiltonian Formulations.The original EHT Hamiltonian, eq 3.1.71, was refined severaltimes to account for missing effects. The corrections are derivedmostly from physical insights and improved accuracy orcorrected artifacts. One of the most notable deficiencies of thatsimple Hoffmann Hamiltonian, eq 3.1.71, was its inability todescribe correctly molecular geometries. The bond lengths wereunderestimated, and the dissociation curves were typicallyinaccurate. A number of corrections were developed to improvethe accuracy of the computed geometric properties. Here wediscuss only the corrections to the electronic Hamiltonian. Somecorrections involved nuclear repulsion terms. Even in these cases,the electronic energy was modified as well.Cusachs developed one of the earliest corrections to the simple

Hoffmann Hamiltonian, aimed at a better description of thegeometries of water molecules.318 The Hamiltonian matrixelements were nonlinear functions of the AO overlaps:

= − | | +H S S h h12

(2 )( )ij ij ij i j (3.1.74)

The term Sij|Sij| represented variation of the kinetic energy as afunction of internuclear distances. Alternatively, this quadraticdependence can be interpreted as the electronic repulsion due tothe Pauli exchange principle. We met a similar nonlinearity in theSINDO and related methods (see section 3.1.4). A relatedformulation for the kinetic energy (Pauli exclusion/correlation)terms will be encountered in the eFF method of Su andGoddard.5,6 The Pauli exclusion is enforced in that formulationby a nonlinear function of the atomic overlaps.The Cusachs method predicts an incorrect position of the

energy minimum of the PES of the simplest H2 molecule. As aresult, bond lengths and dipole moments are overestimated. Themodel was generalized by Kalman,319 who developed a many-parameter exsin formula:

π= + | | + | | | |

− +

H S S c S b S

h h

sgn( )12

{(1 )[1 sin( ) exp( )]

1}( )

ij ij ij ij ij

i j (3.1.75a)

π π= − | |b Scot( )m (3.1.75b)

with c and Sm being parameters.The weighted Wolfberg−Helmholtz formula282

= + Δ + Δ − +H K K S h h12

[ (1 )] ( )ij ij ij ij i j2 4

(3.1.76a)

Δ =−+

h h

h hi j

i j (3.1.76b)

is among the most popular later corrections. It was introduced tofix the wrong order of the orbital energies, obtained with theHamiltonian given by eq 3.1.71.In order to improve the accuracy of geometrical properties,

Anderson253 developed electron−nuclear and nuclear−nuclearcorrections, and modified the Hamiltonian in eq 3.1.71 by usingthe distance-dependent proportionality constant:

δ= −K k Rexp( )ij (3.1.75)

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5819

Page 24: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

with k = 2.25, and δ = 0.13 Å−1. The formula was furthergeneralized by Calzaferri,254,320 in order to make the constant Kijlarger than 1 for intermediate and large interatomic separations:

κ δ= + − − +H R R S h h12

[1 exp[ ( )]] ( )ij ij ij ij ij i j,0

(3.1.76)

The approach was also successful in describing the electronicexcited states of organometallic complexes.320

3.1.8.4. External Potential in EHT. The simple HoffmannHamiltonian, eq 3.1.71, predicted incorrect equilibrium geo-metries252,254 and could not properly describe charge-transferprocesses,321,322 especially in ionic crystals.323 The solution wasbased on modification of the electronic Hamiltonian, asdiscussed above, and on the use of properly parametrized andtuned nuclear repulsion terms and electron−nuclear attractionenergy terms.252−254,306,324,325

Starting from general considerations, Anderson and Hoff-mann252 derived the expression for the PES of diatomicmolecules of the form:

= +E R E R E R( ) ( ) ( )AB AB ABnucl, NPF, (3.1.77)

where Enucl,AB(R) is the repulsive electrostatic interaction energyof the nucleus A with the neutral atom B:

∫ ρ= −

| − |

⎡⎣⎢

⎤⎦⎥E R Z

ZR

r

R rr( )

( )dAB A

B Bnucl,

(3.1.78)

where ZA and ZB are the core charges of the centers A and B,respectively, and ENPF,AB(R) is derived from the Hellmann−Feynman theorem:

∫ ∫ ρ= − ′ ′

′ | − |

∝ −

−∞E R Z R r R r

R R rZR

( ) d d ( , )

dd

1

AB A

R

A

NPF, NPF

(3.1.79)

The first, completely electrostatic term of eq 3.1.78 can becomputed explicitly, using the tabulated integrals between STOorbitals. The latter term, eq 3.1.79, was associated with theelectronic energy of the EHT method, which is given by the sumof the energies of occupied orbitals.Starting from theHartree energy, Carbo et al.324 introduced an

electrostatic correction to the EHTmethod. The derivation goesvia a CNDO-type approximation to the electrostatic energyterms. The final result computes the total PES by

= +E R E R E R( ) ( ) ( )EHT corr (3.1.80)

The correction energy, Ecorr(R), is given by

∑= | − | − |

+

<

⎧⎨⎩⎫⎬⎭

E R Z Z s s s s s s B s s A

R

( ) ( ) ( ) ( )

1

A BA B

A B A A B B A A B B

AB

corr,

(3.1.81)

where sIis the s-type orbital of the atom I with the correspondingprincipal quantum number. With further approximation of thetwo-center integral by

| ≈ | + |s s s s s s B s s A( )12

[( ) ( )]A A B B A A B B (3.1.82)

the expression eq 3.1.81 is transformed into the form reminiscentof that of Anderson and Hoffmann.252 The approach provides asystematic way of improving the EHT formulation forpolyatomic molecules.Following the expression of Carbo et al.,324 Calzaferri

expressed the electrostatic energy in the form:

∑ ρ

ρ

= −

| − |

+

| − |

<

⎧⎨⎩

⎡⎣⎢

⎤⎦⎥⎫⎬⎭

E RZ ZR

Zr

R rr

Zr

R rr

( )12

( )d

( )d

A BA B

A B

ABA

B

BA

corr,

(3.1.83)

where the integrals are computed as

∫ ∑

ρ ξ

ξ

| − |

= −−

− !=

⎡⎣⎢⎢

⎤⎦⎥⎥

r

R rr

Rb

R

n

Rp

n p

( )d

11

exp( 2 )

2

(2 )(2 )

B

n ln l

n l

p

n

n ln p

,,

,

1

2

,2

(3.1.84)

The coefficients bn,l are the AO occupation numbers. Becausethe electrostatic correction term, eq 3.1.83, is added on top of theEHT electronic energy, the populations are determined by theEHT Hamiltonian and are not self-consistent. This helpsaccelerating calculations, while retaining the required repulsionterms in the EHT PES.

3.1.8.5. Self-Consistent EHT.The tight-binding EHT schemeswere successfully applied to many-atom systems, mostlyhydrocarbons. The elements in these compounds havecomparable electronegativities, allowing one to neglect chargetransfer. Charge transfer is more important in polar systems andexcited states. The original Hamiltonian was modified to accountfor charge transfer by including dependence of the Hamiltonianmatrix elements on atomic charges via charge-dependentionization potentials.288,312,319,321−323 The idea can be trackedback to Harris.326 The resulting equations must be solvediteratively, until self-consistency is achieved, because theHamiltonian depends on the charge distribution and the chargedistribution depends on the Hamiltonian. The resulting methodis known as the iterative EHT (IEHT), or the self-consistentEHT (SC-EHT).In the simplest SC-EHT approach, the parameters hi are

modified according to

= −h h a qi i i I0

(3.1.85)

where ai is an adjustable parameter, and qI is the partial charge ofthe atom I on which the ith AO is localized. The atomic ororbital-resolved Mulliken charges327 are typically used, eqs2.1.48. The off-diagonal matrix elements of the EHTHamiltonian are computed according to the standard formulas,eq 3.1.71 or 3.1.74−3.1.76, but using the charge-correctedparametershi. Equation 3.1.85 implies that excess of electroncharge density (negative charge) on a given atom pushes theenergy levels of its atomic states toward more positive values andmakes the ionization potential smaller. Vice versa, excess ofpositive charge makes ionization more difficult.

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5820

Page 25: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

Some formulations of the SC-EHT method involve depend-ence of the atomic orbital energies for the ith orbital, hi, on thefluctuation of the orbital population:

δ= −h h a ni i i i0

(3.1.86a)

where

δ = − n n ni i i (3.1.86b)

is the fluctuation of the population of the orbital i, relative to theorbital population in the isolated atom or another reference valueni. The early works also considered higher order polynomials inδni:

288

δ δ= − +h h a n b ni i i i i i0 2

(3.1.87)

It is important to emphasize that the approximations given byeqs 3.1.86−3.1.87 destroy rotational invariance of the Hamil-tonian. Therefore, they should not be used, despite their higherflexibility for parametrization. Rotational invariance of thecharge-corrected Hamiltonian is preserved, if the charges arechosen to be atomic partial charges that are rotationally invariant,for example Mulliken charges. Nonetheless, a number ofresearchers utilized the orbital-resolved populations for con-struction of charge-corrected EHT Hamiltonians.The recent work of Truhlar et al. presents a noniterative

equivalent of the SC-EHT designed for description of chargetransfer effects.328 The method extends the conventional EHTHamiltonian to include multiple single-determinant excitations.The approach follows the CI philosophy, but it avoids the self-consistent determination of the CI coefficients of differentexcitations/charge-separated states. The CI coefficients arecomputed via a predefined formula rather than fromdiagonalization of the CI matrix. The method allows acceleratingthe calculations and avoids problems associated with iterativeschemes (divergences, instabilities, etc.). At the same time, itaccounts explicitly for charge-separated states and the multi-configurational nature of some processes. Because the method isnot self-consistent, a judicious choice of parameters is requiredand the parameters may not always be transferable.We want to conclude this section by comparing some of the

developments within the EHT framework to those encounteredin the other methods, including the semiempirical and DFTBschemes. First, a comparison of eqs 3.1.71 and 3.1.59 shows thatthe EHT and the non-self-consistent EHT are essentially thesame methods, different mostly in parametrization and intreatment of core electrons and nuclear repulsion terms.Furthermore, the charge-corrected Hamiltonian, eq 3.1.85 oreqs 3.1.86a of the SC-EHT is essentially the same as the SCC-DFTB Hamiltonian, eq 3.1.63. Finally, the expansion eq 3.1.87may be compared to the DFTB3Hamiltonian with only diagonalterms in the charge fluctuation, eq 3.1.67. Thus, the EHTmethodcan be considered a derivative of the DFT method.Carbo showed how the electron−nuclear electrostatic

interactions can be derived from the Hartree energy, skippingthe CNDO-type formulation in the intermediate derivation. Ourrecent work329 presents an extensive discussion of the relationbetween the EHT andHF theories, showing that the EHT can beregarded as a specific approximation to the HF method andpointing out the deficiencies of the approximation. It isremarkable that the evolution of the HF-derived semiempiricalmethods arrived at explicit consideration of overlap integrals inthe secular (generalized eigenvalue) equation only recently. Thesemiempirical methods neglected the overlaps for most of their

evolution. On the contrary, the AO overlaps were utilized sincethe beginning of the EHT method, and not only because theywere used for computation of the Hamiltonian matrix elements.The overlaps play a critical role in the definition of electronicexchange and correlation, especially for local effects such as incovalent bonding. Retention of the overlaps in the secularequation of the EHT approach may have been an importantfactor determining the EHT successes.

3.1.9. Timeline of Semiempirical Methods. Table 1includes the timeline, from 1963 to 2013, of the development ofsemiempirical methods.

3.2. Density-Based Methods

3.2.1. Empirical Density Embedding Schemes: EAMandMEAM. Shortly after the introduction of the DFT167 and itsKS168 formulation, the theories of interatomic interactionpotentials, based on charge density, started emerging. Theembedded-atom method (EAM) of the Baskes group330 is anotable example. Originally developed for the description ofhydrogen impurities in solid-state Ni, it was later generalized andextensively parametrized to other elements.331,332

The EAM formulation is based on the observation that theatomic embedding energy depends on the charge density of thesystem before a new atom (impurity) is embedded into the host,as was shown by Stott and Zaremba.333 Each atom can beconsidered an impurity embedded in the host. The energy of thehost is a functional of the unperturbed host electron density and afunction of the impurity type and position:331

ρ= E F r[ ( )]Z R h, (3.2.1)

Here, ρh(r) is the unperturbed electron density at the position ofthe host atom, while Z and R denote the type and position of theimpurity, respectively. One recognizes close similarity of thisapproach with the original noniterative DFTB formulation.247,248

The approximation is well justified for metals and alloys, onwhich the EAM focused originally. Charge transfer effects arenegligible in these materials, due to the similarity of electro-negativities of most metals and due to strong charge screening bythe free electron cloud.In summary, the main founding principles of the EAM are

• utilization of the charge density as the main variable thatdescribes the ground state of the system, similarly to theDFT

• no self-consistent iterations, unlike the DFT; instead, thetotal density is represented by a sum of atomic chargedensities

The unknown functional, F[ρ(r)], eq 2.2.3, is approximated byan empirically found functional form together with theCoulombic interaction (which actually includes the ∫ v(r) ρ(r)dr term of the DFT). The parameters are derived empirically.Specifically, the original form of the total energy of the system

is

∑ ∑ρ φ= +E F R R( ( ))12

( )i

i i ii j

ij ijtot, (3.2.2)

where Rij = |Ri − Rj| is the distance between the sites i and j. Thefirst summation runs over all atomic sites. The argument dependson the total charge density generated by all other sites (atoms).Thus, it is a many-body potential, especially appropriate for solidsystems with delocalized charge distribution, such as metals. Thesecond term represents the conventional Coulombic pairwise

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5821

Page 26: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

contributions. The functions Fi(ρi(Ri)) do typically possess aminimum, and can be represented by the following expression:

ρ ρρ

= + ++

F b bb b

( )1

1 23 4 (3.2.3)

with {bi} being the fitting parameters. In particular, the originalwork331 used the expression eq 3.2.3 to fit the quantummechanical data by Puska et al.334 The later work of Nørskov335

discussed the calculation of the embedding functionals from theDFT perspective. The functions φij(Rij) are purely repulsive(core repulsion) and depend on the type of atoms (ions), asindicated by the two indices. As suggested in ref 331, thefunctions can be represented in terms of the effective (screened)charges of the ionic center, Zi(|R|):

φ =RZ R Z R

R( )

( ) ( )ij ij

i ij j ij

ij (3.2.4)

with each Z chosen to satisfy the asymptotic properties: Zi(0) =Zi0 and Zi(+∞) = 0. An appropriate choice for the function Z is,

for example:

β α= + −Z R Z R R( ) (1 ) exp( )v0 (3.2.5)

where Z0 is the number of outer electrons (e.g., 10 for Ni, Pd, andPt and 11 for Cu, Ag, and Au), and α, β, and v are the fittingparameters.The electron density at a given site, ρi(Ri), is computed in the

original formulation331 as the sum of the atomic densities from allother atoms:

∑ρ ρ =≠

R R( ) ( )i ij i

a j ij,(3.2.6)

where ρa,j(Rij) is the atomic density created by the atom j at theposition of the atom i. The atomic densities are assumed to bespherically symmetric in the EAM. They are computed using theatomic orbital parameters from the tables of Clementi andRoetti,314 and McLean and McLean:315

∑ρπ

= | |R C R R( )1

4( )a ij

kk k ij

2

(3.2.7)

ξξ=

!−

+−R R

nR R( )

(2 )

[(2 ) ]exp( )k ij

kn

kij

nk ij

( 1/2)

1/21

kk

(3.2.8)

The EAM formulation was extended to a broader range ofelements (Cu, Ag, Au, Ni, Pd, Pt) by Foiles et al.332 Properdescription of the embedding function was the main challengefor such a parametrization. The authors used important results byRose et al.,336 who suggested a universal scaling law fordescription of the sublimation energy of metals as a function ofthe lattice constant:

= − + * − *E a E a( ) (1 )e asub (3.2.9)

Esub is the absolute value of the sublimation energy at zerotemperature and pressure, and the quantity a* describesdeviation from the equilibrium:

β* = − Ω = −⎛⎝⎜

⎞⎠⎟⎛⎝⎜

⎞⎠⎟a

aa

BE

r r19

( )0 sub

1/2

0(3.2.10)

β = Ω⎛⎝⎜

⎞⎠⎟

Br E9

02

sub

1/2

(3.2.11)

B is the bulk modulus of the material, r0 is the equilibrium first-neighbor distance, a0 is the equilibrium lattice constant, a is thelattice constant at a given condition, and Ω is the equilibriumvolume per atom.Using the result of Rose, eq 3.2.9, Baskes337 showed that the

embedding functional can be chosen as

ρ ρρ

ρρ

=

⎛⎝⎜

⎞⎠⎟

⎛⎝⎜

⎞⎠⎟F E( ) ln0

(3.2.12)

where ρ is the charge density in the equilibrium (reference)system, and E0 = Esub is the sublimation energy. The justificationof eq 3.2.12 is based on consideration of the system energy indifferent environments and the related variation of the number ofbonds (and hence charge density), n, with respect to the first-neighbor distance, r:

Table 1. Timeline of Semiempirical Methods Development

year event authors

1963 EHT Hoffmann250

1965 CNDO Pople et al.169,170

Cusachs correction to EHT Cusachs318

1966 CNDO/2 Pople et al.171

1967 INDO Pople et al.172

1969 MINDO Dewar et al.173

1970 MINDO/2 Dewar et al.174

1973 SINDO prototype Coffey and Jug209

ZINDO Zerner et al.183

SC-EHT with exsin formula Kalman319

1975 MINDO/3 Dewar et al.175

nonperfectly following correction to PESderived from EHT

Anderson253

1977 MNDO Dewar and Thiel187,188

1978 SINDO Jug208

weighted Wolfberg−Helmholtz formulafor EHT

Ammeter et al.282

1980 SINDO1 Jug et al.210−212

1985 AM1 Dewar et al.193−196

1989 PM3 Stewart197−199

improved R-dependent formula for K inEHT

Calzaferri et al.254,320

1992 MNDO/d Thiel and Voityk189,190

1993 OM1 Thiel and Kolb202

1994 SMLC/AM1 de Andrade et al.236,237

1995 DFTB1 Porezag et al.247

1998 DFTB2 Elstner et al.249

1999 MSINDO Ahlswede and Jug218,219

2000 OM2 Thiel and Weber203

2003 OM3 Scholten204

2004 Sparkle/AM1 Rocha et al.238

2006 Sparkle/PM3 Freire et al.241

NO-MNDO Sattelmeyer et al.207

DFTB3 Elstner et al.63

2007 PM6 Stewart200

2008 non-self-consistent CI-EHT instead ofSC-EHT

Iron et al.328

2010 Sparkle/PM6 Freire et al.245

2011 MSINDO-sCIS Gadaczek et al.228

2013 PM7 Stewart201

Sparkle/PM7 Dutra et al.246

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5822

Page 27: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

− ∼⎛⎝⎜

⎞⎠⎟r r

nn

ln00 (3.2.13)

A relationship similar to eq 3.2.13 was known to Pauling,338

who proposed the bond order concept and related bond ordersto the strength (and hence the length) of chemical bonds. Thefunctional dependence of the atom embedding energy into ahomogeneous host with a given electron density, similar to eq3.2.12, was derived by Nørskov directly from the DFTconsiderations.335

The early forms of the EAM were very successful in describingvarious ground state properties of metals and their alloys. Thesuccess comes both from the appropriate functional form, whichobeys the asymptotic properties, and from a thorough para-metrization. Such an approach can be viewed today as a relativelyinexpensive method for treating large systems at affordablecomputational costs. It is appropriate for describing reactiveprocesses, but does not provide orbital information. The mainlimitation of the method is hidden in its formulationtheassumption of spherical symmetry of atomic charge densities. Asa consequence, only systems without directional bonds, such asmetals and their alloys, could be represented well. Materials withdirectional bonds, typically semiconductors, and molecularcomplexes and crystals, could not be accurately described.Unlike metals with almost spherically symmetric d-electron

distributions, covalent materials, such as Si, do shown strongbonding anisotropy, because of the pronounced anisotropy of p-orbitals. As a result, many properties that depend on bondingdirectionality, such as shear behavior, cannot be modeledaccurately with the simple EAM approach. The difficulties canbe resolved by adding angular dependence of the charge density,as first proposed by Baskes in his prototype of the modif ied EAM(MEAM) approach for silicon.337 The angular dependence of theatomic charge densities is introduced by modification of eq 3.2.5:

∑ ∑ρ ρ θ ρ

ρ

= − −≠ ≠

R R R

R

( ) ( )13

(1 3 cos ) ( )

( )

i ij i

a ijj ik i

ijk a ij

a ik

2

(3.2.14)

and a slight modification of the expression for the core repulsionterms, φij(Rij), eq 3.2.4. The modification is chosen to reproducethe energy minima in a number of Si structures, as well as theuniversal equation of state, eq 3.2.9. Further extension to Si andGe was developed by Baskes, Nelson, and Wright a few yearslater. The resulting method was namedMEAM.339 The approachwas extended to a larger set of elements and was capable ofdescribing different types of chemical bonding, ranging fromdiatomic molecules to semiconductors and metals. The methodcan be labeled as MEAM92 according to the year of itsdevelopment.340

The central idea of the MEAM is to utilize anisotropic partialbackground charge densities. The spherically symmetric EAMexpression, eq 3.2.6, becomes merely a zero-order term:

∑ρ ρ =≠

R R( ) ( )i ij i

a j ij(0)

,(0)

(3.2.15a)

Higher-order terms are added:

∑ ∑ρ ρ =α

α= ≠

R u R( ( )) ( ( ))i ix y z j i

ij a j ij(1) 2

, ,, ,

(1) 2

(3.2.15b)

∑ ∑

ρ ρ

ρ

=

α βα β

= ≠

R u u R

R

( ( )) ( ( ))

13

( ( ))

i ix y z j i

ij ij a j ij

j ia j ij

(2) 2

, , ,, , ,

(2) 2

,(2) 2

(3.2.15c)

∑ ∑ρ ρ =α β γ

α β γ= ≠

R u u u R( ( )) ( ( ))i ix y z j i

ij ij ij a j ij(3) 2

, , , ,, , , ,

(3) 2

(3.2.15d)

where uα,ij = Rα,ij/Rij is the α projection of the normalized vectorconnecting the sites i and j. ρα,j

(l)(Rij) is the lth order component ofthe atomic density of the type j atom as a function of the distanceRij from the site i. The third-order partial density, eq 3.2.15d, wasrefined in later works341,342 to exclude contributions from thelower-order terms:

∑ ∑

∑ ∑

ρ ρ

ρ

=

α β γα β γ

αα

= ≠

= ≠

R u u u R

u R

( ( )) ( ( ))

35

( ( ))

i ix y z j i

ij ij ij a j ij

x y z j iij a j ij

(3) 2

, , , ,, , , ,

(3) 2

, ,, ,

(3) 2

(3.2.15e)

The lth order component of the atomic density is given by theexpression of the bond-order type:

ρ β= − −⎪ ⎪

⎪ ⎪⎧⎨⎩

⎛⎝⎜

⎞⎠⎟⎫⎬⎭R

RR

( ) exp 1a il

il

i,

( ) ( )0

(3.2.16)

where βi(l) and Ri

0 are parameters. This expression simplifiescomputations notably, in comparison to eqs 3.2.7 and 3.2.8, yet itretains similar qualitative behavior, showing the expectedexponential long-term asymptotic.The choice of the functional form in eqs 3.2.15b and 3.2.15c is

motivated by the requirement of invariance of the correspondingcharge densities to lattice translation and rotation. One canshow339 that the expressions 3.2.15b and 3.2.15c can be rewrittenin the form containing three-body terms with dependence oncosn θijk, with n varying up to 1, 2, and 3 for eqs 3.2.15b, 3.2.15c,and 3.2.15d, respectively. The mathematical form of the partialbackground densities, eqs 3.2.15, suggests that they are related tospecific angular momentum components (s, p, d, etc.) of thequantum-mechanical electronic charge density, although theexact relation between the two is not known. One can also relatethe contributions in eqs 3.2.15 to different orders in the Taylorexpansion of the DFT energy, as it is done in the DFTB family ofmethods.The total (average) density is constructed by summing the

contributions from different orders. To avoid mathematicaldifficulties (singularities of the density and energy derivatives),the summation is performed directly in terms of the squares ofdensities:

∑ρ ρ = =

R t R( ( )) ( ( ))i il

il

a il

i2

0

3( )

,( ) 2

(3.2.17a)

where ti(l) are the superposition coefficients, defined as

parameters. Equation 3.2.17a can be rewritten as a perturbativeexpansion:

∑ρ ρρ

ρ = +

+

=

⎣⎢⎢⎢

⎝⎜⎜

⎠⎟⎟

⎦⎥⎥⎥

R R tR

R( ) ( ) 1

12

( )

( )...i i i i

li

l a il

i

i i

(0)

1

3( ) ,

( )

(0)

2

(3.2.17b)

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5823

Page 28: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

The result eq 3.2.17b provides another way to relate the MEAMtheory to the DFTB series of methods.The original MEAM formalism relied on consideration of only

the first-nearest neighbors (1 NN)the interactions were cut offfor the distances beyond the 1 NN threshold. This restrictionhelped to accelerate calculations and was valid in many cases, butit led to a number of shortcomings. For example, the bccstructures were found metastable and could transform easily todifferent geometries during finite-temperature MD simulations.At the same time, the distance to the second-nearest neighbors (2NN) is only slightly larger than that to the 1 NN. To account forthese lacking interactions, the MEAM (to be more precise 1NN-MEAM) was extended to include the effects of 2 NN (2NN-MEAM).341,342

The initial MEAM formulation339 contained parametrizationfor 26 elements, including C, Si, Ge, H, N,O, and many metals,thus covering a large number of distinct phases and materials.The MEAM92 parametrization was extended or refined in laterworks for a larger number of elements.343−345

The MEAM was employed in simulations of large-scalesystems, including studies on the mechanical properties of Cunanowires and shape memory effect,346 melting of metalnanoclusters on substrates,347 nucleation, formation, and growthof oxides at metal surfaces,348 and ductile and brittle fracture ofnanowires.349 Multiscale simulation of nanoscale materials,350

dislocation dynamics,351,352 and mechanisms of materialsfracture353,354 were investigated. These simulations includedseveral hundred to several thousand metal atoms and consideredlong reactive dynamic time scales. Description of such processesat the MO representation level would be extremely expensiveeven with simple semiempirical methods. Additional complica-tions would arise from an excessive number of d-orbitals anddegeneracies, which can significantly affect computationalefficiency via dimensionality increase and convergence difficul-ties, respectively.3.2.2. Orbital-Free DFT (OF-DFT). A natural generalization

of the above-mentioned reactive potentials would be theformulation based on charge density, i.e., DFT. To overcomethe computational limitations of the standard KS-DFT, themethodology should avoid auxiliary orbilals. Orbital-free DFT(OF-DFT) is one of the techniques pursuing this philosophy.The approach takes full advantage of the Hohenberg−Kohnformulation of DFT. The philosophy is similar to the previouslydescribed EAM and MEAM schemes, which originate from therigorous DFT theory and introduce a number of computationallyconvenient and physically motivated noniterative approxima-tions to the total energy. No auxiliary wave functions are used.The electron density remains the central and the only variable. Asa result, the computation scaling is linear with the system size,and the method can be applied to very large systems.Construction of a sufficiently accurate kinetic energy density

functional (KEDF) is the main challenge of the OF-DFTmethod. The requirement of finding a highly accurate KEDF ismotivated by the virial theorem, according to which the kineticenergy is comparable to the total energy of a system. In contrast,the exchange term is only a correction to the total energy. As theresult, the KEDF accuracy is much more critical to the success ofthe resulting scheme than the accuracy of the exchange terms.The methodologies to derive these two groups of functionals are,therefore, very different.We remind the reader that the difficulties with the early

KEDFs, which were not suitable for chemical applications,motivated introduction of the auxiliary one-electron wave

functions in the KS formalism. The early KEDF functionalswere appropriate only for the uniform electron gas in solids(Thomas−Fermi, etc.)355,356 or for the single-orbital limit (vonWeizsacker).357 The accuracy of such functionals is far from theaccuracy needed for realistic applications of chemical interest.The success of the KS formulation of the DFT came mainly fromits ability to approximate accurately the kinetic energy functionalusing the auxiliary KS orbitals. However, this same KS approachmakes the KS-DFT computationally far less efficient, since itrequires self-consistent solution of equations of increaseddimensionality (defined by the basis set size) and needsexpensive matrix diagonalization and multiplication operations.Early efforts on orbital-free functionals (although the term did

not exist) were limited to the solid-state community, includingthe mentioned Thomas−Fermi and von Weizsacker models.More recent developments of the OF-DFT were performed byCortona, who formulated the OF-DFT method.358 KEDFs ofchemical accuracy were developed by Wang and Teter,359 andSmargiassi and Madden,360 mostly for solid-state chemistry.S ignificant progress was achieved by the Cartergroup.34,65,361−366

With the invention of reliable nonlocal KEDFs, derived usingthe linear response theory, it became possible to achieve bothreasonable accuracy and astonishing efficiency. Used extensivelyby the Carter group, theWang−Govind−Carter 1999 (WGC99)functional361,362 is given by

ρ ρ ρ ρ= + +T T T T[ ] [ ] [ ] [ ]TF vW Z (3.2.18)

The first two terms account for the two limiting cases. TheThomas−Fermi term

∫ρ π ρ= T r r[ ] 0.3(3 ) ( ) dTF 2 2/3 5/3(3.2.19)

is exact for the uniform electron gas.355,356 The von Weizsackerterm

∫ρ ρ ρ= − ∇ T r r r[ ]12

( ) ( ) dvW 1/2 2 1/2(3.2.20)

is exact for single orbital systems.357 The third term is obtainedfrom the linear response theory,167 which implies:

δδρ χ

= −ρ ρ=

T 12

20

0 (3.2.21)

where ρ0 is the average electron density, and χ0 is the linearresponse function of the noninteracting homogeneous electrongas. For example, taking the linear response function to be of theLindhard form, Carter and co-workers arrived at the followingexpression for the nonlocal term:

∫ ∫ρ ρ ξ ρ= ′ | − ′| ′ ′αγ

βT r w r r r r r r r[ ] ( ) ( ( , ), ) ( ) d dZ

(3.2.22)

The kernelw(ξγ(r, r′), |r − r′|) is chosen to satisfy the Lindhardlinear response, where

ξ π ρ π ρ ′ = + ′

γ

γ γ γ⎛⎝⎜

⎞⎠⎟r r

r r( , )

[2 ( )] [2 ( )]2

2 /3 2 /3 1/

(3.2.23)

and the γ is an adjustable parameter.The OF-DFT pseudopotentials used to describe the core

potential must be spherically symmetric, while the KS-DFT hasno such restriction. The range of materials, to which theOF-DFT

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5824

Page 29: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

method can be applied, narrows because of this limitation.Typically, these are the nearly free-electron metals of the maingroup.65,360−362 The OF-DFT shows limited accuracy whenapplied to systems with localized states, such as semiconductorsor transition metals with localized d-electrons.367,368 Huang andCarter (HC) developed the KEDF for simulation of semi-conductors in 2010. This capability came at a notably increasedcomputational cost. The applicability of the HC KEDF tomolecular systems was tested.364,365 The geometric properties,vibrational frequencies, and some charge distributions weresimilar to those obtained using the conventional KS-DFT.Nevertheless, a number of deficiencies remain. For example, theelectron densities in bonding regions, magnetic properties, andordering of singlet and triplet states often show large errors. Bycontrolling the extent of electron density localization, Shin andCarter366 improved theWGC99 functional to semiconductors. Itshowed higher accuracy than the previous HC functional and wascomputationally more efficient than the WGC99 functional.The limitation to utilize only spherically symmetric core

potentials was lifted very recently, leading to the angular-momentum-dependent OF-DFT.369 The new OF-DFT func-tional generation shows increased accuracy over the previousformulations. It is potentially applicable to nonmetal systems,offering extensions to most elements in the periodic table. It isinteresting to note an analogy with the embedding schemes.Generalization of the initial spherical formulation of the OF-DFT by introduction of angular dependence parallels thetransformation of the initially spherically symmetric EAMformulation into the angular-dependent MEAM. Both originalOF-DFT and EAM started from metallic systems (nearly free-electron, in the case of the OF-DFT), in which the sphericalapproximation of the potential was well justified. The extensionswere inevitable, since semiconductors and covalently bondedmaterials required angular-dependent potentials.Provided that the OF-DFT method is sufficiently accurate, it

can be applied to systems far beyond the reach of other quantummechanical approaches. Notable examples include simulation ofmesoscale metal systems with about 1 million atoms using amodest number of processors,65 studies of the tensile strengthand elastic properties of Al nanowires of nanoscale thickness,34

and studies of micrometer-scale nanoindentation of Al thin filmsinvolving millions of atoms.33 The method was also applied tomolecular dynamics of moderate size systems, for which thestandard DFT procedures are not accessiblefor example, toinvestigate dislocation mobility in metals.35

3.2.3. Timeline of Density-Based Methods. Table 2includes the timeline, from 1964 to 2014, of the development ofearly density embedding methods.3.3. Bond Order Methods

3.3.1. Bond Order Concept. The concept of bond orderpotentials leads to approaches that are reminiscent of the densityembedding methods, such as the EAM and MEAM. The bondorder theory dates back to Pauling,338,370 who proposed anempirical relationship between the strength of a chemical bondand the corresponding interatomic distance, eq 3.2.13. Theequation can be rewritten in the form

∼E x (3.3.1a)

α≡ −x rexp( ) (3.3.1b)

where x is the bond order, E is the bond dissociation energy orforce constant, and r is the interatomic distance. Similar empiricalrelationships between the bond length and observable molecular

properties were found by Badger,371,372 Herschbach andLaurie,373 and many others.374−377 However, they mostlyreflected the experimental data, and did not appeal to the theoryof chemical bonding. Only after Pauling did the concept of bondorder became the central idea of constructing atomic interactionpotentials.The Pauling definition of bond order, eq 3.3.1b, implies that

the maximal value of the bond order is 1. Therefore, it should notbe confused with the multiplicity of chemical bonds (single,double, triple), commonly used in organic chemistry. Themultiplicity is given by the Mulliken bond orders. We refer thereader to the works of Jules and Lombardi378,379 for recentreviews on different types of bond orders. The authors presentedan extensive analysis of different bond order formulations andtheir relation to the known empirical results and theexperimentally measurable bond orders.

3.3.2. Bond Order Conservation. The so-called bond orderconservation Morse potential (BOC-MP) method380,381 is one ofthe important developments associated with the bond orderconcept. It was renamed later to the unity bond index quadraticexponential potential (UBI-QEP), to reflect the theoreticallysound foundations and the latest theory refinements.382 Themethod was extensively developed by Shustorovich, and later bySellers. It was abandoned to some extent in modern develop-ments. However, we anticipate that it can be utilized to reinforcethe existing reactive bond order potential methods, because of itsattractive physical interpretation, mathematical flexibility, andconsistent, encouraging accuracy.The BOC-MP method was initially formulated for theoretical

explanation of the experimentally observed relations between theenergy barrier for diffusion of an adsorbate X on a metal surface,ΔE*, and the heat of chemisorption, Q, for transition metal/gassystems.380,381 Shustorovich considered migration of theadsorbate X from one metal surface site Mn to another,characterized by a different value of n. The coordination valuesn = 1, 2, 3 correspond to the on-top, bridge, and 3-fold symmetryfcc(111) hollow coordinations of the Mn−X complexes.The main assumptions of the BOC-MPmethod are as follows:(1) The interaction energy of each M−X pair is given by the

Morse potential, which is a second-order polynomial of the bondorder, xi:

= −E x Q x x( ) [ 2 ]i i i i0,2

(3.3.2a)

α≡ − −x r rexp( ( ))i i i i0, (3.3.2b)

where r0,i is the equilibrium interatomic distance for the dimerM−X, αi is an adjustable parameter, andQ0,i is the bond energy ofthe M−X dimer in the gas phase. By definition, the BO xi is unityat equilibrium:

= =x r r( ) 1i i i0, (3.3.3)

The energy is minimum at this value of xi.(2) The interactions in the Mn−X complex are limited to n

nearest-neighbor metal atoms:(3) The total energy of the Mn−X is the sum of n

contributions, eq 3.3.2a, from each M−X pair:

∑ ∑= = −= =

E E x Q x x( ) [ 2 ]i

n

ii

n

i i1 1

02

(3.3.4)

(4) The total sum of bond orders is unity and is conserved:

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5825

Page 30: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

∑ ==

x 1i

n

i1 (3.3.5)

The BO conservation principle, eq 3.3.5, is motivated by theexperimental observations for three-center A−B−C interactions,as discussed by Murdoch.383 The validity of this principle for

processes taking place along the minimal energy paths wasdemonstrated by the computational studies of Lendvay,384

although the definition of bond orders was somewhat differentfrom eq 3.3.2b. The BOC principle, eq 3.3.5, is easy to justifyqualitatively from the chemical point of view. Consider a three-center system A−B−C. The bond distances A−B, B−C, and A−C in the three-center system are different from the correspondingdiatomic equilibrium values. The electronic density is sharedamong the three bonds, leading to weaker bonding, in general. Asa result, it is typically expected that rij will be larger than thediatomic equilibrium value, rij

0, and the BO for each of the threepairs will be smaller than unity. Consider that the bond B−Cdissociates (hence, A−C dissociates too). As the interatomicdistances rBC and rAC approach infinity, the BOs x1 = xBC and x2 =xAC approach zero. The charge density is now shared only amongthe centers A and B, leading to stronger bonding and shorterdistance. The AB system relaxes to its equilibrium gas-phasegeometry: rAB = rAB

0 and x0 = xAB approaches unity. The schematicshowing the evolution of the bond orders, x1, along the reactionpathway is presented in Figure 2.Applying the method to adatom adsorption on an Mn site, one

obtains the maximal heat of adsorption

= −⎜ ⎟⎛⎝

⎞⎠Q

nQ2

1n 0 (3.3.6)

and the following relation between the activation energy and theheat of adsorption:

Δ * = −−

Enn

Q2

4 2 n (3.3.7)

Equation 3.3.7 is in good agreement with experimentalobservations. Similar analysis was applied381 to understand theactivation barrier for dissociation reactions AB→ A + B on metalsurfaces. The activation barrier is given in this case by

Δ * = − + +∗+

E D Q QQ Q

Q Q( )AB AB A B

A B

A B (3.3.8)

whereDAB is the gas-phase dissociation energy of AB, andQX,X =A, B, is the heat of atomic chemisorption. The BOC principle wasapplied to calculate coverage-dependent chemisorption energiesand to understand coadsorption effects.385−387

The great success of the BOC-MP method suggested that thetheoretical foundations of the method were general andfundamental. The formulation was reconsidered, and it wasfound that the variable x(r) needed not be interpreted as bondorder, eq 3.3.2b. Rather, it can be a more complex function. Forexample, it can be a sum of many exponents. The variable x(r)was renamed to bond index (BI). Initially considered anapproximation, eq 3.3.5 was postulated. The resulting changesare reflected in the change of the name BOC-MP into the unitybond index quadratic exponential potential (UBI-QEP).382,388,389

Shustorovich and Zeigarnik390 generalized the method in laterworks, to avoid bond-energy partitioning in polyatomicmolecules.It is remarkable that such a simple model often predicts correct

qualitative relationships and achieves high quantitative accuracyof surface reactions energetics, on the order of 1−3 kcal/mol.382It is comparable to the accuracy of DFT or high-level ab initiocomputations, at a negligible cost! In contrast to the bond orderpotentials and reactive force fields, discussed in section 3.3.3, themethod is very simple both conceptually and in implementation,and uses a minimal number of approximations. Of course, incontrast to the reactive potentials, the method is applicable to anarrower class of systems, although further elaborations can beanticipated.In their later works, Sellers and co-workers388,391−393 adapted

the UBI-QEP approach as the reactive potential to be used inmolecular dynamics simulations. The resulting method wascalled the normalized bond index molecular dynamics (NBI-MD)method.391,392 The main difference of the new approach withrespect to the UBI-QEP was its extension to the whole reactioncoordinate space, not only to the high-symmetry points. Theexplicit dependence of the bond order functions, xi(r), on atomiccoordinates was utilized, and their values were allowed to begreater than unity or even negative. To distinguish such variablesfrom the BO indices of the UBI-QEP, they were calledinteraction indices and abbreviated Zi.The second major correction introduced in the NBI-MD

concerned the conservation postulate, eq 3.3.5. It was recognizedthat constraining the sum of interaction indices to unity was notconsistent with the reaction process and should be renormalized.The dissociation of a diatomic AB molecule at a metal surface isthe simplest example that illustrates the shortcoming (Figure 3).In the reactant state, themolecule AB is desorbed and is in the gasphase. The total bond index is determined by the bond index ofthe A−B pair and is equal to 1 by definition (the molecule is in itsequilibrium gas-phase geometry). In the product state, thechemisorbed AB molecule dissociates into adatoms A and B thatdiffuse far from each other. Each of the A−Mn and B−Mn bondindices arrive at the equilibrium value, 1. The total bond index isthen 2. As a result, in order to describe the above process, the

Table 2. Timeline of Early Density Embedding MethodDevelopment

year event authors

1964 DFT Hohenberg and Kohn167

1965 KS-DFT Kohn and Sham168

1980 embedding potential theorem Stott and Zaremba333

1981 embedding functional from abinitio calcn

Puska et al.334

1982 embedding functional, relationto DFT

Nørskov335

1984 universal equation of state formetals

Rose et al.336

EAM Daw and Baskes331

1987 MEAM prototype, for Si Baskes337

1989 MEAM Baskes et al.339

1991 OF-DFT originally formulated Cortona358

1992 Wang−Teter OF-DF Wang and Teter359

1994 MEAM92 Baskes et al.340

1999 Wang−Govind−Carter KEDF Wang, Govind, and Carter361,362

2000 2NN-MEAM Lee and Baskes341

2010 Huang−Carter KEDF forsemiconductors

Huang and Carter363

2014 angular-momentum-dependentOF-DFT

Carter et al.369

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5826

Page 31: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

total interaction index should be normalized toN that varies from1 to 2:

+ + =Z Z Z NA B AB (3.3.9)

γ= + − ΔN 1 exp( )i (3.3.10)

where

Δ =−Z

Z1AB

AB (3.3.11)

and γ is an adjustable parameter.

Started as a phenomenological model, the BOC-MP and thenUBI-QEP can be regarded as methods derived from the DFT.This can be seen from the correspondence between the twoDFTand UBI-QEP constructs: (1) The total energy is given in theDFT by a universal density functional, which is unknown andusually obtained from physically motivated models. One is free tochoose an arbitrary functional form of the bond index in the UBI-QEP method. This is equivalent to formulating a universalenergy functional. (2) The total DFT charge density isnormalized to the number of electrons. There is a closely relatedconstraint on the bond index conservation in the BOC-derivedmethods. We suggest that the BOC-derived methods retainsimilar approximation quality as the DFT, with the maindifference being the change of effective variables from chargedensity to bond orders. More generally, one can envision theexistence of infinitely many alternative reformulations of theDFT. It is unclear, however, how the DFT and BOC-derivedformulations transform into each other.The works of Shustorovich and Sellers indicate that the

accuracy of the BOC-derived methods is incredibly high. Onemay relate this success to the many-particle nature of the centralvariables used in these formulations: The bond index variablemaps the positions of two points in space into a single variable. Incontrast, the charge density variable used in the DFT maps onlyone such point, and hence, it encodes a smaller amount ofinformation. The idea of utilization of variables that are beyondthe simple density function appeared in the generalized gradientapproximations of the DFT, and more recently, in the current-dependent DFT functionals. One may then compare theutilization of bond indices in the BOC-derived methods withthe use of the density gradients and the current densities.Finally, we want to compare the BOC-derived methods with

the density embedding methods (EAM and MEAM). First andforemost, the difference is in the absence of variationalconstraints in the embedding methods. In principle, one can

reformulate an unconstrained functional into a differentfunctional with several properly chosen constraints. There isno advantage in using the constrained version from themathematical point of view. It is quite the oppositethenecessity to obey constraints may require additional iterativecomputations. However, utilization of a simple energy functionaland a proper constraint can find a clear physical or chemicaljustification. In particular, the total number of electrons, and theangular and spinmomenta, must be conserved. The total numberof electrons is given in the MO theory by the sum of Mullikenatomic populations and Mulliken bond orders (off-diagonalelements of the density matrix), eqs 2.1.47−2.1.48. Starting fromtheWFT, one can relate the bond indices of the UBI-QEP theoryto the Mulliken bond orders. Lack of proper constraint terms inthe EAM and MEAM methods necessitates a more complexfunctional form of the total energy. One therefore has two waysof further improving the existing theories of interatomicinteractions: Either impose physically and chemically motivatedconstraints on top of the EAM,MEAM, or reactive potentials (tobe discussed later), or improve the energy terms in the BOC-derived methods. For example, add explicit angular dependence.Continuing our analysis, it is unfortunate to observe that the

bond order conservation principle gained significantly lesspopularity than the conservation of charge. Starting from theMO theory, one can observe that bond orders play a similar roleto covalent interactions as partial atomic charges to electrostaticinteractions. The effects of covalent bonding are rigidlyparametrized in the standard (nonreactive) force fields. Hence,only the effects associated with Coulombic interactions need to

be accounted for. The latter are often treated with numerouscharge equalization schemes. On the contrary, we anticipate thatreactive force fields need to account for both covalent bondingand Coulombic interactions on the same footing. After all, bondorders and partial charges are closely related to each other, sincethey are derived from the same density matrix. Both bond orderand charge conservation constraints should be imposedsimultaneously. Modern reactive force fields do utilize differentcharge equilibration schemes, which are used to computegeometry-dependent charges. However, no work has beendone so far to impose additional bond order conservationconstraints. Similarly to the EAM and MEAM methods, reactiveforce field approaches lead to complicated phenomenologicalenergy functions for the description of bond breaking andformation. Utilization of the bond order conservation principle

Figure 2. Evolution of bond orders along different dissociationpathways. A, B, and C label the three atoms; xXYis the bond orderbetween atoms X and Y.

Figure 3. Sum of bond indices for different dissociation limits of theAB−Mn system. A and B label atoms; ZAB, ZA, and ZB are the interactionindices between atoms A and B, between atom A and the substrate, andbetween atom B and the substrate, respectively.

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5827

Page 32: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

may lead to simpler and more general functional forms and tobetter transferable parameters.3.3.3. Bond Order Potentials and Reactive Force Fields.

The origins of the reactive force field developments may betracked back to the seminal work of Rose et al.336 on the universalscaling of cohesive energies. As discussed in section 3.2.1, thework was also central to the development of the EAM and laterMEAM theories. The similarity of the mathematical formulationsof the EAM and reactive force fields is not, therefore, surprising.The physics of bonding is the same in the two approaches. It isinteresting to compare the historical development of the twomethods. If the EAMwas mainly developed for metals and alloys,the reactive empirical bond order (REBO) methodsprototypesof reactive force f ieldsstarted with covalently bonded materials,such as Si and Ge. As a result, the angular dependence appearedonly at the later stages in theMEAM theory, while it was includedright from the beginning in the REBOin the Stillinger−Weberpotential. The DFT is the main theoretical ground of the EAMapproach, while the REBO is associated with the “effective” wavefunction theory (e.g., chemical pseudopotential), as discussed byAbell.394 Both branches heavily utilized the important finding ofRose et al.336 regarding the universal scaling law.3.3.3.1. REBO in Reactive Scattering. REBO potentials were

initially proposed in the field of reactive scattering,395−398 foraccurate representation of the reactive potential energy surfacesthat had correct asymptotic limits for different dissociation/scattering channels. For example, the potential can be expandedfor a three-atomic molecule ABC into the two- and three-bodyterms:

∑ = +=

E r r r V r V r r r( , , ) ( ) ( , , )i

i iA B C1

3(2)

123(3)

1 2 3(3.3.12)

where the diatomic pairs AB, AC, and BC are numbered 1, 2, and3, respectively. The two-body terms Vi

(2) are expressed aspolynomials of the corresponding bond-order variable, xi,defined by eq 3.3.2b:395

∑==

V r a x( )i ik

K

ik ik(2)

1 (3.3.13)

where the coefficients aik and the bond-order parameters αi aredetermined from solving the system of linear equationsgenerated by the requirement to reproduce the spectroscopicforce constants of the diatomic fragments. This procedure mapsthe PES from the Cartesian coordinates to the generalized bond-order coordinates. The PES is linearizable in the space of bond-order variables, {xi}, and the computations can be efficientlyparallelized. The choice of the functional form, eq 3.3.13, ensuresthat the proper diatomic dissociation limits are achievednaturally. The bond-order representation, similar to eq 3.3.13,was also applied to three-body terms,396 although somedifficulties were observed for short distances. The problem wasaddressed by utilization of a new type of variable (XBO/RBO)that mixed the variables of the two spaces:398,399

≡ ·x r xi i i (3.3.14)

In general, the three-atomic PES (including all many-bodyterms) can be represented in the BO variables as399

∑∑∑ == = =

E r r r A x x x( , , )i

K

j

K

k

K

ijki j k

A B C0 0 0

1 2 3

i j k

(3.3.15)

The expression can be efficiently vectorized, leading to up toorder of magnitude boost in performance. The approach is veryexpensive for many-body systems and can hardly be of practicaluse to large-scale computations. Yet, the method represents animportant methodological consideration. Representation of thePES in terms of reactive coordinates is utilized, one way oranother, in the later developments of the REBO potentials.

3.3.3.2. Stillinger−Weber Potential. Stillinger and Weber400

developed one of the first REBO potentials for modeling of Si. Anessential feature, the potential included the three-body termsneeded for description of the directed bonding and tetrahedralcoordination in Si:

θ θ

θ

= +

+

f r r r h r r h r r

h r r

( , , ) ( , , ) ( , , )

( , , )

i j k ij ik jik ji jk ijk

ki kj ikj

3

(3.3.16a)

θ λ γ γ

θ

= − + −

+

− −

⎜ ⎟⎛⎝

⎞⎠

h r r r a r a( , , ) exp[ ( ) ( ) ]

cos13

ij ik jik ij ik

jik

1 1

2

(3.3.16b)

The potential decays as the separation of the three atoms (orany two of them) increases. This property is critical fordescription of bond dissociation and reactive processes. Analternative three-body potential for Si was proposed by Biswasand Hamann.401 It followed the same principal design require-ments.

3.3.3.3. Abell−Tersoff Potential. Discovery of the universalscaling law by Rose et al.336 opened a new era of REBO potentialdevelopment. Soon after, Abell394 showed that the law could beexplained by assuming the Morse-type form for the pairpotentials, with the parameters sensitive to the environment.In particular, he arrived at the conclusion on the basis of thechemical pseudopotential theory that the expression for atomicbinding energies (cohesive energies) can be represented by asum of the pairwise repulsion terms and the many-bodycontributions, describing the effect of environment for eachatom. In this regard, the theory is similar to the EAM/MEAM,which relies heavily on many-body terms. The conclusions ofAbell are very similar to those drawn by Shustorovich injustification of the UBI-QEP theory.382 Namely, Shustorovichshowed that the utilization of a Morse-type potential is mostoptimal for the description of reactive processes. The effects ofthe environment are accounted for in the UBI-QEP theory by thebond index constraint. Another important conclusion of Abell isthat no transferable two-body potential can be obtained; themany-body effects are negligible though in some cases. Forcomparison, the EAM/MEAM solves this problem by utilizationof many-body potentials, while the formally two-body potential(sum of two-bodyMorse terms, in the simplest case) is effectivelymany body in the UBI-QEP, due to the bond index constraintand a more complex definition of the bond indices.Tersoff developed a conceptually new form of the potential

based on Abell’s insights. The main aim was to guarantee theuniversal behavior of the cohesive energies. The potentialappears formally as a pairwise Morse-type potential:

∑=

E V12 i j

i j

ij,

(3.3.17a)

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5828

Page 33: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

λ λ

λ λ

= | | − | | − − | |

>

V f r A r B r( )[ exp( ) exp( )],ij ij ij ij ijc 1 2

1 2 (3.3.17b)

However, the bonding strength, Bij, of the pair of atoms i and j isassumed environment-dependent. Specifically, it should satisfythe proper limiting propertiesit should be a monotonicallydecreasing function of the number of competing bonds, thestrength of these bonds, and the cosines of the angles with thecompeting bonds. These requirements can be satisfied with thefollowing choice of Bij:

= −B B z bexp( / )ij ij0 (3.3.18a)

∑θ

=+ −≠

⎡⎣⎢⎢

⎤⎦⎥⎥z

w rw r c d

( )( )

1exp( cos )ij

k i j

ik

ij

n

ijk, (3.3.18b)

λ= −w r f r r( ) ( ) exp( )c 2 (3.3.18c)

where B0, b, c, and d are parameters. Equation 3.3.18b gives aweighted measure of the number of bonds competing with thebond ij. The function fc(r) in eqs 3.3.18c and 3.3.17b is the cutofffunction used to improve computational efficiency by restrictingthe calculations to interactions between atoms that fall within thecutoff radius from the central atom. The computation of energyand forces scales only quadratically, O(N2), with the number ofatoms, N, in contrast to cubic scaling for a general three-bodypotential, O(N3). Utilization of the cutoff and other standardMM techniques such as interaction lists may scale thecomputations down to linear or quasi-linear dependencies.Transferability is one of the main strength of the Tersoff

potential. The parametrization performed on a relatively smallnumber of high-symmetry systems was successful for accuratedescription of a large number of low-symmetry structures.Following Tersoff’s original work on Si, a number of similarpotentials and parametrizations were developed for differentelements by Tersoff,402−404 Khor and Das Sarma,405 and Ito,Khor, and Das Sarma.406,407

3.3.3.4. Brenner Potential (REBO). Despite its early success,Tersoff’s formulation had a number of limitations. For example, itwas unable to describe some properties of carbon-based systems,including properties of radicals and conjugate systems. A newwave of reactive potential development is associated withBrenner,408 who extended Tersoff’s scheme to include additionaleffects. The new potential aimed at simultaneous description ofenergetics and geometry in carbon-based systems of differenttypesdiamond, graphite, hydrocarbons. The approach wasabbreviated REBO explicitly for the first time. We want toemphasize that in our discussion we use the term “REBO” in abroader sense also, although it can be understood in a narrowersense, as well, when the family of Brenner potentials is discussed.Brenner’s potential is similar in form to that of Tersoff, but it is

reformulated in a more flexible and general framework:

∑= − >

E V r B V r[ ( ) ( )]i jj i

ij ij ij,

R A

(3.3.19a)

with the repulsive and attractive pairwise terms defined by

β=−

− −V rS

f r D S r r( )1

1( ) exp[ 2 ( )]ij

ijij ij ij

eij ij ij ij

eR

( ) ( )

(3.3.19b)

β=−

− −V rS

Sf r D S r r( )

1( ) exp[ 2/ ( )]ij

ij

ijij ij ij

eij ij ij ij

eA

( ) ( )

(3.3.19c)

The Morse potential form is recovered if Sij = 2.Introduction of corrections in the empirical bond-order

function

= + +B B B F N N N12

( ) ( , , )ij ij ji ij it

jt

ij( ) ( ) conj

(3.3.20)

was the second essential component of Brenner’s definition. Thebare pairwise bond-order function is given by

∑ θ

α

= +

− − −

B G f r

r r r r

H N N

{1 ( ) ( )

exp[ [( ) ( )]]

( , )}

ijk i j

i ijk ik ik

ijk ij ije

ik ike

ij iH

jC

,

( ) ( )

( ) ( ) i

(3.3.21)

The details of the definition of the correction terms, Hij(Ni(H),

Nj(C)), Fij(Ni

(t), Nj(t), Nij

conj), and the functions Gi(θijk) and f ik(rik)can be found in the original paper.408 The functional form ofsome terms was improved, and the parametrization was revisedand extended in the improved version, REBO2.409 The resultingmethod led to a significantly more accurate description of bondenergies, lengths, and force constants of hydrocarbon moleculesas well as elastic properties, interstitial defect energies, andsurface energies for diamond. Examples of applications of themodified Brenner REBO and REBO2 include studies of metal-assisted carbon material transformations (e.g., from graphene tofullerene),30 defects in the rare earth elements,410 and dynamicsof ultrananocrystalline diamond.411

3.3.3.5. AIREBO Potential. The initial Brenner potential wassuccessful in systems for which the intermolecular interactionswere dominated by covalent bonding, such as small hydrocarbonmolecules, preferably in the gas phase. In larger systems,especially in condensed phases (liquids and solids), weakdispersion interactions may add up to a notable value and mayeventually become dominating. These terms were missing in theoriginal Brenner potentials. The dispersive interactions of theLennard-Jones (LJ) type were added by Stuart and co-workers intheir adaptive intermolecular REBO (AIREBO) method.412

Absent in the REBO, the torsion interactions were added as well.The AIREBO energy is expressed as

= + +E E E EAIREBP REBO LJ tors (3.3.22)

The LJ energy is designed to vary smoothly from zero for smallinteratomic separations, in which case the energy is dominated bycovalent contributions from the original REBO, to theunconditional LJ for distant atoms, which are not boundcovalently:

∑=

E E12 i j

i j

ijLJ,

LJ

(3.3.23a)

= + + *E S t r S t b C V r{1 ( ( ))[1 ( ( ))]} ( )ij r ij b ij ij ij ijLJ LJ

(3.3.23b)

Here, S is a sufficiently continuous (up to the second derivatives)switching function that turns the LJ terms on and off, dependingon the interatomic distance. The bond order index, Cij, is theconnectivity switching function that disables the LJ interactionsfor any atom pair that is connected by a series of three or fewerbonds and partially disables the LJ interactions if the connection

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5829

Page 34: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

is via a series of partially dissociated bonds. VijLJ is the pristine LJ

potential. The scaling transformations tr and tb convert thephysical variables into a range suitable for input to the switchingfunctions.The torsional potential is designed such that its symmetry

arises naturally from local coordination of the atoms and suchthat it decays as some of the bonds dissociate. The AIREBOchooses to sum all pristine torsional potentials, Vtors(ω),weighted by the weights of the bonds that form the torsionalangle:

∑∑ ∑ ∑

ω

=≠ ≠ ≠

E w r w r w r

V

12

( ) ( ) ( )

( )

i j i k i j l i j kij ij jk jk kl kl

ijkl

tors, , ,

tors(3.3.24a)

ω ε ω= −⎜ ⎟⎡⎣⎢

⎛⎝

⎞⎠

⎤⎦⎥V ( )

256405

cos2

110

tors 10

(3.3.24b)

A detailed discussion of the AIREBO method is availableelsewhere.78,412,413

3.3.3.6. Electrostatic Effects for Reactive Potentials.Addition of dynamical electrostatic effects was the next logicalstep in the evolution of the REBO potentials. All previousformulations concernedmostly covalent bonding or added short-ranged dispersion interactions. The possibility of charge transferupon bond formation and dissociation was not describedexplicitly. These effects could be partially included at the totalenergy level, but properties such as dipole moments were notaccessible. In addition, response to external electromagneticfields was absent even at the phenomenological level.Various dynamical charge transfer schemes were developed

and utilized with classical, nonreactive force fields.4,414−417 Theyare mostly based on either the charge-dependent electro-negativity equalization method (EEM) of Mortier etal.,414,418,419 on the charge equilibration scheme (QEq) ofRappe and Goddard,186 on the classical fluctuating charge modelof Rick et al.,4 or their numerous modifications and improve-ments.181,420,421 The main idea of the EEM or Qeq method is towrite the electrostatic portion of the interaction energy in theform:

∑ ∑χ= += =

E q q q J q q( , ..., )Ni

N

i ii j

N

ij i j11

0,, 1 (3.3.25)

where χ0,i and qi are the charge-independent atomic electro-negativity and charge of the ith atom. Jij is the screened Coulombpotential for interaction between the atomic centers i and j. Thelatter can be computed according to one of the empiricalformulas used in semiempirical methods, eqs 3.1.31−3.1.32. Theatomic chemical potential (electronegativity) of each atom isdefined as

∑χ χ≡ ∂∂

= +=q

E q q J q( , ..., )ii

N ij

N

ij j1 0,1 (3.3.26)

By definition, the chemical potentials of all sites are equal to eachother in equilibrium:

χ χ χ= = =... N1 2 (3.3.27)

The set of the resulting N − 1 equations, eq 3.3.27, togetherwith the conservation of the total charge:

∑ ==

q Qi

N

i1 (3.3.28)

can be solved to obtain N atomic charges, {qi|i = 1, ..., N}.The method is analogous to the Fermi energy equalization

between different subsystems combined with the total chargedensity conservation−used to formulate the divide-and-conquermethod by Yang (see section 4). One can regard theelectronegativity equalization as an approximated noniterativeversion of this method. There is no need for self-consistentiterations of the charges, because the covalent contribution(given by the FF energy or the BO potentials) is decoupled fromthe electrostatic energy. It is important to note the similarity ofthe two approaches for our purpose, and to appreciate theimportance of the physical principles of equilibrium andconservation in the development of the interaction potentials.We have already seen how the principle of bond orderconservation leads to a very efficient, simple, and accuratemodel of covalent bonding. The Qeq or EEM principles can beconsidered analogues of the bond order conservation, but in thespace of one-atomic objects (charges) rather than two-atomicobjects (bond orders). The two principles are naturallyinterrelated in the wave function based methods, in whichbond orders and atomic charges are the linear combinations ofthe density matrix elements.The dependence of the charges on the instantaneous geometry

of the system is an important feature of the method. Itemphasizes the ability to describe charge transfer processes, bondbreaking, and bond formation. The need for solving a system oflinear equations presents a difficulty. The effort scales cubically,O(N3). Although the argument N in this scaling law refers to thenumber of atoms rather than to the number of basis functions asin the MO-based methods, the computations may becomeexpensive as the system becomes large. Utilization of theextended Lagrangian for the description of fluctuating chargessuggested by Rick et al.4 provides means to mitigate the scalingproblem and accelerate the calculations. The method can beregarded as an atomic analogue of the Car−Parrinello dynamics.Incorrect dissociation limit is a notable problem of the EEM or

Qeq methods, especially important to their use within reactiveforce fields. A diatomic molecule would dissociate into an ionicpair rather than the electron-neutral atomic species. One of theconsequences is that a pair of distant atoms would typicallyacquire an excessive charge, leading to a notable overestimationof dipole moments.422−425 The incorrect dissociation limit of theoriginal Qeq method was addressed by many authors.Particularly notable among them are the maximal entropyvalence bond (MEVB) method of Morales and Martinez,426 andthe charge transfer with polarization current equalization(QTPIE) of Chen and Martinez.427 Both methods recoveredthe correct asymptotic behavior of the redistributed chargesupon bond dissociation. The MEVB and QTPIE conceptsgeneralize the Qeq principle and promise further elaboration andcombination with the reactive bond-order potentials.

3.3.3.7. ReaxFF. One of the first reactive force fields thatincluded charge transfer terms was the ReaxFF, extensivelydeveloped and applied to various systems by van Duin.28 Onlythe Qeq and EEM-like schemes for the charge equilibration werepopular at the time the force field was proposed (2001). TheReaxFF formalism was adopted for the description of dynamicelectrostatic effects (polarization, charge transfer) in reactiveprocesses. Although the model extensively utilizes the bond

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5830

Page 35: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

order concept, the electrostatic contributions are decoupled fromthe covalent contributions, similarly to many classical force fields.The ReaxFF philosophy was different from the series of

gradual improvements of the earlier reactive bond-orderpotentials, such as those leading from the REBO to the AIREBOand later to the qAIREBO developed in the Harrisongroup.78,412,428,429 In the ReaxFF, one starts right away with avariety of physically and chemically motivated energy contribu-tions of different types: electrostatic, dispersion, electronconjugation, atomic under/overcoordination, as well as thestandard covalent bonding terms (bonds, valence angle,torsions). All interactions are treated on the same footing forconceptual simplicity, but with specially designed functionalforms. For example, the dispersive interactions are included evenfor bonded (1−2 pair) atoms, but an appropriately chosendamping function is used to avoid the repulsion catastrophe. Incomparison, these terms are typically excluded in the classicalforce fields.The functional form of the reactive potential utilized by the

ReaxFF is very complex, because the central idea is to use a verygeneral form that includes the most important types ofinteractions. It contains a large number of the general modelparameters, apart from the atom-specific parameters. Forexample, the potential describing systems with only C and Hatoms includes 28 general parameters, and at least 23 atom-specific parameters. The initial parametrization to the C and Helements was extended further to other elements, includingtransition metals.29 Another important factor is that the ReaxFFparametrization utilizes not only information on the staticstructures (minima), but also information on the energy profilesfor different reactions involving small molecules. These data aretypically available from high-level ab initio calculations. As aresult, the method is very accurate for the description of chemicalreactions in different environments.Despite the complex functional form of the potential, it is

much more efficient than any semiempirical method, and can beapplied to study reactive processes in large systems. Long timesimulations can also be performed. The ReaxFF was utilized tostudy truly large-scale systems in the recent decade. Examplesinclude studies of elastic constants, diffusion and phasesegregation of alloys,430 materials oxidation,431−433 orientedattachment of nanocrystals,434 mechanical properties of nano-wires,36 fracture of nanotubes,435 decomposition of explosives,27

and reactive adsorption on metal−organic frameworks,436 toname a few.3.3.3.8. Combination of Charge EqualizationMethods with

Bond Order Potentials. The charge equilibration schemes weretypically formulated by decoupling the electrostatic energy fromthe covalent bonding energy terms. This approach was also usedin the ReaxFF. Coupling between the covalent bonding and theelectrostatic interactions was introduced by Mikulski et al.,428

who combined the split-charge equilibration formalism(SQE)437 to dynamical electrostatics with the bond orderpotentials. The split charge equilibration (SQE) method is verysimilar to the Qeq method, but it is formulated in the extendedset of variablesthe split charges.The split charge, qij, relates to the charge of a given atom, qi, as

∑=q qij

ij(3.3.29)

The split charge qij can be interpreted as a steady-state flux(charge current) between the pair of atoms i and j. Similar tobond orders, it contains information about a pair of atoms rather

than about a single atom. It can be considered a natural variablefor description of charge transfer in reactive force fields.To relate the split charges to the bond orders, the fractional

split charges, f ij, and the maximal split charges, qmax,ij(xij), wereintroduced:

=q f q x( )ij ij ij ijmax, (3.3.30)

The maximal value qmax,ij(xij) depends on the bond order via asimple linear relation:

=q x x x( ) /ij ij ij ijmax,ref

(3.3.31)

where xijref is the bond order for a reference system. The

definitions, eqs 3.3.30 and 3.3.31, transform the task of split-charge optimization into a set of equations for the fractions f ij.More details of the combination of the SQE and BO potentialscan be found in the original work.428 Unlike theQEq, themethodcontains a penalty term that achieves the correct dissociationlimit. The method was shown to yield improved dipole momentsand atomic charges. The incorporation of the SQE into theAIREBO potential resulted in a new improved qAIREBO429

force field that is capable of describing charge transfer effects inreactive dynamics.An approach that relates the atomic charges to the bond orders

was developed recently by one of us for studies of the dynamicsand adsorption of large fullerene-containing molecules on a goldsurface.39,438 The method can be related to the EAM/MEAMformulations because it utilizes the auxiliary density-like variableat each adatom position:

∑ρ = α

−r S r( ) e ( )i ik

rik

Au1

ik

(3.3.32)

where α is an empirical parameter. The definition is similar to theEAM, eq 3.2.6, but it uses the sufficiently continuous (up to thesecond derivative) switching function S1(rik) to gradually turn offthe contributions of more distant atoms. At the same time, thedensity, ρi, is a sum of factors similar to eq 3.3.18c used in theTersoff potential. Each term is a damped bond-order variable foreach C−Au bond.Unlike the EAM/MEAM and most REBO potentials, the

resulting auxiliary density, eq 3.3.32, is designed to modeldynamic electrostatic effects and to avoid the need for solvinglarge systems of linear equations that arise in the Qeq or EEMschemes. Specifically, the partial charges of adsorbate carbonatoms molecule are parametrized by

∑βρ γ ρ β γ= − > >≠∈

q S r( ), 0, 0i ij i

j

j ij

C

2

(3.3.33)

where β and γ are empirical parameters, and S2(rij) is yet anotherswitching function, typically with a shorter range than S1(rik).S2(rij) is designed to describe charge back-donation from nearbyC atoms, to model a charge alternation pattern typical foraromatic systems.Because of the specifics of the interactions in the C60/

Au(surface) system, the explicit covalent bonding potentialssimilar to those used in reactive force fields were not constructed.Instead, the nearly covalent bonding in this system was explainedon the basis of combination of the dispersion and dynamicelectrostatic contributions:

= +E E EvdW elec (3.3.34a)

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5831

Page 36: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

∑ ∑ ∑χ= + + +

∈ ∈≠

∈∈

⎜ ⎟⎛⎝

⎞⎠E q J q

q q

r

q q

r12i

i i ii ii j

j i

i j

ij ij

i j

ijelec

real

2

, real realimage

(3.3.34b)

∑ σ σ= −

∈∈

−− −

⎣⎢⎢⎛⎝⎜⎜

⎞⎠⎟⎟

⎛⎝⎜⎜

⎞⎠⎟⎟

⎦⎥⎥E D

r r2

ij

ij ijvdW

CAu

C AuC Au

12

C Au

6

(3.3.34c)

where the meanings of the χi and Jii parameters are similar tothose in the Qeq theory. DC−Au and σC−Au are the standard vdWparameters, qi is the partial charge of the atom i computedaccording to eq 3.3.33, qi =−qi is the image charge induced by thecharge of the atom i, rij is the distance between two adsorbateatoms, and rij is the distance between the adsorbate atom i andthe image of the atom j. The different types of interactions areillustrated in Figure 4.The parameters are designed to reproduce charge transfer

between C60 and the Au surface, and the chemisorption energiesat different C60/Au geometries, simultaneously. This is done byfitting to extensive semiempirical calculations and to knownexperimental data. Because the model uses simple parametrizedenergy/charge functionals, it is efficient computationally,allowing for nanosecond MD simulations with more than 1000atoms. The ability to reproduce coadsorption effects is animportant qualitative advantage of the above model.The construction given by eqs 3.3.32−3.3.34 can be regarded

from the mathematical point of view as a reactive bond-orderpotential, since the total energy is ultimately expressed in termsof bond-order variables. However, the physical interpretation isdifferentthe model intrinsically couples electrostatic andcovalent bonding contributions. The charges are computed viabond orders and can be used in calculations employing anexternal electric field.39 The response to an external electric fieldis absent in the early REBO, BOC-MP/UBI-QEP, and EAM/MEAM potentials. Note that a simple addition of the QEq/EEMscheme in the ReaxFF to the electrostatic energy does notdirectly couple charge transfer and bond orders. This coupling ispresent in the qAIREBO force field, although the strategy isdifferent.3.3.3.9. COMBx (x = 1, 2A, 2B, 3).A series of charge optimized

many-body (COMB) potentials was developed recently by theSinnott group.20−23,439,440 The method originates from theextended Tersoff potential developed by Yasukawa.441 The lattergeneralized the Tersoff potential to make the repulsion andattraction terms charge dependent. The charges are determinedwith a charge equilibration principle, although the functionalform of the charge-dependent energy is different from that ofGoddard. The dynamic-charge scheme was essential for properdescription of the charge redistribution in the interfacial Si/SiO2system.The initial version of the COMB potential (COMB1)20 was

mostly a reparametrization of the Yasukawa potential. It alsoincluded additional repulsion and bond-bending terms. Theextended Lagrangian approach was adapted to solve forcoordinate-dependent charges during molecular dynamicsevolution. The atomic charges are treated as dynamical variablesin this method. The approach increases the number of equationsof motion, but avoids solving systems of linear equations, andleads to linear scaling of the required CPU time.The second generation of the COMB potentials included

different parametrization sets and focused on new types ofsystems. The COMB2A439,440 was optimized for bulk and

interfacial systems. It introduced a number of corrections,including new repulsion and bond-bending terms, and over-coordination corrections. It also replaced the point-chargemodels by the Coulomb integrals over 1s Slater functions. Theelectrostatic cutoff functions were abandoned in favor of thecharge-neutralized real-space direct summation method.136 Thesecond version (COMB2B)21,22 was parametrized against thetraining set that contained both experimental and high-level abinitio data on small molecules, solid-state systems, and interfacialsystems. The data included cohesive energies, formationenthalpies, elastic properties, surface energies, and bonddissociation energies of molecular and anionic species. Inaddition to the previous formulations, the model included thepoint dipole terms to improve description of the atomicpolarization effects.The third generation of the COMB potential (COMB3) was

developed very recently.23 The potential was improved bymodifying the expressions for bond order and self-energy.Flexibility of the functional form and a judicious parametrizationstrategy allowed application of the model to carbon-basedmaterials, hydrocarbons, organometallic compounds, andcombinations of these systems. A large-scale (∼5000 atoms)molecular dynamics simulation was reported.

3.3.4. Construction of Reactive Bond-Order Potentialsas a Phenomenological Variational Principle.Constructionof a suitable functional form that obeys the basic physicalconsiderations and limiting properties, and has good trans-ferability, is certainly the art of computational chemistry. Someaspects of the analytic potential design were extensively discussedby Brenner.86 Although a specific choice is rather arbitrary, it isalways motivated by the problems in hand and by the basicchemical concepts, such as directional bonding and bond order.

Over the course of development, the functional form of thereactive potentials gradually became more and more compli-cated, and involved a large number of adjustable parameters.Some methods evolved as a hierarchy of improvements built upstep-by-step to add new physical effects or to include moregeneral bonding patterns. For example, one can observe clear

Figure 4. Interactions in the coupled charge-bond-order scheme. (a)Auxiliary charge density at the central atom C is given by summation ofthe contributions from all Au atoms (red arrows). The charge on thisatom is determined by linear combination of the auxiliary densities of alladsorbate atoms. (Blue arrows indicate possible contributions to/fromthe nearby partial charges.) (b) Adsorbate point charges (purple circles)induce image charges (blue circles). The energy is determined bysumming electrostatic interactions between adsorbate−adsorbate andadsorbate−image atom pairs. q1 and q2 are partial charges of theadsorbate atoms 1 and 2; d1 and d2 are distances from the atoms 1 and 2to the surface plane.

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5832

Page 37: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

evolutionary lines, such as Tersoff → REBO → AIREBO →qAIREBO or Tersoff→ Yasukawa→ COMB→ COMB2(A,B)→COMB3. These lines can be thought of as minimization pathsin the functional space spanned by all functions approximatingthe true universal energy functional of an arbitrary system. Inother words, the process of developing reactive potentials isessentially an execution of variational principle at the meta level.Therefore, the potential development can be called the empirical(phenomenological) variational principle.The central challenge of the DFT is to determine the universal

density functional, which can be used to compute the exactenergy of an arbitrary many-body system as a function of itselectronic density. This problem can be reduced to finding asuitable exchange−correlation functional, and perhaps a kinetic-energy functional (as in the OF-DFT). In all cases, the functionalis constructed to represent the fundamental electronicinteraction principles. The energy can then be found byoptimizing the electron density with the proper functional.The construction of the BO potentials starts from the total

energy expression and includes functional variation. Theelectronic density is not directly involved and can be arbitrary,fixedthis can be rationalized by an appropriate functionaltransformation. The variation of the functional is aimed to arriveat a proper model representing the reality. The relationshipbetween the DFT and the phenomenological variationalprinciple is illustrated in Figure 5.The efforts in the development of reactive force fields are often

directed toward description of specific elements and bindingenvironments. They are designed to model particular processesor systems. A generalization to a large database of properties andto a larger scope of elements and systems is often very involvedand complex, and is frequently absent. The empirical reactiveforce field approaches are somewhat less attractive from thispoint of view than the semiempirical methods, which are basedon less drastic approximations and retain more information,including the electronic structure properties. This deficiency ofthe reactive potentials is compensated by a much highercomputational efficiency, which makes simulation of large-scalesystems possible when the semiempirical methods become tootime-consuming.The steadily growing complexity of the REBO potentials make

them appear somewhat “black box”. Although all the terms arephysically motivated, their particular choice is flexible and israther arbitrary. The large number of these terms and theirinhomogeneity make further systematic improvements andextensions to more compounds and elements rather difficult.In this regard, it may be rewarding to reformulate the reactivepotentials, or at least some of their parts using artificial neuralnetworks (ANN).A properly constructed and trained multilayer ANN is capable

of approximating an arbitrary function of desired complexity. AnANN can also “memorize” large amounts of information ondifferent types and use it for further predictions. Both input andoutput of ANNs can include multiple channels, which may beused systematically for training on inhomogeneous data sources,e.g., elastic constants, heats of formation, optimized geometricalparameters, spectroscopic data, etc. Although an ANN carrieslittle physical interpretation, it can be used in conjunction withphysically motivated terms to hide complex correlation effects.For example, one of ANN outputs may return the ratio betweenthe repulsion and attraction terms with a predefined asymptotic,the exponential parameters entering the bond-order calculations,the values of the damping and switching functions, the factors

encoding coordination (e.g., hybridization, over/undercoordi-nation), etc.In the past, ANNs were utilized in relatively few, but very

interesting chemically relevant applications, to approximatedensity functionals,442 potential energy surfaces and moleculardynamics,443−449 transition moment functions,450 dynamicalpolarizabilities,451 and molecular electrostatic potentials.452 Theuse and development of the approach is inhibited somewhat by alack of clear interpretation of the parameters and functionalforms (the so-called excitation function of a neuron). Weenvision that the growing complexity of the functional form ofthe reactive force fields can facilitate a conceptually simple,systematically improvable, and computationally efficient ANNformulation. In a similar fashion, the ANN formalismmay be veryuseful in the semiempirical methods, especially for incorporatingthe correlation and complex many-body effects, yet leaving thecomputational advantages of the basic methods unchanged.

3.3.5. Timeline of Bond Order Based Methods. Table 3includes the timeline, from 1984 to 2012, of the development ofbond order based methods.

4. COMPUTATIONALLY MOTIVATEDAPPROXIMATIONS

Development of this group of methods started when thephysically motivated approaches reached practical limits.Although some physically motivated approximations are veryefficient, they often lack the quantum-mechanical level ofdescription. When quantum calculations must be performedon large-scale systems, an alternative approach is needed. Thecomputationally motivated approximations make use of theadvantages of alternative mathematical reformulations of theexisting problem. Fragmentation is a typical strategy. It involvesan initial subdivision of the whole system into smallersubsystems, solving the smaller problems, and subsequentreconstruction of the properties and variables of the combinedsystem. These methods are currently known as the linear-scalingmethods of quantum chemistry. Another related group includesiterative (direct optimization) methods, which also show linearscaling but avoid diagonalization of large matrices by usingiterative procedures, optimization algorithms, and sparse-matrixtechniques to solve the Schrodinger equation.

Figure 5. Relationship between the DFT and the phenomenologicalvariational principle. f [q,x] is an effective functional for the bond-orderpotential, mapping atomic charges, q, and coordinates, x, onto the totalenergy. This function is a BOP counterpart of the universal functional,F[ρ], that maps the charge density ρ onto the total energy in DFT.

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5833

Page 38: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

The computationally motivated approaches follow the ideas ofparallelismall smaller subproblems can be treated simulta-neously, in parallel. On the contrary, the philosophy of thephysically motivated approximations is essentially serialtheentire problem is treated as the whole, only using more efficientformulas. (This does not preclude parallelization of realisticcalculations based on these methods, of course.) The differencein the philosophies of the two groups of methods also reflects thestate of computational technologies available at times of theirdevelopment. Parallel computations were practically notavailable in the 1960s−1980s, when most of the basic physicallymotivated approaches were developed. Parallel computationsemerged starting in the 1990s, when the computationallymotivated (linear-scaling) methods reached their active develop-ment phase. Parallelization facilitated orientation of novelcomputational methods of quantum chemistry toward afragmentation-based, divide-and-conquer philosophy.4.1. Classification of Computationally MotivatedApproaches

In this section, we discuss the linear-scaling techniques ofquantum methods. We use the computationally motivatedapproaches synonymously with the linear-scaling method in thiscontext. To start our review of this topic, it is helpful to give abroader look at the existing approaches and to classify themethods. A detailed description of the methods and theiracronyms will be presented after that.The wide variety of existing linear-scaling methods can be

classified according to different principles. The following are themost popular among existing classifications:(1) Li et al.454 classified the methods into two main groups,

depending on the principle that guides fragmentation:

(a) density matrix based methods (D&C, ELG, MFCC)

(b) energy-based approaches (MFCC, MTA, SFM)In the first group of methods, the main goal is to approximate thedensity matrix of the whole system by the density matrices of thefragments. The former is not explicitly constructed, butapproximations are used to compute energy and other molecularproperties from the densities (density matrices) of thesubsystems. The methods from the second group access theenergy of the whole system directly, by representing it with a sumof energies of the fragments and additional nonadditivity terms.(2) Fedorov and Kitaura455 suggested three main categories:

(a) divide-and-conquer methods (D&C)

(b) many-body molecular interaction methods (FMO,MFCC, etc.)

(c) transferable approaches (SFM)The principle of classification is quite similar to that of Li et al.454

Additional complexity arises from the differentiation of thecategories b and c. The difference lies in the way the interactionsbetween fragments are added.(3) Suarez et al.456 classified the many-body expansion (MBE)

methodologies into two groups, according to the topology offragmentation:

(a) overlapping fragments (MTA, MFCC, FEM, SFM)

(b) disjoint fragments (FMO, KEM)(4) Gordon et al.76 recognized the difficulties of the former

classification schemes and the need for a more complex one.They generalized the scheme of Li et al.454 by a two-levelclassification scheme. At the first level, a method can be classifiedas

• one step

• two stepThe one-step group essentially substitutes the energy-based-methods group. It corresponds to the methods in which aproperty of the composed system (e.g., energy) is directlycomputed from the properties of the fragments. The two-stepgroup substitutes the density-matrix-based methods group of theLi et al. classification. In these methods, an intermediate propertyof the whole system is first computed from the correspondingpieces of the fragments (e.g., density matrix). The intermediateproperty is then used to compute the required properties (e.g.,energy) of the entire system.At the second level of the classification hierarchy, each of the

first-level groups is subdivided into three groups:

• one body

• many body• conglomerateAs the number of new methods for computations on large-

scale systems grows, the distinctions become more indefinite,fuzzy, and a straight linear classification becomes problematic.This problem is already seen in the classification of Li et al., in

Table 3. Timeline of Bond Order Based MethodsDevelopment

year event authors

1984 universal scaling of cohesive energies Rose et al.336

BOC-MP Shustorovich380

1985 proposed simple many-body reactivepotential (Stillinger−Weber potential)

Stillinger and Weber400

three-body potential for Si Biswas and Hamann401

general reactive potential based on themany-body bond-ordered terms

Abell394

reactive potentials for atomic scatteringin BO space

Garcia and Lagana 396

EEM Mortier et al.418

1986 bond-order term for covalently bondedmaterials (Tersoff or Abell−Tersoffpotential)

Tersoff453

1988 UBI-QEP Shustorovich382

1990 BO potential for hydrocarbons(Brenner, Tersoff−Brenner, or REBOpotential)

Brenner408

1991 Qeq Rappe and Goddard186

1994 NBI-MD Sellers391

classical fluctuating charge model Rick et al.4

2000 adaptive intermolecular REBO(AIREBO)accounting for LJinteractions

Stuart et al.412

2001 ReaxFF for hydrocarbons van Duin et al.28

maximal entropy valence bond (MEVB)method

Morales and Martinez426

2002 REBO2improved accuracy forhydrocarbons

Brenner et al.409

2005 ReaxFF extended to transition metals Nielson et al.29

2006 split charge equilibration (SQE) Nistor et al.437

2007 charge transfer with polarization currentequalization (QTPIE)

Chen and Martinez427

COMB1 Yu et al.20

2009 coupling of the split-charge equilibrationscheme with with BOPa prototypeof qAIREBO

Mikulski et al.428

2010 COMB2A Shan et al.439,440

2011 COMB2B Devine et al.21,22

2012 qAIREBO Knippenberg et al.429

COMB3 Liang et al.23

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5834

Page 39: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

which a single method (e.g., MFCC) can be classified into morethan one group. The classification scheme of Gordon et al.partially solves this ambiguity; however, different versions of thesame family of methods (e.g., FMO) can eventually be classifiedinto different groups, which may become distinct at higher levelsof hierarchy.In our opinion, a modern successful classification scheme

cannot be linear. Instead, it is best represented by aninterconnected tree (Figure 6). The tree mimics the history ofthemethods development, and their branching and combination.If in the beginning of the era of large-scale calculations thedistinction was very clear, as progress continues, new methodsappear by combination of ideas from different groups of theearlier methods. The history process acts as a variational, geneticoptimization procedure, pushing researchers to combine themost successful ideas and approaches found in different aspectsof the large-scale computations. Less successful techniqueseventually lose popularity. Therefore, it can be expected in thenear future that the broad variety of existing approaches maymerge into a few most optimal methods, which will combine thetraits of many developed techniques.For the purpose of presentation and organization of the

material in this section, we will adopt the following classificationscheme. All methods for quantum-mechanical calculations oflarge systems can be grouped into three large categories:

• fragmentation-based methods

• direct optimization methods• quantal force fields

It is possible that techniques major for a given group of methodscan also be used in another group. For example, directoptimization can be applied within the fragmentation-basedschemes, such as D&C.Quantal force fields are derivatives of the energy-based or

other fragmentation methods and, in principle, can be classifiedinto that category. However, due their distinct features andpotential diversity in the future, we prefer to separate them intoan independent group, forgetting for a moment their true origin.This separation will also help to structure the presentation andprovide the capability to start a new classification within thegroup.Since seemingly distinct methods may have large similarities in

their foundation, or formulation, or both, it is often difficult toclassify them into a single group. Instead, we choose to tag thefragmentation-based methods according to the following threetypes of properties:Property 1:

• density partitioning and derived schemes (approximationof density matrix)

• Hamiltonian partitioning derived schemes (approxima-tion of Hamiltonian)

Here we essentially follow the original classification of Li et al.,but generalizing energy-based partitioning to the Hamiltonian-based partitioning.Property 2:

• static partitioning (working with the whole pool offragments at once)

• dynamic partitioning (working only with a subset offragments at given time)

The methods of the first type utilize a predefined set offragments, all at once. Most methods can be classified into thiscategory. The dynamic partitioning schemes utilize only a subset

of all available fragments at each iteration. Some property isdynamically adjusted and truncated to maintain constantcomplexity. The ELG and Stewart’s LMO methods are theclearest examples of the methods in this category.Property 3:

• adiabatic partitioning (fragments are close to what they arein the combined system)

• diabatic partitioning (fragments are diabatic states and canbe rather arbitrary)

Most of the discussed methods are classified as adiabaticpartitioning. As an example, conjugate caps are chosen in theMFCCmethod to represent the environment of a given fragmentin a realistic system as accurately as possible. In the D&C scheme,one chooses a buffer or double-buffer region to minimizespurious effects at the boundary. In this sense, a judicious choiceof fragmentation is very important for obtaining desirableaccuracy. The methods of the second group are less popular, butare actively developed too. In the diabatic partitioning methods,spurious boundary effects are relatively unimportant. The sameapplies to an unphysical charge distribution over the fragmentsinto which the whole system is partitioned. One seeks toconstruct a complete basis of diabatic configurations and torepresent the whole system as their linear combination.4.2. Fragmentation-Based Approaches

We start by presenting yet another practical classification. It isspecific to fragment-based approaches and is designed forstructuring the current presentation. We adopt the followingclassification:

(1) density-based fragmentation (D&C, MEDLA, etc.)

(2) energy-based fragmentation (FMO, MFCC, SMF, MTA,E-EDC)

(3) dynamical growth with localization (LMO, ELG)(4) diabatic approaches (FMO-MS-RMD, SFS)

Note that the three types of properties discussed in section 4.1may be attributed to any group of methods in this classification.

Figure 6.Tree structure of the existing linear-scalingmethods. Offspringmethods may be derivatives of some parent method. Additionally, anoffspring method may be a derivative of more than one parent method.The direction of the arrow indicates gradual complication andspecialization of the methods. Separation of the arrows along the x-scale characterizes the similarity of the approaches. Initially distantmethods spread out by acquiring new traits and can eventually overlap,because the new traits may be common across the newly developedmethods. See text for abbreviations.

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5835

Page 40: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

4.2.1. Density-Based Fragmentation. 4.2.1.1. Divide-and-Conquer (D&C).One of the first linear-scaling approaches isthe divide-and-conquer (D&C) technique. It was proposedoriginally by Yang68,69 within the DFT framework. Later it wasextended to semiempirical70,72 and ab initio Hartree−Fock75molecular orbital methods by Dixon, He, Merz, and many otherauthors. The MO-based versions of the D&C algorithm wereextended to MP2 and CCSD levels of theory. The developmentof the D&C algorithm was preceded by a range of relatedmethodologies, not necessarily based on self-consistent for-mulation. Below we summarize the history of the methoddevelopment and describe the basics of the major techniques.4.2.1.1.1. Charge Density Partitioning. The D&C meth-

od68,69 represents one of the first approaches to achieve linearscaling in quantum mechanical calculations. The central D&Cidea is similar to that of any other density partition schemeto“abuse” the main assumption of the DFT, which utilizes thecharge density as the main variable describing the state of thesystem, and to make a good use of its additivity:

ρ ρ ρ = + ∪ r r r( ) ( ) ( )A B A B (4.2.1)

where A ∪ B represents the union of the sets (fragments) A andB. The assumption eq 4.2.1 makes the partitioning of a complexsystem into fragments easy, appealing, and natural, in contrast tothe wave function based composition/partitioning schemes. Incontrast to charge densities, wave functions are nonadditive, ingeneral. Thus, one needs to account for renormalization andboundary effects in the MO-based approaches, when recon-structing the wave function of the entire system from those of thefragment MOs. For these reasons, the MO-based D&Cmethodologies were developed later and utilize an object thatis similar to the charge density in the DFT, namely, the densitymatrix, eqs 2.1.32 and 2.1.33, to perform partitioning. Thedensity matrix can be considered the WFT counterpart of thecharge density in the DFT. Therefore, it can be used naturally forconstructing WFT-based D&C partitioning schemes. Thesemethods will be described in the following sections.The main purpose of Yang’s D&C formulation of the standard

DFT was to avoid using KS orbitals, and to work only in terms ofelectron density. The starting point is to reformulate the standard

KS expression for the total density, eq 2.2.11, for all electronswith the spin component, σ, using the Heaviside function:

ρ η = ⟨ | − | ⟩σ σr r E H r( ) ( )F, (4.2.2)

where

η =>≤

⎧⎨⎩xx

x( )

1, 0

0, 0 (4.2.3)

The definition eqs 4.2.2 and 4.2.3 can be interpreted as the sumof squares of the amplitudes of all eigenvectors of theHamiltonian H with eigenvalues lower than the Fermi energyEF,σ, eqs 2.1.49 (Figure 7a).The electron density partitioning scheme is defined by a set of

smooth switching (weighting, partitioning) functions:

∑ =σ σp r p r{ ( ): ( ) 1}AA

A, ,(4.2.4)

Each function pA,σ(r) = pA,σ(r;{R}) defines the region thatencloses the fragment A in the space of Cartesian coordinates ofall atoms, {R}. For example, the function can be 1 within theinterior of the fragment, 0 outside of the fragment, and anintermediate value within some region around the fragmentboundary (Figure 7b). The only requirement is that the sum ofcontributions from all fragments should be 1 at each coordinatevector r. Apart from this requirement, the partitioning functions,pA,σ(r), can be chosen quite arbitrarily. A variant suggested byYang69 is based on spherical atomic densities pa,σ(|r − Ra|):

=

∑ σσ

σ

p r Rg r R

g r R( ; { })

( ; { })

( ; { })AA

A A,

,

, (4.2.5)

with

∑ ρ = | − |σ σ∈

g r R r R( ; { }) ( )Aa A

a A, ,(4.2.6)

Here and in the following discussion of fragmentation/partition/subsystem methods, we will use the capital letters A, B, C, etc. todenote fragments/sets/subsystems and small letters a, b, c, etc. todenote atomic species.

Figure 7. Illustration of the charge density definition that avoids using KS orbitals (a) and the partitioning of the total charge density into two fragmentsAand B, using the smooth switching functions defined by eq 4.2.4 (b). The switching functions in (b) are shifted along the Y coordinate for clarity. Thepositioning along theX coordinate corresponds to that in the combined system, outlined in the middle drawing. {pα,σ|α =A,B} is the density partitioningscheme for the spin channel σ, H is the Hamiltonian, EF,σ is the Fermi energy for the spin channel σ, and Nσis the number of electrons with a particularspin z-projection σ.

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5836

Page 41: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

Using the switching functions eq 4.2.4, one can divide the totaldensity, eq 4.2.2, into a sum of the fragment densities:

ρ η η

ρ

= ⟨ | − | ⟩ = ⟨ | − | ⟩

=

σ σ σ σ

σ

r r E H r p r r E H r

r

( ) ( ) ( ) ( )

( )

AA

AA

F, , F,

,(4.2.7)

with the charge density on the fragment A defined by

ρ η = ⟨ | − | ⟩σ σ σr p r r E H r( ) ( ) ( )A A, , F, (4.2.8)

4.2.1.1.2. Hamiltonian Localization and Fragment MOs(FMOs). The evaluation of ⟨r|η(EF,σ − H)r⟩ requires knowledgeof the Hamiltonian of the whole system, which depends on thedensities of both spin-up and spin-down electrons. In order forthe D&C approach to be useful, the second main assumption ismade. Namely, the global Hamiltonian, H, is approximated by aset of local Hamiltonians for subsystems, HA. In addition, forpractical purposes, a smooth equivalent of the Heavisidefunction, eq 4.2.3, is introduced. With these approximations,eq 4.2.8 turns into a practically more appealing expression:

ρ = ⟨ | | ⟩σ σ σΔr p r r f H E r( ) 2 ( ) ( ; )A A E A, , F, (4.2.9)

where fΔE(HA; EF,σ) is the Fermi function, eq 2.1.49b, of theoperator argument.A definition of the subsystem Hamiltonians, HA, is the last

ingredient needed to complete the D&C scheme. The subsystemHamiltonian HA is defined as the projection of the full KSHamiltonian on the subspace spanned by the basis orbitals(AOs) that are localized on the corresponding subsystem A,{|χA,i⟩}. Using eq 2.1.42, this can be written formally as

∑ ∑

∑ ∑

χ χ

χ χ

= | ⟩ ⟨ |

≈ | ⟩ ⟨ |

− −

− −

H S H S

S H S

[ ( ) ( ) ]

[ ( ) ( ) ]

Aa b A

A ai j

ai ij jb A b

a b AA a

i jA ai A ij A jb A b

,,

,

1 1,

,,

,

1,

1,

(4.2.10a)

and

χ χ≡ ⟨ | | ⟩H HA ab A i A j, , , (4.2.10b)

The first double summation in eq 4.2.10a runs over all AOslocalized on the fragment A. It is important to emphasize that eq4.2.10b defines matrix elements of the full Hamiltonian. Itdepends on properties of all fragments/subsystems, thereforeleading to a set of coupled SCF (KS) equations of eq 2.1.38 type,but with additional indexing due to fragments:

=σ σ σ σH C S C EA A A A A, , , , (4.2.11)

where HA,σ is the KS Hamiltonian, eq 2.2.11, of the fragment A,and SA is the overlap matrix of the AOs localized on the fragmentA:

χ χ⟨ | ⟩ = SA i A j A ij, , , (4.2.12)

The matrix CA,σ contains coefficients that transform the set ofnonorthogonal fragment AOs, {|χA,i⟩}, to the set of orthogonalfragment-localized MOs (also known as fragment MOs, FMOs),{|ψA,σ,i⟩}:

∑ψ χ σ α β| ⟩ = | ⟩ =σ σC , ,A ia

A ai a, , , ,(4.2.13)

The projected Hamiltonian, eqs 4.2.10, simplifies significantly inthe FMO basis:

∑ ψ ψ = | ⟩ ⟨ |σ

σ σ σH EAi

A i A i A i,

, , , , , ,(4.2.14)

The FMOs, {|ψA,σ,i⟩}, are determined by solving the set ofcoupled equations, eq 4.2.11. Because this procedure accountsfor electronic density deformations of one fragment induced byall other fragments, we classify these FMOs as adiabatic, inanalogy with the terminology utilized in the field of quantumdynamics. The opposite limit is to utilize the basis of FMOsobtained from calculation of noninteracting (isolated) fragments.Such approaches are utilized by a number of authors and will bediscussed later. The FMOs can be classified as diabatic in thissituation. Finally, many intermediate types of coupling betweenfragments can be included into the procedure used to determineFMOs. Common to the QM/MM methods, they will bediscussed later.

4.2.1.1.3. Subsystem Equilibrium and D&C Algorithm.Combining eqs 4.2.7−4.2.9 and keeping in mind the localizationapproximations, eqs 4.2.10−4.2.11, the total density of the wholesystem can be computed by

∑ ∑ρ ψ = − | |σ σ σ σ σΔr p r f E E r( ) ( ) ( ) ( )A

Ai

A i A i, F, , , , ,2

(4.2.15)

The Fermi energy, EF,σ, is assumed to be the same for allsubsystems (fragments), which is consistent with its physicalmeaning as the chemical potential. The self-consistent solutioncorresponds to the equilibrium state, namely to the energyminimum, which follows from the variational principle. There-fore, all subsystems must be in equilibrium with each other. Inother words, the self-consistent solution is obtained when theFermi energies of all subsystems are the same. The numericalvalue is obtained from the normalization:

∫∫∑∑

ρ

ψ

=

= − | |

σ σ

σ σ σ σΔ

N r r

f E E p r r r

( ) d

( ) ( ) ( ) dA i

A i A A iF, , , , , ,2

(4.2.16)

The overall D&C scheme is iterative, since the total density iscomputed by summing up densities of all subsystems, and thedensity of each subsystem depends on the Hamiltonian of thewhole system. The principal scheme of the D&C algorithm isillustrated in Figure 8.

4.2.1.1.4. Source of Computational Savings. Unlike theconventional DFT, in which the KS Hamiltonian is representedby a single NM × NM matrix, the dimensionality of the fragmentKS Hamiltonian is reduced to NMA

× NMA, where NMA

is thenumber of basis states (and, equivalently, MOs) associated withthe fragment A. The number of the SCF equations one needs tosolve increases and is equal to the number of subsystems, NS.However, this increase is only linear with NS. The reduceddimensionality of the KS Hamiltonian that must be diagonalizedhas a much more drastic effect on performance, reducing theCPU time cubically. In total, the scaling of computationalresources reduces to O(∏A=1

Ns NMA

3). Assuming that sizes of allsubsystems are comparable to each other (even partitioning),NMA

= NM/NS, ∀ A = 1, ..., NS, one obtains O(NS(NM/NS)3).

Choosing NS proportional to the system size, NS = NM/α, oneobtains a linear-scaling performance: O(α2NM) = O(NM).

4.2.1.1.5. Anatomy of Fragmentation and Terminology. Inhis original DFT-based D&C scheme, Yang68,69 introduced the

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5837

Page 42: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

concept of buffer atoms. According to Yang’s definition, theseatoms are not part of the subsystem (in our further terminologycalled “core region”), but they are in its neighborhood andcontribute to the subsystem orbitals. The anatomy of thepartitioning of a molecular system into fragments is illustrated inFigure 9.The molecular system is composed of generally overlapping,

localized regions, also called fragments or subsystems, FA:

= ∪=

M FA

NA

1

F

(4.2.17a)

ϕ∩ ≠F FA B (4.2.17b)

Here, NF is the number of fragments into which the system issubdivided, M indicates the entire molecular system, and ϕdenotes the empty (null) set. Each localized region, FA, iscomposed of the nonoverlapping core, CA, and buffer, BA,regions:

= ∪F C BA A A (4.2.18a)

ϕ∩ =C BA A (4.2.18b)

Division into the core and buffer sets must satisfy the conditionthat the core regions of different fragments do not overlap, forany pair of distinct fragments:

ϕ∩ =C CA B (4.2.19)

The overlap may be nonzero in some definitions.70 Thisdeviation from the definition of the core region, eqs4.2.18−4.2.19, can be resolved by introducing additionalpartitioning to separate nonoverlapping core sets that obey eq4.2.19.72

The sets FI, CI, and BI in the partitions above can beunderstood as the sets of atoms, in the spirit of the originalYang68,69 terminology. It is more general and convenient forfurther discussion to think about them as sets of localized

Figure 8. Algorithm of the D&C method based on charge density partitioning in DFT. The arrow that closes the loop represents the step in whichverification for self-consistency among all subsystems is performed, as judged on the basis of computed Fermi levels of all subsystems. See text fordetailed discussion of the terms in the scheme on the right.

Figure 9. Illustration of partitioning in the D&C scheme. (a) Subdivision of the system on the fragments (subsystems) and their componentscore andbuffer regions. (b) Weights of different types of matrix elements in the density matrix of a given fragment. {φi} are basis functions localized in differentspatial regions.

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5838

Page 43: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

orbitals. Most often, the bijection between the two sets can beestablished:

= ∈ ∈S I i S i I{ : , }A Aatomic, orbital, (4.2.20a)

and inversely

= ∈ ∈S i I S i I{ : , }A Aorbital, atomic, (4.2.20b)

where Satomic,A and Sorbital,A are the sets of atomic and orbitalindices of the fragment A, respectively, I is the atom index, i is theorbital index, and the notation i ∈ I implies that the orbital i islocalized on the atom I. A unique mapping from the atomic set tothe orbital set may not be possible in a more general case. This iswhy it is more general to think of the sets in the abovepartitioning in terms of orbital indices (essentially Hilbertsubspaces), not just the sets of atomic fragments. It is because wethink about the partitions in terms of sets of orbitals rather thanatoms that we call the core what Yang originally called asubsystem. Our definition goes along the lines used later by otherauthors.72 We hope this definition, together with the abovepartitioning anatomy, can help to reduce confusion and possibleambiguity and lead to more accurate terminology.The main purpose of the buffer atoms is to supplement AOs at

the subsystem boundary. This requirement stems from theapproximation in eq 4.2.10a. Namely, the global Hamiltonian isdefined on the Hilbert space of all AOs of the entire system.However, because of the localization approximation, eq 4.2.10a,which essentially disregards overlaps of AOs belonging todifferent fragments, ⟨χA,i|χB,j⟩ = δABSij, the fragment-localizedHamiltonian defined on the space of only those AOs that belongto a given fragment is not coupled to Hamiltonians of all otherfragments. In other words, it is defined on one of the mutuallyorthogonal Hilbert subspaces of the original Hilbert space of theentire system. In order for the subsystems to be coupled, theirorbital spaces should overlap, which is represented by eq 4.2.17b.Apart from the assumptions of nonoverlapping cores, eq

4.2.19, and the absence of overlaps between core and buffer

regions of a given fragment, eq 4.2.18b, there is no additionalrestriction on overlaps of the partitions. In particular, the core ofone fragment can overlap with the buffer of another fragment:

ϕ∩ ≠′C BA A (4.2.21a)

Moreover, the core of one fragment can be the buffer ofanother, as illustrated in Figure 9a:

= ′C BA A (4.2.21b)

4.2.1.1.6. Density Matrix Partitioning. The original for-mulation of the D&C was based on division of the system in theEuclidian space. The method operated directly on the chargedensity. In other words, the charge density was the partitionedquantity. The need for computing three-dimensional integrals ofthe type ∫ pA,σ(r)|ψA,σ,i(r)|

2 dr in eq 4.2.16 was one of thedrawbacks of such an approach. In addition, the formulation wasrestricted to the DFT. In the refined formulation, Yang andLee457 considered partition in the space of basis orbitals, makingthe density matrix the main partitioned object, eqs 2.1.32 and2.1.33. The partition scheme, eq 4.2.4, becomes independent ofthe coordinate system and is substituted by a discrete analogue:

∑ σ α β= ∀ ∈σα

σp p i j{ : 1, , ; { , }}A ij A ij, , , ,(4.2.22)

The set of the piecewise functions depending on atomic orbitalindices provides the simplest choice of partition functions:

=

∈ ∈

∈ ∈ ∈ ∈

∉ ∉σ

⎧⎨⎪

⎩⎪p

i C j C

i C j B j C i B

i C j C

1 ,

1/2 , or ,

0 ,A ij

A A

A A A A

A A

, ,

(4.2.23a)

where CA denotes the core region of the fragment A, and i ∈ CAindicates that the orbital i belongs to the setCA. The orbital-basedpartitioning given by eq 4.2.23a is schematically illustrated inFigure 10a.

Figure 10. Illustration of the partitioning scheme in the space of orbital indices with a three-orbital example: (a) original Yang scheme; (b) improvedscheme by Merz. The construction of the total density matrix from the density matrices of the fragments is also illustrated. Panel a shows partitioning ofthe orbital spaceM = {1, 2, 3} into two fragments, F1 = {1, 2} and F2 = {1, 2, 3}. (This example only illustrates the principle, and does not provide muchcomputational saving.) The first fragment is composed of the core, C1 = {1}, and buffer, B1 = {2}, regions. The second fragment is composed of the core,C2 = {2, 3}, and buffer, B2 = {1}, regions. The core regions of the two fragments do not overlap. The partitioning in (b) is F1 = {1, 2, 3} and F2 = {1, 2, 3}with C1 = {1, 2}, B1 = {3}, and C2 = {2, 3}, B2 = {1}. The overlap of the cores of the two subsystems in this case is not empty: C1 ∩ C2 = {1, 2} ∩ {2, 3} ={2}≠ ϕ. Pij, Pij′, and Pij″ are the elements of the density matrix. Prime and double prime indicate fragment matrices; unprimed quantities correspond tothe reconstructed density of the entire system.

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5839

Page 44: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

In a later work, Dixon and Merz70 proposed an improvedD&C density matrix partitioning scheme:

=∈ ∈

σ ⎪

⎪⎧⎨⎩

pi B j B

n

0

1/ otherwiseA ij

A A

ij, ,

(4.2.23b)

where BA is the buffer region of fragment A, and nij is the numberof subsystems in which both basis functions i and j appear asnonbuffer functions.The scheme eq 4.2.23b was improved further72 by splitting the

overlapping “core” parts into the true nonoverlapping core set(see our definition according to eqs 4.2.18−4.2.19) and the innerbuffer (first buffer layer, buffer 1). The original buffer became theouter buffer (second buffer layer, buffer 2), Figure 10a:

=∈ ∈ ∈ ∈∈σ

⎧⎨⎪

⎩⎪p

n i j C i C j B j C iB

1/ , or , or ,

0 otherwiseA ij

ij A A A A

A, ,

1,

1,

(4.2.23c)

Treatment of the interface between the subsystems is the maindistinguishing feature of the new formulation. The partitioningeq 4.2.23b requires an overlap of the subsystems in theirnonbuffer atoms (Figure 10b). The partitioning eq 4.2.23c ismore consistent with the previous definition of the core regions,eqs 4.2.18−4.2.19. The core region contains only a high-qualitydensity matrix. The inner buffer serves the purpose of couplingdifferent subsystems, and in general, contains a less accuratedensity matrix than the core region. The outer region serves onlythe purpose of insulating the fragment from all other subsystems.The orbitals of that region are used in the SCF equations, but arenot used for the density matrix refinement. The outer buffer inthe partitioning scheme eq 4.2.23c corresponds to the standardbuffer of the scheme eq 4.2.23b. The union B1,A ∪ CA of eq4.2.23c corresponds to the core region of eq 4.2.23b. Finally, theunion of the two buffer regions of eq 4.2.23c, B1,A ∪ B2,A,corresponds to the buffer of the scheme eq 4.2.23a.Utilization of the orbital-dependent normalization factor, nij,

instead of the fixed factor 1/2 used in the original Yang scheme,eq 4.2.23a, presents another important modification. Thisapproach is important if the system is split into fragments ofnotably different size, say fragments A and B, such that the core ofthe fragment A overlaps with the buffer region of the fragment Bdifferently from how the core of the fragment B overlaps with thebuffer of the fragment A (Figure 11a). According to thepartitioning eq 4.2.23a, the partition factor for the density matrixelement corresponding to the pair of orbitals i ∈ CA and i ∈ CBwill be the same for each of the two fragments, pA,ij = pB,ij = 1/2,no matter how the orbitals relate to the sets BA and BB. In reality,equal weights of 1/2 are obtained only if the orbitals aresymmetrically distributed (Figure 11a, blue double arrow), thatis, i∈ CA, i∈ BB, and j∈ CB, j∈ BA. Both i and j belong to the twosubsystems in this case. The partitioning factor is given by thenormalization coefficient 1/nij = 1/2, which is the same as in theoriginal formula, eq 4.2.23a. In a more complex situation, theorbital i ∈ CA may be outside the buffer of the fragment B, inwhich the partition of the small fragment gets zero weight, pB,ij =0. From the point of view of the subsystem A, both orbitals stillbelong to a single fragment, and eq 4.2.23c yields pA,ij = 1. Thus,although the sum of the partitioning factors remains constant, thedistribution of the weight across the subsystems is changed.

∑= − *σ σ σ σ σ σ=

Δ

σ

P p f F E C C( )A ij A ijk

N

F A k A ik A jk, , , ,1

, , , , , , ,(4.2.24)

where {(EA,σ, CA,σ)} are the eigenvalues and correspondingeigenvectors of the interacting fragment, determined from thesolution of eq 4.2.11. The total density matrix is given by the sumof the fragmental density matrices, as illustrated in Figure 10:

∑=σ σP PA

A ,(4.2.25)

The Fermi energy is obtained in a manner similar to the chargedensity partitioning scheme, using the following normalizationcondition:

∑ ∑= =σ σ σN P S P Si j

ij ijA i j

A ij ij,

,, ,

, ,(4.2.26)

Note that eq 4.2.26 does not involve any numerical three-dimensional integration, leading to efficiency improvement.The density matrix based partition scheme can be applied to

both the DFT andWFTmethods, ab initio or semiempirical. Theorbital-based partitioning leads to the charge density lesslocalized in the physical space than that derived from the directspace based partitioning. This drawback is emphasized as theorbital size increases (e.g., for diffuse orbitals). The differencebetween the two partitioning schemes is easy to understand interms of the atomic limit. In the case of atomic subdivision, thepartitioning schemes eqs 4.2.4 and 4.2.23 become theHirshfeld458 and Mulliken327,459 population analyses, respec-tively.

4.2.1.1.7. Elaborations and Extensions of the D&C Method.The basic D&C methodology described in detail above wasextended to a variety of other electronic structure methods. Inparticular, the density matrix formulation of the D&C methodwas applied to accelerate calculations based on semiempiricalmethods, to study biomolecules (the groups ofW. Yang, D. York,K. Merz, G. Scuseria, etc.)70,72,79,80,460 and nanoscale materials,such as fullerene clusters.461 Other variations include theimplementations within DFT (the St-Amant group),462−464

HF and HF/DFT hybrids (the groups of H. Nakai, K. Merz,etc.),75,465,466 MP2 (the Nakai group),467−469 and CC (the Nakaigroup).470,471 The approach was adapted to unrestrictedformulations, including HF/DFT472 and MP2.473 Error sourcesof the method were analyzed on a model polypeptide system,

Figure 11. (a) Double buffer partitioning scheme by Merz72 (b)Illustration of the success (blue double arrow) and failure (red doublearrow) of the scheme eq 4.2.23a to describe partitioning of the weightsacross the subsystems of different sizes. See text for discussion of systempartitioning into different regions.

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5840

Page 45: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

indicating that a large buffer space cutoff leads to errors due to theloss of variational flexibility of MOs, while a small buffer cutoffrestricts the capability of the subsystemMOs to be orthogonal tothe true whole-system MOs.462 The D&C scheme was alsoimplemented in parallel computing environments.464 As anexample of the remarkable capabilities of the method, asemiempirical calculation on a truly large-scale protein systemcontaining over 9000 atoms was reported.460 The calculationscould be performed on a typical workstation, indicating that themethod has a large potential for application to even largersystems.4.2.1.1.8. Implementation. The D&C SCF and correlation

methods have been implemented into the GAMESS package474

by theNakai group. Alternative implementation is available in theDivCon 99 program and its subsequent versions, distributed bythe Merz group.475

4.2.1.2. Additive Fuzzy Density Fragmentation (AFDF:MEDLA, ADMA). An alternative approach to partitioning of thecharge density (or density matrix), similar in spirit to the D&Cmethod described above, was developed independently byWalker andMezey, to constitute their molecular electron densityLego approach (MEDLA).476−478 Unlike the self-consistentschemes of Yang et al., the method utilized a significantlysimplified, but computationally more efficient scheme. Themethod was originally targeted on the reconstruction of theelectron density for the whole system. Later extensions of theoriginal MEDLA methodology allowed calculation of electro-static potentials,479,480 energies,480 and dipole moments480 oflarge systems. The MEDLA abbreviation is often usedinterchangeably with the abbreviation ADMA, which stands fora more general term: adjustable density matrix approach. Bothmethods are classified into the group of the additive fuzzy densityfragmentation (AFDF) techniques, first introduced by Mezeyand co-workers. Several related approaches developed by otherauthors can also be classified into this group of methods.The MEDLA technique consists of several steps, outlined in

Figure 12. The central idea of the method is to compute a set ofcharge densities of the rigid molecular fragments. This is done bycalculations involving small molecules that contain the fragmentof interest (so-called parent molecules). Because the patentmolecules are typically small, the computations can be done atthe ab initio level. It is also worth mentioning that the concept of

the “parent” molecules is essentially equivalent to that of thebuffer regions in the D&C scheme. Moreover, the “parent”molecules can be constructed automatically for each givensystem of interest, as outlined in the works of Exner andMezey.479−481 Once the charge density (or density matrix) forthe parent molecule is obtained, the fragment densities (densitymatrices) can be computed, using the inversion of the D&Cdensity matrix partitioning scheme of Yang et al., eq 4.2.23a. Eachfragment is equipped with an internal coordinate system forfurther transformation of charge densities. By performingcomputations on a set of parent molecules, a fragment densitydatabase can be constructed. Finally, the precomputed fragmentcharge densities are used to reconstruct a charge density of largemolecules. This is done by translations and rotations of thebuilding blocks to match their internal coordinate system withthe position and orientation of the blocks in the correspondingfragments of the target molecule. Summation of the chargedensities follows, again using eq 4.2.23a, to construct the densityof the whole system from the fragment densities, as in the D&Cscheme.Utilization of fuzzy charge density boundaries is one of the

central principles of the MEDLA approach. This is in contrast todefining a strict atoms-in-molecule partitioning done by Bader.In fact, the MEDLA partitioning is a close analogue of theMulliken population analysis, extended to fragments. We want toemphasize that, although the partitioning is fuzzy in terms of thecharge density, it is discrete in terms of the density matrixpartitioning. We have already encountered a similar discussion ofthe charge density versus density matrix partitioning and theirsimilarities to the Bader versus Mulliken population analyses inthe context of the D&C approach. The same opportunities arepresent in the MEDLA as well.Although the approach is conceptually appealing and is useful

for quick reconstruction of the charge densities of large systems,and even for express estimation of the energy of the total systemusing the D&C DFT formulations, there are some notablelimitations. First of all, the method lacks a self-consistent schemeto adjust fragment-in-a-molecule densities. This approach isjustified inmany cases by the short range of quantum interactionsand the localized nature of chemical bonds. In this regard, themethodology is somewhat reminiscent of tight-binding methods,which often disregard long-range interactions. Therefore, one

Figure 12. Illustration of the MEDLA. P is the density matrix. The charge density distributions are reprinted from ref 476. Copyright 1993 AmericanChemical Society.

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5841

Page 46: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

can expect that systems exhibiting charge transfer and delocalizedelectronic densities may lead to deteriorated accuracy. Chargetransfer and delocalization effects are of special importance inexcited states, in particular. Thus, the MEDLAmay be difficult toapply to excited states, and is mostly targeted to the groundelectronic state. The difficulties may be partially accounted for bychoosing sufficiently large fragments, which, however, may leadto loss of computation speed.Second, the construction of a fragment library assumes the

utilization of fixed geometries for constituent fragments. Thus,the calculation of molecular dynamics or reactive processes islimited or may require a very large database of the chargedensities for different geometries of considered fragments. Onthe contrary, when the intrafragment dynamics is expected to benegligible, the MEDLA may be very advantageous in terms ofcomputational costs and accuracy. The method can be successfulwith macromolecular nonreactive systems, such as biopolymersand soft-matter systems, and with a range of materials scienceand biomedical applications.Finally, one more limitation is associated with the information

storage difficulties. As the authors indicate, the molecular chargedensity is stored in cubic grid files. The memory requirementrises very quickly with the system size, rendering the methodinapplicable for sufficiently large systems. The latter difficulty canbe mitigated, if one stores density matrix information forfragments only. Then, one can utilize that information tocompute properties of the large molecules on-the-fly, similar tothe procedure used in the D&C schemes.Below we summarize the set of formulas for computation of

some molecular properties within the ADMA.480 These areobtained by representing the total density matrix as the sum ofthe fragment densities, as in eq 4.2.25. We present these formulasexplicitly, for completeness of the present account.

electron density:

∑ ∑ρ χ χ = σ σ=

r r r P( ) ( ) ( )i j

N

i jA

A ij, 1

, ,

AO

(4.2.27)

Hartree−Fock energy:

∑= + + +α α β βE P H F P H F12

[tr( ( )) tr( ( ))]A

A Ael , ,

(4.2.28a)

with

∑∑= + | − |σ σF H P ab cd P ad cb[ ( ) ( )]ab abA c d

N

A cd A cd,,

, , ,

AO

(4.2.28b)

electrostatic potential:

ϕ ϕ ϕ = − r r r( ) ( ) ( )nucl el (4.2.29a)

∑ϕ =| − |=

rZ

r R( )

a

Na

anucl

1

nucl

(4.2.29b)

∑ ∑ϕ χ χ = ⟨ ′ || − ′|

| ′ ⟩=

⇀r P rr r

r( ) ( )1

( )i j

N

AA ij i jel

, 1,

AO

(4.2.29c)

dipole moment:

∑ ∑ ∑ϕ χ χ = − ⟨ | | ⟩= =

r Z R r( )a

N

a ai j

N

Ai j

1 , 1

nucl AO

(4.2.30)

The lack of long-range interactions is one of the mainshortcomings of the ADMA. It has been addressed recently. Asargued in most of the early works on the MEDLA and ADMA,the electrostatic interactions can be accounted for by increasingthe size of “parent” molecules used to calculate the fragmentdensities from the first principles. It has been found that highaccuracy calculations require surroundings of each fragment in aradius of more than 10 Å.480 This size deteriorates computationssignificantly, and more efficient approaches were necessary. Toaddress this efficiency issue, Exner and Mezey developed the

Figure 13. Illustration of the FA-ADMA principle. Molecular structures are adapted from ref 481. Copyright 2004 American Chemical Society.

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5842

Page 47: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

field-adapted ADMA (FA-ADMA).481 It essentially follows theQM/MM ideas of electrostatic embedding.The schematic procedure of the FA-ADMA is presented in

Figure 13. First, the standard ADMA procedure is performed, toinitialize the process. Once the charge density of the entiremolecule is obtained, the Mulliken charges on all atoms can beeasily computed. In the following steps the calculations on allparent molecules are performed, to obtain the density matricesfor the corresponding fragments. Unlike the standard procedure,the Hamiltonian of the parent molecule includes the electrostaticfield created by the point charge of environment that is notincluded into the parent molecule itself. The computed fragmentdensities reflect the existing long-range electrostatic potentialcreated by the environment. The resulting fragment densities areutilized to construct the charge density in the entire molecule andto update all point charges. The cycle is repeated until self-consistency of charges and fragment densities is achieved.The FA-ADMA has become much more similar to the original

D&C approach than to the original ADMA. The ADMA isessentially the tight-binding approximation to the D&Cprocedure. The FA-ADMA can be considered yet anotherapproximation of the D&C schemethe one that utilized only aconglomerated quantity derived from the density matrix andapproximating its effects (Mulliken charges). One can envisionfurther approximations that mimic the effect of the entire densitymatrix to a higher accuracy. One such approach uses multipoleexpansions of the charge density, extensively utilized in thequantal force fields. Thus, we have outlined direct connectionsbetween the FA-ADMA, the quantal force fields, and the QM/MM approaches. We have also drawn the hierarchy ofapproximations: D&C → FA-ADMA → ADMA/MEDLA.Finally, one can find many similarities with the FMO methodto be described in section 4.2.2.1. In particular, the embeddingpart for the incorporation of long-range effects of theenvironment is the same between the two formulations, as wellas the QM/MM approaches.4.2.2. Energy-Based Fragmentation. 4.2.2.1. Fragment

Molecular Orbital (FMO) Method. One of the natural andchemically intuitive ways of accelerating quantum-chemicalcalculations, in particular for large systems, is to divide theentire molecule into smaller fragments, perform calculations onthe smaller fragments, and then obtain the properties (typicallyenergy) of the combined system using the properties of thesmaller building blocks. An approach of this type, known as thefragment molecular orbital (FMO) method, was first proposedby Kitaura,482 and then extensively developed by Kitaura andFedorov.90,483−485 To be fair, this philosophy arose a long timeago in the diatomics-in-molecules (DIM)486 and the pairinteraction molecular orbital (PIMO) methods,487 but only theFMO has become a fully functional technique for sufficientlylarge systems.Further development of the FMO method included its

generalization to the ONIOM-type multilayer FMO method(MLFMO),488 which led to acceleration of themolecular integralcalculations.483 The multilayer approach allows utilization ofalternative levels of theory and different basis sets, to describecomplementary parts of the system with varying accuracy and,possibly, to accelerate calculations. The FMO was extended toinclude correlation at the MP2,489 CIS,490 CC,491 andMCSCF492 levels. The correlation methods were included intothe multilayer schemes, allowing further acceleration. A typicalMCSCF calculation involved about 1300 basis functions andcould be completed in less than 2 h on a standard PC with 1 GB

of memory. A combination of the FMO method with the TD-DFT was developed, allowing one to access excited stateproperties of large systems, including excited state energygradients.493,494 The method was extended to open-shellsystems.494,495 An efficient parallelized implementation of theFMO based on the MP2 level of theory was tested. Thecalculation involved over 3000 atoms and 44 000 basis functionsand was completed in ca. 7 min on ca. 132 000 cores.496 Veryrecently, an unprecedented record-breaking simulation of thefullerite system containing over 106 atoms has been reported.77

The simulation relied on both FMO for linear scaling ofcomputations and on the DFTB Hamiltonian for a smallerprefactor in the computing time. A single point calculation ofsuch a system takes only slightly more than 1 h of the wall time ona 128-core Xeon cluster.The extensive reviews on applications of the FMO method to

large-scale systems and computation of different properties canbe found elsewhere.76,90

4.2.2.1.1. Basic Formulation. To illustrate the basic idea ofthe FMO method, we start with the Hamiltonian of the entiresystem under consideration, eq 2.1.16a, which can besummarized as

∑ ∑ ∑ = − ∇ −| − |

+| − |= =

>

⎪ ⎪

⎪ ⎪⎧⎨⎩

⎫⎬⎭

HZ

r R r r12

1

i

N

ia

Na

i a i ji j

N

i j1

2

1 ,

nucl

(4.2.31)

One can partition the Hamiltonian of the full system into theHamiltonians of the subsystems, HI, and the Hamiltoniansdescribing interactions between the subsystems. In particular, weconsider the fragment I, immersed (embedded) into theenvironment of all other fragments (Figure 14a).The sum

∑| − |

>r r

1

i ji j

N

i j,

can be split into electronic repulsion within the fragment (yellowdouble arrows in Figure 14a) and interaction with the electronsof all other fragments (green arrows):

∑ ∑ ∑ ∑

∑∑∑

| − |=

| − |+

| − |

+| − |

>∈>

≠ ∈>

≠ ∈ ∈>

r r r r r r

r r

1 1 1

1

i ji j

N

i j i j Ii j

N

i j J I i j Ji j

N

i j

J I i I

N

j Ji j

N

i j

, , ,

I J

I J

(4.2.32)

where NI is the number of electrons in the group I. The last twoterms in eq 4.2.32 can be grouped to produce

∫∑ρ

′′

| − ′|≠

rr

r rd

( )

J I

J

i

where ρJ(r′) represents the electronic density due to thefragment J. Finally, the Hamiltonian of the group I can bewritten as

∑ ∑ = − ∇ + +| − |=

>

{ }H V rr r

12

( )1

Ii

N

i I ii j

i j

N

i j1

2

,

I I

(4.2.33)

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5843

Page 48: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

∫∑ ∑ρ

= −| − |

+ ′′

| − ′|= ≠

V rZ

r Rr

r

r r( ) d

( )I i

a

Na

i a J I

J

i1

nucl

(4.2.34)

where VI(ri) plays the role of an effective potential that acts onthe electrons of the fragment I. It is often called the embeddingpotential of the fragment I. Note that the motion of the electronsthat “belong” to the fragment is affected by the field of allavailable nuclei, not only by those belonging to the subsystem. Itis also affected by the electrostatic potential created by theelectrons from other fragments. Thus, the coupling of a givenfragment with all other fragments is mostly due to electrostaticinteractions.Solving the stationary electronic problem for all fragments {I}:

Ψ = ΨH EI I I I (4.2.35)

subject to the constraint of the total number of electronslocalized on each fragment, NI, and to spatial localization of thecorresponding wave functions on the fragments, one obtains thelocalized fragment MOs, ΨI, and the electronic energies ofmonomers, E′I. The total energy of the monomer, EI, is thenobtained by adding nuclear repulsion terms to the electronicenergy,

∑= +| − |∈

<

E EZ Z

R RI Ia b Ia b

Na b

a b,

at

The electron densities of the monomers are obtained in aniterative way, similar to how the electron density matrix isiteratively obtained in a standard SCF scheme in an atomic/molecular orbital basis. In this way, changes of the electrondensity of one fragment induce changes of the electron density ofthe all other fragments in a self-consistent manner. The idea ofperforming self-consistent calculations on sets of fragments andadjusting their embedding potentials was proposed back in 1975as the mutually consistent field method.497

The monomer partition scheme presented above helps inreducing the computational costs drastically. Instead of solvingthe eigenvalue problem for N electrons that scales as O(N3), onesolves only smaller problems forNfr subsystems with {NI|I∈ 1, ...,Nfr} electrons, respectively. The computation time in the latterscheme scales as O(∏I=1

Nfr NI3), which is significantly more

favorable that that for the complete system, especially when thesize of the system increases.In addition to monomers, all possible dimers are computed,

using the dimer Hamiltonian defined similarly to eq 4.2.33, buton a bigger set of atomic and orbital centers:

∑ ∑

∑ ∑ρ

= − ∇ −| − |

+ ′′

| − ′|+

| − |

=

+

=

≠>

+

⎧⎨⎩

⎫⎬⎭

HZ

r R

rr

r r r r

12

d( ) 1

IJi

N N

ia

Na

i a

J I K

K

i i ji j

N N

i j

1

2

1

, ,

I J

I J

nucl

(4.2.36)

Ψ = ΨH EIJ IJ IJ IJ (4.2.37)

leading to electronic energies of the dimers, EIJ, and thecorresponding dimer-localized fragment MOs, ΨIJ. The totalenergy of a dimer is obtained by adding the correspondingnuclear repulsion terms, similar to how this is done formonomers. The frozen monomer densities, ρK(r′), are typicallyused for simplicity and computational efficiency. The approx-imation implies that there is no need for an iterative solution ofthe fragment dimer stationary Schrodinger equation, eq 4.2.37.It should be noted that although the Hamiltonians of the

monomers and dimers include electrostatic interactions with theelectronic densities of surrounding fragments via the

∫∑ ρ′

′| − ′|≠

rr

r rd

( )

J I K

K

i,

terms, the energy of the dimer is not equal to the sum of themonomer energies: EIJ ≠ EI + EJ. The difference originates fromthe interfragment electronic interaction shown by the whitearrows in Figure 14b. It can be expressed as

∑ ∑

∑∑

− + =| − |

−| − |

+| − |

=| − |

∈ ∪>

+

∈>

∈>

∈ ∈

⎜⎜⎜

⎟⎟⎟

H H Hr r

r r r r

r r

( )1

1 1

1

IJ I Ji j I J

i j

N N

i j

i j Ii j

N

i j i j Ji j

N

i j

i I

N

j J

N

i j

,

, ,

I J

I J

I J

(4.2.38)

The types of interactions included in prototypical monomer anddimer fragment Hamiltonians are represented pictorially inFigure 14.The difference between the dimer energy and the sum of

monomer energies:

Δ = − +E E E E( )IJ IJ I J (4.2.39)

is crucial for an accurate description of the energy of thecombined system. It reflects the complex nonadditive effects.The total energy of the system is expressed within the FMO as

∑ ∑= − −=>

=

E E N E( 2)I JI J

N

IJI

N

I, 1

fr1

fr fr

(4.2.40)

This expression can be rewritten in a more systematic way:

∑ ∑= + Δ= =

>

E E EI

N

II JI J

N

IJ1 , 1

fr fr

(4.2.41)

Equation 4.2.41 is a special case of the generalized many-bodyexpansion (GMBE) up to m terms:

Figure 14. Definition of partitioning of the system into monomers (a)and dimers (b) embedded into the environment of all other fragments.Small blue circles represent atoms. Regions represent fragments.

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5844

Page 49: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

∑ ∑ ∑ ∑ ∑ ∑= + Δ + Δ += = =

>= =

<=<

E E E E ...m

I

N

II J

N

JJ I

N

IJI J

N

JJ I

N

KK J

N

IJK1 , 1 1

(2)

, 1 1 1

(3)fr fr fr fr fr fr

(4.2.42)

where the summation is continued up to m-body corrections,ΔEI1I2...Im

(m) . The termsΔEIJ =ΔEIJ(2) are the second-order (pairwise)

perturbation corrections. They describe those interactions withinthe dimers that are absent in the monomers (see Figure 14b).Similarly to the ΔEIJ terms, the three-body corrections ΔEIJK =ΔEIJK

(3) capture the interactions that are absent at the monomerand dimer levels, but are present at the trimer level:

Δ = − Δ − Δ − Δ + + +E E E E E E E EIJK IJK IJ IK JK I J K

(4.2.43)

The expansion eq 4.2.42 has been generalized in the recentwork of Fedorov et al.498 The authors point out that the many-body expansion is not limited to energyan arbitrary property Acan be expressed this way. Further, they suggest that the higher-order terms could be incorporated via the formally two-bodycorrections:

∑ ∑ ∑= + Δ ′= = =

>

A A AM

I

N

II J

N

JJ I

N

IJ1 , 1 1

fr fr fr

(4.2.44)

where the effective two-body correction, ΔAIJ′ , is given by

∑ ∑Δ ′ = Δ + Δ= > > > −

− −A A a AIJ IJ

m

M

I I I

N

IJ I I IJ I I3 ...

, ... , ...

m

m m

1 2 2

fr

1 2 1 2

(4.2.45)

and the weighting coefficient aIJ, I1...Im−2 is computed accordingto the formula

=|Δ |

∑ |Δ |>∈

−a

A

AIJ I IIJ

K LK L

K L I I

NKL

, ...,

, ,..

m

m

1 2

1

fr

(4.2.46)

The adaptive weights in eq 4.2.46 are needed to account forvarying contributions of different fragment pairs, as discussed inthe original work.498 The work also contains simpler but lessgeneral approximations.The FMO formulas presented above introduce many-body

effects via a formally two-body expansion. According to the FMOconstruction, the energies of the monomers and dimers (in theFMO2) are computed taking into account interaction with allother surrounding fragments (embedding potential). Computa-tion of the molecular integrals included in the embeddingpotential of each fragment is one of the time-critical points of theFMO method. In order to develop efficient approximations tothe electrostatic part of the embedding potential, it is importantto separate the embedding potential from the intrinsic potentialof each fragment. Such an energy decomposition was developedby Nakano et al. in 2002.483 In the context of FMO2 method, eq4.2.41 can be rewritten as

∑ ∑= ′ + Δ ′ + Δ= =

>

E E E P V[ Tr( )]I

N

II JI J

N

IJ IJ IJ1 , 1

fr fr

(4.2.47a)

′ = − =E E P V X I IJTr( ), ,X X X X (4.2.47b)

Δ = − ⊕P P P P( )IJ IJ I J (4.2.47c)

Here, E′X is the intrinsic energy of the fragment (X = I formonomer, X = IJ for dimer) without the embedding potential, VXis the embedding (environment) potential, and PX is the densitymatrix of the fragment X. Equations 4.2.47 are equivalent to eq4.2.41, but are much more convenient for introducingapproximations to reduce computational costs associated withthe embedding potential. For sufficiently separated fragments,the energy of the dimer can be approximated by the sum of themonomer energies plus electrostatic interactions between them.In addition, Mulliken charges can be used instead of the fulldensity matrix, if the interfragmental distance is large enough.For details and discussion of these approximations, we refer thereader to the original work483 and to the recent review.498

We would like to emphasize two points. First, the energydecomposition in the FMO method in eqs 4.2.47 has a usefulinterpretation. The term

∑= Δ=>

E P V[Tr( )]I JI J

N

IJ IJexplFMO2

, 1

fr

accounts for explicit many-body effects due to charge transferbetween fragments I and J under the influence of the electrostaticpotential of all other fragments. The term

∑ ∑= ′ + Δ ′= =

>

E E EI

N

II JI J

N

IJimplFMO2

1 , 1

fr fr

accounts for implicit many-body effects. This is because theenergies of the monomers and dimers are affected by theembedding potential that incorporates the many-body effects.Note that the charge transfer energy is largely composed of theexplicit contrbution, but also contains a fraction of the implicitone.The second point concerns the utility of semiempirical

schemes. Computation of molecular integrals constitutes amajor bottleneck of the FMO method. Therefore, approximatebut significantly faster schemes,483 similar to those utilized in thesemiempirical methods, are favored over the more accurate, butsignificantly slower HF-type descriptions. Combination of theFMOmethod with another semiempirical approachthe DFTBmethodallowed reaching the unprecedented scale of quantumcalculations employing over 106 atoms.77

By systematically including the higher-order corrections, onecan achieve the desired accuracy for large systems. Depending onthe perturbation (correction) level, one can identify FMO2,FMO3, etc. methods. The three-body terms may be requiredsometimes to achieve acceptable accuracy.484,499 In particular,explicit water solvation of biopolymers or accurate calculation ofdipole moments may require the use of three-body terms. Therole the four-body correction was assessed by Nakano et al.500

They found that such a correction is important, if a nonconven-tional partitioning of peptides is used (segmentation of residuesof side and main chain). The fourth-order FMO was also foundessential for accurate modeling of systems having three-dimensional structures, such as adamantane-shaped clusters.The higher-order terms are negligible most of the time.As the size of the system increases, the number of fragments

into which the system should be partitioned increases as well.This naturally leads to an increased number of possible n-mersthat need to be computed. Specifically, the number of n-mers thatcan be formed from Nfr monomers is given by the combinatorialnumberCNfr

n =Nfr/[n!(Nfr−n)!]. Because the number grows very

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5845

Page 50: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

rapidly, it is common to include only dimers in the calculationson large systems. The approximation provides sufficient accuracyfor most applications and, therefore, is well justified. A nearlylinear scaling can be achieved using efficient approximations forcomputation of the energies of far separated dimers.483

Additional savings may be obtained if many dimers are similarto each other or are composed of similar monomers.4.2.2.1.2. Elaborations and Derivatives. In the original FMO

method, as well as in the MBE approach, the fragments intowhich the system is partitioned are assumed nonoverlapping.Formulations of the fragment-based methods for overlappingfragments were also proposed. For example, Fedorov proposedan extension of the FMO method suitable to partitions insolids.501 The improved scheme was applied to zeolites.4.2.2.1.3. Implementations. The FMO and its various

improvements are implemented in GAMESS and AbInit-MP.490 The Web interface, input preparation, and analysistools were developed by Fedorov and co-workers.4.2.2.2. Molecular Fractionation with Conjugate Caps

(MFCC). The MFCC method was developed by Zhang and co-workers,502 as an approach to the efficient calculation of theinteraction of a protein, P, with an arbitrary molecule, M (e.g.,solvent). The protein molecule, P, can be represented schemati-cally as a sequence of connected amino acid fragments, Ai:

= − − − −P nA A A ... A cN1 2 3 (4.2.48)

where nA1 and ANc denote N- and C-terminals of the protein:

= − −α+nA NH R C H CONH1 3 1 (4.2.49a)

= − −α−A c R C H COON N (4.2.49b)

and Ai is the amino acid residue:

= − −αA R C H CONHi i (4.2.50)

The fractionation scheme for proteins is presented in Figure15.To compute the protein−molecule interaction energy, E(P ∪

M), the protein is subdivided into fragments, chosen in theMFCC to be the amino acid residues. Each fragment is cappedwith a pair of conjugate cap groups: Ci* and Ci. The interactionenergy of the capped fragment with the moleculeM, E(Ci−1* AiCi∪M), is computed for each fragment i, along with the energy ofinteraction of the pair of conjugate caps with the molecule,E(Ci*Ci ∪M). The protein−molecule interaction energy is thencomputed as

∑ ∑∪ = * ∪ − * ∪=

−=

E P M E C A C M E C C M( ) ( ) ( )i

N

i i ii

N

i i1

11

1

(4.2.51)

The role of the conjugate caps is 2-fold:

• to better represent the electronic structure of the systemaround the covalent bond being cut; in this regard, theconjugate caps play the role very similar to the bufferregions in the D&C methods

• to better approximate the environment of the correspond-

ing amino acid residueTo satisfy the second requirement, the most natural choice for

the capping groups is based on the nearby residue:

* ==

= −

=

− = −

= − −= − − =

α

α

α

+

+

⎪⎧⎨⎩

⎧⎨⎪

⎩⎪

Ci

i N

C

i N

i N

NH (NH ), 0

NH , 1, ..., 1

R C H , 0, ..., 1

A C R C H COO (A CR C H COOH) ,

i

i

i

i i N i i

N

2 3

2

1 2 (4.2.52)

Schematically, the choice of the capping fragment for thesystem shown in Figure 15 is shown in Figure 16.The method was combined with the conductor-like polar-

izable continuum model (CPCM), to describe solvation ofproteins.503

4.2.2.3. Systematic Molecular Fragmentation (SMF) Ap-proach. The systematic molecular fragmentation approach504

was developed by Deev and Collins on the basis of the Chen’sMFCC method.505 Further elaborations were performed by theCollins group and included extensions to treat rings,506 long-range506,507 and nonbonded508 interactions, and to computemolecular potential energy surfaces507,509 and electrostaticpotentials.510 The method was extended to periodic sys-tems.507,511 It allows utilization of high-level correlated ab initioapproaches, such as MP2 and CCSD(T). The method wasapplied to model lattice dynamics and to compute phononspectra and neutron scattering intensities.507 Very recently, animproved version of the algorithm (so-called, SMF byannihilationSMFA) has been applied to compute the relativeenergies of a series of proteins.512

The idea of the SMF method is very simplethe molecule ishierarchically divided into a set of overlapping fragments:

→ +M F Fn0, 1 2 (4.2.53)

ϕ∩ =F F1 2 (4.2.54)

Each fragment is capped with hydrogens:

→ ∪F F Hi i (4.2.55)

and the resulting smaller molecules are used to repeat the processrecursively:

= ∪+M F Hi n i, 1 (4.2.56)

In the above equations, the index n indicates the level offragmentation hierarchy. The energy of the molecule isapproximated by a simple expression:

≈ + − ∩ ∪+ +E M E M E M E F F H( ) ( ) ( ) (( ) )n n n1, 1 2, 1 1 2

(4.2.57)

which can be applied up to a specified level of fragmentation.The two main restrictions are (a) molecules are broken only

through single bonds, not multiple bonds and (b) no explicitcharge separation across the system occurs. The derivation of the

Figure 15. Schematic representation of the fractionation scheme of aprotein, used in the MFCC method.

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5846

Page 51: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

SMF implies that the broken bonds are sufficiently distant fromeach other. Therefore, the larger are the resulting fragments, themore accurate are the computed energies, although at a cost ofreduced computation speed.The choice of the fragmentation scheme is critical. The

fragmentation can be performed systematically, at differentlevels. It is illustrative to consider a chain of five functional groups(fragments), F1F2F3F4F5. The possible division schemes are

→ + + + − − −FF F F F FF F F F F F F F F F

(level 1)1 2 3 4 5 1 2 2 3 3 4 4 5 2 3 4

(4.2.58a)

→ + + − −FF F F F FF F F F F F F F F F F F

(level 2)1 2 3 4 5 1 2 3 2 3 4 3 4 5 2 3 3 4

(4.2.58b)

→ + −FF F F F FF F F F F F F F F F (level 3)1 2 3 4 5 1 2 3 4 2 3 4 5 2 3 4(4.2.58c)

The higher the level, the better the accuracy.Because of the recursive nature, the size of each fragment in the

final level can be kept sufficiently small, to make calculations ofeach subsystem as fast as possible and to achieve linear scaling ofCPU time with the size of the whole system. The energygradients and higher-order derivatives can be easily obtained,making the method applicable to geometry optimization andmolecular dynamics problems. On the other hand, the method isnonvariational and is based solely on energies of the system.Therefore, the electronic structure of the original system cannotbe obtained, restricting the method to the ground electronicstate. In principle, a time-dependent formulation may beenvisioned, to access properties of excited states, similarly tothe extensions of the FMO method.493 However, excitations areoften associated with notable charge delocalization over entiresystem, which may violate the basic assumptions of the method.4.2.2.4. Molecular Tailoring Approach (MTA). Initially, the

MTA approach was extensively developed by the Gadregroup.513−516 They developed a method for theoretical synthesisof molecular electrostatic potentials of large molecules using theinformation obtained by computing the electronic structure of a

set of overlapping fragments.513 Therefore, the method followsthe general scheme of the D&C algorithm. However, unlike theD&C procedure, the electronic densities of the fragments are notaffected by those of all other fragments, and no iterativeprocedure is required. Instead of the partitioning schemes, suchas eqs 4.2.23a and 4.2.23b, coupled with the iterative SCFprocedure for all fragments, the method simply composes thedensities of individual fragments in a proper way, to approximatethe density matrix and energy of the whole system. The densitymatrix composed of the subsystem density matrices stitchedtogether may not produce the population equal to the totalnumber of electrons. An appropriate scaling factor is computedin this case, and the density matrices of the subsystems arerescaled.Because the MTA does not adjust the density matrices of the

subsystems self-consistently, the choice of the partitioning andthe overlap scheme is especially critical for this approach. Inparticular, double and triple bonds, as well as aromatic and smallrings, should not be broken into different fragments. Thefragments should contain a connected graph and should besufficiently large to simulate accurately the properties of thecomposed molecule. To facilitate fragmentation, an automaticpartitioning algorithm was developed by the Gadre group.514

The non-self-consistent nature of the MTA approach makes itvery similar to other methods, notably the density builderapproach (MEDLA or ADMA) of Mezey and co-workers, andthe FMO of Fedorov and Kitaura. The similarity with the firstmethod comes from the composition procedure for the densitymatrix (or charge density). The procedure may or may not beself-consistent in the density builder approaches. The similaritywith the FMO method comes from the fact that both methodsare noniterative at the total energy levels and can be formulatedmerely in terms of the total energies.The method was extended to the MP2514 and periodic

systems.517 The automatic fragmentation procedure wasdeveloped.514,518 Following the initial attempts to performMTA-based optimization,519 a recent modification of the originalMTAthe cardinality-guided MTA (CG-MTA)incorporatesenergy gradients and Hessians for structural optimization.518

The CG-MTA is based on set inclusion and exclusionprinciples. The molecular system is initially partitioned intofragments of a specified maximal size and the so-called R-goodness of each atom in a given fragment. The latter parameteris used to assign each atom to a specific fragment. The initiallyprepared fragments are capped by hydrogen atoms, so as tosaturate dangling bonds. Once the set of fragments, {Fi}, isgenerated, one can utilize Venn diagrams to compute theprefactor specifying the contribution of the energy of eachfragment, {Ei}, to the total energy of the system. The prefactor iscomputed as (−1)K−1, where K is the number of fragmentsinvolved in the intersection. The basic scheme of the CG-MTA ispresented in Figure 17.The total energy is computed on the basis of the energies of all

fragments and their intersections. In the case of two fragments,the formula has a very simple form:

= + − ∩E E F E F E F F( ) ( ) ( )1 2 1 2 (4.2.59)

For the system shown in Figure 17b the expression also includesa tertiary term:

= + + − ∩ − ∩

− ∩ + ∩ ∩

E E F E F E F E F F E F F

E F F E F F F

( ) ( ) ( ) ( ) ( )

( ) ( )1 2 3 1 2 1 3

2 3 1 2 3 (4.2.60)

Figure 16. Schematics showing the fragmentation of a polypeptide inthe MFCC method. The conjugate caps are marked by blue (C-terminus) and red (N-terminus) boxes. The fragments are marked bygreen boxes. The molecules to the right of the capped fragments 0 and 1represent the total addition to the bare fragment. The energy of thisadditional molecule must be subtracted from the energy of the cappedfragment to get the energy contribution of a fragment.

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5847

Page 52: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

Finally, the expression for the general case is

∑ ∑ ∑= − ∩ + ∩ ∩ +E E F E F F E F F F( ) ( ) ( ) ...i

ii j

i ji j k

i j k, , ,

(4.2.61)

The expressions for the energy gradients and Hessians arestraightforward. They are obtained by differentiation of theenergy expression, eq 4.2.61, with respect to the coordinates ofinterest:

∑ ∑

∂∂

=∂∂

−∂ ∩∂ ∩

+∂ ∩ ∩∂ ∩ ∩

+

EX

E FX F

E F F

X F F

E F F F

X F F F

( )( )

( )

( )

( )

( )...

a i

i

a i i j

i j

a i j

i j k

i j k

a i j k

,

, , (4.2.62)

where the coordinate Xa(S) of the variable a belongs to the set S.The energy formula eq 4.2.61 generalizes similar expressionsobtained by Deev and Collins in their SMF approach,504 and byChen et al. in their molecular fractionation with conjugate caps(MFCC).505

4.2.2.5. Energy-Based D&C with Charge Conservation (E-EDC). Energy-based divide-and-conquer (EDC) with chargeconservation (E-EDC) was developed by Song, Li, and Fan.520

They pointed out that the major errors in the energy-basedfragmentation schemes originate from artificial charge transferbetween fragments. Eventually, it violates conservation of thetotal charge of the whole system. The E-EDC methodextrapolates the energy value computed with a nonconservedcharge to the more accurate value that corresponds to theconserved charge. The mathematical and physical foundation ofthe method can be summarized as follows.Similar to the SCC-DFTB family of methods, one expands the

DFT energy in a Taylor series with respect to the electron densityfluctuation, Δρ:

ρ ρ ρ ρ ρΔ = + Δ − = Δ + Δ +E E E a b( ) ( ) ( ) ...i i i i i2

(4.2.63)

where index i refers to the spatial point at which the chargedensity, ρi, is changed the most. Apparently, the approximation isderived with the intent to relate this point charge density to theclassical point charges. In a more rigorous approach, the electrondensity fluctuation may be spread out over a notable spatialregion. The local approximation may not always be accurate.

Nonetheless, the point charges provide the most naturalinterpretation of the quantities ρi in the E-EDC context.The charge transfer,Δρ, is assumed to be relatively small, such

that the Taylor expansion eq 4.2.63 can be truncated after thequadratic terms. One considers charge transfer between the sitesi and j, and associated with it change of the total energy:

ρ ρΔ = Δ + ΔE a b ( )i i i2

(4.2.64a)

ρ ρΔ = −Δ + −ΔE a b( ) ( )j j j2

(4.2.64b)

ρ ρΔ = Δ + Δ = − Δ + + ΔE E E a a b b( ) ( )( )i j i j i j2

(4.2.64c)

Using the variational principle and the DFT, stating that anycharge density fluctuation with respect to the ground statedensity will raise the energy, eq 4.2.64c, the authors show that,under the above approximations, the gradient of the energy withrespect to density is constant at any point:

ρ= = ≡ = ∂

∂a a a

E...i j

(4.2.65)

Assuming that the second order terms inΔρ can be neglected,and generalizing eqs 4.2.64 to many fragments, a linearrelationship between the total energy change, ΔE = ∑iΔEi,and the total charge change, ΔQ = ∑iΔρi, can be derived:

Δ ≈ ΔE a Q (4.2.66)

The subsequent computations are organized to compute thepoints necessary for the extrapolation based on eq 4.2.66. First,the system is fragmented in different waysmonomers, dimers,trimers, etc.in a manner similar to the FMO or GMBEformulations. All subsystems are capped with H atoms ifchemical bonds are broken. Second, the total energy of the entiresystem at a given order of the many-body expansion, Em, can becomputed according to the general formula (see section 4.2.2.1).In the reported scheme, the authors neglect the effect of pointcharges of the environment on the energy of the embeddedfragment. Each of the fragments (monomers, dimers, etc.) iscomputed independently, in a vacuum.The total charge of the system at a given level of MBE is

computed as

∑=Q Q A( )m

AAm

(4.2.67)

Figure 17. (a) Fragmentation in the CG-MTA. (b) Venn diagram showing set inclusion and exclusion principle, as applied to overlapping fragments.The regions indicated by the black letters produce positive prefactors in the energy expression, while the regions indicated by the blue letters producenegative energy prefactors. Fi denotes fragment i.

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5848

Page 53: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

The quantity QAm(A) is computed using the MBE, similar to the

energies:

∑ ∑∑= + Δ + Δ

+

= = =Q A Q A Q A E A( ) ( ) ( ) ( )

...

Am

AI

N

AII

N

J

N

AIJ1

(2)

1 1

(3)fr fr fr

(4.2.68)

Δ = −Q A Q A Q A( ) ( ) ( )AI AI A(2)

(4.2.69)

Δ = − − +Q A Q A Q A Q A Q A( ) ( ) ( ) ( ) ( )AIJ AIJ AI AJ A(3)

(4.2.70)

QA(A) is the total charge of all atoms in the subsystem A,excluding the cap atoms.QAI(A)andQAIJ(A) are the total chargesbelonging to the fragment A in the two-body AI and three-bodyAIJ subsystems, respectively.With all quantities needed for the linear extrapolation of the

energies defined, eq 4.2.66, the extrapolation can be written:

≈ = − −−

−−

−E E EE E

Q QQ Q( )M m

m m

m mm M

1

1 (4.2.71)

whereM is the highest order of MBE included in the scheme. Inother words, the total energy is approximated by the M-bodyexpansion, but with charge conservation. The total charge isapproximated at theM-body level, QM, by the total charge of thesystem, Q, which is known as an input to the calculations: QM =Q.4.2.2.6. Other Related Methods. There are many more

fragmentation-based linear-scaling methods. They follow theideas of the methods presented above, but adapt a slightlydifferent formulation and introduce corrections for missingeffects. Therefore, rather than discussing these methods in detail,we list them here for completeness of the review:(1) molecules-in-molecules (MIM),521 utilizing the ideas of

the ONIOM and energy-based fragmentation schemes like theFMO(2) many-overlapping body (MOB) expansion,522 a two-body

expansion using overlapping fragments; the method was utilizedfor computations of large polypeptide and alkane chains(3) dimer of dimers (DOD) method by Saha and

Raghavachari523

(4) generalized energy-based fragmentation (GEBF) by Li etal.454,524,525

(5) kernel energy method (KEM) by Karle et al.526

(6) hybrid many-body interaction (HMBI) by Beran andNanda;527,528 this method is close to quantum force fields andhybrid QM/MM approaches(7) multilevel fragment-based approach (MFBA) by Rezac

and Salahub529

4.2.3.DynamicalGrowthwithLocalization. 4.2.3.1. Elon-gation Method (ELG). The elongation method was originallyproposed by Imamura and co-workers as a computationalapproach to electronic structure calculations that mimic atheoretical synthesis of polymers.530 Hence, the first applicationswere to one-dimensional systems, but later extensions to two-and three-dimensional systems531,532 were also developed. Themethod was also actively developed by the Aoki lab. Furtherelaborations included an improved localization scheme,533

calculation of the local density of states,534 implementation ofan efficient cutoff scheme for additional acceleration of ab initiocalculations,535−539 and extensive analysis of performance andefficiency.537 The method was applied to study periodic540 and

aperiodic polymers at the semiempirical, ab initio HF, and DFTlevels of theory.541−544 Recently, the method has been extendedto include correlation at theMP2 level,545 and to describe excitedstates at the CIS546 and linear-response time-dependentHartree−Fock547 levels. Geometry optimization548 and molec-ular dynamics549 schemes based on the ELG method weredeveloped. The improved version for delocalized systems wasalso described.550

Before providing the mathematical foundations of the method,we outline its main idea. The short-range nature of chemicalbonding constitutes the major assumption of the method.(Strictly) localized molecular orbitals ((S)LMOs) are the centraltool of the ELG. The system of interest can be subdivided intothe initial fragment F0 and a set of monomers of relatively smallsize, Mn, n = 1, ..., Nmonomers. The initial fragment F0 is furthersubdivided into two components, A0 and B0, such that thefragment A0 is sufficiently distant from the site to which themonomers are attached, while the fragment B0 contains that site(Figure 18). It is conventional to assume that the subfragment B0is larger than the subfragment A0.Having defined the initial fragment F0 = A0 ∪ B0, one obtains

its molecular orbitals by solving the conventional SCF equation,eq 2.1.38. The resulting canonical MOs are typically delocalizedover the entire subsystem F0. That is, the MO expansions in thebasis of standard Cartesian AOs, {(χa)}, involve many AOs oftenlocalized in different regions across the system and havingcomparable weights. In order to make use of the short-rangechemical bonding principle, the MOs are localized into theregions A0 and B0. This is achieved by using the uniformlocalization method,551,552 which is similar to the one proposedby Edmiston and Ruedenberg.553 To start the fragmentlocalization procedure, the nonorthogonal AOs, {|χa⟩}, arepreorthogonalized to form a set of hybrid orthogonal AOs,{|χ a⟩}. There exist twomajor orthogonalization schemes: the oneinitially proposed by Imamura530 and the improved schemebased on the Lowdin orthogonalization.533 It was found that theorthogonalization scheme critically affects the accuracy of thecalculations with the elongation method.537 Orthogonalization isachieved via the unitary transformation:

∑χ χ| ⟩ = | ⟩=

Uab

N

ba b1

M

(4.2.72)

where U is the corresponding unitary transformation matrix(UU+ = U+U = I). The canonical MOs of each fragment, whichare typically expressed in the nonorthogonal AO basis, eq 2.1.12,can then be represented in the basis of the orthogonal hybridizedAOs:

∑ ∑ ∑ψ χ χ χ

σ α β

| ⟩ = | ⟩ = | ⟩ = | ⟩

=

σ σ σ σ= =

+

=C C U C ,

,

ia

N

ai aa b

N

ai ab bb

N

bi b,1

,, 1

,1

,

M M M

(4.2.73)

∑ = ⇔ =σ σ σ σ=

+ +C C U C U Cbia

N

ai ba,1

,

M

(4.2.74)

The SCF equation, eq 2.1.38, becomes in the new basis

= σ σ σ σF C SC E (4.2.75)

with

=σ σ+F U F U (4.2.76)

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5849

Page 54: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

= +S U SU (4.2.77)

The fragment localization procedure consists of the followingsteps. The pair of LMOs ψσ,i and ψσ,i is obtained by a rotation ofthe pair of the canonical MOs, ψσ,i and ψσ,j, in the hyperplane i−j:

ψ

ψθ θ

θ θ

ψ

ψ

=−

σ

σ

σ

σ

⎛⎝⎜⎜

⎞⎠⎟⎟

⎛⎝⎜

⎞⎠⎟⎛⎝⎜⎜

⎞⎠⎟⎟sin cos

cos sin

i

j

i

j

,

,

,

, (4.2.78)

The rotation angle θ is chosen to deliver the maximum of thesum of populations localized on each of the regions A and B(localization measure):

ψ ψ ψ ψ= ⟨ | ⟩ + ⟨ | ⟩σ σ σ σLij A i A i B j B j, , , , , , , , (4.2.79)

where

ψ θ θ χ

θ θ χ

= +

+ − +

σ σ σ

σ σ

C C

C C

[sin( ) cos( ) ]

[ cos( ) sin( ) ]

X ia X

ai aj a

a Xai aj a

, , , ,

, ,(4.2.80)

Using eq 4.2.78, the localization measure can be expressed as

θ θ θ θ= + +L a c bsin ( ) 2 sin( ) cos( ) cos ( )ij ij ij ij2 2

(4.2.81)

with

∑ ∑= + σ σ σ σ∈ ∈

a C C S C C Sija b A

ai bi aba b B

aj bj ab,

, ,,

, ,(4.2.82a)

∑ ∑= + σ σ σ σ∈ ∈

b C C S C C Sija b A

aj bj aba b B

ai bi ab,

, ,,

, ,(4.2.82b)

∑ ∑= − σ σ σ σ∈ ∈

c C C S C C Sija b A

ai bj aba b B

ai bj ab,

, ,,

, ,(4.2.82c)

Extremum of the localization measure is achieved at

θ π ω= −4 2ext (4.2.83a)

ω =−⎛

⎝⎜⎜

⎞⎠⎟⎟a

b a

ctan

2ij ij

ij (4.2.83b)

The value that maximizes eq 4.2.81 is chosen from the set ofvalues that deliver the extremum.Initially, a set of canonical MOs is divided into two groups:

occupied, {ψocc,σ,i}, and virtual {ψvirt,σ,i}. The sequence of 2 × 2rotations, eq 4.2.78, is performed to achieve localization. Therotations are carried out separately for pairs belonging to each ofthe two groups, until any pair of the resulting localized orbitalsψocc,σ,i, ψocc,σ,j ∈ {ψocc,σ,i} or ψvirt,σ,i, ψvirt,σ,j ∈ {ψvirt,σ,i} gives nofurther localization improvement. The ideal localization cannotbe achieved typically, and the iterations are stopped after theinterfragment coupling,536 d = ∑i∈B∑a,b∈A |Cai*SabCbi|, dropsbelow a predefined threshold value. Once the localization withineach of the two sets of orbitals is completed, the four sets ofLMOs are obtained: {ψA,occ,σ,i}, {ψA,virt,σ,i}, {ψB,occ,σ,i}, and{ψB,occ,σ,i}. The resulting Hamiltonian has the following blockstructure:

At this point one can utilize localizedMOs (LMOs) of the leadfragment B0 and similarly obtained LMOs (or canonical MOs) ofthe first monomer fragment, M1, to describe coupling of thefragment F0 with the first monomer. The resulting combinedHamiltonian takes the form

Because of the spatial separation of the localization regions A0andM1, the LMOs of the fragment A0 are unlikely to be affectedby introduction of the monomer M1, provided the size of theregion B0 is sufficiently large. Therefore, the Hamiltonianelements that couple fragments A0 and M1 are negligible,which is indicated by ∼0 in eq 4.2.85. Further, it is assumed thatthe introduction of the monomerM1 does not affect the couplingbetween the fragments A0 and B0, and that coupling betweenthese fragment will not have effect on optimization of the wavefunction for the extended system (with monomer M1).Therefore, the elements Focc,A,B, Fvirt,A,B, Focc,B,A, and Fvirt,B,A, areset to zero. This approximation turns the Hamiltonian of theextended system into the block-diagonal Hamiltonian:

=

++

⎛⎝⎜⎜

⎞⎠⎟⎟F

F

F

0

0F M

A A

B M

,

(4.2.86)

Figure 18. Basic scheme of the elongation method. The scheme showshow the original molecule is gradually reconstructed by sequentialaddition of monomers.Mn is the monomer attached at time step n. Fn isthe n-mer: the structure being computed after n steps.An and Bn are partsof the n-mer computed. The part A is frozen and effectively excludedfrom computations. The part B is treated fully at the nth step.

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5850

Page 55: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

with

= − ·

− ·

⎝⎜⎜

⎠⎟⎟F

F E I

F E I

0

0A A

A A

A A,

occ, ,

virt, , (4.2.87)

The canonical orbitals for the new fragment B1 = B0 ∪M1 areobtained by diagonalizing the corresponding Hamiltonian. TheLMOs of the fragment A0 are kept frozen and do not enter thenew set of SCF equations, but the density is used to compute thetotal energy. The new iteration only considers the mixing of theregions B0 and M1 (stage 2).Once the canonical MOs for the combined system B1 are

obtained, the procedure described above can be repeatedthefragment B1 is partitioned into the frozen and active parts, thecanonical MOs are localized into the fragments, an additionalmonomer is added, and the frozen part is dropped (stage 3). Thelatter feature of the algorithm allows one to keep the size of theactive part of the system (the one that determines thedimensionality of the eigenvalue problem) constant at alltimes, independently of the size of the entire system. Thisleads to linear scaling of the algorithm in the CPU time.4.2.3.1.1. Implementation. The elongation scheme with

cutoff (ELG/C) was implemented in the GAMESS package.4.2.3.2. Localized Molecular Orbitals (LMO) Method. In

order to reduce the scaling to linear or quasilinear order, Stewartutilized the idea very familiar to most practicing computationalchemiststhe localized nature of molecular orbitals. Namely,Stewart proposed the localized molecular orbitals (LMO)method.554 The method has many common points with theELG method just discussed. Most importantly, both methodsutilize dynamical MO growth algorithms and make good use ofthe localized nature of the MOs.For computations of the MOs, one starts with guess orbitals

derived from the standard Lewis diagrams of molecular structure.Each atom is assumed to have one s orbital and three p orbitals.These four orbitals are hybridized and oriented along thedirections of possible (sigma) bonding with the nearestneighbors. The hybrids are forced to remain orthogonal toeach other. If the number of neighbors is small and some of thehybrids remain unused, their orientation is rather arbitrary, andthe orthogonality is preserved.Unlike the conventional MO descriptions, in which MOs are

extended to the entire system, the LMO method starts with thelocalized orbitalsatomic and diatomicthat exist in localregions of space. However, in order to converge the SCFcalculations, the LMOs are allowed to grow dynamically.As discussed in the early works of Stewart et al.,555 it is not

necessary to compute all eigenvalues and eigenvectors bydiagonalizing the entire Fock matrix. It is sufficient to annihilateonly those elements of the Fock matrix that couple occupied andvirtual MOs. The approach is based on the assumption that theFock matrix has an approximately diagonal form in the specifiedbasis. Also, it is assumed that the entire matrix of occupied andvirtual orbitals is known from the previous calculations:=C C C( )o v , where Co (nbas × nocc) and Cv (nbas × nvirt) are

matrices of occupied and virtual orbitals written in a given basis,respectively. In the LMO approach these orbitals are known byconstruction. They are hybrids of the corresponding AOs ofsingle atoms and diatomic fragments. It should be noted that the

technique does not help to get rid of the unfavorable cubic scalingon its own. In the most straightforward implementation, it onlyreduces the prefactor by up to 4 times.Using the principle of occupied−virtual block annihilation,554

one is only interested in considering the pairs composed of oneoccupied, ψi, and one unoccupied, ψa, orbitals. Moreover, it isalways possible to exclude distant pairs because of their smalloverlap and negligible magnitude of the matrix element. For thepair of orbitals whose Fock matrix element

∑∑ ∑

∑∑ ∑

ψ ψ

χ χ

= ⟨ | | ⟩

= *⟨ | | ⟩

= *λ μ

λ λ μ μ

λ μλ λμ μ

∈ ∈

∈ ∈

F F

C F C

C F C

ia i a

A B A Bi a

A B A Bi a

MO

,

,

AO

(4.2.89)

is not negligible in theMO basis, the annihilation is performed bymixing the two orbitals by

ψ

ψ

α β

β α

ψψ

′=−

⎛⎝⎜⎜

⎞⎠⎟⎟

⎛⎝⎜⎜

⎞⎠⎟⎟⎛⎝⎜

⎞⎠⎟

i

a

i

a (4.2.90)

with

α = + +D F D12

(1 / 4 )ia2 2

(4.2.91a)

β ϕ α= −1 2 (4.2.91b)

= −D F Fii aa (4.2.91c)

ϕ =+ <

− ≥⎪

⎪⎧⎨⎩

F

F

1, 0

1, 0

ij

(4.2.91d)

The LMO growth procedure is shown in Figure 19. Althoughthe MO localization region grows with each annihilation levelspanning new atoms and bonds, the contributions of AOsdifferentiate. For example, the middle atoms give large total wavefunction amplitudes, while the peripheral atoms give onlynegligible contributions. As the contributions become smallerthan a specified threshold, the AOs are pruned from the LMO.The LMO size becomes relatively constant in this way, typicallyranging from 100 to 130 atoms for large systems. Mostimportantly, the LMO size becomes independent of the sizefor the entire system, ensuring linear scaling of the computationalefforts. The effect is not observed for relatively small systems,because the fully grown LMOs become comparable with the fullydelocalized MOs obtained directly.The large-scale simulation of systems with up to 1 000 000

atoms on parallel computers is among the most recentapplication of the LMO method.556 Although the LMOalgorithm is the key to success, efficient parallelization andinterprocessor communication strategies, and modern computerarchitectures involving symmetric multiprocessing (SMP), areessential.

4.2.3.3. Iterative Stochastic Subspace SCF (IS3CF). Theannihilation of the occupied−virtual block elements, similar tothe one discussed by Stewart, was utilized by Loos, Rivail, andAssfeld.557 All availableN orbitals are partitioned in their iterativestochastic subspace SCF method (IS3CF) into K sets of size L =N/K (Figure 20). The annihilation procedure between theorbitals belonging to different sets is then performed in themanner similar to that of the LMO. The choice of a particular

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5851

Page 56: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

orbital pair is stochastic and is made based on the norm of thelocalization measure, similar to one used in the ELG method.The annihilations are organized in a hierarchy depending on

the level of subdivision of the entire set of orbitals. Occupied−virtual orbital pairs are formed at the first level of approximation.The occupied orbital, i, is chosen randomly, while the virtual one,a, is chosen as the orbital that maximizes the Frobenius norm ofthe product:

+C F Ci aAO F (4.2.92)

with

≡ +X X XTr( )F (4.2.93)

At the first level of partitioning, thematricesCi andCa are columnvectors containing expansion coefficients of the occupied andvirtual orbitals in the AO basis, respectively. FAO is the Fockmatrix in the AO basis. Also at the first level, the quantities in eq4.2.92 are simply the absolute values of the elements of the Fockmatrix in the MO basis.The occupied−virtual pairs (and other tuples) are combined

in a similar way at the second and higher levels. The first pair(tuple) is chosen randomly, and the second one is chosen tomaximize the norm, similarly in eq 4.2.92. Each of the matrices Cnow contains several column vectors corresponding to theorbitals included into a given group. The stochastic partitioningprocedure is continued until the desired number of subsets isformed. At the highest possible level of partitioning, all N MOsare included in a single subset, yielding the conventional SCFmethod with an N × N Fock matrix. The dimension of the Fockmatrices is L × L at the lower levels. Although the number ofsubsets is larger, the total computational efforts scale morefavorably: O((N/L) × L3) = O(N × L2). This is linear in N for asufficiently large N/L ratio.Once K sets of orbital pairs are constructed stochastically,

eigenvalues and eigenvectors for each set are computed by thefollowing set of transformations:

←× ×B C

(definition of the orthogonalization matrix for a given set)kN L

kN Lk k

(4.2.94a)

=× × + ×F B F B( )

(compute the transformed Fock matrix)kL L

kN L

kN L

AOk k k k

(4.2.94b)

= ∀ =× × × ×F C C E k K, 1, ...,

(solve the eigenvalue problem)kL L

kL L

kL L

kL Lk k k k k k k k

(4.2.94c)

= × × ×C B C

(transform the eigenvectors to the original basis)kN L

kN L

kL Lk k k k

(4.2.94d)

The superscripts indicate matrix dimensions. Once the orbitalsand their energies are computed for all sets of a given subdivisionlevel, the total matrix of coefficients and the matrix of eigenvaluesare reconstructed by concatenation of the matrices for eachsubset:

= ∪ ∪× × ×C C C ...N N N L N L1 2

1 2 (4.2.95a)

= ∪ ∪× × ×E E E ...N N N L N L1 2

1 2 (4.2.95b)

The resulting set of orbitals is utilized to start the stochasticpartitioning scheme again, until the desired convergence isachieved. The methodology requires an initial guess of Northonormal MOs, which can be chosen similarly to the LMOmethod.

4.2.4. Diabatic Approaches. Identification of an efficientsystem fragmentation is the key challenge of the methodsdiscussed above. Many ways of partitioning a system into specificfragments have been discussed under different frameworks, eachbased on its own criterion of accuracy and efficiency. The role ofpartitioning is less important in the D&C methods, in which thenumber of electrons in each fragment is determined uniquelyfrom the Fermi energy equilibration principle. At the same time,the partitioning scheme in the D&C method is also a matter ofjudicious preparation (e.g., how to choose buffer regions,overlaps, etc.). The ambiguity associated with the choice of thepartitioning scheme is minimized in the group of methods thatwe classify as “diabatic”. These methods define a partitioningbasis by considering a subspace of various possible chargelocalizations and molecular partitionings. Essentially, any type ofbasis configurations can be considered, each corresponding to asmaller subsystem. This is the beauty of the diabatic approach.The configurations may be rather artificial in terms of their

Figure 19. Schematic representation of theMO growth procedure in theLMO method of Stewart. ψ and ψ′ are the unmixed and mixed MOs,used in the growth procedure. See eq 4.2.90 and the correspondingdiscussion in the text.

Figure 20. Schematic representation of the IS3CF procedure. K is thenumber of sets into which the original pool ofN orbitals is partitioned; L= N/K is the number of orbitals in a given set. k enumerates all sets of Lorbitals.

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5852

Page 57: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

physical and chemical interpretation or may have a clearmeaning. The efficiency comes from the fact that diabatic statescan be computed using relatively small noninteracting fragmentsof a system.4.2.4.1. Block-Localized WFT and DFT Methods. The block-

localized WFT (BLWFT) was developed in 1998 by Mo andPeyerimhoff for studies of electronic delocalization in organicmolecules.558 The method was applied to study hyper-conjugation,559 and to perform energy decomposition andbasis set superposition error analysis.560

The system is partitioned into K subsystems {Sk, k = 1, ..., K}with Mk AO basis states and Nk electrons in the kth group, suchthat the total number of electrons and AO basis functions areconstant:

∑ ==

N Nk

K

k1 (4.2.96a)

∑ ==

M Mk

K

k1 (4.2.96b)

One constructs many-electron auxiliary states of each group asthe product of the MOs in this group:

σ σ σ

ψ σ ψ σ ψ σ

Φ

=

r r

r r r

( , ..., , , , ..., )

( , ) ( , ) ... ( , )

I k N N

k k k N N N

, 1 1 2

,1 1 1 ,2 2 2 ,

k k

k k k (4.2.97)

where I is the multi-index defining a particular choice of one-electron spin−orbitals within the kth group. The one-electronspin−orbitals of the kth group are represented in the basis of theAOs belonging to this group:

∑ ∑ψ χ χ σ α β| ⟩ = | ⟩ = | ⟩ =σ σ σ∈ ∈

C C , ,k ia S

k a i aa S

k ai a, , , ( , ) , ,

k k

(4.2.98)

The requirement of expansion of MOs for the kth group onlyin terms of the AOs belonging to the given group is the keyassumption of the block-localized WFT, providing the maincomputational savings of the method.We follow the definitions introduced in section 2.1. Unlike eq

2.1.12, each one-particle MO has an addition index correspond-ing to the subset. The states, eq 4.2.97, are many-particle basisfunctions localized on the kth subsystem. They need not beantisymmetric. These states are used to construct physicallymeaningful many-particle diabatic states, obtained as antisym-metric products of the many-particle auxiliary functions of allgroups, eq 4.2.97:

Ψ = Φ Φ ΦA[ ... ]s I I I N,1 ,2 ,NK K1 2 (4.2.99)

The construction, eq 4.2.99, is a coarse-grained andgeneralized version of the conventional Slater determinant, eq2.1.2. The generalization comes via the antisymmetrizer A, whichdoes not have the simple form of a determinant. Similar to theconventional Slater determinant, there are many possible“excitation” of a reference function, Ψ0. Thus, an additionalindex s enumerates all possible configurations. These config-urations come by choosing one of many possible combinations ofthe multicomponent indexes I1, ..., INK

.The MOs are orthonormal within each group, but they are not

required to be orthogonal across different groups:

ψ ψδ δ

⟨ | ⟩ ==

≠σ σ

σσ

⎪⎪⎧⎨⎩

A B

S A B

,

,A i B j

ij

ij, , , ,

(4.2.100)

The expansion of the sth diabatic state,Ψs, in the overall AO basisis

=

⎜⎜⎜⎜

⎟⎟⎟⎟C

C

C

C

0 ... 0

0 ... 0... ... ... ...0 0 ... K

1

2

(4.2.101)

where each Ck is the matrix of coefficients for the MOs of the kthgroup in the basis of the kth group AOs. Each matrix hasdimensionsMk × Nk. We omit the state index s in these matricesfor simplicity. The total energy of the determinant is given by theexpression similar to the Hartree−Fock, eq 2.1.28:

= ⟨Ψ| |Ψ⟩ = · +E H P H F12

Tr( ( ))s s s (4.2.102)

where H and F are the standard one-electron core Hamiltonianand the Fock matrices in the AO basis, and P is the generalizeddensity matrix:

= + − +P C C SC C( ) 1(4.2.103)

The one-electron MOs of each group k, ψk,σ,i, are obtained bysolving a set of K coupled Schrodinger equationsone for eachgroup. The coupling between the equations is realized via thedensity-matrix-dependent Fock operator, but the dimensionalityof each electronic problem is smaller than that of the entiresystem, leading to linear scaling. The organization ofcomputations to find each diabatic state, Ψs, is similar to thatof the D&C method.The main focus of the initial works on the BLWFT was on an

efficient construction of diabatic states. The authors could thenconsider several meaningful configurations (partitioningschemes) to analyze reactive processes. A more systematic andgeneral approach to reactive systems, especially to dynamics, wasdeveloped byGao et al., who combined the BLWFT approach forthe MO description with the valence bond (VB) theory, leadingto the MOVB method.561−563 We refer the reader to the reviewon the VB theory for more details and discussion of the moderndevelopments of the subject.564

In general, the “true” wave function is given in the VB theoryby a linear combination of different resonant configurations. Theconfigurations are chosen to be the diabatic states Ψs in theMOVB theory:

∑Ψ = Ψas

s sMOVB(4.2.104)

The linear combination of configurations, eq 4.2.104, is similar tothat used in the post-HF correlated methods such as the CI, CC,and CASSCF. In a manner similar to those methods, theexpansion, eq 4.2.104, provides a way of incorporatingnondynamic correlation. Together with a reasonable choice ofthe number of configurations and with the BLW approach tocompute them, the method is applicable to large systems andreactive processes in a broad meaning.The expansion coefficients as are determined by diagonaliza-

tion of the Hamiltonian matrix written in terms of the many-electron configurations:

=Ha SaE (4.2.105)

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5853

Page 58: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

where

= ⟨Ψ| |Ψ⟩H HIJ I J (4.2.106)

= ⟨Ψ|Ψ⟩SIJ I J (4.2.107)

The Hamiltonian and overlap matrix elements, eqs 4.2.105 and4.2.107, respectively, are evaluated using the Slater determinantsconstructed from the nonorthogonal orbitals. As a result, the finalexpressions are more involved than in the standard (orthogonalMO) case. The details of the derivations can be foundelsewhere.561−563 Thus, the energy evaluation does not involvecomputation of the adiabatic PESs. Only diabatic energies andcouplings are needed, and diagonalization of a relatively small-dimensional Hamiltonian matrix, eq 4.2.106.Recently, Cembran et al. extended the MOVB theory to the

DFT in the block-localized DFT (BLDFT) method.565 Themethod combines the best of the two worldsdynamiccorrelation from the DFT (via correlation functional) andnondynamics correlation from the VB theory (via multipleconfigurations). Together with the block-localized scheme, themethod has a high potential to become a very accurate andefficient tool for studies of reactive and electronic (e.g., excitedstate) dynamics in large-scale systems.In passing, we would like to comment on the use of diabatic

states. Diabatic states have clear physical meaning, and areconvenient for interpretation and construction. They may bedefined quite arbitrarily and, hence, may be used as a convenientbasis for quantum dynamics, and, as a specific case, for stationaryproblems. For example, most of the nonadiabatic electronicdynamics methods, such as trajectory surface hopping(TSH)566,567 or Ehrenfest568−571 algorithms, can be formulatedin a diabatic basis. They often show results comparable tosimulations in the adiabatic basis. The fact that the basis isarbitrary means that each state may be chosen as a MO of anindependent fragment. However, one may need many differenttypes of fragments. In particular, it is crucial to include chargedconfigurations A, A+, A−, etc. for the description of chargetransfer and polarization phenomena.4.2.4.2. Fragment Orbital DFT (FO-DFT) Method. The

philosophy similar to that of the BLWFT and BLDFT methodswas adopted by Oberhofer and Blumberger, who developed thefragment orbital DFT (FO-DFT) method.49 Calculation ofcharge carrier mobilities in organic materials using surfacehopping schemes is the main target of their method. Theelectronic coupling between the diabatic (e.g., noninteractingdonor and acceptor) states

ψ ψ= ⟨ | | ⟩H HI,II I II (4.2.108)

is one of the most important parameters determining thehopping rates.In the FO-DFT, a system of interest is composed of the donor

D and acceptor A species, each containingN electrons. One extraelectron (the one that is transferred) is added, to describe theelectron transfer reaction:

+ → +− −D A D A (4.2.109)

The coupling between the initial and final diabatic states is tobe computed. A straightforward approach is to construct wavefunctions for the diabatic states as single Slater determinantscomposed of the molecular orbitals of the DA complex:

ψ φ φ=+ !

| + |+NN

1(2 1)

(1)... (2 1)NI 1DA

2 1DA

(4.2.110a)

ψ φ φ=+ !

| + |+NN

1(2 1)

(1)... (2 1)NII 1DA

2 1DA

(4.2.110b)

This choice is a special case of the determinants made in theMOVB (or BLDFT) method discussed above. In order to avoidexplicit calculations of all molecular orbitals of the combined DAsystem {φi

DA, i = 1, ..., 2N + 1}, the diabatic states areapproximated using the orbitals of the noninteracting fragments,{φi

D, i = 1, ..., N + 1} and {φiA, i = 1, ..., N}:

ψ φ φ φ

φ

≈+ !

| +

|

+NN

N

1(2 1)

(1) ... ( 1) (1) ...

( )

ND

N

I 1D

1 1A

A(4.2.111a)

and vice versa:

ψ φ φ φ

φ

≈+ !

|

+ |+

NN

N

1(2 1)

(1) ... ( ) (1) ...

( 1)

N

N

II 1D D

1A

1A

(4.2.111b)

The KS orbitals of the noninteracting fragments are, in general,nonorthogonal across the subsets belonging to differentfragments. Therefore, they are orthogonalized using the Lowdintechnique, which is known to preserve orbital localizationcharacter. The coupling matrix element is given by

ψ ψ ψ ψ φ φ= ⟨ | | ⟩ ≈ ⟨ | | ⟩ = ⟨ | | ⟩+ +H H H hI I N NI,II II IIKS

II 1D

IIKS

1A

(4.2.112)

in the basis of the approximate donor and acceptor states. Thelast equality shows that the coupling matrix elements aredetermined by the singly occupied molecular orbitals (SOMOs)of the charged donor and acceptor species. A similar techniqueworks for hole diffusion with the only exception that one has toconsider positively charged species. The accuracy of theapproximations was studied by comparison with the morerigorous constrained DFT calculations, validating the computa-tional methodology.48 The methodology was applied to a varietyof organic semiconducting materials.50 Although the FO-DFT isvery simplistic, it can be applied to large finite and periodicsystems. Many nuclear configurations can be sampled, allowingMD and systematic screening.

4.2.4.3. FMO-MS-RMD. The common idea of the diabaticapproaches is well illustrated by the recently developed FMO-based multistate reactive molecular dynamics (FMO-MS-RMD)of Lange and Voth.572 The method was motivated by the need toaddress uniqueness of the partitioning scheme. Instead ofdeveloping the “optimal” technique of partitioning a system ontofragments, the authors utilize an idea analogous to that of themultistate valence bond (MS-VB) and multiconfigurationalreactive MD573 theories used in reactive force fields. The idea isalso analogous to the configuration interaction approach used inthe wave function theory. A set of different fragments (obtainedby distinct partitioning schemes) is constructed to serve as thebasis set for performing the FMO calculations. The wavefunction of the total molecular system, |Ψtot⟩, is approximated bya linear combination of different fragmentation “states”, |ΨS⟩:

∑|Ψ ⟩ = |Ψ⟩cS

S Stot(4.2.113)

where cS is the expansion coefficient. The authors considerproton transfer dynamics in water as an example of the reactivesystem which needs a multistate description (Figure 21).

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5854

Page 59: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

The method can be considered a model Hamiltonianapproach. Unlike the BLDFT and BLWFT methods, the matrixelements of the Hamiltonian are computed using the FMOdescription. The diagonal Hamiltonian matrix elements,HS,S, aredefined as the FMO energies for the corresponding schemes withadditional empirical corrections. The off-diagonal elements,HS,S′,correspond to a particular chemical reaction that transforms oneset into the other. Once the Hamiltonian supermatrix is formed,the coefficients of the expansion, eq 4.2.113, are obtained by theHamiltonian diagonalization. The resulting energy of the systemis assigned to be the solution that delivers the minimum. Finally,the gradients can be computed using the Hellmann−Feynmantheorem, to perform MD simulation or geometry optimization.4.2.4.4. Superposition of Fragment States (SFS). The

superposition of fragment states (SFS) method of Valeev etal.574 utilizes a set of fragment ionized states to represent theadiabatic state of a large-scale system. It was proposed initially asa tight-binding, model Hamiltonian approach for studies ofcharge transfer in organic solids. Later, it was combined withcorrelated methods by Zhang and Valeev, to describe ionizedstates of large clusters.575 The basis idea can be illustrated withthe ionized state of an AB dimer. The total wave function of theionized state, |(AB)+⟩, can be represented by a linearcombination of the charge-localized diabatic states:

| ⟩ = |Φ ⟩ + |Φ ⟩+AB C C( ) A B1 2 (4.2.114)

|Φ ⟩ = | ⟩ = | ⟩| ⟩ = | ⟩| ⟩+ +A B A B a A Bi1 A (4.2.115a)

|Φ ⟩ = | ⟩ = | ⟩| ⟩ = | ⟩| ⟩+ +AB A B a A Bi2 B (4.2.115b)

where aiX is the annihilation operator that removes an electronfrom the ith orbital of the fragment X. The states |A+B⟩ and |AB+⟩describe localization of the positive charge on the fragments Aand B, respectively. Within the simplest approach, the diabaticstates are approximated by the product of the Slater determinantsfor the individual fragments in the specified charge state. TheSlater determinants can be obtained by different methodsoutlined below.The diabatic states are normalized, but nonorthogonal,

initially:

⟨Φ |Φ ⟩ = S1 2 12 (4.2.116)

The matrix elements of the resulting coarse-grained (fragmentbasis) Hamiltonian are determined from the Koopmanstheorem:

ψ ψ⟨Φ| |Φ⟩ − ≈ −⟨ | | ⟩ ≡=

≠⎪

⎪⎧⎨⎩

H E Fe i j

J i j

,

,i j AB i AB j

i

ijX X

(4.2.117)

It is important to emphasize that the Fock operator of theentire system, FAB, is utilized to compute the above integrals. Theintegrals are, in general, different from those obtained for theisolated fragments, ⟨ψiX|FX|ψjX⟩, X = A, B. The difference may benotable and, in fact, may play a central role.575 At the same time,the computational cost for such an approach will be comparableto that for the entire system, unless an efficient and accurateseparability approximation for such matrix elements isdeveloped. This approximation can be facilitated by thepreliminary transformation of the nonorthogonal diabatic basisto the orthogonal one. The Lowdin transformation is thestandard choice minimizing distortion of the original wavefunction.Utilization of the diabatic charge-constrained states, as given

by the constrained DFT (CDFT), was combined by van Voorhisand co-workers576 with the linear-scaling code CON-QUEST.87,577,578 The method requires prior knowledge of thecharge localization states. These states can be generated in anumber of ways. The CDFT by van Voorhis,576,579−583 dimersplitting,584,585 and the fragment orbital586−588 methods areamong the most popular. In principle, one can consider asufficiently large basis of different charge states and obtain anaccurate description of the system. This approach mayparticularly be useful for studying charge transfer dynamics.

4.3. Direct Optimization Methods

Most of the “dynamical” methods were extensively discussed byGoedecker.589 According to his classification the methods fallinto one of the following six basic classes:(1) Fermi operator expansion (FOE):590 In this approach the

density matrix is constructed directly from the Hamiltonian ofthe system.(2) Fermi operator projection (FOP): Unlike the FOE

method, the entire density matrix is not constructed; only thesubspace spanned by the occupied states is searched for.(3) Divide-and-conquer method (D&C): The density matrix

for the whole system is computed by combining the densitymatrices for the subsystems.(4) Density matrix minimization (DMM): The density matrix

is determined by optimization of a specially developed functionalwith respect to all components of the density matrix, each treatedas an independent variable.(5) Orbital minimization (OM): The density matrix is

computed via the standard definition in terms of localizedfunctions (Wannier functions, in general; eigenfunctions, inparticular). The localized functions are found by optimization ofthe specially derived functional with respect to the expansioncoefficients in terms of the basis functions.(6) Optimal basis density matrix minimization (OBDMM): It

combines the ideas of the previous two methods. In addition tofinding an optimal density matrix, as an expansion in terms of aspecified basis, the basis itself is optimized by additionalprocedures.

Figure 21. Basis of four fragmentation (diabatic) “states” of theprotonated water tetramer. |ψn⟩ is the fragmentation set n. Reprintedfrom ref 572. Copyright 2013 American Chemical Society.

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5855

Page 60: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

The D&C group of methods has already been discussed insection 3, as one of the static fragmentationmethods. It is distinctin our classification, because it relies on the solution of theeigenvalue problem. On the contrary, the dynamical methodsavoid solving the eigenvalue problem. Instead, they solve for thedensity matrix or coefficients by an iterative approach,propagating them from the initial guess to the optimized value.This is typically done by the standard minimization algorithmssuch as conjugate-gradient or steepest descent, although thepropagated variables may be considered dynamical variables,evolving in time.In the following discussion, we will briefly overview the basic

formulation of the DMM, OM, and FOE groups of methods,along with the most recent advances along those lines. Inaddition, we consider the group of methods that treat the wavefunctions (orbitals) or density matrices as dynamical variables.This group is not specifically classified by Goedecker, although itis reminiscent of the DMM and OM formulations.4.3.1. Density Matrix. Before discussing direct the

optimization methods, it is worth refining the details of thedensity matrix definitions and transformations under differentconditions. This subsection extends the definitions presented insections 2.1.2 and 2.1.4 to more general cases and puts additionalemphasis on the density matrix variables. To simplify thenotation, we utilize the spin−orbital indexing, such that eachindex denotes a unique combination of spin and spatial parts ofan orbital.The one-particle charge density matrix, ρ, is defined in the

orthonormal MO basis, {ψi}, as

∑ρ ψ ψ ′ = * ′=

r r n r r( , ) ( ) ( )i

N

i i i1

orb

(4.3.1)

ni is the occupation number of the orbital i. It is either 0 or 1 inmost cases (zero-temperature limit). ψi(r) is the coordinaterepresentation of the MO |ψi⟩:

ψ ψ ≡ ⟨ | ⟩r r( )i i (4.3.2)

The diagonal element of the density matrix gives the well-knowncharge density:

ρ ρ ρ = = ′ | ′= r r r r r( ) ( , ) ( , ) r r (4.3.3)

A proper one-particle density matrix must satisfy the followingbasic criteria:

hermiticity:

ρ ρ ′ = ′ r r r r( , ) ( , ) (4.3.4a)

normalization:

∫ ρ =r r r N( , ) d(4.3.4b)

idempotency:

∫ρ ρ ρ ρ ′ ≡ ″ ″ ′ ″ = ′ r r r r r r r r r( , ) ( , ) ( , ) d ( , )2

(4.3.4c)

It is often convenient to work with the abstract operator ormatrix representations of the density matrix. The expression

∑ ∑

ρ ψ ψ ψ ψ

ψ ψ ρ

′ = * ′ = ⟨ | ⟩⟨ | ′⟩

= ⟨ | | ⟩⟨ | | ′⟩ = ⟨ | | ′⟩

= =

=

r r n r r n r r

r n r r r

( , ) ( ) ( )

( )

i

N

i i ii

N

i i i

i

N

i i i

1 1

1

orb orb

orb

(4.3.5)

gives the density matrix in operator representation:

∑ρ ψ ψ = | ⟩⟨ |=

n( )i

N

i i i1

orb

(4.3.6)

Because the MO set, {ψi}, is orthonormal, the matrixrepresentation is given by eq 2.1.46, leading to

∑ρ ψ ψ δ = | ⟩ ⟨ | ⇔ = ==

P P O n( )i j

N

i ij j ij ij i ij, 1

orb

(4.3.7)

This trivial result is implied by eqs 2.1.33 and 2.1.34. In short, thedensity matrix in the orthogonal MO basis is diagonal with theelements equal to the populations. The populations can begeneralized further to finite temperatures and fractional values,via the Fermi distribution function.Consider the density matrix representation in the basis of

nonorthogonal AOs. Using eq 4.3.7, and expansion eq 2.1.12, weget

∑ ∑

ρ ψ ψ χ χ

χ χ

= | ⟩ ⟨ | = | ⟩ ⟨ |

= | ⟩ ⟨ |

= =

+

=

O COC

P

( ) ( ( ) )

( )

i j

N

i ij ji j

N

a ab b

i j

N

a ab b

, 1 , 1

, 1

orb orb

orb

(4.3.8a)

= +P COC (4.3.8b)

which is the standard expression, eq 2.1.32. Having generalizedthe standard definition, we can now consider the non-orthonormal MO basis, {ψi}:

ψ ψ⟨ | ⟩ = Si j ijMO, (4.3.9)

The definition of the density matrix, eq 4.3.1, remains thesame, when expressed in terms of othogonalized orbitals. Thecommon orthogonalization procedure is via the Lowdintransformation:

∑ψ ψ| ⟩ = | ⟩ −Sij

j jiMO,1/2

(4.3.10)

so that

ψ ψ ψ ψ

δ

⟨ | ⟩ = ⟨ | ⟩

= =

− −

− −

S S

S S S

i ja b

ia a b bj

a bia ab bj ij

,MO,

1/2MO,

1/2

,MO,

1/2MO, MO,

1/2

(4.3.11)

The operator representation gives

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5856

Page 61: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

∑ ∑

ρ ψ ψ

ψ ψ

ψ ψ

ψ ψ

= | ⟩ ⟨ |

= | ⟩ ⟨ |

= | ⟩ ⟨ |

= | ⟩ ⟨ |

=

= =

− −

=

− −

=

O

S O S

S O S

P

( )

( )

( ( ) )

( )

i j

N

i ij j

i j

N

a b

N

a ai ij jb b

a b

N

a ij ab b

a b

N

a ab b

, 1

, 1 , 1MO,

1/2MO,

1/2

, 1MO

1/2MO

1/2

, 1

orb

orb orb

orb

orb

(4.3.12a)

= − −P S OSMO1/2

MO1/2

(4.3.12b)

The occupation matrix,O, is diagonal, and contains 0 and 1 in thezero-temperature limit. If the basis set consists of only occupiedorbitals, the expression eq 4.3.12b simplifies:

= =− − −P S IS SMO1/2

MO1/2

MO1

(4.3.12c)

The density matrix operator for nonorthonormal MOs can beexpressed in the basis of nonorthonormal AOs:

ρ ψ ψ

χ χ

χ χ

= | ⟩ ⟨ |

= | ⟩ ⟨ |

= | ⟩ ⟨ |

=

=

+

=

P

CPC

P

( )

( ( ) )

( )

a b

N

a ab b

a b

N

a ab b

a b

N

a ab b

, 1

, 1

, 1

orb

orb

orb

(4.3.13a)

= − − +P CS OS CMO1/2

MO1/2

(4.3.13b)

The expression eq 4.3.13b can be transformed further in the zero-temperature limit to

= = =− − + − + + −

+

P CS OS C CS C C C S C

C

( )MO1/2

MO1/2

MO1

AO1

(4.3.13c)

which is the expression used earlier in the context of the BLWFTand BLDFT methods.One can verify the three main properties of the density matrix,

eqs 4.3.4, as applied to the operator representations discussedabove. For example, the idempotency, O2 = O, is satisfied in thenonorthogonal MO case, eq 4.3.13a:

ρ ψ ψ

ψ ψ

ψ

ψ

ψ ψ

ρ

= | ⟩ ⟨ |

| ⟩ ⟨ |

= | ⟩

⟨ |

= | ⟩ ⟨ |

=

=

− −

′ ′=′

− −′ ′ ′

=′ ′=

− −′

− −′ ′ ′

′=

− −′ ′

S OS

S OS

S OS S

S OS

S O S

( ( ) )

( ( ) )

( )

( )

( )

a b

N

a ab b

a b

N

a a b b

a ba b

N

a ab ba

a b b

a b

N

a ab b

2

, 1MO

1/2MO

1/2

, 1MO

1/2MO

1/2

, 1, 1

MO1/2

MO1/2

MO,

MO1/2

MO1/2

, 1MO

1/2 2MO

1/2

orb

orb

orb

orb

(4.3.14a)

The normalization is

ρ ψ ψ

ψ ψ ψ ψ

= | ⟩ ⟨ |

= ⟨ | ⟩ ⟨ | ⟩

=

=

=

=

− −

′ ′==

′− −

′ ′=

− −′ ′

′ ′=′ ′

S OS

S OS

S S OS S

S OS

N

tr( ) tr( ( ) )

( ( ) )

( )

( )

a b

N

a ab b

a ba b

N

a a ab b b

a b

N

a b

a b

N

a b

, 1MO

1/2MO

1/2

, 1, 1

MO1/2

MO1/2

, 1MO MO

1/2MO

1/2MO

, 1MO

1/2MO

1/2

occ

orb

orb

orb

orb

(4.3.14b)

The hemiticity, eq 4.3.4a, is trivial.4.3.2. Wave Function or Density Matrix as Dynamical

Variables. One of the first approaches to solving the electronicstructure problem via a direct minimization of the total energywith respect to orbitals was developed by Car and Parrinello in1985.591 Their approach is based on a simple classical mechanicalidea of an extended Lagrangian and treatment of the MOs, |ψi⟩,and their time derivative, |ψi⟩ as coupled dynamical variables. TheLagrangian is

∑ ψ ψ δ= − + Λ ⟨ | ⟩ −L K E ( )i j

ij i j ij, (4.3.15)

where the last term imposes the orthonormalization restrictionon the orbitals. K and E are the kinetic and potential energies,respectively:

∑ ∑ μ ψ ψ= + ⟨ | ⟩K M R12

12n

ni

i i in2

(4.3.16)

∑ ψ ψ ρ= − ⟨ |∇ | ⟩ +E U12

[ ]i

i i i2

(4.3.17)

where μi is the fictitious mass associated with the orbital |ψi⟩,Mnis themass of the nth nucleus, Rn is the velocity of the nth nucleus,and U[ρ] is the electronic DFT potential energy that includes allcontributions other than the kinetic energy. The Lagrangian, eq4.3.15, generates coupled dynamics of ionic particles (nuclei) andfictitious electronic degrees of freedom, represented by orbitals:

= −∇M R En n Rn (4.3.18a)

∑μ ψ δδψ

ψ = − * + ΛEi i

i jij j

(4.3.18b)

Both electrons and nuclei are treated on the same footing.Using eqs 4.3.18, one performs the simulated annealing MDcalculations that bring the kinetic energy to a finite temperature.In the limit of zero temperature, the time derivative of orbitalvelocity approaches zero, ψi = 0, meaning that the stationary stateis found and that the variational procedure converged to asolution given by the matrix diagonalization. In addition, thedynamics converges to the solution that corresponds toorthonormalized orbitals. The expensive matrix diagonalizationprocedure is substituted with the much more efficient simulatedannealing dynamics, allowing modeling of large molecular orsolid-state systems.

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5857

Page 62: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

The simulated annealing may require a large number of steps,creating one of the difficulties associated with the Car−Parrinello(CP) dynamics. Because the fictitious mass of the orbital, μ, israther small, the integration time step may need to be reducedconsiderably, if stable dynamics is to be obtained. Both thesefactors may, in principle, become obstacles, leading to thepoorer-than-expected performance boost. Yet anther difficulty isthe possibility of converging into one of the local minima. Thismay become more probable as the size of the system increases,and the number of shallow local minima grows significantly.Using sufficiently slow simulated annealing cooling protocolsmay help in avoiding the problem, but will require increasedcomputational time.Further improvements of the CPMD method were reported

soon after the original work. For example, Marx et al.592,593

developed the extensions to add nuclear quantum effects via pathintegrals. Doltsinis and Marx594 extended the method to thenonadiabatic dynamics, by combining it with the restricted open-shell KS (ROKS) method595 and Tully’s fewest switches surfacehopping algorithm.566

To allow even further improvement in computationalefficiency, Galli and Parrinello reformulated the method in anonorthogonal basis.596 The use of a nonorthogonal basis isneeded for localization of orbitals and separation of distantlocalization regions, to yield a linear-scaling approach. Thisgeneralization has several important points. First, one introducesthe auxiliary function, defined similarly to eq 4.3.10, but with adifferent power of the overlap matrix:

∑ψ ψ| ⟩ = | ⟩ −Sij

j jiMO,1

(4.3.19)

With the new function, the equation of motion for electronicvariables, eq 4.3.18b, becomes

μ ψ δδψ

= −*

Ei i

i (4.3.20)

The resulting equation does not contain the orthonormalizationterm ∑jΛijψj, because the orthonormalization is implicitlyincorporated in the definition, eq 4.3.19.Second, to avoid the need for the overlap matrix inversion in

eq 4.3.19, the iterative scheme that computes the auxiliaryfunctions in linear time is used:

ψ ψ ψ| ⟩ = − | ⟩ + | ⟩+ S(1 )in

in

i( 1)

MO( )

(4.3.21)

Finally, to force the MO localization in different spatialregions, a nonlocal potential is added to the Hamiltonian. Thepotential takes the form similar to that of the corepseudopotentials common to DFT:

∑ ψ ψ= | ⟩ ⟨ |V Vi

i i S i,(4.3.22)

θ σ= − | − | −V A r R[1 ( )]i S S, (4.3.23)

where A is the strength of the potential, θ(·) is the step function,and RS and σ are the center and the radius of the localizationregion S, respectively.More recently, an ab initio density matrix propagation

(ADMP) method was formulated. It is similar to the CPMD,but uses the density matrix elements as the main dynamicalvariables.597−599

4.3.3. Orbital Minimization (OM) Methods. The OMmethods are very closely related to the CP and similar methods.

In contrast to the CP method, a strict optimization is performedfor each single-point calculation. The basic formulations weregiven by Mauri et al.600,601 and Ordejon et al.602,603 The methoddirectly minimizes the energy by propagating the MO expansioncoefficients using the standard techniques, such as the conjugate-gradient optimization.The derivation of the OM methods can be systematized using

the notation introduced in section 4.3.1. The energy of thesystem is given in the MO basis by

=E P Htr( )el MO MO (4.3.24)

where PMO is the density matrix, and HMO is the Hamiltonianmatrix, both given in the MO basis. In the general case ofnonorthonormal MOs, in zero-temperature limit, the densitymatrix is given by the inverse of the MO overlap matrix, eq4.3.12c. Thus, the energy is given by

= −E S Htr( )el MO1

MO (4.3.25)

The true solution corresponds to a set of orthonormal MOs, inwhich case the overlap matrix is diagonal. The orthonormaliza-tion of the orbitals is essentially equivalent to the matrixdiagonalization and should be avoided if a linear-scaling methodis to be formulated. Tomitigate the need for the matrix inversion,Mauri et al.600,601 approximated the inverse of the overlap matrixwith its Taylor expansion terminated at an odd power:

= − −

≈ + − + − +

− −S I I S

I I S I S

( ( ))

( ) ( ) ...MO

1MO

1

MO MO2

(4.3.26)

With only the first-order terms, eq 4.3.25 becomes the grandpotential:

Ω = + −IH I S Htr( ( ) )MO MO MO (4.3.27)

Alternatively, the orthonormality constraint can be imposedvia the Lagrange multipliers method, for example as proposed byOrdejon et al.602,603 In this case, the grand potential is

Ω = − Λ −H S Itr( ( ))MO MO (4.3.28)

The stationary solution

∂Ω∂Λ= 0

(4.3.29)

corresponds to the orthonormality: SMO = I. It is achieved forΛ =HMO. Thus, the grand potential for the unconstraint optimizationis

Ω = − −H H S Itr( ( ))MO MO MO (4.3.30)

The result is exactly the same as that obtained by Mauri etal.600,601 using the first-order Taylor expansion, eq 4.3.27. Theconstraining parameters Λ in the grand potential of the formgiven in eq 4.3.28 were used explicitly in the formulation ofWangand Teter.604 They argued that a relatively small magnitude ofthese parameters (larger but comparable to the magnitude of theHamiltonian itself) can be used to formulate an efficient gradientmolecular optimization strategy for linear scaling.The formulations of Mauri et al. and Ordejon et al. utilized

only occupied orbitals. The extension to a more general basis,which can include unoccupied orbitals, was done by Kim et al.605

The extension can be derived using the general constructs andtransformations introduced in section 4.3.1. The grand potentialcan be written in the general case simply as

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5858

Page 63: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

ε ρ εΩ = − +H I Ntr(( ) )F F (4.3.31)

where the last term accounts for the normalization condition ofthe density matrix, eq 4.3.4b.4.3.4. Density Matrix Minimization (DMM) Methods.

The DMM was formulated in 1993 independently by Li, Nunes,Vanderbilt,606 and Daw.607 In the early works, the scheme wasapplied mostly to one-element systems at the tight-binding (TB)level of description, which already provided some computationaladvantages for large-scale calculations of the time. The use of thesimple tight-binding or semiempirical schemes seems to be anatural choice for the initial stages of developing a linear-scalingalgorithm. This common trend appeared in the discussion of thefragmentation-based approaches. Such choices allow theresearcher to focus on the algorithm itself rather than on detailsof the potential.We illustrate the idea of the DMM methods using the early

examples with the tight-binding Hamiltonians. The Fock matrixin eq 4.2.28b reduces to the core Hamiltonian under the tight-binding approximation. Consequently, the whole Hamiltonian isgiven by the core Hamiltonian, H, independent of the densitymatrix. The total energy is then simply

∑= ==

E P H PHtr( )a b

N

ab abel, 1

orb

(4.3.32)

One wishes to minimize the energy functional, eq 4.3.32, withrespect to the density matrix. The minimization should maintainthe basic properties of the density matrix. The hermiticity issatisfied by the definition of the density matrix. To satisfy thenormalization constraint, one considers the unconstrainedminimization of the extended (grand) energy:

ε εΩ = − = −E N P H Itr( ( ))el F F (4.3.33)

Satisfying the idempotency, eq 4.3.4c, is the most difficultchallenge. The idempotency condition imposes constraints onthe populations of the occupied and unoccupied levels, given bythe eigenvalues of the density matrix. Because the DMM doesnot utilize orbitals explicitly, the occupied and unoccupied statesare defined as the energy levels below and above the Fermienergy, respectively. Disregarding the idempotency may allowthe occupation numbers of the levels below the Fermi energy toapproach +∞, and the occupation numbers of the levels abovethe Fermi energy to approach −∞.A solution is given by the purification procedure described by

McWeeny.608,609 Provided that a reasonable initial guess is used,such that the eigenvalues of the density matrix are confined in theinterval [−1/2, 3/2], the following purification procedure willconverge the eigenvalues to 0 or 1:

= −P P P3 22 3 (4.3.34)

where P is the density matrix purified from P by one iterationstep. A few purification iterations are typically needed after eachgradient minimization step. The authors argue that the purifieddensity matrix represents a physically meaningful and convergedquantity and that it can be used to compute the expectationvalues of any arbitrary operator, A:

= A PAtr( ) (4.3.35)

The unpurified density matrix is regarded as an auxiliary (trial)variational parameter. With this assumption, the grand energy ofthe system, eq 4.3.33, is redefined:

ε εΩ = − = − −P H I P P H Itr( ( )) tr((3 2 )( ))F2 2

F(4.3.36)

Finally, the energy expression, eq 4.3.36, can be optimizedusing any standard minimization algorithm, such as conjugategradient or steepest descent, using the gradient of the energywith respect to the variational degrees of freedom:

δδΩ = ′ + ′ − ′ + ′ + ′P

PH H P P H PH P H P3( ) 2( )2 2

(4.3.37)

ε′ = −H H IF (4.3.38)

The method is variational and can be used to compute energyderivatives with respect to arbitrary parameters λ using theHellmann−Feynman theorem:

λ λΩ = ⎜ ⎟⎛

⎝⎞⎠P

Hdd

trdd (4.3.39)

The same results are obtained by Daw607 from slightlydifferent grounds. He started by considering a smeared stepfunction of the Fermi−Dirac type:

θ β θβ

′ ≡ ≡+ ′βH

H( , )

11 exp( ) (4.3.40)

where β is the reciprocal temperature parameter. The derivativeof the function θ(β,H′) with respect to this parameter isexpressed in terms of the function itself:

βθ θ θ∂

∂= ′ −β β βH (1 )

(4.3.41)

The above expression can be integrated to obtain θ(β,H′)starting from θ(0,H′). The zero-temperature limit of the functioneq 4.3.40 is the density matrix:

ρ θ ′ =β

β→∞

r r( , ) lim(4.3.42)

The result given by eq 4.3.42 populates the energy levels belowthe chemical potential (Fermi energy). Using eqs 4.3.41 and4.3.42, one can obtain the equation of motion for the densitymatrix:

βρ ρ ρ∂

∂ ′ = ′ ′ − ′r r H r r r r( , ) ( , )(1 ( , ))

(4.3.43)

Symmetrization of eq 4.3.43 gives eq 4.3.37:

ρβ

ρ ρ ρ ρ ρ ρ∂∂= ′ + ′ − ′ + ′ + ′H H H H H

12

( )13

( )2 2

(4.3.44)

A simplified version of the DMM was developed byChallacombe.610 The method uses several direct minimizationsteps, followed by the McWeeny purification procedure. Thelatter shows a quadratic convergence and may help inaccelerating computations. The purification procedure requiresa smaller number of floating point operations and can be moreefficient than a direct minimization. Very recently, differentpurification schemes were benchmarked and also compared tothe gradient minimization.611 The authors show that thepurification procedure is much more efficient and can lead toat least an order of magnitude acceleration of calculations.Similarly, Daniels and Scuseria612 found, using a different set ofmolecular systems containing up to 20 000 atoms, thatpurification of the density matrix gave the most efficient

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5859

Page 64: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

procedures. Unlike Rudberg and Rubensson,611 they found thatthe Fermi operator expansionmethod was somewhat slower thanthe conjugate gradient DMM.Larsen et al.613 avoided the need for the purification in the

DMM by using exponential parametrization of the atomicdensity matrix, a set of nonorthogonal projection operators, and amultilevel preconditioning during the optimization. A similarapproach was used byMatsuoka et al.,614 who formulated a directdensity matrix optimization procedure based on iterative unitarytransformations. A proper preconditioning is especiallyimportant when the basis set size is large, which leads to theoverlap matrices to be ill-conditioned. Mostofi et al.615 showedthat this problem is related to kinetic energy contribution, andthey proposed a correction scheme.Recently, Sałek et al.616 used the preconditioned conjugate

gradient optimization of the density matrix to obtain linear-scaling HF and KS-DFT implementations. The method wasfurther extended by augmenting with the linear-response theory,to compute excited state properties.617 The frequency-depend-ent polarizabilities were computed for polyalanine peptides withup to 1400 atoms, using hybrid functionals.One may observe that the method avoids expensive matrix

diagonalization procedures. However, matrix multiplication isalso an expensive operation that scales cubically with the matrixsize. Thus, by itself the DMM method provides no advantageover the standard methods, in terms of computational efficiency.To make the approach linear scaling, one must utilize a distancecutoff for the density matrix elements and sparse matrixtechniques. The cutoff sets matrix elements to zero if thecorresponding orbitals are separated by a distance larger than apredefined cutoff value. This makes the density and Hamiltonianmatrices sparse, especially for large systems. The matrixinformation is then stored in a compact and convenient formto reduce memory requirements and CPU time needed tomultiply matrices.Another complication is associated with density matrix

truncation beyond the cutoff distance. The normalizationcondition, eq 4.3.4b, may be violated as a result of the truncation.This error is usually small for semiconductors with a large bandgap, but may be critical to metallic systems. Qiu et al.618

suggested modifying the chemical potential (Fermi energy) toaccount for the changed number of electrons. They performed atwo-step optimization, alternating the conjugate gradientoptimization of the density matrix and the Fermi energyadjustment. The method was utilized to perform moleculardynamics studies of a large carbon-containing system. Theauthors also suggested utilization of the trial density matrix, P, asa dynamical variable within the Car−Parrinello extendedLagrangian method. The DMM method was generalized tononorthogonal basis in later works.619 The derivations can be putin the framework introduced in section 4.3.1 of the presentreview.Millam and Scuseria620 and Daniels and Scuseria71 performed

the first applications of the DMMmethod to very large chemicalsystems. The approaches utilized semiempirical model Hamil-tonians. Minimization was performed with the conjugategradient technique. The approach was applied initially to apolyglycine chain of about 500 atoms and a water cluster of about900 atoms.620 Even larger examples were demonstrated later,including a 20 000 atom linear polyglycine chain, an RNAmolecule containing ca. 6300 atoms, and a 1800 water cluster.71

The authors investigated different cutoff schemes and found thata soft cutoff allows one to obtain more accurate results and to

avoid unphysical oscillations of the system energy with respect tothe cutoff distance. However, the soft cutoff scheme leads toquadratic scaling due to accounting for pairwise electrostaticinteractions. In further work, Li et al.621 developed a quasi-Newton alternative to the conjugate gradient version. It relied onthe direct inversion in iterative space (DIIS) method and wasfound to be 30% faster than the conjugate gradient version, whenapplied to polyglycine chains containing 10−100 residues.Simulations of a 300 molecule water cluster showed betterconvergence.Although the DMM by itself is a linear-scaling technique, the

computational costs increase if a very large basis is used. Oneneeds to reduce the number of basis states, or the dimensionalityof the density matrix. This is achieved in the group of methodsthat can be referred to as the optimal basis density matrixminimization (OBDMM) group, according to Goedecker’sclassification. For instance, Hernandez et al.622,623 simulta-neously optimized both the expansion coefficients and the basisset for representation of the density matrix. Independently,Hierse and Stechel624 developed a similar approach. Theadaptive strategy was used by Berghold et al.,625 who combinedthe polarized atomic orbital (PAO) method needed to reduce alarge basis set into a smaller basis, and the DMM approach toachieve linear scaling. The authors also compared the perform-ance of different linear-scaling techniques and showed thatefficiency increases in the sequence DMM < FOE < canonicalpurification of the density matrix.626

The DMM method was implemented in a number of codes.For example, Rudberg et al.627 implemented it in the Ergo codeand showed linear-scaling performance for the systems with up to300 000 basis functions with hybrid DFT methods. Hine et al.628

used their modified version in the ONETEP code, and reportedsimulations with over 10 000 atoms. Challacombe et al.610

presented the FreeON (former MondoSCF) code.4.3.5. Fermi Operator Expansion (FOE) Methods. The

FOE and FOP methods developed by Goedecker et al.590,629

constitute yet another group of methods. Similar to the DMMapproaches, the FOE and FOP work directly with the densitymatrix. However, a projection step is used instead of the matrixminimization. This eliminates the DMM and OM problemsassociated with multiple local minima.The method relies on a representation of the density matrix by

a finite-temperature Fermi matrix:

β ε= −ε βF f H I( ( )), FF (4.3.45)

where f(·) is the Fermi−Dirac distribution function, eq 2.1.49b.The function of the matrix is defined via its series expansion.Because eq 4.3.45 is a finite temperature approximation of thedensity matrix (e.g., see eq 4.3.42), the electronic energy is givenby

∑ ψ ψ ψ ψ= = ⟨ | | ⟩⟨ | | ⟩ε β ε βE F H F Htr( )i j

i j j i,,

,F F(4.3.46)

If {ψ} is the basis of eigenfunctions of the system Hamiltonian,H, the matrix H is diagonal with elements:

ψ ψ ε δ⟨ | | ⟩ =Hi j i ij (4.3.47)

As a result, the matrix FεF,β is also diagonal in this basis:

ψ ψ β ε ε δ⟨ | | ⟩ = −ε βF f ( ( ))i j i ij, FF (4.3.48)

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5860

Page 65: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

The Fermi matrix is approximated in an arbitrary basis by amatrix polynomial, p(H), of the Hamiltonian matrix, H:

∑ε β≈ =ε β=

F p H c P H( ; , ) ( )i

N

i i, F0

F

pl

(4.3.49)

where Npl is the power of the polynomial expansion, P(·) is abasis polynomial, and {ci} are the expansion coefficients. Pi(H) =Hi is the most trivial choice of the polynomial functions.590

Chebyshev polynomials were found to produce more stableresults.629−631 The expansion coefficients are chosen to satisfythe expansion in any basis. From the practical point of view, thesimplest choice of the coefficients is based on approximation ofthe Fermi matrix in the basis of eigenfunctions of theHamiltonian, which is simply the Fermi−Dirac distributionfunction, eq 4.3.48. The Fermi−Dirac function is approximatedin the interval spanned by the lowest, εmin, and highest, εmax,eigenvalues of the system Hamiltonian. The Fermi−Diracfunction must be approximated only in those regions thatshow non-negligible density of states. In particular, the Fermi−Dirac function may be rather arbitrary in the gap region of largeband gap semiconductors. Therefore, simplified functions suchas629

εε ε

ε= −

−Δ

⎛⎝⎜⎜

⎛⎝⎜⎜

⎞⎠⎟⎟⎞⎠⎟⎟f ( )

12

1 erf F

gap (4.3.50)

can be used to derive expansion coefficients.Finally, the above calculations require knowledge of the

properties εmin, εmax, and εF. These are computed not directly,because it would require matrix diagonalization, but via auxiliaryprocedures.630,631 To show a linear scaling of time, all matrixmultiplications must utilize sparse matrix representation andalgebra. The sparsity of the matrix follows from the decayproperties of the off-diagonal elements, which are set to zerobeyond some distance cutoff.

4.4. Quantal Force Fields

The methods classified into this group typically combine severalstrategies for faster computations, including those discussed inthe previous sections. For this reason, we consider themseparately, under a different level of classification. The mainidea of the quantal force fields is to retain a quantum-mechanicaldescription, as least for a part of the system, but to achieve this viaa computationally efficient scheme. Often, a scheme can beformulated using classical force field terms, e.g., electrostatics anddispersion. The methods of this group can combine semi-empirical, ab initio, or classical force fields elements, moregeneral physically based approximations, as well as fragmentationand other linear-scaling approaches.In principle, any semiempirical method can already be

considered a quantum force field, in the sense that it includes ahandful of physically or computationally motivated terms. Thelanguage of classical force fields, such as partial atomic charges,dipole moments, and dispersion, is also used in some of themethods discussed above. For example, the MNDO utilizes themultipole expansion for more efficient calculation of two-electron integrals. Various core repulsion terms, accounting fordispersion interactions, were introduced in the EHT and PMnmethods.From the alternative viewpoint, the quantal force fields can be

considered derivatives of the classical ones. The main focus insuch generalizations is put on a description of electrostatic

interactions, including polarization. We have already discussedsome early schemes to account for such effects, including variouscharge and electronegativity equalization methods and fluctuat-ing charge models. Those approaches can be considered coarse-grained versions, with no subatomic electrostatics resolved. Thequantal force fields extend these methods to add subatomicdetails of charge distributionvia molecular orbitals.

4.4.1. Basics of QM/MM. 4.4.1.1. Energy Partitioning. Thehybrid quantum mechanics/molecular mechanics (QM/MM)scheme is the earliest version of the quantal force fields. Thedistinction between the QM and MM regions is more definitehere than in later force fields of this sort. The method was firstintroduced by Warshel and Levitt in 1976.632 The literature onthis subject is very diverse and broad. We refer the reader to thegeneral methodological review633 for more details on themethod. Various specialized reviews discuss applications,including recent progress on the simulation of enzymatic andorganic reactions with QM/MM,93,634,635 combinations of theQM/MM method with the DFTB and its variants,267,268,273

charge transfer in disordered organic semiconductors,636 and therole of electrostatics in polarizable force fields and QM/MM.133

Partitioning of a system into two regionsquantum andclassicalconstitutes the main idea of the QM/MM method.The total energy of the hybrid system is represented as

= + +E E E EQM MM QM/MM (4.4.1)

where the terms on the right-hand side are the energies of thequantum mechanical and classical regions, and the energy ofinteraction between them, respectively. The former two energiesare well-defined. The biggest challenge is to define the last term.Warshel and Levitt computed the latter by considering thediagonal elements of the fully quantum Fock matrix. It can bewritten explicitly in the zero differential overlap (ZDO)approximation:

∑ γ= − ∈≠∈

F F Q a A,aa aaB A

B

B AB,QM

MM (4.4.2a)

∑ ∑γ γ γ= + − −∈≠

≠∈

⎢⎢⎢

⎥⎥⎥F U P P Q

12aa aa aa aa

b Ab a

bb abB A

B

B AB,QM

QM

(4.4.2b)

where Faa,QM describes only the contributions present in theisolated QM region. The second term in eq 4.4.2a describesinteraction of orbitals in the QM region with the classical-pointcharges, Q, and atoms belonging to the MM region.Using eqs 4.4.2 as the starting ground, one develops an

asymptotic expression for the Coulomb integrals, γAB ≈ 1/rAB,and adds the induced dipole interaction:

∑= − + ∈≠∈

F FQ

rF a A,aa aa

B AB

B

ABaa,QM/MM ,QM

MM

ind,

(4.4.3)

The inductive term, Find,aa, describes the self-consistentinteraction of the induced dipoles in the MM region. Theinduced dipole on each atom A is computed in this region as

μ α = E R( )A A A (4.4.4a)

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5861

Page 66: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

∑ ∑ μ = − − ∇

∈≠

⎛⎝⎜⎜

⎞⎠⎟⎟E R

Q

RR

R

R( )A

B

B

ABAB

BB A

B AB

AB3

MM3

(4.4.4b)

where α is the atomic polarizability, and E(RA) is the local electricfield at the location of atom A. The set of eqs 4.4.4 is solvediteratively for induced dipoles in the MM region. Thecontribution to the Fock matrix due to induced dipoles, Find,aa,can then be computed as

∑ μ=

∈≠

FR

Ra A,aa

BB A

B AB

ABind,

MM3

(4.4.5)

Finally, the total energy of interaction between the QM andMM regions is given by

∑= + +∈∈

EQ Q

RE E

AB

A B

ABQM/MM

QMMM

disp ind

(4.4.6a)

∑= −∈∈

⎛⎝⎜

⎞⎠⎟E

AR

BRA

B

AB

AB

AB

ABdisp

QMMM

12 6

(4.4.6b)

∑ μ= −

∈∈

E QR

R12 A

B

AB AB

ABind

QMMM

3

(4.4.6c)

The dispersion term, eq 4.4.6b, is introduced empirically,similarly to how it is treated with a standard classical force fieldmodel.4.4.1.2. Boundary Treatment and Strictly Localized

Orbitals. The treatment of the boundary region between theQM and MM subsystems is one of the most difficult parts of theQM/MM methodology. Although the basic electrostaticcoupling scheme presented above is often accurate for distantparts of the system and weak coupling, it may be insufficient for adescription of covalent interactions. Certain care must be takenfor the orbitals of the QM region in this case.One of the main approaches to handle the difficulties

associated with the partition of the QM and MM regions via acovalent bond is to use strictly localized MOs (SLMOs). Theconcept of SLMO appeared starting from the early 1950s.637−639

In the early 1980s Naray-Szabo proposed to utilize the strictlylocalizedMOs (SLMOs) for calculations of large systems.640 Thesimplest set of strictly localized MOs can be constructed on thebasis of conventional chemical concepts of bond types. Namely,the lone pair, two-center σ-bond, and many-center π-orbitals canbe constructed from atomic orbitals as

ψ χ=l A lLP, , (4.4.7a)

ψ χ χ= +σ c cl A A l B B l, , , (4.4.7b)

∑ψ =π c p( )lA

A z A l, ,(4.4.7c)

The SLMO approach is very similar to the LMO methoddescribed in section 4.2.3.2. It is also used implicitly in the ELGmethod, when fragment localization is performed. We refer thereader to the corresponding sections as well as to the originalliterature for the details of the method. It is important to note forour purpose that the method serves as the backbone of anefficient computational strategy for calculations on largemolecular systems. It was utilized successfully for computing

molecular electrostatic potentials from the bond fragments in theoriginal formulations.641−646 The idea gets themost popularity inthe QM/MM context, especially for treating the boundary of thequantum and classical regions.The concept of the fragment self-consistent method (FSCF) is

a special version of the SLMO.640,647−649 The FSCF isimplemented in the Sybyl program by the Naray-Szabo groupusing the NDDO Hamiltonian for quantum systems. Applica-tions included enzyme reactions,650,651 amorphous solidsystems,652−654 and liquid phases.655

The FSCFmethod is formulated as follows. First, the system isdivided into three parts: core (C), polarizable (P), andnonpolarizable (N) regions. The total energy is expressed as

∑= +E EQ Q

R12 A B

A B

ABtot el

, (4.4.8)

where the summation in the second term includes classicalelectrostatic interaction of atomic cores (in the C and P regions)and point charges (in theN region). The first term represents theelectronic energy of the quantum part (the C and P regions). It isdecomposed as

= + + −E E E EC P C Pel (4.4.9)

Keeping in mind the definitions eqs 2.1.28−2.1.30, the electronicenergy of the core (QM) region is given by

∑ ∑

∑ ∑ ∑

∑ ∑ ∑

= + +

= + |

− | + | − |

= + |

− + |

α α β β

α

α β β

α α β β

=∈

=∈

=∈

=∈

=∈

=∈

=∈

=∈

E P H P G P G

P H P P ab cd

P ad cb P P ab cd P ad cb

P H P P ab cd

P P P P ad cb

12

[ ]

12

[ ( ( )

( )) ( ( ) ( ))]

12

[ ( )

( )( )]

Ca ba b C

N

ab aba ba b C

N

ab ab ab ab

a ba b C

N

ab aba ba b C

N

c dc d C

N

ab cd

cd ab cd cd

a ba b C

N

ab aba ba b C

N

c dc d C

N

ab cd

ab cd ab cd

, 1,

, 1,

, , , ,

, 1,

, 1,

, 1,

,

, , ,

, 1,

, 1,

, 1,

, , , ,

AO AO

AO AO AO

AO AO AO

(4.4.10)

The expression simplifies for closed-shell systems, Pσ = (1/2)P, σ = α, β:

∑ ∑ ∑= + ||=∈

=∈

=∈

E P H P P ab cd12

( )Ca ba b C

N

ab aba ba b C

N

c dc d C

N

ab cd, 1,

, 1,

, 1,

AO AO AO

(4.4.11a)

with

|| = | − |⎡⎣⎢

⎤⎦⎥ab cd ab cd ad cb( ) ( )

12

( )(4.4.11b)

The core integrals, Hab, a, b ∈ C, include interactions with allatomic cores and point charges in the system. The energy of thepolarizable region is computed by assuming that both the coreone-electron Hamiltonian and the density matrix have block-diagonal structure that corresponds to nonzero elements only forthe hybrid AOs belonging to the same SLMO:

∑ ∑ ∑ ∑ ∑= + ||α

αα β

α β∈ =

∈∈ =

∈=∈

E P H P P ab cd12

( )PP a b

a b

N

ab abP a b

a b

N

c dc d

N

ab cd, 1,

, , 1,

, 1,

AO AO AO

(4.4.12)

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5862

Page 67: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

Finally, the interaction between the two regions is described by

∑ ∑ ∑= ||α

α= ∈ =

E P P ab cd12

( )Pa b

a b

N

P c dc d

N

ab cd, 1

,, 1,

AO AO

(4.4.13)

Variation of the total energy yields the Fockmatrix of the form:

∑ ∑ ∑= + ||α

α β∈ =

∈=∈

F H P P ab cd12

( )ab abP a b

a b

N

c dc d

N

ab cd, 1,

, 1,

AO AO

(4.4.14)

The method evolved into the local self-consistent (LSCF)approach656 and was later used as a basis to construct a hybridclassical quantum force field (CQFF).657 Both works applied thenovel technique to study proton transfer in biological systems.The approach was implemented in the GEOMOP package. Themethodology was developed by the group of Jean-Louis Rivailand is currently being pushed forward by the Xavier Assfeldgroup. A detailed recent account of the intricacies of themethodology is available.658

In the LSCF method, the entire system is divided into tworegions: the quantum subsystem, S, to be treated quantummechanically, and its environment, E, described with the classicalforce field (Figure 22). The main difficulty arises at the interfaceof the two regions. The interface is typically chosen via one ormultiple single bonds X−Y. The atom X belonging to thequantum system is called a frontier atom. The AOs localized onthis atom are assumed to consist of one s and three p-typefunctions. Hybridization of the four AOs yields four localizedhybrid MOs. One of these MOs, |l⟩, is then frozen and is rotatedto align with the vector connecting the two atoms, to representthe bonding orbital along the X−Y bond. The other threeorbitals, |i⟩, |j⟩, and |k⟩, are used as the basis for variationalcalculations of the quantum subsystem S. The quantumsubsystem has to be orthogonal to the frozen orbital |l⟩. Theremay be an arbitrary number, f, of distinct contacts between theMM and QM regions, in general. Thus, if the total number oforbitals in the full QM region is N, the variational subspaceconsists of N − f orbitals.Similarly to the standard QM/MM scheme, the electrostatic

contributions due to point charges in the MM system areincluded in the Fock matrix formation:

∑ ∑

μ λη μη λ

μ μ μ

= + | − |

+ | − | + |

μ μλ η

λη

⎡⎣⎢

⎤⎦⎥

⎡⎣⎢

⎤⎦⎥

F H P v v

P v ll l lv Q v s s

( )12

( )

( )12

( ) ( )

vS

v

lll

A EA A A

,

(4.4.15)

The total energy is then given by the sum of the subsystem,environment, and subsystem/environment interface terms:

= + +E E E ES E ES (4.4.16a)

∑ ∑= + +μ

μ μ μ′∈< ′

′ ′E P H F Z Z f12

[ ]Sv

v v vA A S

A A

A A AA, ,

(4.4.16b)

∑ ∑ ∑= | ′ ′ + |

+

′′ ′

∈⎪ ⎪

⎪ ⎪⎧⎨⎩

⎫⎬⎭

E P P ll l l Q ll s s

E E

12

( ) ( )

( )

El l

ll l lA E

A A A

MM (4.4.16c)

∑ ∑= | + −

| +∈ ′∈

′ ′ ′

′ ′

E Z P ll s s Z f P s s

s s E E S

{ ( ) [ (

)]} ( , )

ESA S

A ll A AA E

A AA A A A

A A nonelecMM

(4.4.16d)

Here, ZA is the number of valence electrons in the atom A(effective core charge), QA is the partial charge of the atom A, PAis the total electronic population on the atom A, and fAA′ is theempirical core−core repulsion function between the centers Aand A′. Note that the QM-like terms stemming from the frozenorbital |l⟩ are included into the environment contribution,because these orbitals belong to the environment region bydefinition. The terms EMM(E) and Enonelec

MM (E,S) are the puremolecular mechanics terms for the environment and environ-ment/subsystem interface, respectively. The latter excludes allelectrostatic terms, because they are already included via otherexpressions.

4.4.1.3. Huzinaga−Cantu Equation and Separability. Thesolution of the electronic problem for the QM part is chosen inthe FSCF method such that the MOs are orthogonal to thefrozen orbitals. This is achieved in the recent works by Ferenczyet al.659,660 via the Huzinaga−Cantu equation,661 which can besummarized as

ψ ε ψ − =F F P( { , })fia

ia

ia

(4.4.17)

where ψia is the ith active orbital, and εi

a is the correspondingeigenvalue. The anticommutator is

= + F P FP P F{ , }f f f(4.4.18)

where F is the Fock operator, and Pf is the projector to thesubspace spanned by the frozen orbitals:

∑ ψ ψ = | ⟩⟨ |∈

P f

i fi i

(4.4.19)

By projecting the above equations onto some (non-orthogonal) basis, the Huzinaga−Cantu equation can be writtenin the matrix representation:

− − =F SP F FP S C SC E[ ]f f a a a (4.4.20)

where Ca are the coefficients of active orbitals, S is the overlapmatrix in the chosen basis, and Pf =Cf(Cf)+ is the projector matrix(density matrix in the subspace of active orbitals).

Figure 22. Schematic representation of the system partitioning into theQM and MM regions. The active and frozen orbitals are illustrated. |i⟩, |j⟩, and |k⟩ are the active orbitals; |l⟩ is the frozen orbital.

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5863

Page 68: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

The Huzinaga−Cantu equation661 and its generalizations andextensions662,663 appeared initially in the physics community andwere used for construction of atomic pseudopotentials. Themethodology is related to the earlier works of Fock, Wesselow,and Petrashen,664 McWeeny,608 and Szasz and McGinn665 ongeneral separability in quantum mechanics. It is also related tothe paper by Lykos and Parr666 on separability of sigma and pielectrons in molecular systems. Mutual orthogonality is one ofthe central assumptions of separability of two sets of orbitals. Thetwo sets consisted of the core and valence electrons in the earlyatomic simulations. They are composed of the frozen and activeorbitals in the QM/MM framework. The orthogonality of theorbitals belonging to the two subsets is one of the essentialrequirements for accurate partitioning of the system according tothe QM/MM philosophy. The unconstrained solution of eq4.4.17 automatically finds the active orbitals that are orthogonalto a predefined set of frozen orbitals. To achieve theorthogonality, one can include the frozen orbitals in the basisin which the active orbitals are to be constructed.The QM/MM methodology is a particular kind of a more

general group of subsystem embedding methods.73,74,667 Thelatter are very similar to the D&C and FMO-like schemes,discussed in the previous sections. In the present context, wehighlight the importance of separability for the generation ofpseudopotentials, and for the QM/MM and subsystem DFTframeworks. In particular, an important analysis of separability ofthe electronic density is made by Hoffmann.668 He proved thatthe electron density of the whole system is representable as thesum of the densities of subsystems if and only if orbitals ofdifferent subsystems are orthogonal. The orthogonality of thesubsystem orbitals can be automatically satisfied if the coupledKS equations are modified using projectors that transform thestandard KS equations into the form similar to the Huzinaga−Cantu equation, eq 4.4.17.4.4.2. Polarizable Force Fields. Polarizable force fields are

typically orbital-based force fields. They are also referred to asquantal force fields. The methods such as QM/MM can also beconsidered quantal force fields in a general context. To eliminatethe ambiguity, we follow a more strict classification and call thegroup of methods discussed in this section polarizable forcefields. The name “quantal” is used in the literature in the narrowsense. This narrow definition excludes numerous classical forcefields that use quantum mechanical calculations only forparametrization.The X-Pol model of Jiali Gao and collaborators669−671 is one of

the first and most popular orbital-based models used forconstruction of fully quantum force fields. The authors pioneeredan effort to develop a fully quantal force field for simulation andmodeling of materials, fluids, and biomacromolecules. Themethod was originally developed for the simulation ofintermolecular interactions in liquid water.669 Further elabo-rations included addition of the torsional terms,670 variationalformulation and analytic gradients,671 and generalization toinclude charge delocalization (GX-Pol).672 The X-Pol approachwas applied to simulate large biomolecular systems. Theoptimization and MD simulation of the bovine pancreatictrypsin inhibitor (BPTI) in solvent included up to 30 000 basisfunctions and was among the largest scale simulations to date.673

The modern version of the method can be summarized asfollows. A product of the wave functions of N fragmentsrepresents the total wave function of a system

∏Ψ = Φ=A

N

A1 (4.4.21)

whereΦA is the proper (antisymmetric) wave function of the Athfragment, given by the usual Slater determinant of one-electronwave functions. The approximation, eq 4.4.21, violates theantisymmetry of the total wave function and, hence, ignores theexchange−correlation effects across the fragments. Exchange−correlation is included within each subsystem individually. Theneglect of the exchange−correlation effects is compensated byempirical corrections and parameters introduced into the energyfunctional at later stages.The Hamiltonian of the entire system is represented by the

sum of the Hamiltonians of the individual isolated fragments, HA,and the Hamiltonians of pairwise fragment interactions, HAB:

∑ ∑ = + = =

H H H12A

N

AA B

A B

N

AB1 , 1

(4.4.22)

The interfragment interaction Hamiltonian can be expressedin terms of the potential created by one fragment (e.g., B) and feltat the positions of the electrons and nuclei of the other fragment(e.g., A):

∑ ∑ = − + +∈ ∈

H V r Z V R E( ) ( )ABi A

B ia A

a B a ABvdW,(4.4.23)

where the first and second summations run over all electrons andall nuclei of the system A, respectively. The term EvdW,ABdescribes dispersive interactions between the pair of fragmentsA and B. The potential of the subsystem B is computed in thequantum-mechanical formulation from its wave function, ΦB:

∑= −⟨Φ || − |

|Φ ⟩ +| − |∈

V XX r

ZX R

( )1

B B Bb B

b

b (4.4.24)

The electronic structure calculations are performed for eachfragment independently, significantly reducing the effort scaling.However, the Hamiltonian of each fragment depends on thepotential created by all other fragments, and the potentialdepends on the wave functions of all fragments. Thus, a set ofcoupled equations is obtained. It should be solved iteratively,until self-consistency is reached. The principal scheme of themethod is similar to that of the D&C algorithm. The schemeallows one to apply the method to large systems.To accelerate the computations even further, the electrostatic

potential, eq 4.4.24, is approximated by the classical multipoleexpansion. Only the monopole terms were used n the originalversion of Gao,669 to substitute eq 4.4.24 with

∑ ∑= −| − |

+| − |∈ ∈

V Xq

X rZ

X R( )B

i B

B

i b B

b

b (4.4.25)

where qB is the partial charge on the atom B. In addition, theempirical dispersion term was added to the total energy, topartially account for the exchange and correlation effects acrossmolecules, not included in the wave function ansatz, eq 4.4.21.Polarizable force fields are actively developed in the York

group.81,82,674 To achieve a linear-scaling performance and tostudy large systems, the researchers modified the D&C scheme(mD&C) to treat electrostatic interactions between distantcenters. Unlike the X-Pol, which is based on the NDDO-typeHamiltonians, the authors rely on the self-consistent version ofthe DFTB (DFTB3). The self-consistent DFTB allows them to

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5864

Page 69: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

account for the effects of orbital overlap and exchange−correlation. The latter are introduced via the standard DFTexchange−correlation functionals. The method was usedextensively in biomolecular applications. We refer the reader tothe work of Giese et al.108 for a recent account on the progress inthe method development.In the mD&C method,81,82 the electrostatic coupling is

represented via a multipole expansion of the density of eachfragment. The lack of the direct coupling terms is accounted forby the carefully parametrized additional terms. Namely, the totalsystem energy is given by

∑ ∑= + + ′

+

σ

E E C R q p E R

E R

( , { })12

({ })

({ })

AA A

al m a

l m l mA

,, , vdW

bond (4.4.26)

where ql,m∈a and pl,m are the multipole moment on the atom a andthe multipole potential (tensor) at the position of the atom a,respectively. They are defined as

∫δ δ ρ= − − ∈q Z r r C r Rd ( ) ( )l m a a l m a lm a, 0 0 (4.4.27a)

∑= ′∇− !!

∇− !!∈

≠∈

p qCl

C

j R( )

(2 1)

( )

(2 1)1

lm ab ajk b

jklm a jk b

ab(4.4.27b)

where Clm(·) is the real regular solid harmonic of thecorresponding argument, and Za is the effective charge of thenucleus a (including the charge of the core electrons).The Fock matrix of the fragment A is defined by

∑=∂∂

+∂∂

σσ σ

∈∈

FE

Pp

q

PA ijA ij a A

lm a

lmlm

A ij,

A

, , (4.4.28)

with the fragment spin-resolved density matrix defined as usual:

∑=σ σ σ σP n C CA ijk

A k A ki A kj, , , ,(4.4.29)

The computational scheme is presented pictorially in Figure23.

4.4.3. Effective Fragment Potential (EFP) Method.Finally, we discuss a coarse-grained version of the QM/MMand polarizable FF approaches. We illustrate coarse-graining withthe effective fragment potential (EFP) method. The method wasdeveloped as a nonempirical alternative to QM/MM. The initialversion (EFP1) approximated most of the energy contributionsvia a functional form and a parametrization derived quantummechanically. It contained empirical terms to account for theexchange and repulsion energies. The terms were obtained tomatch the total energy of larger systems.675 The force-fieldnature of the method appears mostly in its constructionthefunctional form is chosen from physical considerations and maybe reminiscent of those used in classical force fields. Unlikeclassical force fields, most of the parameters are derived from theelectrostatic potential generated by a rigorous QM calculation,e.g., using multipole expansion, in a manner similar to thepolarizable force fields discussed above. One can distinguishseveral variants depending on the level of electronic structurecalculations used to generate the EFP1 potential. These includethe Hartree−Fock,676 DFT,677 TD-DFT,678 and CIS.679

All energy contributions are derived nonempirically in the latergeneralizations of the method (EFP2).91,680,681 Further elabo-rations involved combination with the electronic correlationapproaches, such as the EOM-CCSD, CASSCF, and MRMP2,for calculations of excited state properties, and studying solventand polarization effects in large biomolecular systems.682−685

The EFP method is implemented in the GAMESS474 and Q-Chem681,686,687 software packages. We summarize the essentials

Figure 23. Schematic representation of the mD&C method. The X-Pol double loop SCF is represented by a similar scheme. See text for symboldefinition.

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5865

Page 70: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

of the EFP formulations. A detailed discussion and reviews can befound elsewhere.91,681,688

The energy of the system is described by the Hamiltonian ofthe form eq 4.4.22. The monomer energy is computed by thestandard QM methods. The EFP focuses on the description ofthe interfragment interactions. The energy of such interaction isdecomposed into the Coulomb, polarization, dispersion, andexchange−repulsion terms:

= + + + −E E E E EAB AB AB AB AB,Coul ,pol ,disp ,exp rep (4.4.30)

The EFP method assumes that the interacting fragments aresufficiently distant from each other, such that perturbation theorycan be applied. The second-order perturbation theory applied tothe fragment dimer A−B yields

= + +E E E ...AB AB AB(1) (2)

(4.4.31a)

= ⟨Φ Φ | |Φ Φ ⟩E HAB A B AB A B(1)

,0 ,0 ,0 ,0 (4.4.31b)

∑= −⟨Φ Φ | |Φ Φ ⟩⟨Φ Φ | |Φ Φ ⟩

+ − +E

H H

E E E E( ) ( )ABi j

A B AB A i B j A i B j AB A B

A i B j A B

(2)

,

,0 ,0 , , , , ,0 ,0

, , ,0 ,0

(4.4.31c)

where ΦX,k is the kth excited determinant of the system X, andEX,k is the corresponding eigenvalue. The perturbation operatorHAB describes the Coulombic interaction between the electronsand nuclei of the two subsystems, excluding those within eachsubsystem. The first-order term, eq 4.4.31b, describes the effectsof the classical electrostatic potential present at one fragment dueto all other fragments, EAB,Coul. The calculations are organizedsimilarly to the X-Pol or mD&C methods. First, a set of pointmultipoles is computed using the wave function (or chargedensity) of the fragment, for example, with the help of Stone’sdistributed multipole analysis.689 The analysis is valid for anyinterparticle separation, provided that the number of expansionterms is sufficient. To achieve high accuracy, the electrostaticfragment potentials should be constructed using expansions atboth the atomic centers and the midbond points. The resultingexpansion is similar to one in eqs 4.4.27. It modifies the elementsof the one-electron core Hamiltonian.The second-order term, eq 4.4.31c, describes polarization and

dispersion, EAB,pol and EAB,disp. Expanding the Coulomb potentialfrom HAB in a series, one can arrive at an expression for thepolarization energy in terms of the induced dipoles and electricfields at the distributed expansion points. The calculation of theinduced dipoles utilizes the tensors of atomic polarizabilities andhyperpolarizabilities. All properties can be computed by aquantum mechanical methodeither on the fly or in advance.The dispersion term is approximated by the polynomial

∑= −≥

−E C RABn

n ABn

,disp6 (4.4.32)

where the coefficients Cncan be derived from the frequency-dependent distributed polarizabilities.The applicability of the perturbation theory becomes more

problematic as the interacting fragments become closer andeventually overlap. The resulting discrepancy with theperturbative approach is corrected by addition of empiricalexchange−repulsion terms, EAB,ex−rep. They were approximatedby the parametrized exponential functions in the earlyformulations of the method:

∑ α= −−E c Rexp( )ABi

i i AB,ex rep2

(4.4.33)

The term is derived in the EFP2 from the antisymmetrizedwave functions. It involves a complex expression includingatomic overlaps, kinetic energy integrals, and Fock matrixelements.681 The computation of the term is the most expensivepart of the calculations. Still, it has a favorable quadratic scaling,which can be converted into linear scaling by using cutoffschemes. Description of the exchange energy contributionsrelying on evaluation of the overlap integrals appears in manydifferent theories, such as the electronic force field of Goddard etal.,5,6 and in the semiempirical model Hamiltonians, such as theEHT.318 The trend illustrates an interesting and persistentapproach for treating the exchange and correlation effects: Theempirical overlap-dependent functionals and the nonantisymme-trized wave function are used instead of the direct evaluation ofthe expensive exchange and correlation integrals in the HF-liketheories.The use of effective fragment libraries is one of the attractive

sides of the EFPmethod.76,690 Fragment libraries make the EFP aquantum-mechanical counterpart of coarse-grained classicalforce fields. Most popular implementations assume rigidmolecular fragments, implying that all parameters for a givenfragment (expansion coefficients, multipole internal coordinates,multipole moments, polarizabilities, etc.) can be computed onceand for all. This information can be used for performing efficientMD simulations using the rigid body MD,690−692 in which thedynamics of the whole fragment is described by translation of itscenter of mass and rotation with respect to the externalcoordinate system.4.5. Embedding Schemes

A common theme of many methods discussed in the presentsection 4 is the partitioning of an entire system into parts that arehandled separately from each other. Each of the parts constitutesan active subsystem (also known as “fragment”). At the sametime, each subsystem is a part of the environment that affects allother individual subsystems. In other words, the active subsystemis embedded into the environment (Figure 24). A self-consistentdescription of the electronic structure or each subsystem and thesurrounding environment constitutes one of the majorchallenges of the reviewed methods. The major differencebetween various formulations lies in the terminology used and inthe methodology to define and compute the embeddingpotential.The simplest approach to define the embedding potential is via

a postulated functional form with empirically determinedparameters, such as in the EAM or MEAM theories, as well asin the reactive force fields. The environmental effects aredescribed in the FMO method via the external (embedding)potential, which may be approximated in a number of ways, asdiscussed in section 4.2.2.1. The potential is derived in a partiallyself-consistent way via many-body expansion. A fully self-consistent formulation of the embedding potential is encoun-tered in the D&C schemes, where the self-consistency is realizedvia equilibration of the Fermi levels of all subsystems. Theembedding potentials are also determined in a self-consistentway in the OF-DFT and DFT embedding schemes. The latterwill be summarized below.Although the idea of the DFT-based embedding schemes

follows very closely that of the D&C and the FMO approaches,this branch of methodology development is often classified into aseparate group. It is commonly called a subsystem DFT or

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5866

Page 71: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

embedding DFT. One of the first formulations was developed byWesolowski and Warshel693 and is known as the frozen densityembedding (FDE).The basics of the subsystem DFT method and the details of

the formulation are given in many sources,73,74,667,668,694−702

including time-dependent generalizations74,700,703 and methodsfor excited states.73,74,667,694 Here, we only briefly summarize themain idea of the method. Assume the entire system can be splitinto two subsystems, A and B, each with the fixed integer numberof electrons NA and NB, respectively, and the total density can bepartitioned into a sum of the subsystem densities:

ρ ρ ρ = + r r r( ) ( ) ( )A B (4.5.1)

∫ ρ =r r N( ) dA A (4.5.2a)

∫ ρ =r r N( ) dB B (4.5.2b)

The total number of electrons is N = NA + NB. A more generalformulation is straightforward. We will focus on splitting intoonly two subsystems for presentation simplicity. The subsystemsdo interact with each other. It is also required that the subsystemdensities are well-behaved:

ρ ≥ =r X A B( ) 0, ,X (4.5.3a)

∫ ρ|∇ | < ∞ =r r X A B( ) d , ,X1/2 2

(4.5.3b)

Finally, it is supposed that each subsystem is uniquely defined bythe set of its nuclei and by the number of electrons it contains.Using the density partitioning eq 4.5.1 and the definition of the

energy functional eq 2.2.1, one can split the energy of the entiresystem into the energy of subsystems and their interaction:

ρ ρ ρ ρ ρ ρ+ = + +E E E E[ ] [ ] [ ] [ , ]v A B v A v B A BintA B (4.5.4a)

ρ ρ ρ ρ ρ ρ

ρ ρ

ρ ρ

ρ ρ ρ ρ

ρ ρ

≡ + − −

+ + +

− −

= + − −

+ +

E F F F

v r v r r r

v r r v r r r

F F F

v r r v r r r

[ , ] [ ] [ ] [ ]

[( ( ) ( ))( ( ) ( ))

( ) ( ) ( ) ( )] d

[ ] [ ] [ ]

[ ( ) ( ) ( ) ( )] d

A B A B A B

A B A B

A A B B

A B A B

A B B A

int

(4.5.4b)

Repeating the steps in eqs 2.2.6−2.2.11, but keeping in mindthe constraints eqs 4.5.2, one obtains the system of coupledEuler−Lagrange equations:

δ ρδρ

δ ρδρ

δ ρδρ

μ+ +

=

ρ ρ ρ= +

E

rv r

Fr

Fr

[ ]

( )( )

[ ]( )

[ ]

( )v A

AB

A

AA

A

A B

(4.5.5a)

δ ρδρ

δ ρδρ

δ ρδρ

μ+ +

=

ρ ρ ρ= +

E

rv r

Fr

F

r

[ ]

( )( )

[ ]( )

[ ]

( )v B

BA

B

BB

B

A B

(4.5.5b)

The first terms of the equations describe the variationδEvX[ρX]/δρX(r) of energy of the corresponding subsystem dueto its own energy functional and its own density. The secondterm, vY(r), describes the energy variation due to the field ofnuclei of all other systems (part of the embedding potential).Finally, the terms of the type

δ ρδρ

δ ρδρ

−ρ ρ ρ= +

Fr

Fr

[ ]( )

[ ]

( )X

XA B

describe nonadditive effects due to electron−electron inter-actions, electronic kinetic, and exchange−correlation functionals.In other words, each equation describes the variation of energy ofone subsystem in the field of all subsystems induced by thevariation of the electronic density of the given subsystem, subjectto the constraint on the total number of electrons in thesubsystem. The magnitude of such a variation has the physicalmeaning of the chemical potential of the considered subsystem.To solve eqs 4.5.5, one requires the equilibrium between allsubsystems:

μ μ=A B (4.5.6)

The equations can further be simplified to

δ ρδρ

ρ δ ρδρ

δ ρδρ

μ+ +

=

ρ ρ ρ= +

Tr

v rT

rT

r[ ]

( )([ ]; )

[ ]( )

[ ]

( )A

A

A

AAeff

A B

(4.5.7a)

δ ρδρ

ρ δ ρδρ

δ ρδρ

μ+ +

=

ρ ρ ρ= +

T

rv r

Tr

T

r

[ ]

( )([ ]; )

[ ]( )

[ ]

( )B

B

B

BBeff

A B

(4.5.7b)

where veff(r) is the effective potential of the entire system, asdefined in eq 2.2.9. Equations 4.5.5 or 4.5.7 are called Kohn−Sham equations with constrained electron density (KSCED).The term

δ ρδρ

δ ρδρ

−ρ ρ ρ= +

Tr

Tr

[ ]( )

[ ]

( )X

XA B

Figure 24. Schematic representation of the embedding philosophy.Each of the subsystems can be considered an active one (blue) and a partof the environment (gray). The interaction between the activesubsystem and the environment may be self-consistent or non-self-consistent.

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5867

Page 72: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

is the nonadditive kinetic energy term for the subsystem X. Thiscorrection is a subject of multiple works and additionalapproximations.696−698,704,705 In particular, the nonadditivekinetic energy correction has been a central subject of the OF-DFT361−363,365,706 developments. In the form eqs 4.5.7, thecoupled Euler−Lagrange equations emphasize the central role ofthe kinetic energy functional for efficient partitioned DFTcalculations. Very recently a formally exact approach to defineand compute nonadditive kinetic energy corrections has beendeveloped in the Miller group.696,697 The approach, however,requires computation of the KS orbitals for a combined system,thus limiting the applicability of themethod to large systems. Themethod has been simplified by using the projector operators thatimpose orthogonality of the KS orbitals belonging to a distinctsubsystem.698 It should be noted that the nonadditivity of thekinetic energy in subsystem DFT is a consequence of the lack ofmutual orthogonality of theMOs assigned to different fragments.The orthogonality condition for separability in DFT has beensuggested in Mark Hoffmann’s work.668 He proved that theelectron density of the whole system is representable as the sumof the densities of subsystems if and only if the orbitals ofdifferent subsystems are orthogonal.The orthogonality of the subsystem orbitals can be automati-

cally satisfied if the coupled KS equations are modified usingprojectors:668,698

μ→ = +F F F PabA

abA

abA

abB

(4.5.8)

∑= ⟨ | | ⟩ = ⟨ | | ⟩⟨ | | ⟩ =∈

P a P b a i i b SP S( ) ( )abB B

i BB ab

(4.5.9)

where F is the Fock matrix of the original KS equations, F is themodified Fock matrix, and PB is the density matrix of thesubsystem B. The projector matrix PB ensures that MOs of thefragment A are orthogonal to MOs of the fragment B. Theprojection technique summarized in eqs 4.5.8 and 4.5.9 isessentially the same as the method based on the Huzinaga−Cantu equations already discussed in the context of QM/MM(see eqs 4.4.17−4.4.20). The Huzinaga−Cantu equations weredeveloped originally to generate atomic pseudopotentials. Aremarkable similarity among all formulations (Huzinaga−Cantufor pseudopotentials, eqs 4.4.17−4.4.20 for QM/MM, and eqs4.5.8 and 4.5.9 for subsystem DFT) originates from the commongoal of establishing the separability conditions. If in Huzinaga’scase the separability concerns the one-electron core and valenceorbitals in Slater determinants, in the KS equations theseparability concerns the electronic densities of the whole versusits part. From the mathematical point of view, the projectiontechniques just discussed are nothing more than the well-knownGram−Schmidt orthogonalization. In the latter, one also has aterm that quantifies the degree of nonorthogonality between thetwo sets of variables. This term is then used to enforce theorthogonality required.It is also important to emphasize that the orthogonality of the

MOs belonging to distinct fragments (subsets) is equivalent tothe self-consistency reached for the entire system. One may drawan analogy with the diagonalization procedure in the conven-tional SCF iterationas we noted in the discussion of the LMOmethods, especially those of Stewart. In order to diagonalize theentire Hamiltonian matrix, it is enough to annihilate theoccupied−virtual blocks by sequential rotations. The samephilosophy holds for partition of the entire system intosubsystems. If subsystem MOs are not orthogonal, the solutionis non-self-consistent, as in many diabatic methods or in the

nonvariational energy-based partitioning schemes such as FMO.Orthogonalization of the MOs belonging to distinct subsystemsconstitutes a transformation to an adiabatic basis, spanning theentire system, and hence delivers a variational solution for theentire system. The lack of mutual orthogonality between MOs ofthe subsystems is also reflected in different Fermi levels of thecorresponding fragments. In other words, the diabaticpartitioning is intrinsically nonequilibrium. The equilibration ofthe Fermi levels is thus equivalent to orthogonalization of MOsof a distinct subsystem, as it is done in the D&C type of methods.In passing, we would like to point out that the embedding

potential may be needed even for diabatic approaches, althoughnot always. For example, in the energy transfer models109,707 thestate-specific integration between chromophores is approxi-mated using the dipole interaction model, similar to the one ofForster. The calculation of electronic structures of subsystemsmay be non-self-consistent, but the effect of the environmentmay still be included. Such computations do not increase theexponent of the scaling law for the CPU time, but they maynotably contribute to the prefactor.

5. CONCLUSIONS AND OUTLOOK

We have summarized the diverse developments in varioussubfields of theoretical and computational chemistry and physicsthat are aimed at extending our present computationalcapabilities beyond their current limits. Classical force field andcoarse-grained approaches allow one to study multimillion atomsystems for sufficiently long time periods with the use ofsupercomputers. Systems containing tens of thousands of atomscan be simulated routinely for nanoseconds on personalcomputers. Utilization of reactive force fields can be applied tocomplex reactive processes in systems composed of thousands ofatoms, if no additional quantum details are needed. Suchcalculations are comparable in performance to classical forcefields, but are significantly more flexible and illuminating.At the same time, the electronic structure approaches are still

on the road of active extension to large-scale calculations.Although the size scales that can be achieved at such levels ofdescription are notably smaller than those of the classicalmethods, the limits have been extended considerably over thepast few decades. Impressive progress in physically motivatedand in computationally motivated approaches allows performingfully quantum calculations onmultithousand-atom systems usingonly personal computers. Record simulations on systems withover a million of atoms are possible with some methods (e.g.,OF-DFT) and with the help of supercomputing facilities. At thesame time we should note that many of the works reported arefocused on the proof of principle rather than on providingdetailed descriptions of the systems and physical interpretationrelated to experiment. For example, the linear-scaling approachesoften deal with the linear or approximately linear molecules in avacuum. The extensions to three dimensions are less commonand more involved. The consideration of the solvent effects andthe extensions of the methods to enhanced sampling andaccelerated dynamics are still needed. We anticipate that furtherdevelopments will address these shortcomings, transforming theproof-of-the-principle showcase examples into insightful anddetailed studies of realistic systems.

5.1. Methods Diversity: New and Old

Considering the continuous progress in the large-scalecomputation methodologies, with dozens of techniquesappearing every year, it becomes increasingly important to

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5868

Page 73: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

establish interrelations between the new methods and the oldertheories. This helps in the classification of the methods andidentification of the heritage trees. This also helps in emphasizingdistinctive features of the new methods. Most importantly,establishing connections of the novel techniques to the olderideas helps to avoid “rediscovery”, when after several generationsa “new” technique is reported under a distinctive brand, but itessentially repeats the ideas of the approaches formulated earlier.Classification of the growing body of linear-scaling andfragmentation methods becomes ever more complicated, withmany methods showing large overlaps. It may be more effectiveto choose several most robust and rigorous approaches, and topresent new methodologies as either special cases or general-izations thereof. This philosophy would help to identify thewinning methods, and to unify all approaches in the long run.We believe that, in the present, the D&C, FMO, and direct

minimization schemes (DMM and FOE) are the most successfuland convenient frameworks for constructing novel linear-scalingmethods. Many other fragmentation methods can be related toor are based on the ideas of the D&C. It is perhaps the mostchemically transparent, yet sufficiently general scheme. Thechoice of a fragmentation strategy constitutes the main challenge.Automation of the partitioning schemes, applicable to distincttypes of systems, will be required in the future. On the otherhand, the attractiveness of the direct minimization approachesresides in their simplicity and absence of fragmentation schemes.This group of methods can be considered a purely mathematicalapproach to solving large chemical problems.Many old concepts become forgotten, abandoned or under-

estimated. This is a side effect of the fast and intense evolution ofthe computational methods. We discussed the bond-indexconservation principle in the context of the reactive force fielddevelopment. This transparent idea of chemical bonding hasresulted in surprisingly high accuracies at the extremely simpleformulation level. We also discussed the closely related idea ofusing bond orders as the main variables for the description ofinteratomic potentials. Combining such variables with self-consistent electrostatics in a correlated way, e.g., similarly to howit is done with the split-charge variables, can lead to a valuabledevelopment.Considering the physically motivated approaches, we outlined

many promising ideas developed within the simple extendedHuckel theory. It is interesting to observe how the line of the HF-

derived semiempirical methods has eventually arrived at thecornerstones of the simple EHT methods: inclusion of atomicorbital overlaps, incorporation of core−core interactionpotentials, and approaches to charge transfer and correlationeffects. Defined from rigorous grounds, the INDO, NDDO, andsimilar semiempirical methods have eventually introduced a largenumber of empirical corrections. In this sense, these methods arenot far more superior to the advanced versions of the EHTformulation. We also indicated how the EHT is reminiscent ofthe presently more popular DFTB series of methods. One shouldalways remember that early failures of simple approaches mighthappen because of wrong computational methods rather thanconceptual limitations. For example, this happened with the earlyversion of the CNDO, although the method was lucky to berevised early on and to give birth to the line of semiempiricalmethods we know presently.Lessons learned, we draw attention to careful review of older

works. Modern research should attempt to build bridges,inheritance lines, and comparative analyses between the oldand new methods. Paying close attention to these issues wouldhelp to unify the existing approaches and to revitalize old,abandoned, but potentially valuable ideas.

5.2. Force Field Development: New Dimensions: Energy →Excited States → Spin?

The early developments of molecular interaction potentialsaimed only at the central propertyenergy, encoded as afunction of system geometry via classical force fields (Figure25a). This computationally efficient and very successful approachleads to the hierarchy of approximations and developments ofgeneral-purpose, biomolecular, reactive, and eventually, polar-izable force fields (e.g., see the MEAM/REBO developmentseries). Although the task is not easy, it is rather low dimensional,focusing on reproducing just energies and geometries.The advent of quantum chemistry has led to studies of many

additional properties, for instance, characterizing excited states:excitation energies, atomic and molecular polarizabilities,transition dipole moments, and nonadiabatic couplings. Thewave function based methods provided significantly moreinformation about the system, greatly expanding the dimension-ality of the problem. In the most straightforward way, suchproperties can be computed with the DFT or WFT methods,similar to the ground state properties. However, the desire tocompute larger systems and to simulate them for longer times

Figure 25. Evolution of computational chemistry methods: (a) “scalar/serial phase”, classical biomolecular and reactive force fields; (b) “vector/parallelphase”, quantal force fields and linear-scaling techniques.

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5869

Page 74: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

gradually pushed scientists to develop a hierarchy ofsimplifications, starting with the semiempirical methods andarriving eventually at the quantal force fields (Figure 25b). Thequantal force fields include two main components. The art ofconstructing a model Hamiltonian is motivated physically, e.g.,representing the Coulomb integrals via sums of two-body terms,introducing charge-dependent atomic radii, etc. The extension tolarge scales is motivated and achieved computationally viaapplication of linear-scaling techniques, together with additionalapproximations and simplifications.Development of a given type of force field is strongly affected

and directed by concurrent applications. In the 1970s−2000s,people were mostly concerned about energies of different types(enthalpies of formation, adsorption, bond breaking, etc.). Theenergies were central to understanding the most importantprocesses of that time. On the one hand, the worldwide economywas greatly dependent on fossil fuels. Therefore, processesinvolving combustion, organic synthesis, and catalysis on metalsurfaces demanded detailed investigation. On the other hand,understanding biological systems was and is important tomedicine and human health. These factors fueled and stimulatedthe development of the classical force fields.The interest in solar energy and electronics (especially

nanoelectronics) has shifted the research focus in the pastdecade. As a consequence, the need for an accurate description ofexcited-state properties (excited-state potential energy surfaces,multiconfiguration nature of some processes) stimulatesdevelopment of the quantal force fields and model Hamiltonianapproaches. Many photoinduced phenomena have beenelucidated in great depth with the help of such techniques.Although efficient approaches have started appearing, they arenot yet routinely used, nor are they fully developed at theconceptual level. Nevertheless, they have already entered the newepoch.The energy-based classical force fields may at some point

match the MO-based force fields, if the time-dependent versionsof the former could yield a reasonable description of the excitedstate properties. This may be considered a conceptual directionof development of such techniques, apart from the pragmaticgoals of accuracy improvement and extension to novel classes ofsystems and processes. The conceptual development mayrequire reconsideration of the principles of construction ofsuccessful functional forms, and inclusion of additional physicaleffectsmost obviously the interaction of matter with theelectromagnetic field.We anticipate other avenues of the force field development.

Spintronics and quantum computing studies start appearing innumerous experimental groups. Quantum computing andinformation processing grow in popularity and potential impact.These problems require new dimensions in the quantal forcefield development, in order to describe accurately relativisticeffects, spin−spin and spin−orbit couplings, and systemresponse to electromagnetic fields. The frequently neglectedspin variables and their coupling to other degrees of freedommaybe incorporated explicitly into the next generation of force fields.

5.3. Are Semiempirical/Force Field Methods Needed?

The great progress of the WFT and DFT methods and theircombination with various linear-scaling techniques suggest thatone day they could substitute all empirical and semiempiricalmethods completely. As a result, one can question the need forclassical general-purpose and biomolecular force fields, reactivepotentials, and semiempirical methods. In particular, onemay ask

if these methods have a future at all. Our expectation is ratherpositive. We subscribe to the argument that more expensivemethods can be avoided if an empirical or semiempirical methodcan provide similar accuracy at a fraction of the CPU time. Thesimplified methods that provide notable acceleration andreasonable accuracy will still be in great use for a long time tocome. Most likely, semiempirical Hamiltonian parametrizationscould be made on the fly using higher-level theories and effectivemapping procedures. One may expect that semiempiricaltheories would constitute an intermediate computational step,rather than being utilized on their own or substituted by acomplete quantum calculation. Such mapping may rely onartificial neural networks and machine learning.The use and development of the semiempirical theories will

continue. This will be largely stimulated by the practical andapplied needs for understanding various processes and effects inlarge-scale systems: photoinduced and reactive dynamics innanoscale materials, catalytic processes involving solvent, metal−organic, and organometallic compounds, and binding processesin biomedical applications. The methods avoiding orbital-basedformalisms (OF-DFT, reactive potentials) will likely developtime-dependent formulations for studies of excited stateproperties. Parallel to the new developments, these methodswill be utilized for a large number of materials scienceapplications. Both semiempirical and reactive force fieldapproaches will be of great importance for screeningapplicationseven a small (but most likely quite substantial)increase in performance will have a pronounced impact when ahuge number of calculations are to be performed.The classical force fields and the related semiempirical

methods provide sufficiently many details, stimulating continu-ing interest. In many applications, especially relevant to biology,one does not require knowledge of the electronic structure. Anaccurate and fast energy prediction presents an importantdirection of improvement of the classical force fields, despite thefact that ab initio methods are capable of tackling large systems.The classical force fields and the semiempirical methods are partof the hybrid QM/MM approaches. It is possible that specificparametrizations of the known methods will appear in the futurefor use within the QM/MM schemes. One can expect that theseparametrizations may become general and adaptive to anarbitrary quantum mechanical level of theory.We conclude that, despite the rapid increase of the capabilities

of the fully quantum methods, both classical and semiempiricaldescriptions of interatomic potentials will remain in greatdemand. They will continue their development and will have agreat impact on diverse areas of chemistry and materials science.

5.4. Linear-Scaling Methods from a Different Perspective,Dynamic Programming

Most electronic structure methods rely on the variationalprinciple. Having identified MOs as the variational parameters,one can develop a plethora of approaches for efficient variations.The process of optimization can be schematically illustrated inthis formalism using the Welch diagrams of MO energy levels(Figure 26). An optimal MO-LCAO combination of two orbitalscan be found by solving a simple quadratic equation. The MOsare simply

ψ χ χ| ⟩ = | ⟩ + | ⟩c c1(2)

11 1 21 2 (5.1a)

ψ χ χ| ⟩ = | ⟩ + | ⟩c c2(2)

12 1 22 2 (5.1b)

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5870

Page 75: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

in this case. A two-level splitting is observed (Figure 26a). Addingone more orbital leads to the three-level splitting scheme, whichcan be considered the result of interaction of the newly added AOwith the previously obtained two MOs:

ψ ψ ψ χ| ⟩ = | ⟩ + | ⟩ + | ⟩C C C1(3)

11 1(2)

21 2(2)

31 3 (5.2a)

ψ ψ ψ χ| ⟩ = | ⟩ + | ⟩ + | ⟩C C C2(3)

12 1(2)

22 2(2)

32 3 (5.2b)

ψ ψ ψ χ| ⟩ = | ⟩ + | ⟩ + | ⟩C C C3(3)

13 1(2)

23 2(2)

33 3 (5.2c)

A three-level splitting develops (Figure 26B). To find thecoefficients in eqs 5.1, one needs to find the roots of the quadraticequation. Similarly, to find the coefficients in eqs 5.2, the cubicequation needs to be solved, which can be done in a closedanalytic form. Equivalently, one can consider the eigenvalueproblem for the 3 × 3 matrix, utilizing either the mixed basis,{|ψ1

(2)⟩, |ψ2(2), |χ3⟩}, or the original AO basis. The direct brute-

force calculation in the AO basis may be even more efficient.In order to keep the optimization problem at the 2 × 2 level,

equivalent to solving the quadratic equation, one can freeze the“reagent” MOs, {|ψ1

(2)⟩, |ψ2(2)⟩}, and optimize the energy along

only one degree of freedom defining the relative contributions ofthe reagent orbitals, i.e., the MOs, {|ψ1

(2)⟩, |ψ2(2)⟩}, and the new

AO, |χ3⟩. The method would resemble in this realization theorbital minimization techniques, although with an additionalconstraint on the basis orbitals. Alternatively, the 2 × 2eigenvalue problem can be solved to find the orbitals |ψ1

(3)⟩ and|ψ2

(3)⟩, while the orbital |ψ3(3)⟩ is determined from the normal-

ization condition. This can be possible if one of the reagentorbitals can be neglected (e.g., due to weak interaction), asimplied by the schematics in Figure 25b. In this realization, theapproach would be reminiscent of the ELG method. One canfurther lift the constraint on the relative weight of the orbitals |χ1⟩and |χ2⟩, or incorporate the effects of the more distant orbitals tooptimize the reagent MOs, depending on their linearcombination coefficients with the function |χ3⟩. As a result,several rounds of the 2 × 2 optimizations (rotations) will beneeded, until convergence is achieved. Themethodmimics manyother approachesnotably the LMO of Stewart and the IS3CF,and in some sense the D&C approach, although withoutreference to the chemical potential.The sequential optimization of MOs, as shown in the

schematics of Figure 25a, can be viewed as a continuous versionof the nonlinear dynamic programming (DP) task. In the discrete

version of DP, one considers a system that evolves from its initialstate, ξ0, to the final state, ξn (Figure 26D). This evolution isgoverned by a set of control parameters, {uk}, and by a transitionfunction, F:

ξ ξ= −F u( , )k k k1 (5.3)

Varying the set of control protocols, {uk}, one can affect theefficiency of the process, which is quantified by the goal function,Z, depending on the initial conditions, ξ0:

ξ=Z Z u( , { })k0 (5.4)

The efficiency of the kth step can be characterized by theefficiency function f k(ξk−1,uk), which depends on the state of thesystem in the beginning of the kth step, ξk−1, and on the control atthis step, uk. The cumulative efficiency from the kth step to thefinal step n is given by the sum of the one-step efficiencies:

∑ ξ==

−Z f u( , )ki k

n

i i i1(5.5)

The total efficiency of the whole optimization process is hencegiven by the quantity Z1.Applied to electronic structure calculations, the MO-LCAO

expansion coefficients are the control variables, and the reactantMOs are the state variables. During each AO addition, one needsto choose coefficients of the MO-LCAO expansion, taking intoaccount their choices at all previous steps. The MO coefficientsshould be chosen at each step to lead to global optimization ofthe target functionthe total energy of the system. The choiceof the coefficient should satisfy the set of constraints imposed onthe possible values of the MO coefficients.The DP problem is solved by utilizing Bellman’s optimality

principle, which generalizes the rule formulated above. Theprinciple formulates the key property of the optimal policy:“Given the initial state and the initial decision, the remainingdecisions must constitute an optimal policy with regard to thestate resulting from the first decision.” Using the definitionsabove, the principle can be written mathematically usingBellman’s equation:

ξ ξ ξ* = + *− − +Z f u Z( ) max{ ( , ) ( )}k ku k k k k k1 1 1

k (5.6)

The generalization of eq 5.6 to continuous control and statevariables is known as the Jacobi−Bellman equation. It can beused for practical calculations and to search for the optimal MOs.

Figure 26. Illustration showing theWelsh diagram of the molecular system created via sequential addition of atomic orbitals (A−C). The process can beviewed as a dynamic programming (optimal control) task. F(ξn−1,un) transition function starting at state variable ξn−1 and using control variable un.f n(ξn−1,un) is the efficiency function at time step n.

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5871

Page 76: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

Favorable scaling presents an important advantage of theBellman and Jacobi−Bellman equations: each step involves aone-dimensional minimization, and recalculations of all Zk+1* canbe avoided by tabulating or memorizing the previouscalculations. Thus, the efforts should scale linearly with thenumber of orbitals.In the electronic structure theory calculations, the total

efficiency is defined as the negative of the total energy, such thatits maximization leads to the minimization of the total energy.Thus, the DP approach is equivalent to the variational principle.Many of the considered linear-scaling schemes are based on thevariational principle and locality of chemical bonding. Theoutlined methodology based on Bellman’s (Jacobi−Bellman’s)equation has never been explored for its potential to produceefficient linear-scaling techniques. The ELG and related“dynamic” methods resemble this philosophy most closely. Tosome extent, the optimal control philosophy and the sequentialone-dimensional minimization are reminiscent of the OM andDMM methods as well. The success of these methods indicatesthat other, more mathematically rigorous approaches based onthe DP optimality principle can be formulated in the future.

5.5. Operator Representations, Projectors, Huzinaga−CantuEquation, and Separability Theory

We have met a variety of levels of intuition and rigor in themethodological formulations of the linear-scaling methods.Many approaches, such as the partitioning and fragmentationschemes, appeal to physically and chemically justified andtransparent assumptions. Other methods, such as the block-diagonalization scheme and the direct optimization techniques,use a uniform mathematical framework and make sequentialapproximations based on simple mathematical transformations.Belonging to the latter group, the approach discussed in section2.1.4 proved very convenient and general. Based on projectorsand representation of arbitrary operators in arbitrary bases, ithelped to simplify complex transformations between differentbases and to develop approximations in a systematic way. Wewould like to encourage its use in derivation of novel efficientschemes for large-scale computations in molecular systems.The Huzinaga−Cantu equation presents an interesting

application of the projector operator techniques. Appearingoriginally as an atomic pseudopotential construction, it providesan elegant way to separate the core and valence electrons and toformulate the Schrodinger equation for the valence electrons inthe effective field of the core. TheHuzinaga−Cantu equation andthe related projector operators have found recent applications inthe derivations of the subsystem KS-DFT equations, the quantalforce field construction, and the uniform mathematicalformulation of the QM/MM schemes. In these circumstances,one is concerned with the same fundamental problem ofseparability, although applied to systems of different sizes andnatures. Understanding the separability principles in quantummechanics is essential for formulating efficient linear-scalingcomputational schemes. Studies of this fundamental question areneeded and will likely become an important avenue of research indifferent groups. We would like to emphasize again that manyimportant studies in this direction have been performed in thepast, in a variety of physics and chemistry fields. Theirreexamination and reinterpretation may prove useful fordeveloping new approaches to the large-scale computations.

5.6. Importance of Diabatic Approaches

This review discussed the use of diabatic states for efficient linear-scaling computations on large systems. The diabatic approaches

are particularly transparent, since diabatic states have clearphysical and chemical meaning (e.g., electron donor or acceptorstates, charge localization states, etc.). Diabatic states need nothave a direct interpretation in general. A set of arbitrary states canbe used as the basis in the variational calculations of reduceddimensionality. If the diabatic states are chosen as the eigenstatesof noninteracting fragments or as the eigenstates of the fragmentsin the external field of their environment, only small dimensionalab initio calculations are needed. If the size of the diabatic basis issmall, the overall variational calculation may be done efficiently.Variational calculations may not even be needed in someapproaches. Instead, empirical formulas can be used fordetermination of the diabatic state coefficients.Many processes may be understood in terms of diabatic states.

Examples include substitution and bond-breaking reactions inorganic chemistry, charge transfer and exciton delocalization insolar energy materials, etc. One can rely on diabatic states tomodel these processes in large-scale systems, bypassing thestandard adiabatic states. It is known that adiabatic and diabaticstates can give different results, and that the diabaticrepresentation can be less accurate. For example, an electrontransfer process of the Marcus type will have a significant energybarrier and, hence, a notably smaller rate when describedsimplistically in a diabatic basis. The superexchange mechanismof electron and energy transfer arises in the diabatic basis,requiring specialized methodologies.708,709 An adiabatic repre-sentation treats the superexchange in a more natural way. At thesame time, transition from the coherent transport to the hoppingmechanism arises more naturally in the diabatic basis.710 It isimportant to improve the capabilities of the quantum and nucleardynamics methods formulated in diabatic bases. Accurateelectronic structure calculations of many diabatic states areneeded for studies of the dynamical processes in large-scalesystems.

5.7. Importance of Software

Software availability constitutes the final aspect of our discussionof large-scale computations. As we have seen in the presentreview, the machinery for linear-scaling quantum calculations iswell-established and diverse, with a number of successfulapplications and record benchmarks reported. However, inorder to enable interested researchers to perform large-scalesimulations, it is not sufficient to know that the theoreticalmethodologies exist. Far greater success is achieved if themethods are implemented in efficient software, accessible toresearchers, preferably via open-source codes. The code shouldbe well documented, and the techniques should be easy fornovice users to apply. Availability of a toolset of methods maybecome a very important practical aspect of large-scalesimulations in the upcoming years and decades.We have mentioned several computer codes that implement

diverse linear-scaling techniques and are available for researchers.Many of these tools are not yet commonly used, and they remainlimited to the developers and collaborators. We expect that moreattention will be drawn to the creation and dissemination ofpublicly available codes, and maturing of the existing tools.Numerous applications of these techniques to realistic large-scalesystems and nanomaterials will appear eventually, as demandedby practical interests in energy conversion, nanotechnology,quantum computing, materials science, biology, and medicine.

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5872

Page 77: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

AUTHOR INFORMATION

Corresponding Author

*E-mail: [email protected] .

Notes

The authors declare no competing financial interest.

Biographies

Alexey V. Akimov was born in Bryansk oblast, Russia. He received hisDiploma in Chemistry in 2007 from M. V. Lomonosov Moscow StateUniversity, Moscow, Russia, under joint supervision by Prof. AlexanderNemukhin (MSU, Russia) and Prof. Anatoly Kolomeisky (RiceUniversity, USA). He obtained his Ph.D. in Chemistry under Prof.Anatoly Kolomeisky at Rice University in 2011. In 2012 he started hispostdoctoral appointment in Prof. Oleg Prezhdo’s group at theUniversity of Rochester, continued since 2014 at the University ofSouthern California. In 2015 he will be joining the chemistrydepartment at the University at Buffalo as an assistant professor. Dr.Akimov’s research interests include the development of semiclassicaland quantum-classical methodologies for accurate and efficientsimulation of quantum dynamics in abstract models and in large-scaleatomistic systems. Specific applied interests include photoinducedprocesses of charge and energy transfer in solar energy materials.

Oleg V. Prezhdo obtained a Diploma in Theoretical Chemistry in 1991from Kharkiv National University, Ukraine, under Anatoly Luzanov. Hecompleted his Ph.D. with Peter Rossky at the University of Texas,Austin. After a postdoctoral fellowship with John Tully at YaleUniversity, he joined the chemistry department at the University ofWashington in 1998, achieving associate and full professor in 2002 and2005. In 2008, he was elected Fellow of the American Physical Society,in 2010 was offered senior professorship at the University of Rochester,and in 2014 moved to the University of Southern California. He hasserved as an editor for the The Journal of Physical Chemistry since 2008,

The Journal of Physical Chemistry Letters since 2011, and Surface ScienceReports since 2012. Recipient of multiple national and internationalawards, he has held invited professorships in France, Germany, Japan,and Ukraine. His current research interests range from fundamentalaspects of semiclassical physics to excitation dynamics in nanoscale andbiological systems.

ACKNOWLEDGMENTS

Financial support of the U.S. National Science Foundation(Grant CHE-1300118) and the U.S. Department of Energy(Grant DE-SC0006527) is gratefully acknowledged. We thankAlexander Andrievsky and Dr. Julia DeBaecke for their helpfulcomments on the language of the manuscript.

ABBREVIATIONS

AA-FF all-atomic force fieldADMA adjustable density matrix approachAFDF additive fuzzy density fragmentationAIREBO adaptive intermolecular reactive empirical bond

orderAMn Austin model number n (n = 1, 3)ANN artificial neural networkAO atomic orbitalBLDFT block-localized density functional theoryBLWFT block-localized wave function theoryBO bond orderBOC bond order conservationBOC-MP bond order conservation Morse potentialCASSCF complete active space self-consistent fieldCC coupled clusterCCSD coupled cluster with single and double excitationsCDFT constrained density functional theoryCG coarse-grained, cardinally guided, or conjugate

gradient (see context)CG-FF coarse-grained force fieldCI configuration interactionCIS configuration interactions with only single excita-

tionsCNDO complete neglect of differential overlapCOMB charge optimized many-bodyCPMD Car−Parrinello molecular dynamicsCPU computing processing unitD&C divide and conquerDFT density functional theoryDFTB density functional tight-bindingDIIS direct inversion in iterative spaceDMM density matrix minimizationDP dynamic programmingEA electron affinityEAM embedded-atom methodE-EDC energy-based D&C with charge conservationEEM electronegativity equalization methodEFP effective fragment potentialEHT extended Huckel theoryELG elongation methodERI electron repulsion integralEVB empirical valence bondFF force fieldFMO fragment molecular orbitalFO-DFT fragment orbital density functional theoryFOE Fermi operator expansionFSCF fragment self-consistent field

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5873

Page 78: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

GGA generalized gradient approximationGMBE generalized many-body expansionGPU graphical processing unitGTO Gaussian type orbitalHF Hartree−FockINDO intermediate neglect of differential overlapIP ionization potentialIS3CF iterative stochastic subspace self-consistent fieldKEDF kinetic energy density functionalKS Kohn−ShamKS-DFT Kohn−Sham density functional theoryLJ Lennard-JonesLMO localized molecular orbitalMBE many-body expansionMCSCF multiconfigurational self-consistent fieldMD molecular dynamicsMEAM modified embedded-atom methodMEDLA molecular electron density Lego approachMEVB maximal entropy valence bondMFCC molecular fractionation with conjugate capsMINDO modified intermediate neglect of diatomic overlapMM molecular mechanicsMNDO modified neglect of diatomic overlapMO molecular orbitalMO-LCAO molecular orbital as a linear combination of atomic

orbitalsMOVB molecular orbital valence bondMPn Møller−Plesset theory of order n (n = 1−4)MSINDO modified symmetrically orthogonalized intermedi-

ate neglect of diatomic overlapMTF molecular tailoring approachNAI nuclear attraction integralNBI-MD normalized bond index molecular dynamicsNDDO neglect of diatomic differential overlapNN nearest neighborOF-DFT orbital-free density functional theoryOM orbital minimizationOMx orthogonalization model number x (n = 1−3)PCM polarizable continuum modelPES potential energy surfacePMn parametric model number n (n = 1−7)QD quantum dotQEq charge equilibration schemeQM quantum mechanicsQM/MM hybrid quantum mechanics−molecular mechanicsQTPIE charge transfer with polarization current equal-

izationREBO reactive empirical bond orderROKS restricted open-shell Kohn−ShamSCF self-consistent fieldSC-EHT self-consistent extended Huckel theorySD Slater determinantSFS superposition of fragment statesSINDO symmetrically orthogonalized intermediate neglect

of diatomic overlapSLMO strictly localized molecular orbitalSMF systematic molecular fragmentationSQE split-charge equilibrationSTO Slater-type orbitalTD-DFT time-dependent density functional theoryUBI-QEP unity bond index conservation quadratic exponen-

tial potentialVB valence bond

vdW van der WaalsVSIP valence state ionization potentialWFT wave function theoryZDO zero differential overlapZINDO Zerner’s parametrization of intermediate neglect of

diatomic overlap

REFERENCES(1) Martsinovich, N.; Troisi, A. High-Throughput ComputationalScreening of Chromophores for Dye-Sensitized Solar Cells. J. Phys.Chem. C 2011, 115, 11781−11792.(2) Ambrosio, F.; Martsinovich, N.; Troisi, A. What Is the BestAnchoring Group for a Dye in aDye-Sensitized Solar Cell? J. Phys. Chem.Lett. 2012, 3, 1531−1535.(3) Kanal, I. Y.; Owens, S. G.; Bechtel, J. S.; Hutchison, G. R. EfficientComputational Screening of Organic Polymer Photovoltaics. J. Phys.Chem. Lett. 2013, 4, 1613−1623.(4) Rick, S. W.; Stuart, S. J.; Berne, B. J. Dynamical Fluctuating ChargeForce Fields: Application to Liquid Water. J. Chem. Phys. 1994, 101,6141−6156.(5) Su, J.; Goddard, W. Excited Electron Dynamics Modeling of WarmDense Matter. Phys. Rev. Lett. 2007, 99, 185003.(6) Su, J. T.; Goddard, W. A.Mechanisms of Auger-Induced ChemistryDerived fromWave Packet Dynamics. Proc. Natl. Acad. Sci. U. S. A. 2009,106, 1001−1005.(7) Ayton, G. S.; Noid, W. G.; Voth, G. A. Systematic Coarse Grainingof Biomolecular and Soft-Matter Systems. MRS Bull. 2007, 32, 929−934.(8) Izvekov, S.; Voth, G. A. Multiscale Coarse-Graining of MixedPhospholipid/Cholesterol Bilayers. J. Chem. Theory Comput. 2006, 2,637−648.(9) Wang, Y.; Jiang, W.; Yan, T.; Voth, G. A. Understanding IonicLiquids through Atomistic and Coarse-Grained Molecular DynamicsSimulations. Acc. Chem. Res. 2007, 40, 1193−1199.(10) Liu, P.; Izvekov, S.; Voth, G. A. Multiscale Coarse-Graining ofMonosaccharides. J. Phys. Chem. B 2007, 111, 11566−11575.(11) Jiang, W.; Wang, Y.; Yan, T.; Voth, G. A. A Multiscale Coarse-Graining Study of the Liquid/Vacuum Interface of Room-TemperatureIonic Liquids with Alkyl Substituents of Different Lengths. J. Phys. Chem.C 2008, 112, 1132−1139.(12) Shehu, A.; Kavraki, L. E.; Clementi, C. Multiscale Character-ization of Protein Conformational Ensembles. Proteins: Struct., Funct.,Bioinf. 2009, 76, 837−851.(13) Stamati, H.; Clementi, C.; Kavraki, L. E. Application of NonlinearDimensionality Reduction to Characterize the Conformational Land-scape of Small Peptides. Proteins: Struct., Funct., Bioinf. 2010, 78, 223−235.(14) Virshup, A.M.; Chen, J.; Martínez, T. J. Nonlinear DimensionalityReduction for Nonadiabatic Dynamics: The Influence of ConicalIntersection Topography on Population Transfer Rates. J. Chem. Phys.2012, 137, 22A519.(15) Zheng, W.; Rohrdanz, M. A.; Clementi, C. Rapid Exploration ofConfiguration Space with Diffusion-Map-Directed Molecular Dynam-ics. J. Phys. Chem. B 2013, 117, 12769−12776.(16) Akimov, A. V.; Prezhdo, O. V. Persistent Electronic CoherenceDespite Rapid Loss of Electron−Nuclear Correlation. J. Phys. Chem.Lett. 2013, 4, 3857−3864.(17) Kurganskaya, I.; Luttge, A. A Comprehensive Stochastic Model ofPhyllosilicate Dissolution: Structure and Kinematics of Etch PitsFormed on Muscovite Basal Face. Geochim. Cosmochim. Acta 2013,120, 545−560.(18) Kurganskaya, I.; Luttge, A. Kinetic Monte Carlo Simulations ofSilicate Dissolution: Model Complexity and Parametrization. J. Phys.Chem. C 2013, 117, 24894−24906.(19) Liao, P.; Keith, J. A.; Carter, E. A. Water Oxidation on Pure andDoped Hematite (0001) Surfaces: Prediction of Co and Ni as EffectiveDopants for Electrocatalysis. J. Am. Chem. Soc. 2012, 134, 13296−13309.

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5874

Page 79: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

(20) Yu, J.; Sinnott, S.; Phillpot, S. Charge Optimized Many-BodyPotential for the Si/SiO2 System. Phys. Rev. B 2007, 75, 085311.(21) Devine, B.; Shan, T.-R.; Cheng, Y.-T.; McGaughey, A. J. H.; Lee,M.; Phillpot, S. R.; Sinnott, S. B. Atomistic Simulations of CopperOxidation and Cu/Cu2O Interfaces Using Charge-Optimized Many-Body Potentials. Phys. Rev. B 2011, 84, 125308.(22) Devine, B.; Shan, T.-R.; Cheng, Y.-T.; McGaughey, A. J. H.; Lee,M.; Phillpot, S. R.; Sinnott, S. B. Erratum: Atomistic Simulations ofCopper Oxidation and Cu/Cu2O Interfaces Using Charge-OptimizedMany-Body Potentials [Phys. Rev. B 84, 125308 (2011)]. Phys. Rev. B2012, 85, 199904(E).(23) Liang, T.; Devine, B.; Phillpot, S. R.; Sinnott, S. B. Variable ChargeReactive Potential for Hydrocarbons to Simulate Organic-CopperInteractions. J. Phys. Chem. A 2012, 116, 7976−7991.(24) Strachan, A.; Kober, E. M.; van Duin, A. C. T.; Oxgaard, J.;Goddard, W. A. Thermal Decomposition of RDX from ReactiveMolecular Dynamics. J. Chem. Phys. 2005, 122, 054502.(25) Alvarez, Y. E.; Watson, J. K.; Mathews, J. P. Novel SimplificationApproach for Large-Scale Structural Models of Coal: Three-Dimen-sional Molecules to Two-Dimensional Lattices. Part 1: Lattice Creation.Energy Fuels 2012, 26, 4938−4945.(26) Alvarez, Y. E.; Moreno, B. M.; Klein, M. T.; Watson, J. K.; Castro-Marcano, F.; Mathews, J. P. Novel Simplification Approach for Large-Scale Structural Models of Coal: Three-Dimensional Molecules to Two-Dimensional Lattices. Part 3: Reactive Lattice Simulations. Energy Fuels2013, 27, 2915−2922.(27) Wood, M. A.; van Duin, A. C. T.; Strachan, A. Coupled Thermaland Electromagnetic Induced Decomposition in the MolecularExplosive αHMX; A Reactive Molecular Dynamics Study. J. Phys.Chem. A 2014, 118, 885−895.(28) Van Duin, A. C. T.; Dasgupta, S.; Lorant, F.; Goddard, W. A.ReaxFF: A Reactive Force Field for Hydrocarbons. J. Phys. Chem. A2001, 105, 9396−9409.(29) Nielson, K. D.; van Duin, A. C. T.; Oxgaard, J.; Deng, W.-Q.;Goddard, W. A. Development of the ReaxFF Reactive Force Field forDescribing Transition Metal Catalyzed Reactions, with Application tothe Initial Stages of the Catalytic Formation of Carbon Nanotubes. J.Phys. Chem. A 2005, 109, 493−499.(30) Lebedeva, I. V.; Knizhnik, A. A.; Popov, A. M.; Potapkin, B. V. Ni-Assisted Transformation of Graphene Flakes to Fullerenes. J. Phys.Chem. C 2012, 116, 6572−6584.(31) Buehler, M.; van Duin, A.; Goddard, W. MultiparadigmModelingof Dynamical Crack Propagation in Silicon Using a Reactive Force Field.Phys. Rev. Lett. 2006, 96, 095505.(32) Buehler, M.; Tang, H.; van Duin, A.; Goddard, W. ThresholdCrack Speed Controls Dynamical Fracture of Silicon Single Crystals.Phys. Rev. Lett. 2007, 99, 165502.(33) Peng, Q.; Zhang, X.; Hung, L.; Carter, E.; Lu, G. QuantumSimulation of Materials at Micron Scales and beyond. Phys. Rev. B 2008,78, 054118.(34) Hung, L.; Carter, E. A. Orbital-Free DFT Simulations of ElasticResponse and Tensile Yielding of Ultrathin [111] Al Nanowires. J. Phys.Chem. C 2011, 115, 6269−6276.(35) Shin, I.; Carter, E. A. Possible Origin of the Discrepancy in PeierlsStresses of Fcc Metals: First-Principles Simulations of DislocationMobility in Aluminum. Phys. Rev. B 2013, 88, 064106.(36) Sen, F. G.; Qi, Y.; van Duin, A. C. T.; Alpas, A. T. OxidationInduced Softening in Al Nanowires. Appl. Phys. Lett. 2013, 102, 051912.(37) Ovchinnikov, V.; Trout, B. L.; Karplus, M. Mechanical Couplingin Myosin V: A Simulation Study. J. Mol. Biol. 2010, 395, 815−833.(38) Akimov, A. V.; Nemukhin, A. V.; Moskovsky, A. A.; Kolomeisky,A. B.; Tour, J. M. Molecular Dynamics of Surface-Moving ThermallyDriven Nanocars. J. Chem. Theory Comput. 2008, 4, 652−656.(39) Akimov, A. V.; Kolomeisky, A. B. Unidirectional Rolling Motionof Nanocars Induced by Electric Field. J. Phys. Chem. C 2012, 116,22595−22601.(40) Konyukhov, S. S.; Kupchenko, I. V.; Moskovsky, A. A.;Nemukhin, A. V.; Akimov, A. V.; Kolomeisky, A. B. Rigid-Body

Molecular Dynamics of Fullerene-Based Nanocars on Metallic Surfaces.J. Chem. Theory Comput. 2010, 6, 2581−2590.(41) Schunack, M.; Linderoth, T.; Rosei, F.; Lægsgaard, E.; Stensgaard,I.; Besenbacher, F. Long Jumps in the Surface Diffusion of LargeMolecules. Phys. Rev. Lett. 2002, 88, 156102.(42) Beu, T. A. Molecular Dynamics Simulations of Ion Transportthrough Carbon Nanotubes. II. Structural Effects of the NanotubeRadius, Solute Concentration, and Applied Electric Fields. J. Chem. Phys.2011, 135, 044515.(43) Beu, T. A. Molecular Dynamics Simulations of Ion Transportthrough Carbon Nanotubes. III. Influence of the Nanotube Radius,Solute Concentration, and Applied Electric Fields on the TransportProperties. J. Chem. Phys. 2011, 135, 044516.(44) Prokop, A.; Vacek, J.; Michl, J. Friction in Carborane-BasedMolecular Rotors Driven by Gas Flow or Electric Field: ClassicalMolecular Dynamics. ACS Nano 2012, 6, 1901−1914.(45) Wang, B.; Kral, P. Dragging of Polarizable Nanodroplets byDistantly Solvated Ions. Phys. Rev. Lett. 2008, 101, 046103.(46) Kral, P.; Vukovic, L.; Patra, N.; Wang, B.; Sint, K.; Titov, A.Control of Rotary Motion at the Nanoscale: Motility, Actuation, Self-Assembly. J. Nanosci. Lett. 2011, 1, 128−144.(47) Vukovic, L.; Kral, P. Coulombically Driven Rolling of Nanorodson Water. Phys. Rev. Lett. 2009, 103, 246103.(48) Oberhofer, H.; Blumberger, J. Electronic Coupling MatrixElements from Charge Constrained Density Functional TheoryCalculations Using a Plane Wave Basis Set. J. Chem. Phys. 2010, 133,244105.(49) Oberhofer, H.; Blumberger, J. Revisiting Electronic Couplingsand Incoherent Hopping Models for Electron Transport in CrystallineC60 at Ambient Temperatures. Phys. Chem. Chem. Phys. 2012, 14,13846−13852.(50) Gajdos, F.; Oberhofer, H.; Dupuis, M.; Blumberger, J. On theInapplicability of Electron-Hopping Models for the Organic Semi-conductor Phenyl-C61-Butyric Acid Methyl Ester (PCBM). J. Phys.Chem. Lett. 2013, 4, 1012−1017.(51) Freddolino, P. L.; Arkhipov, A. S.; Larson, S. B.; McPherson, A.;Schulten, K. Molecular Dynamics Simulations of the Complete SatelliteTobacco Mosaic Virus. Structure 2006, 14, 437−449.(52) Kitao, A.; Yonekura, K.; Maki-Yonekura, S.; Samatey, F. A.;Imada, K.; Namba, K.; Go, N. Switch Interactions Control EnergyFrustration and Multiple Flagellar Filament Structures. Proc. Natl. Acad.Sci. U. S. A. 2006, 103, 4894−4899.(53) Sanbonmatsu, K. Y.; Tung, C.-S. High Performance Computingin Biology: Multimillion Atom Simulations of Nanoscale Systems. J.Struct. Biol. 2007, 157, 470−480.(54) Zhao, G.; Perilla, J. R.; Yufenyuy, E. L.; Meng, X.; Chen, B.; Ning,J.; Ahn, J.; Gronenborn, A. M.; Schulten, K.; Aiken, C.; et al. MatureHIV-1 Capsid Structure by Cryo-Electron Microscopy and All-AtomMolecular Dynamics. Nature 2013, 497, 643−646.(55) Senapati, S.; Berkowitz, M. L. Molecular Dynamics SimulationStudies of Polyether and Perfluoropolyether Surfactant Based ReverseMicelles in Supercritical Carbon Dioxide. J. Phys. Chem. B 2003, 107,12906−12916.(56)Minakova, M.; Savelyev, A.; Papoian, G. A. NonequilibriumWaterTransport in a Nonionic Microemulsion System. J. Phys. Chem. B 2011,115, 6503−6508.(57) Arkhipov, A.; Freddolino, P. L.; Imada, K.; Namba, K.; Schulten,K. Coarse-Grained Molecular Dynamics Simulations of a RotatingBacterial Flagellum. Biophys. J. 2006, 91, 4589−4597.(58) Jambeck, J. P. M.; Eriksson, E. S. E.; Laaksonen, A.; Lyubartsev, A.P.; Eriksson, L. A. Molecular Dynamics Studies of Liposomes as Carriersfor Photosensitizing Drugs: Development, Validation, and Simulationswith a Coarse-Grained Model. J. Chem. Theory Comput. 2014, 10, 5−13.(59) Grime, J. M. A.; Voth, G. A. Highly Scalable andMemory EfficientUltra-Coarse-Grained Molecular Dynamics Simulations. J. Chem.Theory Comput. 2014, 10, 423−431.(60) Klein, M. L.; Shinoda, W. Large-Scale Molecular DynamicsSimulations of Self-Assembling Systems. Science 2008, 321, 798−800.

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5875

Page 80: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

(61) Potoyan, D. A.; Savelyev, A.; Papoian, G. A. Recent Successes inCoarse-Grained Modeling of DNA.Wiley Interdiscip. Rev. Comput. Mol.Sci. 2013, 3, 69−83.(62) Bredow, T.; Jug, K. Theory and Range of Modern SemiempiricalMolecular Orbital Methods. Theor. Chem. Acc. 2005, 113, 1−14.(63) Elstner, M. The SCC-DFTB Method and Its Application toBiological Systems. Theor. Chem. Acc. 2006, 116, 316−325.(64) Elstner, M.; Seifert, G. Density Functional Tight Binding. Philos.Trans. R. Soc., A 2014, 372, 20120483.(65) Hung, L.; Carter, E. A. Accurate Simulations of Metals at theMesoscale: Explicit Treatment of 1 Million Atoms with QuantumMechanics. Chem. Phys. Lett. 2009, 475, 163−170.(66) Aghtar, M.; Strumpfer, J.; Olbrich, C.; Schulten, K.;Kleinekathofer, U. The FMO Complex in a Glycerol−Water Mixture.J. Phys. Chem. B 2013, 117, 7157−7163.(67) Chandler, D. E.; Strumpfer, J.; Sener, M.; Scheuring, S.; Schulten,K. Light Harvesting by Lamellar Chromatophores in RhodospirillumPhotometricum. Biophys. J. 2014, 106, 2503−2510.(68) Yang, W. Direct Calculation of Electron Density in Density-Functional Theory. Phys. Rev. Lett. 1991, 66, 1438−1441.(69) Yang, W. Direct Calculation of Electron Density in Density-Functional Theory: Implementation for Benzene and a Tetrapeptide.Phys. Rev. A 1991, 44, 7823−7826.(70) Dixon, S. L.; Merz, K. M. Semiempirical Molecular OrbitalCalculations with Linear System Size Scaling. J. Chem. Phys. 1996, 104,6643−6649.(71) Daniels, A. D.; Millam, J. M.; Scuseria, G. E. SemiempiricalMethods with Conjugate Gradient Density Matrix Search to ReplaceDiagonalization for Molecular Systems Containing Thousands ofAtoms. J. Chem. Phys. 1997, 107, 425−431.(72) Dixon, S. L.; Merz, K. M. Fast, Accurate Semiempirical MolecularOrbital Calculations for Macromolecules. J. Chem. Phys. 1997, 107,879−893.(73) Neugebauer, J. Couplings between Electronic Transitions in aSubsystem Formulation of Time-Dependent Density FunctionalTheory. J. Chem. Phys. 2007, 126, 134116.(74) Neugebauer, J.; Curutchet, C.; Munoz-Losa, A.; Mennucci, B. ASubsystem TDDFT Approach for Solvent Screening Effects onExcitation Energy Transfer Couplings. J. Chem. Theory Comput. 2010,6, 1843−1851.(75) He, X.; Merz, K. M. Divide and Conquer Hartree−FockCalculations on Proteins. J. Chem. Theory Comput. 2010, 6, 405−411.(76) Gordon, M. S.; Fedorov, D. G.; Pruitt, S. R.; Slipchenko, L. V.Fragmentation Methods: A Route to Accurate Calculations on LargeSystems. Chem. Rev. 2012, 112, 632−672.(77) Nishimoto, Y.; Fedorov, D. G.; Irle, S. Density-Functional Tight-Binding Combined with the Fragment Molecular Orbital Method. J.Chem. Theory Comput. 2014, 10, 4801−4812.(78) Harrison, J. A.; Stuart, S. J.; Tutein, A. B. A New, ReactivePotential Energy Function to Study the Indentation and Friction of n-Alkane C13 Monolayers. In Interfacial Properties on the SubmicrometerScale; Frommer, J., Overney, R. M., Eds.; American Chemical Society:Washington, DC, 2000; Vol. 781, pp 216−229.(79) York, D. M.; Lee, T.-S.; Yang, W. Quantum Mechanical Study ofAqueous Polarization Effects on Biological Macromolecules. J. Am.Chem. Soc. 1996, 118, 10940−10941.(80) Khandogin, J.; York, D. M. Quantum Descriptors for BiologicalMacromolecules from Linear-Scaling Electronic Structure Methods.Proteins: Struct., Funct., Bioinf. 2004, 56, 724−737.(81) Giese, T. J.; Chen, H.; Dissanayake, T.; Giambasu, G. M.;Heldenbrand, H.; Huang, M.; Kuechler, E. R.; Lee, T.-S.; Panteva, M. T.;Radak, B. K.; et al. A Variational Linear-Scaling Framework to BuildPractical, Efficient Next-Generation Orbital-Based Quantum ForceFields. J. Chem. Theory Comput. 2013, 9, 1417−1427.(82) Giese, T. J.; Chen, H.; Huang, M.; York, D. M. Parametrization ofan Orbital-Based Linear-Scaling Quantum Force Field for NoncovalentInteractions. J. Chem. Theory Comput. 2014, 10, 1086−1098.

(83) Fiedler, L.; Gao, J.; Truhlar, D. G. Polarized Molecular OrbitalModel Chemistry. 1. Ab Initio Foundations. J. Chem. Theory Comput.2011, 7, 852−856.(84) Zhang, P.; Fiedler, L.; Leverentz, H. R.; Truhlar, D. G.; Gao, J.Polarized Molecular Orbital Model Chemistry. 2. The PMO Method. J.Chem. Theory Comput. 2011, 7, 857−867.(85) Beck, M. H.; Jackle, A.; Worth, G. A.; Meyer, H.-D. TheMulticonfiguration Time-Dependent Hartree (MCTDH) Method: AHighly Efficient Algorithm For Propagating Wavepackets. Phys. Rep.2000, 324, 1−105.(86) Brenner, D.W. The Art and Science of an Analytic Potential. Phys.Status Solidi B 2000, 217, 23−40.(87) Bowler, D. R.;Miyazaki, T.; Gillan,M. J. Recent Progress in LinearScaling Ab Initio Electronic Structure Techniques. J. Phys.: Condens.Matter 2002, 14, 2781−2798.(88) Bowler, D. R.; Miyazaki, T. O(N) Methods in ElectronicStructure Calculations. Rep. Prog. Phys. 2012, 75, 036503.(89) Riccardi, D.; Schaefer, P.; Yang; Yu, H.; Ghosh, N.; Prat-Resina,X.; Konig, P.; Li, G.; Xu, D.; Guo, H.; et al. Development of EffectiveQuantum Mechanical/Molecular Mechanical (QM/MM) Methods forComplex Biological Processes. J. Phys. Chem. B 2006, 110, 6458−6469.(90) Fedorov, D. G.; Kitaura, K. Extending the Power of QuantumChemistry to Large Systems with the Fragment Molecular OrbitalMethod. J. Phys. Chem. A 2007, 111, 6904−6914.(91) Gordon, M. S.; Slipchenko, L.; Li, H.; Jensen, J. H. Chapter 10The Effective Fragment Potential: A General Method for PredictingIntermolecular Interactions. Annu. Rep. Comput. Chem. 2007, 3, 177−193.(92) Clarke, A. S.; Hamm, S. M.; Cardenas, A. E. Chapter 2 ExtendingAtomistic Time Scale Simulations by Optimization of the Action. Annu.Rep. Comput. Chem. 2007, 3, 15−30.(93) Lodola, A.; Woods, C. J.; Mulholland, A. J. Chapter 9 Applicationsand Advances of QM/MM Methods in Computational Enzymology.Annu. Rep. Comput. Chem. 2008, 4, 155−169.(94) Perez, D.; Uberuaga, B. P.; Shim, Y.; Amar, J. G.; Voter, A. F.Chapter 4 Accelerated Molecular Dynamics Methods: Introduction andRecent Developments. Annu. Rep. Comput. Chem. 2009, 5, 79−98.(95) Xu, D.; Williamson, M. J.; Walker, R. C. Advancements inMolecular Dynamics Simulations of Biomolecules on GraphicalProcessing Units. Annu. Rep. Comput. Chem. 2010, 6, 2−19.(96) Gotz, A. W.; Wolfle, T.; Walker, R. C. Quantum Chemistry onGraphics Processing Units. Annu. Rep. Comput. Chem. 2010, 6, 21−35.(97) Kamerlin, S.; Vicatos, S.; Dryga, A.; Warshel, A. Coarse-Grained(multiscale) Simulations in Studies of Biophysical and ChemicalSystems. Annu. Rev. Phys. Chem. 2010, 62, 41−64.(98) Groenhof, G.; Boggio-Pasqua, M.; Shafer, L. V.; Robb, M. A.Computer Simulations of Photobiological Processes: The Effect of theProtein Environment. Adv. Quantum Chem. 2010, 59, 181−212.(99) Merchant, B. A.; Madura, J. D. A Review of Coarse-GrainedMolecular Dynamics Techniques to Access Extended Spatial andTemporal Scales in Biomolecular Simulations. Annu. Rep. Comput.Chem. 2011, 7, 67−87.(100) Shen, H.; Xia, Z.; Li, G.; Ren, P. A Review of Physics-BasedCoarse-Grained Potentials for the Simulations of Protein Structure andDynamics. Annu. Rep. Comput. Chem. 2012, 8, 129−148.(101) Beran, G. J. O.; Hirata, S. Fragment and Localized OrbitalMethods in Electronic Structure Theory. Phys. Chem. Chem. Phys. 2012,14, 7559−7561.(102) Jacobson, L. D.; Richard, R. M.; Lao, K. U.; Herbert, J. M.Efficient Monomer-Based Quantum Chemistry Methods for Molecularand Ionic Clusters. Annu. Rep. Comput. Chem. 2013, 9, 25−58.(103) Gao, J.; Zhang, J. Z.; Houk, K. N. Beyond QM/MM: FragmentQuantum Mechanical Methods. Acc. Chem. Res. 2014, 47, 2711−2711.(104) Li, S.; Li, W.; Ma, J. Generalized Energy-Based FragmentationApproach and Its Applications to Macromolecules and MolecularAggregates. Acc. Chem. Res. 2014, 47, 2712−2720.(105) Wang, B.; Yang, K. R.; Xu, X.; Isegawa, M.; Leverentz, H. R.;Truhlar, D. G. Quantum Mechanical Fragment Methods Based on

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5876

Page 81: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

Partitioning Atoms or Partitioning Coordinates. Acc. Chem. Res. 2014,47, 2731−2738.(106) Sahu, N.; Gadre, S. R. Molecular Tailoring Approach: A Routefor ab Initio Treatment of Large Clusters. Acc. Chem. Res. 2014, 47,2739−2747.(107) Pruitt, S. R.; Bertoni, C.; Brorsen, K. R.; Gordon, M. S. Efficientand Accurate Fragmentation Methods. Acc. Chem. Res. 2014, 47, 2786−2794.(108) Giese, T. J.; Huang, M.; Chen, H.; York, D. M. Recent Advancestoward a General Purpose Linear-Scaling Quantum Force Field. Acc.Chem. Res. 2014, 47, 2812−2820.(109) Sisto, A.; Glowacki, D. R.; Martínez, T. J. Ab Initio NonadiabaticDynamics of Multichromophore Complexes: A Scalable Graphical-Processing-Unit-Accelerated Exciton Framework. Acc. Chem. Res. 2014,47, 2857−2866.(110) Merz, K. M. Using Quantum Mechanical Approaches to StudyBiological Systems. Acc. Chem. Res. 2014, 47, 2804−2811.(111) Mezey, P. G. Fuzzy Electron Density Fragments in Macro-molecular Quantum Chemistry, Combinatorial Quantum Chemistry,Functional Group Analysis, and Shape−Activity Relations. Acc. Chem.Res. 2014, 47, 2821−2827.(112) Dunlap, B. I.; Connolly, J. W. D.; Sabin, J. R. On First-RowDiatomic Molecules and Local Density Models. J. Chem. Phys. 1979, 71,4993−4999.(113) Dunlap, B. I. Accurate Density Functional Calculation on LargeSystems. Int. J. Quantum Chem. 1996, 58, 123−132.(114) Ren, X.; Rinke, P.; Blum, V.; Wieferink, J.; Tkatchenko, A.;Sanfilippo, A.; Reuter, K.; Scheffler, M. Resolution-of-Identity Approachto Hartree−Fock, Hybrid Density Functionals, RPA, MP2 andGW withNumeric Atom-Centered Orbital Basis Functions.New J. Phys. 2012, 14,053020.(115) Sodt, A.; Subotnik, J. E.; Head-Gordon, M. Linear ScalingDensity Fitting. J. Chem. Phys. 2006, 125, 194109.(116) Foerster, D. Elimination, in Electronic Structure Calculations, ofRedundant Orbital Products. J. Chem. Phys. 2008, 128, 034108.(117) Beebe, N. H.; Linderberg, J. Simplifications in the Generationand Transformation of Two-Electron Integrals in Molecular Calcu-lations. Int. J. Quantum Chem. 1977, 12, 683−705.(118) Aquilante, F.; Boman, L.; Bostrom, J.; Koch, H.; Lindh, R.; deMeras, A. S.; Pedersen, T. B. Cholesky Decomposition Techniques inElectronic Structure Theory. In Linear-Scaling Techniques in Computa-tional Chemistry and Physics; Springer: Berlin, 2011; pp 301−343.(119) Aquilante, F.; Pedersen, T. B.; Lindh, R. Low-Cost Evaluation ofthe Exchange Fock Matrix from Cholesky and Density FittingRepresentations of the Electron Repulsion Integrals. J. Chem. Phys.2007, 126, 194106.(120) Doser, B.; Zienau, J.; Clin, L.; Larnbrecht, D. S.; Ochsenfeld, C.A Linear-Scaling MP2 Method for Large Molecules by RigorousIntegral-Screening. Z. Phys. Chem. 2010, 224, 397−412.(121) Lowdin, P.-O. Some Properties of Inner Projections. Int. J.Quantum Chem. 1970, 5, 231−237.(122) Whitten, J. L. Coulombic Potential Energy Integrals andApproximations. J. Chem. Phys. 1973, 58, 4496−4501.(123) Dunlap, B. I.; Connolly, J. W. D.; Sabin, J. R. On SomeApproximations in Applications of Xα Theory. J. Chem. Phys. 1979, 71,3396−3402.(124) Weigend, F.; Haser, M.; Patzelt, H.; Ahlrichs, R. RI-MP2:Optimized Auxiliary Basis Sets and Demonstration of Efficiency. Chem.Phys. Lett. 1998, 294, 143−152.(125) Goh, S. K.; St-Amant, A. Using a Fitted Electronic Density toImprove the Efficiency of a Linear Combination of Gaussian-TypeOrbitals Calculations. Chem. Phys. Lett. 1997, 264, 9−16.(126) Greengard, L.; Rokhlin, V. A Fast Algorithm for ParticleSimulations. J. Comput. Phys. 1987, 73, 325−348.(127) White, C. A.; Johnson, B. G.; Gill, P. M. W.; Head-Gordon, M.The Continuous Fast Multipole Method. Chem. Phys. Lett. 1994, 230,8−16.

(128) White, C. A.; Johnson, B. G.; Gill, P. M. W.; Head-Gordon, M.Linear Scaling Density Functional Calculations via the Continuous FastMultipole Method. Chem. Phys. Lett. 1996, 253, 268−278.(129) Karasawa, N.; Goddard, W. A., III. Acceleration of Convergencefor Lattice Sums. J. Phys. Chem. 1989, 93, 7320−7327.(130) Wheeler, D. R.; Newman, J. A Less Expensive Ewald LatticeSum. Chem. Phys. Lett. 2002, 366, 537−543.(131) Komeiji, Y. Ewald Summation andMultiple Time Step Methodsfor Molecular Dynamics Simulation of Biological Molecules. J. Mol.Struct.: THEOCHEM 2000, 530, 237−243.(132) Eichinger, M.; Grubmueller, H.; Heller, H.; Tavan, P.FAMUSAMM: An Algorithm for Rapid Evaluation of ElectrostaticInteractions in Molecular Dynamics Simulations. J. Comput. Chem.1997, 18, 1729−1749.(133) Cisneros, G. A.; Karttunen, M.; Ren, P.; Sagui, C. ClassicalElectrostatics for Biomolecular Simulations.Chem. Rev. 2014, 114, 779−814.(134) Amisaki, T. Precise and Efficient Ewald Summation for PeriodicFast Multipole Method. J. Comput. Chem. 2000, 21, 1075−1087.(135) Aguado, A.; Madden, P. A. Ewald Summation of ElectrostaticMultipole Interactions up to theQuadrupolar Level. J. Chem. Phys. 2003,119, 7471−7483.(136) Wolf, D.; Keblinski, P.; Phillpot, S. R.; Eggebrecht, J. ExactMethod for the Simulation of Coulombic Systems by SphericallyTruncated, Pairwise R−1 Summation. J. Chem. Phys. 1999, 110, 8254−8282.(137) Kutteh, R.; Apra, E.; Nichols, J. A Generalized Fast MultipoleApproach for Hartree-Fock and Density Functional Computations.Chem. Phys. Lett. 1995, 238, 173−179.(138) Burant, J. C.; Strain, M. C.; Scuseria, G. E.; Frisch, M. J. AnalyticEnergy Gradients for the Gaussian Very Fast Multipole Method(GvFMM). Chem. Phys. Lett. 1996, 248, 43−49.(139) Ochsenfeld, C.; Kussmann, J.; Lambrecht, D. S. Linear-ScalingMethods in Quantum Chemistry. Rev. Comput. Chem. 2007, 23, 1−82.(140) Choi, C. H. Recent Development of Linear Scaling QuantumTheories in GAMESS. Bull. Korean Chem. Soc. 2003, 24, 733−738.(141) Sun, X.; Pitsianis, N. P. A Matrix Version of the Fast MultipoleMethod. SIAM Rev. 2001, 43, 289−300.(142) Voter, A. F. A Method for Accelerating the Molecular DynamicsSimulation of Infrequent Events. J. Chem. Phys. 1997, 106, 4665−4677.(143) Sørensen, M. R.; Voter, A. F. Temperature-AcceleratedDynamics for Simulation of Infrequent Events. J. Chem. Phys. 2000,112, 9599−9606.(144)Montalenti, F.; Sørensen,M.; Voter, A. Closing the Gap betweenExperiment and Theory: Crystal Growth by Temperature AcceleratedDynamics. Phys. Rev. Lett. 2001, 87, 126101.(145) Montalenti, F.; Voter, A. F. Exploiting Past Visits or Minimum-Barrier Knowledge to Gain Further Boost in the Temperature-Accelerated Dynamics Method. J. Chem. Phys. 2002, 116, 4819−4828.(146) Laio, A.; Gervasio, F. L. Metadynamics: A Method to SimulateRare Events and Reconstruct the Free Energy in Biophysics, Chemistryand Material Science. Rep. Prog. Phys. 2008, 71, 126601.(147) Voter, A. F. Hyperdynamics: Accelerated Molecular Dynamicsof Infrequent Events. Phys. Rev. Lett. 1997, 78, 3908−3911.(148) Miron, R. A.; Fichthorn, K. A. Accelerated Molecular Dynamicswith the Bond-Boost Method. J. Chem. Phys. 2003, 119, 6210−6216.(149) Fichthorn, K. A.; Miron, R. A.; Wang, Y.; Tiwary, Y. AcceleratedMolecular Dynamics Simulation of Thin-Film Growth with the Bond-Boost Method. J. Phys.: Condens. Matter 2009, 21, 084212.(150) Moskovsky, A. A.; Vanovschi, V. V.; Konyukhov, S. S.;Nemukhin, A. V. Implementation of the Replica-Exchange MolecularDynamics Method for Rigid Bodies. Int. J. Quantum Chem. 2006, 106,2208−2213.(151) Kubitzki, M. B.; de Groot, B. L. Molecular Dynamics SimulationsUsing Temperature-Enhanced Essential Dynamics Replica Exchange.Biophys. J. 2007, 92, 4262−4270.(152) Dokholyan, N. V.; Buldyrev, S. V.; Stanley, H. E.; Shakhnovich,E. I. Discrete Molecular Dynamics Studies of the Folding of a Protein-like Model. Folding Des. 1998, 3, 577−587.

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5877

Page 82: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

(153) Ding, F.; Borreguero, J. M.; Buldyrey, S.; Stanley, H. E.;Dokholyan, N. V. Mechanism for the-Helix to-Hairpin Transition.Proteins: Struct., Funct., Genet. 2003, 53, 220−228.(154) Hernandez de la Pena, L.; van Zon, R.; Schofield, J.; Opps, S. B.DiscontinuousMolecular Dynamics for Semiflexible and Rigid Bodies. J.Chem. Phys. 2007, 126, 074105.(155) Hernandez de la Pena, L.; van Zon, R.; Schofield, J.; Opps, S. B.Discontinuous Molecular Dynamics for Rigid Bodies: Applications. J.Chem. Phys. 2007, 126, 074106.(156) Van Zon, R.; Schofield, J. Event-Driven Dynamics of RigidBodies Interacting via Discretized Potentials. J. Chem. Phys. 2008, 128,154119.(157) Schlick, T. Molecular Dynamics-Based Approaches forEnhanced Sampling of Long-Time, Large-Scale ConformationalChanges in Biomolecules. F1000 Biol. Rep. 2009, 1, 51.(158) Kim, S. Y.; Perez, D.; Voter, A. F. Local Hyperdynamics. J. Chem.Phys. 2013, 139, 144110.(159) Ufimtsev, I. S.; Martínez, T. J. Graphical Processing Units forQuantum Chemistry. Comput. Sci. Eng. 2008, 10, 26−34.(160) Ufimtsev, I. S.; Martínez, T. J. Quantum Chemistry on GraphicalProcessing Units. 1. Strategies for Two-Electron Integral Evaluation. J.Chem. Theory Comput. 2008, 4, 222−231.(161) Ufimtsev, I. S.; Martínez, T. J. Quantum Chemistry on GraphicalProcessing Units. 2. Direct Self-Consistent-Field Implementation. J.Chem. Theory Comput. 2009, 5, 1004−1015.(162) Ufimtsev, I. S.; Martínez, T. J. Quantum Chemistry on GraphicalProcessing Units. 3. Analytical Energy Gradients, Geometry Opti-mization, and First Principles Molecular Dynamics. J. Chem. TheoryComput. 2009, 5, 2619−2628.(163) Anderson, J. A.; Lorenz, C. D.; Travesset, A. General PurposeMolecular Dynamics Simulations Fully Implemented on GraphicsProcessing Units. J. Comput. Phys. 2008, 227, 5342−5359.(164) Stone, J. E.; Phillips, J. C.; Freddolino, P. L.; Hardy, D. J.;Trabuco, L. G.; Schulten, K. Accelerating Molecular ModelingApplications with Graphics Processors. J. Comput. Chem. 2007, 28,2618−2640.(165) Vogt, L.; Olivares-Amaya, R.; Kermes, S.; Shao, Y.; Amador-Bedolla, C.; Aspuru-Guzik, A. Accelerating Resolution-of-the-IdentitySecond-Order Møller−Plesset Quantum Chemistry Calculations withGraphical Processing Units. J. Phys. Chem. A 2008, 112, 2049−2057.(166) Yasuda, K. Two-Electron Integral Evaluation on the GraphicsProcessor Unit. J. Comput. Chem. 2008, 29, 334−342.(167) Hohenberg, P.; Kohn, W. Inhomogeneous Electron Gas. Phys.Rev. 1964, 136, B864−B871.(168) Kohn, W.; Sham, L. J. Self-Consistent Equations IncludingExchange and Correlation Effects. Phys. Rev. 1965, 140, A1133−A1138.(169) Pople, J. A.; Santry, D. P.; Segal, G. A. Approximate Self-Consistent Molecular Orbital Theory. I. Invariant Procedures. J. Chem.Phys. 1965, 43, S129−S135.(170) Pople, J. A.; Segal, G. A. Approximate Self-Consistent MolecularOrbital Theory. II. Calculations with Complete Neglect of DifferentialOverlap. J. Chem. Phys. 1965, 43, S136−S151.(171) Pople, J. A. Approximate Self-Consistent Molecular OrbitalTheory. III. CNDO Results for AB2 and AB3 Systems. J. Chem. Phys.1966, 44, 3289−3296.(172) Pople, J. A. Approximate Self-Consistent Molecular-OrbitalTheory. V. Intermediate Neglect of Differential Overlap. J. Chem. Phys.1967, 47, 2026−2033.(173) Baird, N. C.; Dewar, M. J. Ground States of Σ-BondedMolecules. IV. The MINDO Method and Its Application to Hydro-carbons. J. Chem. Phys. 1969, 50, 1262−1274.(174) Dewar, M. J.; Haselbach, E. Ground States of σ-BondedMolecules. IX. MINDO [modified Intermediate Neglect of Differentialoverlap]/2 Method. J. Am. Chem. Soc. 1970, 92, 590−598.(175) Bingham, R. C.; Dewar, M. J.; Lo, D. H. Ground States ofMolecules. XXV. MINDO/3. Improved Version of the MINDOSemiempirical SCF-MO Method. J. Am. Chem. Soc. 1975, 97, 1285−1293.

(176) Dewar, M. J. S.; Hojvat (Sabelli), N. L. The SPO (Split P-Orbital) Method and Its Application to Ethylene. J. Chem. Phys. 1961,34, 1232−1236.(177) Dewar, M. J. S.; Hojvat, N. L. The S.p-O. (Split-P-Orbital)Method. II. Further Definition and Application to Acetylene. Proc. R.Soc., Ser. A: Math. Phys. Eng. Sci. 1961, 264, 431−444.(178) Dewar, M. J. S.; Sabelli, N. L. The Split P-Orbital (S.P.O.)Method. III. Relationship to Other M.O. Treatments and Application toBenzene, Butadiene, and Naphthalene. J. Phys. Chem. 1962, 66, 2310−2316.(179) Klopman, G. A Semiempirical Treatment of MolecularStructures. II. Molecular Terms and Application to Diatomic Molecules.J. Am. Chem. Soc. 1964, 86, 4550−4557.(180) Klopman, G. A Semiempirical Treatment of MolecularStructures. III. Equipotential Orbitals for Polyatomic Systems. J. Am.Chem. Soc. 1965, 87, 3300−3303.(181) Oda, A.; Hirono, S. Geometry-Dependent Atomic ChargeCalculations Using Charge Equilibration Method with Empirical Two-Center Coulombic Terms. J. Mol. Struct.: THEOCHEM 2003, 634,159−170.(182) Mataga, N.; Nishimoto, K. Z. Phys. Chem. 1957, 13, 140.(183) Ridley, J.; Zerner, M. C. An Intermediate Neglect of DifferentialOverlap Technique for Spectroscopy: Pyrrole and the Azines. Theor.Chim. Acta 1973, 32, 111.(184) Ohno, K. Theor. Chem. Acc. 1964, 2, 219.(185) DasGupta, A.; Huzinaga, S. New Developments in CNDOMolecular Orbital Theory. Theor. Chim. Acta 1974, 35, 329−340.(186) Rappe, A. K.; Goddard, W. A. Charge Equilibration forMolecular Dynamics Simulations. J. Phys. Chem. 1991, 95, 3358−3363.(187) Dewar, M. J.; Thiel, W. Ground States of Molecules. 38. TheMNDO Method. Approximations and Parameters. J. Am. Chem. Soc.1977, 99, 4899−4907.(188) Dewar, M. J.; Thiel, W. Ground States of Molecules. 39. MNDOResults for Molecules Containing Hydrogen, Carbon, Nitrogen, andOxygen. J. Am. Chem. Soc. 1977, 99, 4907−4917.(189) Thiel, W.; Voityuk, A. A. Extension of the MNDO Formalism toD Orbitals: Integral Approximations and Preliminary NumericalResults. Theor. Chim. Acta 1992, 81, 391−404.(190) Thiel, W.; Voityuk, A. A. Extension of MNDO to D Orbitals:Parameters and Results for the Second-Row Elements and for the ZincGroup. J. Phys. Chem. 1996, 100, 616−626.(191) Dewar, M. J.; McKee, M. L.; Rzepa, H. S. MNDOParameters forThird Period Elements. J. Am. Chem. Soc. 1978, 100, 3607−3607.(192)Davis, L. P.; Guidry, R.M.;Williams, J. R.; Dewar, M. J. S.; Rzepa,H. S. MNDO Calculations for Compounds Containing Aluminum andBoron. J. Comput. Chem. 1981, 2, 433−445.(193) Dewar, M. J.; Zoebisch, E. G.; Healy, E. F.; Stewart, J. J.Development and Use of Quantum Mechanical Molecular Models. 76.AM1: A New General Purpose QuantumMechanical Molecular Model.J. Am. Chem. Soc. 1985, 107, 3902−3909.(194) Dewar, M. J.; Jie, C.; Zoebisch, E. G. AM1 Calculations forCompounds Containing Boron. Organometallics 1988, 7, 513−521.(195) Dewar, M. J.; Merz, K. M. AM1 Parameters for Zinc.Organometallics 1988, 7, 522−524.(196) Dewar, M. J.; Holder, A. J. AM1 Parameters for Aluminum.Organometallics 1990, 9, 508−511.(197) Stewart, J. J. Optimization of Parameters for SemiempiricalMethods I. Method. J. Comput. Chem. 1989, 10, 209−220.(198) Stewart, J. J. Optimization of Parameters for SemiempiricalMethods II. Applications. J. Comput. Chem. 1989, 10, 221−264.(199) Stewart, J. J. Optimization of Parameters for SemiempiricalMethods. III Extension of PM3 to Be, Mg, Zn, Ga, Ge, As, Se, Cd, In, Sn,Sb, Te, Hg, Tl, Pb, and Bi. J. Comput. Chem. 1991, 12, 320−341.(200) Stewart, J. J. P. Optimization of Parameters for SemiempiricalMethods V:Modification of NDDOApproximations and Application to70 Elements. J. Mol. Model. 2007, 13, 1173−1213.(201) Stewart, J. J. P. Optimization of Parameters for SemiempiricalMethods VI: More Modifications to the NDDO Approximations andRe-Optimization of Parameters. J. Mol. Model. 2013, 19, 1−32.

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5878

Page 83: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

(202) Kolb, M.; Thiel, W. Beyond the MNDO Model: MethodicalConsiderations and Numerical Results. J. Comput. Chem. 1993, 14,775−789.(203) Weber, W.; Thiel, W. Orthogonalization Corrections forSemiempirical Methods. Theor. Chem. Acc. 2000, 103, 495−506.(204) Scholten, M. Ph.D. Thesis, Universitat Dusseldorf, Germany,2003.(205) Korth, M.; Thiel, W. Benchmarking Semiempirical Methods forThermochemistry, Kinetics, and Noncovalent Interactions: OMxMethods Are Almost As Accurate and Robust As DFT-GGA Methodsfor Organic Molecules. J. Chem. Theory Comput. 2011, 7, 2929−2936.(206) Silva-Junior, M. R.; Thiel, W. Benchmark of ElectronicallyExcited States for Semiempirical Methods: MNDO, AM1, PM3, OM1,OM2, OM3, INDO/S, and INDO/S2. J. Chem. Theory Comput. 2010, 6,1546−1564.(207) Sattelmeyer, K. W.; Tubert-Brohman, I.; Jorgensen, W. L. NO-MNDO: Reintroduction of the Overlap Matrix into MNDO. J. Chem.Theory Comput. 2006, 2, 413−419.(208) Jug, K. Mechanism of Cyclopropane-Propene Isomerization.Theor. Chim. Acta 1976, 42, 303−310.(209) Coffey, P.; Jug, K. Semiempirical Molecular Orbital Calculationsand Molecular Energies. A New Formula for the B Parameter. J. Am.Chem. Soc. 1973, 95, 7575−7580.(210) Nanda, D. N.; Jug, K. SINDO1. A Semiempirical SCF MOMethod forMolecular Binding Energy and Geometry I. Approximationsand Parametrization. Theor. Chim. Acta 1980, 57, 95−106.(211) Jug, K.; Nanda, D. N. SINDO1 II. Application to Ground Statesof Molecules Containing Carbon, Nitrogen and Oxygen Atoms. Theor.Chim. Acta 1980, 57, 107−130.(212) Jug, K.; Nanda, D. N. SINDO1 III. Application to Ground Statesof Molecules Containing Fluorine, Boron, Beryllium and LithiumAtoms. Theor. Chim. Acta 1980, 57, 131−144.(213) Zerner, M. C. Removal of Core Orbitals in “Valence OrbitalOnly” Calculations. Mol. Phys. 1972, 23, 963−978.(214) Mueller-Remmers, P. L.; Mishra, P. C.; Jug, K. A SINDO1 Studyof Photoisomerization and Photofragmentation of Cyclopentanone. J.Am. Chem. Soc. 1984, 106, 2538−2543.(215) Buss, S.; Jug, K. SINDO1 Study of the Photoisomerization of 2-Methylfuran to 3-Methylfuran. J. Am. Chem. Soc. 1987, 109, 1044−1050.(216) Mueller-Remmers, P. L.; Jug, K. SINDO1 Study of Photo-chemical Reaction Mechanisms of Diazirines. J. Am. Chem. Soc. 1985,107, 7275−7284.(217) Bredow, T.; Jug, K. SINDO1 Study of Photocatalytic Formationand Reactions of OH Radicals at Anatase Particles. J. Phys. Chem. 1995,99, 285−291.(218) Ahlswede, B.; Jug, K. Consistent Modifications of SINDO1: I.Approximations and Parameters. J. Comput. Chem. 1999, 20, 563−571.(219) Ahlswede, B.; Jug, K. Consistent Modifications of SINDO1: II.Applications to First- and Second-Row Elements. J. Comput. Chem.1999, 20, 572−578.(220) Jug, K.; Geudtner, G.; Homann, T. MSINDO Parameterizationfor Third-RowMain Group Elements. J. Comput. Chem. 2000, 21, 974−987.(221) Bredow, T.; Geudtner, G.; Jug, K. MSINDO Parameterizationfor Third-Row Transition Metals. J. Comput. Chem. 2001, 22, 861−887.(222) Nair, N. N.; Bredow, T.; Jug, K. Molecular DynamicsImplementation in MSINDO: Study of Silicon Clusters. J. Comput.Chem. 2004, 25, 1255−1263.(223) Jug, K.; Nair, N. N.; Bredow, T. Molecular DynamicsInvestigation of Water Adsorption on Rutile Surfaces. Surf. Sci. 2005,590, 9−20.(224) Jug, K.; Heidberg, B.; Bredow, T. Molecular Dynamics Study ofWater Adsorption Structures on theMgO(100) Surface. J. Phys. Chem. C2007, 111, 6846−6851.(225) Jug, K.; Nair, N. N.; Bredow, T. Molecular DynamicsInvestigation of Oxygen Vacancy Diffusion in Rutile. Phys. Chem.Chem. Phys. 2005, 7, 2616−2620.

(226) Wahab, H. S.; Bredow, T.; Aliwi, S. M. ComputationalInvestigation of the Adsorption and Photocleavage of Chlorobenzeneon Anatase TiO2 Surfaces. Chem. Phys. 2008, 353, 93−103.(227) Simpson, D. J.; Bredow, T.; Gerson, A. R. MSINDO Study ofAcid Promoted Dissolution of PlanarMgO andNiO Surfaces. J. Comput.Chem. 2009, 30, 581−588.(228) Gadaczek, I.; Krause, K.; Hintze, K. J.; Bredow, T. MSINDO-sCIS: A New Method for the Calculation of Excited States of LargeMolecules. J. Chem. Theory Comput. 2011, 7, 3675−3685.(229) Gadaczek, I.; Hintze, K. J.; Bredow, T. Periodic Calculations ofExcited State Properties for Solids Using a Semiempirical Approach.Phys. Chem. Chem. Phys. 2012, 14, 741−750.(230) Gadaczek, I.; Krause, K.; Hintze, K. J.; Bredow, T. AnalyticalGradients for the MSINDO-sCIS and MSINDO-UCIS Method:Theory, Implementation, Benchmarks, and Examples. J. Chem. TheoryComput. 2012, 8, 986−996.(231) da Motta Neto, J. D.; Zerner, M. C. New ParameterizationScheme for the Resonance Integrals (Hmv) within the INDO/1Approximation. Main Group Elements. Int. J. Quantum Chem. 2001,81, 187−201.(232) Bacon, A. D.; Zerner, M. C. An Intermediate Neglect ofDifferential Overlap Theory for Transition Metal Complexes: Fe, Coand Cu Chlorides. Theor. Chim. Acta 1979, 53, 21−54.(233) Zerner, M. C.; Loew, G. H.; Kirchner, R. F.; Mueller-Westerhoff,U. T. An Intermediate Neglect of Differential Overlap Technique forSpectroscopy of Transition-Metal Complexes. Ferrocene. J. Am. Chem.Soc. 1980, 102, 589−599.(234) Longo, R. L. Charge-Dependent Basis Sets. I. First RowElements. Int. J. Quantum Chem. 1999, 75, 585−591.(235) Culberson, J. C.; Knappe, P.; Rosch, N.; Zerner, M. C. AnIntermediate Neglect of Differential Overlap (INDO) Technique forLanthanide Complexes: Studies on Lanthanide Halides. Theor. Chim.Acta 1987, 71, 21−39.(236) De Andrade, A. V. M.; da Costa, N. B., Jr.; Simas, A. M.; de Sa, G.F. Sparkle Model for the Quantum Chemical AM1 Calculation ofEuropium Complexes. Chem. Phys. Lett. 1994, 227, 349−353.(237) De Andrade, A. V. M.; da Costa, N. B., Jr.; Simas, A. M.; de Sa, G.F. Sparkle Model for the Quantum Chemical AM1 Calculation ofEuropeum Complexes of Coordination Number Nine. J. Alloys Compd.1995, 225, 55−59.(238) Rocha, G. B.; Freire, R. O.; da Costa, N. B.; de Sa, G. F.; Simas, A.M. Sparkle Model for AM1 Calculation of Lanthanide Complexes:Improved Parameters for Europium. Inorg. Chem. 2004, 43, 2346−2354.(239) Freire, R. O.; Rocha, G. B.; Simas, A. M. Sparkle Model for theCalculation of Lanthanide Complexes: AM1 Parameters for Eu (III), Gd(III), and Tb (III). Inorg. Chem. 2005, 44, 3299−3310.(240) Freire, R. O.; Rocha, G. B.; Simas, A. M. Modeling Rare EarthComplexes: Sparkle/AM1 Parameters for Thulium (III). Chem. Phys.Lett. 2005, 411, 61−65.(241) Freire, R. O.; Rocha, G. B.; Simas, A. M. Modeling Rare EarthComplexes: Sparkle/PM3 Parameters for thulium(III).Chem. Phys. Lett.2006, 425, 138−141.(242) Freire, R. O.; Rocha, G. B.; Simas, A. M. Sparkle/PM3Parameters for praseodymium(III) and ytterbium(III). Chem. Phys. Lett.2007, 441, 354−357.(243) Da Costa, N. B.; Freire, R. O.; Simas, A. M.; Rocha, G. B.Structure Modeling of Trivalent Lanthanum and Lutetium Complexes:Sparkle/PM3. J. Phys. Chem. A 2007, 111, 5015−5018.(244) Freire, R. O.; da Costa, N. B.; Rocha, G. B.; Simas, A. M. Sparkle/PM3 Parameters for the Modeling of Neodymium(III), Promethiu-m(III), and Samarium(III) Complexes. J. Chem. Theory Comput. 2007,3, 1588−1596.(245) Freire, R. O.; Simas, A. M. Sparkle/PM6 Parameters for AllLanthanide Trications from La(III) to Lu(III). J. Chem. Theory Comput.2010, 6, 2019−2023.(246) Dutra, J. D. L.; Filho,M. A.M.; Rocha, G. B.; Freire, R. O.; Simas,A. M.; Stewart, J. J. P. Sparkle/PM7 Lanthanide Parameters for theModeling of Complexes andMaterials. J. Chem. Theory Comput. 2013, 9,3333−3341.

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5879

Page 84: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

(247) Porezag, D.; Frauenheim, T.; Kohler, T.; Seifert, G.; Kaschner, R.Construction of Tight-Binding-like Potentials on the Basis of Density-Functional Theory: Application to Carbon. Phys. Rev. B 1995, 51,12947−12957.(248) Seifert, G.; Porezag, D.; Frauenheim, T. Calculations ofMolecules, Clusters, and Solids with a Simplified LCAO-DFT-LDAScheme. Int. J. Quantum Chem. 1996, 58, 185−192.(249) Elstner, M.; Porezag, D.; Jungnickel, G.; Elsner, J.; Haugk, M.;Frauenheim, T.; Suhai, S.; Seifert, G. Self-Consistent-Charge Density-Functional Tight-Binding Method for Simulations of ComplexMaterials Properties. Phys. Rev. B 1998, 58, 7260−7268.(250) Hoffmann, R. An Extended Huckel Theory. I. Hydrocarbons. J.Chem. Phys. 1963, 39, 1397−1412.(251) Anderson, P. W. Localized Orbitals for Molecular QuantumTheory. I. The Huckel Theory. Phys. Rev. 1969, 181, 181.(252) Anderson, A. B.; Hoffmann, R. Description of DiatomicMolecules Using One Electron Configuration Energies with Two-BodyInteractions. J. Chem. Phys. 1974, 60, 4271−4273.(253) Anderson, A. B. Derivation of the ExtendedHuckel Method withCorrections: One Electron Molecular Orbital Theory for Energy Leveland Structure Determinations. J. Chem. Phys. 1975, 62, 1187−1188.(254) Calzaferri, G.; Forss, L.; Kamber, I. Molecular Geometries by theExtended Hueckel Molecular Orbital (EHMO) Method. J. Phys. Chem.1989, 93, 5366−5371.(255) Yang, Y.; Yu, H.; York, D.; Cui, Q.; Elstner, M. Extension of theSelf-Consistent-Charge Density-Functional Tight-Binding Method:Third-Order Expansion of the Density Functional Theory Total Energyand Introduction of a Modified Effective Coulomb Interaction. J. Phys.Chem. A 2007, 111, 10861−10873.(256) Gaus, M.; Cui, Q.; Elstner, M. DFTB3: Extension of the Self-Consistent-Charge Density-Functional Tight-Binding Method (SCC-DFTB). J. Chem. Theory Comput. 2011, 7, 931−948.(257) Gaus, M.; Goez, A.; Elstner, M. Parametrization and Benchmarkof DFTB3 for Organic Molecules. J. Chem. Theory Comput. 2013, 9,338−354.(258) Niehaus, T.; Suhai, S.; Della Sala, F.; Lugli, P.; Elstner, M.;Seifert, G.; Frauenheim, T. Tight-Binding Approach to Time-Depend-ent Density-Functional Response Theory. Phys. Rev. B 2001, 63,085108.(259) Barone, V.; Carnimeo, I.; Scalmani, G. ComputationalSpectroscopy of Large Systems in Solution: The DFTB/PCM andTD-DFTB/PCM Approach. J. Chem. Theory Comput. 2013, 9, 2052−2071.(260) Niehaus, T.; Rohlfing, M.; Della Sala, F.; Di Carlo, A.;Frauenheim, T. Quasiparticle Energies for Large Molecules: A Tight-Binding-Based Green’s-Function Approach. Phys. Rev. A 2005, 71,022508.(261) Kaminski, S.; Giese, T. J.; Gaus, M.; York, D. M.; Elstner, M.Extended Polarization in Third-Order SCC-DFTB from Chemical-Potential Equalization. J. Phys. Chem. A 2012, 116, 9131−9141.(262) Wahiduzzaman, M.; Oliveira, A. F.; Philipsen, P.; Zhechkov, L.;van Lenthe, E.; Witek, H. A.; Heine, T. DFTB Parameters for thePeriodic Table: Part 1, Electronic Structure. J. Chem. Theory Comput.2013, 9, 4006−4017.(263) Kohler, C.; Seifert, G.; Frauenheim, T. Density Functional BasedCalculations for Fen (n < 32). Chem. Phys. 2005, 309, 23−31.(264) Gutierrez, R.; Caetano, R.; Woiczikowski, P. B.; Kubar, T.;Elstner, M.; Cuniberti, G. Structural Fluctuations and QuantumTransport through DNA Molecular Wires: A Combined MolecularDynamics and Model Hamiltonian Approach. New J. Phys. 2010, 12,023022.(265) Kubar, T.; Woiczikowski, P. B.; Cuniberti, G.; Elstner, M.Efficient Calculation of Charge-Transfer Matrix Elements for HoleTransfer in DNA. J. Phys. Chem. B 2008, 112, 7937−7947.(266) Voityuk, A. A.; Siriwong, K.; Rosch, N. EnvironmentalFluctuations Facilitate Electron-Hole Transfer from Guanine toAdenine in DNA π Stacks. Angew. Chem., Int. Ed. 2004, 43, 624−627.(267) Cui, Q.; Elstner, M.; Kaxiras, E.; Frauenheim, T.; Karplus, M. AQM/MM Implementation of the Self-Consistent Charge Density

Functional Tight Binding (SCC-DFTB)Method. J. Phys. Chem. B 2001,105, 569−585.(268) Konig, P. H.; Hoffmann, M.; Frauenheim, T.; Cui, Q. A CriticalEvaluation of Different QM/MM Frontier Treatments with SCC-DFTB as the QM Method. J. Phys. Chem. B 2005, 109, 9082−9095.(269) Seabra, G. de M.; Walker, R. C.; Elstner, M.; Case, D. A.;Roitberg, A. E. Implementation of the SCC-DFTB Method for HybridQM/MM Simulations within the Amber Molecular Dynamics Package.J. Phys. Chem. A 2007, 111, 5655−5664.(270) Aradi, B.; Hourahine, B.; Frauenheim, T. DFTB+, a SparseMatrix-Based Implementation of the DFTB Method. J. Phys. Chem. A2007, 111, 5678−5684.(271) Frauenheim, T.; Seifert, G.; Elsterner, M.; Hajnal, Z.; Jungnickel,G.; Porezag, D.; Suhai, S.; Scholz, R. A Self-Consistent Charge Density-Functional Based Tight-Binding Method for Predictive MaterialsSimulations in Physics, Chemistry and Biology. Phys. Status Solidi B2000, 217, 41−62.(272) Elstner, M.; Frauenheim, T.; Kaxiras, E.; Seifert, G.; Suhai, S. ASelf-Consistent Charge Density-Functional Based Tight-BindingScheme for Large Biomolecules. Phys. Status Solidi B 2000, 217, 357−376.(273) Elstner, M.; Frauenheim, T.; Suhai, S. An Approximate DFTMethod for QM/MM Simulations of Biological Structures andProcesses. J. Mol. Struct.: THEOCHEM 2003, 632, 29−41.(274) Cui, Q.; Elstner, M. Density Functional Tight Binding: Values ofSemi-Empirical Methods in an Ab Initio Era. Phys. Chem. Chem. Phys.2014, 16, 14368−14377 DOI: 10.1039/c4cp00908h.(275)Goyal, P.; Ghosh, N.; Phatak, P.; Clemens,M.; Gaus,M.; Elstner,M.; Cui, Q. Proton Storage Site in Bacteriorhodopsin: New Insightsfrom Quantum Mechanics/Molecular Mechanics Simulations ofMicroscopic pKa and Infrared Spectra. J. Am. Chem. Soc. 2011, 133,14981−14997.(276) Frahmcke, J. S.; Wanko, M.; Elstner, M. Building a Model of theBlue Cone Pigment Based on the Wild Type Rhodopsin Structure withQM/MM Methods. J. Phys. Chem. B 2012, 116, 3313−3321.(277) Liang, R.; Swanson, J. M. J.; Voth, G. A. Benchmark Study of theSCC-DFTB Approach for a Biomolecular Proton Channel. J. Chem.Theory Comput. 2014, 10, 451−462.(278) Kuc, A.; Enyashin, A.; Seifert, G. Metal−Organic Frameworks:Structural, Energetic, Electronic, and Mechanical Properties. J. Phys.Chem. B 2007, 111, 8179−8186.(279) Lukose, B.; Supronowicz, B.; St. Petkov, P.; Frenzel, J.; Kuc, A.B.; Seifert, G.; Vayssilov, G. N.; Heine, T. Structural Properties of Metal-Organic Frameworks within the Density-Functional Based Tight-Binding Method. Phys. Status Solidi B 2012, 249, 335−342.(280) Lourenco, M. P.; Guimaraes, L.; da Silva, M. C.; de Oliveira, C.;Heine, T.; Duarte, H. A. Nanotubes With Well-Defined Structure:Single- and Double-Walled Imogolites. J. Phys. Chem. C 2014, 118,5945−5953.(281) Trani, F.; Barone, V. Silicon Nanocrystal Functionalization:Analytic Fitting of DFTB Parameters. J. Chem. Theory Comput. 2011, 7,713−719.(282) Ammeter, J. H.; Burgi, H. B.; Thibeault, J. C.; Hoffmann, R.Counterintuitive Orbital Mixing in Semiempirical and Ab InitioMolecular Orbital Calculations. J. Am. Chem. Soc. 1978, 100, 3686−3692.(283) Hoffmann, R. A Chemical and Theoretical Way to Look atBonding on Surfaces. Rev. Mod. Phys. 1988, 60, 601−628.(284) Hoffmann, R. Extended Huckel Theory. II. Σ Orbitals in theAzines. J. Chem. Phys. 1964, 40, 2745−2745.(285) Hoffmann, R. Extended Huckel Theory. III. Compounds ofBoron and Nitrogen. J. Chem. Phys. 1964, 40, 2474−2480.(286) Hoffmann, R. Extended Huckel Theory. IV. Carbonium Ions. J.Chem. Phys. 1964, 40, 2480−2488.(287) Cerda, J.; Soria, F. Accurate and Transferable Extended Huckel-Type Tight-Binding Parameters. Phys. Rev. B 2000, 61, 7965−7971.(288) Larsson, S.; Pyykko, P. Relativistically Parameterized ExtendedHuckel Calculation. IX. An Iterative Version With Applications To

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5880

Page 85: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

Some Xenon, Thorium And Uranium Compounds. Chem. Phys. 1986,101, 355−369.(289) Rego, L. G. C.; Batista, V. S. Quantum Dynamics Simulations ofInterfacial Electron Transfer in Sensitized TiO2 Semiconductors. J. Am.Chem. Soc. 2003, 125, 7989−7997.(290) Abuabara, S. G.; Rego, L. G. C.; Batista, V. S. Influence ofThermal Fluctuations on Interfacial Electron Transfer in FunctionalizedTiO2 Semiconductors. J. Am. Chem. Soc. 2005, 127, 18234−18242.(291) Abuabara, S. G.; Cady, C. W.; Baxter, J. B.; Schmuttenmaer, C.A.; Crabtree, R. H.; Brudvig, G. W.; Batista, V. S. UltrafastPhotooxidation of Mn(II)-Terpyridine Complexes Covalently Attachedto TiO2 Nanoparticles. J. Phys. Chem. C 2007, 111, 11982−11990.(292) McNamara, W. R.; Snoeberger, R. C.; Li, G.; Schleicher, J. M.;Cady, C. W.; Poyatos, M.; Schmuttenmaer, C. A.; Crabtree, R. H.;Brudvig, G. W.; Batista, V. S. Acetylacetonate Anchors for RobustFunctionalization of TiO2 Nanoparticles with Mn(II)−TerpyridineComplexes. J. Am. Chem. Soc. 2008, 130, 14329−14338.(293) Jakubikova, E.; Snoeberger, R. C., III; Batista, V. S.; Martin, R. L.;Batista, E. R. Interfacial Electron Transfer in TiO2 Surfaces Sensitizedwith Ru(II)−Polypyridine Complexes. J. Phys. Chem. A 2009, 113,12532−12540.(294) Bowman, D. N.; Blew, J. H.; Tsuchiya, T.; Jakubikova, E.Elucidating Band-Selective Sensitization in Iron(II) Polypyridine-TiO2

Assemblies. Inorg. Chem. 2013, 52, 8621−8628.(295) Rego, L. G. C.; Hames, B. C.; Mazon, K. T.; Joswig, J.-O.Intramolecular Polarization Induces Electron−Hole Charge Separationin Light-Harvesting Molecular Triads. J. Phys. Chem. C 2014, 118, 126−134.(296) Kienle, D.; Cerda, J. I.; Ghosh, A. W. Extended Huckel Theoryfor Band Structure, Chemistry, and Transport. I. Carbon Nanotubes. J.Appl. Phys. 2006, 100, 043714.(297) Zahid, F.; Paulsson, M.; Polizzi, E.; Ghosh, A. W.; Siddiqui, L.;Datta, S. A Self-Consistent Transport Model for Molecular ConductionBased on Extended Huckel Theory with Full Three-DimensionalElectrostatics. J. Chem. Phys. 2005, 123, 064707.(298) Suhendi, E.; Syariati, R.; Noor, F. A.; Kurniasih, N.; Khairurrijal.Model of a Tunneling Current in a P-N Junction Based on ArmchairGraphene Nanoribbonsan Airy Function Approach and a TransferMatrix Method. AIP Conf. Proc. 2014, 1589, 91−94.(299) Akdim, B.; Pachter, R.; Kim, S. S.; Naik, R. R.; Walsh, T. R.;Trohalaki, S.; Hong, G.; Kuang, Z.; Farmer, B. L. Electronic Propertiesof a Graphene Device with Peptide Adsorption: Insight fromSimulation. ACS Appl. Mater. Interfaces 2013, 5, 7470−7477.(300) Pop, F.; Auban-Senzier, P.; Frackowiak, A.; Ptaszyn ski, K.;Olejniczak, I.; Wallis, J. D.; Canadell, E.; Avarvari, N. Chirality DrivenMetallic versus Semiconducting Behavior in a Complete Series ofRadical Cation Salts Based on Dimethyl-Ethylenedithio-Tetrathiafulva-lene (DM-EDT-TTF). J. Am. Chem. Soc. 2013, 135, 17176−17186.(301) Zhang, X.; Dong, J.; Wang, Y.; Li, L.; Li, H. Electron TransportProperties of Si-Based Nanowires with Substitutional Impurities. J. Phys.Chem. C 2013, 117, 12958−12965.(302) Wrobel, F.; Kemei, M. C.; Derakhshan, S. AntiferromagneticSpin Correlations Between Corner-Shared [FeO5]

7− and [FeO6]9−

Units, in the Novel Iron-Based Compound: BaYFeO4. Inorg. Chem.2013, 52, 2671−2677.(303) Silva, R. A. L.; Neves, A. I. S.; Lopes, E. B.; Santos, I. C.;Coutinho, J. T.; Pereira, L. C. J.; Rovira, C.; Almeida, M.; Belo, D. (α-DT-TTF)2[Au(mnt)2]: A Weakly Disordered Molecular Spin-LadderSystem. Inorg. Chem. 2013, 52, 5300−5306.(304) Yannello, V. J.; Kilduff, B. J.; Fredrickson, D. C. IsolobalAnalogies in Intermetallics: The Reversed Approximation MOApproach and Applications to CrGa4- and Ir3Ge7-Type Phases. Inorg.Chem. 2014, 53, 2730−2741.(305) Gomez-Coca, S.; Cremades, E.; Aliaga-Alcalde, N.; Ruiz, E.Mononuclear Single-Molecule Magnets: Tailoring the MagneticAnisotropy of First-Row Transition-Metal Complexes. J. Am. Chem.Soc. 2013, 135, 7010−7018.

(306) Voityuk, A. A. Accurate Treatment of Energetics and Geometryof Carbon and Hydrocarbon Compounds within Tight-Binding Model.J. Chem. Theory Comput. 2006, 2, 1038−1044.(307) Voityuk, A. A. Thermochemistry of Hydrocarbons. Back toExtended Huckel Theory. J. Chem. Theory Comput. 2008, 4, 1877−1885.(308) Sui, Y.; Glaser, R.; Sarkar, U.; Gates, K. Stabilities and SpinDistributions of Benzannulated Benzyl Radicals. J. Chem. TheoryComput. 2007, 3, 1091−1099.(309) Tromer, R. M.; Freire, J. A. Extended Huckel MethodCalculation of Polarization Energies: The Case of a Benzene Dimer. J.Phys. Chem. A 2013, 117, 14276−14281.(310) Rincon, L.; Hasmy, A.; Gonzalez, C. A.; Almeida, R. ExtendedHuckel Tight-Binding Approach to Electronic Excitations. J. Chem. Phys.2008, 129, 044107.(311) Rincon, L.; Gonzalez, C. A. Extended Huckel Tight-BindingCalculations of Electronic Resonances in Linear Chains of Gold Atomsand Clusters. J. Phys. Chem. C 2010, 114, 20734−20740.(312) Kitamura, M.; Inoue, K.; Chen, H. Improvement of the Spin-Polarized Self-Consistent-Charge Extended Huckel Tight-BindingMethod. Mater. Chem. Phys. 2000, 62, 122−130.(313) Clementi, E.; Raimondi, D. L. Atomic Screening Constants fromSCF Functions. J. Chem. Phys. 1963, 38, 2686−2689.(314) Clementi, E.; Roetti, C. Roothan-Hartree-Fock AtomicWavefunctions Basis Functions and Their Coefficients for Groundand Certain Excited States of Neutral and Ionized Atoms, Z ≤ 54. At.Data Nucl. Data Tables 1974, 14, 177−478.(315) McLean, A. D.; McLean, R. S. Roothaan-Hartree-Fock AtomicWave Functions Slater Basis-Set Expansions for Z = 55−92. At. DataNucl. Data Tables 1981, 26, 197−381.(316) Steinborn, E. O.; Ruedenberg, K. Molecular One-ElectronIntegrals over Slater-Type Atomic Orbitals and Irregular Solid SphericalHarmonics. Int. J. Quantum Chem. 1972, 6, 413−438.(317) Jones, H. W. Evaluation of the Two Center Overlap andCoulomb Integrals Derived from Slater Type Orbitals. Int. J. QuantumChem. 1982, 21, 1079−1089.(318) Cusachs, L. C. Semiempirical Molecular Orbitals for GeneralPolyatomic Molecules. II. One-Electron Model Prediction of theH[Single Bond]O[Single Bond]H Angle. J. Chem. Phys. 1965, 43,S157−S159.(319) Kalman, B. L. Self-Consistent Extended Huckel Theory. I. J.Chem. Phys. 1973, 59, 5184−5188.(320) Amouyal, E.; Mouallem-Bahout, M.; Calzaferri, G. Excited Statesof M(II,d6)-4′-Phenylterpyridine Complexes: Electron Localization. J.Phys. Chem. 1991, 95, 7641−7649.(321) Nishida, M. ClusterModel Approach For Electronic Structure ofSi and Ge(111) and GaAs(110) Surfaces. Surf. Sci. 1978, 72, 589−616.(322) Mukhopadhyay, A. K.; Mukherjee, N. G. Self-ConsistentMethods in Huckel and Extended Huckel Theories. Int. J. QuantumChem. 1981, 19, 515−519.(323) Wang, Y.; Nordlander, P.; Tolk, N. H. Extended Huckel Theoryfor Ionic Molecules and Solids: An Application to Alkali Halides. J.Chem. Phys. 1988, 89, 4163−4169.(324) Carbo, R.; Fornos, J. M.; Hernandez, J. A.; Sanz, F. ElectrostaticCorrections to Extended Huckel Theory. Int. J. Quantum Chem. 1977,11, 271−276.(325) Hathaway, K. B.; Krumhansl, J. A. Potential Energy and ForceConstants of a Potassium Chloride Molecule from Extended HuckelCalculations. J. Chem. Phys. 1975, 63, 4313−4316.(326) Harris, J. Simplified Method for Calculating the Energy ofWeakly Interacting Fragments. Phys. Rev. B 1985, 31, 1770−1779.(327) Mulliken, R. S. Electronic Population Analysis on LCAO-MOMolecular Wave Functions. I. J. Chem. Phys. 1955, 23, 1833−1840.(328) Iron, M. A.; Heyden, A.; Staszewska, G.; Truhlar, D. G. Tight-Binding Configuration Interaction (TBCI): ANoniterative Approach toIncorporating Electrostatics into Tight Binding. J. Chem. Theory Comput.2008, 4, 804−818.(329) Akimov, A. V.; Prezhdo, O. V. Analysis of Self-ConsistentExtended Huckel Theory (SC-EHT): A New Look at the Old Method.J. Math. Chem. 2015, 53, 528−550.

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5881

Page 86: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

(330) Daw, M. S.; Baskes, M. I. Semiempirical, Quantum MechanicalCalculation of Hydrogen Embrittlement in Metals. Phys. Rev. Lett. 1983,50, 1285−1288.(331) Daw, M. S.; Baskes, M. I. Embedded-Atom Method: Derivationand Application to Impurities, Surfaces and Other Defects in Metals.Phys. Rev. B 1984, 29, 6443−6453.(332) Foiles, S. M.; Baskes, M. I.; Daw, M. S. Embedded-Atom-Methods Functions for the Fcc Metals Cu, Ag, Au, Ni, Pd, Pt and TheirAlloys. Phys. Rev. B 1986, 33, 7983−7991.(333) Stott, M. J.; Zaremba, E. Quasiatoms: An Approach to Atoms inNonuniform Electronic Systems. Phys. Rev. B 1980, 22, 1564−1583.(334) Puska, M. J.; Nieminen, R. M.; Manninen, M. Atoms Embeddedin an Electron Gas: Immersion Energies. Phys. Rev. B 1981, 24, 3037−3047.(335) Nørskov, J. K. Covalent Effects in the Effective-Medium Theoryof Chemical Binding: Hydrogen Heats of Solution in the 3d Metals.Phys. Rev. B 1982, 26, 2875−2885.(336) Rose, J. H.; Smith, J. R.; Guinea, F.; Ferrante, J. UniversalFeatures of the Equation of State ofMetals. Phys. Rev. B 1984, 29, 2963−2969.(337) Baskes, M. I. Application of the Embedded-Atom Method toCovalent Materials: A Semiempirical Potential for Silicon. Phys. Rev.Lett. 1987, 59, 2666−2669.(338) Pauling, I. The Nature of the Chemical Bond; Cornell UniversityPress: Ithaca, NY, 1960.(339) Baskes, M. I.; Nelson, J. S.; Wright, A. F. Semiempirical ModifiedEmbedded-Atom Potentials for Silicon and Germanium. Phys. Rev. B1989, 40, 6085−6100.(340) Baskes, M. I. Modified Embedded-Atom Potentials for CubicMaterials and Impurities. Phys. Rev. B 1992, 46, 2727−2742.(341) Lee, B.-J.; Baskes, M. I. Second Nearest-Neighbor ModifiedEmbedded-Atom-Method Potential. Phys. Rev. B 2000, 62, 8564−8567.(342) Lee, B.-J.; Baskes, M. I.; Kim, H.; Cho, Y. K. Second Nearest-Neighbor Modified Embedded Atom Method Potentials for BccTransition Metals. Phys. Rev. B 2001, 64, 184102.(343) Baskes, M. I.; Angelo, J. E.; Bisson, C. L. Atomistic Calculationsof Composite Interfaces. Model. Simul. Mater. Sci. Eng. 1994, 2, 505−518.(344) Lee, B.-J.; Shim, J.-H.; Baskes, M. Semiempirical AtomicPotentials for the Fcc Metals Cu, Ag, Au, Ni, Pd, Pt, Al, and Pb Based onFirst and Second Nearest-Neighbor Modified Embedded AtomMethod. Phys. Rev. B 2003, 68, 144112.(345) Jelinek, B.; Groh, S.; Horstemeyer, M. F.; Houze, J.; Kim, S. G.;Wagner, G. J.; Moitra, A.; Baskes, M. I. Modified Embedded AtomMethod Potential for Al, Si, Mg, Cu, and Fe Alloys. Phys. Rev. B 2012, 85,245102.(346) Liang, W.; Zhou, M.; Ke, F. Shape Memory Effect in CuNanowires. Nano Lett. 2005, 5, 2039−2043.(347) Kuo, C.-L.; Clancy, P. Melting and Freezing Characteristics andStructural Properties of Supported and Unsupported Gold Nano-clusters. J. Phys. Chem. B 2005, 109, 13743−13754.(348) Diawara, B.; Beh, Y.-A.; Marcus, P. Nucleation and Growth ofOxide Layers on Stainless Steels (FeCr) Using a Virtual Oxide LayerModel. J. Phys. Chem. C 2010, 114, 19299−19307.(349) Kang, K.; Cai, W. Brittle and Ductile Fracture of SemiconductorNanowiresMolecular Dynamics Simulations. Philos. Mag. 2007, 87,2169−2189.(350) Gates, T.; Odegard, G.; Frankland, S.; Clancy, T. ComputationalMaterials: Multi-Scale Modeling and Simulation of NanostructuredMaterials. Compos. Sci. Technol. 2005, 65, 2416−2434.(351) Noronha, S.; Farkas, D. Dislocation Pinning Effects on FractureBehavior: Atomistic and Dislocation Dynamics Simulations. Phys. Rev. B2002, 66, 132103.(352) Martínez, E.; Marian, J.; Arsenlis, A.; Victoria, M.; Perlado, J. M.Atomistically Informed Dislocation Dynamics in Fcc Crystals. J. Mech.Phys. Solids 2008, 56, 869−895.(353) Potirniche, G. P.; Horstemeyer, M. F.; Wagner, G. J.; Gullett, P.M. A Molecular Dynamics Study of Void Growth and Coalescence inSingle Crystal Nickel. Int. J. Plast. 2006, 22, 257−278.

(354) Swygenhoven, H. V.; Derlet, P. M.; Frøseth, A. G. Stacking FaultEnergies and Slip in Nanocrystalline Metals. Nat. Mater. 2004, 3, 399−403.(355) Thomas, L. H. The Calculation of Atomic Fields. Math. Proc.Cambridge Philos. Soc. 1927, 23, 542−548.(356) Fermi, E. Eine Statistische Methode Zur Bestimmung EinigerEigenschaften Des Atoms Und Ihre Anwendung Auf Die Theorie DesPeriodischen Systems Der Elemente. Z. Phys. Hadrons Nucl. 1928, 48,73−79.(357) Von Weizsacker, C. F. Zur Theorie Der Kernmassen. Z. Phys.Hadrons Nucl. 1935, 96, 431−458.(358) Cortona, P. Self-Consistently Determined Properties of Solidswithout Band-Structure Calculations. Phys. Rev. B 1991, 44, 8454−8458.(359) Wang, L.-W.; Teter, M. P. Kinetic-Energy Functional of theElectron Density. Phys. Rev. B 1992, 45, 13196−13220.(360) Smargiassi, E.; Madden, P. A. Orbital-Free Kinetic-EnergyFunctionals for First-Principles Molecular Dynamics. Phys. Rev. B 1994,49, 5220−5226.(361) Wang, Y. A.; Govind, N.; Carter, E. A. Orbital-Free Kinetic-Energy Density Functionals with a Density-Dependent Kernel. Phys.Rev. B 1999, 60, 16350−16358.(362) Wang, Y.; Govind, N.; Carter, E. Erratum: Orbital-Free Kinetic-Energy Density Functionals with a Density-Dependent Kernel [Phys.Rev. B 60, 16350 (1999)]. Phys. Rev. B 2001, 64, 089903(E).(363) Huang, C.; Carter, E. A. Nonlocal Orbital-Free Kinetic EnergyDensity Functional for Semiconductors. Phys. Rev. B 2010, 81, 045206.(364) Xia, J.; Huang, C.; Shin, I.; Carter, E. A. Can Orbital-FreeDensity Functional Theory Simulate Molecules? J. Chem. Phys. 2012,136, 084102.(365) Xia, J.; Carter, E. A. Density-Decomposed Orbital-Free DensityFunctional Theory for Covalently Bonded Molecules and Materials.Phys. Rev. B 2012, 86, 235109.(366) Shin, I.; Carter, E. A. Enhanced von Weizsacker Wang-Govind-Carter Kinetic Energy Density Functional for Semiconductors. J. Chem.Phys. 2014, 140, 18A531.(367) Zhou, B.; Ligneres, V. L.; Carter, E. A. Improving the Orbital-Free Density Functional Theory Description of Covalent Materials. J.Chem. Phys. 2005, 122, 044103.(368) Zhou, B.; Carter, E. A. First Principles Local Pseudopotential forSilver: Towards Orbital-Free Density-Functional Theory for TransitionMetals. J. Chem. Phys. 2005, 122, 184108.(369) Ke, Y.; Libisch, F.; Xia, J.; Carter, E. A. Angular MomentumDependent Orbital-Free Density Functional Theory: Formulation andImplementation. Phys. Rev. B 2014, 89, 155112.(370) Pauling, L. Atomic Radii and Interatomic Distances in Metals. J.Am. Chem. Soc. 1947, 69, 542−553.(371) Badger, R. M. A Relation Between Internuclear Distances andBond Force Constants. J. Chem. Phys. 1934, 2, 128−131.(372) Badger, R. M. The Relation Between the Internuclear Distancesand Force Constants of Molecules and Its Application to PolyatomicMolecules. J. Chem. Phys. 1935, 3, 710−714.(373) Herschbach, D. R.; Laurie, V. W. Anharmonic PotentialConstants and Their Dependence upon Bond Length. J. Chem. Phys.1961, 35, 458−463.(374) Glockler, G.; Evans, G. E. Force Constants and InternuclearDistance. J. Chem. Phys. 1942, 10, 606−606.(375) Lippincott, E. R. A New Relation between Potential Energy andInternuclear Distance. J. Chem. Phys. 1953, 21, 2070−2071.(376) Lippincott, E. R.; Schroeder, R. General Relation betweenPotential Energy and Internuclear Distance for Diatomic andPolyatomic Molecules. I. J. Chem. Phys. 1955, 23, 1131−1141.(377) Varshni, Y. P. Correlation of Molecular Constants. II. Relationbetween Force Constant and Equilibrium Internuclear Distance. J.Chem. Phys. 1958, 28, 1081−1089.(378) Jules, J. L.; Lombardi, J. R. Transition Metal Dimer InternuclearDistances from Measured Force Constants. J. Phys. Chem. A 2003, 107,1268−1273.(379) Jules, J. L.; Lombardi, J. R. Toward an Experimental BondOrder.J. Mol. Struct.: THEOCHEM 2003, 664−665, 255−271.

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5882

Page 87: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

(380) Shustorovich, E. Activation Barrier for Adsorbate SurfaceDiffusion, Heat of Chemisorption, and Adsorbate Registry: TheoreticalInterrelations. J. Am. Chem. Soc. 1984, 106, 6479−6481.(381) Shustorovich, E. M. Dissociation Activation Barrier and Heat ofChemisorption: A Morse-Type Analytical Approach. Surf. Sci. Lett.1985, 150, L115−L121.(382) Shustorovich, E. Chemisorption Theory: In Search of theElephant. Acc. Chem. Res. 1988, 21, 183−189.(383) Murdoch, J. R. Barrier Heights and the Position of StationaryPoints along the Reaction Coordinate. J. Am. Chem. Soc. 1983, 105,2667−2672.(384) Lendvay, G. BondOrders from Ab Initio Calculations and a Testof the Principle of Bond Order Conservation. J. Phys. Chem. 1989, 93,4422−4429.(385) Shustorovich, E. Coverage Effects under Atomic Chemisorp-tion: Morse-Potential Modeling Based on Bond-Order Conservation.Surf. Sci. 1985, 163, 645−654.(386) Shustorovich, E. Coadsorption, Promotion, and PoisoningEffects: Analytic Modelling Based on Bond-Order Conservation. Surf.Sci. 1986, 175, 561−578.(387) Shustorovich, E. Heat of Molecular Chemisorption from Bond-Order-Conservation Viewpoint: Why Morse Potentials Are so Efficient.Surf. Sci. 1987, 181, L205−L213.(388) Shustorovich, E.; Sellers, H. The UBI-QEP Method: A PracticalTheoretical Approach to Understanding Chemistry on TransitionMetalSurfaces. Surf. Sci. Rep. 1998, 31, 1−119.(389) Sellers, H.; Shustorovich, E. Intrinsic Activation Barriers andCoadsorption Effects for Reactions on Metal Surfaces: UnifiedFormalism within the UBI-QEP Approach. Surf. Sci. 2002, 504, 167−182.(390) Shustorovich, E.; Zeigarnik, A. V. The UBI-QEP Treatment ofPolyatomic Molecules without Bond-Energy Partitioning. Surf. Sci.2003, 527, 137−148.(391) Sellers, H. On Analytic Potential Functions and MolecularDynamics for Reactions on Metal Surfaces. Surf. Sci. 1994, 310, 281−291.(392) Gislason, J.; Sellers, H. Molecular Dynamics Simulations ofReactions on Metal Surfaces: Rate Constants for Selected DiatomicDissociation Reactions. Surf. Sci. 1997, 385, 77−86.(393) Sellers, H. The Generalized UBI-QEPMethod for Modeling theEnergetics of Reactions on Transition Metal Surfaces. Surf. Sci. 2003,524, 29−39.(394) Abell, G. C. Empirical Chemical Pseudopotential Theory ofMolecular and Metallic Bonding. Phys. Rev. B 1985, 31, 6184−6196.(395) Garcia, E.; Lagana, A. Diatomic Potential Functions forTriatomic Scattering. Mol. Phys. 1985, 56, 621−627.(396) Garcia, E.; Lagana, A. A New Bond-Order Functional Form forTriatomic Molecules. Mol. Phys. 1985, 56, 629−639.(397) Lagana, A. A Rotating Bond Order Formulation of the AtomDiatom Potential Energy Surface. J. Chem. Phys. 1991, 95, 2216−2217.(398) Lagana, A.; Ferraro, G.; Garcia, E.; Gervasi, O.; Ottavi, A.Potential Energy Representations in the Bond Order Space. Chem. Phys.1992, 168, 341−348.(399) Garcia, E.; Ciccarelli, L.; Lagana, A. A Vectorizable PotentialEnergy Functional for Reactive Scattering. Theor. Chim. Acta 1987, 72,253−264.(400) Stillinger, F. H.; Weber, T. A. Computer Simulation of LocalOrder in Condensed Phases of Silicon. Phys. Rev. B 1985, 31, 5262−5271.(401) Biswas, R.; Hamann, D. R. Interatomic Potentials for SiliconStructural Energies. Phys. Rev. Lett. 1985, 55, 2001−2004.(402) Tersoff, J. New Empirical Approach for the Structure and Energyof Covalent Systems. Phys. Rev. B 1988, 37, 6991−7000.(403) Tersoff, J. Empirical Interatomic Potential for Carbon, withApplications to Amorphous Carbon. Phys. Rev. Lett. 1988, 61, 2879−2882.(404) Tersoff, J. Modeling Solid-State Chemistry: InteratomicPotentials for Multicomponent Systems. Phys. Rev. B 1989, 39, 5566−5568.

(405) Khor, K. E.; Das Sarma, S. Proposed Universal InteratomicPotential for Elemental Tetrahedrally Bonded Semiconductors. Phys.Rev. B 1988, 38, 3318−3322.(406) Ito, T.; Khor, K. E.; Das Sarma, S. Empirical Potential-Based Si-Ge Interatomic Potential and Its Application to Superlattice Stability.Phys. Rev. B 1989, 40, 9715−9722.(407) Ito, T.; Khor, K. E.; Das Sarma, S. Systematic Approach toDeveloping Empirical Potentials for Compound Semiconductors. Phys.Rev. B 1990, 41, 3893−3896.(408) Brenner, D. W. Empirical Potential for Hydrocarbons for Use inSimulating the Chemical Vapor Deposition of Diamond Films. Phys.Rev. B 1990, 42, 9458−9471.(409) Brenner, D. W.; Shenderova, O. A.; Harrison, J. A.; Stuart, S. J.;Ni, B.; Sinnott, S. B. A Second-Generation Reactive Empirical BondOrder (REBO) Potential Energy Expression for Hydrocarbons. J. Phys.:Condens. Matter 2002, 14, 783−802.(410) Peng, S. M.; Yang, L.; Long, X. G.; Shen, H. H.; Sun, Q. Q.; Zu,X. T.; Gao, F. Bond-Order Potential for Erbium-Hydride System. J. Phys.Chem. C 2011, 115, 25097−25104.(411) Adiga, S. P.; Adiga, V. P.; Carpick, R. W.; Brenner, D. W.Vibrational Properties and Specific Heat of UltrananocrystallineDiamond: Molecular Dynamics Simulations. J. Phys. Chem. C 2011,115, 21691−21699.(412) Stuart, S. J.; Tutein, A. B.; Harrison, J. A. A Reactive Potential forHydrocarbons with Intermolecular Interactions. J. Chem. Phys. 2000,112, 6472−6486.(413) Liu, A.; Stuart, S. J. Empirical Bond-Order Potential forHydrocarbons: Adaptive Treatment of van Der Waals Interactions. J.Comput. Chem. 2008, 29, 601−611.(414) Cong, Y.; Yang, Z.-Z. General Atom-Bond ElectronegativityEqualization Method and Its Application in Prediction of ChargeDistribution in Polypeptide. Chem. Phys. Lett. 2000, 316, 324−329.(415) Ogawa, T.; Kurita, N.; Sekino, H.; Kitao, O.; Tanaka, S.Hydrogen Bonding of DNA Base Pairs by Consistent ChargeEquilibration Method Combined with Universal Force Field. Chem.Phys. Lett. 2003, 374, 271−278.(416) Ogawa, T.; Kurita, N.; Sekino, H.; Kitao, O.; Tanaka, S.Consistent Charge Equilibration (CQEq) Method: Application toAmino Acids and Crambin Protein. Chem. Phys. Lett. 2004, 397, 382−387.(417) Zhang, M.; Fournier, R. Self-Consistent Charge EquilibrationMethod and Its Application to Au13Nan (n = 1,10) Clusters. J. Phys.Chem. A 2009, 113, 3162−3170.(418)Mortier, W. J.; VanGenechten, K.; Gasteiger, J. ElectronegativityEqualization Application and Parametrization. J. Am. Chem. Soc. 1985,107, 829−835.(419) Mortier, W. J.; Ghosh, S. K.; Shankar, S. Electronegativity-Equalization Method for the Calculation of Atomic Charges inMolecules. J. Am. Chem. Soc. 1986, 108, 4315−4320.(420) Sefcik, J.; Demiralp, E.; Cagin, T.; Goddard, W. A. DynamicCharge Equilibration-Morse Stretch Force Field: Application toEnergetics of Pure Silica Zeolites. J. Comput. Chem. 2002, 23, 1507−1514.(421) Kitao, O.; Ogawa, T. I: METHODOLOGY. Consistent ChargeEquilibration (CQEq). Mol. Phys. 2003, 101, 3−17.(422) Chelli, R.; Procacci, P.; Righini, R.; Califano, S. ElectricalResponse in Chemical Potential Equalization Schemes. J. Chem. Phys.1999, 111, 8569−8575.(423) Chelli, R.; Procacci, P. Comment on “Classical Polarizable ForceFields Parametrized from Ab Initio Calculations” [J. Chem. Phys. 117,1416 (2002)]. J. Chem. Phys. 2003, 118, 1571−1572.(424) Chelli, R.; Procacci, P. Comment to “Calculation of the DipoleMoment for Polypeptides Using the Generalized Born-ElectronegativityEqualization Method: Results in Vacuum and Continuum-DielectricSolvent”. J. Phys. Chem. B 2004, 108, 16995−16997.(425) Warren, G. L.; Davis, J. E.; Patel, S. Origin and Control ofSuperlinear Polarizability Scaling in Chemical Potential EqualizationMethods. J. Chem. Phys. 2008, 128, 144110.

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5883

Page 88: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

(426) Morales, J.; Martínez, T. J. Classical Fluctuating ChargeTheories: The Maximum Entropy Valence Bond Formalism andRelationships to Previous Models. J. Phys. Chem. A 2001, 105, 2842−2850.(427) Chen, J.; Martínez, T. J. QTPIE: Charge Transfer withPolarization Current Equalization. A Fluctuating Charge Model withCorrect Asymptotics. Chem. Phys. Lett. 2007, 438, 315−320.(428) Mikulski, P. T.; Knippenberg, M. T.; Harrison, J. A. MergingBond-Order Potentials with Charge Equilibration. J. Chem. Phys. 2009,131, 241105.(429) Knippenberg, M. T.; Mikulski, P. T.; Ryan, K. E.; Stuart, S. J.;Gao, G.; Harrison, J. A. Bond-Order Potentials with Split-ChargeEquilibration: Application to C-, H-, and O-Containing Systems. J.Chem. Phys. 2012, 136, 164701.(430) Shin, Y. K.; Kwak, H.; Zou, C.; Vasenkov, A. V.; van Duin, A. C.T. Development and Validation of a ReaxFF Reactive Force Field forFe/Al/Ni Alloys: Molecular Dynamics Study of Elastic Constants,Diffusion, and Segregation. J. Phys. Chem. A 2012, 116, 12163−12174.(431) Khalilov, U.; Pourtois, G.; van Duin, A. C. T.; Neyts, E. C.Hyperthermal Oxidation of Si(100)2×1 Surfaces: Effect of GrowthTemperature. J. Phys. Chem. C 2012, 116, 8649−8656.(432) Huang, L.; Seredych, M.; Bandosz, T. J.; van Duin, A. C. T.; Lu,X.; Gubbins, K. E. Controllable Atomistic Graphene Oxide Model andIts Application in Hydrogen Sulfide Removal. J. Chem. Phys. 2013, 139,194707.(433) Poovathingal, S.; Schwartzentruber, T. E.; Srinivasan, S. G.; vanDuin, A. C. T. Large Scale Computational Chemistry Modeling of theOxidation of Highly Oriented Pyrolytic Graphite. J. Phys. Chem. A 2013,117, 2692−2703.(434) Raju, M.; van Duin, A. C. T.; Fichthorn, K. A. Mechanisms ofOriented Attachment of TiO2 Nanocrystals in Vacuum and HumidEnvironments: Reactive Molecular Dynamics. Nano Lett. 2014, 14,1836−1842.(435) Huang, X.; Yang, H.; Liang, W.; Raju, M.; Terrones, M.; Crespi,V. H.; van Duin, A. C. T.; Zhang, S. Lithiation Induced CorrosiveFracture in Defective Carbon Nanotubes. Appl. Phys. Lett. 2013, 103,153901.(436) Huang, L.; Bandosz, T.; Joshi, K. L.; van Duin, A. C. T.; Gubbins,K. E. Reactive Adsorption of Ammonia and Ammonia/water on CuBTCMetal-Organic Framework: A ReaxFF Molecular Dynamics Simulation.J. Chem. Phys. 2013, 138, 034102.(437) Nistor, R. A.; Polihronov, J. G.; Muser, M. H.; Mosey, N. J. AGeneralization of the Charge Equilibration Method for NonmetallicMaterials. J. Chem. Phys. 2006, 125, 094108.(438) Akimov, A. V.; Williams, C.; Kolomeisky, A. B. Charge Transferand Chemisorption of Fullerene Molecules on Metal Surfaces:Application to Dynamics of Nanocars. J. Phys. Chem. C 2012, 116,13816−13826.(439) Shan, T.-R.; Devine, B. D.; Kemper, T. W.; Sinnott, S. B.;Phillpot, S. R. Charge-Optimized Many-Body Potential for theHafnium/hafnium Oxide System. Phys. Rev. B 2010, 81, 125328.(440) Shan, T.-R.; Devine, B. D.; Hawkins, J. M.; Asthagiri, A.; Phillpot,S. R.; Sinnott, S. B. Second-Generation Charge-Optimized Many-BodyPotential for Si/SiO2 and Amorphous Silica. Phys. Rev. B 2010, 82,235302.(441) Yasukawa, A. Using an Extended Tersoff Interatomic Potentialto Analyze the Static-Fatigue Strength of SiO2 under AtmosphericInfluence. JSME Int. J. Ser. A 1996, 39, 313.(442) Snyder, J.; Rupp, M.; Hansen, K.; Muller, K.-R.; Burke, K.Finding Density Functionals with Machine Learning. Phys. Rev. Lett.2012, 108, 253002.(443) Prudente, F. V.; Soares Neto, J. J. The Fitting of Potential EnergySurfaces Using Neural Networks. Application to the Study of thePhotodissociation Processes. Chem. Phys. Lett. 1998, 287, 585−589.(444) Rocha Filho, T. M.; Oliveira, Z. T.; Malbouisson, L. A. C.;Gargano, R.; Soares Neto, J. J. The Use of Neural Networks for FittingPotential Energy Surfaces: A Comparative Case Study for the H+

3

Molecule. Int. J. Quantum Chem. 2003, 95, 281−288.

(445) Witkoskie, J. B.; Doren, D. J. Neural Network Models ofPotential Energy Surfaces: Prototypical Examples. J. Chem. TheoryComput. 2005, 1, 14−23.(446) Latino, D. A. R. S.; Freitas, F. F. M.; Aires-De-Sousa, J.; SilvaFernandes, F. M. S. Neural Networks to Approach Potential EnergySurfaces: Application to a Molecular Dynamics Simulation. Int. J.Quantum Chem. 2007, 107, 2120−2132.(447) Malshe, M.; Raff, L. M.; Rockley, M. G.; Hagan, M.; Agrawal, P.M.; Komanduri, R. Theoretical Investigation of the DissociationDynamics of Vibrationally Excited Vinyl Bromide on an Ab InitioPotential-Energy Surface Obtained Using Modified Novelty Samplingand Feedforward Neural Networks. II. Numerical Application of theMethod. J. Chem. Phys. 2007, 127, 134105.(448) Pukrittayakamee, A.; Malshe, M.; Hagan, M.; Raff, L. M.;Narulkar, R.; Bukkapatnum, S.; Komanduri, R. Simultaneous Fitting of aPotential-Energy Surface and Its Corresponding Force Fields UsingFeedforward Neural Networks. J. Chem. Phys. 2009, 130, 134101.(449) Handley, C. M.; Popelier, P. L. Potential Energy Surfaces Fittedby Artificial Neural Networks. J. Phys. Chem. A 2010, 114, 3371−3383.(450) Bittencourt, A. C.-P.; Prudente, F. V.; Vianna, J. D.-M. TheFitting of Potential Energy and Transition Moment Functions UsingNeural Networks: Transition Probabilities in OH (A2Σ+→ X2Π). Chem.Phys. 2004, 297, 153−161.(451) Darley, M. G.; Handley, C. M.; Popelier, P. L. A. Beyond PointCharges: Dynamic Polarization from Neural Net Predicted MultipoleMoments. J. Chem. Theory Comput. 2008, 4, 1435−1448.(452) Gasteiger, J.; Li, X.; Rudolph, C.; Sadowski, J.; Zupan, J.Representation of Molecular Electrostatic Potentials by TopologicalFeature Maps. J. Am. Chem. Soc. 1994, 116, 4608−4620.(453) Tersoff, J. New Empirical Model for the Structural Properties ofSilicon. Phys. Rev. Lett. 1986, 56, 632−635.(454) Li, W.; Li, S.; Jiang, Y. Generalized Energy-Based FragmentationApproach for Computing the Ground-State Energies and Properties ofLarge Molecules. J. Phys. Chem. A 2007, 111, 2193−2199.(455) Fedorov, D. G.; Kitaura, K. Theoretical Development of theFragment Molecular Orbital (FMO) Method. In Modern Methods forTheoretical Physical Chemistry of Biopolymers; Starikov, E. B., Lewis, J. P.,Tanaka, S., Eds.; Elsevier: Amsterdam, 2006; pp 3−38.(456) Suarez, E.; Díaz, N.; Suarez, D. Thermochemical FragmentEnergy Method for Biomolecules: Application to a Collagen ModelPeptide. J. Chem. Theory Comput. 2009, 5, 1667−1679.(457) Yang, W.; Lee, T.-S. A Density-Matrix Divide-and-ConquerApproach for Electronic Structure Calculations of Large Molecules. J.Chem. Phys. 1995, 103, 5674−5678.(458) Hirshfeld, F. Bonded-Atom Fragments for DescribingMolecularCharge Densities. Theor. Chim. Acta 1977, 44, 129−138.(459) Mulliken, R. S. Criteria for the Construction of Good Self-Consistent-Field Molecular Orbital Wave Functions, and the Signifi-cance of LCAO-MO Population Analysis. J. Chem. Phys. 1962, 36,3428−3439.(460) Lee, T.-S.; York, D. M.; Yang, W. Linear-Scaling SemiempiricalQuantum Calculations for Macromolecules. J. Chem. Phys. 1996, 105,2744−2750.(461) Cabrera-Trujillo, J.; Robles, J. Theoretical Study of the Structuraland Electronic Properties of Two-Dimensionally Polymerized FullereneClusters with 2, 3, 4, and 7 C60Molecules. Phys. Rev. B 2001, 64, 165408.(462) Goh, S. K.; St-Amant, A. Improving the Efficiency and Reliabilityof the Divide-and-Conquer Approach to Constructing the ElectronicDensity. Chem. Phys. Lett. 1997, 274, 429−438.(463) Goh, S. K.; Gallant, R. T.; St-Amant, A. Toward Linear Scalingwith Fitted Exchange-Correlation Terms in the LCGTO-DF Methodvia a Divide-and-Conquer Approach. Int. J. Quantum Chem. 1998, 69,405−421.(464) Goh, S. K.; Sosa, C. P.; St-Amant, A. A Scalable Divide-and-Conquer Algorithm Combining Coarse and Fine-Grain Parallelization.Theor. Chem. Acc. 1998, 99, 197−206.(465) Akama, T.; Kobayashi, M.; Nakai, H. Implementation of Divide-and-Conquer Method Including Hartree-Fock Exchange Interaction. J.Comput. Chem. 2007, 28, 2003−2012.

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5884

Page 89: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

(466) Akama, T.; Fujii, A.; Kobayashi, M.; Nakai, H. Is the Divide-and-Conquer Hartree−Fock Method Valid for Calculations of DelocalizedSystems? Mol. Phys. 2007, 105, 2799−2804.(467) Kobayashi, M.; Akama, T.; Nakai, H. Second-Order Møller-Plesset Perturbation Energy Obtained from Divide-and-ConquerHartree-Fock Density Matrix. J. Chem. Phys. 2006, 125, 204106.(468) Kobayashi, M.; Imamura, Y.; Nakai, H. Alternative Linear-Scaling Methodology for the Second-Order Møller-Plesset PerturbationCalculation Based on the Divide-and-Conquer Method. J. Chem. Phys.2007, 127, 074103.(469) Kobayashi, M.; Nakai, H. Dual-Level Hierarchical Scheme forLinear-Scaling Divide-and-Conquer Correlation Theory. Int. J. QuantumChem. 2009, 109, 2227−2237.(470) Kobayashi, M.; Nakai, H. Extension of Linear-Scaling Divide-and-Conquer-Based Correlation Method to Coupled Cluster Theorywith Singles and Doubles Excitations. J. Chem. Phys. 2008, 129, 044103.(471) Kobayashi, M.; Nakai, H. Divide-and-Conquer-Based Linear-Scaling Approach for Traditional and Renormalized Coupled ClusterMethods with Single, Double, and Noniterative Triple Excitations. J.Chem. Phys. 2009, 131, 114108.(472) Kobayashi, M.; Yoshikawa, T.; Nakai, H. Divide-and-ConquerSelf-Consistent Field Calculation for Open-Shell Systems: Implementa-tion and Application. Chem. Phys. Lett. 2010, 500, 172−177.(473) Yoshikawa, T.; Kobayashi, M.; Nakai, H. Linear-Scaling Divide-and-Conquer Second-Order Møller-Plesset Perturbation Calculationfor Open-Shell Systems: Implementation and Application. Theor. Chem.Acc. 2011, 130, 411−417.(474) Schmidt, M. W.; Baldridge, K. K.; Boatz, J. A.; Elbert, S. T.;Gordon, M. S.; Jensen, J. H.; Koseki, S.; Matsunaga, N.; Nguyen, K. A.;Su, S.; et al. General Atomic andMolecular Electronic Structure System.J. Comput. Chem. 1993, 14, 1347−1363.(475) Dixon, S. L.; van der Vaart, A. J.; Gogonea, V.; Vincent, J. J.;Brothers, E. N.; Westerhoff, L. M.; Merz, K. M. DivCon 99; ThePennsylvania State University: University Park, PA, 1999.(476) Walker, P. D.; Mezey, P. G. Molecular Electron Density LegoApproach to Molecule Building. J. Am. Chem. Soc. 1993, 115, 12423−12430.(477) Walker, P. D.; Mezey, P. G. Ab Initio Quality Electron Densitiesfor Proteins: AMEDLA Approach. J. Am. Chem. Soc. 1994, 116, 12022−12032.(478) Walker, P. D.; Mezey, P. G. Toward Similarity Measures forMacromolecular Bodies: MEDLA Test Calculations for SubstitutedBenzene Systems. J. Comput. Chem. 1995, 16, 1238−1249.(479) Exner, T. E.; Mezey, P. G. Ab Initio-Quality ElectrostaticPotentials for Proteins: An Application of the ADMA Approach. J. Phys.Chem. A 2002, 106, 11791−11800.(480) Exner, T. E.; Mezey, P. G. Ab Initio Quality Properties forMacromolecules Using the ADMA Approach. J. Comput. Chem. 2003,24, 1980−1986.(481) Exner, T. E.; Mezey, P. G. The Field-Adapted ADMAApproach:Introducing Point Charges. J. Phys. Chem. A 2004, 108, 4301−4309.(482) Kitaura, K.; Ikeo, E.; Asada, T.; Nakano, T.; Uebayasi, M.Fragment Molecular Orbital Method: An Approximate ComputationalMethod for Large Molecules. Chem. Phys. Lett. 1999, 313, 701−706.(483) Nakano, T.; Kaminuma, T.; Sato, T.; Fukuzawa, K.; Akiyama, Y.;Uebayasi, M.; Kitaura, K. Fragment Molecular Orbital Method: Use ofApproximate Electrostatic Potential. Chem. Phys. Lett. 2002, 351, 475−480.(484) Fedorov, D. G.; Kitaura, K. The Importance of Three-BodyTerms in the Fragment Molecular Orbital Method. J. Chem. Phys. 2004,120, 6832−6840.(485) Fedorov, D. G.; Kitaura, K. The Fragment Molecular OrbitalMethod: Practical Applications to Large Molecular Systems; CRC Press:Boca Raton, FL, 2009.(486) Ellison, F. O. A Method of Diatomics in Molecules. I. GeneralTheory and Application to HO. J. Am. Chem. Soc. 1963, 85, 3540−3544.(487) Kitaura, K.; Sawai, T.; Asada, T.; Nakano, T.; Uebayasi, M. PairInteractionMolecular Orbital Method: An Approximate Computational

Method for Molecular Interactions. Chem. Phys. Lett. 1999, 312, 319−324.(488) Fedorov, D. G.; Ishida, T.; Kitaura, K. Multilayer Formulation ofthe FragmentMolecular Orbital Method (FMO). J. Phys. Chem. A 2005,109, 2638−2646.(489) Fedorov, D. G.; Kitaura, K. Second Order Møller-PlessetPerturbation Theory Based upon the Fragment Molecular OrbitalMethod. J. Chem. Phys. 2004, 121, 2483−2490.(490) Mochizuki, Y.; Koikegami, S.; Amari, S.; Segawa, K.; Kitaura, K.;Nakano, T. Configuration Interaction Singles Method with MultilayerFragment Molecular Orbital Scheme. Chem. Phys. Lett. 2005, 406, 283−288.(491) Fedorov, D. G.; Kitaura, K. Coupled-Cluster Theory Based uponthe Fragment Molecular-Orbital Method. J. Chem. Phys. 2005, 123,134103.(492) Fedorov, D. G.; Kitaura, K. Multiconfiguration Self-Consistent-Field Theory Based upon the Fragment Molecular Orbital Method. J.Chem. Phys. 2005, 122, 054108.(493) Chiba, M.; Fedorov, D. G.; Nagata, T.; Kitaura, K. Excited StateGeometry Optimizations by Time-Dependent Density FunctionalTheory Based on the Fragment Molecular Orbital Method. Chem.Phys. Lett. 2009, 474, 227−232.(494) Nakata, H.; Fedorov, D. G.; Yokojima, S.; Kitaura, K.; Sakurai,M.; Nakamura, S. Unrestricted Density Functional Theory Based on theFragment Molecular Orbital Method for the Ground and Excited StateCalculations of Large Systems. J. Chem. Phys. 2014, 140, 144101.(495) Pruitt, S. R.; Fedorov, D. G.; Kitaura, K.; Gordon, M. S. Open-Shell Formulation of the Fragment Molecular Orbital Method. J. Chem.Theory Comput. 2010, 6, 1−5.(496) Fletcher, G. D.; Fedorov, D. G.; Pruitt, S. R.; Windus, T. L.;Gordon, M. S. Large-Scale MP2 Calculations on the Blue GeneArchitecture Using the Fragment Molecular Orbital Method. J. Chem.Theory Comput. 2012, 8, 75−79.(497) Otto, P.; Ladik, J. Investigation of the Interaction betweenMolecules at Medium Distances: I. SCF LCAO MO Supermolecule,Perturbational and Mutually Consistent Calculations for TwoInteracting HF and CH2O Molecules. Chem. Phys. 1975, 8, 192−200.(498) Fedorov, D. G.; Asada, N.; Nakanishi, I.; Kitaura, K. The Use ofMany-Body Expansions and Geometry Optimizations in Fragment-Based Methods. Acc. Chem. Res. 2014, 47, 2846−2856.(499) Kobori, T.; Sodeyama, K.; Otsuka, T.; Tateyama, Y.; Tsuneyuki,S. Trimer Effects in Fragment Molecular Orbital-Linear Combination ofMolecular Orbitals Calculation of One-Electron Orbitals for Bio-molecules. J. Chem. Phys. 2013, 139, 094113.(500) Nakano, T.; Mochizuki, Y.; Yamashita, K.; Watanabe, C.;Fukuzawa, K.; Segawa, K.; Okiyama, Y.; Tsukamoto, T.; Tanaka, S.Development of the Four-Body Corrected Fragment Molecular Orbital(FMO4) Method. Chem. Phys. Lett. 2012, 523, 128−133.(501) Fedorov, D. G.; Jensen, J. H.; Deka, R. C.; Kitaura, K. CovalentBond Fragmentation Suitable To Describe Solids in the FragmentMolecular Orbital Method. J. Phys. Chem. A 2008, 112, 11808−11816.(502) Zhang, D. W.; Zhang, J. Z. H. Molecular Fractionation withConjugate Caps for Full Quantum Mechanical Calculation of Protein−molecule Interaction Energy. J. Chem. Phys. 2003, 119, 3599−3605.(503) Mei, Y.; Ji, C.; Zhang, J. Z. H. A New Quantum Method forElectrostatic Solvation Energy of Protein. J. Chem. Phys. 2006, 125,094906.(504) Deev, V.; Collins, M. A. Approximate Ab Initio Energies bySystematic Molecular Fragmentation. J. Chem. Phys. 2005, 122, 154102.(505) Chen, X.; Zhang, Y.; Zhang, J. Z. H. An Efficient Approach forAb Initio Energy Calculation of Biopolymers. J. Chem. Phys. 2005, 122,184105.(506) Collins, M. A.; Deev, V. A. Accuracy and Efficiency of ElectronicEnergies from Systematic Molecular Fragmentation. J. Chem. Phys.2006, 125, 104104.(507) Collins, M. A. Ab Initio Lattice Dynamics of NonconductingCrystals by Systematic Fragmentation. J. Chem. Phys. 2011, 134, 164110.

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5885

Page 90: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

(508) Addicoat, M. A.; Collins, M. A. Accurate Treatment ofNonbonded Interactions within Systematic Molecular Fragmentation.J. Chem. Phys. 2009, 131, 104103.(509) Collins, M. A. Molecular Potential Energy Surfaces Constructedfrom Interpolation of Systematic Fragment Surfaces. J. Chem. Phys.2007, 127, 024104.(510) Reid, D. M.; Collins, M. A. Molecular Electrostatic Potentials bySystematic Molecular Fragmentation. J. Chem. Phys. 2013, 139, 184117.(511) Netzloff, H. M.; Collins, M. A. Ab Initio Energies ofNonconducting Crystals by Systematic Fragmentation. J. Chem. Phys.2007, 127, 134113.(512) Collins, M. A. Systematic Fragmentation of Large Molecules byAnnihilation. Phys. Chem. Chem. Phys. 2012, 14, 7744−7751.(513) Gadre, S. R.; Shirsat, R. N.; Limaye, A. C. Molecular TailoringApproach for Simulation of Electrostatic Properties. J. Phys. Chem. 1994,98, 9165−9169.(514) Babu, K.; Gadre, S. R. Ab Initio Quality One-Electron Propertiesof Large Molecules: Development and Testing of Molecular TailoringApproach. J. Comput. Chem. 2003, 24, 484−495.(515) Ganesh, V.; Dongare, R. K.; Balanarayan, P.; Gadre, S. R.Molecular Tailoring Approach for Geometry Optimization of LargeMolecules: Energy Evaluation and Parallelization Strategies. J. Chem.Phys. 2006, 125, 104109.(516) Gadre, S. R.; Jovan Jose, K. V.; Rahalkar, A. P. MolecularTailoring Approach for Exploring Structures, Energetics and Propertiesof Clusters. J. Chem. Sci. 2010, 122, 47−56.(517) Babu, K.; Ganesh, V.; Gadre, S. R.; Ghermani, N. E. TailoringApproach for Exploring Electron Densities and Electrostatic Potentialsof Molecular Crystals. Theor. Chem. Acc. 2004, 111, 255−263.(518) Sahu, N.; Yeole, S. D.; Gadre, S. R. Appraisal of MolecularTailoring Approach for Large Clusters. J. Chem. Phys. 2013, 138,104101.(519) Gadre, S. R.; Ganesh, V. Molecular Tailoring Approach:Towards PC-Based Ab Initio Treatment of Large Molecules. J. Theor.Comput. Chem. 2006, 5, 835−855.(520) Song, G.-L.; Li, Z. H.; Fan, K.-N. Extended Energy Divide-and-Conquer Method Based on Charge Conservation. J. Chem. TheoryComput. 2013, 9, 1992−1999.(521) Mayhall, N. J.; Raghavachari, K. Molecules-in-Molecules: AnExtrapolated Fragment-Based Approach for Accurate Calculations onLargeMolecules andMaterials. J. Chem. Theory Comput. 2011, 7, 1336−1343.(522) Mayhall, N. J.; Raghavachari, K. Many-Overlapping-Body(MOB) Expansion: A Generalized Many Body Expansion forNondisjoint Monomers in Molecular Fragmentation Calculations ofCovalent Molecules. J. Chem. Theory Comput. 2012, 8, 2669−2675.(523) Saha, A.; Raghavachari, K. Dimers of Dimers (DOD): A NewFragment-Based Method Applied to Large Water Clusters. J. Chem.Theory Comput. 2014, 10, 58−67.(524) Hua, W.; Fang, T.; Li, W.; Yu, J.-G.; Li, S. GeometryOptimizations and Vibrational Spectra of Large Molecules from aGeneralized Energy-Based Fragmentation Approach. J. Phys. Chem. A2008, 112, 10864−10872.(525) Hua, S.; Hua, W.; Li, S. An Efficient Implementation of theGeneralized Energy-Based Fragmentation Approach for General LargeMolecules. J. Phys. Chem. A 2010, 114, 8126−8134.(526) Huang, L.; Massa, L.; Karle, J. The Kernel Energy Method:Application to a tRNA. Proc. Natl. Acad. Sci. U. S. A. 2006, 103, 1233−1237.(527) Beran, G. J. O. Approximating Quantum Many-BodyIntermolecular Interactions in Molecular Clusters Using ClassicalPolarizable Force Fields. J. Chem. Phys. 2009, 130, 164115.(528) Beran, G. J. O.; Nanda, K. Predicting Organic Crystal LatticeEnergies with Chemical Accuracy. J. Phys. Chem. Lett. 2010, 1, 3480−3487.(529) Rezac, J.; Salahub, D. R. Multilevel Fragment-Based Approach(MFBA): A Novel Hybrid Computational Method for the Study ofLarge Molecules. J. Chem. Theory Comput. 2010, 6, 91−99.

(530) Imamura, A.; Aoki, Y.; Maekawa, K. A Theoretical Synthesis ofPolymers by Using Uniform Localization of Molecular Orbitals:Proposal of an ElongationMethod. J. Chem. Phys. 1991, 95, 5419−5431.(531) Liu, K.; Peng, L.; Gu, F. L.; Aoki, Y. Three DimensionalElongation Method for Large Molecular Calculations. Chem. Phys. Lett.2013, 560, 66−70.(532) Liu, K.; Yan, Y.; Gu, F. L.; Aoki, Y. A Modified LocalizationScheme for the Three-Dimensional Elongation Method Applied toLarge Systems. Chem. Phys. Lett. 2013, 565, 143−147.(533) Gu, F. L.; Aoki, Y.; Korchowiec, J.; Imamura, A.; Kirtman, B. ANew Localization Scheme for the Elongation Method. J. Chem. Phys.2004, 121, 10385−10391.(534) Aoki, Y.; Imamura, A. Local Density of States of AperiodicPolymers Using the Localized Orbitals from an Ab Initio ElongationMethod. J. Chem. Phys. 1992, 97, 8432−8440.(535) Korchowiec, J.; Gu, F. L.; Aoki, Y. Elongation Method atRestricted Open-Shell Hartree-Fock Level of Theory. Int. J. QuantumChem. 2005, 105, 875−882.(536) Korchowiec, J.; Gu, F. L.; Imamura, A.; Kirtman, B.; Aoki, Y.ElongationMethod with Cutoff Technique for Linear SCF Scaling. Int. J.Quantum Chem. 2005, 102, 785−794.(537) Makowski, M.; Korchowiec, J.; Gu, F. L.; Aoki, Y. Efficiency andAccuracy of the Elongation Method as Applied to the ElectronicStructures of Large Systems. J. Comput. Chem. 2006, 27, 1603−1619.(538) Korchowiec, J.; Lewandowski, J. Elongation Cutoff Technique:Low-Order Scaling SCF Method. J. Mol. Model. 2008, 14, 651.(539) Korchowiec, J.; Lewandowski, J.; Makowski, M.; Gu, F. L.; Aoki,Y. Elongation Cutoff Technique Armed with Quantum Fast MultipoleMethod for Linear Scaling. J. Comput. Chem. 2009, 30, 2515−2525.(540) Mitani, M.; Aoki, Y.; Imamura, A. A Novel Molecular OrbitalMethod for the Calculations of Polymer Systems with Local AperiodicPart: The Combination of the Elongation Method with the SupercellMethod. J. Chem. Phys. 1994, 100, 2346−2358.(541) Maekawa, K.; Imamura, A. Stationary Conditions of theElectronic Structures against the Extension of Molecular Systems andTheir Application to the Elongation Method. J. Chem. Phys. 1993, 98,534−542.(542) Imamura, A.; Aoki, Y.; Nishimoto, K.; Kurihara, Y.; Nagao, A.Calculations of the Electronic Structure of Various Aperiodic Polymersby an Elongation Method. Int. J. Quantum Chem. 1994, 52, 309−319.(543) Korchowiec, J.; Gu, F. L.; Aoki, Y. Elongation Method for LinearScaling SCF Calculations of Polymers. J. Comput. Methods Sci. Eng.2006, 6, 189−200.(544) Korchowiec, J.; de Silva, P.; Makowski, M.; Gu, F. L.; Aoki, Y.Elongation Cutoff Technique at Kohn-Sham Level of Theory. Int. J.Quantum Chem. 2010, 110, 2130−2139.(545) Makowski, M.; Korchowiec, J.; Gu, F. L.; Aoki, Y. DescribingElectron Correlation Effects in the Framework of the ElongationmethodElongation-MP2: Formalism, Implementation and Efficiency.J. Comput. Chem. 2010, 31, 1733−1740.(546) Makowski, M.; Gu, F. L.; Aoki, Y. Elongation-CIS Method:Describing Excited States of Large Molecular Systems in RegionallyLocalized Molecular Orbital Basis. J. Comput. Methods Sci. Eng. 2010, 10,473−481.(547) Miura, M.; Aoki, Y. Linear-Scaled Excited State Calculations atLinear Response Time-Dependent Hartree−Fock Theory. Mol. Phys.2010, 108, 205−210.(548) Liu, K.; Inerbaev, T.; Korchowiec, J.; Gu, F. L.; Aoki, Y.Geometry Optimization for Large Systems by the Elongation Method.Theor. Chem. Acc. 2012, 131, 1277.(549) Xie, P.; Orimoto, Y.; Aoki, Y. An Efficient Local MolecularDynamics Polymerization Simulation Combined with an Ab Initio MOMethod. Materials 2013, 6, 870−885.(550) Aoki, Y.; Loboda, O.; Liu, K.; Makowski, M.; Gu, F. L. HighlyAccurate O(N) Method for Delocalized Systems. Theor. Chem. Acc.2011, 130, 595−608.(551) Magnasco, V. Uniform Localization of Atomic and MolecularOrbitals. I. J. Chem. Phys. 1967, 47, 971−981.

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5886

Page 91: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

(552) Magnasco, V. Uniform Localization of Atomic and MolecularOrbitals. II. J. Chem. Phys. 1968, 48, 800−808.(553) Edmiston, C.; Ruedenberg, K. Localized Atomic and MolecularOrbitals. Rev. Mod. Phys. 1963, 35, 457−465.(554) Stewart, J. J. P. Application of Localized Molecular Orbitals tothe Solution of Semiempiical Self-Consistent Field Equations. Int. J.Quantum Chem. 1996, 58, 133−146.(555) Stewart, J. J. P.; Csaszar, P.; Pulay, P. Fast SemiempiricalCalculations. J. Comput. Chem. 1982, 3, 227−228.(556) Scemama, A.; Renon, N.; Rapacioli, M. A Sparse Self-ConsistentField Algorithm and Its Parallel Implementation: Application toDensity-Functional-Based Tight Binding. J. Chem. Theory Comput.2014, 10, 2344−2354.(557) Loos, P.-F., Rivail, J.-L., Assfeld, X. Iterative Stochastic SubspaceSelf-Consistent Field Method. J. Comput. Chem. 2015, arXiv:1310.1146.(558) Mo, Y.; Peyerimhoff, S. D. Theoretical Analysis of ElectronicDelocalization. J. Chem. Phys. 1998, 109, 1687−1697.(559) Mo, Y.; Zhang, Y.; Gao, J. A Simple Electrostatic Model forTrisilylamine: Theoretical Examinations of the N→σ* NegativeHyperconjugation, Pπ →dπ Bonding, and Stereoelectronic Interaction.J. Am. Chem. Soc. 1999, 121, 5737−5742.(560) Mo, Y.; Gao, J.; Peyerimhoff, S. D. Energy DecompositionAnalysis of Intermolecular Interactions Using a Block-Localized WaveFunction Approach. J. Chem. Phys. 2000, 112, 5530−5538.(561) Mo, Y.; Gao, J. An Ab Initio Molecular Orbital-Valence Bond(MOVB) Method for Simulating Chemical Reactions in Solution. J.Phys. Chem. A 2000, 104, 3012−3020.(562) Mo, Y.; Gao, J. Ab Initio QM/MM Simulations with aMolecularOrbital-Valence Bond (MOVB) Method: Application to an SN2Reaction in Water. J. Comput. Chem. 2000, 21, 1458−1469.(563) Song, L.; Gao, J. On the Construction of Diabatic and AdiabaticPotential Energy Surfaces Based on Ab Initio Valence Bond Theory. J.Phys. Chem. A 2008, 112, 12925−12935.(564) Aquist, J.; Arieh, W. Simulation of Enzyme Reactions UsingValence Bond Force Fields and Other Hybrid Quantum/ClassicalApproaches. Chem. Rev. 1993, 93, 2523−2544.(565) Cembran, A.; Song, L.; Mo, Y.; Gao, J. Block-Localized DensityFunctional Theory (BLDFT), Diabatic Coupling, and Their Use inValence Bond Theory for Representing Reactive Potential EnergySurfaces. J. Chem. Theory Comput. 2009, 5, 2702−2716.(566) Tully, J. C. Molecular Dynamics with Electronic Transitions. J.Chem. Phys. 1990, 93, 1061−1071.(567) Hack, M. D.; Truhlar, D. G. A Natural Decay of MixingAlgorithm for Non-Born−Oppenheimer Trajectories. J. Chem. Phys.2001, 114, 9305−9314.(568) Ehrenfest, P. Adiabatische Transformationen in Der Quanten-theorie Und Ihre Behandlung Durch Niels Bohr. Naturwissenschaften1923, 11, 543−550.(569) Ehrenfest, P. Bemerkung Uber Die Angenaherte Gultigkeit DerKlassischen Mechanik Innerhalb Der Quantenmechanik. Z. Phys.Hadrons Nucl. 1927, 45, 455−457.(570) Li, X.; Tully, J. C.; Schlegel, H. B.; Frisch, M. J. Ab InitioEhrenfest Dynamics. J. Chem. Phys. 2005, 123, 084106.(571) Fischer, S. A.; Chapman, C. T.; Li, X. Surface Hopping withEhrenfest Excited Potential. J. Chem. Phys. 2011, 135, 144102.(572) Lange, A. W.; Voth, G. A. Multi-State Approach to ChemicalReactivity in Fragment Based Quantum Chemistry Calculations. J.Chem. Theory Comput. 2013, 9, 4018−4025.(573) Yamashita, T.; Peng, Y.; Knight, C.; Voth, G. A. ComputationallyEfficient Multiconfigurational Reactive Molecular Dynamics. J. Chem.Theory Comput. 2012, 8, 4863−4875.(574) Valeev, E. F.; Coropceanu, V.; da Silva Filho, D. A.; Salman, S.;Bredas, J.-L. Effect of Electronic Polarization on Charge-TransportParameters in Molecular Organic Semiconductors. J. Am. Chem. Soc.2006, 128, 9882−9886.(575) Zhang, J.; Valeev, E. F. Hybrid One-Electron/many-ElectronMethods for Ionized States of Molecular Clusters. Phys. Chem. Chem.Phys. 2012, 14, 7863−7871.

(576) Sena, A. M. P.; Miyazaki, T.; Bowler, D. R. Linear ScalingConstrained Density Functional Theory in CONQUEST. J. Chem.Theory Comput. 2011, 7, 884−889.(577) Miyazaki, T.; Bowler, D. R.; Choudhury, R.; Gillan, M. J. AtomicForce Algorithms in Density Functional Theory Electronic-StructureTechniques Based on Local Orbitals. J. Chem. Phys. 2004, 121, 6186−6194.(578) Bowler, D. R.; Choudhury, R.; Gillan, M. J.; Miyazaki, T. RecentProgress with Large-Scale ab Initio Calculations: The CONQUESTCode. Phys. Status Solidi B 2006, 243, 989−1000.(579) Wu, Q.; Van Voorhis, T. Direct Optimization Method to StudyConstrained Systems within Density-Functional Theory. Phys. Rev. A2005, 72, 024502.(580) Wu, Q.; Van Voorhis, T. Constrained Density FunctionalTheory and Its Application in Long-Range Electron Transfer. J. Chem.Theory Comput. 2006, 2, 765−774.(581) Wu, Q.; Kaduk, B.; Van Voorhis, T. Constrained DensityFunctional Theory Based Configuration Interaction Improves thePrediction of Reaction Barrier Heights. J. Chem. Phys. 2009, 130,034109.(582) Van Voorhis, T.; Kowalczyk, T.; Kaduk, B.; Wang, L.-P.; Cheng,C.-L.; Wu, Q. The Diabatic Picture of Electron Transfer, ReactionBarriers, and Molecular Dynamics. Annu. Rev. Phys. Chem. 2010, 61,149−170.(583) Yost, S. R.; Wang, L.-P.; Van Voorhis, T. Molecular Insight Intothe Energy Levels at the Organic Donor/Acceptor Interface: AQuantum Mechanics/Molecular Mechanics Study. J. Phys. Chem. C2011, 115, 14431−14436.(584) Lemaur, V.; da Silva Filho, D. A.; Coropceanu, V.; Lehmann, M.;Geerts, Y.; Piris, J.; Debije, M. G.; van de Craats, A.M.; Senthilkumar, K.;Siebbeles, L. D. A.; et al. Charge Transport Properties in Discotic LiquidCrystals: A Quantum-Chemical Insight into Structure−PropertyRelationships. J. Am. Chem. Soc. 2004, 126, 3271−3279.(585) Hutchison, G. R.; Ratner, M. A.; Marks, T. J. IntermolecularCharge Transfer between Heterocyclic Oligomers. Effects of Heter-oatom and Molecular Packing on Hopping Transport in OrganicSemiconductors. J. Am. Chem. Soc. 2005, 127, 16866−16881.(586) Orlandi, G.; Troisi, A.; Zerbetto, F. Simulation of STM Imagesfrom Commercially Available Software. J. Am. Chem. Soc. 1999, 121,5392−5395.(587) Troisi, A.; Orlandi, G. Band Structure of the Four PentacenePolymorphs and Effect on the Hole Mobility at Low Temperature. J.Phys. Chem. B 2005, 109, 1849−1856.(588) Migliore, A. Nonorthogonality Problem and Effective ElectronicCoupling Calculation: Application to Charge Transfer in Π-StacksRelevant to Biochemistry and Molecular Electronics. J. Chem. TheoryComput. 2011, 7, 1712−1725.(589) Goedecker, S. Linear Scaling Electronic Structure Methods. Rev.Mod. Phys. 1999, 71, 1085−1123.(590) Goedecker, S.; Colombo, L. Efficient Linear Scaling Algorithmfor Tight-Binding Molecular Dynamics. Phys. Rev. Lett. 1994, 73, 122−125.(591) Car, R.; Parrinello, M. Unified Approach for MolecularDynamics and Density-Functional Theory. Phys. Rev. Lett. 1985, 55,2471−2474.(592) Marx, D.; Parrinello, M. Ab Initio Path-Integral MolecularDynamics. Z. Phys. B: Condens. Matter 1994, 95, 143−144.(593) Marx, D.; Tuckerman, M. E.; Martyna, G. J. Quantum Dynamicsvia Adiabatic Ab Initio Centroid Molecular Dynamics. Comput. Phys.Commun. 1999, 118, 166−184.(594) Doltsinis, N.; Marx, D. Nonadiabatic Car-Parrinello MolecularDynamics. Phys. Rev. Lett. 2002, 88, 166402.(595) Frank, I.; Hutter, J.; Marx, D.; Parrinello, M. MolecularDynamics in Low-Spin Excited States. J. Chem. Phys. 1998, 108, 4060−4069.(596) Galli, G.; Parrinello, M. Large Scale Electronic StructureCalculations. Phys. Rev. Lett. 1992, 69, 3547−3550.(597) Schlegel, H. B.; Millam, J. M.; Iyengar, S. S.; Voth, G. A.; Daniels,A. D.; Scuseria, G. E.; Frisch, M. J. Ab Initio Molecular Dynamics:

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5887

Page 92: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

Propagating the Density Matrix with Gaussian Orbitals. J. Chem. Phys.2001, 114, 9758−9763.(598) Iyengar, S. S.; Schlegel, H. B.; Millam, J. M.; A. Voth, G.;Scuseria, G. E.; Frisch, M. J. Ab Initio Molecular Dynamics: Propagatingthe Density Matrix with Gaussian Orbitals. II. Generalizations Based onMass-Weighting, Idempotency, Energy Conservation and Choice ofInitial Conditions. J. Chem. Phys. 2001, 115, 10291−10302.(599) Schlegel, H. B.; Iyengar, S. S.; Li, X.; Millam, J. M.; Voth, G. A.;Scuseria, G. E.; Frisch, M. J. Ab Initio Molecular Dynamics: Propagatingthe Density Matrix with Gaussian Orbitals. III. Comparison with Born−Oppenheimer Dynamics. J. Chem. Phys. 2002, 117, 8694−8704.(600) Mauri, F.; Galli, G.; Car, R. Orbital Formulation for Electronic-Structure Calculations with Linear System-Size Scaling. Phys. Rev. B1993, 47, 9973−9976.(601) Mauri, F.; Galli, G. Electronic-Structure Calculations andMolecular-Dynamics Simulations with Linear System-Size Scaling. Phys.Rev. B 1994, 50, 4316−4326.(602) Ordejon, P.; Drabold, D. A.; Grumbach, M. P.; Martin, R. M.Unconstrained Minimization Approach for Electronic ComputationsThat Scales Linearly with System Size. Phys. Rev. B 1993, 48, 14646−14649.(603) Ordejon, P.; Drabold, D. A.; Martin, R. M.; Grumbach, M. P.Linear System-Size Scaling Methods for Electronic-Structure Calcu-lations. Phys. Rev. B 1995, 51, 1456−1476.(604) Wang, L.-W.; Teter, M. P. Simple Quantum-Mechanical Modelof Covalent Bonding Using a Tight-Binding Basis. Phys. Rev. B 1992, 46,12798−12801.(605) Kim, J.; Mauri, F.; Galli, G. Total Energy Global OptimizationsUsing NonOrthogonal Localized Orbitals. Phys. Rev. B 1994, 52, 1640−1648.(606) Li, X.-P.; Nunes, R. W.; Vanderbilt, D. Density-MatrixElectronic-Structure Method with Linear System-Size Scaling. Phys.Rev. B 1993, 47, 10891−10894.(607) Daw, M. S. Model for Energetics of Solids Based on the DensityMatrix. Phys. Rev. B 1993, 47, 10895−10898.(608) McWeeny, R. The Density Matrix in Many-Electron QuantumMechanics. I. Generalized Product Functions. Factorization andPhysical Interpretation of the Density Matrices. Proc. R. Soc., Ser. A:Math. Phys. Eng. Sci. 1959, 253, 242−259.(609) McWeeny, R. Some Recent Advances in Density Matrix Theory.Rev. Mod. Phys. 1960, 32, 335−369.(610) Challacombe, M. A Simplified Density Matrix Minimization forLinear Scaling Self-Consistent Field Theory. J. Chem. Phys. 1999, 110,2332−2342.(611) Rudberg, E.; Rubensson, E. H. Assessment of Density MatrixMethods for Linear Scaling Electronic Structure Calculations. J. Phys.:Condens. Matter 2011, 23, 075502.(612) Daniels, A. D.; Scuseria, G. E. What Is the Best Alternative toDiagonalization of the Hamiltonian in Large Scale SemiempiricalCalculations? J. Chem. Phys. 1999, 110, 1321−1328.(613) Larsen, H.; Olsen, J.; Jørgensen, P.; Helgaker, T. DirectOptimization of the Atomic-Orbital Density Matrix Using theConjugate-Gradient Method with a Multilevel Preconditioner. J.Chem. Phys. 2001, 115, 9685−9697.(614) Matsuoka, O.; Matsufuji, T.; Sano, T. Direct Calculation of theOne-Electron Density Matrix for Closed-Shell Systems. J. Chem. Phys.2000, 113, 5179−5184.(615) Mostofi, A. A.; Haynes, P. D.; Skylaris, C.-K.; Payne, M. C.Preconditioned Iterative Minimization for Linear-Scaling ElectronicStructure Calculations. J. Chem. Phys. 2003, 119, 8842−8848.(616) Sałek, P.; Høst, S.; Thøgersen, L.; Jørgensen, P.; Manninen, P.;Olsen, J.; Jansík, B.; Reine, S.; Pawłowski, F.; Tellgren, E.; et al. Linear-Scaling Implementation of Molecular Electronic Self-Consistent FieldTheory. J. Chem. Phys. 2007, 126, 114110.(617) Coriani, S.; Høst, S.; Jansík, B.; Thøgersen, L.; Olsen, J.;Jørgensen, P.; Reine, S.; Pawłowski, F.; Helgaker, T.; Sałek, P. Linear-Scaling Implementation of Molecular Response Theory in Self-Consistent Field Electronic-Structure Theory. J. Chem. Phys. 2007,126, 154108.

(618) Qiu, S.-Y.; Wang, C. Z.; Ho, K. M.; Chan, C. T. Tight-BindingMolecular Dynamics with Linear System-Size Scaling. J. Phys.: Condens.Matter 1994, 6, 9153−9172.(619) Nunes, R. W.; Vanderbilt, D. Generalization of the Density-MatrixMethod to a Nonorthogonal Basis. Phys. Rev. B 1994, 50, 17611−17614.(620) Millam, J. M.; Scuseria, G. E. Linear Scaling Conjugate GradientDensity Matrix Search as an Alternative to Diagonalization for FirstPrinciples Electronic Structure Calculations. J. Chem. Phys. 1997, 106,5569−5577.(621) Li, X.; Millam, J. M.; Scuseria, G. E.; Frisch, M. J.; Schlegel, H. B.Density Matrix Search Using Direct Inversion in the Iterative Subspaceas a Linear Scaling Alternative to Diagonalization in Electronic StructureCalculations. J. Chem. Phys. 2003, 119, 7651−7658.(622) Hernandez, E.; Gillan, M. J. Self-Consistent First-PrinciplesTechnique with Linear Scaling. Phys. Rev. B 1995, 51, 10157−10160.(623) Hernandez, E.; Gillan, M. J.; Goringe, C. M. Linear-ScalingDensity-Functional-Theory Technique: The Density-Matrix Approach.Phys. Rev. B 1996, 53, 7147−7157.(624) Hierse, W.; Stechel, E. B. Order-N Methods in Self-ConsistentDensity-Functional Calculations. Phys. Rev. B 1994, 50, 17811−17819.(625) Berghold, G.; Parrinello,M.; Hutter, J. Polarized AtomicOrbitalsfor Linear Scaling Methods. J. Chem. Phys. 2002, 116, 1800−1810.(626) Palser, A. H.; Manolopoulos, D. E. Canonical Purification of theDensity Matrix in Electronic-Structure Theory. Phys. Rev. B 1998, 58,12704−12711.(627) Rudberg, E.; Rubensson, E. H.; Sałek, P. Kohn−Sham DensityFunctional Theory Electronic Structure Calculations with LinearlyScaling Computational Time and Memory Usage. J. Chem. TheoryComput. 2011, 7, 340−350.(628) Hine, N. D. M.; Haynes, P. D.; Mostofi, A. A.; Skylaris, C.-K.;Payne, M. C. Linear-Scaling Density-Functional Theory with Tens ofThousands of Atoms: Expanding the Scope and Scale of Calculationswith ONETEP. Comput. Phys. Commun. 2009, 180, 1041−1053.(629) Goedecker, S.; Teter, M. P. Tight-Binding Electronic-StructureCalculations and Tight-Binding Molecular Dynamics with LocalizedOrbitals. Phys. Rev. B 1995, 51, 9455−9464.(630) Baer, R.; Head-Gordon, M. Chebyshev Expansion Methods forElectronic Structure Calculations on Large Molecular Systems. J. Chem.Phys. 1997, 107, 10003−10013.(631) Bates, K. R.; Daniels, A. D.; Scuseria, G. E. Comparison ofConjugate Gradient Density Matrix Search and Chebyshev ExpansionMethods for Avoiding Diagonalization in Large-Scale ElectronicStructure Calculations. J. Chem. Phys. 1998, 109, 3308−3312.(632) Warshel, A.; Levitt, M. Theoretical Studies of EnzymicReactions: Dielectric, Electrostatic and Steric Stabilization of theCarbonium Ion in the Reaction of Lysozyme. J. Mol. Biol. 1976, 103,227−249.(633) Groenhof, G. Introduction to QM/MM Simulations. InBiomolecular Simulations; Monticelli, L., Salonen, E., Eds.; Methods inMolecular Biology 924; Humana Press: Totowa, NJ, 2013.(634) Acevedo, O.; Jorgensen, W. L. Advances in Quantum andMolecular Mechanical (QM/MM) Simulations for Organic andEnzymatic Reactions. Acc. Chem. Res. 2010, 43, 142−151.(635) Shaik, S.; Cohen, S.; Wang, Y.; Chen, H.; Kumar, D.; Thiel, W.P450 Enzymes: Their Structure, Reactivity, and SelectivityModeledby QM/MM Calculations. Chem. Rev. 2010, 110, 949−1017.(636) Difley, S.; Wang, L.-P.; Yeganeh, S.; Yost, S. R.; Voorhis, T. V.Electronic Properties of Disordered Organic Semiconductors via QM/MM Simulations. Acc. Chem. Res. 2010, 43, 995−1004.(637) Hall, G. G. The Molecular Orbital Theory of Chemical Valency.VIII. A Method of Calculating Ionization Potentials. Proc. R. Soc., Ser. A:Math. Phys. Eng. Sci. 1951, 205, 541−552.(638) Sandorfy, C. LCAO MO Calculations on Saturated Hydro-carbons and Their Substituted Derivatives. Can. J. Chem. 1955, 33,1337−1351.(639) Hoyland, J. R. Ab Initio Bond-Orbital Calculations. I.Application to Methane, Ethane, Propane, and Propylene. J. Am.Chem. Soc. 1968, 90, 2227−2232.

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5888

Page 93: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

(640) Naray-Szabo, G.; Surjan, P. R. Bond Orbital Framework ForRapid Calculation of Environmental Effects on Molecular PotentialSurfaces. Chem. Phys. Lett. 1983, 96, 499−501.(641) Bonaccorsi, R.; Scrocco, E.; Tomasi, J. Group Contributions tothe Electrostatic Molecular Potential. J. Am. Chem. Soc. 1976, 98, 4049−4054.(642) Bonaccorsi, R.; Scrocco, E.; Tomasi, J. An ApproximateExpression of the Electrostatic Molecular Potential in Terms ofCompletely Transferable Group Contributions. J. Am. Chem. Soc.1977, 99, 4546−4554.(643) Agresti, A.; Bonaccorsi, R.; Tomasi, J. An ApproximateExpression of the Electrostatic Molecular Potential for BenzenicCompounds. Theor. Chim. Acta 1979, 53, 215−220.(644) Naray-Szabo, G. Electrostatic Isopotential Maps for LargeBiomolecules. Int. J. Quantum Chem. 1979, 16, 265−272.(645) Naray-Szabo, G.; Grofcsik, A.; Kosa, K.; Kubinyi, M.; Martin, A.Simple Calculation of Electrostatic Isopotential Maps from BondFragments. J. Comput. Chem. 1981, 2, 58−62.(646) Nagy, P.; Angyan, J. G.; Naray-Szabo, G.; Peinel, G. MolecularElectrostatic Fields from Bond Fragments. Int. J. Quantum Chem. 1987,31, 927−939.(647) Bonaccorsi, R.; Ghio, C.; Tomasi, J. On a SemiclassicalInterpretation of Inter-and Intramolecular Interactions. Int. J. QuantumChem. 1984, 26, 637−686.(648) Ferenczy, G. G.; Rivail, J.-L.; Surjan, P. R.; Naray-Szabo, G.NDDO Fragment Self-Consistent Field Approximation for LargeElectronic Systems. J. Comput. Chem. 1992, 13, 830−837.(649) Naray-Szabo , G. Chemical Fragmentation in QuantumMechanical Methods. Comput. Chem. 2000, 24, 287−294.(650) Naray-Szabo, G.; Surjan, P. R.; Kiss, A. I. Quantum ChemicalConformational Analysis of the Catalytic Triad in A-Chymotrypsin. J.Mol. Struct.: THEOCHEM 1985, 123, 85−95.(651) Naray-Szabo, G.; Ferenczy, G. G. Molecular Wavefunctionsfrom Chemical Bonds: The Fragment Self-Consistent Field Theory. J.Mol. Struct.: THEOCHEM 1992, 261, 55−62.(652) Kadas, K.; Ferenczy, G. G. Electronic Structure of DopedFourfold Coordinated Amorphous Semiconductors. Midgap States inAmorphous Carbon. J. Mol. Struct.: THEOCHEM 1999, 463, 175−180.(653) Kadas, K.; Ferenczy, G. G.; Kugler, S. Theory of Dopant Pairs inFour-Fold Coordinated Amorphous Semiconductors. J. Non-Cryst.Solids 1998, 227, 367−371.(654) Toth, G.; Naray-Szabo, G.; Ferenczy, G. G.; Csonka, G. MonteCarlo Simulation of Amorphous Systems with the Fragment Self-Consistent Field Method. J. Mol. Struct.: THEOCHEM 1997, 398−399,129−133.(655) Toth, G.; Gereben, O.; Naray-Szabo , G. SemiempiricalFragment Self-Consistent Field Monte Carlo Simulations for LiquidChlorosilanes. J. Mol. Struct.: THEOCHEM 1994, 313, 165−172.(656) Thery, V.; Rinaldi, D.; Rivail, J.-L.; Maigret, B.; Ferenczy, G. G.QuantumMechanical Computations on Very Large Molecular Systems:The Local Self-Consistent Field Method. J. Comput. Chem. 1994, 15,269−282.(657) Monard, G.; Loos, M.; Thery, V.; Baka, K.; Rivail, J.-L. HybridClassical Quantum Force Field for Modeling Very Large Molecules. Int.J. Quantum Chem. 1996, 58, 153−159.(658) Monari, A.; Rivail, J.-L.; Assfeld, X. Theoretical Modeling ofLarge Molecular Systems. Advances in the Local Self Consistent FieldMethod for Mixed Quantum Mechanics/Molecular MechanicsCalculations. Acc. Chem. Res. 2013, 46, 596−603.(659) Ferenczy, G. G. Calculation of Wave-Functions with FrozenOrbitals in Mixed QuantumMechanics/molecular Mechanics Methods.Part I. Application of the Huzinaga Equation. J. Comput. Chem. 2013, 34,854−861.(660) Ferenczy, G. G. Calculation of Wave-Functions with FrozenOrbitals in Mixed QuantumMechanics/molecular Mechanics Methods.II. Application of the Local Basis Equation. J. Comput. Chem. 2013, 34,862−869.(661) Huzinaga, S.; Cantu, A. A. Theory of Separability of Many-Electron Systems. J. Chem. Phys. 1971, 55, 5543−5549.

(662) Bonifacic, V.; Huzinaga, S. Atomic and Molecular Calculationswith the Model Potential Method. I. J. Chem. Phys. 1974, 60, 2779−2786.(663) Sakai, Y.; Huzinaga, S. The Use ofModel Potentials inMolecularCalculations. I. J. Chem. Phys. 1982, 76, 2537−2551.(664) Fock, V.; Wesselow, M.; Petrashen, M. Zh. Eksp. Theor. Fiz.1940, 10, 723.(665) Szasz, L.; McGinn, G. Atomic and Molecular Calculations withthe Pseudopotential Method. I. The Binding Energy and EquilibriumInternuclear Distance of the Na2 Molecule. J. Chem. Phys. 1966, 45,2898−2912.(666) Lykos, P. G.; Parr, R. G. On the Pi-Electron Approximation andIts Possible Refinement. J. Chem. Phys. 1956, 24, 1166−1173.(667) Khait, Y. G.; Hoffmann, M. R. Embedding Theory for ExcitedStates. J. Chem. Phys. 2010, 133, 044107.(668) Khait, Y. G.; Hoffmann, M. R. On the Orthogonality of Orbitalsin Subsystem Kohn−Sham Density Functional Theory. Annu. Rep.Comput. Chem. 2012, 8, 53−70.(669) Gao, J. A Molecular-Orbital Derived Polarization Potential forLiquid Water. J. Chem. Phys. 1998, 109, 2346−2354.(670) Xie, W.; Gao, J. Design of a Next Generation Force Field: The X-POL Potential. J. Chem. Theory Comput. 2007, 3, 1890−1900.(671) Xie, W.; Song, L.; Truhlar, D. G.; Gao, J. The Variational ExplicitPolarization Potential and Analytical First Derivative of Energy:Towards a next Generation Force Field. J. Chem. Phys. 2008, 128,234108.(672) Gao, J.; Cembran, A.; Mo, Y. Generalized X-Pol Theory andCharge Delocalization States. J. Chem. Theory Comput. 2010, 6, 2402−2410.(673) Xie, W.; Orozco, M.; Truhlar, D. G.; Gao, J. X-Pol Potential: AnElectronic Structure-Based Force Field for Molecular DynamicsSimulation of a Solvated Protein in Water. J. Chem. Theory Comput.2009, 5, 459−467.(674) Giese, T. J.; York, D. M. Charge-Dependent Model for Many-Body Polarization, Exchange, and Dispersion Interactions in HybridQuantum Mechanical/molecular Mechanical Calculations. J. Chem.Phys. 2007, 127, 194101.(675) Day, P. N.; Jensen, J. H.; Gordon, M. S.; Webb, S. P.; Stevens, W.J.; Krauss, M.; Garmer, D.; Basch, H.; Cohen, D. An Effective FragmentMethod for Modeling Solvent Effects in Quantum MechanicalCalculations. J. Chem. Phys. 1996, 105, 1968−1986.(676) Gordon, M. S.; Freitag, M. A.; Bandyopadhyay, P.; Jensen, J. H.;Kairys, V.; Stevens, W. J. The Effective Fragment Potential Method: AQM-Based MM Approach to Modeling Environmental Effects inChemistry. J. Phys. Chem. A 2001, 105, 293−307.(677) Adamovic, I.; Freitag, M. A.; Gordon, M. S. Density FunctionalTheory Based Effective Fragment Potential Method. J. Chem. Phys.2003, 118, 6725−6732.(678) Yoo, S.; Zahariev, F.; Sok, S.; Gordon, M. S. Solvent Effects onOptical Properties of Molecules: A Combined Time-DependentDensity Functional Theory/effective Fragment Potential Approach. J.Chem. Phys. 2008, 129, 144112.(679) Arora, P.; Slipchenko, L. V.; Webb, S. P.; DeFusco, A.; Gordon,M. S. Solvent-Induced Frequency Shifts: Configuration InteractionSingles Combined with the Effective Fragment Potential Method. J.Phys. Chem. A 2010, 114, 6742−6750.(680) Slipchenko, L. V.; Gordon, M. S. Damping Functions in theEffective Fragment Potential Method.Mol. Phys. 2009, 107, 999−1016.(681) Ghosh, D.; Kosenkov, D.; Vanovschi, V.; Williams, C. F.;Herbert, J. M.; Gordon, M. S.; Schmidt, M. W.; Slipchenko, L. V.;Krylov, A. I. Noncovalent Interactions in Extended Systems Describedby the Effective Fragment Potential Method: Theory and Application toNucleobase Oligomers. J. Phys. Chem. A 2010, 114, 12739−12754.(682) Polyakov, I.; Epifanovsky, E.; Grigorenko, B.; Krylov, A. I.;Nemukhin, A. Quantum Chemical Benchmark Studies of the ElectronicProperties of the Green Fluorescent Protein Chromophore: 2.Cis-TransIsomerization in Water. J. Chem. Theory Comput. 2009, 5, 1907−1914.

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5889

Page 94: Large-Scale Computations in Chemistry: A Bird s Eye View of a …people.virginia.edu/~lz2n/mse627/notes/Akimov-Prezhdo... · 2016-01-31 · Density Functional Theory (DFT) 5806 2.3

(683) Slipchenko, L. V. Solvation of the Excited States ofChromophores in Polarizable Environment: Orbital Relaxation versusPolarization. J. Phys. Chem. A 2010, 114, 8824−8830.(684) DeFusco, A.; Minezawa, N.; Slipchenko, L. V.; Zahariev, F.;Gordon, M. S. Modeling Solvent Effects on Electronic Excited States. J.Phys. Chem. Lett. 2011, 2, 2184−2192.(685) Grigorenko, B. L.; Nemukhin, A. V.; Morozov, D. I.; Polyakov, I.V.; Bravaya, K. B.; Krylov, A. I. Toward Molecular-Level Character-ization of Photoinduced Decarboxylation of the Green FluorescentProtein: Accessibility of the Charge-Transfer States. J. Chem. TheoryComput. 2012, 8, 1912−1920.(686) Shao, Y.; Molnar, L. F.; Jung, Y.; Kussmann, J.; Ochsenfeld, C.;Brown, S. T.; Gilbert, A. T. B.; Slipchenko, L. V.; Levchenko, S. V.;O’Neill, D. P.; et al. Advances in Methods and Algorithms in a ModernQuantum Chemistry Program Package. Phys. Chem. Chem. Phys. 2006,8, 3172−3191.(687) Ghosh, D.; Kosenkov, D.; Vanovschi, V.; Flick, J.; Kaliman, I.;Shao, Y.; Gilbert, A. T. B.; Krylov, A. I.; Slipchenko, L. V. EffectiveFragment Potential Method in Q-CHEM: A Guide for Users andDevelopers: Software News and Updates. J. Comput. Chem. 2013, 34,1060−1070.(688) Gordon, M. S.; Mullin, J. M.; Pruitt, S. R.; Roskop, L. B.;Slipchenko, L. V.; Boatz, J. A. Accurate Methods for Large MolecularSystems. J. Phys. Chem. B 2009, 113, 9646−9663.(689) Stone, A. J. DistributedMultipole Analysis, or How to Describe aMolecular Charge Distribution. Chem. Phys. Lett. 1981, 83, 233−239.(690) Kaliman, I. A.; Slipchenko, L. V. LIBEFP: A New ParallelImplementation of the Effective Fragment Potential Method as aPortable Software Library. J. Comput. Chem. 2013, 34, 2284−2292.(691) Moskovsky, A. A.; Kaliman, I. A.; Akimov, A. V.; Konyukhov, S.S.; Grigorenko, B. L.; Nemukhin, A. V. Implementation of a MolecularDynamics Approach with Rigid Fragments to Simulation of ChemicalReactions in Biomolecular Systems.Moscow Univ. Chem. Bull. 2007, 62,177−179.(692) Akimov, A. V.; Kolomeisky, A. B. Recursive Taylor SeriesExpansionMethod for Rigid-BodyMolecular Dynamics. J. Chem. TheoryComput. 2011, 7, 3062−3071.(693) Wesolowski, T. A.; Warshel, A. Frozen Density FunctionalApproach for Ab Initio Calculations of Solvated Molecules. J. Phys.Chem. 1993, 97, 8050−8053.(694) Neugebauer, J. Subsystem-Based Theoretical Spectroscopy ofBiomolecules and Biomolecular Assemblies. ChemPhysChem 2009, 10,3148−3173.(695) Solovyeva, A.; Pavanello, M.; Neugebauer, J. Spin Densities fromSubsystem Density-Functional Theory: Assessment and Application toa Photosynthetic Reaction Center Complex Model. J. Chem. Phys. 2012,136, 194104.(696) Goodpaster, J. D.; Ananth, N.; Manby, F. R.; Miller, T. F. ExactNonadditive Kinetic Potentials for Embedded Density FunctionalTheory. J. Chem. Phys. 2010, 133, 084103.(697) Goodpaster, J. D.; Barnes, T. A.; Miller, T. F. Embedded DensityFunctional Theory for Covalently Bonded and Strongly InteractingSubsystems. J. Chem. Phys. 2011, 134, 164108.(698) Manby, F. R.; Stella, M.; Goodpaster, J. D.; Miller, T. F. ASimple, Exact Density-Functional-Theory Embedding Scheme. J. Chem.Theory Comput. 2012, 8, 2564−2568.(699) Jacob, C. R.; Visscher, L. In Recent Advances in Orbital-FreeDensity Functional Theory; Wesolowski, T. A.; Wang, Y. A., Eds.;Towards the description of covalent bonds in subsystem density-functional theory; World Scientific: Singapore.(700) Severo Pereira Gomes, A.; Jacob, C. R. Quantum-ChemicalEmbedding Methods for Treating Local Electronic Excitations inComplex Chemical Systems. Annu. Rep. Prog. Chem., Sect. C: Phys. Chem.2012, 108, 222−277.(701) Elliott, P.; Cohen, M. H.; Wasserman, A.; Burke, K. DensityFunctional Partition Theory with Fractional Occupations. J. Chem.Theory Comput. 2009, 5, 827−833.

(702) Nafziger, J.; Wasserman, A. Density-Based Partitioning Methodsfor Ground-State Molecular Calculations. J. Phys. Chem. A 2014, 118,7623−7639.(703) Mosquera, M. A.; Jensen, D.; Wasserman, A. Fragment-BasedTime-Dependent Density Functional Theory. Phys. Rev. Lett. 2013, 111,023001.(704) Fux, S.; Jacob, C. R.; Neugebauer, J.; Visscher, L.; Reiher, M.Accurate Frozen-Density Embedding Potentials as a First Step towards aSubsystem Description of Covalent Bonds. J. Chem. Phys. 2010, 132,164101.(705) Fux, S.; Jacob, C. R.; Neugebauer, J.; Visscher, L.; Reiher, M.Response to “Comment on ‘Accurate Frozen-Density EmbeddingPotentials as a First Step towards a Subsystem Description of CovalentBonds’” [J. Chem. Phys. 135, 027101 (2011)]. J. Chem. Phys. 2011, 135,027102.(706) Jacob, C. R.; Beyhan, S. M.; Visscher, L. Exact FunctionalDerivative of the Nonadditive Kinetic-Energy Bifunctional in the Long-Distance Limit. J. Chem. Phys. 2007, 126, 234116.(707) Krueger, B. P.; Scholes, G. D.; Fleming, G. R. Calculation ofCouplings and Energy-Transfer Pathways between the Pigments of LH2by the Ab Initio Transition Density Cube Method. J. Phys. Chem. B1998, 102, 5378−5386.(708) Wang, L.; Trivedi, D.; Prezhdo, O. V. Global Flux SurfaceHopping Approach for Mixed Quantum-Classical Dynamics. J. Chem.Theory Comput. 2014, 10, 3598−3605.(709) Akimov, A. V.; Prezhdo, O. V. Second-Quantized SurfaceHopping. Phys. Rev. Lett. 2014, 113, 153003.(710) Wang, L.; Beljonne, D. Charge Transport in OrganicSemiconductors: Assessment of the Mean Field Theory in the HoppingRegime. J. Chem. Phys. 2013, 139, 064316.

Chemical Reviews Review

DOI: 10.1021/cr500524cChem. Rev. 2015, 115, 5797−5890

5890