adapted ab initio theory: a simplified kohn-sham...

ADAPTED AB INITIO THEORY: A SIMPLIFIED KOHN-SHAM DENSITYFUNCTIONAL THEORY

By

JOSHUA J. McCLELLAN

A DISSERTATION PRESENTED TO THE GRADUATE SCHOOLOF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT

OF THE REQUIREMENTS FOR THE DEGREE OFDOCTOR OF PHILOSOPHY

UNIVERSITY OF FLORIDA

2008

1

c©2008 Joshua J. McClellan

2

To my parents.

3

ACKNOWLEDGMENTS

I thank the chair and members of my supervisory committee for their mentoring. I

would also like to give special thanks to Mark Ponton, Victor Lotrich, and Dr. Deumens

for their extensive help in the last few months. I owe a debt of gratitude to my family,

whose support has instilled a deep yearning to pursue my interests. Several people made

many other contributions to my life, both personally and academically. Larry urged me

to appreciate the finer things life has to offer and kept me sane. I thank the two real

chemists, Travis and Lani, for their unending quest to answer the basic question man asks

of life: What is fire? The people of the Quantum Theory Project offered a constant source

of inspiration, and often a source of morning headaches. I offer many thanks to Andrew,

Tom, Dan, Christina, Georgios, Seonah, Martin, Joey, Lena, Julio, and of course Merve.

I also acknowledge Prof. Merz for the use of the DIVCON program and Dr. Martin

Peters for helping to implement and test some initial ideas for cubic spline interpolation

(Chapter 4). Special thanks go to Andrew Taube and Tom Hughes for the numerous

extensive discussions that were invaluable in my research. Also, I acknowledge Tom

Hughes’ work in implementing the transfer Hamiltonian parameters for nitromethane

clusters (Chapter 2). I would like to recognize the help from Dr. Ajith Perera for help

throughout my graduate studies.

4

TABLE OF CONTENTS

page

ACKNOWLEDGMENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

ABSTRACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

CHAPTER

1 INTRODUCTION AND BACKGROUND . . . . . . . . . . . . . . . . . . . . . 10

1.1 Who Ordered a New Semiempirical Theory? . . . . . . . . . . . . . . . . . 101.2 Quantum Chemistry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151.3 Hartree-Fock . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171.4 Coupled-Cluster Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191.5 Kohn-Sham Density Functional Theory . . . . . . . . . . . . . . . . . . . . 221.6 Huckel Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231.7 Neglect of Diatomic Differential Overlap . . . . . . . . . . . . . . . . . . . 251.8 Tight-Binding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261.9 Density-Functional Tight-Binding . . . . . . . . . . . . . . . . . . . . . . . 271.10 Self-Consistent-Charge Density-Functional Tight-Binding . . . . . . . . . . 29

2 THE TRANSFER HAMILTONIAN . . . . . . . . . . . . . . . . . . . . . . . . . 35

2.1 Theoretical Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352.2 Computational Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 372.3 Nitrogen-Containing Energetic Materials . . . . . . . . . . . . . . . . . . . 37

2.3.1 Nitromethane Monomer . . . . . . . . . . . . . . . . . . . . . . . . . 402.3.2 Nitromethane Dimer . . . . . . . . . . . . . . . . . . . . . . . . . . 412.3.3 Nitromethane Trimer . . . . . . . . . . . . . . . . . . . . . . . . . . 42

2.4 Silica and Silicon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 442.4.1 Silicon Clusters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 452.4.2 Silica Polymorphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

3 IMPETUS FOR A NEW SEMIEMPIRICAL THEORY . . . . . . . . . . . . . . 58

3.1 Neglect of Diatomic Differential Overlap Based Methods . . . . . . . . . . 583.1.1 Does NDDO Approximate HF? . . . . . . . . . . . . . . . . . . . . 583.1.2 Artificial Repulsive Bump . . . . . . . . . . . . . . . . . . . . . . . 603.1.3 Separation of Electronic and Nuclear Energy Contributions . . . . . 613.1.4 Total Energy Expression . . . . . . . . . . . . . . . . . . . . . . . . 61

5

3.2 Self-Consistent Charge Density-Functional Tight-Binding . . . . . . . . . . 623.2.1 Self-Consistent Charge Density-Functional Tight-Binding in an AO

Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 623.2.2 Mulliken Approximation Connection to SCC-DFTB . . . . . . . . . 653.2.3 SCC-DFTB Repulsive Energy . . . . . . . . . . . . . . . . . . . . . 66

3.3 Systematic Comparison of HF, NDDO, DFTB, SCC-DFTB, KS-DFT in aSingle Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 673.3.1 General Energy Expression . . . . . . . . . . . . . . . . . . . . . . . 703.3.2 Density Independent Fock-Like Matrix Contribution . . . . . . . . . 713.3.3 Density Dependent Fock-Like Matrix Contribution . . . . . . . . . . 723.3.4 Residual Electronic Energy . . . . . . . . . . . . . . . . . . . . . . . 733.3.5 General Gradient Expression . . . . . . . . . . . . . . . . . . . . . . 743.3.6 Requirements for an Accurate Simplified Method . . . . . . . . . . . 75

4 ADAPTED AB INITIO THEORY . . . . . . . . . . . . . . . . . . . . . . . . . 80

4.1 Two-Electron Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 804.2 Kohn-Sham Exchange and Correlation . . . . . . . . . . . . . . . . . . . . 894.3 Adapted ab initio Theory Model Zero . . . . . . . . . . . . . . . . . . . . . 924.4 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 944.5 Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 974.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

APPENDIX PARALLEL IMPLEMENTATION IN THE ACES III ENVIRONMENT 117

REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122

BIOGRAPHICAL SKETCH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127

6

LIST OF TABLES

Table page

1-1 The NDDO one-center two-electron integrals . . . . . . . . . . . . . . . . . . . . 32

1-2 The NDDO two-center two-electron integrals . . . . . . . . . . . . . . . . . . . . 33

1-3 Multipole distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

2-1 Equilibrium geometry for NMT, in A and degrees . . . . . . . . . . . . . . . . . 49

4-1 Adapted integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

4-2 Average percentage of Fock matrix contribution by adapted approximation . . . 104

7

LIST OF FIGURES

Figure page

2-1 The NMT force curve for C-N rupture . . . . . . . . . . . . . . . . . . . . . . . 50

2-2 The NMT energy curve for C-N rupture . . . . . . . . . . . . . . . . . . . . . . 51

2-3 The NMT dimer energies relative minimum in the intermolecular hydrogen bondingdistance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

2-4 Snapshots of the geometry optimizations for NMT dimer and trimer performedwith the TH-CCSD and AM1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

2-5 Average deviation of Si-O stretch . . . . . . . . . . . . . . . . . . . . . . . . . . 54

2-6 Average deviation of Si-Si stretch . . . . . . . . . . . . . . . . . . . . . . . . . . 55

2-7 Average deviation of O-Si-O angle . . . . . . . . . . . . . . . . . . . . . . . . . . 56

2-8 Density deviation from PW91 DFT . . . . . . . . . . . . . . . . . . . . . . . . . 57

3-1 Multi-center contribution to the C2sN2s matrix element of NMT. . . . . . . . . 76

3-2 The AM1 artificial repulsive bump for NMT direct bond fission. . . . . . . . . 77

3-3 The SCC-DFTB total energy breakdown. . . . . . . . . . . . . . . . . . . . . . 78

3-4 Energies from Rudenberg and Mulliken approximations. . . . . . . . . . . . . . 79

4-1 Scatter plots of agreement between approximate and exact two-electron integrals. 105

4-2 Dissociation curve of C-N with 4-center Rudenberg approximation. . . . . . . . 106

4-3 Orbital products. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107

4-4 Dissociation curve of C-N with GZERO(B3LYP) approximation. . . . . . . . . . 108

4-5 Dissociation curve of C-N with AATM0 approximation. . . . . . . . . . . . . . . 109

4-6 Timing of ab initio versus Rudenberg four-center integrals . . . . . . . . . . . . 110

4-7 Percentage of multi-center terms versus system size . . . . . . . . . . . . . . . . 111

4-8 Timing of Fock build using traditional NDDO and NDDO with cubic splines asa function of system size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112

4-9 Error introduced per atom from cubic splines as a function of system size . . . . 113

4-10 Pseudo-reaction-path splitting C20 . . . . . . . . . . . . . . . . . . . . . . . . . 114

4-11 Force RMSD from MP2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115

4-12 Force RMSD from B3LYP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

8

Abstract of Dissertation Presented to the Graduate Schoolof the University of Florida in Partial Fulfillment of theRequirements for the Degree of Doctor of Philosophy

ADAPTED AB INITIO THEORY: A SIMPLIFIED KOHN-SHAM DENSITYFUNCTIONAL THEORY

By

Joshua J. McClellan

May 2008

Chair: Rodney J. BartlettMajor: Chemistry

We present work toward a one-electron Hamiltonian whose solution provides

electronic energies, forces, and properties for more than 1000 atoms fast enough to drive

large scale molecular dynamics. Ideally, such a method would be as predictive as accurate

ab initio quantum chemistry for such systems. To design the Hamiltonian requires that

we investigate rigorous one-particle theories including density functional theory and

the analogous wavefunction theory construction. These two complementary approaches

help identify the essential features required by an exact one-particle theory of electronic

structure. The intent is to incorporate these into a simple approximation that can provide

the accuracy required but at a speed orders of magnitude faster than today’s DFT.

We call the framework developed adapted ab initio theory. It retains many of the

computationally attractive features of the widely-used neglect of diatomic differential

overlap and self-consistent-charge density-functional tight-binding semiempirical methods,

but is instead, a simplified method as it allows for precise connections to high-level ab

initio methods. Working within this novel formal structure we explore computational

aspects that exploit modern computer architectures while maintaining a desired level of

accuracy.

9

CHAPTER 1INTRODUCTION AND BACKGROUND

The underlying physical laws for the mathematical theory of a large partof physics and the whole of chemistry are thus completely known, and thedifficulty is only that the exact application of these laws leads to equationsmuch too complicated to be soluble. — P.A.M. Dirac

1.1 Who Ordered a New Semiempirical Theory?

Available quantum chemical methods are incapable of achieving so-called chemical

accuracy (1 to 2 kcal/mol for bond energies) for a system composed of thousands of

atoms within a computationally efficient framework (approximately one day on a typical

desktop machine). A computational method with this capability would revolutionize the

way that matter is studied in biology, chemistry and physics. In the field of quantum

chemistry, there are many unexploited connections among wavefunction, density functional

and semiempirical theories. Use of such connections will aid in the development of

quantum chemical methods with the desired accuracy and computational efficiency. The

ultimate goal then is to provide scientists with a computational tool that makes materials

simulations predictive, chemically specific, and applicable to a wide variety of mechanical

and optical properties of realistic, complex systems.

Though high-level quantum methods approach the desired accuracy, the computational

scaling of such methods precludes their use for systems larger than a tens of atoms (e.g.,

coupled-cluster with singles and doubles (CCSD) scales O(N6), where N is the number

of basis functions). In contrast, the computational scaling of semiempirical (SE) methods

is acceptable (typically scales O(N3)), though they can be made to scale linearly with N .

For instance, if the size of a system is increased by an order of magnitude then a CCSD

calculation would incur a factor of 106 more computational time. For the same increase

in system size, the computational time of a normal SE method would only increase by

a factor of a thousand, but with linear scaling only by ten. In the regime of very large

systems (thousands of atoms) it is apparent that a calculation that could be performed in

one day using SE methods might take several years for CCSD. Unfortunately, SE methods

10

or, better, simplified methods (SM) are not yet sufficiently reliable, although they do

provide an excellent opportunity for improvement. There are two general approaches to

improve SE methods: first, existing reference data can be expanded to include additional

systems and properties or improved fitting procedures can be used to help locate new

minima in the model Hamiltonian’s parameter space; second, more rigorous model

Hamiltonians can be derived to satisfy more demanding desiderata, such as uniform

accuracy of the energy and forces across the entire potential energy surfaces (PES) and

improved electronic densities, ionization potentials, and associated first-order properties.

The first route is a mainstay of SE method development, though we will show that the

second SM strategy has the potential to be more valuable in the long run. To this end,

a new SM is discussed that incorporates rigorous formal connections to high-level ab

initio methods while retaining computational features of commonly used SE methods. An

important computational aspect of existing SE methods is that their speed is dictated by

the solution of a set of one-particle eigenvalue equations (typically matrix diagonalization)

and not by the construction of those one-particle equations or any ensuing step. This

computational restriction, in addition to a modicum of accuracy, guides the development

of the method under discussion, adapted ab initio theory (AAT). AAT not only provides

a general theoretical SM framework into which existing SE methods can fit, but also has

the advantage that there are well-defined connections between it and high-level ab initio

theories. This novel framework not only allows for improvements to existing SE methods

that enhance transferability and extendibility, but also has the flexibility needed for more

systematic improvement in future versions of SE methodology.

Before we begin the detailed description of our quantum mechanical method the

need for a fast and accurate method to describe the properties of large molecules will be

demonstrated. Phenomena that involve bond-breaking or bond-forming often necessitate

a detailed knowledge of the electronic structure and the PES. Such quantum chemical

descriptions are important in many complex chemical processes; for example:

11

1. Gas phase studies.• Plasmon absorption of organometallic nanoparticles.• Degradation pathways of chemical warfare agents.• Environmentally safe green explosives.

2. Condensed matter studies.• Defects in semiconductors.• Crack-tip propagation and hydrolytic weakening of silica.• Bulk tensile properties of silica polymorphs.

3. Biological studies.• Bioinformatics: fragmentation patterns of peptides for proteomics.• Biocatalysis: enzymatic reactions on oligosaccharides for biofuel generation.• Photoreceptors such as rhodopsin.• Drug design.

Quantum chemical methods are used to some extent in the analysis of these types

of problems. The ideal quantum chemical method could be applied to any size system

and achieve any desired accuracy, while using minimal computational resources, but, of

course, this cannot be achieved. Hence, there are often practical limitations dictated by

the desired accuracy and the available computational resources. Our goal is to create a

method with chemical accuracy that can be applied to systems with thousands of atoms.

A method with these features would be able to readily provide the quantum mechanical

contribution to the solution of the systems mentioned.

The two main approaches in quantum chemistry are wavefunction theory (WFT) and

density functional theory (DFT). Likewise, there are two commonly used SE techniques:

neglect of diatomic differential overlap (NDDO) methods, which originated in WFT, and

density-functional tight-binding (DFTB), including its self-consistent-charge extension

(SCC-DFTB), which originated in DFT. Both NDDO and SCC-DFTB can be considered

semiempirical since they both approximate the formal structure of the original methods

and are parameterized to reproduce either experimental or theoretical reference data.

Just as the formal connections between WFT and DFT have recently been uncovered

and exploited, via ab initio DFT [1], there are similar formal connections between their

12

semiempirical counterparts: NDDO and SCC-DFTB. These connections form the basis of

AAT, which is a SM rather than a SE method.

Existing SE methods are widely used and, when applied judiciously, perform quite

well. Regrettably, they are often used in situations for which the results are not reliable,

such as locating transition states. To expand the applicability of SE methods, we combine

their primary features with those from rigorous WFT or DFT, and define the new AAT

framework. Hence, AAT encompasses both NDDO and SCC-DFTB.

Despite their formal deficiencies, both NDDO and SCC-DFTB have been quite

successful in a variety of applications [2]. Nonetheless, there remain several aspects

of these SE methods that fall short of expectation that could be greatly improved. In

addition to obvious formal limitations, existing SE methods were designed to reproduce

the energetics of systems at equilibrium and often yield unphysical results near transition

states and along dissociation pathways. Uniform accuracy of the PES is critical for

molecular dynamics (MD) simulations, for determining the lowest energy (global

minimum) structure, and for calculating rate constants. Associated with these failings

are unrealistic electronic densities that, in turn, make combining these methods into

multi-scale modeling schemes difficult. Furthermore, errors in first-order properties such

as the dipole and higher-order moments would be remedied by a method that delivers

adequate electronic densities. Attempts to improve such properties have met with modest

success, typically at the expense of other aspects of the model.

By far, the most common route to incorporate such additional features is to

reparameterize the model. Unfortunately, building in accuracy for non-equilibrium

structures, excited states, and other properties, by simple reparameterization of existing

NDDO-based SE methods [3] is not readily achievable, at least not in a way that

maintains transferability (similar accuracy regardless of bonding characteristics or system

size). This difficulty highlights an important distinction between the current AAT route of

designing a new simplified model Hamiltonian based on rigorous ab initio principles versus

13

attempting to extend existing model Hamiltonians to include features that the model was

not originally intended to capture. To avoid such complications the AAT SM is primarily

designed to yield accurate electronic structure (electronic density) and, secondarily, the

total energy and forces. Excited and ionized states are of interest. Clearly, the electronic

structure and the energy of a molecule are closely connected. Consequently, the distinction

between the two is subtle, nevertheless, their separation is crucial in order to ensure that

the AAT achieves uniform accuracy for energies, forces and first-order properties over an

entire PES. The general computational approach is as follows:

• Approximate four-center terms with a sum of two-center terms using the Rudenbergapproximation [4, 5].

• Determine exchange-correlation potentials to guarantee the exact result at theseparated-atom limit, starting from a sum of atom-based densities.

• Use cubic splines to interpolate these two-center potential terms (a lookup table ofsplines need only be generated once and is very fast).

• Solve a set of one-particle equations with explicit inclusion of orbital overlap.

• Include explicit exact (HF-like/ non-local) exchange.

• At convergence of self-consistency, make a final energy correction by inserting thedensity into a DFT functional (B3LYP) and performing a single numeric integrationfor the exchange-correlation energy.

An important computational consideration is the inclusion of the overlap between

orbitals, which is a feature of AAT that distinguishes it from standard NDDO methods.

The zero-differential overlap approximation, which is the fundamental approximation of

NDDO, implies that there is no overlap between orbitals. Though this approximation

enhances the speed of the method, it neglects an important physical feature of electrons

and the basis sets used to describe them. There have been attempts to reintroduce such

overlap effects in NDDO methods, most notably Thiel’s orthogonalization correction

methods (OM1 and OM2) [6]. Still, Elstner’s development of SCC-DFTB [7] demonstrates

14

that one can include overlap explicitly and still retain a computationally efficient

framework. Thus, AAT explicitly includes the overlap between atomic orbitals.

The predominant computational aspects of the method have been highlighted; now,

consider the formal development of the AAT approach. There is an exact reference to

the exchange-correlation potential with which we can directly compare. The electronic

density can be corrected iteratively by an inverse procedure, such as that of Zhao,

Morrison, and Parr [8] or that of van Leeuwen and Baerends [9]. These procedures

take a high-quality reference density and generate the numerical exchange-correlation

potential that best reproduces it. In the limit that the reference density is exact, such

procedures produce an exact local exchange-correlation potential. This exact limit

defines a set of one-particle equations, which in turn, define the first-order electronic

energy. When added to the nuclear-nuclear repulsive energy, the electronic energy gives

the total energy correct through first-order in the trial wavefunction. In existing SE

methods, the remaining higher-order energy terms are combined with the nuclear-nuclear

repulsion to give a parameterized atom-pair based total energy correction that would be

predominantly nuclear repulsion. Instead, we choose to retain a separate nuclear-nuclear

repulsion contribution to ensure that every aspect of our model has a direct one-to-one

correspondence to ab initio theory. The connections to the exact limits for both density

and total energy components of the overall parameterization provide the necessary

structure so that every approximation of the model can be rigorously verified.

1.2 Quantum Chemistry

Electronic structure theory describes the distribution of electrons around nuclei in

atoms, molecules, and solids. Many observable features of a molecule are given by its

electronic structure, though computational efficiency requires approximations that can

limit the accuracy with which we can describe some chemical phenomena. There is a series

of standard approximations that permeate quantum chemistry that restrict its universal

applicability, but that are valid for many cases of interest:

15

• Time-independent Schrodinger equation.

• Born-Oppenheimer approximation.

• Nonrelativistic Hamiltonian.

• Linear combination of atomic orbitals.

The time-independent Schrodinger equation provides the solutions needed for the

time-dependent Schrodinger equation, and though some problems require a time-dependent

solution, the vast majority of problems in quantum chemistry can be treated adequately

with the time-independent Schrodinger equation. The Born-Oppenheimer approximation

is based on the simple observation that electrons move much faster than nuclei. This

difference in speed effectively implies nuclear motion cannot affect the electronic

structure significantly and we can solve the Schrodinger equation for a set of fixed nuclear

coordinates. For example in the hydrogen atom the electron is traveling at about two

million meters per second, which is orders of magnitude faster than the motion of the

nucleus. Also, this speed is less than 1% of the speed of light, which also implies that we

need not concern ourselves with relativstic effects that would only become important for

atoms with heavier nuclei (typically those with atomic number larger than 25). For such

heavy atoms the speed of an electron near the nucleus will approach the speed of light and

would require a relativistic Hamiltonian for an accurate description. In such cases effective

core potentials that approximate the passive (Darwin and mass-velocity) relativistic effects

can be used, which still allows one to keep the nonrelativistic Hamiltonian framework.

Finally, the linear combination of atomic orbitals (LCAO) approximation introduces basis

sets into the problem and gives a general framework for systematically expanding the

reference space of the orbitals that the electrons may occupy.

The time-independent Schrodinger equation within the Born-Oppenheimer (clamped-

nucleus) approximation is

HΨ = EΨ (1–1)

16

where E is the energy of the system, Ψ is the wavefunction of the electrons, and the

nonrelativistic Hamiltonian in Hartree atomic units is

H =− 1

2

∑i

∇2i −

∑A,i

ZA

rAi

+∑i>j

1

rij

+∑A>B

ZAZB

RAB

=∑

i

h(i) +∑i>j

1

rij

+ VNN

(1–2)

with i and j referring to electrons and A and B to nuclei. The terms in Equation 1–2

are the operators that correspond to the kinetic energy of the electrons, electron-nuclear

attraction, electron-electron repulsion, and nuclear-nuclear repulsion. The wavefunction

Ψ describes the state of our quantum system. We choose to approximate Ψ within a basis

set of atomic orbitals (AOs). Enforcing antisymmetry and satisfying the Pauli exclusion

principle can be achieved with an antisymmetric product of orthonormalized AOs within a

Slater determinant, the LCAO approximation

|ψ〉 = |χ1(r1)χ2(r2) · · ·χn(rn)〉 (1–3)

where n is the number of electrons and the spin orbital χ(r1) = φ(r1)α, β and α, β denote

spin and φ(r) is a spatial orbital.

1.3 Hartree-Fock

Focusing on the ground electronic state, the most widely known approximate solution

of the electronic Schrodinger equation is made by a mean-field approximation: HF theory.

That is, we insist that a zeroth-order approximation to the n-electron wavefunction (Φ0)

be written as an antisymmetrized product of n one-electron molecular spin-orbitals,

Φ0 = A[φ1(1)φ2(2)...φn(n)]. Here A is the antisymmetrization operator. Then using the

Raleigh-Ritz variational principle, EHF ≥ Eexact, we derive the one-particle HF equations

which yield the energetically best possible single determinant wavefunction. Each of the

orbitals in its canonical form is an eigen-function of the effective one-particle operator, f .

It is fitting to start with the most well-known approximation in electronic structure

theory. HF is an excellent touchstone since correlation energy is defined by it (Ecorrelation =

17

Eexact − EHF ) and because many quantum chemists have developed an intuitive

understanding of it. Also, HF provides a convenient way to introduce notation that will be

used throughout the current work. The one-electron operator of the full Hamiltonian is

h(1)|χi〉 = (−1

2∇2

1 −∑

A

ZA

rA1

)|χi〉 (1–4)

and the two-electron operators are Coulomb J and exchange K,

J(1)j|χi〉 =

∫ |χj(r2)|2r12

dr2|χi〉 (1–5)

K(1)j|χi〉 =∑

j 6=i

∫χ∗j(r2)χi(r2)

r12

dr2|χj〉 (1–6)

and

f(1)|χi〉 = [h(r1) + (J(1)− K(1))]|χi〉 = εi|χi〉 (1–7)

denoting the molecular spin orbital χi as a linear combination of atomic spin orbitals χi

χi(r1) =∑

µ

Cµiχµ(r1) (1–8)

Using Lagrange’s method of undetermined multipliers and enforcing orthonormality of the

orbitals gives a set of one-particle matrix equations, the Hartree-Fock-Roothaan equations:

FC = SCε (1–9)

where

Sµν =

∫χµ(r1)χν(r1)dr1 (1–10)

and

Fµν =

∫χµ(r1)f(r1)χν(r1)dr1 (1–11)

The construction of a Fock-like (or effective one-particle) matrix is a central theme of this

work. A Fock-like matrix is an orbital representation of an effective one-particle operator,

18

such as found in HF or DFT. It is also possible to define others, as in the correlated

orbital potential (COP) method [10, 11].

HF illustrates many of the necessary features of a one-particle theory. One distinct

feature, which sets HF apart from most other one-particle theories, is that HF enforces the

Pauli exclusion principle exactly via the exchange operator. Such exact exchange terms

are missing in Kohn-Sham density functional theory [12] which leads to the so-called self-

interaction error. Another benefit of HF is that there is Koopmans’ theorem that gives

meaning to the eigenvalues of the Fock matrix by relating them to ionization potentials

and electron affinities. However, the exact DFT structure can provide a formally correct

one-particle theory that includes electron correlation, which HF cannot do because by

definition it does not include electron correlation. COP is another exact alternative that

can be defined from WFT.

1.4 Coupled-Cluster Theory

The introduction of electron correlation in wavefunction theory can be achieved by

generalizing the mean-field wavefunction to allow electrons to be excited into the orbitals

that are unoccupied (virtual, i.e., do not appear in the Hartree-Fock determinant) orbitals,

a,b,c, . . . . Those additional orbitals allow the electrons to avoid each other in a more

detailed way (hence lower the energy) than in the mean-field wavefunction. That is, for

single excitations (replacements), double excitations, triple excitations, etc., we define

excitation operators Tj, j = 1, 2, 3, . . .

T1Φ0 =∑i,a

tai Φai (1–12)

T2Φ0 =∑

i<j,a<b

tabij Φab

ij (1–13)

T3Φ0 =∑

i<j<k,a<b<c

tabcijkΦ

abcijk (1–14)

...

19

with the excited determinants, Φabij , meaning the antisymmetrized product,

Φabij = A[φ1(1)φ2(2)..φa(i)..φb(j)...φn(n)], (1–15)

where unoccupied orbitals a and b replace the occupied ones, i and j. The coefficient

(amplitude) in front of the excitation introduces its proper weight into the wavefunction.

The ultimate version of such a sequence would include all possible excitations up to n-fold

for n electrons and is called the full configuration interaction or full CI. Full CI is:

• The best possible solution in a given basis set.

• Invariant to all unitary transformations (rotations) of the orbitals.

• Size-extensive [13] (meaning that it scales properly with the number of electrons.)

• An upper bound to the exact energy for a molecular electronic state.

The full CI can be written conveniently in the coupled-cluster (CC) form,

Ψ = exp(T1 + T2 + T3 + ... + Tn)Φ0. (1–16)

[13–16]. Approximations to CC theory are made by restricting the Tp operators to singles

and doubles, as in CCSD, or to other approximations that decouple higher-excitations

from lower ones, e.g., CCSD(T). Here the triple excitations are approximated from T1 and

T2, which can be obtained from CCSD, e.g., without allowing the new T3 estimate then to

contribute back to T1 and T2. The details of these methods, which can be found elsewhere

[17], are not important to the present discussion. What is important is that all truncated

CC methods are size-extensive, but do not necessarily give upper bounds to the exact

energy.

The full CC equations provide the exact electronic energy,

ECC = 〈Φ0|H|Φ0〉 = EHF +∑

i<j,a<b

〈ij||ab〉(tabij + tai t

bj − taj t

ai ) +

∑i,a

〈i|f |a〉tai (1–17)

20

where the two-electron and one-electron integrals are, respectively,

〈pq||rs〉 =

∫φ∗p(1)φ∗q(2)(1− P12)r

−112 φr(1)φs(1)d1d2 (1–18)

〈i|f |a〉 =

∫φ∗i (1)f(1)φa(1)d1 (1–19)

and P12 is the operator that permutes electrons 1 and 2. The amplitudes, {tai , tabij , · · · }

come from projection of exp(−T )H exp(T ) = H onto single, double, etc. excitations,

〈Φai |H|Φ0〉 = 0 (1–20)

〈Φabij |H|Φ0〉 = 0 (1–21)

. . .

The CC equations are a set of complicated, coupled non-linear equations. Unlike the

truncated CI equations, even when the CC equations are restricted to some subset of

excitations, like singles and doubles (CCSD), they still guarantee a size-extensive result.

They converge much more rapidly to the full CI reference solution than does the sequence

of partial sums of the CI expansion itself. For these reasons coupled-cluster has become

the method of choice in high-accuracy quantum chemical applications for small molecules

[17–19].

CC theory provides a route to exact reference data. It is well-known that the

hierarchy of CC methods are systematically improvable, and lead to the full CI solution

for a particular basis set. Hence, for small molecules we have ready access to reference

energies, forces, densities, and electronic properties to which we can compare directly.

The wealth of information that is afforded by a high-level ab initio calculation is superior

to the typical amount of information available from experiment and provides unique

insight into how well an approximation is working. For instance, besides getting energetic

information about a molecule we simultaneously have access to the electronic density.

This increased level of control, when compared to experimental reference data, allows for

systematically verifiable approximations of our effective one-particle Hamiltonian. The

21

ability to verify every approximation systematically puts the current work into sharp

relief from current semiempirical methods that depend intrinsically upon uncontrolled

approximations.

1.5 Kohn-Sham Density Functional Theory

The Kohn-Sham [12] equations may be written as

fKS|χi〉 = εi|χi〉 (1–22)

or,

[−1

2∇2 + νext(r) + νJ(r) + νxc(r)]|χi〉 = εi|χi〉 (1–23)

where

νext(r) = −∑

A

ZA

|RA − r| , (1–24)

νJ(r1) =

∫ρ(r2)

r12

dr2, (1–25)

ρ(r) =∑

i

|φi(r)|2, (1–26)

and νxc(r) is the exchange-correlation potential. Constrained-search minimization of the

noninteracting kinetic energy alone determines orbitals that yield the ground state density

[20]. The other terms do not affect the minimization because they are explicit functionals

of the density.

The Hohenberg-Kohn-Sham theorems guarantee that the density obtained from the

exact ground state wavefunction will be returned by Equation 1–26 when using the exact

exchange-correlation potential. In a pragmatic sense, these arguments give a formal proof

of existence, though they do not guide the construction of the exact one-particle operator,

or, more specifically, the exchange-correlation potential. However, in combination with

so-called inverse procedures, such as that of Zhao, Morrison and Parr (ZMP) [21–23],

which have been explored by Tozer et al. [24], or the procedure by Peirs et al. [25], the

potential may be refined iteratively such that the solutions to Equation 1–23 will generate

a reference density, presumably a high-quality ab initio reference density.

22

This is only part of the overall problem. In addition to the purely electronic

properties, we also want total energies and forces. The total energy in KS-DFT can

be written,

EKSelec =

occ∑i

εi − 1

2

∫ ∫ρ(r)ρ(r′)|r − r′| drdr′ −

∫vxc(r)ρ(r)dr + Exc[ρ] (1–27)

where the exchange-correlation potential is the functional derivative of the exchange-

correlation energy functional

vxc(r) =δExc[ρ]

δρ(r). (1–28)

Unfortunately, currently there is no procedure that allows us to take a numerically

derived exchange-correlation potential, one that is determined such that it reproduces a

high-quality reference density, and generate the exchange-correlation functional, which is

an explicit functional of the density. This inability to determine the exchange-correlation

energy functional from a reference density alone is the first indication of a fundamental

problem in semiempirical methods, which will be discussed in further detail later: how

does one design a model Hamiltonian that will simultaneously achieve the correct result

for both electronic properties and energetics? However, if we are willing to assume the

form of a particular exchange-correlation energy functional then the potential is well

defined, as is the effective one-particle Hamiltonian.

1.6 Huckel Method

The Huckel π-electron method is perhaps the simplest quantum chemical approximation.

It is intended to highlight the basic underlying effects of symmetry involved with π

electrons for alternant hydrocarbons, though it can be made more general. In relation

to HF, the same type of one-particle matrix equations are solved, though with the

overlap matrix set to the identity matrix and the Fock-like matrix is composed of only a

one-electron (density independent) contribution, F = H

HC = Cε (1–29)

23

The total energy is

E =∑

i

niεi (1–30)

where ni is the occupation number for the ith MO. Note there is no explicit consideration

of the nuclear repulsion in Equation 1–30. The Huckel approximations are

Hij =

α if i = j

β if i are nearest neighbors j

0 otherwise.

(1–31)

The extended Huckel method [26], as its name implies, is merely an extension of the

standard Huckel method with the same total energy formula. The full matrix equation

HC = SCε is solved, with overlap explicitly included. And the Hamiltonian is subject to a

slightly more sophisticated approximation:

Hij =

α if i = j

KSij(Hii + Hjj) if i and j are nearest neighbors

0 otherwise

(1–32)

where Hii is based on the ith AO of an isolated atom and K is a dimensionless adjustable

parameter for an atom-pair. Hoffmann suggested a value for K based on its effect on the

energy of ethane (C2H6). He argued that, since the energy dependence on K is nearly

linear beyond K = 1.5, the MO energies and, in turn, the total energy, will not be too

sensitive to K for values greater than that. He then considered the energy difference

between the staggered and eclipsed conformations of ethane and optimized the value of

K to minimize the error between calculated and experimental values. The final value he

arrived at was K = 1.75, a value is still in common use.

The important aspect to note about the Huckel method, as it pertains to the

development of improved semiempirical methods, is that a very simple model can

reproduce a few features qualitatively well. The spatial symmetry is correct due to the

inclusion of the overlap matrix. However, despite the ability of such basic approximations

24

to capture a large fraction of the total energy, more sophisticated descriptions are needed

to guarantee chemical accuracy.

1.7 Neglect of Diatomic Differential Overlap

Neglect of diatomic differential overlap (NDDO) is the model that most people

associate with semiempirical theory in quantum chemistry today. The NDDO approach

is meant to approximate the HF solution, not the exact solution, though its parameters

can be varied to minimize error to reference data [27–32]. It is a minimal basis set,

valence-only approach. NDDO is based on the zero-differential overlap (ZDO) approximation:

products of orbitals with respect the same electron coordinate that are on different centers

are defined to be zero,

χAµ (1)χB

ν (1) = 0. (1–33)

The orbitals are considered to be Lowdin symmetrically orthogonalized AOs, so that

the overlap matrix is the identity matrix, and the set of one-particle matrix equations

solved is FNDDOC = Cε. The Fock-like matrix in NDDO contains one-electron (H) and

two-electron (G) components. The one-electron part is approximated as

Hµν =

Uµµ −∑

B 6=A ZB(µµ|BB) if µ = ν

12Sµν(βµ + βν) if µ 6= ν

(1–34)

where µ is on atom A, ν is on atom B, and Uµµ and βµ are adjustable parameters,

(µµ|BB) models the repulsion between a product of orbitals (µ) and a point charge

at center B, (〈µ(1)| 1r1B|µ(1)〉). Note that the two-center one-electron terms include

overlap and are therefore not consistent with the ZDO approximation. The one-center

two-electron integrals are given in Table 1-1. The parameters Gss, Gpp, Hsp, Gpp, and Gp2

are adjustable.

Within a local framework there are twenty-two unique two-center NDDO-type

two-electron integrals for each atom pair (Table 1-2). The NDDO two-center two-electron

integrals are approximated by a multipole-multipole expansion [33]. Each product of

25

orbitals is assigned an approximate charge distribution (Table 1-3). Adjustable parameters

arise from the distances between the point charges used in the multipole expansion and

by enforcing the correct limit as R → 0 where the equivalent one-center term should be

reproduced.

As mentioned above, originally the NDDO Fock-like matrix was designed to model

the HF one-particle Hamiltonian. The HF solution neglects all electron correlation

effects and on the scale of chemical accuracy one should not a priori expect the NDDO

approximation to reproduce experimental results. However, because some parameters are

adjustable and fit to experimental data the typical argument is that electron correlation

is implicitly included. In fact, it is clear that electron correlation is no more implicit in

the NDDO model than are relativistic effects, time-dependence, non-Born-Oppenheimer

effects, basis set incompleteness, and experimental errors in the reference data. Despite

such formal gaps, the NDDO method has enjoyed a tremendous amount of success and

reproduces heats of formation and other properties with surprising accuracy. The difficulty

arises when better accuracy is required in situations for which the model was not originally

intended to describe, such as transition states or other nonequilibrium structures. In such

cases one quickly approaches a point of diminishing returns for parameterizing to new

reference data. The precise origin of these limitations is explored later.

1.8 Tight-Binding

The basic equation for the total energy in tight-binding (TB) theory is assumed to be

of the form,

ETB =n∑

i=1

εi +1

2

∑

A6=B

U(|RA −RB|). (1–35)

where the εi’s are the eigenvalues of

heff |i〉 = (−1

2∇2 + vext)|i〉 = εi|i〉, (1–36)

vext is the nuclear-electron Coulomb attraction term and U(|RA − RB|) is considered to

represent all the remaining terms. In total, the latter constitute a repulsive function of the

26

set {RAB} between atoms A and B. For the one-particle Hamiltonian, the usual LCAO

approximation is made and the φi’s are expanded in a set of χµ’s giving the equation,

HeffC = SCε (1–37)

where

(Heff )µν = 〈µ|heff |ν〉 (1–38)

and

Sµν = 〈µ|ν〉. (1–39)

In practice, these two-center matrix elements are evaluated using the Slater-Koster

parameterization [34]. The TB approximation has been applied successfully to many solid

state problems and works especially well when the density of the chemical system is well

approximated as a sum of spherically symmetric atomic densities. In the case of molecules,

a more advanced approach that “treats the delicate balance of charge” [7] is needed. If we

compare TB to HF, we see that U(|RA −RB|) has to account for J − K and VNN . In DFT

it would account for J , VNN , and would combine exchange with correlation by adding vxc.

1.9 Density-Functional Tight-Binding

To vest the TB energy expression with more rigor, Foulkes and Haydock [35] derived

it from first-principles DFT considerations. The energy expression in KS-DFT, Equation

1–27, may be written as

EDFT =occ∑i

〈φi| − ∇2

2+ vext +

1

2

∫ ′ ρ(r′)|r − r′| |φi〉+ Exc[ρ(r)] +

1

2

N∑A,B

ZAZB

|RA −RB| . (1–40)

Equation 1–40 is exact if ρ =∑occ

i |φ2i |, where ρ is the exact density. However, it should

be noted that φi are eigensolutions of hs = (−∇2

2+ vext +

∫ ′ ρ′0|r−r′| + vxc[ρ0]) and not

(−∇2

2+ vext + 1

2

∫ ′ ρ(r′)|r−r′|). Following Foulkes and Haydock, the density ρ is replaced by a

reference density ρ0 and a small deviation δρ, such that ρ = ρ0 + δρ. Then we can rewrite

27

the energy expression to be

EDFT =occ∑i

〈φi| − ∇2

2+ vext +

∫ ′ ρ′0|r − r′| + vxc[ρ0]|φi〉

−1

2

∫ ∫ ′ ρ′0(ρ0 + δρ)

|r − r′| −∫

Vxc[ρ0](ρ0 + δρ) (1–41)

+1

2

∫ ∫ ′ δρ′(ρ0 + δρ)

|r − r′| + Exc[ρ0 + δρ] +1

2

N∑A,B

ZAZB

|RA −RB|

to enable us to introduce a form dependent upon the one-particle Hamiltonian, hs, plus its

corrections. The first new term accounts for the double-counting of the Coulomb potential,

the next accounts for the addition of vxc in the first term, and the third new term is the

remaining part of the Coulomb potential. Next, the exchange-correlation contribution

Exc is expanded about the reference density, ρ0, with a Volterra (Taylor) expansion for

functionals,

Exc[ρ0 + δρ] = Exc[ρ0] +

∫νxc[ρ0]δρ +

1

2

∫ ∫ ′ δ2Exc

δρδρ′

∣∣∣∣ρ0

δρδρ′

+1

6

∫ ∫ ′ ∫ ′′ δ3Exc

δρδρ′δρ′′

∣∣∣∣ρ0

δρδρ′δρ′′ + · · · (1–42)

Substitution of this expression for Exc into the energy equation yields an expression for the

energy that is formally exact, yet only includes terms that are second-order and higher in

δρ.

EDFT =occ∑i

〈φi| − ∇2

2+ vext +

∫ ′ ρ′0|r − r′| + vxc[ρ0]|φi〉

− 1

2

∫ ∫ ′ ρ′0ρ0

|r − r′| + Exc[ρ0] +1

2

N∑A,B

ZAZB

|RA −RB| −∫

vxc[ρ0](ρ0)

+1

2

∫ ∫ ′ δρ′δρ|r − r′| +

1

2

∫ ∫ ′ δ2Exc

δρδρ′

∣∣∣∣ρ0

δρδρ′ (1–43)

+1

6

∫ ∫ ′ ∫ ′′ δ3Exc

δρδρ′δρ′′

∣∣∣∣ρ0

δρδρ′δρ′′ + · · ·

28

All of the terms in this expression that are independent of δρ comprise the standard

TB equation. Hence,

U(|RA −RB|) = −1

2

∫ ∫ ′ ρ′0ρ0

|r − r′| −∫

vxc[ρ0](ρ0) + Exc[ρ0] +1

2

N∑A,B

ZAZB

|RA −RB| (1–44)

which obviously is correct to second order in ρ. Note the inclusion of nuclear-nuclear

repulsion in this term.

One of the great advantages of DFT approaches is that they offer a reasonable

zeroth-order approximation to the energy simply from knowing some approximation to the

reference density, ρ0. However, since Exc[ρ] is not known, getting the exchange-correlation

potential, vxc(r) = δExc/δρ(r), and its kernel, fxc(r, r′) = δ2Exc/δρ(r)δρ(r′), from first

principles, which are required to set up a rigorous DFT, is difficult [1, 36]. However, were

we to accept one of the plethora of approximations to Exc (LDA, LYP, PBE, B3LYP, B88,

etc.) the equations can be solved for a few hundred atoms, but not on a time-scale that

can be tied to MD for thousands of atoms.

Furthermore, to provide a correct description of bond breaking the electronic charge

has to be free to relocate. A common example is LiF, in which at R = Req we have

virtually an exact Li+F− form, but at separation in a vacuum, we must have Li0F0.

Such self-consistency usually is introduced by solving the AO-based DFT equations via

diagonalization, but this approach imposes an iterative step that scales as N3 to the

procedure. Since the model is meant to be approximate anyway, the alternative charge

self-consistency approach, familiar from iterative extended Huckel theory [37, 38], can be

used more efficiently. This extension leads to the self-consistent-charge density-functional

tight-binding (SCC-DFTB) approach of Elstner et al. [7].

1.10 Self-Consistent-Charge Density-Functional Tight-Binding

To introduce charge self-consistency, the terms that are second-order in δρ are

approximated as a short-range Coulombic interaction based upon Mulliken atomic charge

29

definitions,

E2nd =1

2

∫ ∫ ′(

δρ′δρ|r − r′| +

δ2Exc

δρδρ′

∣∣∣∣ρ0

)δρδρ′ ≈ 1

2

N∑A,B

∆qA∆qBγAB. (1–45)

The ∆qA are the differences between some reference charge, q0A, and the charge on atom A,

qA =occ∑i

∑µ∈A

∑ν

cµicνiSµν (1–46)

γAB is a pair-potential, whose common functional forms are taken from traditional

semiempirical quantum chemistry schemes such as those due to Klopman [39], Ohno [40],

and Mataga-Nishimoto [41]. For example, the latter form is

γAB = 1/[2(γA + γB)−1 + RAB] (1–47)

where the single index γA is meant to be a purely atomic two-electron repulsion term.

These are typically chosen as fixed atomic parameters in the calculation. For an

assessment of how the SCC-DFTB model performs compared to standard semiempirical

quantum chemical methods, see Morokuma, et al. [42]. For example, unlike SCC-DFTB,

standard AM1 does not give the most and least stable isomers of C28 when compared

to DFT calculations with the B3LYP Exc and a 6-31G(d) basis set. In contrast, the

relative energies of the isomers of C28 with B3LYP/6-31G(d)//SCC-DFTB are in very

good agreement with B3LYP/6-31G(d), having a linear regression R2 coefficient of 0.9925.

Though the geometries generated by SCC-DFTB are excellent, the SCC-DFTB relative

energies are poor, with a linear regression R2 of 0.7571.

SCC-DFTB is a fairly rigorous semiempirical method. The framework provided by

Foulkes and Haydock exposed how higher-order terms could be included. The practical

implementation of a well-behaved approximation of higher-order was subsequently

provided by Elstner. Later, we will develop the SCC-DFTB formalism from an MO

viewpoint to facilitate the comparison between it and other methods.

30

We have discussed the basic features of HF, CC, KS-DFT, Huckel, NDDO, TB,

DFTB, and SCC-DFTB, and have alluded to the role each plays in the development

of an improved semiempirical method. We take two approaches to develop this target

method. First, we demonstrate the transfer Hamiltonian, which provides an exact

one-particle framework in WFT. Second, we develop adapted ab initio theory which

provides rigorously verifiable approximations to reproduce ab initio reference data.

31

Table 1-3. Multipole distributions

Term Charge distribution Point charges(ss| Monopole 1(sp| Dipole 2(pp| Linear Quadrapole 4(pp′| Square Quadrapole 4

34

CHAPTER 2THE TRANSFER HAMILTONIAN

2.1 Theoretical Method

Historically SE method development has been guided by experiment, the goal being

to reproduce a set of experimental quantities. While any QM method should in some

limit reproduce experimentally observed phenomena, the exact WFT imposes more

demanding and detailed requirements on a model than experiment alone. For instance,

the wavefunction is not an observable yet contains all the information about the system.

In this sense a model based in WFT aspires to reproduce the exact wavefunction and the

effective Hamiltonian that generates it. The expectation value of this exact wavefunction

for a particular Hermitian operator will yield a particular experimental observable.

High-quality ab initio reference data is readily available from CC theory and provides an

unambiguous point of reference for our model. Furthermore, by reproducing the features of

an exact WFT we are guaranteed to reproduce all experimental phenomena.

However, practical limitations restrict the functional form of the wavefunction and

the Hamiltonian. The challenge is to incorporate the many-body effects of exact WFT

into a simple (low-rank) model Hamiltonian. One aspect of the approach taken here is to

develop a formally exact one-particle theory, the transfer Hamiltonian (Th), which includes

all many-particle effects. Another aspect is to connect a simple approximate model

Hamiltonian to the exact Th. Our initial work used the well-known NDDO-type model

Hamiltonian. Because these models have limited applicability, it has become evident that

more complete models are needed that accurately incorporate the many-particle effects

of the Th. We develop the AAT simplified approach to include the features of the Th in a

systematic way, which is discussed in detail later.

Here we introduce the Th [43], a one-particle operator with a correlation contribution,

as an extension of the similarity transformed Hamiltonian from CC theory [44]. This

approach provides a rigorous formal framework for an exact one-particle WFT. Also,

unlike other methods, such as the specific reaction coordinate approach of Truhlar [45],

35

which attempt to fit a known energy surface to a SE form via reparameterization, the Th

fits to the Hamiltonian itself, to provide a form that is simple enough that it can be solved

on a time scale consistent with large scale MD.

We start from the exponential ansatz in which the exact ground state wavefunction is

given by

|Ψg〉 = eT |Φ0〉 (2–1)

where Φ0 is a reference function, which can be taken as a single determinant and T is the

excitation operator of CC theory. When defined this way, we may write the Schrodinger

equation in terms of the similarity transformed Hamiltonian, H.

H|Φ0〉 = e−T HeT |Φ0〉 = E|Φ0〉. (2–2)

The solution of this equation has the same eigenvalues as the bare Hamiltonian. The total

energy, E, is exact, thus the forces, ∂E∂RA

, are exact as well. In second-quantized form H

may be written as,

H =∑p,q

f pqp†q +

1

2!

∑p,q,r,s

〈pq||rs〉p†q†sr

+1

3!

∑p,q,r,s,t,u

〈pqr||stu〉p†q†r†uts + · · · (2–3)

where we show the inclusion of three- and higher-body interaction terms. To include

the higher-order terms we renormalize with respect to the one- and two-body terms and

obtain a generalized, correlated Fock-like operator, called the transfer Hamiltonian [10, 43]

Th =∑µ,ν

[hµν +∑

λ,σ

Pλσ˜〈pq||rs〉]µ†ν (2–4)

where Pλσ is the single particle density matrix and both hµν and ˜〈pq||rs〉 can be

approximated with a suitable functional of atom-based parameters. We have used the

NDDO form in the current work, but the formalism developed thus far is not limited by

this choice of Fock-like operator.

36

2.2 Computational Details

Parameters were optimized by a quasi-Newton-Raphson optimization algorithm in

tandem with a genetic algorithm to fit Austin Model 1 (AM1) NDDO parameters [27] to

reproduce the forces determined with CC theory. The parameterization was performed

by a combination of the PIKAIA [46] implementation of a genetic algorithm and the

linearized quasi-Newton-Raphson minimization routine of L-BFGS [47].

The reference forces used in this parameterization were obtained with the ACES

II [48] program system. We have used CCSD in a triple zeta basis, with a unrestricted

reference and Hartree-Fock stability following (to ensure that we arrive at the proper

unrestricted spin solution.) All semiempirical AM1 and OM1 calculations were performed

using a modified version of MNDO97 Version 5.0 [30–32, 49].

DFT calculations for the nitromethane (NMT) monomer were performed at the

B3LYP/6-31G(d) level with the HONDOPLUS program [50]. The initial NMT dimer

and trimer geometries were taken from a DFT study using B3LYP/6-31++G** [51]. Due

to the large number of correlated reference calculations needed in modeling our clusters,

we used the Turbomole Version 5 electronic structure package [52, 53] to do all-electron

calculations at the resolution-of-the-identity second order Møller-Plesset (RI-MP2) level

in a TZVP basis. We chose an all-electron model in a large auxiliary basis to ensure an

accurate treatment of correlation energies and geometries that were demonstrated on a

series of compounds representative of atoms in various oxidation states [54].

2.3 Nitrogen-Containing Energetic Materials

The motivation of this study is to develop a Th that can accurately predict the

quantum mechanical behavior of nitrogen-containing energetic materials. We present a

new parameterization for a semiempirical NDDO Hamiltonian which improves upon the

standard AM1 parameter set. This new parameterization reproduces the decomposition

of NMT to the CH3 and NO2 radicals as predicted with CCSD/TZP. We demonstrate

transferability to small NMT clusters. For NMT dimers and trimers this model Th predicts

37

rearrangements similar to previously calculated unimolecular mechanisms. The NMT

trimer undergoes a concerted rearrangement to the methylnitrite trimer when one of the

NMT molecules becomes geometrically strained.

NMT has been the subject of many theoretical and experimental studies due to the

significant role that nitrogen-containing compounds play in the chemistry of explosives,

propellants, and atmospheric pollution [55]. The description of the decomposition of NMT

and its products is crucial to the understanding of the kinetics and dynamics of larger

energetic materials for which NMT is a model. NMT is small enough to allow detailed

investigation of its complex decomposition [56, 57]. Specifically, even for a system of

deceiving simplicity, its decomposition to radicals via simple C-N bond rupture versus the

competing NMT-methylnitrite (MNT) rearrangement has been the subject of much debate

[56–59].

The use of classical potentials and standard semiempirical NDDO methods often

yield incorrect behavior for energies, forces, and geometries, especially in non-equilibrium

cases. Therefore, the use of such techniques to do MD will often lead to incorrect results

when bond breaking and forming are studied. Alternatively, ab initio quantum chemistry

is known to be predictive, in the sense that it can reproduce most quantities adequately

compared to the experimental values in those regions without knowing the result prior

to calculation. However, these methods are too expensive and, therefore, currently

impractical for MD because of the time and disk space requirements.

Previous NDDO parameterizations, such as AM1, can reproduce near-equilibrium

structures for a large number of molecules, yet they often fail for non-equilibrium

geometries. Thus, for the purposes of doing MD simulations, we are willing to sacrifice the

ability to accurately treat a large number of molecules near equilibrium, for the ability to

treat a small class of molecules accurately at all geometries. When modeling combustion

or detonation it is particularly important to have a method that works equally well for

equilibrium and non-equilibrium geometries if the chemistry of bond breaking and forming

38

is to be described. However, this sacrifice means that the new Th parameter set must be

applied judiciously. If used for a class of molecules for which it was not trained, then the

accuracy of calculated properties could be diminished. If the appropriate set of reference

data and a sufficiently complete model are chosen, then the Th will provide an equally

appropriate and complete description of the system.

Among the first to study NMT theoretically were Dewar et al. [56] who used

MINDO/3 and found that the NMT-MNT rearrangement occurred with an energy

barrier of 47.0 kcal/mol and that MNT dissociated to H2CO+HNO with an energy of

32.4 kcal/mol. Hence they proposed that NMT decomposition occurs via rearrangement.

McKee [57] reported the NMT-MNT rearrangement barrier to be 73.5 kcal/mol calculated

at the MP2 level of theory with a 6-31G(d) basis and that MNT dissociated to H2CO+HNO

with an energy of 44.1 kcal/mol. The high barrier height of the rearrangement suggests

that the decomposition pathway is simple bond rupture to NO2 and the CH3 radical.

Hu et al. [59] performed calculations with the G2MP2 approximation and found that

the NMT dissociation occurs via direct C-N bond rupture with an energy of 61.9

kcal/mol. The NMT dissociation is lower than both the NMT-MNT rearrangement

and the NMT-aci-nitromethane rearrangement by 2.7 and 2.1 kcal/mol, respectively.

Nguyen et al. [58] reported that the NMT-MNT rearrangement barrier is 69 kcal/mol

and NMT direct dissociation to CH3 and NO2 radicals is 63 kcal/mol calculated at the

CCSD(T)/cc-pVTZ level using CCSD(T)/cc-pVDZ geometries. Most recently Taube and

Bartlett [60] report the NMT-MNT rearrangement at 70 kcal/mol using ΛCCSD(T).

Hu et al. [59] also performed a detailed study of unimolecular NMT decomposition

and isomerization channels using DFT with G2MP2//B3LYP level of theory in a

6-311++G(2d,2p) basis. Single points were basis set extrapolated along the minimum

energy paths to correct for incompleteness using the G2MP2 method. Their article shows

that from a particular intermediate state (IS2b) one possible reaction path is the rupture

of the N-O bond to give the methoxy radical and nitrous oxide. Hu et al. indicate that on

39

this reaction path no transition state could be located. Results from photodissociation and

thermolysis experiments suggest that one of the primary steps of NMT decomposition is

the elimination of oxygen [58, 61, 62].

However, the reliability of both the B3LYP and, to a lesser extent, CCSD(T) at

non-equilibrium geometries [59] is limited and the same accuracy should not be expected

in regions other than equilibrium.

The work of Li et al. [51] suggests that the three-body interaction energy in bulk

NMT is a critical component in accurately describing the potential energy surface. We

investigated the use of our Th in predicting how an accurate treatment of three-body

effects dictates chemical reactivity.

2.3.1 Nitromethane Monomer

We have developed a set of NDDO parameters to reproduce the force curve along

the intrinsic reaction coordinate for C-N bond rupture and to reproduce the CCSD/TZP

equilibrium geometry. As seen in Table 2-1 the agreement between CCSD and TH-CCSD

is to within a degree for the angles and less than one hundredth of an A for bond lengths.

TH-CCSD is shown in Figure 2-1 to reproduce the original force curve for C-N

bond rupture to within an accuracy of 0.005 Hartree/Bohr. The AM1 parameters were

designed to reproduce equilibrium structures, and thus they have the correct force at

the equilibrium bond length. However, at 0.1 A away from equilibrium the forces are in

considerable error. A recurring feature for AM1 force curves, which is problematic for

MD, is the unphysical repulsive behavior at large internuclear distances. The transfer

Hamiltonian removes this artificial repulsion and gives the qualitatively and quantitatively

correct dissociation.

One way of determining energy differences in a single reference theory, other than

Hartree-Fock, is by integration of the force curve. For AM1 the error in the forces is

reflected in the energy as shown in Figure 2-2. Also shown in Figure 2-2 the DFT solution

40

yields a potential energy surface and force curve which are incorrect at non-equilibrium

geometries.

2.3.2 Nitromethane Dimer

In analogy with how Bukowski et al. [63] have studied the interaction energy of

dimers as a function of rigid monomer separation, we calculate the relative energies of

the NMT dimer as a function of hydrogen bonding distance. The minimum chosen in our

NMT dimer calculations from [51] is stable due to the formation of two hydrogen bonds of

equal length and antiparallel arrangement of the molecular dipoles. In Figure 2-3, we plot

the energies relative to the respective methods minima in ROH .

Our data indicate that in the case of frozen monomer geometries, AM1 overestimates

the repulsive wall at small intermolecular distances. It appears as if the H, C, N, and

O parameters in the core-core repulsion function for AM1 were unable to cancel the

effect from which MNDO has traditionally suffered, namely overestimation of energies

in crowded molecular systems [31, 64]. The TH-CCSD reproduces the interaction forces

quite well for intermolecular distances up to ∼3 A , while AM1 is shown to slightly

overbind. Thereafter the TH-CCSD underbinds slightly in disagreement with RI-MP2 but

in agreement with the OM1 Hamiltonian. The OM1 method has credence as a method

which models the dipole and thus hydrogen bonding interaction properly by correcting the

overlap matrix [32, 65, 66].

Adiabatic C-N bond dissociation for small NMT clusters was investigated starting

from published local minima [51] using the TH-CCSD and AM1. In studying the dimer

interaction, for which one of the monomers is undergoing unrestricted C-N bond rupture,

the TH-CCSD forces predict that a bimolecular process takes place when the stretched

bond is at a distance of 2.42 A (1.6Req). The reaction products are nitrosomethane,

methoxy radical, and nitrogen dioxide radical. Figure 2-4 shows snapshots of this

particular dimer decomposition channel.

41

The TH-CCSD geometry optimization shows the monomer-monomer distance

decreasing from 3.50 A to 2.90 A as the methyl group on monomer one rotates to face an

oxygen on monomer two. As this occurs, the C-N bond on monomer two increases from

1.5 A to 1.6 A and the length of the N-O bond participating in the reaction increases

from 1.3 A to 1.7 A . The AM1 Hamiltonian does not predict any chemistry as a result of

the C-N bond breaking because the forces governing proper dissociation are incorrect, as

shown in the case of the NMT monomer (Figure 2-1). The geometry optimization using

AM1 reveals that the monomers actually separate by 0.45 A and that no methyl rotation

takes place.

The bimolecular dimer reaction predicted by the TH-CCSD is supported by the

unimolecular rearrangement of intermediate IS2b [59] for which no transition state

was located. The similarity is that the reaction products both involve the formation of

methoxy radical via the cleavage of an N-O bond. Unlike the study in [59], our process

is bimolecular, where methoxy radical formed is by NMT undergoing oxygen elimination

or N-O rupture. Contrary to [59], where the methyl group picks up an oxygen from its

own monomer, we observe that the oxygen is picked up by a methyl group on a different

monomer. Our results for oxygen elimination are in agreement with previous studies

[61, 62], which suggest this is the primary process in detonation.

2.3.3 Nitromethane Trimer

The NMT trimer local minimum taken from [51] has a ring structure involving three

hydrogen bonds. Performing the same study done for the dimer on the reference structure

trimer from [51] shows that at a C-N bond distance of 1.94 A or 1.3Req a concerted

rearrangement of NMT to MNT occurs. The decomposition of NMT in trimer thus occurs

at a C-N distance that is 0.5 A less than that in the dimer. This indicates a cooperative

reactivity in clusters of NMT initiated by the vibrational excitation of a C-N bond. The

monomer rearrangement to MNT is widely supported [58, 59] and is believed to be a

key step in the detonation of NMT. The initial intermolecular N-N distances are 4.5 A ,

42

4.6 A , and 4.8 A and angles are 62◦, 58◦, and 60◦ (Figure 2-4). During the geometry

optimization, because one of the monomers has a stretched C-N bond, the system is

unable to reach its ring-like hydrogen-bonded local minimum, leading to ring strain. The

TH-CCSD predicts that, as a result, the one monomer cleaves at a C-N bond distance of

1.8 A and, subsequently, a second monomer dissociates at the same C-N bond distance.

At this point, all the intermolecular N-N bond distances have decreased to 4.3 A , 4.5

A , and 4.6 A ; the angles, however, remain unchanged. Geometry optimization with

AM1 shows that the stretched monomer shifts slightly from the remaining two monomers

allowing one of them to rotate to a lower energy structure where it forms a bifurcated

hydrogen bond each of which is 2.44 A.

We have demonstrated that the TH-CCSD parameter set trained for NMT monomer

transfers to the dimer and trimer calculations. The criteria for transferability used

here regards the extent to which these forces yield products that are in accord with

theoretically supported decomposition mechanisms of NMT [58, 59]. Note that the

training of our Hamiltonian was not tailored to favor any particular rearrangement

mechanism, evident by the fact that the dimer decomposition differs markedly from the

trimer mechanism. Along the adiabatic surface AM1 does not predict any bond breaking

to occur. Therefore, the accuracy of modeling bond breaking and forming processes in

bulk simulations using AM1 is questionable, at best.

We have shown a parameterization of the NDDO Hamiltonian that reproduces proper

dissociation of NMT into radicals, both for the energy and forces. This parameter set

reproduces the reference CCSD/TZP geometries and forces for unrestricted C-N bond

breaking. This NDDO parameterization would appear to potentially offer improved

accuracy in MD simulations for clusters of high-energy materials. We also show that

concerted reactions for both the dimer (bimolecular rearrangement) and trimer (trimolecular

rearrangement) are not generated with an AM1 parameterization, yet are required for

the proper description of NMT clusters. This new transfer Hamiltonian will allow for

43

rapid and accurate MD studies due to the inherent speed of NDDO and the fitting of

the parameters to high-level CC reference forces. In this sense we are only limited by

the reference data and the quality of the model Hamiltonian. We have demonstrated

transferability of monomer parameters to nitromethane dimer and trimer. These forces

predict chemical reactivity in clusters that is not observed in AM1 simulations and, in

the case of the trimer, we see a threefold rearrangement of NMT to MNT giving further

insight into the detonation process.

2.4 Silica and Silicon

The TH depends upon the choice of the form of the Hamiltonian. So far we have

used the NDDO form [67, 68] because of its wide use and prior successes. However, such

methods using standard parameters like AM1 [27], MNDO [30–32] and PM3 [28, 29]

are also known to have significant failings compared to ab initio methods. Our TH, on

the other hand, benefits from the fact that we only require specific parameters [45] for

the systems of interest, for example, SiO2 clusters and H2O in our hydrolytic weakening

studies [69]. These parameters permit all appropriate clustering and bond breaking for

all possible paths for all the components, to allow MD to go where the physics leads.

Furthermore, unlike traditional semiempirical methods, we do not use a minimal basis set,

preferring to use polarization functions on all atoms, and in some cases, diffuse functions.

The form of the Hamiltonian emphasizes two-center interactions (but is not limited to

nearest neighbors as in TB), which allows our parameters to saturate for relatively small

clusters. We check this by doing ab initio calculations on larger clusters involving the

same units to document that the TH is effectively unchanged with respect to system size.

At that point, we have a TH that we can expect to describe extended systems composed

of SiO2, and its solutions properly, including all multi-center effects on the energies and

forces. Only the Hamiltonian is short-range. We also check our TH by applying it to

related systems. We will show that our TH determined from forces for bond rupture in

disiloxane, pyrosilicic acid, and disilane also provide correct structures for several small

44

cluster models for silica and silicon. Moreover, multiple bonds and high coordination Si,

though not typical in silica polymorphs, are described well. Finally, we will show that our

TH performs well using periodic boundary conditions for silica polymorphs.

2.4.1 Silicon Clusters

A method based on rigorous quantum mechanics should be transferable. We define

two types of transferability: first, an appropriate method should apply equally well

to systems of similar size, but with different bonding characteristics, e.g., single- and

double-bonds; and second, a method should perform equally well for small and large

systems, and even infinite systems as modeled by periodic boundary conditions.

Now we will show that the TH-CCSD parameters for an NDDO-type Hamiltonian

trained on a small set of reference molecules are transferable in the first sense. We chose

our training set to be reasonable representatives of the local behavior of extended systems:

disiloxane (H3SiOSiH3), pyrosilicic acid ((OH)3SiOSi(OH)3) and disilane (H3SiSiH3).

The initial parameters used were from MNDO/d [70] because of availability, overall

performance on the training set, and inclusion of d-orbitals on Si.

The fit was based on forces from the CCSD approximation using DZP basis sets.

Primarily the TH-CCSD fit was to the forces along the intrinsic reaction coordinate for

Si-Si bond breaking in disilane and Si-O bond breaking in both disiloxane and pyrosilicic

acid. Further refinement was achieved by fitting to the forces for the Si-O-Si bends

of disiloxane and pyrosilicic acid. The most reliable error function to use in the type

of fitting seems to be the square of the difference between the reference CCSD and

TH-CCSD forces. The fitting procedure itself, though nonlinear, involved iteratively

performing a linear minimization of each parameter, starting with those that most reduced

the error.

To verify the transferability of the parameters one should evaluate properties for

molecules outside the reference set. Verification for bond breaking tests were performed

on Si(OH)4, Si(SiH3)(OH)3, Si(SiH3)2(OH)2, Si(SiH3)3(OH), Si(SiH3)4, H2Si=SiH2,

45

(SiH3)Si≡Si(SiH3), (SiH3)2Si=Si(SiH3)2, and (SiH3)3Si-Si(SiH3)3. Verification testing for

the O-Si-O bend involved calculations on Si(OH)4, Si(SiH3)(OH)3, and Si(SiH3)2(OH)2.

Figures 2-5 and 2-6 show the results relative to DFT (with the B3LYP approximate

exchange-correlation functional) values.

It is apparent that the TH-CCSD parameters outperform many of the commonly used

NDDO methods for a variety of different bonding environments. Furthermore, although

the parameters were designed to reproduce Si-O-Si, Si-O, and Si-Si forces, they transfer

well to equilibrium O-Si-O (Figure 2-7).

2.4.2 Silica Polymorphs

Now we demonstrate transferability of the TH-CCSD parameters to periodic systems.

We choose to do so for crystalline forms of SiO2 first. Note that we need a model that

describes amorphous and crystalline phases equally well; if a method is comparatively

more accurate (or less accurate) for crystalline or amorphous materials then non-physical

restrictions would be imposed. For example, crack tip propagation in silica is attracted to

regions of low potential. Crystalline SiO2 is a good first step because crystalline phases

have been observed at the interface in the otherwise amorphous silica gate oxide layer in

metal-oxide semiconductors (MOS) [71]. Understanding silica and silicon brings us closer

to understanding defects in MOS. In them, phenomena such as current leakage, electrical

breakdown over time, and the thinning trends of the dielectric layer in gate oxides are

strongly influenced by defects [72, 73].

It also should be noted that we focus on training models that perform well primarily

for localized units (i.e., clusters) and secondarily for bulk systems with periodic boundary

conditions. This is because it is much harder to build local phenomena into a model

used primarily for extended systems than it is to construct a model based on local

chemistry that is applicable for small molecules, clusters, and bulk. For example, periodic

calculations with plane waves require exceedingly large numbers of plane waves to describe

oxygen defects in silica [74].

46

Furthermore, there is evidence that any model that is only dependent upon

pair-potentials, (i.e., nearest-neighbor distances) such as standard orthogonal TB schemes,

lacks the ability to describe some fundamental properties of silica. It is difficult to

prove such a claim, as we are able to imagine an expansion of the energy about some

parameter, such as nearest-neighbor distances. However, Stixrude et al. [75] have

shown experimentally, based on the two-fold compression of liquid SiO2, that there is

essentially no change in nearest-neighbor distances though there are significant changes

in medium-range structure. This medium-range structure is best described by ring- and

cluster-statistics and cluster geometry [76], which would be not be incorporated in any

two-body potential.

To avoid the limitations of empirical two-body potentials, one may consider NDDO

or non-orthogonal TB methods discussed above, which include some many-body effects

due to inclusion of the overlap matrix [77]. Especially promising is the SCC-DFTB

approach [7] which includes the largest contributions to long-range Coulomb interactions

while handling charge in a self-consistent way. These methods offer a practical balance of

chemical accuracy and computational efficiency, as evidenced from their recent increase in

popularity.

Figure 2-8 shows the deviations of the TH-CCSD parameterization, AM1, and

MNDO/d NDDO calculated unit cell densities from reference DFT densities. The

reference DFT calculations for all silica polymorphs are from the VASP code using

the gradient-corrected PW91 GGA exchange-correlation functional with 3D periodicity, an

ultrasoft pseudopotential, and a plane wave basis set (395 eV cutoff). A 2x2x2 k-point

grid is employed, which converges total energies to within ±0.05 eV/cell, and the

convergence of self-consistent steps is precise to 10−4 eV/cell. The optimization of cell

parameters and atomic positions is precise to < 10−3 eV A−1.

It is clear that the TH-CCSD parameters trained on small silica clusters are

transferable to several silica polymorphs. It is also interesting to note that not only is

47

quantitative improvement achieved but the error from the PW91 DFT calculations is

systematic and parallels similar improvement found when this new parameterization is

applied to molecular systems. Further refinement of the TH-CCSD parameters must be

made to accurately describe the silicon-silica interface, since the current parameterization

fails to converge to the proper topology.

The application of the transfer Hamiltonian methodology has been presented for

nitrogen containing energetic materials and silicon containing compounds. In both cases

we have demonstrated significant improvement over the underlying semiempirical method

by reparameterizing with a training set of high-quality CC forces for representative

molecules. It is apparent that for focused MD studies that use semiempirical model

Hamiltonians there is an advantage to creating a special set of parameters that is tailored

to the chemical system being studied. However, the transferability of the model is affected

by which molecules are selected for the training set and the PES reference points used.

These models will often break down when applied to systems to which they have not been

parameterized. One route to achieve this level of transferability is to design a simplified

method that does not require parameterization, only controlled approximations, which

leads to the AAT approach to defining a superior Th.

48

Table 2-1. Equilibrium geometry for NMT, in A and degrees

Method RCN RCH RNO AHCN AONC

CCSD/TZP 1.498 1.087 1.219 107 117TH-CCSD 1.504 1.084 1.225 108 118

AM1 1.500 1.119 1.201 107 119

49

-0.1

-0.05

0

0.05

0.1

1 1.5 2 2.5 3 3.5

Forc

e (H

artr

ee/B

ohr)

C-N reaction coordinate (Angstrom)

CCSD/TZP UHFAM1 UHF

TH(CCSD) UHFB3LYP/6-31G* UHF

Figure 2-1. The NMT force curve for C-N rupture

50

0

20

40

60

80

1 1.5 2 2.5 3 3.5

Ene

rgy

(kca

l/mol

)


CCSD/TZP UHFAM1 UHF

TH(CCSD) UHFB3LYP/6-31G* UHF

Figure 2-2. The NMT energy curve for C-N rupture

51

-5

0

5

10

15

1 1.5 2 2.5 3 3.5 4 4.5E(R

OH

)-E

(RO

H-e

q) (

kcal

/mol

)

O-H reaction coordinate (Angstrom)

RIMP2/TZVP RHFAM1 RHFOM1 RHF

TH(CCSD) RHF

Figure 2-3. The NMT dimer energies relative minimum in the intermolecular hydrogenbonding distance

52

A

B

C

D

Figure 2-4. Snapshots of the geometry optimizations for NMT dimer and trimerperformed with the TH-CCSD and AM1. A) NMT dimer treated withTH-CCSD. B) NMT dimer treated with AM1. C) NMT trimer treated withTH-CCSD. D) NMT trimer treated with AM1

53

0

0.02

0.04

0.06

0.08

0.1

TH(CCSD)MNDO/dPM3AM1

Ang

stro

m

Method

Figure 2-5. Average deviation of Si-O stretch

54

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14


Ang

stro

m

Method

Figure 2-6. Average deviation of Si-Si stretch

55

0

1

2

3

4

5

6

7

8


Ang

le

Method

Figure 2-7. Average deviation of O-Si-O angle

56

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9∆

dens

ity (

g/cm

3 )

α-Cris

tobali

te

α-Qua

rtz

β-Cris

tobali

te

β-Qua

rtz

β-Trid

ymite

Coesit

e

TH(CCSD)AM1

MNDO/d

Figure 2-8. Density deviation from PW91 DFT

57

CHAPTER 3IMPETUS FOR A NEW SEMIEMPIRICAL THEORY

Semiempirical method development has a long history dating back to the 1930’s

starting with Huckel theory for describing even-alternant hydrocarbons. In the current

study we have been guided by the developments over the last seventy years in regard to

both what should and should not be approximated in our model. We focus here on the

NDDO and SCC-DFTB models. Both models have been widely-used with reasonable

success. The relative numerical performance of the most recent versions of each model has

been reported recently [2], and extensive comparisons of the formal similarities of each

model have been made [78, 79]. It is important to understand the context in which these

models were designed to fully appreciate the difference in the design philosophy of the

current AAT model. NDDO-based methods were originally constructed to reproduce the

experimental heats of formation for small organic molecules. Since the heat of formation

is defined only for molecules in their equilibrium structures, it is not surprising that

NDDO models are parameterized specifically to reproduce equilibrium properties. For

nonequilibrium structures the accuracy of NDDO methods varies greatly, as we will show.

Similarly, SCC-DFTB has been parameterized to reproduce equilibrium structures.

The goal of the AAT approach is to achieve uniform accuracy of electronic properties,

structures, and energetics over the entire PES to the extent possible for a minimal ba-

sis set effective one-particle theory. Uniform accuracy is required to achieve meaningful

results when performing MD simulations, in which transition states and other nonequilibrium

structures play a critical role.

3.1 Neglect of Diatomic Differential Overlap Based Methods

3.1.1 Does NDDO Approximate HF?

A common misconception about NDDO-based methods is that the NDDO model

Hamiltonian approximates the HF Hamiltonian. This assumption is not the case and it

can be shown that the density dependent contribution to the Fock-like matrix in NDDO

has little to do with the density dependent Fock matrix of HF theory, though they are

58

deceptively similar:

Gσλµν

HF= (µν|λσ)− 1

2(µσ|λν) (3–1)

Gσλµν

NDDO= (µν|λσ)δABδCD − 1

2(µσ|λν)δADδBC (3–2)

where µ, ν, λ, and σ are on centers A, B, C, and D respectively and the terms (µν|λσ)

are subject to the multipole-multipole expansion approximation. It is true that a typical

NDDO-type two-electron integral encountered is often larger in magnitude than a

corresponding non-NDDO-type two-center, three-center, or four-center two-electron

integral. For molecules with more that three heavy atoms in a valence-only minimal basis

set representation, there are significantly more terms involving three- and four-center

two-electron integrals. The combined effect of this large number of multi-center terms

has a significant effect on the density dependent Fock matrix contributions. We consider

a simple example. Shown in Figure 3-1 are the individual contributions to the NMT

density dependent Fock matrix element of the carbon 2s-type orbital and nitrogen 2s-type

orbital in a minimal basis set for the combined three- and four-center, two-center, ab initio

NDDO-type, and full multi-center contributions. Though only a single matrix element

is plotted over this reaction coordinate, the trend is the same for other matrix elements

and other molecules. The three- and four-center contributions are nearly equal and oppo-

site to the two-center contributions. This cancellation is a universal trend and holds at

all energy scales. And, while it could be said that the NDDO-type class of two-electron

integrals approximates the full set of two-center two-electron integrals, it is clear that

the NDDO-type integrals alone bear little resemblance to the full two-electron integral

contribution. Instead, it is the delicate balance among all multi-center two-electron

integrals that is critical to obtaining the proper Fock matrix contribution.

It has been known since studies were first performed on the NDDO approximation

that a parameterized set of NDDO integrals perform better that the ab initio NDDO

integrals. However, what has been less clear is the lack of correspondence between the

NDDO approximation and the quantity it is supposedly approximating. Instead it seems

59

that the adjustable parameters involved in the multipole-multipole expansion are being fit

to the full density dependent part of the Fock matrix, and not the integrals themselves.

Furthermore, this fit is non-linear because of the two-electron exchange integrals, which

means that improving the NDDO approximation by modifying the integrals or trying to

add three- or four-center contributions becomes increasingly convoluted. For the AAT

model, as we will show, all integral approximations can be systematically improved and

do not involve any adjustable parameters. By avoiding the need to parameterize integral

contributions we can retain uniform accuracy of the PES because no structural bias has

been introduced.

3.1.2 Artificial Repulsive Bump

There are many chemically interesting cases in which existing semiempirical methods

seem to work quite well. For example Jorgensen et al. [80] show that their PDDG/PM3

reparameterization of PM3 yields heats of formation with a mean absolute error of 3.2

kcal/mol for a test set of 622 molecules. However, there is little evidence to suggest that

the same performance should be expected for chemical structures that are significantly

distorted from equilibrium structures. Even for simple activation barriers, the error is

typically 10-15 kcal/mol [80]. For the determination of rate constants and in molecular

dynamics studies the PES should be described within 1-2 kcal/mol. For example, since at

300K an error of 1 kcal/mol in the activation barrier will result in an order of magnitude

change in the rate constant, clearly a 10-15 kcal/mol error will not yield meaningful

rate constants. The situation is especially bad in cases in which direct bond fission

is treated with NDDO-based model Hamiltonians. In such cases there is an artificial

repulsive region near 1.2Req to 1.6Req, as is shown in Figure 3-2 for NMT UHF direct

bond fission. Generally, in cases where it is known that there is barrierless dissociation,

one would recognize such features as an error in the method and subsequent results would

be questioned. For more complicated chemical processes, for example transition states or

60

concerted reaction mechanisms, such dramatic errors are much more difficult to recognize,

though it is precisely those regions that are often of primary interest.

3.1.3 Separation of Electronic and Nuclear Energy Contributions

In ab initio theory (without consideration of non-BO effects) inclusion of the

nuclear-nuclear repulsion term (VNN) is a trivial matter when evaluating total energies

and associated forces. While VNN has no part in the determination of purely electronic

properties (such as the electronic density, electronic spectra, ionization potentials,

or electron affinities) it is critical in the determination of molecular structure, the

identification of transition states, and their associated vibration spectra. In simplified

theories the role of VNN is obfuscated because the electronic and nuclear repulsion effects

are not separated.

The core-core repulsion term in NDDO Hamiltonians involves parameters that are

designed for energetics at equilibrium. For such a specific set of properties the difference

of two large numbers, the attractive (negative) electronic energy and the repulsive

(positive) nuclear-nuclear repulsion energy, is easier to model than each separately. In

our opinion, combining such unrelated terms is the origin of the long-standing contention

that electronic properties and total energy properties cannot be described within the same

set of parameters. Also, once the core-core terms are parameterized, the correspondence

between rigorous theory and semiempirical theory is lost. To solve this problem, we need

to search for a better form for our model Hamiltonian that does not require that VNN be

combined with terms that are purely electronic.

3.1.4 Total Energy Expression

Here we illustrate the origin of the combined electronic and nuclear energy contributions

of the total energy expression, the core-core term in NDDO methods, so that we may

avoid this pitfall in the development of AAT. Consider the case in which the exact

KS-DFT exchange-correlation energy functional and corresponding potential are known,

and further are represented in a complete basis set. Then our effective one-particle

61

Hamiltonian will yield a set of orbitals which yield the exact density. The exact energy is

given by the KS-DFT total energy expression, Equation 1–27.

Now, if we expect that the NDDO model Hamiltonian will generate the correct

electronic density, which is a requirement if we want any of the first-order electronic

properties of our system, then we can relate the NDDO and KS-DFT energy expressions.

Making the temporary assumption that fNDDO ≈ fKS, and consequently ρNDDO ≈ ρKS,

then the total energy difference is

ENDDO − EKS =∑

A,B>A

ENDDOAB (RAB)− Exc[ρ] +

1

2

∫vxc(r)ρ(r)dr −

∑A,B>A

ZAZB

RAB

(3–3)

Now it is clear that if the NDDO one-particle Hamiltonian were to yield the correct

electronic density then the two-center core-core term (ENDDOAB ) must account for not only

the simple nuclear-nuclear repulsion term (ZAZB

RAB), but also some pair-based contribution,

dependent only upon RAB, of the the complicated multi-center exchange-correlation

contribution (Exc[ρ] − 12

∫vxc(r)ρ(r)). The motivation for approximating this last term

is obvious: it drastically simplifies and avoids a complicated and expensive step in the

determination of the total energy. However, there is no reason to expect a priori that this

approximation would be accurate. For our improved model Hamiltonian we make a special

effort to ensure that all approximations are controllable and systematically verifiable.

3.2 Self-Consistent Charge Density-Functional Tight-Binding

3.2.1 Self-Consistent Charge Density-Functional Tight-Binding in an AOFramework

Equations 3–4 through 3–8 are the working equations for SCC-DFTB which come

directly from Elstner’s original paper [7]:

∆qA = qA − q0A (3–4)

qA =1

2

occ∑i

ni

∑µ∈A

atoms∑ν

(c∗µicνiSµν + c∗νicµiSνµ) (3–5)

62

M∑ν

cνi(HSCCµν − εiSµν) = 0 (3–6)

HSCCµν = 〈χµ|H0|χν〉+

1

2Sµν

atoms∑

ξ

(γAξ + γBξ)∆qξ

= H0µν + H1

µν (3–7)

ETB2 =

occ∑i

〈φi|H0|φi〉+1

2

atoms∑A,B

γAB∆qA∆qB +∑

A6=B

Erep(RAB) (3–8)

and the corresponding gradients are

~Fα = −occ∑i

ni

∑µν

cµicνi[∂H0

µν

∂α− (εi −

H1µν

Sµν

)∂Sµν

∂α]−∆qα

N∑

ξ

∂γαξ

∂α∆qξ −

∂∑

A 6=B Erep(RAB)

∂α

(3–9)

we have used µ ∈ A and ν ∈ B.

To facilitate comparison of the SCC-DFTB model Hamiltonian to other methods

we force it into a structure that is comprised of a density independent part H and a

density dependent part G. Using Mulliken population analysis to determine the net charge

associated with atom A yields

∆qA = −q0A +

∑µ∈A,ν

PµνSνµ (3–10)

where the occupation numbers ni = 1 and the MO coefficients are real.

F SCC−DFTBµν = H0

µν + H1µν

= H0µν −

1

2Sµν

atoms∑

ξ

(γAξ + γBξ)q0ξ +

1

2Sµν

atoms∑

ξ

∑

λ∈ξ,σ

(γAξ + γBξ)PλσSσλ

= (HSCC)µν +∑

λσ

Pλσ(GSCC)σλµν (3–11)

we have introduced the notation that

(HSCC)µν = H0µν −

1

2Sµν

atoms∑

ξ

(γAξ + γBξ)q0ξ (3–12)

63

(GSCC)σλµν =

1

2SµνSσλ(γAξ + γBξ) (3–13)

Note that unlike HF or NDDO, in SCC-DFTB it appears that it is not generally true that

Gσλµν = Gµν

σλ, as λ ∈ ξ and σ is unrestricted. If SCC-DFTB were to have this property, that

would be useful both for the calculation of forces and for comparison among all effective

one-particle methods we consider. To enforce this similarity we make the following

observation:

∑

λσ

Pλσ[1

2SµνSσλ(γAC + γBC)] =

∑

λσ

Pλσ[1

4SµνSσλ(γAC + γBC + γAD + γBD)] (3–14)

where µ, ν, λ, and σ are on A, B, C, and D respectively, and our working equation for the

density dependent contribution in SCC-DFTB is

(GSCC)σλµν =

1

4SµνSσλ(γAC + γBC + γAD + γBD) (3–15)

Equation 3–15 bears a similarity to the Mulliken approximation to Coulomb repulsion, we

will address this in further detail later.

Now we revisit the total energy expression. Using our new definition of the density

dependent and independent terms in the SCC-DFTB Fock-like matrix.

ESCC =1

2

∑µν

Pνµ[2HSCCµν +

∑

λσ

Pλσ(GSCC)σλµν ] +

∑

A6=B

(1

2γABq0

Aq0B + ESCC

rep (RAB)) (3–16)

This energy expression closely resembles other one-particle Hamiltonians, yet is equal to

Equation 3–8. Similarly, after significant manipulation the gradient expression can be

shown to be

~Fα =1

2

∑µν

Pνµ[2∂HSCC

µν

∂α+

∑

λσ

Pλσ

∂(GSCC)σλµν

∂α]

−∑µν

∂Sµν

∂α

N∑i

cµicνiεi +∂

∂α

∑

A6=B

(1

2γABq0

Aq0B + ESCC

rep (RAB)) (3–17)

which again is equal to the original expression, Equation 3–9, but allows a more

systematic comparison with other methods.

64

3.2.2 Mulliken Approximation Connection to SCC-DFTB

To facilitate understanding of the density-dependent term in SCC-DFTB in relation

to concepts in wavefunction theory we must recognize that the term ∆qA is the charge

fluctuation on atom A from some reference charge for that atom-type (q0A), with the

charge on atom A being defined by the Mulliken population analysis. In the expanded

representation in terms of the AO density matrix (P) the additional term to the density

dependent part of the Fock-like matrix is given by Equation 3–13, which has been shown

to be equivalent to Equation 3–15, though the latter has a form that some may recognize

as the Mulliken expansion for two-electron integrals,

(µν|λσ) ≈ 1

4SµνSσλ(γAC + γBC + γAD + γBD) (3–18)

The origin of the Mulliken expansion and the further extension of the basic approximation

to the Rudenberg approximation will be explored later. For now it is sufficient that

we discuss the numerical properties of this approximation and what advantages and

limitations it entails.

It is perhaps easiest to consider the nature of the Mulliken approximation by way of

a simple example. For an NDDO-type two-center two-electron integral with normalized

orbitals it follows directly from Equation 3–18

γAB ≈ (µAµA|λBλB) (3–19)

There have been many studies on the various forms for γ, most notably by Klopman [39],

Ohno [40], and Nishimoto-Mataga [41]. The particular form within SCC-DFTB model

is the Coulomb repulsion between two 1s Slater type orbitals with an added condition

that as the separation between the two centers approaches zero γAB → UA where UA is

the Hubbard parameter, which is related to chemical hardness, which in turn is related

to the difference between the first IP and EA of that atom. The particular form of the

γAB is only as significant as the underlying approximation in which it is used. In this

65

case the Mulliken approximation ought to be accurate. The Mulliken approximation in

SCC-DFTB essentially converts three- and four-center two-electron integral contributions

to the Coulomb repulsion energy term into overlap-weighted two-center repulsive terms.

If we apply the Mulliken approximation in standard HF and use the ab initio (µµ|λλ)

two-center integrals, the behavior is quite unphysical. Shown in Figure 3-4 are errors

from the full HF when the Mulliken and Rudenberg approximations are applied to the

two classes of three-center terms, (AA|BC) and (AB|AC), and the four-center terms in

NMT over the carbon-nitrogen reaction coordinate. The error introduced by the Mulliken

approximation is drastic, particularly for the (AA|BC) three-center term, for which

the error is over 1000 kcal/mol. Given that the only explicit density dependence of the

Fock-like matrix in SCC-DFTB is introduced by the Mulliken term, and yet the relative

energetics of SCC-DFTB in practice do not introduce such errors, then some density

independent part of the model must be accounting for the difference. We will avoid such

uncontrolled approximations in the AAT approach.

3.2.3 SCC-DFTB Repulsive Energy

The SCC-DFTB total energy expression relies on the original TB form for the total

energy, in which a sum of repulsive atom-pair potentials is used. An error is introduced

because of this reliance on atom-pair potentials to describe quantities that are inherently

density dependent. Under an ideal parameterization, using the same argument as

used earlier for NDDO, where we assume the effective one-particle model SCC-DFTB

Hamiltonian returns the correct density, the error in the SCC-DFTB energy to the exact

form is

ESCC − EKS =∑

A,B>A

ESCCAB (RAB)− Exc[ρ] +

1

2

∫vxc(r)ρ(r)dr −

∑A,B>A

ZAZB

RAB

(3–20)

The numerical behavior of this error is shown in Figure 3-3. Clearly, this repulsive

potential is carrying the burden of a portion of the electronic energy. This is not by

accident; the value of a function (in this case the total energy) that results from the

66

cancellation of two unrelated large functions (the electronic and nuclear repulsion energies)

is easier to parameterize against than the values of the larger component functions.

Alternatively, if the parameterization is against the larger functions there is a risk that

any error introduced in the parameterization will be magnified when they are subsequently

combined. If only the total energies or heats of formation (as in NDDO), and hence

only some combined function, are of interest then such an average approach would be

suitable. However, for our purposes we are interested in two separate and distinct sets of

information, the electronic properties and the total energy. Because of this added demand

on our semiempirical method we must generate the individual components of the total

energy expression with equal accuracy. We recognize three distinct components: first-order

electronic energy, residual electronic energy, and nuclear-nuclear repulsion. The first-order

electronic component has been the focus of most semiempirical method development, and

yields, in addition to the first-order electronic energy, the first-order electronic properties:

IPs, EAs, dipole, etc. The residual energy term presents a unique challenge, though

we will demonstrate how it can be included in a computationally efficient way. The

nuclear-nuclear repulsion is trivial to calculate and the only complication arises when

screened charges must be used if we use a valence-only Hamiltonian.

3.3 Systematic Comparison of HF, NDDO, DFTB, SCC-DFTB, KS-DFT in aSingle Framework

One of the goals of this research is to bring many seemingly disparate approximate

methods under a single common framework. For now, we will not deal explicitly with

so-called post-HF methods that include correlation based upon a reference single

determinant (because of the required computational effort). Instead, we are trying to push

the limits of what can be done within a one-particle theory. The problem we are solving is

the set of Hartree-Fock-Roothaan matrix equations, FC = SCε, for a general one-particle

operator f . We use the notation that F = H + G, where H is the density independent

contribution and G is the density dependent contribution to F. The spin-free one-particle

67

density matrix P is defined by a set of coefficients C for the real spin-unrestricted case,

Pµν =N∑i

(CαµiC

ανi + Cβ

µiCβνi) (3–21)

This dependence allows us to use a self-consistent field (SCF) procedure: determine the

Fock-like matrix from a guess density and determine a new set of coefficients and the

corresponding density, repeat until the density output is within some threshold of the

input density. Alternatively, FPS = PSF and 〈a|heff |i〉εi−εa

= 0 would be good measures of

convergence.

Upon convergence of the SCF equations we have a set of AO coefficients that define

the set of MOs

φi =∑

µ

Cµiχµ, (3–22)

a reference single determinant wavefunction is then generated as an antisymmetrized

product of the MOs,

Φ0 = A{φ1(1)φ2(2) . . . φn(n)} (3–23)

It is useful at this point to define the first-order electronic energy given a zeroth-order

wavefunction, |Φ0〉. The exact electronic Hamiltonian is

H =∑

i

h(i) +∑i,j>i

1

rij

, (3–24)

where the core Hamiltonian is

h(1) = −1

2∇2

1 −∑

A

ZA

r1A

(3–25)

and in its matrix representation the core Hamiltonian is

Hcoreµν =

∫χµ(r1)h(r1)χν(r1)dr1 (3–26)

68

The electronic energy correct through first order is

E = 〈Φ0|H|Φ0〉

= Tr[PHcore] +1

2

∑i,j

〈ij||ij〉 (3–27)

Note that the first-order electronic energy is exactly the same as the HF energy expression

and furthermore has no explicit dependence on the particular form of the one-particle

operator f .

However, the goal is not the first-order electronic energy but the exact electronic

energy. We can consider the CC route, where we know that for a single determinant the

exact electronic energy is (with any choice of orbitals)

ECCelec = E +

∑

i<j,a<b

〈ij||ab〉(tabij + tai t

bj − taj t

ai ) +

∑i,a

〈i|f |a〉tai (3–28)

The CC energy expression provides great insight, but the explicit calculation of tabij and tai

amplitudes requires much more computational effort than we are able to expend for a fast

semiempirical model. A more natural environment for a semiempirical methods is given

by KS-DFT. Assuming that we have the exact exchange-correlation energy functional, the

exact KS-DFT energy is (for KS-DFT orbitals)

EKSelec = E +

1

2

∑i,j

〈ij|ji〉+ Exc[ρ]

= Tr[PHcore] +1

2

∑i,j

〈ij|ij〉+ Exc[ρ] (3–29)

There is another form of the KS-DFT energy expression that is more useful for the

current development, though it requires that we use a slightly more specific form for

our one-particle operator. If the target is a one-particle Hamiltonian that returns the

correct electronic density, and hence the correct first-order electronic properties, then the

ideal one-particle operator is indeed the KS-DFT one-particle operator. And, if only the

total operator and not its individual components, i.e., −12∇2, νext(r), νJ(r), and νxc(r), is

69

pertinent then the first-order energy expression given by Equation 3–29 would be suitable.

By separating terms that are dependent on the density (G) we can write

EKSelec =

1

2Tr[P(2HKS + GKS)]− 1

2Tr[PGKS

xc ] + Exc[ρ] (3–30)

where

(GKSxc )µν = 〈µ|vxc(r)|ν〉 (3–31)

and F = H+G. Equation 3–30 is a central theme of our work, as it gives the unambiguous

definition of the exact electronic energy. Tied to it are the one-particle KS equations that

define the orbitals, the density and associated first-order properties. In this way, the

problem for the total energy and gradients is completely specified by

• F that generates the ground state one-particle density matrix P.

• F that can be decomposed into density independent and (H) and dependent (G)components.

• The exchange-correlation energy functional.

3.3.1 General Energy Expression

We now choose a general electronic energy expression. Including the nuclear-nuclear

repulsion we have the total energy expression,

E =1

2Tr[P(2H + G)] + Eresidual[P] +

∑A,B>A

ZAZB

RAB

(3–32)

We will bring HF, NDDO, DFTB, SCC-DFTB and KS-DFT into this framework, to make

it clear what the three distinct components are:

• Density independent H .

• Density dependent G.

• Residual electronic energy Eresidual[P].

If a method is to give the correct electronic density, and we can know the exact

KS-DFT HKS and GKS, then the model components for H and G absolutely must

70

correspond to them. There is no flexibility with this interpretation. In addition, if we

have the F that yields the correct density then to get the exact total energy, Eresidual[P]

must be EKSresidual[P]. Clearly, we are dealing with approximate models, though instead of

defining the target or reference as being a single number, such as an experimental heat

of formation, we instead have three separate and distinct functional forms, for which the

exact forms are, in principle, known.

3.3.2 Density Independent Fock-Like Matrix Contribution

This component generally accounts for nuclear-electron attraction and the electron

kinetic contribution. The following definitions are based on µ ∈ A and ν ∈ B.

• HF:

HHFµν = 〈µ| − 1

2∇2

1 −∑

A

ZA

r1A

|ν〉 (3–33)

• NDDO:

Hµν =

{Uµµ −

∑B 6=A ZB(µµ|BB) if µ = ν

12Sµν(βµ + βν) if µ 6= ν

(3–34)

• DFTB:

HDFTBµν =

1

2H0

µν (3–35)

• SCC-DFTB:

HSCC−DFTBµν =

1

2H0

µν −1

2Sµν

N∑

ξ

(γAξ + γBξ)q0ξ (3–36)

• KS-DFT:

HKSµν = 〈µ| − 1

2∇2 −

∑A

ZA

r1A

|ν〉 (3–37)

The only density independent contribution to the ideal Fock-like matrix (the Fock-like

matrix which returns the correct one-particle ground state density) is the exact core

Hamiltonian, Hcore. The biggest bottleneck for computing contributions for Hcore is the

three-center one-electron integrals, which scale as n2orbitalsNatoms. Still, they represent

a small computational effort for large systems when compared to the n3orbitals terms

71

for the three-center two-electron integrals. Though there is some cancellation, the

possibility of complete cancellation among the three-center one-electron attraction and

two-electron repulsion integrals is limited. All three-center one-electron terms appear only

in off-diagonal blocks of the Fock-like matrix and are small compared to the dominant

three-center two-electron terms which appear primarily in the diagonal blocks.

3.3.3 Density Dependent Fock-Like Matrix Contribution

This component generally accounts for electron-electron contributions.

• HF:

GHFµν =

∑

λσ

Pλσ((µν|λσ)− 1

2(µλ|νσ)) (3–38)

• NDDO:

GNDDOµν =

∑

λσ

Pλσ((µν|λσ)δABδCD − 1

2(µλ|νσ)δADδBC) (3–39)

• DFTB:GDFTB

µν = 0 (3–40)

• SCC-DFTB:

GSCC−DFTBµν =

∑

λσ

Pλσ[1

4SµνSσλ(γAC + γBC + γAD + γBD)] (3–41)

• KS-DFT:GKS

µν = 〈µ|vJ [P ](r) + vxc[P ](r)|ν〉 (3–42)

The density dependent term of the ideal Fock-like matrix is GKS. This choice

follows directly from the chosen form for the density independent component and the

requirement that we reproduce the exact electronic density. The density dependence of

the Fock-like matrix accounts for the complex interactions among electrons. This term

is dominated by Coulombic repulsion. The second most important term is generally

called exchange. The third component is correlation, and can be considered to be the

highly complex way in which electrons interact with one another, i.e., their motion is

72

correlated. For large systems, the density dependent part of a Fock-like matrix is by

far the most computationally costly. Unlike the density independent part, this term

must be recalculated for every iteration of the SCF procedure. In addition, the terms

involved here are more computationally intensive, e.g., the four-center integrals and the

numeric integration required for the full treatment of vxc. Generally, reduction in the

computational cost of the density dependent term translates into an equivalent reduction

in cost for the entire calculation. Therefore, the bulk of our attention is on systematic

approximations that can be applied that simultaneously reduce the computational cost but

maintain a desired accuracy.

3.3.4 Residual Electronic Energy

This term places no restriction on the Fock-like matrix and hence should have no

direct effect on the electronic density. However, its accurate determination is critical for

high-quality electronic energies and consequently total energies and structures.

• HF:EHF

residual = 0 (3–43)

• NDDO:

ENDDOresidual =

∑A,B>A

(Ecore−coreAB (RAB)− ZAZB

RAB

) (3–44)

• DFTB:

EDFTBresidual =

∑A,B>A

(EDFTBAB (RAB)− ZAZB

RAB

) (3–45)

• SCC-DFTB:

ESCC−DFTBresidual =

∑A,B>A

(γABq0Aq0

B + ESCCAB (RAB)− ZAZB

RAB

) (3–46)

• KS-DFT:

EKSresidual = −1

2Tr[PGKS

xc ] + Exc[ρ] (3–47)

73

Because we insist that an approximate Fock-like matrix reproduce the KS Fock-like

matrix, and thereby define the first-order electronic energy, the residual electronic energy

must be EKSresidual. The residual electronic energy and its combination with the unrelated

nuclear-nuclear repulsion term in SE methods is notorious. We have attempted to include

this term in a simple and computationally efficient way in our method. If our Fock-like

operator included exact (non-local) exchange then this term would introduce only the

residual correlation energy contribution. Although the correlation energy is a small

component (< 1%) of the total energy, its accurate inclusion is required to obtain chemical

accuracy.

3.3.5 General Gradient Expression

One of the overarching objectives of our work is to create a model Hamiltonian

to drive large-scale MD. Achieving this goal requires an efficient method for obtaining

gradients. Given the general energy expression provided by Equation 3–30, the gradient is

[81]

dE

dα= −2

∂Sµν

∂α

∑i

CµiCνiεa + Pνµ[∂Hµν

∂α+

1

2

∑

λσ

Pσλ∂Gµν

λσ

∂α] +

∂Eresidual

∂α(3–48)

which is based on∑

i Cνi(Fµν − εiSµν) = 0 and orthonormality. Note that there is no

explicit dependence on ∂P∂α

. Equation 3–48 only holds if terms in G are linear in the

density

Gµν =∑

λσ

PσλGµνλσ (3–49)

The derivatives of Eresidual, which are not present in HF, are simple for those

methods that only depend on a sum of atom-pair terms, such as NDDO, DFTB, and

SCC-DFTB. In KS-DFT additional numerical integration is required for the perturbed

density. Potential approaches for reducing the computational cost associated with the

derivatives in KS-DFT will be discussed in further detail later.

74

3.3.6 Requirements for an Accurate Simplified Method

We have discussed the features of a general simplified method aimed at large systems

that can be used to reliably obtain electronic properties as well as energetics. It is clear

that the framework is essentially that of KS-DFT, though we have structured the basic

equations in a form amenable to further approximation. The two essential elements for the

determination of electronic structure are the terms in the Fock-like matrix that are either

completely independent (H) or dependent (G) on the electronic density. The final element,

which is needed to acquire accurate energetics and structures, is the residual electronic

energy term. Furthermore, we have well-defined exact limits for each of these terms. When

we discuss the AAT approach we have unambiguous connections to DFT and, from ab

initio DFT, an analogous connection to WFT and the transfer Hamiltonian.

75

-6

-4

-2

0

2

4

6

1 1.5 2 2.5 3

Ene

rgy

(Har

tree

)


G3,4-C

GFull

G2-C

GNDDO

Figure 3-1. Multi-center contribution to the C2sN2s matrix element of NMT.

76

-10

0

10

20

30

40

50

1 1.5 2 2.5 3

Ene

rgy

(kca

l/mol

)


AM1 UHF

Figure 3-2. The AM1 artificial repulsive bump for NMT direct bond fission.

77

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

1 1.5 2 2.5 3

Ene

rgy

(Har

tree

)

C-C reaction coordinate (Angstrom)

EtotErep

Eelec

Figure 3-3. The SCC-DFTB total energy breakdown.

78

A

0

200

400

600

800

1000

1200

1 1.5 2 2.5 3 3.5

Err

or f

rom

HF

(kca

l/mol

)


RudenbergMulliken

B

10

20

30

40

50

60

70

80

90

100

1 1.5 2 2.5 3 3.5

Err

or f

rom

HF

(kca

l/mol

)


RudenbergMulliken

C

-2 0 2 4 6 8

10 12 14 16 18 20

1 1.5 2 2.5 3 3.5

Err

or f

rom

HF

(kca

l/mol

)


RudenbergMulliken

Figure 3-4. Energies from Rudenberg and Mulliken approximations. A) Three-center(AABC). B) Three-center (ABAC). C) Four-center (ABCD).

79

CHAPTER 4ADAPTED AB INITIO THEORY

We have specified the target framework of our simplified quantum mechanical

method. The task now is to demonstrate simple well-controlled approximations that

significantly reduce computational effort while maintaining a desired level of accuracy. To

this end, a primary focus is on the density dependent component of the Fock-like matrix,

as it is the most computationally intensive part of the SCF procedure. Furthermore, the

residual electronic energy contribution is examined because we have shown that its proper

treatment is essential for a SM to achieve uniform accuracy of electronic structure and

energetics.

Our strategy is to start with a reference theoretical method that is reasonably

well-behaved, in this case KS-DFT. Though there is much debate about the best DFT

energy functional, we choose a hybrid DFT functional, specifically the widely-used

B3LYP, because it includes a portion of exact exchange which adds a substantial degree

of flexibility to our simplified method. To simplify the computational procedure we

then explore various approximations and verify their accuracy. It may be too much to

demand a simplified method that is systematically improvable (in the CC wavefunction

theory sense), however, every approximation should be systematically verifiable. We

will show that by making a couple of well-controlled approximations we are able to

reproduce the topographical features of a hybrid DFT PES with a substantial reduction in

computational effort.

4.1 Two-Electron Integrals

Explicit calculation of two-electron integrals is a computational bottleneck in DFT

because the number of integrals scales as n4orbitals. For the Coulomb integrals this can

be reduced to n3orbitals by fitting the density prior to the evaluation of the integrals [82],

but this cannot be done for the exact exchange, leaving n4orbitals dependence. Though

most quantum chemistry programs employ screening techniques to reduce the number

of two-electron integrals calculated, they remain a significant cost, especially when exact

80

exchange is included. There are two ways to lower the computational cost associated with

two-electron integrals: first, the number of integrals can be lowered by applying screening

methods; second, the cost associated with the evaluation of each individual integral can

be reduced. Several techniques for reducing the number of integrals have been successfully

applied. They generally involve a procedure that recognizes a threshold for significant

contributions to the Fock matrix. These include fast matrix multipole (FMM) [83–85], the

linear exact exchange [86], and quantum chemical tree code (QCTC) [87]. These methods

still rely on the explicit calculation of integrals using standard techniques for Cartesian

Gaussian-type functions. For large systems the most numerous class of integrals are the

four-center integrals. Furthermore, four-center integrals are not only the largest class of

integrals, but also the most computational involved integrals. According to Gill et al. [88]

a four-center integral is typically an order of magnitude more computationally demanding

than a two-center integral. Our focus here is on systematic approximations that can be

made to these ab initio multi-center integrals that will reduce the computational effort

associated with calculating these terms while retaining acceptable accuracy.

The cornerstone of our approach is based on ideas originally presented by Rudenberg

in 1951 [5]. The underlying approximation is to expand an orbital on center A in terms of

a set of auxiliary orbitals located on center B:

φµA(1) =

∞∑

k=1

SkB , µAφkB

(1) (4–1)

where

SkB , µA= 〈kB|µA〉 (4–2)

If the auxiliary orbitals on center B are complete and orthogonal then the expansion is

exact. By repeated application of Equation 4–1 the number of centers that are involved in

three- and four-center two-electron integrals can be reduced. The possible combinations

of this type of expansion has been explored and enumerated by Koch et al. [89], though

without comparing the numerical accuracy of such approximations. We will show the

81

numerical precision of the Rudenberg approximation in an effort to avoid the explicit

calculation of the numerous four-center integrals. By reducing the cost of each four-center

integral we improve the most computationally expensive step of building the Fock matrix

for large systems.

First, consider the effect of the orbital expansion on a two-center product of two

orbitals with respect to the same electron coordinate,

φµA(1)φνB

(1) =1

2

∞∑

k=1

[SµAkBφkB

(1)φνB(1) + SkAνB

φµA(1)φkA

(1)] (4–3)

Since a general multi-center two-electron integral is given as

〈µAλC |νBσD〉 = (µAνB|λCσD) =

∫ ∫φµA

(1)φνB(1)φλC

(2)φσD(2)

r12

d1d2 (4–4)

substituting Equation 4–3 into Equation 4–4 for both the AB and CD orbital products

yields the following expression, which is the Rudenberg two-electron integral form,

(µAνB|λCσD) =1

4

∞∑

k=1

∞∑j=1

[SµAkBSλCjD

(kBνB|jDσD) + SµAkBSjCσD

(kBνB|λCjC)

+SkAνBSλCjD

(µAkA|jDσD) + SkAνBSjCσD

(µAkA|λCjC)]

(4–5)

In the limit that the sums over auxiliary orbitals k and j are complete this expansion

is exact. In practice, the expansion is approximated by limiting the orbitals over which

we are expanding to those included in the AO basis functions, though using a separate

auxiliary set of orbitals may also be suitable.

The Rudenberg two-electron integral expression (Equation 4–5), is more general than

the well-known Mulliken two-electron integral approximation. The Mulliken approximation

is equivalent to restricting the sum in Equation 4–1 to be only over orbitals of the same

type,

φµA(1)φνB

(1) =1

2SµAνB

[φµA(1)φµA

(1) + φνB(1)φνB

(1)] (4–6)

82

which captures a large portion of the Rudenberg expansion and when applied to

two-electron integrals yields,

(µAνB|λCσD) =1

4SµAνB

SλCσD[(µAµA|λCλC) + (µAµA|σDσD)

+(νBνB|λCλC) + (νBνB|σDσD)].

(4–7)

Note the similarity of this expression to that which appears in SCC-DFTB, Equation

3–18, in which

γAB ≈ (µAµA|νBνB) (4–8)

though in SCC-DFTB the orbitals µ and ν are restricted to be s-type functions.

We consider here all eight different approximations that result from the application

of Equation 4–1 to three- and four-center two-electron integrals. The two classes of

three-center terms are of the form (µAνA|λBσC) and (µAνB|λAσC), which will be referred

to as types A and B respectively. To facilitate our discussion we introduce the numbering

convention in Table 4-1.

The Rudenberg and Mulliken formulas can be applied to either class of three-center or

the four-center integrals (approximations I-VI) using Equations 4–5 and 4–7, respectively.

For the partial approximations to four-center terms (VII and VIII), partial means that the

corresponding approximation has been applied only once. The expansion for these partial

approximations is then in terms of three-center two-electron integrals (instead of only

two-center terms). For the Mulliken partial (VII) we have

(µAνB|λCσD) =1

4{SµAνB

[(µAµA|λCσD) + (νBνB|λCσD)]

+SλCσD[(µAνB|λCλC) + (µAνB|σDσD)]}.

(4–9)

and the Rudenberg partial (VIII) is

(µAνB|λCσD) =1

4

∞∑

k=1

[SµAkB(kBνB|λCσD) + SkAνB

(µAkA|λCσD)

+SλCkD(µAνB|kDσD) + SkCσD

(µAνB|λCkC)]

(4–10)

83

With all eight of the two-electron integral approximations now defined, the accuracy

of each will be considered. Since one requirement of our approach is that it reproduce the

PES that would arise from a full ab initio treatment, we require that our approximations

be uniformly accurate over the entire PES. This is in contrast with many commonly

used semiempirical methods, in which the focus is typically on accuracy only about an

equilibrium structure. However, uniform accuracy is critical to getting accurate forces

to drive large-scale MD and to aid in locating transition states for rate constant studies.

Unfortunately there is no simple measure for uniform accuracy, due to the complexity of

the 3N − 6 dimensional PES, so our analysis will be statistical.

Note that the total electronic energy is the sum of all the one-, two-, three-, and

four-center components. And, each of these components individually bear little similarity

to the total electronic energy, again consider each contribution to any particular matrix

element shown in Figure 3-1, instead the total energy is the result of the delicate

cancellation among these terms. In addition, since only the non-parallelity errors are

of importance in reconstructing the PES, the energetic contribution arising from each

individual class of terms may be shifted by a constant value independently of one another

without affecting the features of a PES. These subtleties, compounded by the 3N − 6

degrees of freedom of the PES, make ascribing significance to any single class of terms

difficult. Because we want to assess the accuracy of a particular approximation as it

applies to a single class of terms we will instead focus on the Fock matrix elements, which

contain all the information needed for the zeroth-order electronic energy, to overcome some

of these difficulties.

Given any set of points on the PES (for example, a dissociation curve) and the Fock

matrix for each point, each class of multi-center two-electron integrals will contribute a

particular percentage to the full density dependent Fock matrix. The best approximation

for a particular class of integrals is then given by how well it reproduces the average

percentage of the Fock matrix for the corresponding ab initio set of integrals. Consider

84

the scatter plots of the full density-dependent contribution of the Fock matrix versus the

individual contributions arising from the multi-center terms. Performing a least-squares

linear regression yields a line whose slope is a measure for the overall contribution from

each term. Summing the one-, two-, three-, and four-center slopes yields unity. An

individual slope is, in a sense, the average percentage of the density-dependent part of the

Fock matrix elements (entries). However, it should not be interpreted as the percentage

of the energetic contribution. Because we are interested in not only the energy but also

the density (which is given by the Fock matrix) then this average percentage serves

as an acceptable measure of the importance of each term. Furthermore, in assessing

approximations to individual classes of terms then we have an average way of measuring

its significance. This is useful because the contribution from each multi-center component

can vary greatly from molecule to molecule and also depending on internuclear separation.

Take as an example NMT, and the dissociation of the C-N bond. Plotted in Figure 4-1

are the scatter plots for each multi-center contribution to the Coulomb and exchange

contributions of the Fock matrix from HF theory in a minimal basis. The C-N reaction

coordinate ranges from 1.0 A to 5.0 A in steps of 0.2 A . The slopes of the corresponding

best-fit lines of the one-, two-, three- and four-center contributions are 0.0063, 0.5199,

0.4522, and 0.0216 respectively, note that these slopes represent the fraction of the full

contribution, and that their sum is one.

Table 4-2 shows the statistical averages of the density-dependent contributions to the

Fock matrix for several molecules along the specified reaction coordinate (Rxn) at the HF

level of theory in a minimal basis set. The dissociation in all cases is from 1.0 A to 5.0 A

in steps of 0.2 A . On average, the two-center two-electron integral component accounts

for roughly half of the full matrix elements and the three-center type A integrals recover

another third. Again, the corresponding energetic contributions will not exhibit the

same ratios. The purpose here is to validate individual approximations made to various

multi-center two-electron integrals. To this end we consider the average contribution: the

85

better the approximation to a particular multi-center class of integrals the closer their

average contributions will be.

The a priori best approximation to the two types of three-center contributions is that

of Rudenberg. Shown in Figure 3-4 are the errors introduced by making approximations

I-VI, for NMT direct bond fission along the carbon-nitrogen reaction coordinate. For type

A and B three-center integrals (3-C A and 3-C B) the average matrix contributions are

35.67% and 4.98% respectively. The Rudenberg approximations II and IV differ by about

half a percent at 35.12% and 5.50%. Though the percent error for both approximations

II and IV are roughly the same, the corresponding error introduced energetically is two

orders of magnitude larger for the (AA|BC) three-center terms than for the (AB|AC)

three-center terms. In contrast, the Rudenberg approximation applied to four-center

terms (VI) is 2.47% which differs from the four-center ab initio (4-C) value of 2.39%

by only 0.08%. The error introduced by the full Rudenberg four-center approximation

(VI) is surprisingly small. One would expect that the partial Rudenberg approximation

would have the best agreement with the exact value, instead, at least empirically, the

full Rudenberg approximation performs best, and has the smallest error relative to the

average contribution to the Fock-like matrix. Indeed, the effect this 0.08% difference has

on the energetics of a dissociation curve is within an acceptable range and reproduces the

energetics well for a variety of different dissociation curves. We will focus on the molecules

in Table 4-2 that involve the dissociation of a C-N bond. Plotted in Figure 4-2 are the

dissociation curves for NMT, nitroethane (NET), COHNO2, and CH3NH2 at the B3LYP

level of theory in a minimal basis set. There is a clear trend for the four-center Rudenberg

approximation to overestimate the average contribution to the Fock-like matrix. The

energetics, on the other hand, do not show this clear-cut trend since the energy is lowered

for NET, raised for CH3NH2, and essentially unchanged for NMT. However, even though

the PES are not precisely reproduced they still exhibit all the basic features of the ab

86

initio surface, surprisingly without the explicit calculation of four-center two-electron

integrals.

We have demonstrated that the Mulliken and Rudenberg approximations do not

approximate the three-center one- or two-electron integrals well. These classes of integrals

pose a challenge as they are the only remaining terms that require explicit integration

since all one-center terms can be tabulated and two-center terms can be interpolated by

cubic splines, as will be demonstrated later. However, based on the Gaussian product rule

these terms can, without approximation, be reduced to two-center terms. In principle then

cubic spline interpolation could be used to store the intermediates and completely avoid

the need to do any integrals explicitly. The Gaussian product rule is

e−αµ(r−RA)2e−αν(r−RB)2 = e− αµαν

αµ+αν(RA−RB)2

e−(αµ+αν)(r−αµRA+ανRB

αµ+αν)2

(4–11)

We can generate an auxiliary set of orbitals for each Gaussian-type orbital-pair that have

the orbital exponent αµ +αν . Also, because products of l = 1 (p-type) orbitals are involved

this auxiliary set must also now span l = 2 (d -type). The prefactor is not included in the

stored orbitals, instead it is computed on-the-fly to retain the two-center character of the

terms in which it will be used.

There is a significant difficulty with using the Gaussian product in a simple

way. Because we work in contracted Gaussian-type orbitals the number of orbital

pairs increases as the number of contracted functions squared. The centers of these

resulting product Gaussians are not necessarily the same, αµRA+ανRB

αµ+αν. To use cubic spline

interpolation an auxiliary orbital must have a constant exponent, e.g., αµ + αν in the case

of the simple product of uncontracted Gaussians. We envision an approximation to the

exact contracted Gaussian product, which is a sum of several Gaussians with different

centers and different exponents, that is a single contracted Gaussian located at a single

center. The approximate auxiliary Gaussian could be determined by fitting directly to the

87

exact product. However, this fit will generally yield different exponents and coefficients for

two contracted Gaussian functions as there separation changes.

If we consider the basis set STO-nG, a sum of contracted Gaussians approximate a

Slater-type orbital. There is no Slater product rule that reduces a product of two Slaters

to a single Slater. Instead we have

e−αµ|r−RA|e−αν |r−RB | = e−αµ|r−RA|−αν |r−RB | (4–12)

Consider a simple example of the product of two Gaussian functions versus two Slater

functions separated by one Bohr, which is represented by the blue curve in Figure 4-3.

The challenge is now clearer, the underlying basis functions we use are fit to Slater

functions, and the product of two Slater functions does not resemble a simple Gaussian,

or even a sum of contracted Gaussians. The Slater product is defined exactly by the

intersection of two functions:

e−αµ|r−RA|e−αν |r−RB | = e−αγ |r−RC | ∩ e−αδ|r−RD| (4–13)

where, for αµ < αν and RA < RB

γ = αµ + αν (4–14)

δ = αν − αµ (4–15)

RC =αµRA + ανRB

αµ + αν

(4–16)

RD =αµRA − ανRB

αµ − αν

(4–17)

And for this particular case RA < RC < RB < RD. For the Slater orbital product

given in Figure 4-3 the exponents used were αµ = αν so αγ = αµ

2, αδ = 0, RC = 0, and

RD → ∞. The purple curve is e−αγ |r−RC | which is truncated by a horizontal line that

results from e−αδ|r−RD|. For αµ 6= αν the situation is somewhat more complicated. However

we now have a more direct path to two-center intermediate terms that have a constant

exponent for each atom-pair, and, are therefore, amenable to cubic spline interpolation

88

with universally transferable parameters. The difficulty still remains on how best to

incorporate the intersection of these two functions in a practical way.

We have developed the strategy and framework for eventually including the

three-center terms. For now we are satisfied with reducing the computational cost through

only the four-center Rudenberg approximation because the four-center terms account for

most of the computational effort associated with construction of the Fock matrix for large

systems.

4.2 Kohn-Sham Exchange and Correlation

If we envision an approximation to KS-DFT then a primary focus ought to be

avoiding the numerical integration that must be performed for each SCF iteration. The

formal scaling of the numerical integration step is n3orbitals, as it is performed in real space.

In practice, the scaling is lower because the exchange-correlation potential is local and

the density decays rapidly. Still, we want to avoid any extra computational effort that

contributes little to the overall energetics. To approximate the exchange-correlation

potential contribution consider the expansion of the potential in terms of atom-centered

densities, ρ =∑

A ρA. Then we have the following telescoping series

vxc[∑

A

ρA](r) =∑

A

vxc[ρA](r) +∑A>B

{vxc[ρA + ρB](r)− vxc[ρA](r)− vxc[ρB](r)}

+∑

A>B>C

{vxc[ρA + ρB + ρC ](r)

− vxc[ρA + ρB](r)− vxc[ρA + ρC ](r)− vxc[ρB + ρC ](r)

+ vxc[ρA](r) + vxc[ρB](r) + vxc[ρC ](r)}+ · · ·

=v1−centerxc + v2−center

xc + v3−centerxc + · · · (4–18)

Equation 4–18 is exact. It only requires an arbitrary partition of the density into

atomic parts, though convergence of the series is affected by the choice of partition.

Our target is to include contributions from the exchange-correlation potential to the

KS Fock-like matrix in a way that only requires the storage of a limited set of two-center

89

terms. Such two-center terms can be interpolated using cubic splines once, then the

cubic spline coefficients can be stored in RAM to be efficiently retrieved and computed

on-the-fly. With such a target in mind only terms in Equation 4–18 that can be retained

are v1−centerxc and v2−center

xc . Furthermore, such contributions ought to be independent of

chemical environment and thus completely transferable. Therefore, we insist on densities

that are transferable. The natural choice is spherically symmetric atom-centered densities

of neutral atoms, ρA ≈ ρ0A, recognizing we can have α and β spin components. Based on

these considerations we have an approximate exchange-correlation potential,

vxc[∑

A

ρA](r) ≡∑

A

vxc[ρ0A](r) +

∑A>B

[vxc[ρ0A + ρ0

B](r)− vxc[ρ0A](r)− vxc[ρ

0B](r)] (4–19)

We do not want to map and store the real-space exchange-correlation potential,

because of the obvious high degree of complexity. Instead, all that is ever needed are

the contributions from the potential projected in the basis set. The Fock-like matrix

contribution from this approximate exchange-correlation potential is

(Gxc)µAνB[ρ] = 〈µA|vxc[ρ]|νB〉 (4–20)

Restricting this to contributions that only depend on two centers yields, for the diagonal

block (A = B)

(Gxc)µAνA=〈µA|vxc[

∑A

ρ0A]|νA〉

=〈µA|vxc[ρ0A]|νA〉

+1

2

∑

B 6=A

〈µA|vxc[ρ0A + ρ0

B]− vxc[ρ0A]|νA〉

(4–21)

and for the off-diagonal block (A 6= B)

(Gxc)µAνB=〈µA|vxc[

∑A

ρ0A]|νB〉

=〈µA|vxc[ρ0A + ρ0

B]|νB〉(4–22)

90

The off-diagonal block terms can be further improved by applying the Rudenberg

approximation. Note that the diagonal block terms do not benefit from this approximation

because the orbitals only depend upon one center. The class of neglected terms 〈µA|vxc[ρC ]|νB〉becomes

〈µA|vxc[ρ0C ]|νB〉 =

1

2

∑

C 6=A,B

∞∑

k

[SµAkB〈kB|vxc[ρ

0C ]|νB〉+ SkAνB

〈µA|vxc[ρ0C ]|kA〉] (4–23)

The leading order terms are

(G0xc)µAνB

=

〈µA|vxc[ρ

0A]|νA〉 if A = B


B]|νB〉 if A 6= B(4–24)

and a slight improvement could be achieved with the addition of

(G1xc)µAνB

=

12

∑B 6=A〈µA|vxc[ρ

0A + ρ0

B]− vxc[ρ0A]|νA〉 if A = B

12

∑C 6=A,B

∑∞k [SµAkB

〈kB|vxc[ρ0C ]|νB〉+ SkAνB

〈µA|vxc[ρ0C ]|kA〉] if A 6= B

(4–25)

Shown in Figure 4-4 are results for the GZERO(B3LYP) approximation: G0xc is used with

the B3LYP energy functional interpolated over all the atom pairs CNOH. The errors

introduced are relatively small and GZERO(B3LYP) appears to be generally transferable.

In terms of computational feasibility for GZERO we have introduced a single

one-center matrix per atom-type and a single two-center matrix per atom-pair. For C,

N, O and H atom types the number of one-center matrices is trivial and does not present

a challenge, as they are the dimension of the number of orbitals per atom squared. The

two-center matrices are stored in the local coordinate framework of the atom pair and are

roughly the dimension of the number of orbitals on atom A times the number of orbitals

on atom B. The actual dimension can be reduced because most of the matrix elements are

zero due to orthogonality. To interpolate the values of these two-center matrix elements

cubic splines are used. Cubic splines require five parameters per number interpolated. We

used adaptive grids (non-uniformly spaced) that are defined for every pair term from 0.5

91

A to infinity (effectively). To a good level of numerical precision this fit can be performed

with roughly 600 reference points, but more can be added should higher levels of precision

be required. For CNOH in a minimal basis set (there are ten unique atom pairs) splines

require roughly half a million double precision numbers. Going to a double zeta type basis

increases this storage requirement by about a factor of nine. Even for the double zeta case

that cost does not present a challenge.

4.3 Adapted ab initio Theory Model Zero

To briefly summarize, we have applied two novel approximations to speed up the

construction of the Fock-like matrix in a KS-DFT framework. We have shown that the

Rudenberg approximation applied to four-center two-electron integrals retains the features

of the ab initio integrals, thus avoiding the most costly class of Cartesian Gaussian

integrals. Furthermore, this approximation is readily applied to hybrid KS-DFT methods,

which include some fraction of exact exchange.

In addition to approximations for integrals we also demonstrated approximations that

can be applied to the exchange-correlation potential contribution of the Fock-like matrix,

and thereby avoid the need to perform numerical integration for every SCF iteration.

Extensions to include three-center contributions from the telescoping series approximation

to the potential were also presented.

The last component we will consider involves the residual total energy. Though it

is worthwhile to avoid numerical integration to generate the potential and subsequent

matrix contribution, a final one-time evaluation of Exc[ρ] upon SCF convergence should

not be a bottleneck. Furthermore, as demonstrated by Bartlett and Oliphant [90] the

exchange-correlation energy functional is relatively insensitive to the input density. Indeed,

as we have seen already in Figures 4-2 and 4-4 the B3LYP energy evaluation was used and

introduced no significant error.

92

The current implementation of AAT (AAT Model Zero) is given by

HAATµν = 〈µ| − 1

2∇2

1 +∑

A

ZA

r1A

|ν〉 (4–26)

all one-electron terms are calculated exactly as they represent a small portion of the

total computational cost for large systems. Given that we are using a hybrid KS-DFT

functional we have a fraction of exact exchange in the density dependent Fock-like matrix,

for B3LYP, α = 0.2.

(GJ−αK1,2,3−center)µν =

∑

λσ

Pλσ((µν|λσ)− α(µλ|νσ)) (4–27)

(GJ−αK4−center)µν =

∑

λσ

Pλσ( ˜(µν|λσ)− α ˜(µλ|νσ)) (4–28)

where

˜(µν|λσ) =1

4

∑

k=1

∑j=1

[SµAkBSλCjD

(kBνB|jDσD) + SµAkBSjCσD

(kBνB|λCjC)

+SkAνBSλCjD

(µAkA|jDσD) + SkAνBSjCσD

(µAkA|λCjC)]

(4–29)

(G0xc)µν =

〈µA|vxc[ρ

0A]|νA〉 if A = B


B]|νB〉 if A 6= B(4–30)

GAATµν = (GJ−αK

1,2,3−center)µν + (GJ−αK4−center)µν + (G0

xc)µν (4–31)

and the residual electronic energy is

EAATresidual = −1

2Tr[PG0

xc] + Exc[P] (4–32)

Finally, shown in Figure 4-5 are UHF dissociation curves for AAT Model Zero (AATM0),

where both the Rudenberg and GZERO(B3LYP) approximations have been applied

simultaneously. This model reproduces the underlying B3LYP dissociation curves well

especially when compared to AM1.

The corresponding gradient expression for AATM0 is given by Equation 4–34. The

only difference between standard KS-type derivatives arise from the modified four-center

93

term dependence on overlap. The evaluation of terms ∂ ˜(µν|λσ)∂α

is straight-forward and only

dependent upon derivatives of the overlap, ∂Sµν

∂α, and two-center NDDO-type integrals,

∂(µAνA|λBσB)∂α

.

∂ ˜(µν|λσ)

∂α=

1

4

∑

k=1

∑j=1

[∂SµAkB

∂α{SλCjD

(kBνB|jDσD) + SjCσD(kBνB|λCjC)}

+∂SkAνB

∂α{SλCjD

(µAkA|jDσD) + SjCσD(µAkA|λCjC)}

+∂SλCjD

∂α{SµAkB

(kBνB|jDσD) + SkAνB(µAkA|jDσD)}

+∂SjCσD

∂α{SµAkB

(kBνB|λCjC) + SkAνB(µAkA|λCjC)}

+SµAkBSλCjD

∂(kBνB|jDσD)

∂α+ SµAkB

SjCσD

∂(kBνB|λCjC)

∂α

+SkAνBSλCjD

∂(µAkA|jDσD)

∂α+ SkAνB

SjCσD

∂(µAkA|λCjC)

∂α]

(4–33)

dE

dα= −2

∂Sµν

∂α

∑i

CµiCνiεi + Pνµ[∂{HAAT

µν + (G0xc)µν}

∂α+

1

2

∑

λσ

Pσλ∂{(Gµν

λσ)1,2,3−center + (Gµνλσ)4−center]}

∂α] +

∂Eresidual

∂α

(4–34)

4.4 Implementation

The implementation of the approximations outlined above must be efficient because

the construction of the Fock-like matrix can be a rate limiting step for large systems.

The Rudenberg approximation can be implemented in a stored integral or direct SCF

procedure. For the Rudenberg approximation to be competitive requires that the

contraction of the overlap with the appropriate two-center two-electron integrals be

significantly faster than the four-center two-electron integrals. Shown in Figure 4-6 are

the relative timings for the evaluation of the ab initio four-center integrals versus the

Rudenberg approximation. These timings are preliminary and more work could be done

to optimize both methods for evaluating this four-center contributions. However, it is

clear that there is a significant advantage to using the Rudenberg approximation and

94

the possibility of improving the speed over traditional DFT methods by two-orders of

magnitude is well within range. It is sufficient to focus only on timings for the four-center

integrals as they quickly become the dominate contribution for large systems, as shown in

Figure 4-7.

Screening of small two-electron integrals can lead to linear scaling for large systems.

Such screening methods can be applied to the Rudenberg approximation very efficiently

because of the explicit dependence on overlap.

In an effort to minimize computational effort with building the Fock-like matrix we

have made extensive use of storing transferable one-center and two-center terms. In our

current implementation all integrals up through three-centers are calculated exactly. There

are two reasons for explicitly evaluating these integrals: first, they are a small contribution

when compared to the explicit calculation of the four-center terms; second, the application

of the Rudenberg or Mulliken approximations to three-center terms (decomposing them

into two-center terms) introduces significant errors. If an accurate, simple and transferable

approximation for the three-center terms were available then all terms in the model could

be two-center. It is a straight forward matter to interpolate functions of two centers using

cubic splines. We have implemented such techniques for the evaluation of the matrix

elements of the exchange-correlation: (G0xc)µAνB

(RAB).

Cubic splines are applicable for numerical 2-center functions, where the exact

functional form is not known, and for complicated analytic 2-center functions. In the

first case, the benefit of splines is to provide an interpolation scheme for a set of empirical

reference points. In the second case splines provide a computationally efficient framework

for the evaluation of otherwise prohibitively expensive analytic functions. We are primarily

interested in the latter.

The ability to store a large number of parameters in computer memory (RAM) is

a requirement for the efficient implementation of cubic splines. The memory required

is directly determined by the number of cubic spline parameters. A pertinent example

95

is the evaluation of two-center two-electron integrals terms in MNDO, specifically, the

multipole-multipole expansion of the NDDO-type two-electron integrals. There are

twenty-two unique NDDO-type integrals for every atom-pair involving two heavy atoms

in standard NDDO methods such as AM1, PM3, etc. This means that for CNOH atom

types, for which there are ten unique atom-pairs (CC, CN, CO, CH, etc.), there are fewer

than 220 analytic functions. The evaluation of a cubic spline requires fewer operations

than even the simplest of the twenty-two unique NDDO-type analytic expressions. As a

simple demonstration of the speed of cubic splines we have implemented them to replace

the explicit multipole-multipole evaluation for each of the 22 unique NDDO-type integrals.

The multipole-multipole approximation is already significantly faster than the ab initio

integral evaluation. As shown in Figure 4-8 a modest speed-up is achieved over the already

fast NDDO Fock-like matrix construction. These timings are for 16 protein systems up

to 20000 orbitals. Using cubic splines is up to 15% faster than the multipole-multipole

expansion. There is an error introduced, though it is small and controllable. For example

the average error per atom is given in Figure 4-9. Furthermore, while going to higher

multipoles in the expansion of integrals is increasingly expensive, cubic splines cost the

same regardless of the underlying function being interpolated. However, the number of

parameters involved in a typical NDDO multipole-multipole expansion is relatively small

versus the number of parameters needed for cubic splines. The premium placed on RAM

(or core) memory when NDDO methods were first developed thirty years ago would have

made cubic splines impractical. Now, the amount of memory typically available on most

desktop machines is more than sufficient to make cubic splines the method of choice when

evaluating nearly any two-center integral in quantum chemistry.

A general cubic spline has the form

f(x) = Ak + Bk(x− xk) + Ck(x− xk)2 + Dk(x− xk)

3 (4–35)

96

where xk is a reference point (node), and f(xk) = Ak. The determination of the

parameters Bk, Ck, and Dk is straight-forward and unambiguous. We have used the freely

available Duris routine for discrete cubic spline interpolation and smoothing [91]. The only

degrees of flexibility when using cubic spline interpolation are the number of nodes used

and if they are evenly spaced (discrete) or based on an adaptive grid. In our case, for both

integrals and matrix elements, the functions in all cases go to zero in the separated limit

and at smaller length scales the function requires a dense grid to completely capture the

subtle changes that occur in that region. A dense node spacing over the entire length scale

is unnecessary because the functions change very little at long length scales.

To determine a suitable node mapping function we tried several different functional

forms and concluded that a simple exponential function provides a good balance between

having a dense grid in the more critical bonding region (from around 0.5 A to 5 A) and a

sparse grid at long bond lengths (hundreds of A). The actual node mapping function used

is

node(RAB) = Int(1000 ∗ e−6/(RAB+1))− 1. (4–36)

All of the functions we have currently implemented, those involving two-center exchange-

correlation approximations, are zero beyond 10 A. Using the node mapping function then

requires approximately 600 reference points. Since there are 5 variables per reference

point we need 3000 parameters that would form a look-up table for a single function. If

double precision numbers are used to store the spline parameters the storage requirement

is about 0.023 MB per function. Further details of the current implementation of the AAT

procedure in the parallel ACES III program system can be found in Appendix A.

4.5 Verification

We have shown that AATM0 reproduces well the basic features of a PES for a small

test set of molecules involving the direct fission of a C-N bond. Now we turn our attention

to a more thorough comparison to test the behavior of AATM0 for a variety of different

properties. We use the diverse set of molecules in the so-called G2 test of molecules

97

[92]. The G2 set is made up of 148 neutral molecules with singlet, doublet, and triplet

spin multiplicities, though we focus on a subset of these that contain only C, N, O, and

H because the current implementation of AATM0 includes only the spline parameters

generated for those atom atom-pairs. There are 71 molecules in this truncated set ranging

from 2 to 14 atoms.

The separation of electronic and nuclear components in the energy expression

of our AAT approach sets it apart from commonly used SE methods. So it follows

that we expect more consistent results for purely electronic properties. To this end we

have considered the vertical ionization potentials of the relevant molecules in the G2

set. The vertical ionization potential is purely electronic as it does not depend on the

nuclear-nuclear repulsion. We have used the simple definition EIP = E(N) − E(N − 1).

This requires the total energy of the molecule and, since all molecules in the G2 set

are neutral, the corresponding cation with the properly modified multiplicity. We have

compared AATM0 against the underlying B3LYP minimal basis set results and find

a average RMSD from B3LYP for our entire test set of 0.057 Hartree (1.6 eV). The

analogous comparison to AM1 yields an RMSD from B3LYP of 0.331 Hartree (9.0 eV).

This clearly demonstrates the ability of AATM0 to reproduce the electronic structure of

the underlying B3LYP. The error of AM1 from B3LYP is substantial, though a more valid

comparison for AM1 would be against experimental IPs, the important issue here is the

ability our AATM0 to reproduce the physics of the method it is approximating.

We now consider the dipole moment. The dipole RMSD deviation of AATM0 and

AM1 from minimal basis set B3LYP is 0.219 a.u. (0.56 Debye) and 0.595 a.u. (1.51

Debye) respectively. There were no sign errors observed for the orientation of the

dipole moment for either AATM0 or AM1. Though the error introduced by AATM0

is significantly smaller than that of AM1, it is larger than would be expected given the

performance of other properties generated by the AATM0 procedure.

98

Another test of a purely electronic property is simply the electronic density. In this

case a direct comparison to existing SE methods is not possible in a straight forward

way because NDDO-based and DFTB-based methods drop the core orbitals since they

are using a valence-only approach. Instead, we have used B3LYP as the reference and

compared AATM0, minimal basis HF with the Rudenberg approximation applied to

four-center two-electron integrals, and full ab initio minimal basis HF. Considering the

one-particle density matrix (AO density matrix P) from B3LYP the average RMSDs

are 0.061, 0.136 and 0.029 for AATM0, HF (with Rudenberg), and full HF, respectively.

These results demonstrate some surprising behavior. On average the full HF has the best

agreement to B3LYP, at least in the case of a minimal basis set. The AATM0 agreement

is somewhat worse and HF with the Rudenberg approximation introduces a significant

error. Since AATM0 includes the Rudenberg approximation, we would expect its deviation

from B3LYP to be similar to the deviation of HF with the Rudenberg approximation from

the full ab initio HF results, especially since several systems in the test have only two or

three atoms and the four-center approximation does not affect those cases. We see two

probable origins of this unexpected discrepancy; first, there is a cooperative cancellation of

errors between the Rudenberg approximation and the exchange-correlation approximation

(G0xc); second, because the hybrid B3LYP functional includes only 20% nonlocal exchange

the corresponding error introduced by the Rudenberg approximation is not as significant

when compared to the HF treatment in which full nonlocal exchange is used.

In addition to the average RMSD of the density of AATM0 and full HF we consider

the standard deviation of the RMSD of the density for each molecule in the test set.

The standard deviations are 0.042 and 0.057 for AATM0 and full HF respectively. This

suggests that while the overall average RMSD is somewhat higher from AATM0 the error

is slightly more systematic.

Turning our attention to energetic properties we consider the atomization energies of

the relevant portion of the G2 set. We take as the atomic reference energies the neutral

99

atoms for each method. Since the AATM0 is designed to reproduce the exact B3LYP

result in the separated atom limit, the atomic energies are identical. So, the error between

AATM0 and B3LYP is equivalent to the total energy differences. Since we also want to

compare B3LYP to AM1 we must also have atomization energy for B3LYP. Similarly, for

AM1 we perform the neutral atom calculations and subtract the sum of atomic energies

from the total energy. The AATM0 error from B3LYP is 0.057 Hartree. For AM1 the

corresponding error is 0.331 Hartree. This indicates that the AATM0 total energies are in

better agreement with the underlying B3LYP total energies.

Since one goal of our approach is to reproduce a complex PES of large systems we

have applied the AATM0 approximation to describe a somewhat unphysical reaction

mechanism for C20 in order to subject the approximation to an extreme case of bond

breaking. The pseudo-reaction-path corresponds to splitting the C20 molecule in half,

simultaneously breaking ten C-C bonds. Shown in Figure 4-10 are the total energies of

B3LYP, AATM0, AM1, and SCC-DFTB. The reaction coordinate is the distance between

the end-caps and the equilibrium value is 3.223 A, the range is from 2.8 A through 6

A. As expected we see that AM1 exhibits unphysical behavior, and some convergence

problems beyond 4.5 A. The SCC-DFTB actually performs quite well for C20, this result

may not be too surprising since the repulsive energy term in SCC-DFTB is parameterized

from B3LYP total energies for C-C bond breaking and the energy scale is very similar to

the sum of C-C bond energies. The AATM0 overbinds this system significantly and may

in fact be converging to a different electronic state, as the eigenvalue spectrum deviates

significantly from the B3LYP result. Further work is needed to determine the precise

origin of this discrepancy and to correct it.

The final comparison we make is to the RMSD of the gradients for the G2 set. We

will consider two sets of data, one for the MP2 equilibrium structures and the other

for the B3LYP minimal basis set equilibrium structures. Shown in Figure 4-11 are the

average RMSD of the Cartesian gradients from the MP2 reference structures calculated

100

with AATM0, B3LYP, AM1, and SCC-DFTB for the G2 set. The standard deviation of

the gradients is also shown. AM1 appears to have the best agreement to the MP2 result

with an RMSD of about 0.02 Hartree/Bohr, the corresponding standard deviation is also

about 0.02 Hartree/Bohr. SCC-DFTB does not perform well for the G2 set, however

if we eliminate those molecules that contain two or three atoms from the set then the

SCC-DFTB performance improves significantly and the RMSD is 0.007 Hartree/Bohr,

which lower than the AM1 RMSD of 0.017 Hartree/Bohr for this truncated set. In both

cases the AATM0 and B3LYP RMSD forces are nearly the same at 0.05 Hartree/Bohr,

with a standard deviation of 0.01 Hartree/Bohr.

A better comparison for how well the AATM0 approximates the underlying

B3LYP structure is shown in Figure 4-12. At the B3LYP geometries the RMSD is 0.05

Hartree/Bohr and 0.09 Hartree/Bohr for AM1 and SCC-DFTB, respectively. We see the

AATM0 has an average RMSD of 0.006 Hartree/Bohr with a standard deviation of only

0.003 Hartree/Bohr. This clearly shows that AATM0 is reproducing the B3LYP gradients

with excellent accuracy.

4.6 Conclusion

Current SE methods are not uniformly accurate over an entire PES, owing to severe

approximations that are only valid near equilibrium. AAT has been shown to be able to

describe bond breaking in its spin-polarized form at least as well as B3LYP, and much

better than traditional SE methods, like AM1. The AAT SM framework enables the

systematic verification of approximations via comparisons to CC and B3LYP references

for prototypical molecules. Furthermore, all approximations can be further refined. Unlike

existing SE model Hamiltonians, there are no adjustable parameters.

Fast and accurate approximations to four-center two-electron integrals via Rudenberg

(for Coulomb or exact exchange) enable substantial savings for very large systems. Since

the Rudenberg approximation merely reduces the cost associated with calculating each

four-center two-electron integral, methods for screening two-electron integrals to achieve

101

linear scaling are equally applicable here. The nonlocal exchange that is not considered in

SCC-DFTB is allowed in AAT, so that B3LYP and other hybrid DFT methods can be a

target of the approach. We have also developed a route toward a fully two-center model

based on considerations of the three-center two-electron repulsion integrals.

The Fock build requires no numerical integration, but the AAT approach does

explicitly include exchange and correlation. In the current approach we have used

the minimal basis set B3LYP exchange-correlation potentials as a reference, further

improvement is possible by evaluating large basis set ab initio DFT exchange-correlation

potentials and subsequently projecting them in a minimal basis set representation.

Extensions of the AATM0 that include a more complete expansion of the telescoping series

for the exchange-correlation potential have also been described and their implementation

would be a further improvement on the results presented.

102

Table 4-1. Adapted integrals

(µAνA|λBσC) I Mulliken3-center Type A II Rudenberg(µAνB|λAσC) III Mulliken3-center Type B IV Rudenberg(µAνB|λCσD) V Mulliken4-center VI Rudenberg

VII Mulliken PartialVIII Rudenberg Partial

103

Tab

le4-

2.A

vera

geper

centa

geof

Fock

mat

rix

contr

ibuti

onby

adap

ted

appro

xim

atio

n

Rxn

1-C

2-C

3-C

AI

II3-

CB

III

IV4-

CV

VI

VII

VII

IC

OH

NO

2C

N0.

6155

.11

40.5

941

.85

40.7

22.

282.

522.

361.

401.

411.

401.

381.

39N

MT

CN

0.63

51.9

941

.07

42.3

040

.85

4.15

4.87

4.47

2.16

2.25

2.20

2.17

2.16

CH

2O

CO

1.49

62.5

331

.27

31.7

730

.61

4.84

6.18

5.44

-0.1

3-0

.14

-0.1

4-0

.13

-0.1

3C

H3N

H2

CN

0.52

55.5

633

.13

33.1

632

.19

8.07

10.0

48.

962.

712.

982.

832.

752.

73H

OO

HO

O2.

2176

.34

19.8

520

.15

19.2

20.

891.

191.

100.

710.

850.

790.

760.

74N

ET

CN

0.35

41.8

148

.28

48.9

947

.84

5.15

5.98

5.57

4.40

4.65

4.52

4.43

4.41

C2H

6C

C0.

1949

.40

35.4

735

.18

34.3

99.

4911

.95

10.6

05.

455.

935.

685.

555.

50A

vera

ge0.

8656

.11

35.6

736

.20

35.1

24.

986.

105.

502.

392.

562.

472.

412.

40

104

A

-0.2-0.1

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7

-10 -5 0 5 101-ce

nter

2-e

- inte

gral

(a.

u.)

Full 2-e- integral (a.u.)

y = 0.0063xR2 = -0.0065

B

-4-3-2-1 0 1 2 3 4

-10 -5 0 5 102-ce

nter

2-e

- inte

gral

(a.

u.)


y = 0.5199xR2 = 0.8988

C

-4-3-2-1 0 1 2 3 4

-10 -5 0 5 103-ce

nter

2-e

- inte

gral

(a.

u.)


y = 0.4522xR2 = 0.8699

D

-0.2-0.15-0.1

-0.05 0

0.05 0.1

0.15 0.2

-10 -5 0 5 104-ce

nter

2-e

- inte

gral

(a.

u.)


y = 0.0216x

R2 = 0.6841

Figure 4-1. Scatter plots of agreement between approximate and exact two-electronintegrals. A) One-center. B) Two-center. C) Three-center. D) Four-center

105

A

-60

-40

-20

0

20

40

60

1 1.5 2 2.5 3

Ene

rgy

(kca

l/mol

)


Rudenberg (4C)B3LYP

B

-80

-60

-40

-20

0

20

40

60

1 1.2 1.4 1.6 1.8 2 2.2 2.4

Ene

rgy

(kca

l/mol

)


Rudenberg (4C)B3LYP

C

-20

-10

0

10

20

30

40

50

60

1 1.2 1.4 1.6 1.8 2 2.2 2.4

Ene

rgy

(kca

l/mol

)


Rudenberg (4C)B3LYP

D

-100

-50

0

50

1 1.5 2 2.5 3 3.5 4

Ene

rgy

(kca

l/mol

)


Rudenberg (4C)B3LYP

Figure 4-2. Dissociation curve of C-N with 4-center Rudenberg approximation. A) NMT.B) NET. C) COHNO2. D) CH3NH2

106

A

0.4

0.2

0-3 -2 -1 0 1 2 3

Separation (Bohr)

B

0.4

0.2

0-4 -2 0 2 4

Separation (Bohr)

Figure 4-3. Orbital products. A) Gaussian orbital product. B) Slater orbital product

107

A

-60

-40

-20

0

20

40

60

1 1.5 2 2.5 3

Ene

rgy

(kca

l/mol

)


GZERO(B3LYP)B3LYP

B

-80

-60

-40

-20

0

20

40

60

1 1.2 1.4 1.6 1.8 2 2.2 2.4

Ene

rgy

(kca

l/mol

)


GZERO(B3LYP)B3LYP

C

-20

-10

0

10

20

30

40

50

60

1 1.2 1.4 1.6 1.8 2 2.2 2.4

Ene

rgy

(kca

l/mol

)


GZERO(B3LYP)B3LYP

D

-100

-50

0

50

1 1.5 2 2.5 3 3.5 4

Ene

rgy

(kca

l/mol

)


GZERO(B3LYP)B3LYP

Figure 4-4. Dissociation curve of C-N with AAT approximation. A) NMT. B) NET. C)COHNO2. D) CH3NH2

108

A

-60

-40

-20

0

20

40

60

1 1.5 2 2.5 3

Ene

rgy

(kca

l/mol

)


AATM0B3LYP

AM1

B

-80

-60

-40

-20

0

20

40

60

1 1.2 1.4 1.6 1.8 2 2.2 2.4

Ene

rgy

(kca

l/mol

)


AATM0B3LYP

AM1

C

-20

-10

0

10

20

30

40

50

60

1 1.2 1.4 1.6 1.8 2 2.2 2.4

Ene

rgy

(kca

l/mol

)


AATM0B3LYP

AM1

D

-100

-50

0

50

1 1.5 2 2.5 3 3.5 4

Ene

rgy

(kca

l/mol

)


AATM0B3LYP

AM1

Figure 4-5. Dissociation curve of C-N with AATM0 approximation. A) NMT. B) NET. C)COHNO2. D) CH3NH2

109

0

5000

10000

15000

20000

25000

30000

35000

40000

400 600 800 1000 1200 1400 1600

Tim

e (S

econ

ds)

Orbitals

ab initioRudenberg

Figure 4-6. Timing of ab initio versus Rudenberg four-center integrals

110

-10 0

10 20 30 40 50 60 70 80 90

100

0 100 200 300 400 500 600 700 800 900 1000

Perc

ent o

f to

tal

Number of atoms (5 orbitals per atom)

1-center2-center3-center4-center

Figure 4-7. Percentage of multi-center terms versus system size

111

0 10 20 30 40 50 60 70 80 90

100

0 5000 10000 15000 20000 25000

Seco

nds

Orbitals

StandardSplines

Figure 4-8. Timing of Fock build using traditional NDDO and NDDO with cubic splinesas a function of system size

112

-1.92e-05-1.9e-05

-1.88e-05-1.86e-05-1.84e-05-1.82e-05-1.8e-05

-1.78e-05-1.76e-05-1.74e-05-1.72e-05

0 5000 10000 15000 20000 25000

Err

or (

kcal

/mol

)

Orbitals

Error

Figure 4-9. Error introduced per atom from cubic splines as a function of system size

113

-4-3.5

-3-2.5

-2-1.5

-1-0.5

0 0.5

2.5 3 3.5 4 4.5 5 5.5 6

Ene

rgy

(Har

tree

)

R (Angstrom)

B3LYPAATM0

AM1SCC-DFTB

Figure 4-10. Pseudo-reaction-path splitting C20

114

0

0.05

0.1

0.15

0.2

AATM0

B3LYP

AM1

SCC-DFTB

RMSDStandard Deviation

Figure 4-11. Force RMSD from MP2

115

0

0.05

0.1

0.15

0.2

AATM0

AM1

SCC-DFTB

RMSDStandard Deviation

Figure 4-12. Force RMSD from B3LYP

116

APPENDIX Parallel Implementation in the ACES III Environment

The ACES III program system environment has been designed for the efficient parallel

implementation of wavefunction theory methods in computational chemistry. The ACES

III program is suited to treating large systems, typically 500-1000 basis functions and up

to 300 electrons with post-HF ab initio methods. The super instruction processor (SIP)

manages the communication and processing of blocks of data, such blocks are intended to

be somewhat large (on the order of 500,000 floating point numbers). To provide a more

direct route for implementing new methods the SIP reads the code written in the high

level symbolic language, the super instruction assembly language (SIAL) (pronounced

“sail”). The SIP hides many of the more complicated aspects of parallel programing

allowing the SIAL programmer to focus on algorithm and method development. Within

SIAL each index of an array is divided into segments, the segments map elements of

the array into blocks. An algorithm then involves operations for which the segment is

the fundamental unit of indexing. Operations that involve the contraction of two arrays

can then be broken into several smaller contractions over two blocks. Each block-pair

contraction can then be distributed over many processors, allowing for the parallel

implementation of the contraction of two large arrays.

A general issue in parallel codes is ensuring that latency does not become a rate

limiting step. This can be done by making sure that the amount of computation done is

on a par with the latency of any operation. Such balancing is machine dependent, so it

requires some amount of testing to determine an optimal solution for a particular machine.

For the implementation of the AAT methodology this balancing is critical and requires a

different treatment than is usually used for post-HF methods.

The structure of the AAT equations are quite different than post-HF equations.

In the former the atomic indices are used as the primary variable in the loops used to

construct the Fock-like matrix. For post-HF methods the orbital indices are the important

variable. Moreover, the number of orbitals per atom in AAT is small relative to the

117

number of atoms, the reverse of many post-HF calculations, which are generally used to

treat systems with fewer atoms but with much larger basis sets.

The speed of the AAT approach is achieved by avoiding the explicit calculation of the

numerous and costly four-center integrals. Instead, these terms are introduced by a sum of

contracted two-center terms. Initially, we used small segment block sizes. The block size

ultimately determines the computational efficiency by affecting how the IO, or fetching

of data, is done. In these initial studies each block spanned only one atom. In this case,

when the two-electron integrals were evaluated it was straight forward to simply neglect

the integrals that involved four atoms (four-centers.) This procedure worked for systems

with few atoms (<20), but, as the system size increased, the number of blocks rapidly

increased, and latency became an problem.

An improved approach that was implemented, which is more consistent with the

original ACES III design philosophy, involved removing the atom-spanning block

size restriction. This was achieved by rewriting the original routines that returned all

two-electron integrals spanning four-block units to explicitly exclude integrals involving

four-centers. In addition, the Rudenberg approximation, which relies on contractions

between the overlap matrix and NDDO-type two-center two-electron integrals ((AA|BB)),

required modification to avoid over-counting some terms. Similar to the routine that

excludes four-center integrals, the modified routine to generate the integrals used by the

Rudenberg approximation excluded all integrals that were not of the type (AA|BB).

These two modified routines involve some overhead that could be reduced with further

refinement. These modified routines are specific to the AAT approach and are not

included in the standard release version of ACES III.

Since two-electron integrals possess an eight-fold spatial symmetry the overall number

of integrals that need to be calculated can be reduced by roughly a factor of eight. To

do so requires recognizing the symmetry of every integral type. In ACES III the problem

is slightly different. Instead of having this eight-fold symmetry manifest itself by AO

118

Still, there are many aspects that need to be implemented efficiently to have a viable,

stable code. We have previously discussed the use of cubic splines are an efficient way of

balancing the numerical accuracy of the two-center functions and the number of reference

points stored.

Another aspect that needs to be considered is the rotation of the two-center matrix

elements in a local atom-pair coordinate system to the global coordinate system of the

molecule. This rotation is unambiguous once the global coordinate system is specified,

and although orbitals of the molecule can be rotated simultaneously without affecting

the energetics, the corresponding matrix elements will change on rotation. This degree

of flexibility must be incorporated to ensure that the Fock-like matrix contribution that

arises from every atom-pair corresponds to the orbital orientation of the global coordinate

system of the molecule. To achieve this we use a simple procedure that generates the

matrix that rotates a pair of atomic coordinates so that they are aligned to the axis of

the the stored cubic spline, and then perform the reverse rotation on that sub-block of the

Fock-like matrix.

Finally, the third component implemented was the numerical integration that is

performed once the SCF procedure has converged. The ACES III program system

generates a job archive file (JOBARC) that can be read by ACES II executables, assuming

it has been generated on the same computer architecture. To test the efficacy of the AAT

approximations, the quickest route was to modify the ACES II xintgrt executable.

The xintgrt module performs numerical integration on a density to determine the

exchange-correlation energy that is needed for the residual electronic energy contribution

of AAT. Effectively, xintgrt pulls the AO eigenvector matrix from the JOBARC, as well

as other information about the system that would typically be generated by the xvscf ks

executable in ACES II when running a KS-DFT calculation (e.g. occupation numbers).

Since ACES III does not create all of the necessary records in the JOBARC, some small

adjustments to the xintgrt routine led to a new modified executable that performed the

120

needed numerical integration for the density generated by the AAT SCF procedure.

The largest modification in this regard was the proper inclusion of the exchange terms.

In the original xintgrt routine the Coulomb and exchange energy contributions are

recalculated from the integral file (IIII). This integral file is not created by ACES III

(because it works in an integral direct environment), removing the dependence on the IIII

file and eliminating the revaluation of the Coulomb and exchange terms. Instead those

intermediates are stored in the JOBARC by ACES III when they are evaluated in the

normal course of evaluating the Fock-like matrix. With these modifications all energy

contributions needed for the AAT approximations becomes available.

121

REFERENCES

[1] R. J. Bartlett, V. F. Lotrich, and I. V. Schweigert, J. Chem. Phys. 123, 062205(2005).

[2] K. W. Sattelmeyer, J. Tirado-Rives, and W. L. Jorgensen, J. Phys. Chem. 110, 13551(2006).

[3] J. McClellan, T. Hughes, and R. J. Bartlett, Int. J. Quantum Chem. 105, 914 (2005).

[4] R. S. Mulliken, J. Chim. Phys. 46, 497 (1949).

[5] K. Rudenberg, J. Chem. Phys. 34, 1433 (1951).

[6] W. Weber and W. Thiel, Theor. Chem. Acc. 103, 495 (2000).

[7] M. Elstner, D. Porezag, G. Jungnickel, J. Elsner, M. Haugk, T. Frauenheim, S. Suhai,and G. Seifert, Phys. Rev. B 58, 7260 (1998).

[8] Q. Zhao, R. C. Morrison, and R. G. Parr, Phys. Rev. A 50, 2138 (1994).

[9] R. van Leeuwen and E. J. Baerends, Phys. Rev. A 49, 2421 (1994).

[10] A. Beste and R. J. Bartlett, J. Chem. Phys. 120, 8395 (2004).

[11] A. Beste and R. J. Bartlett, J. Chem. Phys. 123, 154103 (2005).

[12] W. Kohn and L. J. Sham, Phys. Rev. 140, A1133 (1965).

[13] R. J. Bartlett and G. D. Purvis III, Int. J. Quantum Chem. XIV, 561 (1978).

[14] J. Cızek and J. Paldus, J. Chem. Phys. 47, 3976 (1960).

[15] J. Paldus and J. Cızek, J. Chem. Phys. 54, 2293 (1971).

[16] R. J. Bartlett, Annu. Rev. Phys. Chem. 32, 359 (1981).

[17] R. J. Bartlett and M. MusiaÃl, Rev. Mod. Phys. 79, 291 (2007).

[18] T. Helgaker, P. Jørgensen, and J. Olsen, Molecular Electronic-Structure Theory(Wiley, New York, 2000).

[19] R. J. Bartlett, Theory and Applications of Computational Chemistry (Elsevier, 2005),p. 1191.

[20] M. Levy, Proc. of the Nat. Acad. of Sci. 76, 6062 (1979).

[21] Q. Zhao, R. C. Morrison, and R. G. Parr, Phys. Rev. A 50, 2138 (1994).

[22] Q. Zhao and R. G. Parr, Phys. Rev. A 46, 2337 (1992).

[23] Q. Zhao and R. Parr, J. Chem. Phys. 98, 543 (1993).

122

[24] D. J. Tozer, V. E. Ingamells, and N. C. Handy, J. Chem. Phys. 105, 9200 (1996).

[25] K. Peirs, D. Van Neck, and M. Waroquier, Phys. Rev. A 67, 012505 (2003).

[26] R. Hoffmann, J. Chem. Phys. 39, 1397 (1963).

[27] M. J. S. Dewar, J. Friedheim, G. Grady, E. F. Healy, and J. Stewart, Organometallics5, 375 (1986).

[28] J. J. P. Stewart, J. Computat. Chem. 10, 209 (1989).

[29] J. J. P. Stewart, J. Computat. Chem. 10, 221 (1989).

[30] M. J. S. Dewar and W. Thiel, J. Am. Chem. Soc. 99, 4899 (1977).

[31] M. J. S. Dewar, E. G. Zoebisch, E. F. Healy, and J. J. P. Stewart, J. Am. Chem. Soc.107, 3902 (1985).

[32] M. Kolb and W. Thiel, J. Computat. Chem. 14, 775 (1993).

[33] M. J. S. Dewar and W. Thiel, Theor. Chem. Acc. 46, 89 (1977).

[34] J. C. Slater and G. F. Koster, Phys. Rev. 94, 1498 (1954).

[35] W. M. C. Foulkes and R. Haydock, Phys. Rev. B 39, 12520 (1989).

[36] R. J. Bartlett, I. Grabowski, S. Hirata, and S. Ivanov, J. Chem. Phys. 122, 034104(2005).

[37] M. Wolfsberg and L. Helmholz, J. Chem. Phys. 20, 837 (1952).

[38] C. J. Ballhausen and H. B. Gray, Inorg. Chem. 1, 111 (1962).

[39] G. Klopman, J. Am. Chem. Soc. 86, 4550 (1964).

[40] K. Ohno, Theor. Chem. Acc. 2, 219 (1964).

[41] K. Nishimoto and N. Mataga, Z. Physik. Chem (Frankfurt) 12, 335 (1957).

[42] G. Zheng, S. Irle, M. Elstner, and K. Morokuma, J. Phys. Chem. A 108, 3182 (2004).

[43] C. E. Taylor, M. G. Cory, R. J. Bartlett, and W. Thiel, Comput. Mat. Sci. 27, 204(2003).

[44] R. J. Bartlett, J. Phys. Chem. 93, 1697 (1989).

[45] I. Rossi and D. Truhlar, Chem. Phys. Lett. 233, 231 (1995).

[46] P. Charbonneau, Astrophys. J. (Suppl.) 101, 309 (1995).

[47] R. H. Byrd, L-BFGS-B version 2.1, University of Colorado. (1997).

123

[48] ACES II is a program product of the Quantum Theory Project, University of Florida.J.F. Stanton, J. Gauss, J.D. Watts, M. Nooijen, N. Oliphant, S. A. Perera, P. G.Szalay, W.J. Lauderdale, S.R. Gwaltney, S. Beck, A. Balkova D. E. Bernholdt,K.-K Baeck, P. Rozyczko, H. Sekino, C. Hober, and R.J. Bartlett. Integral packagesincluded are VMOL (J. Almlof and P.R. Taylor); VPROPS (P. Taylor) ABACUS; (T.Helgaker, H.J. Aa. Jensen, P. Jørgensen, J. Olsen, and P.R. Taylor) (2006).

[49] W. Thiel, MNDO97, Organisch-chemisches Institut, Universitat Zurich,Winterhurerstrasse 190, CH-8057 Zurich, Switzerland (1997).

[50] M. Dupuis, A. Marquez, and E. Davidson, HONDO 99.6, based on HONDO 95.3,Quantum Chemistry Program Exchange (QCPE), Indiana University, Bloomington,IN 47405 (1999).

[51] J. Li, F. Zhao, and F. Jing, J. Comput. Chem. 24, 345 (2003).

[52] R.Ahlrichs, M. Bar, M. Haser, H. Horn, M. Hser, and C. Klemel, Chem. Phys. Lett.162, 165 (1989).

[53] M. Haser and R. Ahlrichs, J. Comput. Chem. 10, 104 (1989).

[54] F. Weigend, M. Haser, H. Patzelt, and R. Ahlrichs, Chem. Phys. Lett. 294, 143(1998).

[55] P. Politzer and J. S. M. Eds., Theoretical and Computational Chemistry EnergeticMaterials Part 1. Decomposition, Crystal and Molecular Properties, vol. 12 (Elsevier,Amsterdam, The Netherlands, 2003).

[56] M. J. S. Dewar, J. P. Ritchie, and J. Alster, J. Org. Chem. 50, 1031 (1985).

[57] M. L. J. McKee, J. Am. Chem. Soc. 108, 5784 (1986).

[58] M. T. Nguyen, H. T. Le, B. Hajgato, T. Veszpremi, and M. C. Lin, J. Phys. Chem. A107, 4286 (2003).

[59] W. F. Hu, T. J. He, D. M. Chen, and F. C. Liu, J. Phys. Chem. A 106, 7294 (2002).

[60] A. G. Taube and R. J. Bartlett, J. Chem. Phys. accepted (2007).

[61] W. D. Taylor, T. D. Allston, M. J. Moscato, G. B. Fazekas, R.Kozlowski, and G. A.Takacs, Int. J. Chem. Kin. 12, 231 (1980).

[62] H. A. Taylor and V. V. Vesselovsky, J. Phys. Chem. 39, 1095 (1935).

[63] R. Bukowski, K. Szalewicz, and C. F. Chabalowski, J. Phys. Chem. A 103, 7322(1999).

[64] M. J. S. Dewar and W. Thiel, J. Am. Chem. Soc. 99, 4907 (1977).

[65] W. Thiel, J. Mol. Struc. (Theochem) 1-6, 398 (1997).

124

[66] W. Weber and W. Thiel, Theor. Chem. Acc. 103, 495 (2000).

[67] J. Stewart, Reviews in Computational Chemistry (VCH Publishers, 1990), vol. 1,p. 45.

[68] W. Thiel, Adv. Chem. Phys. 93, 703 (1996).

[69] D. E. Taylor, K. Runge, and R. J. Bartlett, Mol. Phys. 103, 2019 (2005).

[70] W. Thiel and A. Voityuk, Theor. Chem. Acc. 81, 391 (1992).

[71] A. Ourmazd, D. W. Taylor, J. A. Rentschler, and J. Bevk, Phys. Rev. Lett. 59, 213(1987).

[72] D. Muller, T. Sorsch, S. Moccio, F. Baumann, K. Evans-Lutterdodt, and G. Timp,Nature 399, 758 (1999).

[73] X. Y. Liu, D. Jovanovic, and R. Stumpf, Applied Physics Letters 86, 82104 (2005).

[74] A. A. Demkov, J. Orteg, O. F. Sankey, and M. P. Grumback, Phys. Rev. B 52, 1618(1995).

[75] L. Stixrude and M. S. T. Bukowinski, Science 250, 541 (1990).

[76] L. Stixrude and M. S. T. Bukowinski, Phys. Rev. B 44, 2523 (1991).

[77] M. Menon and K. R. Subbaswamy, Phys. Rev. B 55, 9231 (1997).

[78] M. Elstner, J. Phys. Chem. A 111, 5614 (2007).

[79] N. Otte, M. Scholten, and W. Thiel, J. Phys. Chem. A 111, 5751 (2007).

[80] M. P. Repasky, J. Chandrasekhar, and W. L. Jorgensen, J. Comput. Chem. 23, 1601(2002).

[81] P. Pulay, Mol. Phys. 17, 197 (1969).

[82] B. Dunlap, J. Connolly, and J. Sabin, J. Chem. Phys. 71, 3396 (1979).

[83] C. A. White, B. G. Johnson, P. M. W. Gill, and M. Head-Gordon, Chem. Phys. Lett.230, 8 (1994).

[84] C. A. White, B. G. Johnson, P. M. W. Gill, and M. Head-Gordon, Chem. Phys. Lett.253, 268 (1996).

[85] M. C. Strain, G. E. Scuseria, and M. J. Frisch, Science 271, 51 (1996).

[86] J. C. Burant, G. E. Scuseria, and M. J. Frisch, J. Chem. Phys. 105, 8969 (1996).

[87] M. Challacombe and E. Schwegler, J. Chem. Phys. 106, 5526 (1997).

125

[88] P. M. W. Gill, A. T. B. Gilbert, and T. R. Adams, J. Computat. Chem. 21, 1505(2000).

[89] W. Koch, B. Frey, J. F. S. Ruiz, and T. Scior, Z. Naturforsch 58a, 756 (2003).

[90] N. Oliphant and R. J. Bartlett, J. Chem. Phys. 100, 6550 (1994).

[91] C. S. Duris, ACM TOMS 6, 92 (1980).

[92] L. A. Curtiss, K. Raghavachari, P. Redfern, and J. A. Pople, J. Chem. Phys. 106,1063 (1997).

126

BIOGRAPHICAL SKETCH

I was born in Michigan but have done time in Ohio, Washington, Oregon, and now

Florida. My tumultuous career in science began in sixth grade when my science teacher

called me a dreamer (I know I am not the only one) for proposing the construction of an

elevator to the moon to avoid the costs associated with space flight. In hindsight, I should

not have specified my original design as a rigid structure. That same year I was awarded

a certificate as the classroom’s strangest kid. After making the transition to high school,

I doubled-up on math and science and finished all available classes in those areas by my

Junior year, resulting in a misspent Senior year.

Somehow I managed to get into the only college to which I applied. I arrived a

Freshman filled with youthful optimism and persuaded my academic advisor to enroll me

in a chemistry course 2 years beyond my level. I continued my Sophomore year without

the youthful optimism. I participated in two Research Experience for Undergraduates

programs sponsored by the National Science Foundation at Cornell and the University

of Chicago. In Chicago I worked for Prof. Karl Freed, essentially spending the summer

building overly complicated internal coordinate input files. I must have enjoyed it because

that same summer I tattooed myself with the mark of a quantum chemist, literally.

Eventually, I took an interdisciplinary BA in chemistry and physics from Reed College in

beautiful Portland, Oregon, in 2001.

127

adapted ab initio theory: a simplified kohn-sham...

Documents