discovering new materials via a priori crystal structure

Discovering New Materials via a priori CrystalStructure Prediction

Eva ZurekDepartment of Chemistry

University at BuffaloState University of New York

Buffalo, NY 14260-3000, USA

July 27, 2014

1

Introduction and ScopeThe atomistic structure of a solid determines its properties, many of which can be computedwith reasonable accuracy via first-principles calculations. In fact, the computational prediction ofnew materials for specific applications hinges on the ability to prophesize their crystal structuresprior to their synthesis. This is just one of the reasons for the growing interest in a priori crystalstructure prediction (CSP). Another reason is that CSP techniques can be used in concert withexperimental data to elucidate the structure of an already synthesized solid. This type of feed-back loop between theory and experiment is of critical importance when experimental studiesare not able to unambiguously uncover all of the structural intricacies. For example, it may beimpossible to distinguish between two elements such as boron and carbon experimentally, andthe positions of light elements like hydrogen cannot be resolved using X-ray diffraction. CSP isalso tremendously valuable in situations where experiments are difficult or even impossible, forinstance conditions of extreme pressures that prevail at the interiors of giant planets.

Sir John Maddox, a former editor of the journal Nature, realized the importance of CSP. In amuch-quoted 1988 editorial1 Maddox wrote that: “One of the continuing scandals in the physicalsciences is that it remains, in general, impossible to predict the structure of even the simplestcrystalline solids from a knowledge of their chemical composition.” He continued by stating that:“Yet one would have thought that, by now, it should be possible to equip a sufficiently large com-puter with a sufficiently large program, type in the formula of the chemical and obtain, as output,the atomic coordinates of the atoms in a unit cell.” Two main obstacles need to be overcome toachieve Maddox’s dream. We will not dwell much on the first challenge – the accurate computa-tional ranking of the stabilities of different atomic configurations. For the purpose of this chapter,it suffices to list the numerous program packages which can compute the energies or enthalpiesof solids via empirical potentials or first-principles calculations, and we briefly discuss the diffi-culties associated with computations of free energies. The second goal is the development of aclever algorithm that can efficiently search the potential energy landscape of a crystalline solid,and locate the global minimum, as well as a number of important local minima, throughout thecourse of its exploration. Methods developed towards this end will be our main concern.

Specifically, we focus on algorithms best suited towards predicting the crystal structures ofinorganic solids. Often times a quantum mechanical description is required to accurately calcu-late the energies, and optimize the geometries, of these systems. We note that there have been anumber of excellent reviews2–7 and books8, 9 published on this topic. Some of the techniques de-scribed in this chapter have been applied to predict the structures of finite inorganic clusters.10–12

Even though a few of the methods discussed here can be adapted to molecular solids, we will notconsider them in much detail. The typically large unit cell sizes, weak intermolecular forces, largedegree of conformational flexibility, and closeness of energy of many crystalline polymorphs re-quire special considerations outside the scope of this chapter. A few perspectives on CSP formolecular crystals can be found in References 13–15. The Cambridge Crystallographic DataCenter (CCDC) blind tests are held every few years to test the methodologies available for pre-dicting the structures of crystals composed of organic molecules.16–20 We are also not concernedwith CSP via high-throughput means, where possible structural candidates are proposed usinginformation extracted from mining experimental data sets, and quantum mechanical calculationsare employed to obtain their stability ranking.21, 22

2

This chapter is organized as follows: After defining important terms and discussing the prop-erties of potential energy landscapes, we briefly outline computational techniques employed tocalculate the energies and optimize the geometries of crystalline materials. This is followed by adescription of some of the methods most widely used for CSP of inorganic solids. One class oftechniques, evolutionary algorithms (EAs), take center stage in this chapter. Because our goal isto educate novices, we have included a section that is designed to answer questions typically notcovered in most reviews; those dealing with practical aspects of carrying out an effective evolu-tionary structure search. The penultimate section presents examples of recent original researchwhere the XTALOPT evolutionary algorithm was employed to predict the structure of crystallinesolids. Challenges awaiting solutions are outlined in the conclusion.

Crystal Lattices and Potential Energy SurfacesA crystal lattice can be described as an array of boxes repeating infinitely in three dimensions.A single box is called a unit cell, and the shape of the cell is determined by six parameters:three lattice vectors (a, b, c), and the angles between them (α, β, γ). Each cell is filled with Natoms whose positions are defined by three Cartesian coordinates, and there are 3N − 3 degreesof freedom associated with the atomic positions. So, a total of 3N + 3 degrees of freedom arerequired to describe the unit cell shape and its contents. A primitive cell is the smallest unitthat can be employed to build the crystal via translations alone. Sometimes it may be useful torepresent the structure by a cell, which exhibits the symmetry of the lattice more clearly, but islarger than the primitive one. This building block is called a conventional cell. In Fig. 1(a), weillustrate both the primitive and conventional unit cell for the given two dimensional lattice.

A number of symmetry operations, including inversions, reflections and rotations, may relatethe positions of some of the atoms within the cell. The smallest unique part of the structure iscalled the asymmetric unit, and application of the symmetry operations to this entity gives riseto the full unit cell. The same crystal structure can be described in a number of ways, differing inthe size of the cell, the centering of the atoms within it, and the choice of the lattice parameters.For example, Figs. 1(b) and 1(c) illustrate two different unit cells that define the same lattice. Toovercome the ambiguity present in describing unit cells, it is desirable to convert them to somesort of standard reduced cell. One type of reduced cell is the Buerger cell;23 the primitive cellhaving the shortest possible lattice vectors. Another is a Niggli cell;24 the cell within which thelattice vectors have been transformed into their ‘most cubic’ form. The dashed lines in Figs. 1(b)and 1(c) denote a 3× 2 supercell of the two dimensional primitive cell.

Seven different lattice types, which differ in the allowed lattice vectors and angles, are possi-ble: triclinic, monoclinic, orthorhombic, tetragonal, rhombohedral, hexagonal and cubic. Thereare four different lattice centerings: primitive, body-centered, face-centered, and base-centered.Coupling the seven lattice systems with the lattice centerings gives rise to 14 unique Bravaislattices. Combining the symmetry operations present in a cell with the translational symmetryof the lattice yields a total of 230 space groups (symmetry groups) that can describe all possiblethree dimensional crystal structures. Hermann-Mauguin notation is often times used to repre-sent the different space groups. In the section ‘Practical Aspects of Carrying out an EvolutionaryStructure Search’ we provide the reader with a list of computational tools that may be employed

3

to determine the space group of a system given the Cartesian coordinates of the lattice vectors,and the positions of the atoms in the cell. The unit cell of a two dimensional system may bedescribed by two lattice coordinates plus one angle, and there are 17 distinct symmetry groups(wallpaper groups). So, adding the third dimension increases the number of symmetries by anorder of magnitude.

Predicting the crystal structure of the most stable configuration for a particular stoichiometryboils down to finding the unit cell parameters and atomic positions that minimize the thermo-dynamic quantity of interest, provided the number of atoms comprising the unit cell is known.If the number of formula units (FUs) within the primitive unit cell is unknown too (as is oftenthe case), predicting the global minimum becomes significantly harder. In principle, one needsto find the most stable configuration for unit cells containing all possible FUs and compare theirenergies – clearly an impossible feat! In practice, structure searches are typically carried out forcells containing reasonable values of N .

p1

p2

c1

c2

(a) (b) (c)

a

b

a

b

Figure 1: (a) The primitive unit cell for this two dimensional lattice is a parallelogram and isdenoted by the vectors p1 and p2. The area of the conventional cell, denoted by the cell vectorsc1 and c2, is twice as large as that of the primitive cell, but it illustrates the rectangular symmetryof the lattice more clearly. A second structure described with two different unit cells is shown in(b) and (c). The unit cells used to describe it differ, with the former being a square, and the latter aparallelogram. The cell illustrated in (b) corresponds to both the Buerger cell and the Niggli cell.Note that different shades of gray are used to illustrate the two types of atoms (color available ine-books).

The most stable structure at a given temperature, T , and pressure, P , is the one that has thelowest Gibbs Free Energy, G, given as

G = E + PV − TS ≡ H − TS, [1]

where E is the internal energy at the temperature T , V is the volume, S is the entropy and H isthe enthalpy. Since the most stable configuration of atoms at a given pressure may be temperaturedependent, CSP should, in principle, be carried out at finite temperatures. The zero point energy(ZPE) and terms in the enthalpy and entropy depend on the vibrational degrees of freedom of thecrystal lattice. But, calculations of the vibrational modes can be exceedingly time consuming,making it impractical to include them during the course of the crystal structure search. Therefore,the energy or enthalpy – at zero Kelvin and excluding the ZPE – are typically employed to inferstability instead. When solids are subject to external pressures, the PV term becomes increasinglyimportant, so the enthalpy becomes the relevant thermodynamic quantity. Within this chapter

4

we will use both of the terms ‘enthalpy’ and ‘energy’ when discussing crystal structure stability,keeping in mind that at zero pressure and zero temperature, and neglecting the ZPE (as is commonpractice in most crystal structure searches), the two become equivalent. A phase refers to aparticular atomic configuration which can persist over some range of temperatures and pressures,and the domain of existence of various phases is illustrated by a phase diagram.

A potential energy surface (PES) or potential energy landscape illustrates how the energyof a collection of atoms varies as a function of some set of coordinates (such as interatomic dis-tances, angles or dihedrals), which represent the relative position of the atoms. In a diatomicmolecule, the energy depends only on the bond length, which makes it easy to graphically illus-trate the PES as a plot ofE versus r. When two degrees of freedom (i.e., two bond lengths) definethe system, they may be plotted separately on x and y axes, and the energy values to which theycorrespond can be illustrated in the same way that the height of the land is shown in a topographicmap. Because each crystal lattice possesses 3N+3 degrees of freedom, 3N+4 axes would be re-quired to plot the dependence of the energy on all of the atomic positions and lattice vectors. Forthis reason, the multidimensional PESs of crystals are sometimes referred to as hypersurfaces orhyperspaces.

In Fig. 2, we illustrate a schematic diagram of a one dimensional (1D) PES, which shows howthe energy depends on some abstract degree of freedom. The valleys or low points in this plotcorrespond to local minima. The lowest energy structure is called the global minimum, and theother minima may be metastable or kinetically stable provided that the barriers between them(transition states) are sufficiently high in energy. To verify that the lattices found in a CSP searchare dynamically stable, and therefore correspond to local minima, it is necessary to calculate theirvibrational normal modes and confirm they are all real. A basin of attraction contains all of theconfigurations that will optimize to the same minimum via small-step downhill relaxations, anda super-basin is an area in the PES containing a number of neighboring basins. If it is possibleto reach the lowest energy point within a super-basin without overcoming any large barriers, thenthis is referred to as a funnel.

The degree of difficulty associated with predicting the global minimum for a particular crys-talline lattice depends on how many atoms make up the unit cell, and on their identity. In ahomogeneous system comprised of N atoms with ns(N) local minima, the energy remains in-variant to permutations of all of the atoms, so there are a total of N ! equivalent configurations.Because permutations of atoms of different types gives rise to non-equivalent structures whoseenergies differ from each other (unless otherwise dictated by symmetry), the situation becomesmuch more complicated for a heterogeneous system. Consider for example a binary alloy withthe formula AmBn, wherem+n = N . If the coordinates of the atoms are kept fixed (independentof the type of atom that occupies them), the number of permutations yielding distinct structures,Nperm, is

Nperm =N !

n!m!=

N !

n!(N − n)!, [2]

neglecting point group symmetry.4 To put it simply, as the number of different types of atoms in asystem increases, so does the number of possible local minima, and finding the global minimumbecomes progressively more difficult.

Another problem in CSP stems from the fact that the number of local minima in a PES in-creases exponentially with the number of atoms comprising the system. To illustrate why this

5

TSFunnel

LM

GM

Basin

Energy

Figure 2: A schematic of a 1D PES illustrating how the energy of the system varies as a functionof one of the degrees of freedom. It contains a large number of local minima (LM) which areseparated by barriers; only one local minimum is pointed out. The global minimum (GM) is thelowest energy point. A basin contains all of the configurations that will optimize to the same localminimum. To better illustrate one of the basins we have shaded it in gray. This PES contains twofunnels, one of which is explicitly denoted. The transition state (TS) between these two funnelsis labeled.

is so, let us follow Stillinger’s analysis,25 which begins with the supposition that a large systemcomprised of N atoms can be subdivided into M regions of equal size, such that each subsystemcontains N/M atoms. If these subsystems are big enough, then the atomic configurations withinthem will be independent of the structures assumed in a neighboring region. Each subsystem willpossess ns(N/M) minima, and the total number of local minima for the large system, ns(N), isgiven as

ns(N) = nMs (N/M). [3]

The solution to this equation yields the proposed exponential dependence

ns(N) = exp(αN), [4]

where α is a system-dependent constant. The exponential increase does not bode well for the suc-cess of CSP, especially for large systems. In fact, it has been shown that finding the global mini-mum of homogeneous26 and heterogeneous27 clusters is an NP-hard (non-deterministic polynomial-time-hard) problem. This is a class of problems for which it is believed that there is no algorithmthat scales as a polynomial in the number of degrees of freedom. In addition, the ‘No Free Lunch’Theorems show that any searching and optimization algorithm that performs well on one classof problems will perform poorly on another class.28 In the context of CSP, this implies that allalgorithms will give equivalent success rates when averaged over all PESs.

The combinatorial problem encountered for systems with many different atom types, an ex-ponential increase in the number of local minima, and the inability to devise an algorithm thatworks well for all systems, is certainly discouraging for CSP. Despite this, one should not lose allhope. There are, in fact, a number of encouraging properties of PESs, which can be leveraged inorder to find clever solutions to the CSP problem. They are as follows:

6

1. Only certain regions of the PES are chemically relevant. Consider for example two atomsthat are pushed so close together that they repel strongly, or are pulled so far apart they theycan no longer bond. The energies of these structures will be very high. In addition, theregions in the PES that correspond to chemically irrelevant configurations, such as these,possess very few minima.

2. Studies have suggested that low energy basins occupy the largest amount of ‘space’ withinthe multidimensional PES.29 This implies that in a 1D PES, the lowest energy basin wouldtake up the largest area, for example. Therefore, a randomly generated structure wouldhave the greatest probability to fall within this basin.

3. The Bell Evans Polanyi principle30, 31 states that the activation energy, Ea, and enthalpy ofreaction, ∆H , within a series of closely related reactions is given by

Ea = E0 + α∆H, [5]

where E0 is the activation energy of a reference reaction of the same class, and 0 ≤ α ≤ 1.This implies that highly exothermic chemical reactions will have a low activation energy,and that the barriers between low-lying free energy minima in a PES are expected to besmall. Moreover, low-energy basins are likely to be found close to one another, just likethey are within the funnels illustrated in Fig. 2(b).

Because of the first point, a smart CSP algorithm need not explore the chemically unreason-able regions of the PES. Such configurations can simply be ruled out prior to performing expen-sive structural relaxations. If the algorithm fails to filter out some very high energy individuals,structural relaxation will lead them into a different region of the PES that contains local minima.The second item suggests that the probability is higher for a randomly generated structure to fallwithin a low-energy basin than within one that has a higher energy. In fact, the deepest minimumought to occupy the most ‘space’ in the PES and, therefore, be the easiest to reach. Moreover,because of the Bell Evans Polanyi principle, it is likely that low-energy basins are clustered closeto each other, and that the barriers separating one basin from another are small. This means thatonce the search has landed in a basin with a low energy, small structural changes sampling thesurrounding PES should eventually find the most stable configuration within the funnel to whichthe basin belongs. If the PES contains many funnels, which are far apart from one another, findingthe global minimum will be more difficult than if only a single funnel were present. Worse, how-ever, would be a featureless PES that does not display any regions of attraction. Provided that thePES contains at least a single funnel, the aforementioned properties of energy landscapes suggestthat a search that thoroughly samples the PES (excluding the chemically impossible regions), andperforms an in-depth exploration of the regions near the lowest-energy minima encountered, islikely to find the global minimum.

During the course of a crystal structure search, a plethora of local minima will be encoun-tered on the road towards the lowest energy structure. A number of these low energy metastablestructures may also be important and warrant further analysis. Such phases could potentiallybe synthesized in experiment by manipulating the variables of temperature and pressure, or byutilizing techniques that yield structures far from equilibrium. A famous example of a materialthat is metastable at 1 atm, important because of its beauty and strength, is diamond. It is less

7

stable than graphite at ambient conditions and forms within the earth at pressures of 4.5-6 GPa(1 GPa = 9869 atm). But, since the barriers to decomposition are so high (as a result of the strongC-C sp3-bonds), diamond does not convert to graphite when the pressure is released. Syntheticdiamonds can also be made at low pressures by using chemical vapor deposition (CVD). In thecontext of materials prediction, it is important that a crystal structure search is able to identifyboth graphite and diamond at atmospheric conditions.

Calculating Energies and Optimizing GeometriesMost of the automated CSP algorithms described in this chapter must be interfaced with an ex-ternal software package that performs structural relaxations. These external programs providethe CSP codes with energies and geometries needed to construct the next trial structure. Theyare typically much slower than the CSP routines and represent the bottleneck in the entire CSPprocedure. Because of this, one typically chooses input parameters that speed up the geometryoptimizations. But care must be taken in striking the right balance between accuracy and speed.After all, the relaxed geometries and energies must still be sufficiently accurate so that they canprovide the CSP algorithms with meaningful guidance in the construction of plausible structuralcandidates. Generally speaking, two types of methods can be employed to optimize geometries:those which calculate the energy of a particular configuration and the forces acting on the atomsvia interatomic potentials, and those which attempt to approximately solve the Schrodinger equa-tion (within the Born Oppenheimer approximation) for the given atomic configuration.

Not surprisingly, the first set of methods is significantly faster than the second. Becauseinteratomic potentials use a simple force field, which depends upon, for example, pairwise inter-actions, they make it possible to tackle systems with large unit cells that would be prohibitivelyexpensive for first-principles techniques. However, potentials parameterized for the system beinginvestigated may not be available. And even when they are, it may not be clear how well theywill do at describing the nuances of the PES. A potential might represent the system accuratelyaround the regions of the PES where the fit was performed, but perform poorly otherwise, giv-ing rise to incorrect ground states or spurious local minima. A plethora of potential forms canbe used for inorganic systems including the Lennard-Jones and Morse potentials, as well as theembedded-atom method (EAM). Two software packages, which can use interatomic potentialsfor structural relaxations, include GULP (General Utility Lattice Program)32–34 and LAMMPS(Large-scale Atomic/Molecular Massively Parallel Simulator).35

First-principles density functional theory (DFT) methods provide a good balance betweenspeed and accuracy, so they may be routinely employed to locally optimize the geometries ofsolids containing a few hundred atoms, followed by calculations of their electronic structure, re-activity, and properties. The drawback to DFT calculations is that there is no straightforward wayto improve their accuracy (as there is in quantum chemical correlated wavefunction based meth-ods), and sometimes the results of the calculation depend strongly on the functional that is used.Nonetheless, because each CSP search necessitates the relaxation of hundreds or thousands oflow-symmetry structural candidates, DFT remains the method of choice when accurate potentialsare not available. A few of the computational packages that are commonly employed to performDFT calculations on extended systems include VASP,36–39 CASTEP,40 Quantum ESPRESSO

8

(PW-SCF),41 ABINIT,42 Crystal,43 SIESTA,44 ADF-BAND45–47 and Gaussian.48

Because of the immense computational expense involved in a CSP search, such studies couldonly be carried out using empirical potentials until about a decade ago. The spectacular advancesin computer hardware, coupled with software developments, have made it possible to routinelyemploy first-principles calculations for CSP of systems containing up to ∼50 atoms in the prim-itive cell. Nowadays, interatomic potentials and CSP algorithms are being combined in new andunique ways. For example, it has been shown that structure prediction programs may be employedto test the validity of existing potentials,49 and to design new ones.50

Methods to Predict Crystal StructuresStructure prediction is ultimately a global optimization problem, where the atomic positions andunit cell parameters are the variables, and the multidimensional PES represents the function tobe minimized. In fact, many of the algorithms described below are well-known metaheuristicsdesigned to find good solutions to optimization problems. It cannot be guaranteed that any ofthem will uncover the globally optimal solution for all but the simplest systems. The only way toverify that the global minimum has been located is to carry out a comparison of all of the localminima. Such a brute force systematic approach is impossible in CSP for anything but the mosttrivial systems. Global optimization algorithms have been applied to diverse fields, spanningfrom electrical circuit design to protein folding. In this section, we briefly outline how a numberof these techniques have been adapted towards the CSP problem.

We first present the simplest algorithms, and then build upon them to illustrate more com-plicated methods. Calculations of the phonons, or vibrational normal modes, are required toverify that the structures found via the automated search techniques are local minima. If theyare not, local minima can be obtained by following soft phonon modes. Simulated annealing,metadynamics, basin hopping, and minima hopping are conceptually related in the sense thatthey attempt to find the global minimum by overcoming energy barriers in the potential energylandscape. They are particularly good in carrying out a fine exploration of a specific region of thePES, but typically do not roam far from their initial starting point. Because of this, they performthe best given a good starting structure, but may do poorly otherwise. Random searches, the par-ticle swarm optimization (PSO) technique, genetic algorithms (GAs) and evolutionary algorithms(EAs) do a much better job of sampling the entire PES. One drawback to a purely random searchis that it is not able to learn from its history. Since learning is incorporated into the PSO methodas well as GAs and EAs, they are sometimes given as examples of artificial intelligence. Thesethree approaches explore the whole PES while simultaneously zooming in on the most promisingregions, so they should be the methods of choice when the characteristics of the structures beingsearched for are wholly unknown.

The reader will notice that a number of aspects of the various techniques overlap with oneanother, and at times it is difficult to determine where one method ends and the other begins. Inaddition, there is great freedom to combine two or more algorithms, or to merge computationswith experimental input. A couple of hybrid approaches are also outlined in this section. It isimportant to note that many of these techniques are stochastic in nature, so two runs employingthe same user input can, in principle, explore very different regions of the PES, and find different

9

minima. Because of this, it is difficult to benchmark the various methods and to determine whichones work best for a given set of problems. The use of these techniques in materials predictionis steadily becoming more commonplace, and further developments which enhance their utilitytowards more complex systems are foreseeable in the near future.

Following Soft Vibrational ModesPerhaps the most straightforward CSP method starts with geometry optimizations of a few simplelattices possessing high symmetry, followed by calculations of their vibrational normal modes(phonons). If all of the phonon modes are real, the structure corresponds to a local minimumon the PES. An imaginary mode indicates a structural instability. If this mode is located at thezone center, Γ, this means that some distortion of the structure, which maintains the same unitcell will possess a lower energy. If the imaginary mode is found in a different region of theBrillouin Zone, the size of the unit cell used in the original calculation was too small. In this case,a stable structure can be obtained by expanding the unit cell in at least one direction, followedby a distortion that breaks the symmetry of the original system. This method works only if thestarting structure resembles the target minimum.

Consider a hypothetical 1D chain of hydrogen atoms whose geometry has been optimizedunder the constraint that each unit cell contains a single atom. This choice of lattice requiresall of the interatomic distances along the chain to be the same, as shown in Fig. 3(a). But atconditions of ambient temperature and pressure, this structure does not correspond to a minimumon the PES. As every chemist intuitively knows, such a 1D chain must be energetically less stablethan a configuration where neighboring hydrogen atoms have paired up to form H2 molecules,see Fig. 3(c). So, phonon calculations carried out on the non-molecular 1D chain would reveal alarge instability off the zone center.

a a’

(a) (b) (c)

a’’

Figure 3: (a) A 1D chain of hydrogens where the distance between all of the atoms, a, is the same.This configuration is unstable with respect to a dimerized system (c) with one short (a′) and onelong (a′′) H-H measure. (b) A schematic illustration of the phonon mode that transforms the 1Dchain in (a) to the arrangement of H2 molecules in (c).

In practice, a given structure may yield many imaginary modes, and the one with the largestmagnitude is often referred to as the softest mode. To find a local minimum, one may movethe atoms along the wave vector corresponding to this soft mode, followed by an optimizationof the structure which preserves only those symmetry elements that remain after the distortion.In the case of the 1D hydrogen chain, this procedure would result in a doubling of the unit cell,along with a shortening of one of the H-H contacts and a concomitant lengthening of the other, asexpected. Fig. 3(b) illustrates this alteration in the H-H measure. At times, it may not be possibleto reach a minimum by a single perturbation of the lattice. In such a situation, the procedure of

10

calculating the phonon spectrum and following the imaginary phonons is repeated in an iterativemanner until a local minimum is obtained.

The method of following soft modes has been used extensively to predict low-symmetrydistortions in cubic perovskites. For example, it has been applied towards BiAlO3, BiGaO3,51

BaTiO3, PbTiO3, PbZrO3,52 and MgSiO3,53 some of which are important ferroelectric materials.It is highly unlikely that the automated CSP methods described in this section will yield structuresthat have imaginary modes at the zone center, Γ, but instabilities may be present in different re-gions of the Brillouin zone. Because of this, the method of following soft modes can be combinedwith these global optimization schemes in order to identify local and, perhaps, global minima.

Random (Sensible) Structure SearchesThe simplest automated CSP technique relies on the local optimization of structures whose atomiccoordinates and lattice vectors have been chosen randomly, as illustrated in Fig. 4(a). But, be-cause completely random choices are likely to yield numerous structures which lie far away fromany local minimum, imposing user-defined constraints, which make good chemical sense, candramatically speed up the time associated with each optimization and increase the likelihood thatthe search will succeed. Perhaps the two most important constraints which need to be specifiedare the unit cell volumes (often a range is given) and the minimum interatomic distances, asshown in Fig. 4(b). If experimental data is available, or if structural analogies can be made withknown systems, then these constraints can be employed to further bias a search. For example,one can impose symmetry constraints on the randomly generated lattices, and specify interatomicconnectivities by using clusters or molecules as building blocks for the solid. The randomly gen-erated structures are relaxed to their nearest stationary point, which may correspond to a localminimum.

Pickard and Needs have referred to the aforementioned procedure as ‘generating random sen-sible structures’ (RSS).6 In addition to this purely stochastic procedure, their method incorporatesrandom atomic displacements and random unit cell deformations of particularly stable structures.This is referred to as shaking. The shake strategy can be employed to overcome barriers be-tween basins. Because the funnel containing the lowest energy basin assumes the largest amountof ‘space’ in the PES, Pickard and Needs have argued that a randomly generated structure mayhave a reasonable probability of falling within this funnel. Furthermore, the shake procedure willenhance exploration of this funnel, inevitably leading the search to the global minimum. Whencoupled with first-principles calculations, this method has been referred to as the ab initio ran-dom structure searching (AIRSS) method. AIRSS, and other random-search techniques used byvarious authors, have been successfully employed to predict the structures of a wide range of sys-tems including the molecular crystalline phase II of ammonia monohydrate,54 hydrogen55, 56 andlithium under conditions of extreme pressure,57, 58 compressed alloys of Li and Be,59 as well asmixtures of hydrogen/silane (SiH4(H2)2)60 and water/hydrogen (HnO, n > 2)61 under pressure.

In a purely random search (whether it be sensible or not), each structure that is generated isindependent of those that preceded it. Because of this, the algorithm is unable to learn aboutwhere the valleys and high-points in the PES are located during the course of the run. It can beargued that the shake operation in AIRSS takes this method out of the random search category,while at the same time introducing history-dependence and learning into the algorithm. Since

11

En

erg

y(a) (b)

‘sensible’ structurenot a ‘sensible’

structure

Figure 4: (a) A diagram illustrating the generation of random structures in a 1D PES, followed byoptimization to the nearest local minimum (bold dashed arrow). (b) An illustration of a ‘sensible’and a not so ‘sensible’ unit cell. In the latter, the volume is too big, the atomic radii overlap, andthe atom-types are not distributed homogeneously throughout the structure.

the number of local minima increases exponentially with the number of atoms in the system, theprobability of generating the global minimum randomly decreases rapidly with increasing systemsize. Therefore, a random search, which does not learn from history nor make use of constraintsthat lower the number of degrees of freedom of the system, is typically limited to small cell sizes.

Simulated AnnealingThe simulated annealing technique of Kirkpatrick, Gelatt and Vecchi62 is a global optimizationalgorithm that was inspired by the metallurgical process in which a substance is first heated,and then cooled until crystallization occurs in a controlled fashion, thereby removing structuraldefects. This method has been applied to a variety of different problems including the optimaldesign of computer circuits, as well as the traveling salesman problem (what is the shortest routethat a salesman can take so that he visits every city on his list exactly once prior to his return tothe origin?).

DE2

DE1

DE2

DE1

(a) T is large (b) T is small, followed by local optimization

En

erg

y

Accepted Rejected

En

erg

y

AcceptedAccepted

Figure 5: An illustration of the simulated annealing method in a 1D PES. New structures aregenerated by random displacement of atoms, or modifications of the unit cell. (a) At high tem-peratures, configurations whose energies are larger than that of the initial one may be accepted,whereas (b) at lower temperatures the only moves that can be made are those which lower theenergy of the system. A quench run (at T = 0 K) is followed by optimization to the nearest localminimum (bold dashed arrow).

12

When adapted towards CSP, the method typically begins by calculating the energy of a ran-domly generated arrangement of atoms occupying a unit cell whose volume slightly exceedsthe one expected for the ground state. The structure is perturbed, and the energy of the newconfiguration is calculated. The perturbations employed include random displacement of atoms,permutations of atoms of different types, and modifications of the unit cell shape or the latticeconstant. The procedure employed to accept or reject structures corresponds to the well-knownMetropolis Monte-Carlo algorithm.63 The algorithm decides whether to accept or reject thenew arrangement based upon a probability that is calculated via p = exp(−∆E/kBT ), where∆E is the energy difference between the two configurations, kB is the Boltzmann constant, andT is the simulation temperature. A random number, ε, between 0 and 1 is chosen by the algo-rithm, and if ε < p the configuration is accepted and the cycle begins anew. If the structure isrejected, the algorithm attempts to find a new arrangement that will be accepted instead. It needsto be pointed out that the ‘temperature’, T , is not the physical temperature we are used to; ratherit is a variable that controls the rejection rate.

In order for most configurations to be accepted, the simulation should begin at high tem-peratures where the starting structure begins to ‘melt.’ The temperature is gradually decreasedduring the simulation so that fewer high energy configurations are accepted, thereby mimickingthe physical annealing process. Once a user-defined number of structures have been considered,a quench run is performed. The temperature in this step is set to 0 K, so that the only moves thatare accepted are those which result in a decrease of the system’s energy. The quench run is fol-lowed by local optimization to the nearest minimum. The basic annealing procedure is illustratedin Fig. 5. More complicated annealing runs, where the temperature is sometimes raised duringthe simulation, have been devised in order to prevent the algorithm from getting stuck in localminima. Another variation of simulated annealing uses molecular dynamics to move the atomsinstead, and the temperature employed during the simulation is continually lowered. Doll, Schonand Jansen have extensively employed simulated annealing in conjunction with Hartree-Fockand DFT calculations to predict the crystal structures of a wide range of inorganic systems3, 64

including lead sulfide,65 lithium66 and calcium carbide.67

One drawback of simulated annealing stems from the fact that the PES at high temperaturesmay not coincide with the low temperature energy landscape. Consequently, the algorithm mayinadvertently get stuck in a region of the PES which is close to a structure that resembles theglobal minimum at high temperatures, during the cooling procedure. Furthermore, one mustmake a judicious choice regarding the magnitude of the structural perturbations. These stepsmust be small enough for the algorithm to learn from its history, but step sizes that are too smallmay make it impossible to escape from local minima. If a good starting structure is not known, itmay be necessary to carry out multiple annealing runs, all starting from different configurations,in order to ensure adequate exploration of the entire PES.

Basin Hopping and Minima HoppingThe basin hopping procedure of Wales and Doye68 is another algorithm that explores the PES viaa Monte-Carlo procedure. Unlike simulated annealing, however, in basin hopping each geometryis relaxed to the nearest local minimum and the energies of the minima are employed to deter-mine whether or not a particular move is accepted. Because every configuration falling within

13

a single basin optimizes to the same structure, it is often said that the relaxation transforms thePES into a series of step functions that connect the various local minima. The modification ofthe PES is known as hypersurface deformation,69 and it is illustrated in Fig. 6(a). Because thisprocedure effectively reduces the barriers between minima, the same temperature may be main-tained throughout the whole run. As in simulated annealing, the choice of the step size is crucialfor the technique to work well. If the step size is too small to move out of a basin, the structurewill simply optimize to its predecessor. Significant structural modifications may be needed tohop into a neighboring basin. But, if the step sizes are too large, the algorithm basically behaveslike a random search. The temperature of the run is also an important variable, and a non-optimalchoice may prevent the global minimum from being found. Another drawback of the methodis that there is no penalty associated with visiting the same basin numerous times. The basinhopping procedure was originally applied towards finite Lennard-Jones clusters,68 and ever since,this method has been used extensively to predict the local and global minima of a wide range ofclusters.70–73

En

erg

y

En

erg

y

(a) (b) Accepted

RejectedAccepted

Rejected

Ediff

Figure 6: A diagram illustrating the basin hopping and minima hopping techniques in a 1D PES.(a) Basin hopping traverses the PES via a Monte-Carlo procedure. The optimization of eachstructure to the nearest local minimum (bold dashed arrow) results in the transformation of thecurvy PES into the stepwise PES. (b) In minima hopping molecular dynamics is employed toexplore the PES. Structures are accepted or rejected based upon their energies relative to theirpredecessor. The difference between these two energies, Ediff, is constantly adjusted so that halfof the new configurations are accepted.

To overcome some of the problems associated with basin hopping, Goedecker introduced theminima hopping method.74 Unlike basin hopping, minima hopping is not a Monte-Carlo method,so a temperature need not be chosen. Molecular dynamics (MD) simulations are employed totraverse the PES, and each structure is relaxed to its nearest local minimum, see Fig. 6(b). Thealgorithm compares the energy of the current configuration, Ecur, with that of a trial structure,Enew. The variable, Ediff, which determines whether or not a structure is accepted, is modifiedthroughout the run such that the acceptance ratio of new structures remains fixed at 50%. Thismeans that unless Enew ≤ Ecur + Ediff, the configuration will be rejected. The kinetic energyemployed in the MD simulation is continuously adjusted to ensure that about half of the runsallow the system to cross a barrier and enter a new basin. In addition, re-visitation of minimathat have already been explored is discouraged by raising the kinetic energy employed in the MDrun. With the exception of the history dependent term employed in minima hopping, the wayin which both of these algorithms traverse the PES is quite similar. Because minima hopping

14

discourages the re-visitation of portions of the PES that have already been traversed, it facilitatesthe escape from funnels which do not contain the global minimum. Minima hopping has beenemployed extensively to predict the structures of a number of fascinating materials in a widepressure range including compressed disilane75 and carbon,76, 77 as well as zinc78 and alkali-metal-zinc borohydrides.79 Because both basin and minima hopping work best given a good startingstructure, it may be necessary to carry out numerous runs with different initial configurations toguarantee sufficient exploration of the PES by these methods.

MetadynamicsMetadynamics is an MD-based method where the forces (which are used to generate the nextstructure) are modified by a history-dependent term that inhibits the revisitation of known regionsof the PES, while simultaneously overcoming barriers.80 This procedure lifts the potential in areasthat have already been visited by filling them with Gaussians, and is sometimes descriptivelyreferred to as ‘basin-flooding.’ Eventually, the potential within a given basin is lifted high enoughso that the algorithm is able to overcome a barrier and fall into a nearby local minimum, asillustrated in Fig. 7. The flooded space typically has a dimensionality that is smaller than the3N + 3 degrees of freedom associated with the full crystal lattice. Instead, it is assumed that afew collective variables, such as the unit cell parameters, can be used to describe the structuraltransition, and the flooding is carried out in a space defined by these variables.

(a)

Global Minimum

Barrier

En

erg

y

(b)

Global Minimum

Barrier

En

erg

y

Figure 7: A diagram illustrating how the metadynamics method traverses a 1D PES. The potentialis lifted in areas the algorithm has already explored, thereby enabling the method to overcomebarriers between neighboring basins. The global minimum in (a) will be found more quickly thanin (b), because the barriers separating it from the current location in the PES are smaller.

Unfortunately, the choice of the starting structure and the shape of the PES could preventthis method from succeeding in its search. Consider for example the two PESs in Fig. 7. In thePES on the left the global minimum will be located much more quickly than in the one on theright, because the barriers that need to be overcome are significantly smaller. In some circum-stances, flooding a transition basin between two basins may even completely prevent the globalminimum from being found. This means that if a good initial structure is not known, one mayneed to carry out a number of runs with various starting points in a CSP search. Metadynamicshas been employed extensively to predict structural modifications under pressure in a number ofsystems including benzene across a large temperature range,81 germanium,82 CO2,83 and variouspolymorphs of silica.84 Metadynamics can be applied to a wide range of problems outside of

15

CSP as well. For example, this method can be employed to study structural phase transitions,conformational changes in solution, and mechanisms of chemical reactions.

Particle Swarm OptimizationThe particle-swarm optimization (PSO) method is a global optimization technique whose inspira-tion came from watching the collective behavior certain animals exhibit during their migration.85

We can all attest to witnessing the amazing, seemingly coordinated, behavior of a flock of birdsin flight — an example of swarm intelligence. The flock appears to follow a leading bird (whoseidentity may change during the migration), and the position of each bird is influenced by its ownindividual flight path, as well as by the behavior of the other members of the flock, and in par-ticular that of the leading bird. Fifteen years after the PSO method was first proposed, Wang andMa86 adapted it towards the structure prediction problem within the CALYPSO (Crystal structureAnaLYsis by Particle Swarm Optimization) code.5 The original CALYPSO implementation waslimited to inorganic crystals, but recently it has been extended towards finite clusters,87 and 2Dlayered lattices.88

Each structure generated by the PSO algorithm in a CSP run is referred to as a particle, andthe set of all structures constitutes the swarm. At a particular moment in time, t, every particleoccupies a certain position within the multidimensional PES. This position, x(t), refers to anunoptimized structure, and optimization moves it into the nearest local minimum, whose positionis denoted as pbest(t). The particle moves within the multidimensional PES with a velocity, v(t).The position of each individual particle at some instant, x(t + 1), is dependent upon its priorlocation, x(t), as well as its velocity, v(t+ 1), as

x(t+ 1) = x(t) + v(t+ 1) [6]

where,v(t+ 1) = ωv(t) + c1r1(pbest(t)− x(t)) + c2r2(gbest(t)− x(t)). [7]

In Eq. 6 and Eq. 7 v(t) is the velocity of the particle at x(t), gbest(t) is the position of the globalminimum for a given population, r1 and r2 are random numbers falling within the range of [0,1],and ω is an inertia weight, ranging from 0.4-0.9, that is modified during the simulation. Largevalues of ω encourage a global search, whereas smaller choices favor exploration of the local areanear the particle. The coefficients c1 and c2 are factors reflecting how much the individual trustsits own experience as opposed to that of the swarm. The generation of a new structure by PSO isillustrated schematically in Fig. 8(a).

The workflow of the CALYPSO program is show in Fig. 8(b). In the first step, a number ofrandom structures, which are subject to interatomic distance, volume, and symmetry constraints,are optimized locally. Each relaxed geometry is analyzed and duplicate structures (which are de-tected based upon structural similarity measures, see the section entitled ‘Maintaining Diversity’)are removed from the search. The coordinates of subsequent structures are generated via Eqs. 6and 7. It should be noted that only 60% of the lowest energy structures are employed to constructnew individuals via PSO, and 40% of the structures are generated randomly in order to keep thediversity high during the search. Because the PSO algorithm has a feedback mechanism allowingit to learn from its history, it is more likely to zoom in on the regions of the PES containing the

16

Generate Random Structures

With Symmetry ConstraintsLocal Optimization

Generation of New Structures

Identification and Removal

of Duplicate Structures

Local OptimizationConverged?

Stop

YES

NO

(b) (a)E

ne

rgy

pbest(t)

v(t)

x(t)

x(t+1)

v(t+1)

gbest(t)

Figure 8: (a) A diagram depicting how new structures are generated within the PSO algorithmin a 1D PES via Eq. 6 and Eq. 7. Some of the arrows correspond to velocity vectors, and thedots to the positions of particles within the PES. (b) A chart illustrating the workflow in the PSOtechnique as implemented in the CALYPSO code.5, 86

global minimum, and other low lying local minima, as the search progresses. Since its incep-tion the CALYPSO code has been applied to predict the structures of a plethora of different 3Dand 2D systems including potential topological insulators,89 superhard materials,90 and layeredlattices comprised of boron and carbon.91 A number of unique phases that may be stable underpressure, such as the calcium polyhydrides92 and cesium polyfluorides,93 have been predictedwith CALYPSO as well.

Genetic Algorithms and Evolutionary AlgorithmsAnother set of stochastic search techniques which learn about the PES while exploring it aregenetic algorithms (GAs) or evolutionary algorithms (EAs). These algorithms can be tailoredtowards a wide variety of complex optimization problems, which necessitate the search of ahighly dimensional space to locate the global minimum (or a good approximation for it).94, 95

The application of EAs towards various problems in materials science and related fields,96, 97

including CSP,98 has been reviewed. GAs and EAs attempt to find the local minima and globalminimum by using evolutionary principles such as survival of the fittest, mutation, and breeding.The fittest individuals are the ones with the lowest enthalpy. A relatively small modification,or mutation, of a structure may be sufficient to surpass a barrier and enable the search to finda nearby local minimum, as illustrated in Fig. 9(a). A combination of two parents to make asingle child via breeding can enable sampling of very different regions of the PES, while stillpreserving features that give rise to stability on a local scale. The ways in which offspring aregenerated from parents in GAs/EAs are called evolutionary operators or variation operators.In this section we briefly summarize the principles underlying GAs/EAs as applied to the structureprediction problem, and their historical development. In the section entitled ‘The Nitty-GrittyAspects of Evolutionary Algorithms’ we give a detailed description of the various components ofan EA, and in the section entitled ‘Practical Aspects of Carrying out an Evolutionary StructureSearch’ we provide guidelines that may help a novice in setting up their first run. Examples ofcrystal structures predicted with the XTALOPT EA are presented in the penultimate section ofthis chapter.

Even though the terms GA and EA are often used interchangeably, there is a subtle difference

17

+ = +

chromosome

gene

mutation

parent

child

breeding

parent

parent

child

child

(b) (a)

En

erg

ymutationbreeding

Figure 9: (a) A diagram depicting how new structures are generated within an EA or a GA in a1D PES. Optimization to the nearest local minimum is represented by the bold dashed line. (b) Aschematic illustrating how structural data is encoded onto a string in a GA. Generic breeding andmutation operations are also shown. Note that breeding can in principle result in two differentchildren. Examples of evolutionary operators that act on a structure in real space are provided inFig. 11.

between the two methods. In a GA the real-space structure is first mapped onto a string, akinto a chromosome in biology, as shown in Fig. 9(b). Each gene corresponds to a structural vari-able which may be optimized, and breeding and mutation occurs via modification of the genesthemselves. The first successful example of CSP via evolutionary principles employed this typeof approach. In 1995 Bush, Catlow and Battle used a GA to predict the previously unknownstructure of Li3RuO4.99 Due to the large cost associated with the run, interatomic potentialswere employed to calculate the energies of the structures generated during the search. A binarystring stored the coordinates of the atoms within the unit cell, and new structures were made bysegmenting and reassembling the strings, or randomly changing the bits that comprise them.

An EA skips the step of mapping the structural parameters onto a gene. The structural mu-tations and breeding operations are carried out by manipulating the geometric parameters in realspace, instead. In 1995 Zeiri introduced a method that directly acts on the Cartesian coordinatesof the atoms.100 The same year, Deaven and Ho employed an EA to successfully predict thestructures of various fullerenes.101 The first important contribution by Deaven and Ho was thecut and splice operator used for procreation. In this evolutionary operator, a randomly orientedplane cuts through the center of mass of two different parent clusters. A child structure is as-sembled by combining a single slice from each parent. Because this operation is carried out inreal space, it gives a more physical meaning to the breeding operation. It was shown that thisalgorithm dramatically outperforms traditional GAs. The second important ingredient in Deavenand Ho’s method is local optimization of each generated cluster. The first applications of EAsemployed interatomic potentials and were geared towards predicting the structures of finite clus-ters. About a decade after Deaven and Ho’s seminal paper, Glass, Oganov and Hansen adaptedthis method towards extended systems in their USPEX (Universal Structure Predictor: Evolution-ary Xtallography) algorithm.98, 102–104 Breeding and mutation were performed in real space, andeach structure was optimized to the nearest stationary point. USPEX has been interfaced witha number of programs that perform structural optimizations; some of these employ interatomicpotentials and others carry out first-principles calculations.

18

Hybrid MethodsAbove, we described a number of automated search techniques that have been applied to theCSP problem. The boundaries between the different methods are not strictly defined (e.g., theab initio random structure searching method employs a real-space mutation similar to an EA),and there is no reason preventing two or more of these techniques from being combined into asingle algorithm. If experimental data is known, it can be used to guide these algorithms as well.This can be done, for example, by starting a search from a likely configuration or by imposinggeometric constraints throughout the course of the run. Is it possible to forge a closer link betweencomputations and known experimental data, or to devise a clever computational scheme thatborrows ideas from two or more algorithms? Below we describe a few hybrid methods, twowhich marry multiple computational strategies, and another which integrates computational andexperimental results. Clearly many more combinations are possible, and it is likely that a numberof new hybrid schemes will arise that are tailored towards various PESs, and to the amount andtype of experimental data available.

The automated search methods described above can be classified into those that sample a largepart of the chemically meaningful portion of the PES, and those that perform a more thoroughexploration of a small region of the PES. Random searches, PSO and GAs or EAs fall into the for-mer category, whereas simulated annealing, metadynamics and basin hopping or minima hoppingare better suited towards situations where a good starting configuration is known. To benefit fromthe strengths of these two categories of algorithms, Zhu, Oganov and Lyakhov developed the evo-lutionary metadynamics technique.105 This algorithm starts with a single structure, whose normalmodes are calculated using empirical parameters. A new generation of individuals is created bydisplacing the atoms along the eigenvectors of the softest vibrational modes, and by incorporat-ing other structural mutations. The atomic coordinates are relaxed subject to the constraint thatthe unit cell parameters (typically used as the collective variables in a metadynamics run) remainfixed. The most stable structure is chosen, and the unit cell is updated using the same equations asin a standard metadynamics simulation. This procedure is repeated for a user defined number ofgenerations. Within this hybrid approach, evolutionary operations are employed to drive a meta-dynamics run. Evolutionary metadynamics has been used to predict the structures of Al2SiO5,105

and metastable sp3 allotropes of carbon.106 Another hybrid program which has recently been re-leased is the Multi-algorithm-collaborative Universal Structure-prediction Environment (MUSE)algorithm, which combines basin hopping, simulated annealing and evolutionary algorithms.107

Recently, Meredig and Wolverton proposed a hybrid method, the first-principles-assistedstructure solution (FPASS) approach, which couples experimental data and data mining with astochastic searching algorithm that uses DFT energies to determine fitness.108 This techniquewas developed for situations where the most likely space groups and lattice constants have beenobtained via diffraction experiments, but the atomic positions remain unknown. This type of sit-uation is not uncommon, especially when the crystals being considered contain light elements.In real space the breeding and mutation operations typically change the symmetry of the lat-tice. In order to prevent this from happening, FPASS encodes the structural information intochromosome-like strings, and the gene determining the space group (as deduced from experi-ments) remains fixed during mating and mutation. That is to say, FPASS carries out a GA-likesearch within which the symmetry of the crystal lattice is constrained. The GA search is guided

19

by data obtained from the Inorganic Crystal Structure Database. This is done by making sure thatthe occupied Wyckoff sites in the structures generated by FPASS mimic the occupations statis-tically seen in nature. Finally, the run is biased to favor individuals whose diffraction patternsbetter agree with those obtained experimentally. Meredig and Wolverton showed that FPASScan predict structures that prove difficult for EAs, and random searches — even those employingspace-group-symmetry constraints. The main drawback of the FPASS method is that it absolutelyrequires experimental input, so it cannot be used to predict hitherto-unknown phases.

The Nitty-Gritty Aspects of Evolutionary AlgorithmsIn the last decade a number of groups have released EAs interfaced with first-principles periodicprogram packages, which differ subtly in various aspects including the workflow, evolutionaryoperators used to make new children, methods employed in duplicate detection, and numerousother details. Some of these EAs include XTALOPT,109, 110 USPEX,98, 102 MAISE,111 EVO,112

GASP,49, 113 the ‘adaptive-GA’,50 algorithms by Trimarchi, d’Avezac and Zunger,114, 115 Abrahamand Probert,116 as well as Fadda.117 Within this section we describe in detail the workflow carriedout in an EA, and discuss a number of technical aspects associated with each task. There aremany different ways in which the various steps in an EA can be implemented, and often times theuser is asked to adjust various parameters controlling the search. This section will help a novicechoose the best options for their particular application.

WorkflowThe workflow employed in most EAs is illustrated schematically in Fig. 10. A set of structuresis created randomly, subject to a set of user-defined constraints, such as unit cell volumes andinteratomic distances. Most algorithms also allow the user to seed the search with structuresthat are deemed to be plausible. The seeds are typically chosen based upon structural analogy,experimental data, or the results of other calculations. This procedure may steer the search to-wards areas of the PES likely to be more favorable. In addition, a number of algorithms allowthe user to impose space group symmetry constraints on the randomly generated structures. Theindividuals in this first generation are locally relaxed by an external code (see the section entitled‘Calculating Energies and Optimizing Geometries’), and their energies or enthalpies are used todetermine their fitness, or likelihood to be chosen as parents for the next generation. Because themost stable geometries have the highest fitness, and are therefore the most likely to be chosenas parents, this ensures that favorable traits (motifs which render the structures stable) propagateinto the next generation.

Two different types of workflows have been put forward, and they differ in how the parentpool is created. The most widely used scheme employs a generation based pool of parents,and that workflow is shown in Fig. 10(a). In this scheme a subset of the fittest structures ina given generation are chosen to be parents, and some of the most stable structures from priorgenerations are also allowed to procreate. This means that some of the worst individuals froma generation never get the opportunity to produce offspring. A single parent can father multiplechildren. The user specifies the number of individuals to be created in each generation. Within

20

Initial Population of

Random StructuresLocal Optimization

Generation of New Structures

via Mating or Mutation

Selection for Procreation

Local Optimization

Converged?

Stop

YES NO

Assign Fitness

Converged?

Assign Fitness

Pool

H < Hcut

?NO

NO

Discard

(a) (b)

YES

Figure 10: (a) A workflow for a traditional EA that uses a generation based pool. (b) Modificationof the workflow illustrated in (a) for an EA that employs a population based pool. Dashed arrowsindicate where the workflow merges with the traditional EA. Only the structures with the lowestenthalpies are kept in the pool used for procreation, and the rest are discarded. The pool size isspecified by the user.

this scheme all of the members of the generation must be locally optimized before their fitness canbe assigned, and the new generation can be created via mutations and breeding. The drawback ofthis workflow is that some structures may be slow to optimize, thereby creating a bottleneck in thealgorithm. To alleviate this problem, Bandow and Hartke118 introduced a different workflow thatuses a population based pool. In this scheme, see Fig. 10(b), a user defined number of structuresare continuously optimized, so that as soon as one individual finishes local relaxation, a newchild is created. This procedure eliminates the bottleneck. Only the fittest members of the entirepopulation are allowed to comprise the breeding pool, and the user specifies how large this poolshould be. When a structure is fully relaxed, its enthalpy is compared to those within the pool. Ifits enthalpy is lower than the most unstable member of the pool, it will replace this structure. Inthis way the pool is continuously updated. Bandow and Hartke applied their workflow to predictthe structures of water clusters, and illustrated that it makes better use of computational resourcesas compared to the traditional generation based pool.118

In both of these schemes, children are generated via evolutionary operators, a number ofwhich will be described soon. Because there is no guarantee that an evolutionary search willfind the most stable geometry, the criterion for convergence, or stopping criteria, are somewhatarbitrary. In the section entitled ‘Practical Aspects of Carrying out an Evolutionary StructureSearch’ we describe a few rules of thumb that various authors have put forward to help answerthe question “When should I stop the evolutionary search?” If the search is continued, the cyclestarts again by determining the fitness of the newly optimized parents, selecting some of themfor procreation, and creating a new generation (or single child, in the case of a population basedsearch).

21

Selection for ProcreationEvolutionary algorithms that attempt to locate the most stable structure employ a thermodynamicvariable to determine fitness. The computational expense associated with calculations of vibra-tional frequencies via first-principles are prohibitively expensive, so enthalpies or energies (ne-glecting zero point energy corrections) are typically used instead of free energies. In this sectionwe assume the enthalpy, H , to be the quantity assessing the fitness of the individual, keeping inmind that any other thermodynamic parameter can be utilized. In so-called roulette wheel selec-tion the probability (pi) that a structure with enthalpy Hi will be chosen for breeding is calculateddirectly from its fitness, fi, defined as

fi =

(1− Hi −Hmin

Hmax −Hmin

), [8]

whereHmin andHmax are the enthalpies of the best and worst structures comprising the generationor pool. Such a dynamic fitness scaling ensures that the probability of choosing the most stableindividual will be very high, whereas the least stable structure will never be picked. In the simplestcase pi = fi (and the probabilities are normalized such that

∑i pi = 1), but different strategies can

be chosen to control the preference of structures with lower enthalpies as compared to those withhigher ones. Some of the functions which have been employed are exponential (fi = exp(−αpi),α =constant), linear (pi = 1− 0.7fi) and hyperbolic tangent (pi = 1

2[1− tanh(2fi − 1)).10

In truncation selection, some set fraction of the best individuals in the population are givenan equal probability of reproducing. In tournament selection, a set of structures is chosen ran-domly (the tournament pool) and those with the best fitness are allowed to procreate. Most struc-ture prediction EAs employ some variant of the roulette wheel method, however a more generalapproach has been proposed recently113 in which two parameters (the number of parents Nparents,and an exponent P ) can be varied in the formula

pi =(fi)

P∑Nparentsj (fj)P

. [9]

Appropriate choices of these variables give rise to selection probabilities resembling truncationand roulette wheel selection, and many other distributions can be generated.

It should be noted that thermodynamic quantities are not the only computed variables thatcan be used to determine fitness. Many choices are possible, including combinations of two ormore variables. It all depends on the property the user would like the search to optimize. Forexample, both an EA119 and a PSO-based120 technique, which employ the computed hardness todetermine fitness, have been developed to predict the structures of superhard materials. A GA forsingle molecules, which uses the energy of the HOMO (highest occupied molecular orbital) andthe electronic transition energy to determine fitness, was recently applied towards the predictionof organic copolymers that may be good candidate materials in solar cells.121 When EAs attemptto simultaneously optimize two or more variables, the total fitness can be obtained via some sortof weighted linear combination of the fitness values computed for the individual variables.

22

Evolutionary OperatorsAn evolutionary operator is any geometric modification that transforms one structure into another.Sometimes evolutionary operators are referred to as variation, mating or reproduction operators.The defining characteristic of an evolutionary algorithm, as opposed to a genetic one, is that theoperators act upon the structural coordinates in real space in an EA. A plethora of operators havebeen proposed, and it is important that the ones chosen thoroughly explore the configurationalspace of the system, while at the same time generating the most chemically reasonable structures.In the case of inorganic crystals, the atomic positions, atomic ordering, and unit cell parameters,are some of the most important factors dictating structural stability. Operators can be classifiedinto: (i) those employing two parents to create a single child (analogous to the reproductionprocess in biology), and (ii) those that act on a single parent (mimicking mutations in individualorganisms). In Fig. 11, we schematically illustrate how some of the operators commonly used forinorganic crystals modify their structural parameters.

+

Breeding Permutation StrainRipple

Figure 11: A schematic of evolutionary operators that act in real space on a 2D lattice. Breedingis a two parent operation which combines a slice of each parent into a single unit cell. In prin-ciple two different children can be made in this way, but in practice typically only one is kept.Permutation, strain and ripple are examples of mutations of a single parent.

Glass, Oganov and Hansen102 were the first to adapt Deaven and Ho’s cut-and-splice opera-tor for finite clusters towards 3D crystal lattices. This operator is sometimes referred to as thecrossover, heredity, breeding or mating operator. The exact details of how it is implementedvary, but the overarching principle is to combine two unit cell slices from two different parents.The atoms in the cell are often rotated and reflected randomly in order to prevent biasing a givenorientation, then the cell is cut (in fractional coordinate space) along a direction perpendicularto one of the cell vectors. Abraham and Probert put forward another way to cut the parents; byusing a function which has the same periodicity as the cell, such as a sine curve.116 In eithercase, a coherent piece from one cell is joined with another piece from the other cell in fractionalcoordinate space, and the remaining two slices are discarded. The lattice vectors of the child area randomly weighted linear combination of the two parents. Our previous tests suggest that thebreeding operator is the least effective out of any of the operators employed in the XTALOPT

EA.109 Lyakhov, Oganov and Valle obtained similar results.122 They postulated that the ineffi-ciency of this operator stems from the fact that particularly unstable offspring are generated fromthe breeding of two parents originating from two funnels that are far apart in the PES. Further-

23

more, they showed that the efficiency of the heredity operator can be improved by favoring thecreation of children with a higher degree of local order.

Various single-parent operations (mutations) have been proposed, and most algorithms makeuse of a handful of these. Perhaps the simplest variation is switching the positions of atomsof different types. This modification is often referred to as permutation or exchange. Atomscan also be displaced randomly, or along the softest vibrational mode calculated using empiricalparameters (as in the soft-mutation operation described in Reference 122). We have introduceda so-called ripple operator,109 which shifts the coordinates of the atoms along a random axis insuch a way so as to send a wave through the cell as shown in Fig. 11. For illustration purposes,let us assume the operator displaces the atoms by some amount ∆z along the z-axis. Then thechange in the atomic position is given as,

∆z = ρ cos(2πµx+ θx) cos(2πηy + θy) [10]

where ρ is the amplitude of the wave, µ and η are non-negative integers specifying the number ofcosine waves along the directions that are not displaced, and the phase angles θx and θy (whichrange from 0 to 2π) determine which regions of the cell are strongly varied. Finally, the commonlyused strain operator changes the shape of the unit cell by multiplying each lattice vector ~v by thesymmetric matrix

~vnew = ~v

1 + ε11ε122

ε132

ε122

1 + ε22ε232

ε132

ε232

1 + ε33

[11]

where the εij are random numbers taken from a zero-centered normal distribution with a specifiedstandard deviation. In an attempt to keep the diversity high during the course of the search, andprevent the algorithm from getting stuck in a particular region of the PES, some algorithms invokean operator that introduces randomly-generated structures into the gene pool throughout the run.

We have focused above on outlining the most important operators used for inorganic threedimensional crystalline solids. Some of these may not be appropriate for organic molecular crys-tals. A few variations exploring relevant regions of the PES for these systems include rotations ofwhole molecules, and sampling of various molecular conformations via bond rotations.123 Differ-ent systems may necessitate other geometrical modifications. For example, a number of operatorshave been developed specifically for finite clusters, such as the application of a twist to half of theatoms within the cluster.10 Clearly, a plethora of operators are possible. Which ones are employeddepend on the chemical system at hand, and on one’s imagination.

Maintaining DiversityFor a successful structure search based upon EAs or GAs, the gene pool or breeding pool needs toremain diverse. If the structures that are chosen for procreation are too similar to one another, thealgorithm can become stuck within a particular region of the PES, or even on a single structure.The diversity must remain high throughout the search to ensure that the variation operations canproduce new structures and that all of the chemically relevant portions of the PES are thoroughlyexplored. The PSO algorithm also shares this concern, but it is expressed in a somewhat differentlanguage: one says that diversity needs to be maintained within the entire swarm. Individuals

24

deemed as being too similar to one another must be eliminated (from the breeding pool or theswarm) to prevent the search from stagnating. This is referred to as diversity checking or nich-ing. Below, we describe a few techniques used towards this end in both classes of algorithms.

The simplest niching scheme compares the energies or enthalpies of all of the individuals todetermine if they differ by a value that is less than some user defined amount, δE. If two or morestructures satisfy this criterion, it is assumed they are duplicates of each other. Only one of themis kept in the search, while the others are removed.10, 101 The problem with this procedure is thatit assumes there is a one-to-one correspondence between energy and structure. In fact, variousconfigurations originating from different funnels may have similar energies (especially if theirgeometries are not yet fully relaxed). Another drawback to this strategy is that the size of δE isarbitrary, and is likely system-dependent. This method can detect many false positives, therebyremoving unique structures from the search. To prevent good structures from being discarded,other criteria are needed to distinguish individuals from one another.

Two approaches have been put forward to further discriminate between structures. The firstclass consists of indirect approximate schemes, which do not attempt to find duplicates by carry-ing out an atom-by-atom comparison. These methods compare some set of variables referred toas that individual’s fingerprint instead. The second class compares atomic positions directly.

In the first version of XTALOPT, the fingerprint consisted of a structure’s space group (de-termined to within a user specified tolerance), volume, and energy or enthalpy.124 Even thoughthis method improved on using just the energy or enthalpy alone, we still found that many goodunique structures were removed from the breeding pool. Another indirect scheme to weed outsimilar individuals uses bond distance measures.125 For example, CALYPSO86 employs bondcharacterization metrics (introduced by Steinhardt126 in order to quantify both the bond anglesand lengths) for duplicate matching. Histograms of these metrics have shown a clear distinctionbetween sp3 coordinated diamond and sp2 graphite.127 The fingerprint functions employed inUSPEX were inspired by radial distribution functions and diffraction spectra.128 In this EA, eachcrystal structure is transformed into a point within a so-called fingerprint space, and the simi-larity between two individuals is determined by the distance between them within this abstractspace. In another EA, Abraham and Probert penalize the fitness of structures whose sphericallyaveraged scattering intensities resemble those of the most stable individual.116

All of the aforementioned methods are approximate in the sense that they reduce the structuralcoordinates to intermediate forms which are then compared. Perhaps one of the reasons why thedirect approach is seldom employed is because degenerate unit cells, numerical noise, boundaryerrors (a consequence of periodic boundary conditions), and the lack of a canonical origin makeit impossible to identify duplicates by simply comparing the coordinates output by the programsused for the structural relaxations. To overcome these problems, and enable an exact comparisonof the atomic positions, we have written the XTALCOMP algorithm.124 This program transformsthe crystal lattices into a canonical form, and searches for a rotation-reflection matrix that canmap the two lattices onto each other to within a user-specified tolerance. The current versionof XTALOPT uses XTALCOMP to find duplicate structures. Our tests have shown that the newversion of XTALOPT dramatically outperforms the old one, which employed the aforementionedfingerprinting scheme instead. This improvement is likely due to the fact that the indirect methodfound a number of false positives, so that unique stable individuals were removed from the breed-ing pool. We note that the best diversity checking scheme is likely to be system-dependent. For

25

example, the direct XTALCOMP method will probably not perform well for organic molecularcrystals where two structures may differ simply by a slight rotation of a functional group arounda bond. In such a scenario, the tolerance may need to be quite loose to flag the two structures asdirect duplicates, but a too large tolerance will identify many false positives.

The XTALOPT Evolutionary AlgorithmIn this section we take the opportunity to describe the nuances of the XTALOPT EA.109, 110, 129

XTALOPT has been written in C++ as an extension to the AVOGADRO130, 131 molecular editor, andit makes use of the OPENBABEL132, 133 chemical toolkit, which (among many other useful fea-tures) can convert between the various file-formats used to store chemical data. The SPGLIB134

package is included to aid in space group identification, and the XTALCOMP124, 135 program isused to search for duplicate structures. We have released XTALOPT as an open-source programunder the GNU Public License (GPL)136 so as to make it freely available to the scientific com-munity for use and collaboration. XTALOPT has been interfaced with GULP, as well as thefirst-principles programs VASP, CASTEP, and Quantum ESPRESSO (PW-SCF).

XTALOPT searches for the most stable crystal structure of a given composition (number ofatoms in the unit cell and their types), which remains fixed throughout the course of the run. Con-straints can be imposed to limit the search to the chemically meaningful portion of configurationspace. Lattice lengths, angles, and volumes can be restrained to a range or be fixed at a singlevalue, and a minimum interatomic distance must be set. A check is performed to determine ifall of the seeded structures, and those which are generated (whether it be randomly or via evo-lutionary operators) fulfill these constraints. If the geometric criteria are met, the structures areoptimized by the external program, otherwise they are removed from the queue and new indi-viduals are created. The crystals are free to optimize outside of the specified values during therelaxation. To fully utilize computational resources, we have chosen to use a population basedpool and the continuous workflow illustrated in Fig. 10(b). The number of structures constantlybeing optimized is input by the user, and this value can be modified during the course of therun. The latest version of XTALOPT employs the XTALCOMP algorithm to detect duplicates bydirectly comparing their atomic positions to within a user-defined tolerance.

Roulette wheel selection is employed to determine fitness, and the probability that a structureis chosen to procreate is equal to its fitness normalized such that

∑i fi = 1. The evolutionary

operators implemented in XTALOPT are illustrated in Fig. 11. The two-parent breeding opera-tion provides communication between different structures. The strain mutation samples variouslattice parameters, permutation explores atomic orderings, and ripple probes atomic positions.Initial tests suggested that one of the drawbacks of the strain and ripple operators is that at timesthey fail to sufficiently modify the lattice and the children optimize back to the parents. Thisgives rise to an increased number of duplicates found in the search, and wastes computer re-sources by re-exploring known regions of the PES. To overcome this problem, we combined themutations shown in Fig. 11 into two hybrid operators. The stripple operator merges strain withripple, and permustrain marries permutation with strain. Tests revealed that an EA employinghybrid operators results in far fewer duplicate structures than one that did not, implying that thehybrid operators are more effective than a single mutation in overcoming barriers and visitingnew regions of the PES.

26

Figure 12: Screen shots of the ‘Plot’ tab in the XTALOPT GUI. These can be used to visualizeand analyze the results of the run in real time, or after the search has completed. Clicking on anentry brings up the structure for visualization in the AVOGADRO main window. The right mousebutton can be used to kill unstable structures, or to inject promising seed lattices into the run.

An attractive feature of XTALOPT is its easy-to-use graphical user interface (GUI), whichcan be used to set up and start a search, inspect structures generated during the run in real time,visualize and analyze the progress and results of the search, and configure parameters associatedwith the run on-the-fly. The GUI consists of a number of different tabs, each one associatedwith a particular function. The structural parameters and geometric constraints are entered in the‘Structure Limits’ tab. Information related to the queuing system, the location of local and remotedirectories, as well as inputs for the external program employed for relaxation, may be entered inthe ‘Optimization Settings’ tab. ‘Search settings’ controls parameters related to the evolutionaryrun. This includes the number of atoms comprising the breeding pool, information about theseeds (if any are to be used), tolerances for duplicate matching, and variables that control thebehavior of the genetic operators. The ‘Plot’ and ‘Progress’ tabs can be used to visualize theresults of the search, see Fig. 12 and Fig. 13. The structure number, generation, volume, enthalpy,energy, PV contribution towards the enthalpy, and any of the lattice parameters can be plottedon either the x or y-axis, or be shown as labels on the graph. Data for duplicate structures, orthose not yet finished optimizing, can be optionally displayed. In addition, each structure can beoptionally labeled by its Hermann Mauguin space group symbol (as determined by the SPGLIB

program). During the course of an evolutionary search we typically plot the enthalpy versusstructure number, and visually inspect the most stable lattices. Much of the aforementioned

27

Figure 13: Same as Fig. 12, but the ‘Progress’ tab.

information is also displayed in the ‘Progress’ tab, which additionally provides each individual’sancestry, lists which structures are duplicates of one another, and displays the current status ofthe job (if it is queued, running, has finished optimizing, or has been killed). Right clicking onan entry allows the user to remove or restart the job, replace one individual with another, injecta seed, or copy the lattice coordinates to the clipboard (in VASP POSCAR format). Clicking onany of the data points will bring up the structure in the main AVOGADRO window where it can befurther visualized or manipulated, and the lattice coordinates may be copied to the clipboard byclicking on ‘edit’ followed by ‘copy’. The Crystallography extension we have implemented intoAVOGADRO enables further structural analysis. One can determine the space group to within auser-specified tolerance, symmetrize the crystal, and reduce the lattice to either its primitive cellor Niggli cell. A text file listing the structures, from the most stable to the least, is stored on thelocal computer.

The current release of XTALOPT is best suited towards inorganic systems where the atomscan be treated as individual entities. Connectivity constraints for molecular crystal determinationhave been incorporated in a β-version, which we hope to release soon. We are also implementingnew functionality useful in searches for metastable species and layered structures. A number ofbinary and ternary phases that have been predicted by XTALOPT are described in the penultimatesection of this chapter.

28

Practical Aspects of Carrying out an Evolutionary StructureSearchA plethora of parameters may be adjusted to control the behavior of an evolutionary structuresearch. But because of the stochastic nature of the algorithm, and the potentially large compu-tational expense associated with every run, it may be difficult for a new user to decide whichvariables to change and how. For these same reasons, very few thorough benchmark studies havebeen carried out to determine good parameters sets, and users frequently make choices based onheuristic arguments, convenience, and prior experience. Within this section we describe how asearch can be accelerated via a judicious choice of parameters and outline criteria which can beused to determine when a search may be terminated (either because it is failing, or because thereis a good probability that the global minimum has been found).

Which Unit Cells Do I Consider? The first choice the user is faced with when searching forthe lowest energy structure for a given chemical composition is the size of the primitive unit cell.Unless experimental data is available, it is typically not known if the irreducible cell contains one,two, three or fifty-three FUs. But, because it is too expensive to carry out evolutionary searcheson really large systems, the runs to be performed will need to be restricted to a computationallyfeasible subset of cells. For illustration purposes, let us suppose that the global minimum weare looking for is comprised of 3 FUs. In principle, a search containing 3n FUs, where n is anypositive integer, should be able to locate this target structure. But, since the number of minimais an exponential function of the number of atoms in the cell, many more structures will likelyneed to be optimized to find the global minimum for large cells. In addition, the more atoms acell contains, the longer each geometry optimization takes. So, in practice, a search containing6, 9, or 12 FUs may not be able to locate the global minimum in a reasonable amount of time.It may therefore be a good idea to carry out runs containing 1-4, and 6 FUs, and compare thelowest energy structures identified in each search. It may also be worthwhile to seed a 6 FUsearch with supercells of the lowest enthalpy structures located in the 2 FU and 3 FU runs (asdescribed in more detail below). Note that if the number of atoms used in the search is smallerthan in the primitive cell of the global minimum, the evolutionary run will never be able to findthe lowest energy configuration. However, following the imaginary phonons of the most stablestructures found within the search (see the section entitled ‘Following Soft Vibrational Modes’)may ultimately lead to the global minimum.

Sometimes one would like to predict the structures of stable binary phases across a broadcomposition range, i.e., AxBy, where x and y are unknown positive integers. One way to do thisis to find the most stable structure for each composition via an evolutionary search. A plot similarto the one shown in Fig. 14 can be employed to determine thermodynamic stability. It illustratesthe enthalpy of formation (∆HF ) per atom for the reaction xA + yB → AxBy, as a function ofcomposition. The leftmost point on the x-axis corresponds to a phase which is 100% A and therightmost 100% B. So-called tie-lines are drawn connecting the enthalpies of formation of eachphase. The set of line segments below which no other ∆HF points lie defines the convex hull,and the phases whose ∆HF comprise the hull are thermodynamically stable (other systems whose∆HF do not fall on the hull may be metastable, provided that all calculated phonon modes arereal). In principle this procedure can be employed to determine thermodynamic stability in phasescontaining three or more different types of atoms as well. Reference 137 shows an example of

29

AxB

y100% A 100% B

Mol Fraction B

∆HF

S

S

S

MS

MS

MS

A

B

C

D

x∆HF

Figure 14: A plot illustrating the enthalpies of formation, ∆HF , per atom for the formation of thebinary phase AxBy according to the reaction xA + yB→ AxBy. The fraction of element B in thebinary is given on the x-axis. If a tie-line is used to connect ∆HF (A) and ∆HF (B), and ∆HF (C)falls below it, as shown in the inset, then the formation of C from A and B is thermodynamicallypreferred. Based on thermodynamics alone, phase D is expected to decompose into A and B. Thedashed line in the main plot represents the convex-hull, that is the set of all tie-lines below whichno other phases lie. All of the phases whose ∆HF lie on the convex hull are thermodynamicallystable (S) with respect to decomposition into other phases. Structures whose ∆HF do not fall onthe hull may be metastable (MS), provided their phonons modes are real, and the barriers towardstheir decomposition are high.

a ternary phase diagram. Such studies will be much more difficult because all decompositionpathways must be considered, and a higher dimensional space is necessary to map out the convexhull.

A second way to determine which binary phases are thermodynamically stable was proposedin 2009 by Trimarchi, Freeman and Zunger.138 They developed an EA that locates all of theminimum-energy lattices throughout the whole composition range in a single search. This meansthat the EA does not assume a fixed composition, and the convex hull is constructed during thecourse of the run. In this case the fitness function of a particular structure is dependent uponhow much its ∆HF deviates from the approximate convex hull, and the chemical composition ofchildren generated during the search is allowed to differ from that of their parents. Many EAscan employ either the fixed or the variable composition scheme.

Should I Optimize Every Structure? In Darwin’s theory of evolution, the propensity for parent-hood is dependent upon an individual’s fitness (coupled with a few random events), and the traitsthat are inherited at birth are passed onto future generations. Lamarck postulated that the char-acteristics acquired during a lifetime can be passed onto children as well. Analogously, whereasin a Darwinian EA the enthalpies of the generated structures are directly used to determine theirfitness, while the enthalpies of structures that have been relaxed to the nearest local minimumare employed to measure the likelihood for procreation in a Lamarckian EA. Despite the com-putational cost associated with performing structural relaxations, it has been demonstrated that

30

global minima are found much more readily via the Lamarckian approach.115, 139 The reason forthis may be that local optimization simplifies the PES by deforming it into a series of step func-tions, as illustrated for the basin hopping method in Fig. 6(a). We therefore highly recommendedthat most structures generated during an evolutionary search be relaxed. Not all lattices need befully optimized, however. Systems whose enthalpies are so high that they will never be membersof the breeding pool can be discarded before their geometries are fully relaxed. Typically, it issufficient to use loose convergence criteria and settings that minimize the computational expensein the geometry optimizations. In addition, our group sets up the EA search so that geometriesare optimized in a step-wise fashion. Often times the first step consists of approximately relaxingthe atomic coordinates while keeping all other variables fixed, the second step fixes the unit cellvolume, and subsequent steps allow all parameters to relax. The accuracy of the calculation isincreased at each step. During the search it is sufficient to get a rough enthalpy ordering of thevarious configurations. Only the most stable systems will be subject to stricter relaxations (inorder to obtain an accurate stability ordering), and to further analysis.

Which Constraints Should I Impose? Specifying constraints that prevent chemically unrea-sonable configurations, or those that do not match experimental data, from being subject to localrelaxations, is one of the most effective ways to accelerate an evolutionary search. If nothingabout the structure is known, the user should, at a minimum, specify a volume (or volume range)that the unit cell is likely to adopt at the pressure employed in the simulation. Volumes that areeither significantly too large or too small can lead to prohibitively lengthy geometry optimiza-tions, and sometimes even the first self-consistent-field (SCF) step in an energy evaluation maynot converge. Optimizing a randomly generated cell with the correct composition is sufficientto derive a first guess of a sensible volume, and typically the volume constraint can be refinedduring the course of the run. Imposing limits on the minimum distances between different atomtypes can also be extremely useful. For example, because C-C bonds shorter than 1.2 A are veryunlikely to be found at 1 atm, rejecting structures whose carbon-carbon contacts measure lessthan ∼0.8 A will help minimize the number of lengthy structural relaxations in an EA search.

If experimental information is available, it should be employed whenever possible to accel-erate the run. For example, sometimes X-ray diffraction can be used to determine the latticeparameters and a set of likely space groups of a crystalline lattice, even though it is not able torefine all atomic positions. In such a situation, synergy between experiment and theory is re-quired to solve the structure. Moreover, the application of space group symmetry constraints canbe useful even when experimental data is not available. This is because unique symmetry distri-butions are found in different classes of crystals. Over 80% of organic crystals possess the spacegroups P21/c (36.6%), P 1 (16.92%), P212121 (11.00%), C2/c (6.95%), P21 (6.35%) and Pbca(4.24%), but for inorganic systems the three most frequent space groups are Pnma (8.25%),P21/c (8.15%) and Fm3m (4.42%).140 It is also well known that both very low and very highenergy structures tend to have high symmetry. Most EAs are able to restrict the lattice lengths andangles to a particular set of values (or a range), so the search can be limited to a certain subset ofthe Bravais lattices. In addition, a number of structure prediction programs allow one to constrainthe structures that are randomly generated to a set of user-defined space group symmetries.6, 122, 127

Enforcing specific symmetries for lattices generated via real-space evolutionary operators is notstraightforward though, because most of the variation operators break the symmetry of the struc-ture.

31

Intramolecular connectivity constraints can be immensely helpful as well. Consider for in-stance the group 14 elements, which typically form tetrahedral units, [A4]4−, in Zintl compoundswith an alkali metal atom. One easy way to accelerate a search for a structure that is suspectedto contain these types of motifs is to treat the tetrahedra as indivisible entities. Another instancewhen connectivity constraints can be of utmost importance is in the prediction of metastablespecies. For example, germane, GeH4, is unstable with respect to decomposition into H2 andGe metal below pressures of 225 GPa. Nonetheless, this molecule can be synthesized and solidphases can be made. Because an unconstrained EA seeks out the lowest energy configuration,it will evolve towards a structure composed of layers of molecular hydrogen separated by sheetsof pure germanium. Increasing the number of FUs in the cell will simply lead to an increase inthe thickness of the layers. One way to find the most stable packing of GeH4 tetrahedra is toimpose intramolecular connectivity constraints; another way is to leverage interatomic distanceconstraints, which prevent the formation of Ge-Ge and H-H bonds. Interatomic distance con-straints can also be effective in searches carried out on 2D lattices. If the structural relaxationprogram employs 3D periodic boundary conditions, an EA can still search for the most stable2D layered system by employing constraints that enforce large distances between atoms lyingalong one of the unit cell vectors.88, 112 Another way to direct the search into particular regions ofthe PES is by penalizing the fitness of structures with undesirable properties. This approach hasbeen used in CSP searches for metastable phases containing molecular N2, at pressures wherepolymeric phases correspond to the global minimum.141

What About Parameters Related to the Search Itself? We have alluded to the fact that a num-ber of parameters are associated with each CSP program, and pointed out that adjustable settingswill depend upon the details of the implementation. For example, in XTALOPT the user can mod-ify a number of options including: the number of individuals comprising the first generation andbreeding pool; the percentage of new structures created via the crossover, stripple, and permus-train operators; the number of waves per cell and the amplitude in the periodic displacement usedin Eq. 10; the minimum and maximum values for the standard deviation in Eq. 11; the number ofatoms permuted; the tolerances used for duplicate matching. In Fig. 15 we illustrate the ‘SearchSettings’ tab, which can be employed to modify these parameters during the course of the run.Can modifying settings such as these influence how well a search performs?

Because EAs are stochastic in nature, it is unlikely that any two runs will follow exactly thesame trajectory, even if the same parameter set is employed. This makes it extremely difficultto carry out benchmarks to determine the best settings for different classes of systems. Considerfor example our initial testing of XTALOPT, for which we employed a 48 atom supercell ofTiO2.109 To speed up the computations, the geometries were optimized using a combination ofBuckingham and Lennard-Jones potentials within the GULP package. For a given choice ofsettings in XTALOPT, we carried out one hundred different searches. The results for one choiceof parameters are illustrated in Fig. 16. This plot shows the enthalpy of the most stable structurefound in both the ‘best search’ and the ‘worst search’, as well as the average of all searches. Inthe ‘best search’ the global minimum (rutile) was found in the first randomly generated set ofstructures, whereas in the ‘worst search’ rutile was not located even by the 600th structure (notethat in a later study we obtained a success rate of 100% by the 280th individual when XTALCOMP

was employed for duplicate matching124). This plot clearly illustrates that the performance of aparticular parameter set can only be accurately gauged by analyzing the results of large sets of

32

Figure 15: The ‘Search Settings’ tab, which may be employed to adjust the parameters used inan EA run performed with XTALOPT.

Figure 16: One hundred EA searches interfaced with GULP were carried out on a 48 atom TiO2

supercell. The lowest enthalpy obtained from the ‘best search’ and the ‘worst search’ are shownas solid and dashed black lines, respectively. The gray line provides the average enthalpy of thebest structure from each search. Data taken from Reference 109.

computations. Therefore, to determine reasonable default parameters for XTALOPT, we analyzedthe behavior of 45 different parameter sets, and 100 separate searches were performed for eachset. The benchmarks were carried out using rutile as a target structure, so it is likely that different

33

types of systems may require slightly different parameters.Because the variation operators, duplicate matching criteria, and other technical details of the

various EA codes vary, each program will have its own set of adjustable parameters. We recom-mend that users become familiar with the intricacies of the EA they are employing, and tailorthe parameters for their particular problem. A trivial example is the permutation operator, whichshould be disabled for single-component systems since it will not generate any new structures. Orit may be useful to turn-off the two-parent breeding operation if one has good reason to believethat the target structure strongly resembles those in the present pool so that it can be located bymutations alone.

What if I Want to Search for a Really Large Unit Cell? The unit cells of a number of in-organic crystals may be dauntingly large. Consider for example Rb-III,142 Cs-III,143 and thecommensurate host-guest structure of Ba-IVc144 whose unit cells contain 52, 84, and 768 atoms,respectively. Would it be possible to use an EA to locate the minimum-energy configurationsfor these systems? Because these phases are found under pressure, it is unlikely that empiricalpotential parameters accurately describing their PES have been developed, so DFT calculationsmust be employed. However, the formal cubic scaling associated with DFT, the lack of symme-try in the EA-generated structures, and the sheer number of systems that need to be optimized,would most likely render the prediction of these structures out of reach of current algorithms.It is difficult to answer the question: “What is the largest unit cell that could be predicted via aDFT-based EA search?” This depends upon a number of factors including: the number of atomsof different types present in the structure, the stoichiometry, the intricacies of the PES, and thecomputational resources available. At the time of writing this article, we estimate that most EAsthat are carried out in conjunction with first-principles geometry optimizations are limited to cellscontaining roughly up to 45±15 atoms in a binary system, provided that no restraints (e.g., inter-molecular connectivities, unit cell parameters, or symmetries) are employed. Below we outline afew strategies for large unit cells.

The first approach does not specifically use an EA to explore the PES of the large cell. Instead,a good approximation for the most stable structure is found via an EA on a much smaller system.The lowest energy structure is then used to build up larger, more complex cells, which mightbe even more stable. The simplest way to do this is to create a supercell and optimize it withoutsymmetry constraints. One can also manually perturb some of the atoms in the supercell, followedby local optimization. Or, the supercell can be used as a seed in a calculation on the large system.Coming back to the example of the 6 FU cell, an EA search on this composition may be guidedtowards stable areas of the PES by seeds that were found in runs on structures containing 2 FUsand 3 FUs. Phonon calculations on the smaller cell, followed by displacement of the atoms alongthe softest modes within a supercell of the structure, is another way to locate the global minimum,as discussed in the section entitled ‘Following Soft Vibrational Modes’.

One strategy to increase the cell size, which can potentially be tackled by the EA, is to mod-ify the way new structures are created. In a recent implementation of USPEX Lyakhov, Oganovand Valle introduced new methods designed specifically for systems with many degrees of free-dom.122 This includes generating the first set of sub-random structures by using supercells ofsmaller randomly generated cells (the cell-splitting technique), displacing atoms along the softesteigenvectors calculated empirically via bond hardness coefficients (the soft-mutation operator),as well as biasing variation operators to favor structures with higher local order. Tests were per-

34

formed on SiO2 with 24 atoms in the unit cell using interatomic potentials along with GULP asthe external optimizer. It was illustrated that the new algorithm is able to find the global minimummuch more quickly, and the success rate was dramatically improved.

A final way to enable the prediction of systems with large unit cells by evolutionary means isto decrease the computational time associated with each geometry optimization. Calculations thatuse interatomic potentials can be performed at a fraction of the cost as those with first-principlesmethods; however, accurate potential parameters are often not known. In addition, the availablepotentials may have been developed to accurately reproduce the energies associated with a spe-cific portion of the PES, and as a consequence might provide a poor description of other regions.Wu et al. have developed an adaptive-GA (AGA),50 based upon the Deaven and Ho scheme,which increases the efficiency of a DFT-based search while simultaneously maintaining its accu-racy. This is accomplished by fitting potential parameters to DFT data generated during the GAsearch for a small subset of systems, and using the custom-made interatomic potentials to carryout the next iteration of the GA-loop. Because the potential parameters are further updated in thenext iteration of the code, they evolve to better represent the particular region of the PES sam-pled during the search. The AGA was employed to successfully predict a number of structures,including a new phase of Mg2SiO4 under pressure with 56 atoms in the unit cell.

When Should I Stop the Evolutionary Search? Most evolutionary searches do not have hardand fast termination criteria, so it is up to the user to determine when to end them. In an idealcase the run would stop once the global minimum is located. However, the only way to ensurethat the most stable structure has been found is to visit every single local minimum. But, becausea full exploration of the PES cannot be carried out for all but the simplest systems, other criteriamust be adopted. One possible choice would be to stop a run when a predefined number ofstructures have finished optimizing, say for example 10 times the number of atoms that comprisethe unit cell. However, because the number of minima in the PES increases exponentially withthe number of atoms, it is unlikely that the same criterion is appropriate for all systems. Anotherchoice is to stop the search once the same minimum-energy structure has been found at least fivetimes. Since an EA is intrinsically a stochastic method, the trajectories followed in searches usingthe same parameters can be very different. Therefore, in principle multiple EA searches with thesame composition should be carried out to confirm that the global minimum has been found.But because of the large computational cost associated with EAs interfaced with first-principlestechniques, often times only a single run is performed.

An EA may also be stopped prematurely if the user has reason to believe that the search willfail to find the target structure. Let’s assume that we are trying to predict stable binary phaseswith the stoichiometry AxBy. If it appears that the search is evolving towards layered structures,with one layer containing only atoms of type A and the other solely of type B, it may be thatthe binary phase is unstable with respect towards decomposition into the elemental phases. Thelayered arrangement suggests that A-A and B-B interactions are stronger than A-B bonding. Here,the user might want to terminate the run even before the minimum-energy structure is located.However, if the goal is to find metastable AxBy phases, it could be useful to let the run progressand analyze structures whose energies are somewhat higher than that of the global minimum, orto carry out a search using appropriate constraints (e.g., the GeH4 example discussed above).

A third reason to terminate an EA is because there is good reason to believe that the searchhas become stuck in a local minimum from which it cannot escape. How does one distinguish

35

this situation from the case when the global minimum has been found? In both scenarios the samelow enthalpy structure will be located multiple times during the course of the search. However,if the search has stagnated, it is also reasonable to assume that the diversity in the breeding pooland the offspring will be low. This can be probed112 by analyzing the similarity of the structuresgenerated during the search using indirect fingerprinting schemes, such as those described in thesection entitled ‘Maintaining Diversity’.

How Do I Analyze the Data Once a Search is Done? The output of an EA typically consistsof a list of structures arranged from the most stable to the least, as well as files containing the co-ordinates for the final optimized geometries. In addition, each program comes with its own set oftools, which lets the user analyze the predicted structures, and get an overview of the progress ofthe run. Some of the visualization and analysis features incorporated within XTALOPT and AVO-GADRO are described in the section entitled ‘The XTALOPT Evolutionary Algorithm’. In manycases a careful analysis of the data is key towards identifying the most stable individual, or set ofunique, important structures. One must keep in mind that in an effort to decrease the computa-tional cost, accurate settings are not employed in the EA calculations. Because the geometries aretypically not fully converged, nor are they symmetrized, the enthalpies and structural coordinatesobtained from the EA are only approximate. In addition, modification of the tolerances used todetermine space groups and to identify duplicate structures can influence the results, even forfully converged geometries. So, there may be quite some uncertainty associated with analyzingthe results of an evolutionary run.

Because of this, we have adopted the practice of carefully analyzing a subset of the moststable structures as identified by the EA. One reasonable choice is to painstakingly scrutinizeall of the systems whose enthalpies lie within ∼15 meV/atom of the (presumed) lowest energylattice. Once we have narrowed down the list of structures, the next task is to remove any ob-vious duplicates. We typically employ a combination of visual inspection, as well as duplicatematching using XTALCOMP, and space group detection towards this end. If there is uncertaintyabout whether two structures are unique, it is best to keep both at this stage. Next, the unique lowenthalpy systems need to be optimized using high quality computational settings. At this stage,certain structures can be removed from further analysis if it becomes clear that their enthalpiesare too high. Subsequent symmetrization can be quite tricky because the results obtained dependon the tolerances and codes used. Programs that can identify symmetries include SPGLIB,134 theFINDSYM program, which is part of the ISOTROPY software suite and can be accessed via theinternet,145, 146 as well as various commercial packages (e.g., Materials Studio). We recommendthat a number of methods and tolerances be employed if the results appear ambiguous. Again,the symmetrized structures must be re-optimized with sufficient energy cutoffs, k-meshes, andstrict convergence criteria to obtain accurate energy orderings. In some circumstances a numberof unique phases may emerge from the search with similar enthalpies. In these situations includ-ing zero-point energies or finite temperature effects might change the order of stability. Phononcalculations need to be carried out to verify all of the structures are minima, at which point freeenergies can be compared.

The main rules of thumb to be used when carrying out an EA search can be summarized as:

• Employ chemically reasonable constraints (or experimental information) whenever possi-ble.

36

• Optimize the geometries generated by the algorithm in a step-wise fashion, increasing theaccuracy of the calculation at each step.

• Connectivity or distance constraints can be utilized to find metastable species.

• If the unit cell of the system is large, or it contains many different types of atoms, a searchcan be accelerated dramatically by biasing it with plausible low-enthalpy structures.

• Post-processing of the structures found in the search to determine which are duplicates, andfurther optimization of the most important structures using very accurate computationalsettings, is of utmost importance.

Crystal Structure Prediction at Extreme PressuresNow that we have learned about different methods commonly used in CSP for inorganic crystals,and discussed the intricacies associated with carrying out EA-based searches, we introduce youto some crystal structures that have been predicted using EAs. With the exception of the LiH6

phase (found by USPEX), all of the structures described in this section have been predicted usingthe open-source EA XTALOPT. A number of recent articles have reviewed the applications ofUSPEX,98, 103 the adaptive-GA,50 the CALYPSO technique,5 random searching via AIRSS,6 andsimulated annealing.3 The reader might find it curious that in many cases these methods havebeen applied to systems under conditions of extreme pressure, and XTALOPT is no exception.One reason why automated structure searches are ubiquitous in the area of high pressure scienceis because these types of experiments can be exceedingly challenging to carry out and a synergeticfeedback loop between theory and experiment is necessary to move the field forward.

Being that we live on the surface of the earth, which experiences 1 atm of pressure, it istempting to assume that the same rules we have learned in undergraduate chemistry can describechemical behavior at different pressures. Such an assumption is very wrong, and it turns outthat chemistry in intergalactic space (P = 10−32 atm) is very different from that in the centerof a neutron star (P = 1032 atm). As the pressure increases so does the importance of the PVterm towards the enthalpy, so, more densely packed systems become more stable. A recent re-view helps develop an intuitive picture of the seemingly bizarre ways in which pressure affectschemistry.147 The pressure variable is important in dictating what transpires in the earth’s interior(P ∼350 GPa), the behavior of matter on giant planets, and in the synthesis of new materialswith novel properties (such as superconductors or superhard materials). Pressure can effect thestructure of solids, their electronic structure, and chemical reactivity, to name a few. Static pres-sures of up to ∼350 GPa can be applied in experiments using diamond anvil cells (DACs), andmuch higher dynamic pressures can be achieved with shock waves. In addition, pre-compressedpressure cells have been placed inside of DACs, and DACs have been shocked. There are manyways experimentalists have put the squeeze on matter.148 Despite these advances, it is often dif-ficult to interpret the results of high pressure experiments. For example, it may be impossible todetermine the structure of a phase directly (i.e., using X-ray diffraction), especially if it containslight elements. Structural information may be inferred from the results of Raman or IR spec-troscopy instead. The substances being analyzed can decompose, and unforeseen reactions with

37

the experimental apparatus can occur. Light elements can diffuse into the diamonds causing themto break, resulting in a costly and premature end to the experiments. Measuring the properties(e.g., conductivity) of a sample within a DAC can also be a challenge. These are some of thereasons why CSP is so useful for high pressure research.

The CSP studies carried out in our group aim to predict synthesis targets for experimentalistsin their quest to create novel materials. Our evolutionary searches have been directed towardssystems with stoichiometries that appear odd from the perspective of chemistry at 1 atm. Theyhave identified a number of stable hydrogen rich phases containing ‘a little bit’ of the alkali metalsor alkaline earth metals.149–156 The main impetus for our computational studies was the theoreti-cal prediction that metallic hydrogen,157 and hydrogen-rich systems158 may be high-temperaturesuperconductors under conditions of extreme pressure. Our investigations have uncovered onephase, R3m-SrH6,156 which is likely to have an unusually high superconducting critical temper-ature, Tc. A number of other phases likely to have a moderate Tc under pressure have been pre-dicted as well. Perhaps more importantly, we have used CSP to unveil a rich structural diversity inalkali metal and alkaline earth polyhydrides under pressure. The propensity for superconductivityhas been correlated with the different hydrogenic motifs observed in the stable phases. Focusingon the alkali polyhydrides, we have predicted that LiH6, NaH9, KH5, RbH5 and CsH3, have thelowest enthalpy of formation from the classic MH hydride and molecular hydrogen throughout asignificant portion of the pressure range considered (P = 1 atm-350 GPa). Supercells of the moststable structures found are illustrated in Fig. 17. None of these phases could have been predictedusing chemical intuition or data mining techniques, because the only known alkali hydrides at1 atm contain one hydrogen atom for every metal atom.

2-4 GPa, KH5, RbH5100 GPa, LiH6 25 GPa, NaH9 2 GPa, CsH3

Figure 17: Supercells of alkali metal polyhydrides predicted to be particularly stable over a widepressure range. LiH6 consists of Li+ and H−1/32 (white). NaH9 is comprised of H− (small darkspheres) and H2 (white). KH5 and RbH5 contain H2 (white) and H−3 (small dark spheres). Anumber of nearly isoenthalpic CsH3 phases with H−3 (white), and Cs+ (small dark spheres) werefound. The pressures at which the first polyhydride phase, MHn with n > 1, becomes stable withrespect to decomposition into MH and H2 is also provided. In some cases these phases were notthe same as the polyhydrides that were stable throughout the largest pressure range.

Here is a concise summary of our findings. The pressures at which the polyhydrides start tobecome stable decrease going down the group, so whereas the first LiHn (n > 1) stoichiometryto resist decomposition into LiH and H2 was predicted to do so around 100 GPa, stabilizationwas achieved by 2 GPa within the cesium polyhydrides. Electronic structure calculations re-vealed that in all cases there is a transfer of charge from the electropositive alkali metal to thehydrogenic sublattices whose make-up depends on the nature of the metal ion. LiH6

149 contains

38

only Hδ−2 molecules with bonds that are slightly stretched relative to dihydrogen; dihydrogen and

hydridic H− anions are incorporated into NaH9;150 KH5152 and RbH5

151 are comprised of dihy-drogen and linear symmetric H−3 anions; and a number of nearly isoenthalpic CsH3

153 phasesaccommodated only H−3 molecules — the simplest example of a 3-center 4-electron bond. Thenature of the hydrogenic sublattice proved to be intimately related to the density of states at theFermi level, and therefore potential for superconductivity. Systems that contain H− or H−3 be-came metallic because of the overlap of bands that were broadened by pressure; these are poormetals. Phases with Hδ−

2 are highly metallic as a result of the filling of the anti-bonding σ∗u bandsof dihydrogen. Calculations suggest that phases lacking H− or H−3 have a higher Tc than otherphases.154, 159 The best candidates for high-temperature superconductivity, however, are likely tobe R3m-SrH6 at 250 GPa156 and Im3m-CaH6 at 150 GPa92 (this phase was found by Wang et al.via the CALYPSO method). These two systems are good metals, and their hydrogenic sublatticesare atomistic (they do not contain any molecular motifs). We have also employed XTALOPT topredict the structures of the classic alkali hydrides with one-to-one stoichiometries,160 and thelithium subhydrides161 under pressure.

C2/m H2O at 2 TPa C2/c AuO at 1 atm I4/mmm LiB

4 at 80 GPa Ama2 LiBeB at 40 GPa

Figure 18: Structures predicted by Hermann using the XTALOPT EA. (From left to right) A phaseof H2O ice that is metallic and stable above 4800 GPa162 (oxygen atoms are large and dark). AuO,the only experimentally unknown late transition metal monoxide163 (oxygen atoms are small anddark). The LiB4 phase with the BaAl4 structure type,164 and an LiBeB phase metastable between15-70 GPa137 (boron atoms are small, beryllium atoms in LiBeB are medium-sized, and lithiumatoms are large).

Hermann, Prasad, Wen, Labet, Zaleski-Ejgierd and AlKaabi in Hoffmann’s group have usedXTALOPT (sometimes in conjunction with other CSP methods such as chemical intuition, ran-dom search, USPEX and CALYPSO) to predict the structures of a number of unique materials.These include the tungsten polyhydrides (WHn n = 1 − 6, 8),165, 166 lithium amide (LiNH2)167

and lithium azide (LiN3),168 benzene,169 silicon monoxide (SiO),170 and the hitherto unrealizedcondensed phase of astatine at 1 atm.171 In Fig. 18 we show a few of the predicted structures toillustrate the wide variety of materials that have been studied. Consider for example the C2/mphase of water-ice, found to become stable (and metallic) around 4800 GPa.162 It can be viewedas bilayers of interpenetrating sheets spanning the ab plane. Some of the hydrogen atoms arecoordinated to two oxygens in a linear O-H-O arrangement, whereas other O-H-O bonds arebuckled. This study, relevant for the interiors of giant gas planets, illustrated that the pressure

39

at which water is likely to metallize is higher than previously thought. The crystal structures ofthe late transition metals monoxides (at 1 atm), which can all be viewed as distorted rock saltlattices, have also been investigated.163 Even though NiO, PdO, PtO, CuO, AgO, ZnO and HgOhave been synthesized, AuO remains unrealized. EA searches predicted that AuO would adopt aunique, but poorly packed, structure with open channels that deviates significantly from the rocksalt prototype. And finally, binary and ternary phases containing the light elements Li, Be andB137, 164, 172–174 have been scrutinized at 1 atm up to pressures close to those at the center of theearth. A plethora of beautiful structures were uncovered, here we only briefly mention the twoshown in Fig. 18. LiB4 is the only new boron-rich phase that was found to be stable under pres-sure. It assumes the BaAl4 structure type, commonly observed in various binaries comprised of analkali metal and a group 13 element.164 A number of crystals with the LiBeB stoichiometry werepredicted to be metastable or stable as well, including a metallic phase with Ama2 stoichiometrywhose domain of existence spans from 15-70 GPa.137 This phase contains a hexagonal lithiumsublattice with linear chains of beryllium and buckled chains of boron filling in the cavities.

We look forward to further predictions of unique crystals with unusual properties using anyof the CSP methods described above (as well as new ones yet to be developed), and eagerlyanticipate their eventual synthesis.

ConclusionsPredicting a crystal’s structure given only its chemical composition is a global optimization prob-lem, where the function to be minimized is the free energy of the lattice, and the variables arethe cell parameters and atomic coordinates. In the last 25 years a number of global optimizationalgorithms have been adapted towards crystal structure prediction. Some of these techniques arebetter suited towards global searches (random searching, EAs, PSO), others towards local explo-ration (simulated annealing, basin and minima hopping, metadynamics) and all of them performto their fullest when chemically reasonable constraints are placed on the search space. The in-crease in computational power has made it possible to interface these methods with programs thatuse first-principles to calculate energies and relax geometries to the nearest local minimum. Algo-rithmic developments coupled with the dramatic advancements in computer hardware have madeit possible to rise to Maddox’s challenge (see the ‘Introduction and Scope’), but many limitationsstill remain.

Size matters in CSP, and large unit cells with many different atom types present a combina-torial problem. The number of local minima grows as an exponential function of the number ofatoms in the unit cell. This means that finding the global minimum of a crystal with a large unitcell will entail many more relaxations as compared to a small cell, and the relaxations via first-principles calculations will consume progressively more time. Smarter algorithms are requiredto find ways to sample the relevant regions of the PES in large systems selectively. Methods tolocate important local minima are also required. In addition, an algorithm that predicts the sizeof the unit cell in the most stable structure is desirable. Despite the fact that structural stabilityis inherently related to temperature, at the moment it is computationally too expensive to takefinite temperature effects into account during the course of a CSP search. Runs are carried out at0 K, and sometimes the free energies of the most stable set of structures are calculated and com-

40

pared in a post-processing step. Using the enthalpy at 0 K to infer the relative stabilities at finitetemperature assumes that the global minimum is independent of temperature. In principle meta-dynamics or molecular dynamics could be employed to obtain the free energy, but in practice verylong simulation times are necessary to yield accurate results. The immense cost associated withcomputing free energies is the bottleneck in predicting crystal structures at finite temperatures.

Much progress has been made within computational crystal structure prediction for inorganicsystems since Maddox uttered his provocative comments. This is an exciting area where theo-retical work has been used in conjunction with experiment to characterize a number of newlymade materials, and also where predictions have inspired experimental explorations. There aremany materials waiting to be discovered, and we expect computational predictions will becomeincreasingly more important in this burgeoning field. We look forward to new algorithmic devel-opments that enable CSP searches on larger unit cells, and the incorporation of finite temperatureeffects into the search itself.

AcknowledgementsEZ acknowledges the NSF (DMR-1005413) for financial support, and thanks the Alfred P. SloanFoundation for a research fellowship (2013-2015). EZ thanks current and former members of herresearch group for inspiration and help editing the manuscript. This includes Zack Falls, JamesHooper, David Lonie, Dan Miller, Andrew Shamp, Scott Simpson and Tyson Terpstra. AndreasHermann is acknowledged for providing graphic material.

References[1] J. Maddox, Nature, 335, 201 (1988). Crystals From First Principles.

[2] S. M. Woodley, and R. Catlow, Nat. Mater., 7, 937 (2008). Crystal Structure Predictionfrom First Principles.

[3] J. C. Schon, K. Doll, and M. Jansen, Phys. Status Solidi B, 247, 23 (2010). Predicting SolidCompounds Via Global Exploration of the Energy Landscape of Solids on the Ab InitioLevel Without Recourse to Experimental Information.

[4] G. Rossi, and R. Ferrando, J. Phys.: Condens. Matter, 21, 084208 (2009). Searching forLow-Energy Structures of Nanoparticles: A Comparison of Different Methods and Algo-rithms.

[5] Y. Wang, and Y. Ma, J. Chem. Phys., 140, 040901 (2014). Perspective: Crystal StructurePrediction at High Pressures.

[6] C. J. Pickard, and R. J. Needs, Phys. Status Solidi B, 246, 536 (2009). Structures at HighPressure from Random Searching.

41

[7] B. C. Revard, W. W. Tipton, and R. G. Hennig, in Topics in Current Chemistry, S. Atahan-Evrenk and A. Aspuru-Guzik, Eds., Springer-Verlag, Berlin Heidelberg, 2014, pp. 1-42.Structure and Stability Prediction of Compounds with Evolutionary Algorithms.

[8] A. R. Oganov (Editor), Modern Methods of Crystal Structure Prediction, Wiley-VCH,Berlin, 2010.

[9] D. J. Wales, Energy Landscapes: Applications to Clusters, Biomolecules and Glasses,Cambridge University Press, Cambridge, 2003.

[10] R. L. Johnston, Dalton Trans., 22, 4193 (2003). Evolving Better Nanoparticles: GeneticAlgorithms for Optimising Cluster Geometries.

[11] S. Heiles, and R. L. Johnston, Int. J. Quantum Chem., 113, 2091 (2013). Global Optimiza-tion of Clusters Using Electronic Structure Methods.

[12] M. Sierka, Prog. Surf. Sci., 85, 398 (2010). Synergy Between Theory and Experiment inStructure Resolution of Low-Dimensional Oxides.

[13] P. Verwer, and F. J. J. Leusen, in K. B. Lipkowitz, and D. B. Boyd, Eds., Reviews inComputational Chemistry, Vol. 12, John Wiley & Sons, Inc., New York, 1998. pp. 327-380. Computer Simulation to Predict Possible Crystal Polymorphs.

[14] G. M. Day, Crystallography Reviews, 17, 3 (2011). Current Approaches to PredictingMolecular Organic Crystal Structures.

[15] S. L. Price, Acta Cryst., B69, 313 (2013). Why Don’t We Find More Polymorphs.

[16] J. P. M. Lommerse, W. D. S. Motherwell, H. L. Ammon, J. D. Dunitz, A. Gavezzotti,D. W. M. Hofmann, F. J. J. Leusen, W. T. M. Mooij, S. L. Price, B. Schweizer, M. U.Schmidt, B. P. van Eijck, P. Verwer, and D. E. Williams, Acta Cryst., B56, 697 (2000). ATest of Crystal Structure Prediction of Small Organic Molecules.

[17] W. D. S. Motherwell, H. L. Ammon, J. D. Dunitz, A. Dzyabchenko, P. Erk, A. Gavezzotti,D. W. M. Hofmann, F. J. J. Leusen, J. P. M. Lommerse, W. T. M. Mooij, S. L. Price,H. Scheraga, B. Schweizer, M. U. Schmidt, B. P. van Eijck, P. Verwer, and D. E. Williams,Acta Cryst., B58, 647 (2002). Crystal Structure Prediction of Small Organic Molecules: ASecond Blind Test.

[18] G. M. Day, W. D. S. Motherwell, H. L. Ammon, S. X. M. Boerrigter, R. G. Della Valle,E. Venuti, A. Dzyabchenko, J. D. Dunitz, B. Schweizer, B. P. van Eijck, P. Erk, J. C.Facelli, V. E. Bazterra, M. B. Ferraro, D. W. M. Hofmann, F. J. J. Leusen, C. Liang, C. C.Pantelides, P. G. Karamertzanis, S. L. Price, T. C. Lewis, H. Nowell, A. Torrisi, H. A.Scheraga, Y. A. Arnautova, M. U. Schmidt, and P. Verwer, Acta Cryst., B61, 511 (2005).A Third Blind Test of Crystal Structure Prediction.

42

[19] G. M. Day, T. G. Cooper, A. J. Cruz-Cabeza, K. E. Hejczyk, H. L. Ammon, S. X. M.Boerrigter, J. S. Tan, R. G. Della Valle, E. Venuti, J. Jose, S. R. Gadre, G. R. Desiraju, T. S.Thakur, B. P. van Eijck, J. C. Facelli, V. E. Bazterra, M. B. Ferraro, D. W. M. Hofmann,M. A. Neumann, F. J. J. Leusen, J. Kendrick, S. L. Price, A. J. Misquitta, P. G. Karamertza-nis, G. W. A. Welch, H. A. Scheraga, Y. A. Arnautova, M. U. Schmidt, J. van de Streek,A. K. Wolf, and B. Schweizer, Acta Cryst., B65, 107 (2009). Significant Progress in Pre-dicting the Crystal Structures of Small Organic Molecules – A Report on the Fourth BlindTest.

[20] D. A. Bardwell, C. S. Adjiman, Y. A. Arnautova, E. Bartashevich, S. X. M. Boerrigter,D. E. Braun, A. J. Cruz-Cabeza, G. M. Day, R. G. Della Valle, G. R. Desiraju, B. P. vanEijck, J. C. Facelli, M. B. Ferraro, D. Grillo, M. Habgood, D. W. M. Hofmann, F. Hof-mann, K. V. J. Jose, P. G. Karamertzanis, A. V. Kazantsev, J. Kendrick, L. N. Kuleshova,F. J. J. Leusen, A. V. Maleev, A. J. Misquitta, S. Mohamed, R. J. Needs, M. A. Neumann,D. Nikylov, A. M. Orendt, R. Pal, C. C. Pantelides, C. J. Pickard, L. S. Price, S. L. Price,H. A. Scheraga, J. van de Streek, T. S. Thakur, S. Tiwari, E. Venuti, and I. K. Zhitkov,Acta Cryst., B67, 535 (2011). Towards Crystal Structure Prediction of Complex OrganicCompounds – A Report on the Fifth Blind Test.

[21] C. C. Fischer, K. J. Tibbetts, D. Morgan, and G. Ceder, Nat. Mater., 5, 641 (2006). Predict-ing Crystal Structure by Merging Data Mining with Quantum Mechanics.

[22] S. Curtarolo, G. L. W. Hart, M. B. Nardelli, N. Mingo, S. Sanvito, and O. Levy, Nat. Mater.,12, 191 (2013). The High-Throughput Highway to Computational Materials Design.

[23] M. J. Buerger, Z. Kristallogr., 109, 42 (1957). Reduced Cells.

[24] P. Niggli, Krystallographische und Strukturtheoretische Grundbegriffe, Akademische Ver-lagsgesellschaft, Leipzig, 1928.

[25] F. H. Stillinger, Phys. Rev. E, 59, 48 (1996). Exponential Multiplicity of Inherent Struc-tures.

[26] L. T. Wille, and J. Vennik, J. Phys. A: Math. Gen., 18, L419 (1985). Computational Com-plexity of the Ground-State Determination of Atomic Clusters.

[27] G. W. Greenwood, Int. J. Res. Phys. Chem. Chem. Phys., 211, 105 (1999). Revisiting theComplexity of Finding Globally Minimum Energy Configurations in Atomic Clusters.

[28] D. H. Wolpert, and W. G. Macready, IEEE Trans. Evol. Comput., 1, 67 (1997). No FreeLunch Theorems for Optimization.

[29] J. P. K. Doye, and C. P. Massen, J. Chem. Phys., 122, 084105 (2005). Characterizing theNetwork Topology of the Energy Landscapes of Atomic Clusters.

[30] R. P. Bell, Proc. R. Soc. London. Ser. A, 154, 414 (1936). The Theory of Reactions Involv-ing Proton Transfers.

43

[31] M. G. Evans, and M. Polanyi, Trans. Faraday Soc., 31, 875 (1935). Some Applicationsof the Transition State Method to the Calculation of Reaction Velocities, Especially inSolution.

[32] J. D. Gale, J. Comput. Phys., 93, 629 (1997). GULP - A Computer Program for the Sym-metry Adapted Simulation of Solids, URL http://projects.ivec.org/gulp/

[33] J. D. Gale, Phil. Mag. B, 73, 3 (1996). Empirical Potential Derivation for Ionic Materials.

[34] J. D. Gale, and A. L. Rohl, Mol. Simul., 29, 291 (2003). The General Utility Lattice Pro-gram.

[35] S. Plimpton, J. Comput. Phys., 117, 1 (1995). Fast Parallel Algorithms for Short-RangeMolecular Dynamics, URL http://lammps.sandia.gov/

[36] G. Kresse, and J. Hafner, Phys. Rev. B, 47, 558 (1993). Ab Initio Molecular Dynamics forLiquid Metals, URL http://cms.mpi.univie.ac.at/vasp/

[37] G. Kresse, and J. Hafner, Phys. Rev. B, 49, 14251 (1994). Ab Initio Molecular-DynamicsSimulation of the Liquid-Metal-Amorphous-Semiconductor Transition in Germanium.

[38] G. Kresse, and J. Furthmuller, Comput. Mat. Sci., 6, 15 (1996). Efficiency of Ab InitioTotal Energy Calculations for Metals and Semiconductors Using Plane-Wave Basis Set.

[39] G. Kresse, and J. Furthmuller, Phys. Rev. B, 54, 11169 (1996). Efficient Iterative Schemesfor Ab Initio Total-Energy Calculations Using Plane-Wave Basis Set.

[40] S. J. Clark, M. D. Segall, P. C. J., P. J. Hasnip, M. J. Probert, K. Refson, and M. C.Payne, Z. Kristallogr., 220, 567 (2005). First Principles Methods Using CASTEP, URLhttp://www.castep.org/

[41] P. Giannozzi, S. Baroni, N. Bonini, M. Calandra, R. Car, C. Cavazzoni, D. Ceresoli, G. L.Chiarotti, M. Cococcioni, I. Dabo, A. D. Corso, S. de Gironcoli, S. Fabris, G. Fratesi,R. Gebauer, U. Gerstmann, C. Gougoussis, A. Kokalj, M. Lazzeri, L. Martin-Samos,N. Marzari, F. Mauri, R. Mazzarello, S. Paolini, A. Pasquarello, L. Paulatto, C. Sbrac-cia, S. Scandolo, G. Sclauzero, A. P. Seitsonen, A. Smogunov, P. Umari, and R. M.Wentzcovitch, J. Phys.: Condens. Matter, 21, 395502 (2009). QUANTUM ESPRESSO: AModular and Open-Source Software Project for Quantum Simulations of Materials, URLhttp://www.quantum-espresso.org/

[42] X. Gonze, B. Amadon, P.-M. Anglade, J.-M. Beuken, F. Bottin, P. Boulanger, F. Bruneval,D. Caliste, R. Caracas, M. Cote, T. Deutsch, L. Genovese, P. Ghosez, M. Giantomassi,S. Goedecker, D. Hamann, P. Hermet, F. Jollet, G. Jomard, S. Leroux, M. Mancini,S. Mazevet, M. Oliveira, G. Onida, Y. Pouillon, T. Rangel, G.-M. Rignanese, D. Sangalli,R. Shaltaf, M. Torrent, M. Verstraete, G. Zerah, and J. Zwanziger, Comput. Phys. Com-mun., 180, 2582 (2009). ABINIT: First-Principles Approach to Material and NanosystemProperties, URL http://www.abinit.org/

44

[43] R. Dovesi, R. Orlando, B. Civalleri, C. Roetti, V. R. Saunders, and C. M. Zicovich-Wilson,Z. Kristallogr., 220, 571 (2005). Crystal: A Computational Tool for the Ab Initio Study ofthe Electronic Properties of Crystals, URL http://www.crystal.unito.it/

[44] J. M. Soler, E. Artacho, J. D. Gale, A. Garcia, J. Junquera, P. Orde-jon, and D. Sanchez-Portal, J. Phys.: Condens. Matter, 14, 2745 (2002).The SIESTA Method for Ab Initio Order-N Material Simulation, URLhttp://departments.icmab.es/leem/siesta/

[45] G. Velde, and E. J. Baerends, Phys. Rev. B, 44, 7888 (1991). Precise Density-FunctionalMethod for Periodic Structures.

[46] G. Wiesenekker, and E. J. Baerrends, J. Phys.: Condens. Matter, 3, 6721 (1991). QuadraticIntegration Over the Three-Dimensional Brillouin Zone.

[47] P. H. T. Philipsen, G. te Velde, E. J. Baerends, J. A. Berger, P. L. Boeij, M. Franchini, J. A.Groeneveld, E. S. Kadantsev, R. Klooster, F. Kootstra, P. Romaniello, D. G. Skachkov, andG. Snidjder, BAND2013, SCM, Theoretical Chemistry, Vrije Universiteit, Amsterdam,The Netherlands, URL http://www.scm.com/BAND PeriodicDFT/

[48] M. J. Frisch, G. W. Trucks, H. B. Schlegel, G. E. Scuseria, M. A. Robb, J. R. Cheese-man, G. Scalmani, V. Barone, B. Mennucci, G. A. Petersson, H. Nakatsuji, M. Caricato,X. Li, H. P. Hratchian, A. F. Izmaylov, J. Bloino, G. Zheng, J. L. Sonnenberg, M. Hada,M. Ehara, K. Toyota, R. Fukuda, J. Hasegawa, M. Ishida, T. Nakajima, Y. Honda, O. Ki-tao, H. Nakai, T. Vreven, J. A. Montgomery, Jr., J. E. Peralta, F. Ogliaro, M. Bearpark,J. J. Heyd, E. Brothers, K. N. Kudin, V. N. Staroverov, R. Kobayashi, J. Normand,K. Raghavachari, A. Rendell, J. C. Burant, S. S. Iyengar, J. Tomasi, M. Cossi, N. Rega,J. M. Millam, M. Klene, J. E. Knox, J. B. Cross, V. Bakken, C. Adamo, J. Jaramillo,R. Gomperts, R. E. Stratmann, O. Yazyev, A. J. Austin, R. Cammi, C. Pomelli, J. W.Ochterski, R. L. Martin, K. Morokuma, V. G. Zakrzewski, G. A. Voth, P. Salvador,J. J. Dannenberg, S. Dapprich, A. D. Daniels, O. Farkas, J. B. Foresman, J. V. Ortiz,J. Cioslowski, and D. J. Fox, Gaussian 09 Revision D.01Gaussian Inc. Wallingford CT2009, URL http://www.gaussian.com/

[49] W. W. Tipton, and R. Hennig, J. Phys.: Condens. Matter, 25, 495401 (2013). A GrandCanonical Genetic Algorithm for the Prediction of Multi-Component Phase Diagrams andTesting of Empirical Potentials.

[50] S. Q. Wu, M. Ji, C. Z. Wang, M. C. Nguyen, X. Zhao, K. Umemoto, R. M. Wentzcov-itch, and K. M. Ho, J. Phys.: Condens. Matter, 26, 035402 (2014). An Adaptive GeneticAlgorithm for Crystal Structure Prediction.

[51] P. Baettig, C. F. Schelle, R. LeSar, U. V. Waghmare, and N. A. Spaldin, Chem. Mater., 17,1376 (2005). Theoretical Prediction of New High-Performance Lead-Free Piezoelectrics.

[52] P. Ghosez, E. Cockayne, U. V. Waghmare, and K. M. Rabe, Phys. Rev. B, 60, 836 (1999).Lattice Dynamics of BaTiO3, PbTiO3, and PbZrO3: A Comparative First-Principles Study.

45

[53] K. Parlinski, and Y. Kawazoe, Eur. Phys. J. B, 16, 49 (2000). Ab Initio Study of Phononsand Structural Stabilities of the Perovskite-Type MgSiO3.

[54] A. D. Fortes, E. Suard, M. H. Lemee-Cailleau, C. J. Pickard, and R. J. Needs, J. Am. Chem.Soc., 131, 13508 (2009). Crystal Structure of Ammonia Monohydrate Phase II.

[55] C. J. Pickard, and R. J. Needs, Nat. Phys., 3, 473 (2007). Structure of Phase III of SolidHydrogen.

[56] J. M. McMahon, and D. M. Ceperley, Phys. Rev. Lett., 106, 165302 (2011). Ground-StateStructures of Atomic Metallic Hydrogen.

[57] Y. Yao, J. S. Tse, and D. D. Klug, Phys. Rev. Lett., 102, 115503 (2009). Structures ofInsulating Phases of Dense Lithium.

[58] M. Marques, M. I. McMahon, E. Gregoryanz, M. Hanfland, C. L. Guillaume, C. J. Pickard,G. J. Ackland, and R. J. Nelmes, Phys. Rev. Lett., 106, 095502 (2011). Crystal Structuresof Dense Lithium: A Metal–Semiconductor-Metal Transition.

[59] J. Feng, R. G. Hennig, N. W. Ashcroft, and R. Hoffmann, Nature, 451, 445 (2008). Emer-gent Reduction of Electronic State Dimensionality in Dense Ordered Li-Be Alloys.

[60] Y. Yao, and D. D. Klug, Proc. Natl. Acad. Sci. USA, 107, 20893 (2010). Silane Plus Molec-ular Hydrogen as a Possible Pathway to Metallic Hydrogen.

[61] S. Zhang, H. Wilson, K. Driver, and B. Militzer, Phys. Rev. B., 87, 024112 (2013). H4Oand Other Hydrogen-Oxygen Compounds at Giant-Planet Core Pressures.

[62] S. Kirkpatrick, C. D. Gelatt, and M. P. Vecchi, Science, 220, 671 (1983). Optimization bySimulated Annealing.

[63] N. Metropolis, A. W. Rosenbluth, M. N. Rosenbluth, A. H. Teller, and E. Teller, J. Chem.Phys., 21, 1087 (1953). Equations of State Calculations by Fast Computing Machines.

[64] K. Doll, J. C. Schon, and M. Jansen, Phys. Chem. Chem. Phys., 9, 6128 (2007). GlobalExploration of the Energy Landscape of Solids on the Ab Initio Level.

[65] D. Zagorac, K. Doll, J. C. Schon, and M. Jansen, Phys. Rev. B, 84, 045206 (2011). AbInitio Structure Prediction for Lead Sulfide at Standard and Elevated Pressures.

[66] A. Kulkarni, K. Doll, D. L. V. K. Prasad, J. C. Schon, and M. Jansen, Phys. Rev. B, 84,172101 (2011). Alternative Structure Prediction for Lithium at Ambient Pressure.

[67] A. Kulkarni, K. Doll, J. C. Schon, and M. Jansen, J. Phys. Chem. B, 114, 15573 (2010).Global Exploration of the Enthalpy Landscape of Calcium Carbide.

[68] J. P. K. Doye, and D. J. Wales, J. Phys. Chem. A, 101, 5111 (1997). Global Optimization byBasin-Hopping and the Lowest Energy Structures of Lennard-Jones Clusters Containingup to 110 Atoms.

46

[69] D. J. Wales, and H. A. Scheraga, Science, 285, 1368 (1999). Global Optimization of Clus-ters, Crystals, and Biomolecules.

[70] A. Costales, M. A. Blanco, E. Francisco, R. Pandey, and A. M. Pendas, J. Phys. Chem. B,109, 24352 (2005). Evolution of the Properties of AlnNn Clusters with Size.

[71] D. J. Harding, C. Kerpal, G. Meijer, and A. Fielicke, J. Phys. Chem. Lett., 4, 892 (2013).Unusual Bonding in Platinum Carbido Clusters.

[72] S. M. Woodley, J. Phys. Chem. C, 117, 24003 (2013). Knowledge Led Master Code Searchfor Atomic and Electronic Structures of LaF3 Nanoclusters on Hybrid Rigid Ion-ShellModel-DFT Landscapes.

[73] Y. Gao, N. Shao, R. Zhou, G. Zhang, and X. C. Zeng, J. Phys. Chem. Lett., 3, 2264 (2012).[CTi2+7 ]: Heptacoordinate Carbon Motif?

[74] S. Godecker, J. Chem. Phys., 120, 9911 (2004). Minima Hopping: An Efficient SearchMethod for the Global Minimum of the Potential Energy Surface of Complex MolecularSystems.

[75] J. A. Flores-Livas, M. Amsler, T. J. Lenosky, L. Lehtovaara, S. Botti, M. A. L. Marques,and S. Goedecker, Phys. Rev. Lett., 108, 117004 (2012). High-Pressure Structures of Disi-lane and Their Superconducting Properties.

[76] M. Amsler, J. A. Flores-Livas, L. Lehtovaara, F. Balima, S. A. Ghasemi, D. Machon,S. Pailhes, A. Willand, D. Caliste, S. Botti, A. San Miguel, S. Goedecker, and M. A. L.Marques, Phys. Rev. Lett., 108, 065501 (2012). Crystal Structure of Cold CompressedGraphite.

[77] S. Botti, M. Amsler, J. A. Flores-Livas, P. Ceria, S. Goedecker, and M. A. L. Marques,Phys. Rev. B, 88, 014102 (2013). Carbon Structures and Defect Planes in Diamond at HighPressure.

[78] T. D. Huan, M. Amsler, A. Willand, and S. Goedecker, Phys. Rev. B, 86, 224110 (2012).Low-Energy Structures of Zinc Borohydride Zn(BH4)2.

[79] T. D. Huan, M. Amsler, R. Sabatini, V. N. Tuoc, N. B. Le, L. M. Woods, N. Marzari, andS. Goedecker, Phys. Rev. B, 88, 024108 (2013). Thermodynamic Stability of Alkali-Metal-Zinc Double-Cation Borohydrides at Low Temperatures.

[80] A. Laio, and M. Parrinello, Proc. Natl. Acad. Sci. USA, 99, 12562 (2002). Escaping Free-Energy Minima.

[81] P. Raiteri, R. Martonak, and M. Parrinello, Angew. Chem. Int. Ed., 44, 3769 (2005). Ex-ploring Polymorphism: The Case of Benzene.

[82] D. Selli, I. A. Baburin, R. Martonak, and S. Leoni, Sci. Rep., 3, 1466 (2013). NovelMetastable Metallic and Semicondcting Germaniums.

47

[83] J. Sun, D. D. Klug, R. Martonak, J. A. Montoya, M. S. Lee, S. Scandolo, and E. Tosatti,Proc. Natl. Acad. Sci. USA, 106, 6077 (2009). High-Pressure Polymeric Phases of CarbonDioxide.

[84] R. Martonak, D. Donadio, A. R. Oganov, and M. Parrinello, Nat. Mater., 5, 623 (2006).Crystal Structure Transformations in SiO2 from Classical and Ab Initio Metadynamics.

[85] J. Kennedy, and R. Eberhart, Proceedings of IEEE International Conference on NeuralNetworks IV, 1942 (1995). Particle Swarm Optimization.

[86] Y. Wang, J. Lv, L. Zhu, and Y. Ma, Phys. Rev. B, 82, 094116 (2010). Crystal StructurePrediction via Particle–Swarm Optimization.

[87] J. Lv, Y. Wang, L. Zhu, and Y. Ma, J. Chem. Phys., 137, 084104 (2012). Particle-SwarmStructure Prediction on Clusters.

[88] Y. Wang, M. Miao, J. Lv, L. Zhu, K. Yin, H. Liu, and Y. Ma, J. Chem. Phys., 137, 224108(2012). An Effective Structure Prediction Method for Layered Materials Based on 2D Par-ticle Swarm Optimization Algorithm.

[89] Y. Chen, X. Xi, W. L. Yim, F. Peng, Y. Wang, H. Wang, Y. Ma, G. Liu, C. Sun, C. Ma,Z. Chen, and H. Berger, J. Phys. Chem. C, 117, 25677 (2013). High-Pressure Phase Tran-sitions and Structures of Topological Insulator BiTeI.

[90] Q. Li, D. Zhou, W. Zheng, Y. Ma, and C. Chen, Phys. Rev. Lett., 110, 136403 (2013).Global Structural Optimization of Tungsten Borides.

[91] X. Luo, J. Yang, H. Liu, X. Wu, Y. Wang, Y. Ma, S. H. Wei, X. Gong, and H. Xiang, J. Am.Chem. Soc., 133, 16285 (2011). Predicting Two-Dimensional Boron-Carbon Compoundsby the Global Optimization Method.

[92] H. Wang, J. S. Tse, K. Tanaka, T. Iitaka, and Y. Ma, Proc. Natl. Acad. Sci. USA, 109, 6463(2012). Superconductive Sodalite-Like Clathrate Calcium Hydride at High Pressures.

[93] M. S. Miao, Nat. Chem., 5, 846 (2013). Caesium in High Oxidation States and as a p-BlockElement.

[94] H. Cartwright, in K. B. Lipkowitz, and T. R. Cundari, Eds., Reviews in ComputationalChemistry, Vol 25, John Wiley & Sons, Inc., New York, 2007, pp. 349-389. Developmentand Uses of Artificial Intelligence in Chemistry.

[95] D. Ashlock, Evolutionary Computation for Modeling and Optimization, Springer, NewYork, 2006.

[96] W. Paszkowicz, Mater. Manuf. Process., 27, 174 (2009). Genetic Algorithms, A Nature-Inspired Tool: Survey of Applications in Materials Science and Related Fields.

[97] W. Paszkowicz, Mater. Manuf. Process., 28, 708 (2013). Genetic Algorithms, A Nature-Inspired Tool: Survey of Applications in Materials Science and Related Fields: Part II.

48

[98] A. R. Oganov, A. O. Lyakhov, and M. Valle, Acc. Chem. Res., 44, 227 (2011). How Evo-lutionary Crystal Structure Prediction Works — and Why.

[99] T. S. Bush, C. R. A. Catlow, and P. D. Battle, J. Mater. Chem., 5, 1269 (1995). EvolutionaryProgramming Techniques for Predicting Inorganic Crystal Structures.

[100] Y. Zeiri, Phys. Rev. E., 51, R2769 (1995). Prediction of the Lowest Energy Structure ofClusters Using a Genetic Algorithm.

[101] D. M. Deaven, and K. M. Ho, Phys. Rev. Lett., 75, 288 (1995). Molecular Geometry Opti-mization with a Genetic Algorithm.

[102] C. W. Glass, A. R. Oganov, and N. Hansen, Comput. Phys. Commun., 175, 713 (2006).USPEX—Evolutionary Crystal Structure Prediction.

[103] A. R. Oganov, and C. W. Glass, J. Chem. Phys., 124, 244704 (2006). Crystal StructurePrediction Using Ab Initio Evolutionary Techniques: Principles and Applications.

[104] A. R. Oganov, C. W. Glass, and S. Ono, Earth Planet. Sci. Lett., 241, 95 (2006). High-Pressure Phases of CaCO3: Crystal Structure Prediction and Experiment.

[105] Q. Zhu, A. R. Oganov, and A. O. Lyakhov, CrystEngComm, 14, 3596 (2012). EvolutionaryMetadynamics: A Novel Method to Predict Crystal Structures.

[106] Q. Zhu, Q. Zeng, and A. R. Oganov, Phys. Rev. B, 84, 201407 (2012). Systematic Searchfor Low-Enthalpy sp3 Carbon Allotropes Using Evolutionary Metadynamics.

[107] Z. L. Liu, Comput. Phys. Commun., 185, 1893 (2014). MUSE: Multi-Algorithm Collabo-rative Crystal Structure Prediction.

[108] B. Meredig, and C. Wolverton, Nat. Mater., 12, 123 (2012). A Hybrid Computational-Experimental Approach for Automated Crystal Structure Prediction.

[109] D. C. Lonie, and E. Zurek, Comput. Phys. Commun., 182, 372 (2011). XtalOpt: An Open-Source Evolutionary Algorithm for Crystal Structure Prediction.

[110] D. C. Lonie, and E. Zurek, Comput. Phys. Commun., 182, 2305 (2011). New VersionAnnouncement: XtalOpt Version r7: An Open-Source Evolutionary Algorithm for CrystalStructure Prediction.

[111] A. N. Kolmogorov, S. Shah, E. R. Margine, A. F. Bialon, T. Hammerschmidt, andR. Drautz, Phys. Rev. Lett., 105, 217003 (2010). New Superconducting and Semiconduct-ing Fe-B Compounds Predicted with an Ab Initio Evolutionary Search.

[112] S. Bahmann, and J. Kortus, Comput. Phys. Commun., 184, 1618 (2013). EVO — Evolu-tionary Algorithm for Crystal Structore Prediction.

[113] W. W. Tipton, C. R. Bealing, K. Mathew, and R. Hennig, Phys. Rev. B, 87, 184114 (2013).Structures, Phase Stabilities, and Electrical Potentials of Li-Si Battery Anode Materials.

49

[114] G. Trimarchi, and A. Zunger, Phys. Rev. B., 75, 104113 (2007). Global Space-Group Op-timization Problem: Finding the Stablest Crystal Structure Without Constraints.

[115] M. d’Avezac, and A. Zunger, Phys. Rev. B., 78, 064102 (2008). Identifying the Minimum–Energy Atomic Configuration on a Lattice: Lamarckian Twist on Darwinian Evolution.

[116] N. L. Abraham, and M. I. J. Probert, Phys. Rev. B., 73, 224104 (2006). A Periodic GeneticAlgorithm with Real-Space Representation for Crystal Structure and Polymorph Predic-tion.

[117] A. Fadda, and G. Fadda, Phys. Rev. B, 82, 104105 (2010). An Evolutionary Algorithm forthe Prediction of Crystal Structures.

[118] B. Bandow, and B. Hartke, J. Phys. Chem. A, 110, 5809 (2006). Larger Water Clusters WithEdges and Corners on Their Way to Ice: Structural Trends Elucidated With an ImprovedParallel Evolutionary Algorithm.

[119] A. O. Lyakhov, and A. R. Oganov, Phys. Rev. B, 84, 092103 (2011). Evolutionary Searchfor Superhard Materials: Methodology and Applications to Forms of Carbon and TiO2.

[120] X. Zhang, Y. Wang, J. Lv, C. Zhu, Q. Li, M. Zhang, Q. Li, and Y. Ma, J. Chem. Phys., 138,114101 (2013). First-principles Structural Design of Superhard Materials.

[121] N. M. O’Boyle, C. M. Campbell, and G. R. Hutchison, J. Phys. Chem. C, 115, 16200(2011). Computational Design and Selection of Optimal Organic Photovoltaic Materials.

[122] A. O. Lyakhov, A. R. Oganov, and M. Valle, Comput. Phys. Commun., 181, 1623 (2010).How to Predict Very Large and Complex Crystal Structures.

[123] Q. Zhu, A. R. Oganov, C. W. Glass, and H. T. Stokes, Acta Cryst., B68, 215 (2012) Con-strained Evolutionary Algorithm for Structure Prediction of Molecular Crystals: Method-ology and Applications.

[124] D. C. Lonie, and E. Zurek, Comput. Phys. Commun., 183, 690 (2012). Identifying Dupli-cate Crystal Structures: XtalComp, an Open–Source Solution.

[125] Z. H. Li, A. W. Jasper, and D. G. Truhlar, J. Am. Chem. Soc., 129, 14899 (2007). Structures,Rugged Energetic Landscapes, and Nanothermodynamics of Aln (2 ≤ n ≤ 65) Particles.

[126] P. J. Steinhardt, D. R. Nelson, and M. Ronchetti, Phys. Rev. B, 28, 784 (1983). Bond-Orientational Order in Liquids and Glasses.

[127] Y. Wang, J. Lv, L. Zhu, and Y. Ma, Comput. Phys. Commun., 183, 2063 (2012). CA-LYPSO: A Method for Crystal Structure Prediction.

[128] M. Valle, and A. R. Oganov, Acta Cryst., A66, 507–517 (2010)., Crystal Fingerprint Space- A Novel Paradigm for Studying Crystal-Structure Sets.

[129] XtalOpt, URL http://xtalopt.openmolecules.net/.

50

[130] M. D. Hanwell, D. E. Curtis, D. Lonie, T. Vandermeersch, E. Zurek, and G. R. Hutchison,J. Cheminf., 4, 1 (2012). Avogadro: An Advanced Semantic Chemical Editor, Visualiza-tion, and Analysis Platform.

[131] Avogadro, URL http://avogadro.openmolecules.net/.

[132] N. M. O’Boyle, M. Banck, C. A. James, C. Morley, T. Vandermeersch, and G. R. Hutchi-son, J. Cheminf., 3, 33 (2011). Open Babel: An Open Chemical Toolbox.

[133] Open Babel, URL http://openbabel.org.

[134] Spglib, URL http://spglib.sourceforge.net/.

[135] XtalComp, URL http://xtalopt.openmolecules.net/xtalcomp/xtalcomp.html.

[136] GNU Public License, URL http://www.gnu.org/licenses/gpl.html.

[137] A. Hermann, B. L. Ivanov, N. W. Ashcroft, and R. Hoffmann, Phys. Rev. B, 86, 014104(2012). LiBeB: A Predicted Phase with Structural and Electronic Peculiarities.

[138] G. Trimarchi, A. J. Freeman, and A. Zunger, Phys. Rev. B, 80, 092101 (2009). PredictingStable Stoichiometries of Compounds via Evolutionary Global Space-Group Optimization.

[139] S. M. Woodley, and C. R. A. Catlow, Comput. Mater. Sci., 45, 84 (2009). Structure Predic-tion of Titania Phases: Implementation of Darwinian versus Lamarckian Concepts in anEvolutionary Algorithm.

[140] H. B. Werner, and D. Kassner, Acta Cryst., B48, 356 (1992). The Perils of Cc: Comparingthe Frequencies of Falsely Assigned Space Groups with their General Population.

[141] J. Hooper, A. G. Hu, F. Zhang, and T. K. Woo, Phys. Rev. B, 80, 104117 (2009). GeneticAlgorithm and First Principles DFT Study of the High-Pressure Molecular Zeta Phase ofNitrogen.

[142] R. J. Nelmes, M. I. McMahon, J. S. Loveday, and S. Rekhi, Phys. Rev. Lett., 88, 155503(2002). Structure of Rb-III: Novel Modulated Stacking Structures in Alkali Metals.

[143] M. I. McMahon, R. J. Nelmes, and S. Rekhi, Phys. Rev. Lett., 87, 255502 (2001). ComplexCrystal Structure of Cesium-III.

[144] I. Loa, R. J. Nelmes, L. F. Lundegaard, and M. I. McMahon, Nat. Mater., 11, 627 (2012).Extraordinarily Complex Crystal Structure with Mesoscopic Patterning in Barium at HighPressure.

[145] ISOTROPY, URL http://stokes.byu.edu/isotropy.html.

[146] AFLOWLIB, URL http://aflowlib.org/.

[147] W. Grochala, R. Hoffmann, J. Feng, and N. W. Ashcroft, Angew. Chem. Int. Ed., 46, 3620(2007). The Chemical Imagination at Work in Very Tight Places.

51

[148] R. J. Hemley, Physics World, 8, 26 (2006). A Pressing Matter.

[149] E. Zurek, R. Hoffmann, N. W. Ashcroft, A. R. Oganov, and A. O. Lyakhov, Proc. Natl.Acad. Sci. USA, 106, 17640 (2009). A Little Bit of Lithium Does a Lot for Hydrogen.

[150] P. Baettig, and E. Zurek, Phys. Rev. Lett., 106, 237002 (2011). Pressure-Stabilized SodiumPolyhydrides, NaHn (n > 1).

[151] J. Hooper, and E. Zurek, Chem–Eur. J., 18, 5013 (2012). Rubidium Polyhydrides UnderPressure: Emergence of the Linear H−3 Species.

[152] J. Hooper, and E. Zurek, J. Phys. Chem. C, 116, 13322 (2012). High Pressure PotassiumPolyhydrides: A Chemical Perspective.

[153] A. Shamp, J. Hooper, and E. Zurek, Inorg. Chem., 51, 9333 (2012). Compressed CesiumPolyhydrides: Cs+ Sublattices and H−3 Three-Connected Nets.

[154] D. Lonie, J. Hooper, B. Altintas, and E. Zurek, Phys. Rev. B, 87, 054107 (2013). Metal-lization of Magnesium Polyhydrides Under Pressure.

[155] J. Hooper, B. Altintas, A. Shamp, and E. Zurek, J. Phys. Chem. C, 117, 2982 (2013).Polyhydrides of the Alkaline Earth Metals: A Look at the Extremes Under Pressure.

[156] J. Hooper, T. Terpstra, A. Shamp, and E. Zurek, J. Phys. Chem. C, 118, 6433 (2014). TheComposition and Constitution of Compressed Strontium Polyhydrides.

[157] N. W. Ashcroft, Phys. Rev. Lett., 21, 1748 (1968). Metallic Hydrogen: A High-Temperature Superconductor?

[158] N. W. Ashcroft, Phys. Rev. Lett., 92, 187002 (2004). Hydrogen Dominant Metallic Alloys:High Temperature Superconductors?

[159] Y. Xie, Q. Li, A. R. Oganov, and H. Wang, Acta. Cryst, C70, 104 (2014). Superconductivityof Lithium-Doped Hydrogen Under High Pressure.

[160] J. Hooper, P. Baettig, and E. Zurek, J. Appl. Phys., 111, 112611 (2012). Pressure InducedStructural Transitions in KH, RbH and CsH.

[161] J. Hooper, and E. Zurek, ChemPlusChem, 77, 969 (2012). Lithium Subhydrides UnderPressure and their Superatom–Like Building Blocks.

[162] A. Hermann, N. W. Ashcroft, and R. Hoffmann, Proc. Natl. Acad. Sci. USA, 109, 745(2012). High Pressure Ices.

[163] M. Derzsi, A. Hermann, R. Hoffmann, and W. Grochala, Eur. J. Inorg. Chem., 5094 (2013).The Close Relationship Between the Crystal Structures of MO and MSO4 (M = Group 10,11, or 12 Metal), and the Predicted Structures of AuO and PtSO4.

52

[164] A. Hermann, A. McSorley, N. W. Ashcroft, and R. Hoffmann, J. Am. Chem. Soc., 134,18606 (2012). From Wade-Mingos to Zintl-Klemm at 100 GPa: Binary Compounds ofBoron and Lithium.

[165] V. Labet, R. Hoffmann, and N. W. Ashcroft, New J. Chem., 35, 2349 (2011). MolecularModels for WH6 Under Pressure.

[166] P. Zaleski-Ejgierd, V. Labet, T. A. Strobel, R. Hoffmann, and N. W. Ashcroft, J. Phys.:Condens. Matter, 24, 155701 (2012). WHn Under Pressure.

[167] D. L. V. K. Prasad, N. W. Ashcroft, and R. Hoffmann, J. Phys. Chem. A, 116, 10027 (2012).Lithium Amide (LiNH2) Under Pressure.

[168] D. L. V. K. Prasad, N. W. Ashcroft, and R. Hoffmann, J. Phys. Chem. C, 117, 20838(2013). Evolving Structural Diversity and Metallicity in Compressed Lithium Azide.

[169] X. D. Wen, R. Hoffmann, and N. W. Ashcroft, J. Am. Chem. Soc., 133, 9023 (2011).Benzene Under High Pressure: A Story of Molecular Crystals Transforming to SaturatedNetworks, with a Possible Intermediate Metallic Phase.

[170] K. AlKaabi, D. L. V. K. Prasad, P. Kroll, N. W. Ashcroft, and R. Hoffmann, J. Am. Chem.Soc., 136, 3410 (2014). Silicon Monoxide at 1 atm and Elevated Pressures: Crystalline orAmorphous?

[171] A. Hermann, R. Hoffmann, and N. W. Ashcroft, Phys. Rev. Lett., 111, 116404 (2013).Condensed Astatine: Monatomic and Metallic.

[172] A. Hermann, A. Suarez-Alcubilla, I. G. Gurtubay, L. M. Yang, A. Bergara, N. W. Ashcroft,and R. Hoffmann, Phys. Rev. B., 86, 144110 (2012). LiB and its Boron-Deficient VariantsUnder Pressure.

[173] A. Hermann, N. W. Ashcroft, and R. Hoffmann, J. Am. Chem. Soc., 51, 9066 (2012).Making Sense of Boron-Rich Binary Be-B Phases.

[174] A. Hermann, N. W. Ashcroft, and R. Hoffmann, Chem. Eur. J., 19, 4184 (2013). BinaryCompounds of Boron and Beryllium: A Rich Structural Arena with Space for Predictions.

53

List of Figures1 (a) The primitive unit cell for this two dimensional lattice is a parallelogram and

is denoted by the vectors p1 and p2. The area of the conventional cell, denotedby the cell vectors c1 and c2, is twice as large as that of the primitive cell, but itillustrates the rectangular symmetry of the lattice more clearly. A second structuredescribed with two different unit cells is shown in (b) and (c). The unit cells usedto describe it differ, with the former being a square, and the latter a parallelogram.The cell illustrated in (b) corresponds to both the Buerger cell and the Niggli cell.Note that different shades of gray are used to illustrate the two types of atoms(color available in e-books). . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2 A schematic of a 1D PES illustrating how the energy of the system varies asa function of one of the degrees of freedom. It contains a large number of localminima (LM) which are separated by barriers; only one local minimum is pointedout. The global minimum (GM) is the lowest energy point. A basin contains allof the configurations that will optimize to the same local minimum. To betterillustrate one of the basins we have shaded it in gray. This PES contains twofunnels, one of which is explicitly denoted. The transition state (TS) betweenthese two funnels is labeled. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

3 (a) A 1D chain of hydrogens where the distance between all of the atoms, a, isthe same. This configuration is unstable with respect to a dimerized system (c)with one short (a′) and one long (a′′) H-H measure. (b) A schematic illustrationof the phonon mode that transforms the 1D chain in (a) to the arrangement of H2

molecules in (c). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 (a) A diagram illustrating the generation of random structures in a 1D PES, fol-

lowed by optimization to the nearest local minimum (bold dashed arrow). (b)An illustration of a ‘sensible’ and a not so ‘sensible’ unit cell. In the latter, thevolume is too big, the atomic radii overlap, and the atom-types are not distributedhomogeneously throughout the structure. . . . . . . . . . . . . . . . . . . . . . . 12

5 An illustration of the simulated annealing method in a 1D PES. New structuresare generated by random displacement of atoms, or modifications of the unit cell.(a) At high temperatures, configurations whose energies are larger than that of theinitial one may be accepted, whereas (b) at lower temperatures the only movesthat can be made are those which lower the energy of the system. A quench run(at T = 0 K) is followed by optimization to the nearest local minimum (bolddashed arrow). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

6 A diagram illustrating the basin hopping and minima hopping techniques in a1D PES. (a) Basin hopping traverses the PES via a Monte-Carlo procedure. Theoptimization of each structure to the nearest local minimum (bold dashed arrow)results in the transformation of the curvy PES into the stepwise PES. (b) In min-ima hopping molecular dynamics is employed to explore the PES. Structures areaccepted or rejected based upon their energies relative to their predecessor. Thedifference between these two energies, Ediff, is constantly adjusted so that half ofthe new configurations are accepted. . . . . . . . . . . . . . . . . . . . . . . . . 14

54

7 A diagram illustrating how the metadynamics method traverses a 1D PES. Thepotential is lifted in areas the algorithm has already explored, thereby enabling themethod to overcome barriers between neighboring basins. The global minimumin (a) will be found more quickly than in (b), because the barriers separating itfrom the current location in the PES are smaller. . . . . . . . . . . . . . . . . . . 15

8 (a) A diagram depicting how new structures are generated within the PSO algo-rithm in a 1D PES via Eq. 6 and Eq. 7. Some of the arrows correspond to velocityvectors, and the dots to the positions of particles within the PES. (b) A chart il-lustrating the workflow in the PSO technique as implemented in the CALYPSOcode.5, 86 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

9 (a) A diagram depicting how new structures are generated within an EA or aGA in a 1D PES. Optimization to the nearest local minimum is represented bythe bold dashed line. (b) A schematic illustrating how structural data is encodedonto a string in a GA. Generic breeding and mutation operations are also shown.Note that breeding can in principle result in two different children. Examples ofevolutionary operators that act on a structure in real space are provided in Fig. 11. 18

10 (a) A workflow for a traditional EA that uses a generation based pool. (b) Mod-ification of the workflow illustrated in (a) for an EA that employs a populationbased pool. Dashed arrows indicate where the workflow merges with the tradi-tional EA. Only the structures with the lowest enthalpies are kept in the pool usedfor procreation, and the rest are discarded. The pool size is specified by the user. . 21

11 A schematic of evolutionary operators that act in real space on a 2D lattice.Breeding is a two parent operation which combines a slice of each parent intoa single unit cell. In principle two different children can be made in this way, butin practice typically only one is kept. Permutation, strain and ripple are examplesof mutations of a single parent. . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

12 Screen shots of the ‘Plot’ tab in the XTALOPT GUI. These can be used to vi-sualize and analyze the results of the run in real time, or after the search hascompleted. Clicking on an entry brings up the structure for visualization in theAVOGADRO main window. The right mouse button can be used to kill unstablestructures, or to inject promising seed lattices into the run. . . . . . . . . . . . . 27

13 Same as Fig. 12, but the ‘Progress’ tab. . . . . . . . . . . . . . . . . . . . . . . . 2814 A plot illustrating the enthalpies of formation, ∆HF , per atom for the formation

of the binary phase AxBy according to the reaction xA + yB → AxBy. Thefraction of element B in the binary is given on the x-axis. If a tie-line is usedto connect ∆HF (A) and ∆HF (B), and ∆HF (C) falls below it, as shown in theinset, then the formation of C from A and B is thermodynamically preferred.Based on thermodynamics alone, phase D is expected to decompose into A andB. The dashed line in the main plot represents the convex-hull, that is the set ofall tie-lines below which no other phases lie. All of the phases whose ∆HF lieon the convex hull are thermodynamically stable (S) with respect to decompo-sition into other phases. Structures whose ∆HF do not fall on the hull may bemetastable (MS), provided their phonons modes are real, and the barriers towardstheir decomposition are high. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

55

15 The ‘Search Settings’ tab, which may be employed to adjust the parameters usedin an EA run performed with XTALOPT. . . . . . . . . . . . . . . . . . . . . . . 33

16 One hundred EA searches interfaced with GULP were carried out on a 48 atomTiO2 supercell. The lowest enthalpy obtained from the ‘best search’ and the‘worst search’ are shown as solid and dashed black lines, respectively. The grayline provides the average enthalpy of the best structure from each search. Datataken from Reference 109. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

17 Supercells of alkali metal polyhydrides predicted to be particularly stable over awide pressure range. LiH6 consists of Li+ and H−1/32 (white). NaH9 is comprisedof H− (small dark spheres) and H2 (white). KH5 and RbH5 contain H2 (white)and H−3 (small dark spheres). A number of nearly isoenthalpic CsH3 phases withH−3 (white), and Cs+ (small dark spheres) were found. The pressures at whichthe first polyhydride phase, MHn with n > 1, becomes stable with respect todecomposition into MH and H2 is also provided. In some cases these phaseswere not the same as the polyhydrides that were stable throughout the largestpressure range. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

18 Structures predicted by Hermann using the XTALOPT EA. (From left to right) Aphase of H2O ice that is metallic and stable above 4800 GPa162 (oxygen atomsare large and dark). AuO, the only experimentally unknown late transition metalmonoxide163 (oxygen atoms are small and dark). The LiB4 phase with the BaAl4structure type,164 and an LiBeB phase metastable between 15-70 GPa137 (boronatoms are small, beryllium atoms in LiBeB are medium-sized, and lithium atomsare large). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

56

discovering new materials via a priori crystal structure

Documents