numerical aspects of dft - university of oulu · sparse matrix with large dimension, thus iterative...
TRANSCRIPT
Basis sets● Expand wave functions in a set of basis functions
● Turns (continuous) Kohn-Sham equations into discrete matrix equations
● Criteria for good basis set
― Accuracy― Numerical efficiency
Real-space grids● Wave functions, electron densities, and potentials are
represented on grids.● Single parameter, grid spacing h
h
● Accuracy of calculation can be improved systematically by decreasing the grid spacing
● Can work only with smooth wave functions
― Pseudopotential or PAW approximation is needed
Boundary conditions● Real-space description allows flexible
boundary conditions● Zero boundary conditions (finite systems)
― potential, charge density and wave functions are zeros at the cell boundaries
― possible to treat charged systems● Periodic boundary conditions (bulk systems)
― potential and charge are periodic― wave-functions obey Bloch boundary
conditions (k-points)● Boundary conditions can be mixed
― periodic in one dimension (wires)― periodic in two dimensions (surfaces)
Finite-differences
● Both the Poisson equation and kinetic energy contain the Laplacian which can be approximated with finite differences
● Accuracy depends on the order of the stencil N● Sparse matrix, storage not needed● Cost in operating to wave function is
proportional to the number of grid points
Hamiltonian in real-space ● Denoting grid points by G, the real-space
discretization of PAW Hamiltonian is
where LGG' is the finite-difference Laplacian
● Sparse matrix with large dimension, thus iterative techniques requiring only are used
Multigrid method● General framework for solving differential equations
with hierarchy of discretizations● Poisson equation
can be solved as
● Problem: critical slowing down for long wave length components of error
― solution: treat long wave length components on a coarser grid
― recursive restriction leads to the V-cycle
Multigrid method
● Each multigrid level reduces the error component which is related to grid spacing
Egg-box effect● Integrals involving localized functions centered on
an atom and functions spanning the whole simulation cell such as
are evaluated as sum over grid points
● The due to high frequency components in localized function, integral depends on the position of atom with respect to grid
Egg-box effect
● Problem: localization in real-space means delocalization in Fourier space and vice versa
● Fourier filtering is used for optimizing localization both in real and reciprocal spaces
Real-space method● Flexible boundary conditions● Systematic convergence● Uniform resolution – empty space expensive● Total energy is not variational with grid spacing● Egg-box effect● Efficient multigrid and iterative methods● Good parallelization prospects● Real-space program packages
― GPAW, Octopus, Parsec
Self-consistent cycle revisited
● In practice, the new electron density
is not used directly when calculating the effective potential for next iteration
● The new density is mixed with old density
Density mixing● Simplest approach: direct mixing
● Typically, small parameters are needed for the mixing to be stable
● Convergence can be slow● In spin-polarized systems, charge density and
magnetization density can be mixed separately
Pulay mixing● Mixing can be improved by using information from
previous iterations
● Pulay method assumes that the residual
depends linearly on the input density:
● Optimum set of αi is found by minimizing norm of the residual
where
Pulay mixing● Minimization of residual leads to
● Original Pulay method used only for obtaining the new density
● A weight factor can be included for covering the solution space more efficiently
● Parameters of Pulay mixing
― number of old densities to use― the weight factor
Charge sloshing● Small variation in input density can cause a large
change in output density
― Problem especially in large metallic systems― Convergence of SCF cycle can be slowed down
considerably● Charge sloshing is typically caused by the long range
changes in density● Filtering out long range changes in mixing can
reduce the effect of charge sloshing
Reducing charge sloshing● The long-wave length changes can be filtered out by
using a special metric when calculating the residuals
● In reciprocal space one can use a diagonal metric
● In real-space this metric is non-local, however it can approximated as
● The coefficient ci are chosen (with a weight factor w) as:
Broyden mixing● The SCF cycle can be interpreted as non-linear
problem
● Newton method:
● Broyden methods build successively approximation to inverse Jacobian J-1 using information from previous iterations
● Updates can be performed without explicit storage of (approximate) inverse Jacobian
Density mixing● Density mixing is needed for the SCF-cycle to
convergence● Different mixing schemes exist● Charge sloshing can be avoided by using suitable
metric● Optimum mixing parameters are highly system
dependent
Iterative solution of eigenproblem
● Determination of the effective potential from the input charge density is computationally simple
― Poisson equation can be tricky with some basis sets● The major task in SCF calculation is the
eigenproblem
● Well-established (dense) matrix diagonalization algorithms and libraries (LAPACK) can be used when basis size is modest
● If the basis is very large or the basis results in sparse matrix, iterative techniques may be advantageous
Iterative solution of eigenproblem
● Approximation for eigenvalue is obtained as
● All iterative methods are based on trial eigenfunction which is updated in each iteration
● Updates are normally based on the residual
● Iterative methods differ in how the residual is used● The full Hamiltonian matrix is not needed
― Only applying of the matrix to eigenfunction ― only O(N2) operations with sparse matrices
Preconditioning● Optimum update would be
● The residuals can be preconditioned by suitable approximations to the matrix inverse
● Preconditioned residuals can be obtained by solving approximately Poisson equation
● With plane wave basis a diagonal Teter preconditioner is often used
Residual minimization● One possibility for updating the trial wave function is
● Step length is found by minimizing the new residual
● The final update is obtained as
● Wave functions are updated band by band
Davidson method
● In simple Davidson method one first calculates the preconditioned residuals for each state
● One constructs then a vector from wave functions and preconditioned residuals
● Hamiltonian is then diagonalized in this basis and Nel lowest eigenvectors are used as new trial wave functions
Conjugate gradient method
● Conjugate gradient algorithm can be used for minimizing the expectation value
● By enforcing the updates the keep new trial wave functions orthonormal one can find out Nel lowest eigenstates
Subspace diagonalization
● Both residual minimization and conjugate gradient algorithms calculate only arbitrary linear combination of Nel lowest eigenfunctions
● Hamiltonian can be diagonalized in the subspace of trial wave functions
● The Nel lowest eigenfunctions are obtained from subspace rotation:
Orthonormalization● After residual minimization the trial wave functions
are not orthonormal● Numerically efficient scheme for orthonormalization:
― Calculate overlap matrix
and determine its Cholesky decomposition L
― Orthonormal wave functions are obtained by multiplying with inverse Cholesky factor
Computational scaling of DFT calculation
● Poisson equation: O(Nb) – O(Nb log Nb)
● XC-potential
― semi-local functionals: O(Nb)
― EXX: O(Ne4 Nb)
● Direct solution of eigenproblem
― Construction of Hamiltonian: O(N2b)
― Diagonalization of Hamiltonian: O(N3b)
or● Iterative solution of eigenproblem
― Applying with Hamiltonian: O(Nel Nb) – O(Nel Nblog Nb)
― Subspace diagonalization: O(Nel3)
― Orthonormalization: O(Nel3)
Summary
● Density mixing is needed in SCF cycle● Iterative diagonalization schemes can be efficient with
sparse matrices● Computational scaling of DFT algorithm is
O(N3)― Direct eigensolvers: diagonalization dominates in large
systems― Iterative solvers: orthonormalization dominates in large
systems
Parallel calculations
● Speedup in modern supercomputers is based on large number of CPUs
● In order to exploit available computing power, parallel computing is needed
● Parallel computing allows one― Solve problems faster― Solve bigger problems
Parallellization prospects in DFT
● Basis functions● k-points and spin● electronic states● Additional trivial parallelizations
― e.g. different atomic configurations or unit cells
Parallellization: k-points and spin
● Spin and k-points can be treated equivalently● Trivial parallelization● Limited scalability
― k-points only in (small) periodic systems― spin only in magnetic systems
Parallelization: basis
● Depends largely on the basis used● Plane waves : parallel FFT is challenging● Real-space grids: efficient domain decomposition
Finite difference LaplacianP1
P3 P4
P2PAW augmentation sphere
Parallelization: electronic states
● Orthonormalizations are complicated ― communication of all wave functions to all processes
― Can be implemented as pipeline with only nearest neighbour communication
Parallel scalability in practice
● Feasible amount of CPUs depends heavily on the studied system (number of atoms/electrons)
● Largest ground state DFT calculations use typically with few thousand CPU cores
● GPAW, real-space mode● 561 Au atoms, 6200
electrons● Blue Gene P, Argonne
Summary
● Parallelization is needed for fully exploiting modern computers
● DFT equations offer several parallelization levels― K-points, spin, basis, electronic states
● Optimum number of CPUs depend on the studied system
GPAW● Implementation of projector augmented wave
method on― uniform real-space grids, atomic orbital basis,
plane waves● Density-functional theory and time-dependent DFT● Open source software licensed under GPL
― 20-30 developers in Europe and in USA
wiki.fysik.dtu.dk/gpaw
J. J. Mortensen et al., Phys. Rev. B 71, 035109 (2005)
J. Enkovaara et al., J. Phys. Condens. Matter 22, 253202 (2010)
GPAW features● PAW-method: accurate description over the
whole periodic table● Total energies, forces, structural optimization
― analysis of electronic structure● Excited states, optical spectra
― Non-adiabatic electron-ion dynamics● Wide range of XC-potentials (thanks to libxc!)
― LDAs, GGAs, meta-GGAs, hybrids, DFT+U, vdW
● Electron transport● GW-approximation, Bethe-Salpeter equation● ...
GPAW features
● Simple but flexible Python scripting interface via Atomic Simulation Environment
● Runs on wide variety of computer architectures
● Efficient parallelization, system sizes up to thousands of electrons
● Modular design helps implementing new features
Atomic Simulation Environment
● ASE is a Python package for― building atomic structures― structure optimization and molecular dynamics― analysis and visualization
● ASE relies on external software which provides total energies, forces, etc.
― GPAW, Abinit, Siesta, Vasp, Castep, ...● Input files are Python scripts
― calculations are run as “python input.py”― simple format, no knowledge of Python required― knowledge of Python enables great flexibility
● Simple graphical user interface
ASE
Calculator
atomic positions
energies,forces,
wfs,densities
wiki.fysik.dtu.dk/ase
Setting up the atoms● Specifying atomic positions directly
from ase.all import * # Setup the atomic simulation environment
d0 = 1.10x = d0 / sqrt(3)atoms = Atoms('CH4', positions=[(0.0, 0.0, 0.0), # C (x, x, x), # H1 (x, x, x), # H2 (x, x, x), # H3 (x, x, x)] # H4 )view(atoms)
● Reading atomic positions from a file
― Several file formats supported
...atoms = read('CH4.xyz')view(atoms)
Setting up the unit cell● By default, the simulation cell of an Atoms object has
zero boundary conditions and edge length of 1 Å● Unit cell can be set when constructing Atoms
or later on
atoms = ...atoms.center(vacuum=3.5) # finite system 3.5 Å empty space around atoms
atoms = Atoms(...) # positions in relative coordinatesatoms.set_cell((2.5, 2.5, 2.5), scale_atoms=True)atoms.set_pbc(True) # or atoms.set_pbc((True, True, True))
atoms = Atoms(..., # positions must be now in absolute coordinates cell=(1., 2., 3.), pbc=True)# or pbc=(True, True, True)
atoms = ...atoms.set_pbc((False, True, True)) # surface slabatoms.center(axis=0, vacuum=3.5) # 3.5 Å empty space in xdirection
Units in ASE● Length: Å● Energy: eV
● Easy conversion between units:
― also Rydberg, kcal, nm, ...
from ase.units import Bohr, Hartree
a = a0 * Bohr # a0 in a.u., a in ÅE = E0 * Hartree # E0 in Hartree, E in eV
Pre-defined molecules and structures
● Database of small molecules (G2-1 and G2-2 sets)
from ase.structure import molecule
mol = molecule('C6H6') # coordinates from MP2 calculation mol.center(3.5) # molecule() returns unit cell of 1 Å
● Bulk structures of elemental materials
from ase.lattice import bulk
atoms = bulk('Si') # primitive (2atom) unit cell with exp. lattice constantatoms_conv = bulk('Si', cubic=True) # cubic 8atom unit cellatoms_my_a = bulk('Si', a=5.4) # User specified lattice constant
Supercells and surfaces● Existing Atoms objects can be “repeated” and
individual atoms removed
from ase.lattice import bulk
atoms = bulk('Si', cubic=True) # cubic 8atom unit cellsupercell = atoms.repeat((4, 4, 4)) # 512 atom supercelldel supercell[0] # remove first atom, e.g. create a vacancy
● Utilities for working with surfacesfrom ase.lattice.surface import fcc111, add_adsorbate
slab = fcc111('Cu', size=(3,3,5)) # 5layers of 3x3 Cu (111) surface# add O atom 2.5 Å above the surface in the 'bridge' siteadd_adsorbate(slab, 'O', 2.5, position='bridge')
Performing a calculation● In order to do calculation, one has to define a
calculator object and attach that to Atoms
● Specifying calculator parameters
● See wiki.fysik.dtu.dk/gpaw/documentation/manual.html for all parameters
from ase.structure import molecule # Setup the atomic simulation environmentfrom gpaw import GPAW # Setup GPAW
atoms = molecule('CH4')atoms.center(3.5)calc = GPAW() # Use default parametersatoms.set_calculator(calc)atoms.get_potential_energy() # Calculate the total energy
...calc = GPAW(h=0.18, nbands=6, # 6 bands and grid spacing of 0.20 Å kpts=(4,4,4), # 4x4x4 MonkhorstPack kmesh xc='PBE', txt='out.txt') # PBE and print text output to file...
Performing a calculation
● Serial calculations and analysis can be carried out with normal Python interpreter
● Parallel calculations with gpaw-python executable
[jenkovaa@flamingo ~]$ python input.py
#!/bin/bash#SBATCH J gpaw_test#SBATCH t 00:30:00#SBATCH p parallel#SBATCH n 16
Module load gpawenv/0.10.0
srun gpawpython input.py
Structural optimization
● See wiki.fysik.dtu.dk/ase/ase/optimize.html for supported optimizers
● “Best” optimizer is case-dependent
from ase.all import * # Setup the atomic simulation environmentfrom gpaw import GPAW # Setup GPAW
atoms = ...calc = GPAW(...)atoms.set_calculator(calc)
opt = BFGS(atoms, trajectory='file.traj') # define an optimizeropt.run(fmax=0.05) # optimize the structure until forces smaller than 0.05 eV / Å
Simple Python scriptingatoms = ...calc = GPAW(...)atoms.set_calculator(calc)
# Check convergence with grid spacingfor h in [0.35, 0.30, 0.25, 0.20, 0.18]: txtfile = 'test_h' + str(h) + '.txt' calc.set(h=h, txt=txtfile) e = atoms.get_potential_energy() print h, e
import numpy as npatoms = ...calc = GPAW(...)atoms.set_calculator(calc)
# lattice constant for different XCfunctionalsfor xc in ['LDA', 'PBE']: for a in np.linspace(3.8, 4.3, 5): txtfile = 'test_xc_' + xc + '_a' + str(s) + '.txt' atoms.set_cell((a, a, a), scale_atoms=True) calc.set(xc=xc, txt=txtfile) e = atoms.get_potential_energy()
Saving and restarting● Saving full state of calculation: .gpw-files (or
.hdf5-files)
● Restarting
...calc = GPAW(...) atoms.set_calculator(calc)atoms.get_potential_energy() # Calculate the total energycalc.write('myfile.gpw') # Atomic positions, densities, calculator parameters
...
calc.write('myfile.gpw', mode='all') # Save also wave functions (larger files)
...
calc.write('myfile.hdf5', mode='all') # If GPAW is build with HDF5 support
from ase.all import * # Setup the atomic simulation environmentfrom gpaw import restart # Setup GPAW
atoms, calc = restart('file.gpw')e0 = atoms.get_potential_energy() # no calculation neededcalc.set(h=0.20)e1 = atoms.get_potential_energy() # calculation total energy with new grid
Saving and restarting● Trajectories: atomic positions, energies, forces
...calc = GPAW(...) atoms.set_calculator(calc)traj = PickleTrajectory('file.traj', 'w', atoms) # define a trajectory filefor a in np.linspace(3.8, 4.3, 5): txtfile = 'test_xc_' + xc + '_a' + str(s) + '.txt' atoms.set_cell((a, a, a), scale_atoms=True) atoms.get_potential_energy() traj.write() # write cell and energy to trajectory file
● Reading atomic positions
from ase.all import * # Setup the atomic simulation environmentfrom gpaw import GPAW # Setup GPAW
atoms = read('file.traj') # read the last image first = read('file.traj', 0) # first imagecalc = GPAW(...)atoms.set_calculator(calc) # calculator has to be attached