m a i s e - bingweb

26
Module for Ab Initio Structure Evolution: Overview slides 20-24 train & test NNs for compounds slides 2-5 find space group compare structures slides 6-14 relax structure set up MD & EVOS slides 15-20 create & select data for NN fitting Reference data generation evolutionary sampling NN model construction stratified training Structure Simulation local/global optimization MD, Phonons Structure Analysis symmetry (spglib) fingerprints (RDF) Structure Analysis & Prediction M A I S E Neural Network Modeling For neural network (NN) users For NN developers

Upload: others

Post on 10-Jan-2022

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: M A I S E - Bingweb

Module for Ab Initio Structure Evolution: Overview

slides 20-24

train & test NNs for compounds

slides 2-5

find space groupcompare structures

slides 6-14

relax structureset up MD & EVOS

slides 15-20

create & select data for NN fitting

Referencedata generation

evolutionary sampling

NN modelconstruction

stratifiedtraining

StructureSimulation

local/global optimizationMD, Phonons

StructureAnalysis

symmetry (spglib)fingerprints (RDF)

StructureAnalysis & Prediction

M A I S ENeural Network

Modeling

For neural network (NN) users

For NN developers

Page 2: M A I S E - Bingweb

Command Line Options

POSCAR

-spg find space group number with SPGLIB

-rdf find RDF and nearest neighbor distances

-cxc compute dot product for two structures

-cmp compare two structures using RDF and SPGLIB

-cif convert str.cif into conventional unit cell CONV

-sup make a supercell specified by Na x Nb x Nc

-rot rotate a nanoparticle along eigenvectors of moments of inertia

-dim find structure periodicity

-vol compute volume per atom for crystal or nano structures

-box reset the box size for nanoparticles

Cu NP

1.000000

20.00000000 0.00000000 0.00000000

0.00000000 20.00000000 0.00000000

0.00000000 0.00000000 20.00000000

Cu

36

Cartesian

8.78504090 8.78512670 10.00000000

11.75961210 9.10945610 10.00000000

9.10951520 11.75965600 10.00000000

10.16741660 10.16741750 11.98894980

[...]

Command line applications aredesignated to perform structureanalysis and manipulation operationsgiven one POSCAR or two POSCAR0and POSCAR1 input files.

2

maise/XX-app/

Page 3: M A I S E - Bingweb

Symmetry Analysis with SPGLIB

str.cif CONV PRIM

_symmetry_Int_Tables_number 12

_cell_length_a 4.2624856597999239

_cell_length_b 3.0000000000000000

_cell_length_c 3.0000006666665926

_cell_angle_alpha 90.0000000000000000

_cell_angle_beta 134.6956664078110180

_cell_angle_gamma 90.0000000000000000

_chemical_formula_sum

'Cu '

loop_

Cu1 Cu 2 a 0.0000000000000000 0.0000000000000000 0.0000000000000000

#End

[1] [1] Run the SG solver with a given tolerance (0.01 default) to get

SG numberPearson symbolSG international symbol used tolerancesimilarity cxc

$ maise –spg 0.01

139 tI2 I4/mmm 1.0E-02 1.0000

$ maise –spg 0.0005

12 mS2 C2/m 5.0E-04 1.0000

$ maise –spg -0.1

229 cI2 Im-3m 1.0E-01 0.6891

139 tI2 I4/mmm 3.0E-02 1.0000

12 mS2 C2/m 1.0E-03 1.0000

[2]

bcc-Cu

1.00000000000000

3.03000000000000 0.00000000000000 0.00000000000000

0.00200000000000 3.00000000000000 0.00000000000000

0.00000000000000 0.00000000000000 3.00000000000000

Cu

2

direct

0.00000000000000 0.00000000000000 0.00000000000000

0.50000000000000 0.50000000000000 0.50000000000000

With the [-spg] flag, maise uses thespglib solver to find the space group(SG) symmetry of a given POSCARstructure. The tolerance for thesymmetry analysis can be adjusted.The output is a symmeterizedstructure in the cif and VASP formats(CONV for conventional and PRIM forprimitive unit cells).

maise/01-spg/

3

[2] With a given negative tolerance, maise will scan the [ |tol|, 10-12 ] range and output the closest tolerance values corresponding to a SG symmetry change

POSCAR

https://atztogo.github.io/spglib/

Page 4: M A I S E - Bingweb

Structure Analysis

list.dat RDF.dat rdf.dat

# R total AA AB BB

[...]

2.996667 5.917243 2.958621 0.000000 2.958621

2.998333 5.979203 2.989601 0.000000 2.989601

3.000000 6.000000 3.000000 0.000000 3.000000

[...]

[1][2]

[1] Distance from the origin in Å

$ maise -rdf 300 4.5 5 0.02

Max number of nearest neighbors 300

Soft cutoff for finding neighbors 4.500000

Hard cutoff for finding neighbors 5.000000

Gaussian spread for smearing bonds 0.020000

Neighbor list written to list.dat

Normalized RDF written to RDF.dat

Original RDF written to rdf.dat

Cu Ag

1.00000000000000

3.00000000000000 0.00000000000000 0.00000000000000

0.00000000000000 3.00000000000000 0.00000000000000

0.00000000000000 0.00000000000000 3.00000000000000

Cu Ag

1 1

direct

0.00000000000000 0.00000000000000 0.00000000000000

0.50000000000000 0.50000000000000 0.50000000000000

Using [-rdf] flag, maise calculates theradial distribution function (RDF) foran input structure and outputs thelist of nearest neighbor distances(list.dat) and RDF (normalized andtotal) patterns.

4

maise/00-rdf/

A.N. Kolmogorov et al., PRL 105, 217003 (2010)

[2] Normalized RDF for each combination of species

POSCAR

Page 5: M A I S E - Bingweb

RDF1.dat

$ maise -cmp 300 6 7 0.06

STR vol/atom space group number RDF scalar product

number A^3/atom 10^-1 10^-2 10^-4 10^-8 0 1

0 13.500000 221 221 221 221 1.000000 0.979452

1 13.635000 221 123 10 10 0.979452 1.000000

$ maise -cxc 300 6 7 0.06

0.979452

[2] -cxc flag: quick calculation of RDF products.

[1]

[1] Dot product of RDFs for POSCAR0 and POSCAR1 as a measure of structural similarity: 1.0 representing similar and 0.0 dissimilar structures.

Zn

1.00000000000000

8.59556541588277 0.00000000000000 0.00000000000000

0.00000000000000 2.96067416794937 0.00000000000000

0.00000000000000 0.00000000000000 4.91525475045491

Zn

8

direct

0.42517529285620 0.50000000000000 0.14065768796726

0.18890882824820 0.50000000000000 0.61011420106387

0.57482470714380 0.50000000000000 0.85934231203274

0.81109117175180 0.50000000000000 0.38988579893613

0.92517529285620 0.00000000000000 0.14065768796726

0.68890882824820 0.00000000000000 0.61011420106387

0.07482470714380 0.00000000000000 0.85934231203274

0.31109117175180 -0.00000000000000 0.38988579893613

POSCAR1

RDF0.datlist1.datlist0.dat

Cu-Ag

1.00000000000000

3.00000000000000 0.00000000000000 0.00000000000000

0.00000000000000 3.00000000000000 0.00000000000000

0.00000000000000 0.00000000000000 3.00000000000000

Cu Ag

1 1

direct

0.00000000000000 0.00000000000000 0.00000000000000

0.50000000000000 0.50000000000000 0.50000000000000

The list of nearest neighbors and anormalized list of RDF for each inputstructure is outputted after thestructure comparison.

maise/02-cmp/

Structure Comparison 5

POSCAR0

[2]

A.N. Kolmogorov et al., PRL 105, 217003 (2010)

Page 6: M A I S E - Bingweb

MAISE

Structure Simulation: Built-in Options

INPUT POSCAR setup model

OUTPUT CONTCAR OUTCAR ...

6

neural networkGuptaSutton-Chen...

structure potential type

local relaxationmolecular dynamicsΓ-point phonons...

simulation type

Page 7: M A I S E - Bingweb

-----------------------------------------------------------------------------

| neural network general information |

-----------------------------------------------------------------------------

| model unique ID | 0357EDA0 |

| number of species | 1 |

| species types | 29 |

| species names | Cu |

| number of layers | 4 |

| architecture | 51 10 10 1 |

| number of weights | 641 |

| reference | doi |

-----------------------------------------------------------------------------

| performance |

-----------------------------------------------------------------------------

| train energy error | 0.001731 eV/atom |

| test Energy error | 0.001833 eV/atom |

| train force error | 0.010072 eV/Ang |

| test Force error | 0.010170 eV/Ang |

-----------------------------------------------------------------------------

B2A Scaling

0.529177249 1.25

...

Rc 1 11.338

-----------------------------------------------------------------------------

Structure Simulation: Available Model Types 7

model

-----------------------------------------------------------------------------

| Gupta potential |

-----------------------------------------------------------------------------

| model unique ID | XXXXXXXX |

| number of species | 1 |

| species types | 29 |

| species names | Cu |

| reference | https://doi.org/10.1007/s11051-017-3907-6 |

-----------------------------------------------------------------------------

| parameters |

-----------------------------------------------------------------------------

| A rep (eV) Cu | 0.0855 |

| p rep (Ang) Cu | 10.96 |

| B att (eV) Cu | 1.2240 |

| q att (Ang) Cu | 2.278 |

| r0 (Ang) Cu | 2.556 |

| Rmin (Ang) Cu | 100.0000 |

| Rmax (Ang) Cu | 110.0000 |

-----------------------------------------------------------------------------

model

Naming convention: nn_Cu_d3_v0neural network for Cu periodic structure version 0

Architecture: 51 inputs, 2 hidden layers, 1 output641 = (51+1)*10 + (10+1)*10 + (10+1)*1

Typical energy accuracy: 1-10 meV/atomTypical force accuracy 10-50 meV/Å

Interaction range (Rc)( 11.338 * B2A * 1.25 ) Å = 7.5 Å

Naming convention: gp_Cu_d3_vpGupta potential for Cu clusters version p

Parameters taken from DOI10.1007/s11051-017-3907-6

Effective accuracy ~ 30 meV/atom10.1021/acs.jpcc.9b08517

Rmax is set to 110 Å to include all neighborsFor bulk structures it should be set to ~ 10 Å

Page 8: M A I S E - Bingweb

model setupPOSCAR

OUTCAR OSZICAR CONTCAR

POSITION (Angst) TOTAL-FORCE (eV/Angst) ATOM ENERGY (eV)

---------------------------------------------------------------------------------------------------

0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 -4.099140027504

0.000000 1.817187 1.817187 0.000000 0.000000 0.000000 -4.099140027504

1.817187 0.000000 1.817187 0.000000 0.000000 0.000000 -4.099140027504

1.817187 1.817187 0.000000 0.000000 0.000000 0.000000 -4.099140027504

---------------------------------------------------------------------------------------------------

Total 0.00000001 0.00000001 0.00000001 0.00000000 0.00000000 0.00000000

in kB 0.00001015 0.00001015 0.00001015 0.00000000 0.00000000 0.00000000

---------------------------------------------------------------------------------------------------

iter 8 total enthalpy= -16.39656011 energy= -16.39656011 -4.09914003 -4.09914003

---------------------------------------------------------------------------------------------------

Total CPU time used (sec): 7.22000000 wall time (sec): 1.65885300

[5]

[5] enthalpy and energy per atom

[4]

[4] energy for each atom

[1] NPAR sets the number of cores used in parallel execution. It should be multiple of the number of atoms.

[2]

MAISE can perform local structurerelaxation given model, setup, andPOSCAR files. The OUTCAR output(compatible with the VASP format)includes the relaxed structure, totaland atomic energies, forces, etc.

Structure Simulation: Relaxation 8

maise/03-rlx/

$ maise

0 -7.9926743659805801

1 -8.1223228263625700 -0.1296484603819898

[...]

INI -7.99267437

DIF -0.20441123

FIN -8.19708559

[2] NDIM sets the periodicity of the simulation cell. Cells are either periodic (3) or non-periodic (0).

[3] MITR sets the max number of cell relaxation steps. Set it to 0 for a single-point calculation.

[1]

[3]

=====================================================================================

JOBT 20 cell relaxation

NPAR 4 number of cores for parallel run

NDIM 3 (3) crystal; (0) cluster;

MITR 100 number of cell optimization steps

RLXT 3 cell optimization type: (2) force only; (3) full cell; (7) volume;

PGPA 0.0 pressure in GPa

ETOL 1e-8 error tolerance for cell optimization convergence

COUT 12 OUTCAR output options

MINT 0 minimizer type: (0) BFGS2; (1) CG-FR; (2) CG-PR; (3) steepest descent;

MMAX 500 maximum number of nearest neighbors

=====================================================================================

Page 9: M A I S E - Bingweb

model setup

=====================================================================================

JOBT 21 molecular dynamics

NPAR 16 number of cores for parallel run

NDIM 3 3) crystal; (0) cluster;

MDTP 20 MD type (10) NVE (20) NVT Nose-Hoover (30) NPT Nose-Hoover + Berendsen

PGPA 0.0 pressure in GPa

TMIN 1300.0 min temperature in MD runs

TMAX 1500.0 max temperature in MD runs

TSTP 100.0 temperature step in MD runs

DELT 2.0 time step in femtoseconds

NSTP 100 number of steps

CPLT 25.0 thermostat coupling constant

CPLP 100.0 barostat coupling constant

ICMP 0.01 isothermal compressibility (in 1/GPa)

=====================================================================================

POSCAR

MAISE can perform NVE, NVT, andNPT molecular dynamics simulationsgiven model, setup, and POSCARfiles. The main output file ave-out.datcontains the averages for thepotential and kinetic energies, latticeconstants (for NPT), and Lindemannindex at each temperature.

Structure Simulation: Molecular Dynamics 9

maise/07-nvt/

Example: linear thermal expansion coefficient in fcc-Ag (NPT)

S. Hajinazar et al., arxiv:2005.12131 (2020)

Page 10: M A I S E - Bingweb

External engine PHON, PyChemia, MAISE, ...Simulation type Phonon analysis, evolutionary search, ...

Structure Simulation: External Usage 10

External engine PHON, PyChemia, MAISE, ...Simulation type Phonon analysis, evolutionary search, ...

MAISE

INPUT POSCAR setup model

OUTPUT CONTCAR OUTCAR ...

neural networkGuptaSutton-Chen...

structure potential type

local relaxationmolecular dynamicsΓ-point phonons...

simulation type

Page 11: M A I S E - Bingweb

Structure Simulation: Examples of External Usage 11

MAISE Atomic forces with NNs

new stable Mg-Ca phases at high T, P

S. Hajinazar, J. Shao, and A.N. Kolmogorov, PRB 95, 014114 (2017) W. Ibarra-Hernandez, S. Hajinazar, et al., PCCP 20, 27545 (2018)

PHON Frozen phonon methodD. Alfe, 10.1016/j.cpc.2009.03.010

phonon dispersion, free energy corrections

MAISE Local relaxation with NNs

PyChemia Minima hopping searchA. Romero, https://pypi.org/project/pychemia/

Page 12: M A I S E - Bingweb

0 20 40 60 80

E (

eV/a

tom

)

generation numbergeneration 0

initialize

relax

generation 1

mate

relax

rank

select

Structure Simulation: Evolutionary Search 12

Page 13: M A I S E - Bingweb

Structure Simulation: Evolutionary Search 13

MAISE Local relaxation with NNs

MAISE Evolutionary search

=============================================================================================================

EVOLUTIONARY SEARCH GENERAL SETTINGS

=============================================================================================================

JOBT 10 evolutionary search run (10); soft exit (11); hard exit (12); analysis (13)

NMAX 48 maximum number of atoms

MMAX 500 maximum number of neighbors within cutoff radius

NSPC 2 number of species types

TSPC 12 20 species types

ASPC 8 4 atom number of each species

CODE 2 MAISE-INT (0); VASP-EXT (1); MAISE-EXT (2)

QUET 1 queue type: torque (0); slurm (1)

NDIM 3 structure type: crystal (3); film (2); cluster (0)

NPOP 32 population size

SITR 0 starting iteration

NITR 100 number of iterations

TINI 0 starting options if SITR=0

TIME 1200 max time per relaxation

PGPA 0.0 pressure in GPa

SEED 0 random number generator seed (0 for system time)

=============================================================================================================

EVOLUTIONARY OPERATION SETTINGS

=============================================================================================================

MATE 0.7 crossover with planar cut

MUTE 0.3 mutation via distortion

=============================================================================================================

MCRS 0.5 mutation rate in crossover

SCRS 0.1 swapping rate in crossover

LCRS 0.1 lattice vector distortion in crossover

ACRS 0.1 atomic position distortion in crossover

SDST 0.1 swapping rate in mutation

LDST 0.15 lattice vector distortion in mutation

ADST 0.15 atomic position distortion in mutation

MAGN 10000000 max number of tries for crossover

=============================================================================================================

setup

Page 14: M A I S E - Bingweb

Structure Simulation: Evolutionary Search Performance 14

3D crystals 0D clusters

Crossover operation critical for N > 10 atoms

No input information: 3000-10000 relaxations for 12 ≲ N ≲ 30

Lattice constant input: only 100-200 relaxations for 20 ≲ N ≲ 30

tI56-CaB6 found w/o any input among largest confirmed

Kolmogorov et al., PRL 95, 109, 075501 (2012)

V (Å3/atom)

H–

HcP

7(e

V/a

tom

)

Hajinazar et al., PCCP 20, 27545 (2018); Thorn et al., JPCC 21, 8729 (2019)

Convenient TETRIS-like generation of clusters

Alternative evolutionary operations for efficiency analysis

Dramatic efficiency boost with multitribe co-evolution

New putative DFT ground states for Au with 30-80 atoms

Page 15: M A I S E - Bingweb

{ Ri } = { Rij , θijk }

Behler-Parrinellosymmetry functions

{ Ri }

generate a database of relevant structures

E, Fiα, sαβ

convert atomic positions into NN input

tune NN parametersto produce energy & forces

{ xin }

Construction of Neural Network Interatomic Models: Overview 15

MAISE-NETevolutionary sampling

MAISEdata parsing

MAISEstratified training

Page 16: M A I S E - Bingweb

arbitraryenvironments

2007symmetry functions

Behler-ParrinelloPRL 98, 146401

(2007)

constant #of neighbors

1999-2004atomistic RBF-NN

AKPhD thesis

unpublished (2004)

constant #of atoms

1995-system-

specific NNsreviewed in

JCTC 1, 14 (2005)

1 2 1 0

1 3 1 1

arbitraryenvironments

2013-SOAP (Bartók, Kondor, Csányi)

PRB 87, 184115 (2013)moment tensor (Shapeev), etc.

MMS 14, 1153 (2016)completeness (Pozdnyakov et al.)

arXiv:2001.11696 (2020)

r1 r2 ... rN atom Δr1 Δr2 ... ΔrN n n1 n2 ... nN f n1 n2 ... nN f

Our focus: Develop a practical NN tool for unconstrained prediction of stable alloys

Construction of Neural Network Interatomic Models: A Perspective 16

Page 17: M A I S E - Bingweb

Evolutionary sampling guidelines

• Sample configurations relevant for EVOS• Add EOS structures with short distances• Eliminate similar structures• Expand datasets iteratively after NN testing

Reference data and NN generation with MAISE-NET 17

DFT

Data generation approaches

• Randomization of given structures• Molecular dynamics• Evolutionary sampling (2017)

S. Hajinazar, J. Shao, and A.N. Kolmogorov, PRB 95, 014114 (2017) S. Hajinazar et al., arxiv:2005.12131 (2020)

NN

Page 18: M A I S E - Bingweb

Reference Data Organization and Selection 18

TEFS 1 train NN for (0) E (1) EF

FMRK 0.5 fraction of atoms used for EF training

EMAX 0.90 fraction of lowest-enthalpy structures

FMAX 100.0 do not parse data with larger force

VMIN 0.0 do not parse data with lower volume

VMAX 45.0 do not parse data with larger volume

RAND 5 optional for fixed parsing

DEPO ./DATA location of DFT data

setup

-7.99786290

in kB -78.764 40.695 -54.435 19.213 -102.114 -17.84928

POSITION TOTAL-FORCE (eV/Angst)

--------------------------------------------------------

3.14733 -0.05179 1.34028 0.225348 0.299805 -0.262492

2.39085 -1.67936 2.83105 -0.225348 -0.299805 0.262492

-------------------------------------------------------

pressure 10.00000000

dat.dat

if present, tag fileoverwrites setup

and assigns the full subset to training

DATA/eosbcc

tag 00/ 01/ ...

Data is divided into subsets with same compositionsame pressuresame dimensionality

POSCARDATA/evo01/01

DATA/evo01

00/ 01/ ...

DATA/evo00

00/ 01/ ...

ECUT = 0.90 means that 10% of highest-enthalpy structures

are discarded

remainingstructures are splitinto training (90%)

& testing (10%) sets

FMRK fraction of forces are selected

Page 19: M A I S E - Bingweb

PARS/e*

$ maise

dir poscars Emin Emax Ecut path

0 113 112 226 -1.453728145000 3.385686635000 3.385686635000 ./DATA/000/

1 250 225 225 -3.308694100000 0.009592215000 -2.603072180966 ./DATA/100/

2 61 60 122 -3.070987127500 2.247194970000 2.247194970000 ./DATA/200/

Total 424 POSCAR.0 files are found in ./DATA/*

Structures marked for: TRAIN= 0 TEST= 0 TRAIN+TEST= 397 DISCARD= 27

Successfully parsed 397 POSCAR.0 out of total 424 structures!

setup DATA/ basis

JOBT 30 job type: data parsing (30)

NPAR 8 number of cores for parsing

NSPC 2 number of species types

TSPC 29 47 species types

NSYM 30 number of Behler-Parrinello symmetry functions

NCMP 82 total number of NN inputs

TEFS 1 train NN for: (0) E; (1) EF;

FMRK 0.5 fraction of atoms used for EF training

ECUT 0.90 parse only this fraction of lowest-energy structures (from 0 to 1)

EMAX 5.0 maximum energy from the lowest-energy structure that is parsed

FMAX 50.0 do not parse data with larger force

RAND 5 seed for random number generator

DEPO ./DATA path to the DFT dataset to be parsed

DATA ./PARS location of the parsed data to write the parsed data

PARS/index.dat PARS/stamp.dat PARS/ve.dat PARS/RDFP.dat

e000000 ./DATA/000/109/

e000001 ./DATA/000/066/

e000002 ./DATA/000/091/

e000003 ./DATA/000/069/

e000004 ./DATA/000/048/

e000005 ./DATA/000/141/

e000006 ./DATA/000/057/

e000007 ./DATA/200/38/

e000008 ./DATA/000/084/

[...]

Parsing the DFT-based data forconstruction of training dataset of aneural network model.

maise/04-prs/

Reference Data Parsing 19

Parsing produces a set of e* files that contained the parsed information for each structure.

The index.dat file contains the the list of e* files and the path to the corresponding structure.

A summary of parsing task is provided in the stamp.dat file. A list of volume-energy data, and average RDF for the dataset is provided in ve.dat and RDFP.dat files.

Behler-Parrinello symmetry functions can be customized in the basis file: number, types, parameters, Rcut.

Page 20: M A I S E - Bingweb

$ maise

Loading list of parsed data from ./PARS/index.dat

Total number of parameters: 1902

BFGS2 relaxation: 1040 adjustable parameters

1 1.1180920333757309 0.779057 5.034344 0.938831 7.022223

2 0.6960607178739942 0.623539 1.941889 0.660513 2.726337

[...]

Test error ENE FRC TOT 0.027726 0.180121 0.026405 39 102

setup

JOBT 41 training type: full training (40); stratified training (41)

NPAR 16 number of cores for parallel NN training

MINT 0 gsl minimizer type: (0) BFGS2; (1) CG-FR; (2) CG-PR; (3) steepest descent;

MITR 100 maximum N for NN training or cell optimization steps

ETOL 1e-6 error tolerance for training

NSPC 2 number of species types

TSPC 29 47 species types

NSYM 30 number of Behler-Parrinello symmetry functions

NCMP 82 total number of NN inputs

TEFS 1 train NN for: (0) E; (1) EF;

LREG 1e-8 regularization parameter

NTRN -90 number of structures for training (negative means percentage)

NTST -10 number of structures for testing (negative means percentage)

NNNN 2 number of hidden layers in MLP

NNNU 10 10 number of neurons in hidden layers in MLP

Training neural network models. Theoutput includes model, err-out.dat(optimization steps), and energy andforce errors for testing set (err-ene.dat and err-frc.dat).

maise/05-trn/

NN Model Construction: Training 20

NNET/model NNET/err-out.dat NNET/err-ene.dat NNET/err-frc.dat

-----------------------------------------------------------------------------

| number of species | 2 |

| species names | Cu Ag |

| architecture | 82 10 10 1 |

| number of weights | 1902 |

-----------------------------------------------------------------------------

| train energy data | 357 |

| test energy data | 39 |

| train force data | 906 |

| test force data | 102 |

-----------------------------------------------------------------------------

| number of epochs | 100 |

| train energy error | 0.030774 eV/atom |

| test Energy error | 0.027726 eV/atom |

| train force error | 0.180638 eV/Ang |

| test Force error | 0.180121 eV/Ang |

-----------------------------------------------------------------------------

Page 21: M A I S E - Bingweb

NN Model Construction: Training 21

NN accuracy shows little dependence on

number of layers (1 or 2)number of neurons (20-40)weight initialization (random or restart)

Typical E&F training time for 100,000 steps

elements weights data CPU hours

unary 641 38,000 300 binary 1,880 42,000 900ternary 1,290 55,000 1,500

Page 22: M A I S E - Bingweb

Standard fitting procedure

Stratified Construction of NN Models for Compounds 22

CuCu PdPd

CuCu CuPd PdPd PdCu

E(Cu) ≠ E(Cu)

Cudata

Pddata

CuPddata

Cudata

Pddata

CuPd

CuPddata

NNCu

model

Cu structure

NNCuPd

model

NNCu

model

NNPd

model

binary NNCuPd

model

E(Cu) = E(Cu)

NNCu

model

Cu structure

NNCuPd

model

Stratified fitting procedure

S. Hajinazar, J. Shao, and A.N. Kolmogorov, PRB 95, 014114 (2017)

Binary data affects elemental weights

Unphysical change in subsystem description

Elemental weights in the binary NN are fixed

Consistent unchanged subsystem description

Page 23: M A I S E - Bingweb

Stratified Construction of NN Models for Alloys 23

S. Hajinazar, J. Shao, and A.N. Kolmogorov, PRB 95, 014114 (2017); S. Hajinazar et al., arxiv:2005.12131 (2020)

Stratified training procedure

has been previously used for tight-binding models and traditional potentials

allows for proper description of screening, charge transfer, etc. in multielement systems

is most natural for compounds with elements similar in size, electronegativity, etc.

speeds up training due to division of adjustable parameters

enables construction of reusable libraries of NN models

has been tested on several metals (e.g., Cu-Pd-Ag, Mg-Ca, etc.)

has been recently generalized for an arbitrary number of species

NN library built with MAISE-NETStratified training from the bottom up

Page 24: M A I S E - Bingweb

Are NNs better than traditional potentials for guiding ab initio search?

search

DFT NN~ 10

meV/atom

NN+

~ 5 meV/atom

buildsearch

check

EAM, GP, SC ~ 30

meV/atom

A. Thorn, J. Rojas-Nunez, S. Hajinazar, S.E. Baltazar, and A.N. Kolmogorov, JPCC (2019)

build

search

check

build

search+check

check

CPU time

Structure Search Acceleration with Pre-trained NNs 24

perform search with coarse

but fast method

examine small low-E pool with

accurate method

~ΔE

cost of Au50 calculation

GP : NN : DFT1 : 102 : 106

Page 25: M A I S E - Bingweb

MAISE present features

space group solver SPGLIB interface https://atztogo.github.io/spglib/fingerprint structure comparison with RDF PRL 105, 217003 (2010)global optimization DFT evolutionary search PRL 109, 075501 (2012), etc.NN data generation Python MAISE-NET script PRB 95, 014114 (2017); arXiv:2005.12131NN development stratified training PRB 95, 014114 (2017)NN application structure prediction PCCP 20, 27545 (2018); JPCC 21, 8729 (2019)

Summary 25

MAISE development information

MAISE 2.7 OpenMP parallelized C code (14364 lines)MAISE-NET 2.0 Python script (3889 lines)2009-present A. Kolmogorov, S. Hajinazar, E. Sandoval, A. Thorn, S. Kharabadze

MAISE resources

open source code https://github.com/maise-guideNN model library https://github.com/maise-guide/maise/tree/master/modelsforum http://harvey0.binghamton.edu/~akolmogo/forumwiki documentation http://maise.binghamton.edu/wikiA. Kolmogorov contact for development of new simulation features and NN models for alloys

TransdisciplinaryAreas of Excellence

DMR 1410514DMR 1821815

Page 26: M A I S E - Bingweb

Generalized Stratified Training of NN for Compounds E1