chemistry space—time · chemistry space—time david a. winklera,b c ∗ a csiro manufacturing,...

13
Perspectives in Science (2015) 6, 2—14 Available online at www.sciencedirect.com ScienceDirect j our na l homepage: www.elsevier.com/pisc Chemistry space—time David A. Winkler a,b,c,a CSIRO Manufacturing, Bag 10, Clayton South 3169, Australia b Monash Institute of Pharmaceutical Sciences, 390 Royal Parade, Parkville 3052, Australia c Latrobe Institute for Molecular Science, Bundoora, Australia Received 19 December 2014; accepted 20 August 2015 Available online 23 October 2015 KEYWORDS Primordial chemistry; Radioastronomical spectroscopy; Amino acids; Materials space; Evolutionary algorithms; High throughput robotic synthesis Abstract As Einstein identified so clearly, space and time are intimately related. We discuss the relationship between time and Euclidean space using spectroscopic and radioastronomical studies of interstellar chemistry as an example. Given the finite speed of light, we are clearly studying chemical reactions occurring tens of thousands of years ago that may elucidate the primordial chemistry of this planet several billion years ago. We also explore space of a different kind chemical space, with many more dimensions than the four we associate as space—time. Vast chemical spaces also need very efficient (computational) methods for their exploration to overcome this ‘curse of dimensionality’. We discuss methods by which the time to explore these new spaces can be very substantially reduced, opening the discovery useful new materials that are the key to our future. © 2015 The Author. Published by Elsevier GmbH. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). Contents The relationship between space and time..................................................................................... 3 Interstellar space, chemistry, and relationship to time........................................................................ 3 Chemistry space .............................................................................................................. 5 Exploring spaces faster robotic and other accelerated synthesis and characterisation methods ............................. 7 Materials evolution ........................................................................................................... 8 Representing the ‘materials genome’ ......................................................................................... 8 Evolving white light LED phosphors .......................................................................................... 12 Where to next? Automatic ‘closed loop’ systems ............................................................................. 13 This article is part of a special issue entitled ‘‘Proceedings of the Beilstein Bozen Symposium 2014 Chemistry and Time’’. Copyright by Beilstein-Institut www.beilstein-institut.de. Correspondence to: CSIRO Manufacturing, Bag 10, Clayton South 3169, Australia. E-mail address: [email protected] http://dx.doi.org/10.1016/j.pisc.2015.10.002 2213-0209/© 2015 The Author. Published by Elsevier GmbH. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

Upload: others

Post on 08-Oct-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Chemistry space—time · Chemistry space—time David A. Winklera,b c ∗ a CSIRO Manufacturing, Bag 10, Clayton South 3169, Australia b Monash Institute of Pharmaceutical Sciences,

P

C

D

a

b

c

RA

C

b

h2(

erspectives in Science (2015) 6, 2—14

Available online at www.sciencedirect.com

ScienceDirect

j our na l homepage: www.elsev ier .com/pisc

hemistry space—time�

avid A. Winklera,b,c,∗

CSIRO Manufacturing, Bag 10, Clayton South 3169, AustraliaMonash Institute of Pharmaceutical Sciences, 390 Royal Parade, Parkville 3052, AustraliaLatrobe Institute for Molecular Science, Bundoora, Australia

eceived 19 December 2014; accepted 20 August 2015vailable online 23 October 2015

KEYWORDSPrimordial chemistry;Radioastronomicalspectroscopy;Amino acids;Materials space;Evolutionary

Abstract As Einstein identified so clearly, space and time are intimately related. We discussthe relationship between time and Euclidean space using spectroscopic and radioastronomicalstudies of interstellar chemistry as an example. Given the finite speed of light, we are clearlystudying chemical reactions occurring tens of thousands of years ago that may elucidate theprimordial chemistry of this planet several billion years ago. We also explore space of a differentkind — chemical space, with many more dimensions than the four we associate as space—time.Vast chemical spaces also need very efficient (computational) methods for their exploration to

algorithms;High throughputrobotic synthesis

overcome this ‘curse of dimensionality’. We discuss methods by which the time to explore thesenew spaces can be very substantially reduced, opening the discovery useful new materials thatare the key to our future.© 2015 The Author. Published by Elsevier GmbH. This is an open access article under the CC

BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

ontents

The relationship between space and time.....................................................................................3Interstellar space, chemistry, and relationship to time........................................................................3Chemistry space .............................................................................................................. 5Exploring spaces faster — robotic and other accelerated synthesis and characterisation methods.............................7Materials evolution ........................................................................................................... 8

Representing the ‘materials genome’ ...........................Evolving white light LED phosphors .............................Where to next? Automatic ‘closed loop’ systems................

� This article is part of a special issue entitled ‘‘Proceedings of the Bey Beilstein-Institut www.beilstein-institut.de.∗ Correspondence to: CSIRO Manufacturing, Bag 10, Clayton South 3169

E-mail address: [email protected]

ttp://dx.doi.org/10.1016/j.pisc.2015.10.002213-0209/© 2015 The Author. Published by Elsevier GmbH. Thishttp://creativecommons.org/licenses/by-nc-nd/4.0/).

.............................................................. 8............................................................. 12.............................................................13

ilstein Bozen Symposium 2014 — Chemistry and Time’’. Copyright

, Australia.

is an open access article under the CC BY-NC-ND license

Page 2: Chemistry space—time · Chemistry space—time David A. Winklera,b c ∗ a CSIRO Manufacturing, Bag 10, Clayton South 3169, Australia b Monash Institute of Pharmaceutical Sciences,

Chemistry space—time 3

Conflict of interest ........................................................................................................ 13Acknowledgements ........................................................................................................ 13References ................................................................................................................ 13

aafin

tswpt

It

Tiwin

I wasted time, and now doth time waste me;For now hath time made me his numbering clock;My thoughts are minutes. — William Shakespeare

The relationship between space and time

‘‘People like us, who believe in physics, know that thedistinction between past, present, and future is only astubbornly persistent illusion.’’ Albert Einstein

Most physical scientists are aware of the intimate rela-tionship between space and time. Einstein quantified andclarified this relationship in the theories of General and Spe-cial Relativity. Space—time is a mathematical model thatunites space and time into a single interwoven continuum.It combines space and time into a single manifold calledMinkowski space, as opposed to the commonly experiencedEuclidian space (Fig. 1).

In cosmology, the concept of space—time combines spaceand time to a single abstract universe. Mathematically it

is a manifold consisting of ‘‘events’’ which are describedby some type of coordinate system. Typically three spa-tial dimensions (length, width, height), and one temporaldimension (time) are required.

Figure 1 Albert Einstein.

caipiiaomMt(

icoa(epTc(

1oms

mkps

General and Special Relativity shows that space, time,nd gravity are interlinked, and that gravity can warp spacend distort time. The consequence of large spaces and thenite speed of light is long times. Astronomical observationecessitates seeing things far away and long ago.

Here we discuss the relationship between chemistry,ime, and two kinds of ‘space’, interstellar (Euclidean)pace, and chemical or materials space. In interstellar spacee will discover primordial chemistry that is the key to ourast, and in materials space we will discover new chemistryhat is the key to our future.

nterstellar space, chemistry, and relationshipo time

he Universe is approximately 13.5 billion years old, evolv-ng steadily since the Big Bang. It is a very dynamic spaceith many very unusual non-equilibrium systems within

t. Interstellar dust and gas clouds are the birthplaces ofew stars, and are also where very interesting primordialhemistry occurs. Several of these interstellar gas cloudsre in our Milky Way galaxy, and have been the source ofntense spectroscopic analysis. Infrared (vibrational) and inarticular, microwave (rotational) spectroscopy, have beennvaluable for identifying the types of molecules and chem-stry that occur within these clouds. These methods canlso be useful for identifying the amount (column density)f small molecules present, the temperature, velocity, andagnetic fields (Brown et al., 1980a) present in the clouds.olecular line radioastronomy has generated much informa-

ion on the chemistry that occurs within interstellar cloudsKroto, 1980) (Fig. 2).

The two most interesting and accessible molecular cloudsn our galaxy are Sagittarius B2 (Sgr B2) near the galacticentre, and the Orion Nebula (M42) on an outer spiral armf the galaxy. Sagittarius B2 is a giant molecular cloud of gasnd dust that is located about 120 parsecs (390 light yearsly)) from the centre of the Milky Way and 27,000 ly fromarth. It is the largest molecular cloud near the centre, ∼45arsecs (150 ly) diameter, and mass ∼3 million solar masses.he mean hydrogen density in the cloud is 3000 atoms perm3, ∼20—40 times denser than typical molecular cloudsFig. 3).

The Orion nebula (M42) is located at a distance of344 ± 20 light years from the Earth and is the closest regionf massive star formation to Earth. The M42 nebula is esti-ated to be 24 light years across. It has a mass of ∼2000

olar masses.Molecule densities are very low (106 hydrogen

olecules/cm3 in very dense clouds) with averageinetic temperatures 10—100 K, suggesting mean freeaths of between 109 and 1015 cm. Time between colli-ions (104—1010 s) is of the order of relaxation times for

Page 3: Chemistry space—time · Chemistry space—time David A. Winklera,b c ∗ a CSIRO Manufacturing, Bag 10, Clayton South 3169, Australia b Monash Institute of Pharmaceutical Sciences,

4 D.A. Winkler

Figure 2 The large molecular cloud in the constellation of Sagittarius B2 (SgrB2) near the galactic centre.

Figure 3 The Orion Molecular Cloud (OMC) in the outer armo

rtlti

twim

11t

Fpm

utpthem. The simplest of these are the amino acids. While the

f the Milky Way.

otationally excited molecules. The chemical processeshat generate interstellar molecules are efficient overarge regions of the clouds and are intimately related tohe presence of both grains and stars enveloped in thenterstellar clouds (Snyder, 1997).

In very dense clouds, grains provide catalytic surfaceshat enhance the production of larger polyatomic moleculeshereas there is ample evidence that gas-phase chemistry

s the predominant mechanism for formation of smallerolecules in more tenuous regions (Giri et al., 2013) (Fig. 4).Very low barrier or barrier-less reactions are required at

0 K — generally radical or ion-molecule reactions (Snyder,997). In Sgr B2, methanimine, CH2 NH, has a rotationalemperature of 45 K and column density of ∼1015/cm2.

euw

igure 4 The formation of interstellar dust grains, theutative sites of many organic reactions that generate smallolecules and possibly amino acids.

The interstellar gas clouds where reactions still occurnder these rather extreme conditions are thought to behe birthplaces of primordial small molecules that are therecursors of some of the ‘molecules of life’ as we now know

xact nature of the interstellar chemistry is still somewhatncertain, several researchers have proposed mechanism byhich the simplest amino acids, glycine and alanine may be

Page 4: Chemistry space—time · Chemistry space—time David A. Winklera,b c ∗ a CSIRO Manufacturing, Bag 10, Clayton South 3169, Australia b Monash Institute of Pharmaceutical Sciences,

Chemistry space—time

penrarnhrrmtd(tia

islmb(

omsp

iFeG

Figure 5 Interstellar methanol maser lines showing narrow-ness of lines and high measurement precision that is possible.

generated in cold clouds and few degrees above absolutezero (Kuan et al., 2004).

Molecular line radioastronomy studies the types of chem-istry occurring in the interstellar media by measuring theradio frequencies emitted by small molecules. These fre-quencies, which occur in the microwave region between 1and 100 GHz, are very narrow and can be identified withgreat accuracy (<1 part in 109), after correcting for theDoppler shift of the emitting source relative to the Earth(Fig. 5). These frequencies can be measured in the labora-tory using microwave spectrometers to assign the rotationalspectra of candidate molecules. Many of these moleculesare unstable under terrestrial conditions, and exist for onlyshort periods of time in the laboratory, making measurementof their microwave spectra challenging.

At least 180 molecules have been confirmed in interstel-lar medium, with 10 being unconfirmed, including glycine

and graphene. Table 1 shows examples of molecules dis-covered in several molecular clouds, especially those in theGalactic Centre and the Orion Nebula.

Table 1 Examples of small molecules identified in theinterstellar medium.

Molecule Designation Mass

C5 Linear C5 60NH4

+ Ammonium Ion 18CH4 Methane 16CH3O Methoxy radical 31c-C3H2 Cyclopropenylidene 38l-H2C3 Propadienylidene 38H2CCN Cyanomethyl 40H2C2O Ketene 42H2CNH Methylenimine 29HNCNH Carbodiimide 42C4H Butadiynyl 49HC3N Cyanoacetylene 51HCC-NC Isocyanoacetylene 51HCOOH Formic acid 46NH2CN Cyanamide 42HC(O)CN Cyanoformaldehyde 55

haKo

aaam

C

CeceoeePatttei

5

We studied two small molecules proposed to be therecursors of glycine and alanine in the interstellarnvironment, CH2 NH (methanimine) and CH3CH NH (etha-imine) (Brown et al., 1980b, 1982). Putative chemicaleactions that could lead to the formation of amino acidsre shown in Scheme 1. The Bernstein Stecker mechanismelies on the interaction of aldehydes with amines anditriles (Bernstein et al., 2002). The four other schemes,owever, proceed from simple imines that are generated byeduction of the analogous nitriles in the case of the Woonadical—radical (Woon, 2002) and Elsila modified nitrileechanisms (Elsila et al., 2007). These nitriles are known

o be abundant in the interstellar medium. The mechanismsiffer in that they either further reduce the imines to aminesWoon, 2002), react with nitriles to generate aminoacetoni-riles (Elsila et al., 2007), or react with another abundantnterstellar molecules, CO, that generate a transient cyclicziridinone whose ring opens.

Methanimine, one of the early molecules to be discoveredn the interstellar media, had been studied by microwavepectroscopy by Johnson and Lovas (1972). Its methyl ana-ogues, ethanimine, had its microwave spectrum assignedore recently by Brown et al. (1980b). It can be formedy high temperature pyrolysis of a number of precursorsScheme 2).

We assigned the rotational spectrum of the Z isomerf ethanimine and the hyperfine rotational spectrum ofethanimine (e.g., Fig. 6) (Brown et al., 1980b, 1982). Sub-

equently, the rotational spectrum of the E isomer was alsoublished (Lovas et al., 1980).

Very recently, ethanimine was unambiguously identifiedn the SgrB2 in the galactic centre (Loomis et al., 2013).ig. 6 shows the part of the laboratory spectrum of Z-thanimine, and the same line observed in SgrB2 using thereenbank radiotelescope (Fig. 7).

Several unsuccessful searches for interstellar glycineave been conducted by our group (Brown et al., 1979)nd others. A potential positive detection was announced byuan et al. (2003) but could not be confirmed by subsequentbservations by Snyder et al. (2005).

In summary, the coupling of chemistry with astronomyllows us to begin to understand the building blocks of lifend primordial chemistry that is the key to our past, andllows us to study the chemistry occurring thousands toillions of years ago.

hemistry space

hemistry also allows us to design molecules that will bessential for our future. Most chemists have begun to appre-iate how large chemical spaces are. It has been roughlystimated that the number of drug-like molecules thatbey the laws of chemistry may be as high as 1080. Recentstimates of the size of materials composition space areven larger, around 10100 (Martin, 2012; Shoichet, 2013;olishchuk et al., 2013). As with Cartesian space that were very familiar with, larger spaces need technologieshat can explore them more quickly to keep the search

imes tractable. Clearly, chemical or materials spaces andime are intimately related, and we must find much morefficient ways of exploring these spaces to find dramat-cally different new molecules or materials in timescale
Page 5: Chemistry space—time · Chemistry space—time David A. Winklera,b c ∗ a CSIRO Manufacturing, Bag 10, Clayton South 3169, Australia b Monash Institute of Pharmaceutical Sciences,

6 D.A. Winkler

S ple

i

cm

miaaot

sn

itp

FiF

cheme 1 Several reactions proposed for the formation of simnterstellar grains.

ompatible with human lifetimes and technology develop-ent time frames.Although accelerated synthesis and characterisation

ethods employing advanced robotics, combinatorial chem-stry, etc. can greatly accelerate our exploration of chemicalnd materials spaces, we clearly cannot explore more than

tiny fraction of spaces as large as 10100 even using the mostptimistic projections. Computational methods are requiredo do this, as well as to manage and interpret the vast data

cbt

igure 6 The 212—111 rotational transition of z-ethanimine. The linteractions of the nitrogen atom.rom Winkler, D.A. PhD thesis, Monash University, 1980.

amino acids in the interstellar media, mainly on the surface of

ets that accelerated experimental methods are now begin-ing to generate.

This ‘curse of dimensionality’ gets worse when the exper-mental variables required to synthesise materials are alsoaken into account. Most materials consist of multiple com-onents and their synthesis requires multiple reagents and

onditions. This generates an ‘experimental space’ that cane vast when the number of components and experimen-al variables rises above 5—6. The number of experiments

ne is a multiplet because of the effects of quadrupole coupling

Page 6: Chemistry space—time · Chemistry space—time David A. Winklera,b c ∗ a CSIRO Manufacturing, Bag 10, Clayton South 3169, Australia b Monash Institute of Pharmaceutical Sciences,

Chemistry space—time 7

Figure 7 The observed rotational lines of z-ethanimine with the rewere made on the Greenbank radio telescope (right) (Wiki Common

required to exhaustively locate the optimum conditions andcomponents to synthesise new materials increases exponen-tially with the number of these components and conditions.

For example, if a material has 7 components and 3 syn-thesis conditions, and these vary over a range of 10 values,the number of experiments required to explore this spacecompletely is 1010. Much faster experimental methods, cou-pled to advanced informatics and computational chemistrymethods, are required to explore large materials spaces.

The size of chemistry and materials spaces can tackledin several ways:

• Combinatorial and other high-throughput synthesis andscreening approaches.

• Fragment-based methods, these use small chemical moi-eties to explore chemical space constrained, for example,by a binding site in a protein.

• De novo design molecular design, this again uses con-strained chemical space defined by a protein ligandbinding site.

• Design of experiments, diversity libraries, models. Theseuse mathematical methods to explore spaces sparsely, andmodels to ‘fill in’ the missing regions of space.

• Supramolecular approaches that rely on weak covalentbonds forming large dynamic libraries of molecules thatare ‘selected’ by appropriate binding sites.

• Evolutionary approaches that ‘evolve’ materials towarddesired fitness functions using genetic mutation methods(see below).

Scheme 2 Two methods for pyrolytic synthesis of metha-nimine.

wmettsbttwm

Eam

MtcsCaadbm

boeow(WTppsmr

levant source Doppler shifts marked (left). The measurementss).

Although this list is biased toward small moleculesith biological properties, some of the methods apply toaterials as well. Some of them allow large spaces to be

xplored more quickly via virtual screening, other allowhese spaces to be explored sparsely with models being usedo interpolate between the experimental points, others con-train the space using geometric constraints (e.g., ability toind into a protein pocket), or exploit evolutionary methodso find molecules with locally optimal properties. Some ofhe above methods of tackling the size of materials spaceill be illustrated further by examples, based largely onaterials.

xploring spaces faster — robotic and otherccelerated synthesis and characterisationethods

aterials science is now beginning to leverage some ofhe high-throughput methods developed by the pharma-eutical industries more than two decades ago. Roboticynthesis and characterisation facilities like the RAMPentre in CSIRO Manufacturing Flagship (http://www.csiro.u/en/Research/MF/Areas/Chemicals-and-fibres/RAMP)nd now being built and used widely in diverse materialsesign and discovery projects. High throughput synchrotroneamlines are increasingly being used to accelerateaterials characterisation (Fig. 8).High throughput experimentation needs to be matched

y efficient informatics and data modelling methods. Devel-pments in statistics and mathematics have provided thesexcellent modelling tools. We use Bayesian regularisationf neural networks to optimise model complexity, togetherith sparse features selection methods employing sparse

Laplacian) priors (Burden et al., 2000, 2009; Burden andinkler, 1999a,b, 2009a,b; Winkler and Burden, 2000).hese provide models relating materials structure androcessing conditions to their properties that have optimum

redictive power. These also allow chemical or materialspaces to be explored virtually. The application of theseethods to materials has been comprehensively reviewed

ecently, and a diverse range of large materials data sets

Page 7: Chemistry space—time · Chemistry space—time David A. Winklera,b c ∗ a CSIRO Manufacturing, Bag 10, Clayton South 3169, Australia b Monash Institute of Pharmaceutical Sciences,

8 D.A. Winkler

Figure 8 The RAMP facility at CSIRO carries out 100—10,000 experiments per day to rapidly produce, characterise and testd , nanr rimeb 00 ex

hp

M

Wgcrvmsnrw(

stsl

mmtttr

R

Trbwpbgb

iverse materials such as polymers, metal—organic frameworksig (centre) conducts 10—1000 scaled down but plant-like expeeamline at the Australian synchrotron (right) can carry out 50

ave been successfully modelled and used to predict theroperties of new materials (Le et al., 2012).

aterials evolution

hile accelerated synthesis and assessment method willreatly increase the productivity for materials discovery,omputational methods for materials evolution are alsoequired. These techniques have the advantage of exploringery large spaces more efficiently than other computationalethods. They embody a computational analogue of natural

election, first identified by Darwin, which is the domi-ant paradigm in biology. Interestingly, Charles Darwin waselated to one of the first materials scientists, Josiah Wedg-ood (ceramics) by marriage to his cousin, Emma Wedgwood

Fig. 9).Mathematical analogies of genetic processes have been

tudied widely, and genetic algorithms and related evolu-ionary methods have been shown to be very effective atearching vast multidimensional spaces. Paradoxically, evo-utionary methods have not yet been widely employed in

Figure 9 Charles Darwin (left) a

oparticles and small molecules (left). Its high-speed catalysisnts per day in multiplexed reactors. The high throughput SAXSperiments per day.

aterials discovery. One could speculate that the reasonay be that materials science has only recently adopted

he high throughput synthesis and characterisation methodshat were employed in the pharmaceutical industries morehan two decades ago. We predict such methods will growapidly in prominence in the materials community.

epresenting the ‘materials genome’

he components in a structure (or material) can be rep-esented in a myriad of ways mathematically, e.g., as ainary string such as 00010100010101000101010011110100here 0 = fragment not present in structure and 1 = fragmentresent in structure (perhaps multiple times). Two types ofinary string (also called a fingerprint or more recently aenome) are used. For discrete molecules (similar forms cane written for more complex materials):

structural keys — fixed dictionary of fragments wherethere is a 1:1 relationship between the bit in the stringand the fragment. These suffer from the disadvantage

nd Josiah Wedgwood (right).

Page 8: Chemistry space—time · Chemistry space—time David A. Winklera,b c ∗ a CSIRO Manufacturing, Bag 10, Clayton South 3169, Australia b Monash Institute of Pharmaceutical Sciences,

Chemistry space—time 9

Table 2 Examples of the types of variable making up a materials ‘genome’, and typical fitness functions used in materialsevolution studies.

Goals Variables Objectives

Synthesis efficiency

Functional properties

• Reagents• Catalysis• Processing conditions• Solvents or host lattices• Molecules: scaffolds, moieties,chemometrics• Materials: atomic/molecular

ttice

• Rate of products formation• Product yield• Enantiomeric excess• Catalytic activity• Electromagnetic properties• Mechanical properties• Thermodynamic properties

figs

tottfi

taflaalompaarap

composition, host la

Adapted from Moore et al. (2011).

that structures may contain fragments that are not indictionary.

• hashed fingerprints — the fragment description (C-C-N-C-O) can be hashed to a bit position e.g., between 1 and1024 and this bit is set. This method can suffer from col-lisions, where two different fragments map to the samebit position. This creates a bit position-fragment identitymap discontinuity.

For materials that are more complex and whose prop-erties may depend on the method of manufacture orsubsequent processing steps:

• Compositional vectors — vectors of real numbers encodingcomposition of material.

• Structural strings — similar to the above examples fordiscrete molecules, but harder to achieve for complexmaterials like polymers.

• Process vectors — real number strings representingprocessing parameters.

Materials evolution follows a process akin to ‘survivalof the fittest’ or natural selection. An initial population ofcandidate molecules or materials is generated and encoded

by a genome. A ‘fitness function’, a desirable property forthe molecule or material that needs to be optimised andthat can be measured by some assay or test is defined andused to evaluate the initial pool of candidates. Examples of

t

ct

Figure 10 The materiaAdapted from Zhou et al. (2013).

s • Molecular binding constants

tness functions and variables used in molecule or materialsenomes that have been employed in published studies areummarised in Table 2.

The best or ‘fittest’ members of the population are usedo define the next generation through mutation operatorsr by being carried forward (elitism). This new population ishe tested against the fitness function and ranked again andhe best members used to define a subsequent pool or overtter molecules or materials (Fig. 10) (Zhou et al., 2013).

The most common mutation operators that operate onhe molecule or material genome are single point mutationsnd crossover mutations (Fig. 11). Single point mutationsip or alter a single element of the genome e.g., inverting

binary bit, or perturbing the value of a single element of real value vector. This operation effectively conducts aocal search of molecular or materials space in the vicinityf the current pool member. This is useful for optimisingaterials without moving too far from existing areas ofromise in the materials space. Crossover mutations define

split point in the genomes of two molecule or materials,nd the relevant fragments that result from splitting, areeassembled in anew combination. This allows materials in

population to jump into a distant are of materials space,otentially identifying an interesting new area of composi-

ional or processing space.

Materials property landscapes are multidimensional andomplex. There is a concern that the landscape may con-ain many local minima or optima. This is true of biological

ls evolution cycle.

Page 9: Chemistry space—time · Chemistry space—time David A. Winklera,b c ∗ a CSIRO Manufacturing, Bag 10, Clayton South 3169, Australia b Monash Institute of Pharmaceutical Sciences,

10 D.A. Winkler

F tatig d (rig

etelath

ota

Ft

igure 11 Two common mutation operators, single point muenes are split at an arbitrary place and the fragments swappe

volution as well, and it is now considered almost certainhat the various species that have evolved to fit particularvolutionary niches are driven by local optima in the fitness

andscape. Nevertheless many of these local fitness optimare very impressive and are certainly useful and well suitedo the particular environmental niche in which the speciesas evolved. Materials fitness landscape may be simpler and

coip

igure 12 Materials property landscapes containing local and globo optimise (left) while those that are trap-free are easier to optim

Table 3 Examples of evolution of materials experiments from th

Ref. number Number of controlvariables

Objective

20 6 Binding to stromelysin21 8 Propane → propene

22 4 Inhibition of thrombin23 8 Propane → CO

24 8 Propane → propene

25 13 Propane → propene

26 23 NH3 + CH4 → NCN

27 9 CO → CO2

28 4 CO + CO2 + H2 → CH3OH29 5 3CO + 3H2 + H2 → CH3O30 6 CO + CO2 + H2 → CH3OH31 10 n-Pentane isomerizatio32 7 Propane → propene

33 8 Isobutane → methacro34 8 Membrane permeabilit35 4 Cyclohexene epoxidat36 3 Protein inhibition

37 6 Red luminescence

Data from Rabitz (2012).

ons in the materials genome (left), and crossover, where twoht).

f lower dimension that biological fitness landscape, buthey still exist in large multidimensional properties spacesnd probably also contain local optima. These features can

onstitute ‘traps’ that may prevent genetic algorithms orther computational methods of locating optima from find-ng the best solutions. It can be shown in fact that it is notossible to find the global optimum in a complex surface by

al optima. The local optimum ‘traps’ make the materials hardise (Rabitz, 2012).

e literature.

Number ofexperiments to reachoptimal outcome

Number of possiblecontrol sample points

300 6.4 × 107

328 NA 400 1.6 × 105

150 NA280 NA60 NA

644 NA189 NA

115 2.7 × 109

H 160 2.4 × 1011

235 4.7 × 109n 72 1.44 × 104

80 NAlein 90 109

y 192 9 × 1021

ion 114 NA160 1016

216 NA

Page 10: Chemistry space—time · Chemistry space—time David A. Winklera,b c ∗ a CSIRO Manufacturing, Bag 10, Clayton South 3169, Australia b Monash Institute of Pharmaceutical Sciences,

Chemistry space—time 11

Fa

tpehr(f

Figure 13 White light LEDs.

any methods other than an exhaustive grid search with smallsteps, clearly impossible for spaces as large as materialsspace is projected to be. Surprisingly, preliminary experi-ments on materials evolution have suggested that this localoptimum or ‘trap’ problem may not be as serious as firstthought. Rabitz has studied examples where two variables(this could be extended to many more in practice) that con-trol a yield or property value (Fig. 12) (Rabitz, 2012). Thehypothetical associated landscape (a) has multiple local sub-optimal maxima acting as traps for searches seeking the bestpossible outcome. Landscape (b) has rich structure but onlya single maximum. In practice, Rabitz found that in ∼90%of the reported chemical and material landscapes have noevident traps. Even where traps exist, local optima on thefitness landscape are likely to be very useful as they arein biological fitness landscapes where local optima oftendefine different species. However, these conclusions havebeen drawn from a small set of examples and they may notapply to materials space generally.

Although there are few examples in the literature whereevolutionary methods have been used in practice to explorelarge and complex properties spaces (Le and Winkler, 2015),

tp

Figure 14 Methods for geneFrom https://www1.eere.energy.gov/buildings/ssl/images/led-basi

igure 15 Improvement in white light LED phosphor colourcross four generations. From Park et al. (2012).

he use of evolutionary methods has been identified in arominent, recent Editorial (Nature Materials paper refer-nce) (Polishchuk et al., 2013). Most of the literature studiesave involved catalysis and they have generated impressiveesults. Some of the studies (Table 3) have found effectivebut necessarily local) solutions after 100—300 experimentsrom possible search spaces as large as 1022.

We provide a literature example of the use of evolu-ionary methods to optimise the brightness and colour ofhosphors for white light LEDs.

rating white light LEDs.cs white light.jpg.

Page 11: Chemistry space—time · Chemistry space—time David A. Winklera,b c ∗ a CSIRO Manufacturing, Bag 10, Clayton South 3169, Australia b Monash Institute of Pharmaceutical Sciences,

12 D.A. Winkler

Table 4 Example white light phosphor materials ‘genomes’.

CaO SrO BaO YO3/2 LaO3/2 SiN4/3 EuO3/2

1 0.1920 0.0384 0.1536 0.0000 0.0000 0.6000 0.01602 0.1382 0.0154 0.0000 0.0400 0.1600 0.6400 0.16003 0.0000 0.1536 0.0384 0.0800 0.0800 0.6400 0.00804 0.0154 0.0038 0.0192 0.0000 0.1600 0.8000 0.00165 0.0230 0.0461 0.1613 0.0400 0.0000 0.7200 0.00966 0.0000 0.1229 0.0307 0.0080 0.0320 0.8000 0.00647 0.0346 0.2074 0.1037 0.0000 0.0400 0.6000 0.0144

Adapted from Park et al. (2012).

F eura

E

EeiCortglTtism

eomtsde

cn

pmPnsePaap

tShtwove

lwl

igure 16 Predictions of phosphor brightness from Bayesian n

volving white light LED phosphors

fficient use of resources and energy is essential, not only forxtending the life of finite resources but also for minimis-ng the effects of waste, especially greenhouse gases likeO2 and methane. Lighting consumes a significant amountf electricity resources and light sources have evolvedelatively recently from the incandescent filament inven-ion made by Swan and Edison many years ago. Incandescentlobes have been replaced by fluorescent and metal vapouramps then, most recently, by white light LEDs (Fig. 13).hese devices are up to fifty times more energy efficienthan incandescent sources and have made a significantmpact on lighting energy use and consequently, CO2 emis-ions. White light LEDs are commonly made by one of threeethods (Fig. 14).Park et al. used a materials evolution approach to gen-

rate white light LED phosphors from sintered rare earthxides (Park et al., 2012). Phosphors were synthesised byixing seven ground metal oxides in different mole frac-

ions then firing at 1525 ◦C for 4 h under a N2 gas flow in aealed tube furnace. Each catalyst was encoded by a sevenimensional ‘genome’ consisting of the mole fraction of

ach metal oxide in the catalyst (Table 4).

After the first generation, the phosphor intensity andolour of each individual was tested and analysed for fit-ess. The fittest individuals were used to generate a new

mffs

l net model for training set (left) and test set (right).

opulation of individuals through combination crossover orutation operations (they also used a more sophisticated

article Swarm optimisation method for comparison). Theseew individuals composed the next generation, and theelection and evolution process was repeated over four gen-rations. Brightness and colour improved in each generation.ark et al. also used Pareto ranking to find the best materi-ls that also met other potentially conflicting criteria suchs novelty (structure similar to known phosphors) and phaseurity.

They synthesised and tested 144 phosphors usinghis evolutionary approach. The seven dimensionalrO—CaO—BaO—La2O3—Y2O3—Si3N4—Eu2O3 search spacead a minimum size of 16,384 if each rare earth oxide couldake one of four possible mole fractions (47). Excellenthiteness and brightness was achieved over the four cyclesf evolution using only 144/16,384 or 0.88% of the entireirtual library or search space. The results of successivevolutionary cycles are summarised in Fig. 15.

While this early use of evolutionary methods to find excel-ent local solutions worked very well, we were intriguedhether the data from the four successive cycles of evo-

utionary optimisation could be used to build a predictive

odel that could search the materials space more minutely

or better solutions. We generated a model relating the moleraction of rare earth oxides in the sample (a seven dimen-ional input vector) and the colour and brightness of the

Page 12: Chemistry space—time · Chemistry space—time David A. Winklera,b c ∗ a CSIRO Manufacturing, Bag 10, Clayton South 3169, Australia b Monash Institute of Pharmaceutical Sciences,

B

B

B

B

B

B

B

B

B

B

E

G

J

K

K

K

L

L

Chemistry space—time

resulting phosphor. We wondered that the relationship wasvery nonlinear, with linear models providing very low pre-dictability of the phosphor properties. A nonlinear modelconstructed from a Bayesian regularised neural network withfive nodes in the hidden layer (Burden and Winkler, 1999a,b;Winkler and Burden, 2000) could predict the training andtest set properties with high fidelity (Fig. 16). We used thismodel to predict the properties of 10 million phosphorswithin the domain of the model, by allowing each phosphorsto adopt 10 possible mole fractions (ensuring the sum ofthe seven mole fractions was 1.0). We discovered severalphosphors with brightness’s almost twice those of the bestphosphors found by Park et al. (2012). The very significantnonlinearity of the response surface and the fact that usinga nonlinear model discovered better solutions suggest thatPark et al. could have also found even better solutions ifthey had run their evolutionary approach for more cycles.

Where to next? Automatic ‘closed loop’systems

Evolutionary methods have been shown to be effective inmaterials discovery, helping with the ‘‘curse of dimension-ality’’. The approach is very complementary to the newhigh throughput materials synthesis, characterisation, andtesting technologies — e.g., RAMP, flow chemistry, highthroughput beam lines, combinatorial chemistry.

Ultimately an automatic, closed loop system could bedeveloped where the fittest materials synthesised in agiven generation are used to design the next generationof improved materials. Use of evolutionary and machinelearning in silico methods and robotic synthesis and char-acterisation methods could explore large materials spacesand accelerate discovery of novel, useful materials.

Conflict of interest

The author declares that there is no conflict of interest.

Acknowledgements

The author gratefully acknowledges the contributions ofmany collaborators. Nottingham: Andrew Hook, Jing Yang,Paul Williams, Martyn Davies, Morgan Alexander. MIT: YingMei, Robert Langer and Daniel Anderson. Monash Univer-sity: Ron Brown, Peter Godfrey (both deceased). CSIRO: TuLe, Frank Burden, Vidana Epa (modelling team), DanielleKennedy, Ben Muir, Shaun Howard (RAMP), Julianne Halley(now at York).

The author also gratefully acknowledges the financialsupport from CSIRO Advanced Materials TransformationalCapability Platform, a Newton Turner Fellowship forExceptional Senior Scientists, the CSIRO Complex SystemsScience Initiative, the European COST Office, and theNational Enabling Technologies Scheme of the AustralianGovernment.

References

Bernstein, M.P., Dworkin, J.P., Sandford, S.A., Cooper, G.W., Alla-mandola, L.J., 2002. Racemic amino acids from the ultraviolet

L

13

photolysis of interstellar ice analogues. Nature 416, 401—403,http://dx.doi.org/10.1038/416401a.

rown, R.D., Godfrey, P.D., Storey, J.W.V., Bassez, M.-P., Robin-son, B.J., Batchelor, R.A., McCulloch, M.G., Rydbeck, O.E.H.,Hjalmarson, A.G., 1979. Search for inter-stellar glycine.Mon. Not. R. Astron. Soc. 186, P5—P8, http://dx.doi.org/10.1093/mnras/186.1.5p.

rown, R.D., Godfrey, P.D., Winkler, D.A., 1980a. Detection of the23-22 emission-line of sulfur monoxide and its relevance tomagnetic-fields in Orion. Mon. Not. R. Astron. Soc. 190, 1—6,http://dx.doi.org/10.1093/mnras/190.1.1.

rown, R.D., Godfrey, P.D., Winkler, D.A., 1980b. Microwave-spectrum of (Z)-ethanimine. Aust. J. Chem. 33, 1—7,http://dx.doi.org/10.1071/ch9800001.

rown, R.D., Godfrey, P.D., Winkler, D.A., 1982. Hyperfineinteractions in methanimine. Aust. J. Chem. 35, 667—672,http://dx.doi.org/10.1071/ch9820667.

urden, F.R., Winkler, D.A., 1999a. Robust QSAR models usingBayesian regularized neural networks. J. Med. Chem. 42,3183—3187, http://dx.doi.org/10.1021/jm980697n.

urden, F.R., Winkler, D.A., 1999b. New QSAR methods appliedto structure—activity mapping and combinatorial chemistry.J. Chem. Inf. Comput. Sci. 39, 236—242, http://dx.doi.org/10.1021/ci980070d.

urden, F.R., Ford, M.G., Whitley, D.C., Winkler, D.A., 2000. Useof automatic relevance determination in QSAR studies usingBayesian neural networks. J. Chem. Inf. Comput. Sci. 40,1423—1430, http://dx.doi.org/10.1021/ci000450a.

urden, F.R., Polley, M.J., Winkler, D.A., 2009. Toward novel uni-versal descriptors: charge fingerprints. J. Chem. Inf. Model. 49,710—715, http://dx.doi.org/10.1021/ci800290h.

urden, F.R., Winkler, D.A., 2009a. Optimal sparse descriptor selec-tion for QSAR using Bayesian methods. QSAR Comb. Sci. 28,645—653, http://dx.doi.org/10.1002/qsar.200810173.

urden, F.R., Winkler, D.A., 2009b. An optimal self-pruningneural network and nonlinear descriptor selection inQSAR. QSAR Comb. Sci. 28, 1092—1097, http://dx.doi.org/10.1002/qsar.200810202.

lsila, J.E., Dworkin, J.P., Bernstein, M.P., Martin, M.P., Sandford,S.A., 2007. Mechanisms of amino acid formation in interstel-lar ice analogs. Astrophys. J. 660, 911—918, http://dx.doi.org/10.1086/513141.

iri, C., Goesmann, F., Meinert, C., Evans, A.C., Meierhenrich,U.J., 2013. Top. Curr. Chem. 333, 41—82, http://dx.doi.org/10.1007/128 2012 367.

ohnson, D.R., Lovas, F.J., 1972. Microwave detection of the molec-ular transient methyleneimine (CH2 NH). Chem. Phys. Lett. 15,65—68, http://dx.doi.org/10.1016/0009-2614(72)87017-9.

roto, H.W., 1980. The detection of unstable species usingmicrowave, photoelectron and radioastronomy techniques.Chimia 34, 313.

uan, Y.-J., Charnley, S.B., Huang, H.C., Tseng, W.L.,Kisiel, Z., 2003. Interstellar glycine. Astrophys. J. 593,848—867.

uan, Y-.J., Huang, H.-C., Charnley, S.B., Tseng, W.-L., Snyder,L.E., Ehrenfreund, P., Kisiel, Z., Thorwirth, S., Bohn, R.K., Wil-son, T.L., 2004. Prebiologically important interstellar molecules.In: Norris, R.P., Stootman, F.H. (Eds.), Bioastronomy 2002: LifeAmong the Stars. IAU Symposium, vol. 213. , pp. 185—188.

e, T., Epa, V.C., Burden, F.R., Winkler, D.A., 2012. Quantitativestructure—property relationship modeling of diverse materi-als properties. Chem. Rev. 112, 2889—2919, http://dx.doi.org/10.1021/cr200066h.

e, T.C., Winkler, D.A., 2015. A bright future for evolutionary meth-

ods in drug design. ChemMedChem 10 (8), 1296—1300.

oomis, R.A., Zaleski, D.P., Steber, A.L., Neill, J.L., Muckle, M.T.,Harris, B.J., Hollis, J.M., Jewell, P.R., Lattanzi, V., Lovas,F.J., Martinez Jr., O., McCarthy, M.C., Remijan, A.J., Pate,

Page 13: Chemistry space—time · Chemistry space—time David A. Winklera,b c ∗ a CSIRO Manufacturing, Bag 10, Clayton South 3169, Australia b Monash Institute of Pharmaceutical Sciences,

1

L

M

M

P

P

R

S

S

S

W

W

4

B.H., Corby, J.F., 2013. The detection of interstellar etha-nimine (CH3CH NH) from observations taken during the GBTPRIMOS Survey. Astrophys. J. Lett. 765, L9, http://dx.doi.org/10.1088/2041-8205/765/1/l9.

ovas, F.J., Suenram, R.D., Johnson, D.R., Clark, F.O., Tiemann,E., 1980. Pyrolysis of ethylamine. II. Synthesis and microwavespectrum of ethylidenimine (CH3CH NH). J. Chem. Phys. 72,4964—4972, http://dx.doi.org/10.1063/1.439783.

artin, S., 2012. Lattice enumeration for inverse moleculardesign using the signature descriptor. J. Chem. Inf. Model. 52,1787—1797, http://dx.doi.org/10.1021/ci3001748.

oore, K.W., Pechen, A., Feng, X.-J., Dominy, J., Beltrani, V.J.,Rabitz, H., 2011. Why is chemical synthesis and property opti-mization easier than expected? Phys. Chem. Chem. Phys. 13,10048, http://dx.doi.org/10.1039/c1cp20353c.

ark, W.B., Shin, N., Hong, K.-P., Pyo, M., Sohn, K.-S., 2012.A new paradigm for materials discovery: heuristics-assistedcombinatorial chemistry involving parameterization of material

novelty. Adv. Funct. Mater. 22, 2258—2266, http://dx.doi.org/10.1002/adfm.201102118.

olishchuk, P.G., Madzhidov, T.I., Varnek, A., 2013. J. Comput.-Aided Mol. Des. 27, 675—679.

Z

D.A. Winkler

abitz, H., 2012. Control in the sciences over vast length and timescales. Quant. Phys. Lett. 1, 1—19.

hoichet, B.K., 2013. Drug discovery: nature’s pieces. Nat. Chem.5, 9—10, http://dx.doi.org/10.1038/nchem.1537.

nyder, L.E., 1997. The search for interstellar glycine. Orig.Life Evol. Biosph. 27, 115—133, http://dx.doi.org/10.1023/a:1006522230405.

nyder, L.E., Lovas, F.J., Hollis, J.M., Friedel, D.N., Jewell, P.R.,Remijan, A., Ilyushin, V.V., Alekseev, E.A., Dyubko, S.F., 2005.A rigorous attempt to verify interstellar glycine. Astrophys. J.619, 914—930, http://dx.doi.org/10.1086/426677.

inkler, D.A., Burden, F.R., 2000. Robust QSAR models from noveldescriptors and Bayesian Regularised Neural Networks.Mol. Simul. 24, 243—258, http://dx.doi.org/10.1080/08927020008022374.

oon, D.E., 2002. Pathways to glycine and other amino acids inultraviolet-irradiated astrophysical ices determined via quan-tum chemical modeling. Astrophys. J. Lett. 571, L177—L180,

http://dx.doi.org/10.1086/341227.

hou, J., Love, P.E.D., Wang, X., Teo, K.L., Irani, Z., 2013. J.Oper. Res. Soc. 64, 1091—1105, http://dx.doi.org/10.1057/jors.2012.174.