computation and computational thinking in chemistry
DESCRIPTION
Computation and computational thinking in Chemistry. Paul Madden School of Chemistry. The “plan”. My interest – atomistic , predictive calculations of the properties of materials Energy minimization – optimization ideas - PowerPoint PPT PresentationTRANSCRIPT
Computation and computational thinkingin Chemistry
Paul Madden School of Chemistry
The “plan”
• My interest – atomistic, predictive calculations of the properties of materials
• Energy minimization – optimization ideas
• Cutting out the computer – application of optimization strategies in synthesis
+ -
numerous technologies benefit from the capability to model thermodynamic and
transport properties accurately & reliably
Pyroprocessing of Nuclear Waste
LiCl/KCl “solvent” – now fluorides
Are the (continuum) models of transport adequate representations of reality?
More principles:
Why simulate?: interpretation/visualization provide data not obtainable by experiment
answer problems of principle, test theory
Molecular Dynamics simulation:
Follow trajectory of interacting atoms
r
Newton’s Laws of Motion
Molecular Dynamics simulation:
Follow trajectory of interacting atoms
r
Newton’s Laws of Motion
Need a “Law of Force” – sometimes “pairwise additive”
(like gravitation F ∞ 1/ r 2 )
Electron Densities and the “Force Laws”
Covalent Ionic, Non-bonding
Overlap of two spherical,non-bonding chargedensities
- +
Electron Densities and the “Force Laws”
Covalent Ionic, Non-bonding
Overlap of two spherical,non-bonding chargedensities
- +
A stiff spring between bonded atoms
Can model the dependenceon interatomic separation
because of the simplicity of these force laws can model (atomistically) molecular materials of great complexity
Cell membrane
Phospholipid
Can visualise (qualitatively)
"These movies were made by Dr. Aleksei Aksimentiev using VMD and are owned by the Theoretical and Computational Biophysics Group, NIH Resource for Macromolecular Modeling and Bioinformatics, at the Beckman Institute, University of Illinois at Urbana-Champaign."
Ion permeation through α-haemolysin
Electron Densities and the “Force Laws”
Covalent Ionic, Non-bonding
Overlap of two spherical,non-bonding chargedensities
- +
Now “easily” manipulated by chemistry (200 years)
Control the “liaisons”affected by thermal motion
Inhibition of Cyclin Dependent Kinases (CDKs)
CDK2
ATP binding pocket
CDK2 is involved in DNA replication
It is overexpressed in cancer cells, => Find inhibitors
Inhibition of Cyclin Dependent Kinases (CDKs)
NU2058
NU6027
9d-NU6027
ATP
NU6102
StaurosporineSU9516
Inhibition of Cyclin Dependent Kinases (CDKs)
CDK2
ATP binding pocket
CDK2 is involved in DNA replication
It is overexpressed in cancer cells, => Find inhibitors
MD simulation:
Follow trajectory of interacting atoms
r
But, this only works if the electrons are moving “trivially” with nucleii
Newton’s Laws of Motion
Need a “Law of Force” – sometimes pairwiseadditive – and this makes large-scale possible
Interatomic interactions mediated by local electron density
generally, this depends on instantaneous coordination environment
Electron density for aself-interstitial in Aluminium
Interatomic interactions mediated by local electron density
generally, this depends on instantaneous coordination environment
Electron density for aself-interstitial in Aluminium
Can obtain the forcesdirect from an electronic structure calculation
“First-Principles”
Such calculations can give accurate binding energies (v.i.)
Interatomic interactions mediated by local electron density
generally, this depends on instantaneous coordination environment
Electron density for aself-interstitial in Aluminium
Can obtain the forcesdirect from an electronic structure calculation (on-the-fly)
Additional benefit: obtain the electronic structure
E.g: mechanism of oxidation of a silicon surface (M. Payne)
The ab initio MD methods are general andparticularly useful when covalent bonds arebroken and formed
But they are very expensive, meaning that many issues, requiring large simulations or long runs, are out of reach
Why simulate?: interpretation/visualization provide data not obtainable by experiment
answer problems of principle, test theory
i.e. quantitative, realistic modelling
Properties of materials under extreme conditions
Mineralogy of the earth’s interior
Phase diagram of H2O -- or is it??
1 GPa = 10,000 atmospheres!!
Direct coexistence simulation – to obtain melting temperature
Determine T & P at which equilibrated solid and liquid
Size Matters:
Gillan, Alfè
The ab initio MD methods are general andparticularly useful when covalent bonds arebroken and formed
But they are very expensive, meaning that many issues are out of reach
Maybe we can use simpler representation ofelectronic structure in some cases
The ab initio MD methods are general andparticularly useful when covalent bonds arebroken and formed
But they are very expensive, meaning that many issues are out of reach
Maybe we can use simpler representation ofelectronic structure in some cases
e.g. in ionic materials simple force laws do not work quantitatively
Ions are not spherical – theyare deformed in thisenvironment
Maybe in “ionic” materials:Electron densityin an AlF3 crystal
Incorporate such ideasinto interaction potentialand parameterize A-IMultiscale modelling
Direct coexistence simulation to determine the melting temperature of MgO
Determine T & P at which equilibration occurs
Melting curve of MgO ab initio model
=
Many problems may beregarded as optimization
e.g. lowest energy structures of a cluster or a crystal
= + +
Finding a global minimum may be easy, or hard
Energy Landscape concept
+ +=
For “hard” problems non-minimization strategies, such as “genetic algorithms” have been adopted
Structures of virus capsids
Hard for minimization
110001101001001001110 00110110101101011100
1100011010010 1011100 0011011010110
1001110
Parents
Offspring
Crossover
“fitness”
mutation
Genetic algorithm
Start with a population of “parents” and evolve successive generations, by stochastically selecting moves, to improve fitness
Representation of problems within GA paradigm
Folding a protein, which should be “hard”, must actually be easy (for nature – simulated annealing works!).
Primary Structure: Sequence• The primary structure of a protein is the amino acid sequence
Typical protein will contain ~ 200 links
Tertiary Structure: A Protein Fold
Proteins onlywork when properly folded
Primary Structure: Sequence
• Twenty different amino acids have distinct shapes and properties
Secondary Structure: , , & loops
helices and sheets are stabilized by hydrogen bonds between backbone oxygen and hydrogen atoms
Tertiary Structure: A Protein Fold
Levinthal paradox, 1968
• A polypeptide chain of 100 residues (amino acids)• Each residue has only 2 possible configurations• 2^100~10^30 configurations• 10^-11 second is required to convert one to another• 10^19 seconds ~10^11years!• Doubling time for a bacteria is <30 minutes• Molten globule (microsecond ~ millisecond)• Native state (millisecond ~ seconds)
Idea of a folding “funnel”
“Foldability” must be encoded in the amino acid sequence
Schematic representation of some of the states accessible to a polypeptide chain following its
biosynthesis
We know the amino acid sequence from the genome project
A major objective is to be able to predict the fold from a knowledge of the sequence
The folded structures of some proteins is known from crystallography
Inhibition of Cyclin Dependent Kinases (CDKs)
CDK2
ATP binding pocket
CDK2 is involved in DNA replication
It is overexpressed in cancer cells, => Find inhibitors
Inhibition of Cyclin Dependent Kinases (CDKs)
NU2058
NU6027
9d-NU6027
ATP
NU6102
StaurosporineSU9516
Binding Energy (eV)
Must go out to large distances to get convergence
Can calculate binding energy of molecule at active site
Identifying drug molecules by direct calculation ofenergetics is far too slow for practical applications
Instead use QSAR Quantitative Structure Activity Relations
Activity = function(prop1,prop2,prop3,prop4,…)
prop is a readily-determined property of each potential drug mol.
Use a training set of drug mols to “determine” function (neural net)
Search huge databases of mols > 106
=> targets for synthesis and testing
However, the “properties” of relevance are defined on 3-d grids
e.g. of the electrostatic potential, or the hydrophobicity of the molecule
- which should match that of the binding site
But, the molecule (& grid) must be aligned with pocket
And, property varies with the conformation of the molecule !
Leads to huge search problems – screen-savershttp://www.bellatrix.ox.ac.uk
Oxide glasses with many components
Step 1: prepare “gene pool” of 54 glasses made up with randomly chosen compositionsStep2: measure their luminosities – “fitness”
Cutting out the computer!
“Engineering” to producearrays of such chemicals and to screen them for desirable characteristicsis now well-established
“Combinatorial Chemistry” e.g. Prof. Mark Bradley
(Huge arrays possible)
Generate a second generation stochastically and “evolve”
Drug Discovery Today !