ober simulation science: direction/needs of the next 2-5 years
DESCRIPTION
OBER simulation science: direction/needs of the next 2-5 years. Doug Rotman, LLNL Feb. 22, 2001 NERSC-NUGEX meeting February 22/23, 2001. OBER’s simulations will continue to challenge compute platforms of the next decade. Climate modeling and carbon cycle - PowerPoint PPT PresentationTRANSCRIPT
OBER simulation science: direction/needs of the next 2-5 years
Doug Rotman, LLNL
Feb. 22, 2001
NERSC-NUGEX meeting February 22/23, 2001
OBER’s simulations will continue to challenge compute platforms of the next decade
• Climate modeling and carbon cycle
• Atmospheric chemistry and aerosols
• Computational biology
Understanding climate forcings
Global, annual-mean radiative forcings (W m-2) due to a number of agents from 1750 to present. The vertical line about the rectangular bar indicates an estimate of the uncertainty range
[IPCC, 2000a].
Climate modeling: current capabilities• Coupled atmos/ocean (resolution)
– atmos: about 2 degrees– ocean: about 1 degree
• Atmos: prescribed land types, substantial efforts in radiation physics (SW/LW), boundary layer physics, cloud physics, and meteorological processes
• Ocean: detailed ocean floor topography, convection
• includes atmospheric sulfate aerosols• prescribed greenhouse gases - CO2, ...• model top ~ 40-50 Kms• Multi-century simulations are large
productions• Ensembles of multi-century are heroic
Climate Modeling parallel computing characteristics
• Mostly 1-D domain decomposition
• At current resolution, focused on ~100-200 processors
• Typically not memory bound
• Throughput is major issue
• Climate simulations tend to be long, hence queuing system to enable long running jobs is optimal (but, we are also quite talented at playing the queuing games!)
• more general 2-D (and 3-D!!) decompositions are coming ...
Moving to higher resolution climate models
• Topology, land types, clouds, precipitation, emissions of species, … all point to the need for higher resolution climate simulations to understand processes that impact climate prediction
• There are multiple scientific issues to be addressed at higher resolution, but, …
• To 1st order, computational limitations dominate
Higher resolution costs build quickly (15 year run in 8 hours wall clock)
Resolution (km) Required (Gflops) Required storage (Gbytes)
300 15 25
200 32 50
150 140 120
125 220 190
75 1330 620
60 3300 1450
40 22000 4900
30 42000 7600
Sustained!Grid size Just monthly averages!
Future climate models will include chemistry and more complete aerosol physics
• Accurate modeling of atmospheric processes and climate requires inclusion of realistic ozone chemistry and aerosol direct and indirect effects
• Chemistry and aerosols provide intense, but local calculations
• Transport of chemical species provides communication and accuracy challenges
Obs Increasing chemistry and physics
Moving from Specified to Predicted CO2
• • We must move from this:We must move from this:
• • To this:To this:
SpecifiedSpecifiedAtmospheric COAtmospheric CO22
ConcentrationConcentration
ClimateClimateModelModel
Future ClimateFuture Climate
SpecifiedSpecifiedCOCO22
EmissionsEmissions
Integrated Integrated Climate and Climate and
CarbonCarbonModelModel Future ClimateFuture Climate
COCO22 Concentration Concentration
Carbon management requires knowledge of sources, sinks and reservoirs
(A) Ocean carbon column inventory and (B) fluxes of anthropogenic carbon as of 1995
CO2 injection near New York City at 3000 m depth. Shown isthe amount of injected CO2 per unit surface areaafter 100 years of continuous injection.
DOCS/LLNL
Carbon cycle modeling requires interactive atmospheric, ocean and terrestrial ecosystem models
Need linkage to terrestrial ecosystem models
• How does vegetation and soils change with respect to changes in land use or climate?
00 400400 800800
Net Primary Productivity (NPP) (g C/mNet Primary Productivity (NPP) (g C/m22//yryr))
Next Generation Internet: Creating a Earth System Grid
• Goal: Enable a geographically distributed climate community [of thousands] to perform sophisticated, computationally intensive analyses and visualization on Petabytes of data
• Approach: We are integrating advanced data structures and algorithms for analysis and visualization of petabyte data in a distributed environment.
• Collaborators: NCAR, LBNL, ANL, LANL
Atmospheric chemistry: Current capabilities
• Separate stratospheric and tropospheric models (almost)
• resolution: about 2-4 degrees horizontal and 2 Km vertical
• short simulations using more complete mechanisms (80-100 species), multi-year runs use smaller chemistry(30-50 species); still uncertainty on some rates, ...
• substantial parameterizations, but still large uncertainties (dry dep, scavenging, PBL diffusion, …)
• fixed emissions, need interactive ...
• aerosols use fixed size distribution and many times, fixed geographic distribution
• little feedback to climate model
NO2 at 30 Km
We can now simulate ozone in a combined troposphere and stratosphere: will become standard
Chemistry coupling to biogeochemical ocean models
• Chlorophyll: provides feedback to DMS and sulfur emissions, which then impacts sulfate aerosol and climate forcing
Dec 1996, pre-El Ninos Dec 1997, strong El Ninos
Interactive chemistry and aerosols
0
20
40
60
80
100
30000 35000 40000 45000 50000 55000
Time (sec)
Flight 6 [971020]
• Formation of sulfate aerosols is dependent on local ozone concentration
• Rather than using monthly averaged ozone distributions, we are now moving forward to calculate aerosol formation using interactive and local ozone
Aerosol indirect effects may be more important than direct effects
W/m2
• Direct effects of aerosol (scattering) has been included
• Indirect effects (brightness and lifetime of clouds) may be more important and needs to be included
• microphysics plays a role; models will be implementing algorithms for the evolution of the aerosol size distribution via sedimentation, coagulation, nucleation, ….
• Interaction between aerosol microphysics and cloud physics is still very uncertain
General Computational needs for future climate/chemistry modeling• Hardware
– Sustained performance of about 250 Gflops
– Peak flop to byte (on processor): 2 to 1
– Aggregate memory: 1-2 Tbytes
– Cache at least 8 mb, hopefully 16mb
– Inter-node, bi-directional bandwidth: 1 - 5 Gbytes
– Latency: 5 micro-seconds
– Aggregate I/O bandwidth: 8 Gbytes/sec
– Disk needs: 10-50 Tbytes
• Software– MPI and OpenMP
– BLAS, FFTs, LAPACK, SPHEREPACK, NetCDF
– F90, C, ??
– Totalview
– CDAT (PCMDI), IDL, ..
– Queuing: long running jobs
– parallel profilers
Moving forward in Biology: from sequence to function• Key elements of upcoming computational biological research
– Characterize the link between protein sequence and fold topology
– Quantitative determination of protein structure from folding or conformational searches
– Simulate he biochemical function of individual gene products
Towards, for example,
Individualized medicine
Re-engineering microbes for bio-remediation
See http://cbcg.lbl.gov/ssi-csb
Experimental and computational activities are becoming more co-dependent
Computational biology involves modelingat many different levels of description
Homology-basedStructure Prediction
Classical Molecular Dynamicsand Molecular Mechanics
First PrinciplesQuantum Mechanics
• Protein structures• Structure-based homologies
• Dynamic structural data• Solvent distributions• Docking
• Molecular structures• Reaction energies• Spectra• Solvation energies• Reaction rates
First PrinciplesMolecular Dynamics
• Dynamic structural data (fast processes < 1 p.s.)• Solvent distributions• Quantative energetics
Increasing dependence on empirical data
Chemical modeling plays two roles in support of biological research
1) Analytical: Predict accurate chemical properties:
2) Qualitative: Explain observed phenomena:
Molecular structures Chemical reaction energies
Factors favoring helix formation Structure of parallel DNAConformation of DNA-adducts
0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
220 240 260 280 300 320 340 360 380
pH 9.0pH 7.0pH 5.0pH 3.0
234 273 318257 343
Spectroscopic values
Loosely Coupled Clusters Provide High-Throughput Capacity for Comprehensive Biological Studies
O
O
OH
OH
OH
Apigenin Flavonol
O
OOH
OH O
OH O
OH
OH
Luteolin Isorhamnetin
O
O
OH
OHOH
OH
OCH3
O
OH O
5ohflavone
O
OOCH3
5moflavone
O
OH O
OH
Chrysin
O
O
Flavone
O
O
H3CO
6moflavone
OH O
OH O
OH
OH
Kaempferol
O
O
OCH3
4'moflavone
O
O
Flavanone
O
O
OH
7ohflavone Diosmetin
O
OH O
OH
OH
OCH3
2'moflavone
O
O
H3CO
Fisetin
OH
O
O
OH
OH
OH
O
O
OCH3
OCH3
H3CO
H3CO
OCH3
Tangeretin
O
O
OCH3
4'moflavanone
O
OOH
OH
OCH3
Isosakuranetin
O
OH
2ohChalcone
O
OOHH3CO
6moflavonol
O
O
OH
OH
OH
Naringenin
O
O
OH
OH
OH
OH
OH
Quercitin
O
Chalcone
O
OCH3
4'moChalcone
O
OOCH3
5moflavanone
O
OOH
H3CO
Pinostrobin
O
O
OH
OH
OH
OCH3
Hesperitin
O
O
H3CO
6moflavanone
O
O
O
4’ohflavanone
O
O
OH
OH
OH
OH
OHOH
Myricetin
H
O
O
OH
OH
OH
OH
Eriodictyol
BiochaininA
OH O
OOHOCH3
O
OH
OH
OH
Phloretin
O
O
OHOH
OH
OH
OH
Robinetin
O
O
OH
OH
OH
OHOH
Morin
OH
O
O
6ohflavone
O
O
OH
2'ohflavanone
O
O
OH
6ohflavanone
OH
Simulation of bioflavonoid cancer-preventative compounds for structure-activity study
Simulation of binding energetics for natural and synthetic DNA bases
Structures and barriers to ring planarity calculated using theHartree-Fock method with a 6-31G* basis set. The energy toform a planar structure is correlated to bioactivity.
Ab initio quantum chemical calculations
G-C A-T
Z-F
Z-T
A-F
A-FBinding energies and structures calculated using DFT/B3LYP, Hartree-Fock and Møller-Plesset perterbation theory with a 6-31G** basis set. Arrows indicate calculated dipole moments of individual bases.
MPP Computers Provide Unique Capability for Simulations at an Unprecedented Accuracy and Scale
First Principles Molecular Dynamics Simulations
water
hydrogenfluoride
electrondensity isosurface
HF-H2O mixture showing proton exchange and electron density(600 atom simulation took 12 days on 3840 processors of ASCI Blue)
Solvated Dimethyl Phosphate(3.5 ps. took 30 days on 104 processors of ASCI Blue)
Dimethylphosphate
Aqueous-phase reactions: Solvation effects on DNA backbone:
Figures courtesy of Francois Gygi
Computational Requirements forBiochemical Simulations
1-10 TeraFLOPs .1-1 PetaFLOPs
Proposed active site for Exo III DNA nuclease(Barsky, et al. unpublished results.)
Experimental structure of DNApolymerase I with DNA binding sitepredicted by modeling.(Doublie, et al. Nature, 391 (1998) 251-258.)
>1 ExaFLOPs
Electron micrograph reconstructionof E. Coli 70s Ribosome(Frank, et al. Nature, 376 (1995) 441-444.)
First-principles dynamics forenzyme mechanisms. Mixed classical/First principles
dynamics for complete enzyme
Mixed classical/First principlesdynamics for multiprotein-nucleic acid complex.
General Computational needs for future computational biology modeling
• Hardware
– several hundred Mbytes per processor
– Gigabyte per second inter-processor communication needs
• Interconnects
– community relies on quality access to dispersed databases and information
– Next Generation Internet or similar high bandwidth connections are essential