genomes to life a partnership between biology and computing
Post on 12-Jan-2016
27 Views
Preview:
DESCRIPTION
TRANSCRIPT
1
OASCR Genomes to Life OBER
Genomes to Lifea partnership between Biology and Computing
http://www.doegenomestolife.org/
Gary JohnsonJohn Houghton
Office of Science
2
OASCR Genomes to Life OBER
3
OASCR Genomes to Life OBER
Office of Advanced Scientific Computing Research:
Mathematical, Informationand
Computational Sciences
a brief overview
http://www.sc.doe.gov/production/octr/mics/index.html
4
OASCR Genomes to Life OBER
operate supercomputers, a high performance network, and related facilities.
MICS Mission
Discover, develop, and deploy the computational and networking advances that enable researchers in the scientific disciplines to analyze, model, simulate, and predict complex physical, chemical, and biological phenomena important to the Department of Energy (DOE).
support a broad research portfolio in advanced scientific computing – applied mathematics, computer science, networking and collaboratory software
5
OASCR Genomes to Life OBER
Program Strategy
BasicResearch
…simulation …distributed teams, of complex systems remote access to facilities
Energy Sciences Network (ESnet)
Advanced Computing Research Facilities
National Energy Research Scientific Computing Center (NERSC)
• Materials• Chemical• Combustion• Accelerator• HEP• Nuclear• Fusion• Climate• Astrophysics
• Applied Mathematics• Computer Science
• Scientific Application Pilots• Collaboratory Tools
• Collaboratory Pilots
BES,BER, FES, HEP, NP
• Integrated Software Infrastructure CentersTeams- mathematicians, computer scientists,
application scientists, and software engineers
High Performance Computing and Network Facilities for Science
Research to enable…
• Grid enabling research • Topical Computing
• Networking
• Nanoscience
Computational Biology
6
OASCR Genomes to Life OBER
Budget Request
FY2003- $166,625,000
Base Research
Comp. Bio.SciDAC
FacilitiesSBIR/STTR
32%
5%25%
35%
3%
Enhancements over FY2002
• Computational Biology +$5.6M• SciDAC +$5.3M• Facilities +$1.3M
7
OASCR Genomes to Life OBER
unscalable
scalable
Problem Size (increasing with number of processors)
Tim
e t
o S
olu
tio
n
200
150
50
0
100
10 100 10001
Ax=b F(u,x,y,z)=0 F(u,u’,u’’,…,x,y,z,t)=0
Linear Solvers Nonlinear Solvers PDE Solvers
From the “simple”… …to the complex!
Ax=Bx
Eigensolvers
Algorithms must be scalable. Ideally, as the problem size grows and the number of processors grows, the solution time does not !
Combustion
~60 coupled, nonsymmetric, nonlinear time-dependent PDEs on 10M mesh points. Time steps range from 10-12 (for chemical reaction rates) to 10-2 (for the speed of flame front)
Current simulations use 44 amino acids.
Protein Folding
Actual protein ~300 amino acids. Run times using current techniques? Greater than life of the universe!
Applied Mathematical Sciences
8
OASCR Genomes to Life OBER
AMS Base Research Program
ObjectivesAdvance our understanding of science and technology by supporting research in basic applied mathematics and in computational research that facilitates the use of the latest high-performance computer systems.
Applied Mathematics Research:Linear AlgebraFluid DynamicsDifferential Eqs.Optimization
Robust High-Performance Numerical LibrariesAdaptive Mesh Refinement (AMR)Sustained Teraflop/s simulationsLevel Set / Fast Marching MethodsInvestment in Education Computational Sciences Graduate Fellowship
Ultrascalable Algorithms(up to millions of PEs)
Mathematical Microscopy
These opportunities will be explored through• Genomes to Life (with BER)• Comp. Nanoscience (with BES)• Fusion Energy (FESAC-ASCAC workshop)
Ongoing Projects Growth Opportunities
Accomplishments
Grid GenerationPredictability Analysis &Uncertainty Quantification
Automated Reasoning
Advanced Numerical Algorithms:PETScAztecTAOADIFOR / ADIC
HypreCHOMBOSuperLUPICO
9
OASCR Genomes to Life OBER
Computer Science Research
• Challenge – HPC for Science is (still after fifteen years!)– Hard to use– Inefficient– Fragile– An unimportant vendor market
• Vision– A comprehensive, integrated software
environment which enables the effective application of high performance systems to critical DOE problems
• Goal– Radical Improvement in– Application Performance– Ease of Use– Time to Solution Node and System Hardware Arch
User Space Runtime Support
OS Kernel OS Bypass
ScientificApplications
SystemAdmin
SoftwareDevelopment
Chkpt/Rstrt Math LibsDebuggers
Viz/Data Scheduler
PSEsRes. Mgt Framewrks
Compilers
Perf ToolsFile Sys Runtme Tls
HPC System Elements
10
OASCR Genomes to Life OBER
Computer Science Technical Elements
Interoperability & Portability
Tools$6.5M
System Software
Environment$4.7M
Performance Evaluation & Optimization
$4.5M
Programming Models & Runtime$3.8M
Visualization & Data
Understanding$5.8M
25%19%
18%
15%23%
11
OASCR Genomes to Life OBER
Major Accomplishments
• PVM – the first widely successful model for parallel computing• MPI – the lingua franca of today’s parallel computing• MPICH – the open source version of MPI that is the basis for all
vendor adaptations• Global Arrays – the distributed shared memory programming model
that is at the core of NWChem, the motivating application for SciDAC
• CTSS – the first interactive operating system for high performance computers
• SUNMOS/Puma/Cougar – the most successful high performance parallel operating system
• OSCAR – a partnership with industry, the most widely used open source toolkit for management of Linux clusters
12
OASCR Genomes to Life OBER
National Collaboratories
• The nature of how large scale science is done is changing
– Distributed data, computing, people, instruments
– Instruments integrated with large-scale computing
– Human resources are seldom collocated with the resources needed for their science
• Additional drivers– Large and international collaborations
– Management of unique national user facilities
– Large multi-laboratory science and engineering projects
Why?
13
OASCR Genomes to Life OBER
NERSCSupercomputing
& Large-Scale Storage
PNNL
LBNL
ANL
ESnet
Europe
ORNL
ESNet
MDSCA
Asia-Pacific
Scientist
An End-to-End Problem for ApplicationsMany different types of objectsneed to be connected to and coordinated by the networks
14
OASCR Genomes to Life OBER
Staff
– Ed Oliver, Associate Director for Advanced Scientific Computing Research– Dan Hitchcock, Senior Scientific Advisor– Linda Twenty, Senior Budget & Financial Specialist
– Walt Polansky, Acting Director MICS
– Gary Johnson, ACRTs, Computational Biology– Fred Johnson, Computer Science– William (Buff) Miner, NERSC & Scientific Applications– Thomas Ndousse-Fetter, Network Research– Kimberly Rasar, Senior Info. Tech. (SciDAC)– Chuck Romine, Applied Mathematics– Mary Anne Scott, Collaboratories– George Seweryniak, Esnet– John van Rosendale, Computer Science- Visualization and Data Management
– Vacancies- (2)
– Jane Hiegel– Susan Kilroy
Phone- 301-903-5800Fax- 301-903-7774http://www.sc.doe.gov/production/octr/mics/index.html
15
OASCR Genomes to Life OBER
OASCR Advisory Committee
• Committee Chair: Margaret Wright, NYU
• Subcommittee Chairs:– Biology: Juan Meza, LBNL– Computing Infrastructure: Jill Dahlberg, General Atomics
• Members in common
with BERAC: Warren Washington, NCAR
• Next Meeting:2-3 May 2002
Crowne Plaza Hotel
14th and K Streets
Washington, DC
16
OASCR Genomes to Life OBER
Genomes to Life Program History
• Phased program startup– FY 2002: OBER
– FY 2003: OASCR
• Precursor activity– FN 01-21: Advanced Modeling and Simulation of Biological Systems
– 9 Awards, $3M
• Current solicitations– FN 02-13: Genomes to Life
• Program planning– 5 workshops
– Goal 4 roadmap
– Update to GTL roadmap
17
OASCR Genomes to Life OBER
GTL Planning Activities
• 7-8 August GTL Computing Workshop
• 6-7 September Systems Biology & GTL Workshop
• 22-23 January Computing Infrastructure Workshop
• 6-7 March Computer Science for GTL Workshop
• 18-19 March Mathematics for GTL Workshop
• 19 April Draft Goal 4 Roadmap
• Future New Edition of the GTL Roadmap
18
OASCR Genomes to Life OBER
GTL Goal 4 Roadmap
19
OASCR Genomes to Life OBER
Genomes to Life Goals
Goal 1 Identify and Characterize the Molecular Machines
of Life – the Multiprotein Complexes that Execute
Cellular Functions and Govern Cell Form
Goal 2 Characterize Gene Regulatory Networks
Goal 3 Characterize the Functional Repertoire of Complex
Microbial Communities in their Natural Environments
at the Molecular Level
Goal 4 Develop the Computational Methods and Capabilities
to Advance Understanding of Complex Biological
Systems and Predict their Behavior
20
OASCR Genomes to Life OBER
Three Computing Domains
• Bioinformatics/Data-Intensive Applications
• Biophysics/Compute-Intensive Applications
• Biosystems/Complex Systems Modeling
21
OASCR Genomes to Life OBER
Biology & Computing Perspectives
22
OASCR Genomes to Life OBER
Domain Challenges
• Bioinformatics– Heterogeneous, large and growing data sets
– Legacy systems that don’t interoperate and don’t scale
• Biophysics– Already bumping up against computational resources
• More computation, better algorithms, new theory
• Biosystems – Too much data not to have models
– Data-poor and biology-poor
– Parts list short, but complex systems
23
OASCR Genomes to Life OBER
Initial Thoughts on Computational Infrastructure
top related