genomes to life a partnership between biology and computing

Post on 12-Jan-2016

27 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Genomes to Life a partnership between Biology and Computing. Gary Johnson John Houghton Office of Science. http://www.doegenomestolife.org/. Office of Advanced Scientific Computing Research: Mathematical, Information and Computational Sciences. a brief overview. - PowerPoint PPT Presentation

TRANSCRIPT

1

OASCR Genomes to Life OBER

Genomes to Lifea partnership between Biology and Computing

http://www.doegenomestolife.org/

Gary JohnsonJohn Houghton

Office of Science

2

OASCR Genomes to Life OBER

3

OASCR Genomes to Life OBER

Office of Advanced Scientific Computing Research:

Mathematical, Informationand

Computational Sciences

a brief overview

http://www.sc.doe.gov/production/octr/mics/index.html

4

OASCR Genomes to Life OBER

operate supercomputers, a high performance network, and related facilities.

MICS Mission

Discover, develop, and deploy the computational and networking advances that enable researchers in the scientific disciplines to analyze, model, simulate, and predict complex physical, chemical, and biological phenomena important to the Department of Energy (DOE).

support a broad research portfolio in advanced scientific computing – applied mathematics, computer science, networking and collaboratory software

5

OASCR Genomes to Life OBER

Program Strategy

BasicResearch

…simulation …distributed teams, of complex systems remote access to facilities

Energy Sciences Network (ESnet)

Advanced Computing Research Facilities

National Energy Research Scientific Computing Center (NERSC)

• Materials• Chemical• Combustion• Accelerator• HEP• Nuclear• Fusion• Climate• Astrophysics

• Applied Mathematics• Computer Science

• Scientific Application Pilots• Collaboratory Tools

• Collaboratory Pilots

BES,BER, FES, HEP, NP

• Integrated Software Infrastructure CentersTeams- mathematicians, computer scientists,

application scientists, and software engineers

High Performance Computing and Network Facilities for Science

Research to enable…

• Grid enabling research • Topical Computing

• Networking

• Nanoscience

Computational Biology

6

OASCR Genomes to Life OBER

Budget Request

FY2003- $166,625,000

Base Research

Comp. Bio.SciDAC

FacilitiesSBIR/STTR

32%

5%25%

35%

3%

Enhancements over FY2002

• Computational Biology +$5.6M• SciDAC +$5.3M• Facilities +$1.3M

7

OASCR Genomes to Life OBER

unscalable

scalable

Problem Size (increasing with number of processors)

Tim

e t

o S

olu

tio

n

200

150

50

0

100

10 100 10001

Ax=b F(u,x,y,z)=0 F(u,u’,u’’,…,x,y,z,t)=0

Linear Solvers Nonlinear Solvers PDE Solvers

From the “simple”… …to the complex!

Ax=Bx

Eigensolvers

Algorithms must be scalable. Ideally, as the problem size grows and the number of processors grows, the solution time does not !

Combustion

~60 coupled, nonsymmetric, nonlinear time-dependent PDEs on 10M mesh points. Time steps range from 10-12 (for chemical reaction rates) to 10-2 (for the speed of flame front)

Current simulations use 44 amino acids.

Protein Folding

Actual protein ~300 amino acids. Run times using current techniques? Greater than life of the universe!

Applied Mathematical Sciences

8

OASCR Genomes to Life OBER

AMS Base Research Program

ObjectivesAdvance our understanding of science and technology by supporting research in basic applied mathematics and in computational research that facilitates the use of the latest high-performance computer systems.

Applied Mathematics Research:Linear AlgebraFluid DynamicsDifferential Eqs.Optimization

Robust High-Performance Numerical LibrariesAdaptive Mesh Refinement (AMR)Sustained Teraflop/s simulationsLevel Set / Fast Marching MethodsInvestment in Education Computational Sciences Graduate Fellowship

Ultrascalable Algorithms(up to millions of PEs)

Mathematical Microscopy

These opportunities will be explored through• Genomes to Life (with BER)• Comp. Nanoscience (with BES)• Fusion Energy (FESAC-ASCAC workshop)

Ongoing Projects Growth Opportunities

Accomplishments

Grid GenerationPredictability Analysis &Uncertainty Quantification

Automated Reasoning

Advanced Numerical Algorithms:PETScAztecTAOADIFOR / ADIC

HypreCHOMBOSuperLUPICO

9

OASCR Genomes to Life OBER

Computer Science Research

• Challenge – HPC for Science is (still after fifteen years!)– Hard to use– Inefficient– Fragile– An unimportant vendor market

• Vision– A comprehensive, integrated software

environment which enables the effective application of high performance systems to critical DOE problems

• Goal– Radical Improvement in– Application Performance– Ease of Use– Time to Solution Node and System Hardware Arch

User Space Runtime Support

OS Kernel OS Bypass

ScientificApplications

SystemAdmin

SoftwareDevelopment

Chkpt/Rstrt Math LibsDebuggers

Viz/Data Scheduler

PSEsRes. Mgt Framewrks

Compilers

Perf ToolsFile Sys Runtme Tls

HPC System Elements

10

OASCR Genomes to Life OBER

Computer Science Technical Elements

Interoperability & Portability

Tools$6.5M

System Software

Environment$4.7M

Performance Evaluation & Optimization

$4.5M

Programming Models & Runtime$3.8M

Visualization & Data

Understanding$5.8M

25%19%

18%

15%23%

11

OASCR Genomes to Life OBER

Major Accomplishments

• PVM – the first widely successful model for parallel computing• MPI – the lingua franca of today’s parallel computing• MPICH – the open source version of MPI that is the basis for all

vendor adaptations• Global Arrays – the distributed shared memory programming model

that is at the core of NWChem, the motivating application for SciDAC

• CTSS – the first interactive operating system for high performance computers

• SUNMOS/Puma/Cougar – the most successful high performance parallel operating system

• OSCAR – a partnership with industry, the most widely used open source toolkit for management of Linux clusters

12

OASCR Genomes to Life OBER

National Collaboratories

• The nature of how large scale science is done is changing

– Distributed data, computing, people, instruments

– Instruments integrated with large-scale computing

– Human resources are seldom collocated with the resources needed for their science

• Additional drivers– Large and international collaborations

– Management of unique national user facilities

– Large multi-laboratory science and engineering projects

Why?

13

OASCR Genomes to Life OBER

NERSCSupercomputing

& Large-Scale Storage

PNNL

LBNL

ANL

ESnet

Europe

ORNL

ESNet

MDSCA

Asia-Pacific

Scientist

An End-to-End Problem for ApplicationsMany different types of objectsneed to be connected to and coordinated by the networks

14

OASCR Genomes to Life OBER

Staff

– Ed Oliver, Associate Director for Advanced Scientific Computing Research– Dan Hitchcock, Senior Scientific Advisor– Linda Twenty, Senior Budget & Financial Specialist

– Walt Polansky, Acting Director MICS

– Gary Johnson, ACRTs, Computational Biology– Fred Johnson, Computer Science– William (Buff) Miner, NERSC & Scientific Applications– Thomas Ndousse-Fetter, Network Research– Kimberly Rasar, Senior Info. Tech. (SciDAC)– Chuck Romine, Applied Mathematics– Mary Anne Scott, Collaboratories– George Seweryniak, Esnet– John van Rosendale, Computer Science- Visualization and Data Management

– Vacancies- (2)

– Jane Hiegel– Susan Kilroy

Phone- 301-903-5800Fax- 301-903-7774http://www.sc.doe.gov/production/octr/mics/index.html

15

OASCR Genomes to Life OBER

OASCR Advisory Committee

• Committee Chair: Margaret Wright, NYU

• Subcommittee Chairs:– Biology: Juan Meza, LBNL– Computing Infrastructure: Jill Dahlberg, General Atomics

• Members in common

with BERAC: Warren Washington, NCAR

• Next Meeting:2-3 May 2002

Crowne Plaza Hotel

14th and K Streets

Washington, DC

16

OASCR Genomes to Life OBER

Genomes to Life Program History

• Phased program startup– FY 2002: OBER

– FY 2003: OASCR

• Precursor activity– FN 01-21: Advanced Modeling and Simulation of Biological Systems

– 9 Awards, $3M

• Current solicitations– FN 02-13: Genomes to Life

• Program planning– 5 workshops

– Goal 4 roadmap

– Update to GTL roadmap

17

OASCR Genomes to Life OBER

GTL Planning Activities

• 7-8 August GTL Computing Workshop

• 6-7 September Systems Biology & GTL Workshop

• 22-23 January Computing Infrastructure Workshop

• 6-7 March Computer Science for GTL Workshop

• 18-19 March Mathematics for GTL Workshop

• 19 April Draft Goal 4 Roadmap

• Future New Edition of the GTL Roadmap

18

OASCR Genomes to Life OBER

GTL Goal 4 Roadmap

19

OASCR Genomes to Life OBER

Genomes to Life Goals

Goal 1 Identify and Characterize the Molecular Machines

of Life – the Multiprotein Complexes that Execute

Cellular Functions and Govern Cell Form

Goal 2 Characterize Gene Regulatory Networks

Goal 3 Characterize the Functional Repertoire of Complex

Microbial Communities in their Natural Environments

at the Molecular Level

Goal 4 Develop the Computational Methods and Capabilities

to Advance Understanding of Complex Biological

Systems and Predict their Behavior

20

OASCR Genomes to Life OBER

Three Computing Domains

• Bioinformatics/Data-Intensive Applications

• Biophysics/Compute-Intensive Applications

• Biosystems/Complex Systems Modeling

21

OASCR Genomes to Life OBER

Biology & Computing Perspectives

22

OASCR Genomes to Life OBER

Domain Challenges

• Bioinformatics– Heterogeneous, large and growing data sets

– Legacy systems that don’t interoperate and don’t scale

• Biophysics– Already bumping up against computational resources

• More computation, better algorithms, new theory

• Biosystems – Too much data not to have models

– Data-poor and biology-poor

– Parts list short, but complex systems

23

OASCR Genomes to Life OBER

Initial Thoughts on Computational Infrastructure

top related