data- and compute-driven transformation of modern science update on the nsf cyberinfrastructure...

24
Data- and Compute-Driven Transformation of Modern Science Update on the NSF Cyberinfrastructure Vision People, Sustainability, Innovation, Integration Edward Seidel Acting Assistant Director, Mathematical and Physical Sciences, NSF (Director, Office of Cyberinfrastructure) 1

Upload: darcy-bryan

Post on 12-Jan-2016

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Data- and Compute-Driven Transformation of Modern Science Update on the NSF Cyberinfrastructure Vision People, Sustainability, Innovation, Integration

Data- and Compute-Driven Transformation of Modern

Science

Update on the NSF Cyberinfrastructure Vision

People, Sustainability, Innovation, IntegrationEdward Seidel

Acting Assistant Director, Mathematical and Physical Sciences, NSF

(Director, Office of Cyberinfrastructure)

1

Page 2: Data- and Compute-Driven Transformation of Modern Science Update on the NSF Cyberinfrastructure Vision People, Sustainability, Innovation, Integration

2

Profound Transformation of Science

Gravitational Physics Galileo, Newton usher in birth of

modern science: c. 1600 Problem: single “particle” (apple)

in gravitational field (General 2 body-problem already too hard)

MethodsData: notebooks (Kbytes)Theory: driven by dataComputation: calculus by hand (1

Flop/s) Collaboration

1 brilliant scientist, 1-2 student

Page 3: Data- and Compute-Driven Transformation of Modern Science Update on the NSF Cyberinfrastructure Vision People, Sustainability, Innovation, Integration

33

• 3D Collision

• Science Result

• Year: 1998

• Team size

• ~ 15

• Data produced

• ~ 50Gbytes

Profound Transformation of Science

Collision of Two Black Holes

• Science Result

• The “Pair of Pants”

• Year: 1994

• Team size

• ~ 10

• Data produced

• ~ 50Mbytes

• Impact of HPC taking root

Science ResultThe “Pair of Pants”

Year: 1972 Team size

1 person (S. Hawking) Computation

Flop/s Data produced

~ Kbytes (text, hand-drawn sketch)

400 years later…same!

Page 4: Data- and Compute-Driven Transformation of Modern Science Update on the NSF Cyberinfrastructure Vision People, Sustainability, Innovation, Integration

4

Now: Complexity of UniverseLHC, Gamma-ray bursts!

Gamma-ray bursts!• GR now soluble: complex

problems in relativistic astro can now be attacked

• All energy emitted in lifetime of sun bursts out in a few seconds: what are they?! Colliding BH-NS? SN?

• GR, hydrodynamics, nuclear physics, radiation transport, neutrinos, magnetic fields: globally distributed collab!

• Scalable algorithms, complex AMR codes, viz, PFlops*week, PB output!

LHC: What is the nature of mass? Higgs particle?• ~10K scientists, 33+ countries,

25PB data, distributed!• Planetary lab for scientific

discovery!

Remote Instrument

Remote Instrument

Page 5: Data- and Compute-Driven Transformation of Modern Science Update on the NSF Cyberinfrastructure Vision People, Sustainability, Innovation, Integration

5

Grand Challenge Communities Combine it All...

Where is it going to go?

5

Same CI useful for black holes, hurricanes

Page 6: Data- and Compute-Driven Transformation of Modern Science Update on the NSF Cyberinfrastructure Vision People, Sustainability, Innovation, Integration

6

Grand Challenge Communities

Complex problems require many disciplines, all scales of collaborations, advanced CI Individuals, groups, teams, communities

Multiscale Collaborations: Beyond teams Grand Challenge Communities assemble

dynamically Emergency forecasting: flu, hurricane, tornado...Gamma-ray bursts, supernovae,

They can only work by sharing data Place requirements on

CI: software, networks, collaborative environments, data, sharing, computing, etc

Scientific culture, reproducibility, access, university structures

New social networking technologies will be needed for collaborations at this scale. Allen, Schnetter, et al

6

Page 7: Data- and Compute-Driven Transformation of Modern Science Update on the NSF Cyberinfrastructure Vision People, Sustainability, Innovation, Integration

NSF Vision and National CI Blueprint

7

Track 1

Track 2

Track 2

Track 2

Campus Campus Campus Campus Campus Campus CampusCampus

DataNet

DataNet

Software

Nets

DataNet

DataNet

DataNet

Learning & Work Force Needs & Opportunities

Virtual Organizations for Distributed Communities

High Performance Computing

Data & Visualization/Interactio

n

Education Crisis: I need all of this to start to solve

my problem!

Page 8: Data- and Compute-Driven Transformation of Modern Science Update on the NSF Cyberinfrastructure Vision People, Sustainability, Innovation, Integration

What is Needed?

8

NSF-wide CI Framework for 21st Century Science & Engineering

Page 9: Data- and Compute-Driven Transformation of Modern Science Update on the NSF Cyberinfrastructure Vision People, Sustainability, Innovation, Integration

CF21: Cyberinfrastructure Framework for 21st Century Science

& Engineering High-end computation, data, visualization

for transformative science; sustainability, extensibility Facilities/centers as hubs of innovation

MREFCs and collaborations including large-scale NSF collaborative facilities, international partners

Software, tools, science applications, and VOs critical to science, integrally connected to hardware

Campuses fundamentally linked; grids, clouds, loosely coupled campus services, policy to support

People. Comprehensive approach workforce development for 21st century science and engineering

9

Comprehensive, balanced, integrated, national high performance CI; Dear Colleague Letter released December, 2009 by all units

Page 10: Data- and Compute-Driven Transformation of Modern Science Update on the NSF Cyberinfrastructure Vision People, Sustainability, Innovation, Integration

10

ACCI Task ForcesCampus Bridging: Craig Stewart, IU (BIO)

Computing: Thomas Zacharia, ORNL/UTK (DOE)

Grand Challenge Communities/VOs: Tinsley Oden, Austin (ENG)

Education & Workforce: Alex Ramirez, CEOSE

Software: David Keyes, Columbia/KAUST (MPS)

Data & Viz: Shenda Baker, Harvey Mudd (MPS); Tony Hey, (CISE)

Timelines: 12-18 months Advising NSF Workshop(s) Recommendations Input to NSF informs CF21

programs, 2012 CI Vision Plan

10

Page 11: Data- and Compute-Driven Transformation of Modern Science Update on the NSF Cyberinfrastructure Vision People, Sustainability, Innovation, Integration

Preliminary Task Force (TF) Results

Computing TF Workshop Interim ReportRec: Address sustainability, people, innovation

• Developing CF21-oriented HPC program Software TF Interim Report

Rec: Address sustainability, create long term, multi-directorate, multi-level software program• Developing CF21-oriented integrated program

GCC/VO TF Interim ReportRec: Address sustainability, OCI to nurture

computational science across NSF units• Concept paper coming to NSF

PITAC: “inadequate structures within the Federal government and the academy today do not effectively support computational science”

11

Page 12: Data- and Compute-Driven Transformation of Modern Science Update on the NSF Cyberinfrastructure Vision People, Sustainability, Innovation, Integration

Roadmap and Timelines

12

DataNet

DataNet

Track 2

Track 2

Track 2

2010 2011

Task Force Reports and Workshops

2012 2013

• NSF CF21 Strategic Plan 2012-2017

• Integration • Stronger

interagency interaction

• New science activities enabled

DataNet DataNet DataNet

• National Petascale Facility

• CF21Computing program; hubs of innovation

• CF21 Software• People, VOs • Better campus integration• Major facilities CI

planning

Updating fo

r

2013 &

Beyond

Page 13: Data- and Compute-Driven Transformation of Modern Science Update on the NSF Cyberinfrastructure Vision People, Sustainability, Innovation, Integration

OCI Special Role in CF21 Driver for integrative CI activity via CF21

Working with all units, community• Develop vision and implementation plan• OCI budget ¼ NSF CI; other units critical!

Catalyst for coordinated, linked investmentsCI in all forms: campus, centers, MREFC

• Leadership in R&D for prototypes, pilots, best practices

• Looking for coherence, re-use of CIScience applications enabled by CIPeople: supporting next generation of CI

researchers Steward for NSF-wide computational science

Working with all NSF units to provide sustainable home

13

Page 14: Data- and Compute-Driven Transformation of Modern Science Update on the NSF Cyberinfrastructure Vision People, Sustainability, Innovation, Integration

2009 PetaApps, CDI, CI-Reuse 70% OCI ARRA: Innovations in software, apps,

people PetaApps: OCI led, NSF-wide

Partners: MPS, CISE, ENG, GEO and SBE 2009: $16M from OCI, matched for total of

$35M! 2007-9 Total: 42 awards, ~200 proposals,

$60M Equivalent to entire Track-2 award

(including O&M) CDI: CISE led, NSF-wide

OCI a “Big 4” contributor in FY09! (CISE, ENG, OCI, MPS…), $63M total

OCI contributed to 22 awards, more than $10M

CI Re-Use: Internal OCI-led NSF program OCI venture fund of $4M to catalyze CISE, GEO, OPP, BIO and MPS 13 awards, > $20M investments catalyzed

by OCI

14

Page 15: Data- and Compute-Driven Transformation of Modern Science Update on the NSF Cyberinfrastructure Vision People, Sustainability, Innovation, Integration

15

MREFC Projects:NEON, and Cyber-GIS

Page 16: Data- and Compute-Driven Transformation of Modern Science Update on the NSF Cyberinfrastructure Vision People, Sustainability, Innovation, Integration

James Collins, Assistant DirectorBiological Sciences Directorate, NSF

Office of Management and Budget Briefing October 5, 2009

National Ecological Observatory Network New horizons for large-scale biology

Page 17: Data- and Compute-Driven Transformation of Modern Science Update on the NSF Cyberinfrastructure Vision People, Sustainability, Innovation, Integration

• How does the effect of climate change on biosphere processes vary along regional and continental gradients?

• What is the effect of the biosphere on regional climate?

• How will land use change affect the dispersion of invasive species through a region and across the continent?

• How do large scale physical processes produce regional to continental ecological responses?

In theory all life is interconnected….

What is NEON?

NEON is an integrated sensing system to detect, understand, and forecast the consequences of climate and landuse change and the effects of invasive species on the biosphere of the U.S. at the regional and continental scales.

Enables research to address ….

Page 18: Data- and Compute-Driven Transformation of Modern Science Update on the NSF Cyberinfrastructure Vision People, Sustainability, Innovation, Integration

Cyberinfrastructure

Decision Support

Education

Research

Distributed DistributedCentralized

Operational and Support Systems (OSS)

Data Services

Data Management

Airborne Remote Sensing

ArchiveData ProductsRaw

Data Acquisition

Data Process Management

Portals

Future Sources

In situ Sensors

Satellite Remote Sensing

Biological Monitoring and Measurements

Page 19: Data- and Compute-Driven Transformation of Modern Science Update on the NSF Cyberinfrastructure Vision People, Sustainability, Innovation, Integration

Landuse Analysis Package

On land use, land cover and land management -- drivers of change

Across multiple spatial scales (local to the continental) for the entire NEON realm

Across multiple temporal scales (days to decades to centuries) to help understand legacy effects of prior land use on ecosystem function and performance

For use by ecological modelers and forecasters to extend models to a continental scale

The NEON Land Use Analysis Package (LUAP) provides information:

(ISEP, NOD)

Goal to “… collate existing data … on past and current land use practices as well as economic and social data that are useful for prediction of future land use processes”

Page 20: Data- and Compute-Driven Transformation of Modern Science Update on the NSF Cyberinfrastructure Vision People, Sustainability, Innovation, Integration

Landuse Analysis Package

NEON must scale from site to region to continentRemote sensing, aircraft borne, satellite. Spectral and LiDAR data converted into 3D

biogeochemical fingerprints of earth surface including vegetation and human structures.

GIS critical to convert sensor data to spatial data

USGS will provide satellite data from MODIS, Landsat, etc.

NEON will ingest other spatial data from and convert them into spatial data using GIS

20

Page 21: Data- and Compute-Driven Transformation of Modern Science Update on the NSF Cyberinfrastructure Vision People, Sustainability, Innovation, Integration

21

New Approaches with CF21

Page 22: Data- and Compute-Driven Transformation of Modern Science Update on the NSF Cyberinfrastructure Vision People, Sustainability, Innovation, Integration

Emerging CF21 Concepts CF21 HPC program

Sustainability, hubs of innovation + experimentalLooking to develop new program in FY10

CF21 Software Institutes and InnovatorsTransform innovation into sustainable softwareSignificant multiscale, long-term program

• Connected institutes, teams, investigators• Integrated into CF21 framework w/Directorates

22

Hierarchical structures that link innovation and sustainability, integrate with national and campus activities

Page 23: Data- and Compute-Driven Transformation of Modern Science Update on the NSF Cyberinfrastructure Vision People, Sustainability, Innovation, Integration

Concept for NSF-wide Fellowships for Transformative Computational Science

Goal: People! Build innovative researchers in computational science by supporting outstanding postdocsEmphasize central role of cyberscience in all

sciences (physical, biological, geological, mathematical, social, behavioral, economic, computer, information and data)

Support cyberscience research and education: CI-based, cross disciplinary boundaries• Use CI to make revolutionary advances in their

disciplines• Research and develop CI that enables innovative

computational practices23

Page 24: Data- and Compute-Driven Transformation of Modern Science Update on the NSF Cyberinfrastructure Vision People, Sustainability, Innovation, Integration

Summary Science is being revolutionized through CI

Compute, data, networking advance suddenly 9-12 orders of magnitude after 4 centuries

All forms of CI—including GIS—needed for science

NSF responsive: developing much more comprehensive, integrated CF21 initiativeAll units involved; OCI, CISE play important rolesCommunity deeply engaged in planningActivities ramp up in FY11-12 and beyond

People, sustainability, innovation, integrationLonger term programs, better linked, hubs of

innovationSupport computational scientists who develop

and/or use advanced CI

24