cyberinfrastructure framework for 21st century science & engineering (cif21)

28
1 Cyberinfrastructure Framework for 21st Century Science & Engineering (CIF21) NSF-wide Cyberinfrastructure Vision People, Sustainability, Innovation, Integration Alan Blatecky Acting Director OCI 1

Upload: leo-huffman

Post on 31-Dec-2015

35 views

Category:

Documents


1 download

DESCRIPTION

Cyberinfrastructure Framework for 21st Century Science & Engineering (CIF21) NSF-wide Cyberinfrastructure Vision People, Sustainability, Innovation, Integration Alan Blatecky Acting Director OCI. 1. Framing the Challenge: Science and Society Transformed by Data. Modern science - PowerPoint PPT Presentation

TRANSCRIPT

1

Cyberinfrastructure Frameworkfor 21st Century Science &

Engineering (CIF21)

NSF-wide Cyberinfrastructure VisionPeople, Sustainability, Innovation, Integration

Alan BlateckyActing Director OCI

1

Framing the Challenge:Science and Society Transformed by

Data Modern science

Data- and compute-intensive

Integrative, multiscale Multi-disciplinary

Collaborations for Complexity Individuals, groups,

teams, communities Sea of Data

Age of Observation Distributed, central

repositories, sensor- driven, diverse, etc 2

ACCI Task Force Reports

GrandChallenges

CampusBridgingData and Viz

Cyberlearning

HPC

HIGH P ERFORMANCE COMPUTING

Software

Final recommendations presented to the NSF Advisory Committee on Cyberinfrastructure Dec 2010

More than 25 workshops and Birds of a Feather sessions and more than 1300 people involved

Final reports on-line 3

Recommendation of NSF Advisory Committee on Cyberinfrastructure

ACCI"The National Science Foundation should create a

program in Computational and Data-Enabled Science and Engineering (CDS&E), based in and coordinated by the NSF Office of Cyberinfrastructure. The new program should be collaborative with relevant disciplinary programs in other NSF directorates and offices."

4

Grand Challenges Task Force Recommendations (Oden)

Permanent, integrative activities in CDS&E are critically needed at NSF to address current and emerging Grand Challenge Problems

An interagency group in CDS&E should be established to address national goals and priorities and to ensure coordination of efforts

Support of diverse HPC activities (hardware, methods, algorithms) should remain a high priority. University researchers need open access to these resources at all levels

The development of robust, reliable and useable software at all levels needs to supported by NSF and recognized as an important component of the research portfolio of NSF

Support CI for data and visualization Learn how to create grand challenge communities and VOs

(and do it!) 5

Campus Bridging Recommendations (Stewart)

NSF should Study successful campus CI implementations to document

and disseminate the best practices for strategies, governance, financial models and deployment

Establish a blueprint and roadmap for national CI, including• Standard Authentication (InCommon)• MRI awards at campus level• National Data infrastructure, including national networking

backbone

Campuses should Develop a Cyberinfrastructure master plan with the goal of

identifying and planning for the changing research infrastructure needs of faculty and researchers

Work toward a goal of providing their educators and researchers access to a seamless Cyberinfrastructure which supports and accelerates research and education

6

Software Task Force Recommendations (Keyes)

Develop a multi-level (individual, team, institute), long-term program to support scientific software

Promote verification, validation, sustainability, and reproducibility through software developed with federal support

Develop a uniform policy on open source that promotes scientific discovery and encourages innovation

Support software through collaborations among all of its divisions, related federal agencies, and private industry

Utilize its Advisory Committees (including Directorate level) to obtain community input on software priorities 7

Data Task Force Recommendations (Baker, Hey)

Infrastructure: NSF should recognize data infrastructure and services (including visualization) as essential research assets fundamental to today’s science and as long-term investments in national prosperity

Culture Change: NSF should reinforce expectations for data sharing; support the establishment of new citation models in which data and software tool providers and developers are credited with their contributions

Economic sustainability: NSF should develop and publish realistic cost models to underpin institutional/national business plans for research repositories/data services

Data Management Guidelines: NSF should identify and share best-practices for the critical areas of data management

Ethics and IP: NSF should train researchers in privacy-preserving data access

8

HPC Task Force Recommendations (Zacharia)

Develop a sustainable model to provide the academic research community with access to a rich mix of HPC systems 20-100 PF, integrated nationally, supported at campus levels Invest now for exascale systems by 2018-2020

Continue and grow a variety of education, outreach, and training programs to expand awareness and encourage the use of high-end modeling and simulation

Broaden outreach to improve the preparation of researchers and to engage industry, decision-makers, and new user communities in the use of HPC as a valuable tool

Provide funding for digital data framework to address the issues of knowledge discovery including co-location of archives and data resources with compute and visualization resources as appropriate. 9

Cyberlearning and Workforce Development Task Force

Recommendations (Ramirez) Overall: Continuous, Collaborative, Computation Cloud (C4) Pervasive/ubiquitous Internet-based, interacting devices, data

sources, users to dominate research, education & all areas of human endeavor

Promote cross-disciplinary, transformative research and education Systemic change needed at all levels of education; university

structures adjusted to train next generation scientists

Invest in efforts to understand learning and research mechanisms and organizations in the new world of CI Exploit and transform CI-enabled, STEM research advancements,

tools, and resources for cyberlearning and workforce development purposes

Focus on lifelong learning and professional development

Strengthen leadership, fund research in broadening participation: elimination of underrepresentation of women, persons with disabilities, and minorities

10

DiscoveryCollaboration

Education

Maintainability, sustainability, and extensibility

Cyberinfrastructure Ecosystem (CIF21)

Organizations Universities, schools Government labs, agencies Research and Medical Centers Libraries, Museums Virtual Organizations Communities

Expertise Research and Scholarship Education Learning and Workforce Development Interoperability and operations Cyberscience

Networking Campus, national, international networks Research and experimental networks End-to-end throughput Cybersecurity

Computational Resources Supercomputers Clouds, Grids, Clusters Visualization Compute services Data Centers

Data Databases, Data repositories Collections and Libraries Data Access; storage, navigation management, mining tools, curation, privacy

Scientific Instruments Large Facilities, MREFCs,,telescopes Colliders, shake Tables Sensor Arrays - Ocean, environment, weather, buildings, climate. etc

Software Applications, middleware Software development and supportCybersecurity: access, authorization, authentication

CIF21 – a metaphor A goal of Virtual Proximity –--

“ you are one with your resources” Continue to collapse the barrier of distance and

remove geographic location as an issue ALL resources (including people) are virtually

present, accessible and secure End-to-end integrated resources Science, discovery, innovation, education are the

metrics

12

An organizing fabric and foundation for science, engineering and education

Broad Principles to Lead CIF21

Builds national infrastructure for S&E Leverages common methods,

approaches, and applications – focus on interoperability

Catalyzes other CI investments across NSFProvides focus and is a vehicle for

coordinating efforts and programsIs a “force multiplier” across NSF

Shared governance; embedded into every directorate and office

Managed as a coherent program13

DiscoveryCollaboration

Education

Four Thrust Areas

Organizations Universities, schools Government labs, agencies Research and Medical Centers Libraries, Museums Virtual Organizations Communities

Expertise Research and Scholarship Education Learning and Workforce Development Interoperability and operations Cyberscience

Networking Campus, national, international networks Research and experimental networks End-to-end throughput Cybersecurity

Computational Resources Supercomputers Clouds, Grids, Clusters Visualization Compute services Data Centers

Data Databases, Data repositories Collections and Libraries Data Access; storage, navigation management, mining tools, curation, privacy

Scientific Instruments Large Facilities, MREFCs,,telescopes Colliders, shake Tables Sensor Arrays - Ocean, environment, weather, buildings, climate. etc

Software Applications, middleware Software development and supportCybersecurity: access, authorization, authentication

Data-Enabled Science

New ComputationalResources

Community ResearchNetworks

Access andConnections toCI Resources

Education: integral and embedded

Data-Enabled ScienceThrust Area 1

Data Services Program (data) Provide reliable digital preservation, access,

integration, and analysis capabilities for science and/or engineering data over a decades-long timeline

Data Analysis and Tools Program (information)Data mining, manipulation, modeling,

visualization, decision-making systems Data-intensive Science Program

(knowledge) Intensive disciplinary efforts, multi-disciplinary discoveryand innovation

15Dumped On by Data: Scientists Say a Deluge Is Drowning Research

Data Challenges

16

2012 2016 2020

Genomics

LHC

BlueWaters

SquareKilometer

Array

Genomics

LHC

Climate, Environment

LSST

ExaBytes

PetaBytes

TeraBytes

GigaBytes

Climate, Environment

Volume of data

Growth

Distribution of data

Interoperability of Data

Public Data Being Created

172012 2016 2020

Genomics

LHC

BlueWaters

SquareKilometer

ArrayGenomics

LHC

Climate, Environment

LSST

Zetta Bytes

ExaBytes

PetaBytes

TeraBytes

Climate, Environment

IDC

Sciencexpress

New Computational InfrastructureThrust Area 2

Computational and Data-enabled resourcesHPC, Clouds, Clusters, Data Centers

Long-term software for science and engineeringSustained software development and support

Discipline-specific activitiesServices, tools, compute environments that

serve specific research efforts and communities

18

Creating Scalable SoftwareDevelopment Environments

Create a software ecosystem that scales from individual or small groups of software innovators to large hubs of software excellence

Focus on innovation Focus on sustainability19

Community Research NetworksThrust Area 3

New multidisciplinary research communitiesAddress challenges beyond individuals and

disciplinary research communitiesSupport and optimize collaboration across small,

mid-level and large community networksSupport SEES and new research communities

Advanced research on community and social networksStructures, leadership, fostering and

sustainability“virtuous cycle” providing feedback through

formal evaluation and program iteration20

Access and Connectivity Thrust Area 4

Network connections and engineering programReal-time access to facilities and instruments;

Begins to tie in MREFC activities Integration and end-to-end performance to

provide seamless access from researcher to resource

Cybersecurity – from innovation to practiceDeployment of identity management systemsDevelopment of cybersecurity prototypes

21

CIF21 Strategic Plan

Development of a detailed CIF21 Roadmap for FY12 and beyond; updated as needed

Developing a plan and guide for CI investments across NSF

Established internal NSF working group Exploring and developing data policies

on open access, publications, citation, etc

Multidirectorate/office “collective” programs designed to build critical infrastructure and capabilities 22

CIF21 Strategy Plan con’t

Outcomes and metrics being identified for each Thrust Area

Spiral development model adopted for all components3-5 year overlapping spirals Iteration and creation of new versions and

capabilities with each spiralEver increasing improvements

NSF Advisory Committee on Cyberinfrastructure to review CIF21 progress; individual directorate ACs as well 23

Transient & Data-intensive Astronomy

24

New era: seeing events as they occur (Almost) here now

ALMA, EVLA in radio

Ice Cube neutrinos

On horizon 24-42m optical? LIGO south? LSST = SDSS

(40TB) every night!

SKA = exabytes Simulations integrate

all physics

?

Earth Systems and CIF21 A Complex Interconnected Earth System

25

Earth–Human Knowledge Management System (Earth-Cubed)

forDiscovery ObservingSimulatingSharing Collaborating Training Learning Informing Broadening

Participation

Impacts on NSF CI as enabling infrastructure for S&E

Sustainability and viability essential NSF has a unique role in this strategy

New role for Data data-enabled and data intensive science management of data - compute, storage, use,

access, etc

Multi-disciplinary approaches essential Education - embedded and integral

methodologies (compute and data intensive) culture/society (interdisciplinary approaches,

collaborative)

More coordinated post-award management ensure value across the entire research enterprise

27

Thank you

28