cyberinfrastructure framework for 21st century science & engineering (cif21)
DESCRIPTION
Cyberinfrastructure Framework for 21st Century Science & Engineering (CIF21) NSF-wide Cyberinfrastructure Vision People, Sustainability, Innovation, Integration Alan Blatecky Acting Director OCI. 1. Framing the Challenge: Science and Society Transformed by Data. Modern science - PowerPoint PPT PresentationTRANSCRIPT
1
Cyberinfrastructure Frameworkfor 21st Century Science &
Engineering (CIF21)
NSF-wide Cyberinfrastructure VisionPeople, Sustainability, Innovation, Integration
Alan BlateckyActing Director OCI
1
Framing the Challenge:Science and Society Transformed by
Data Modern science
Data- and compute-intensive
Integrative, multiscale Multi-disciplinary
Collaborations for Complexity Individuals, groups,
teams, communities Sea of Data
Age of Observation Distributed, central
repositories, sensor- driven, diverse, etc 2
ACCI Task Force Reports
GrandChallenges
CampusBridgingData and Viz
Cyberlearning
HPC
HIGH P ERFORMANCE COMPUTING
Software
Final recommendations presented to the NSF Advisory Committee on Cyberinfrastructure Dec 2010
More than 25 workshops and Birds of a Feather sessions and more than 1300 people involved
Final reports on-line 3
Recommendation of NSF Advisory Committee on Cyberinfrastructure
ACCI"The National Science Foundation should create a
program in Computational and Data-Enabled Science and Engineering (CDS&E), based in and coordinated by the NSF Office of Cyberinfrastructure. The new program should be collaborative with relevant disciplinary programs in other NSF directorates and offices."
4
Grand Challenges Task Force Recommendations (Oden)
Permanent, integrative activities in CDS&E are critically needed at NSF to address current and emerging Grand Challenge Problems
An interagency group in CDS&E should be established to address national goals and priorities and to ensure coordination of efforts
Support of diverse HPC activities (hardware, methods, algorithms) should remain a high priority. University researchers need open access to these resources at all levels
The development of robust, reliable and useable software at all levels needs to supported by NSF and recognized as an important component of the research portfolio of NSF
Support CI for data and visualization Learn how to create grand challenge communities and VOs
(and do it!) 5
Campus Bridging Recommendations (Stewart)
NSF should Study successful campus CI implementations to document
and disseminate the best practices for strategies, governance, financial models and deployment
Establish a blueprint and roadmap for national CI, including• Standard Authentication (InCommon)• MRI awards at campus level• National Data infrastructure, including national networking
backbone
Campuses should Develop a Cyberinfrastructure master plan with the goal of
identifying and planning for the changing research infrastructure needs of faculty and researchers
Work toward a goal of providing their educators and researchers access to a seamless Cyberinfrastructure which supports and accelerates research and education
6
Software Task Force Recommendations (Keyes)
Develop a multi-level (individual, team, institute), long-term program to support scientific software
Promote verification, validation, sustainability, and reproducibility through software developed with federal support
Develop a uniform policy on open source that promotes scientific discovery and encourages innovation
Support software through collaborations among all of its divisions, related federal agencies, and private industry
Utilize its Advisory Committees (including Directorate level) to obtain community input on software priorities 7
Data Task Force Recommendations (Baker, Hey)
Infrastructure: NSF should recognize data infrastructure and services (including visualization) as essential research assets fundamental to today’s science and as long-term investments in national prosperity
Culture Change: NSF should reinforce expectations for data sharing; support the establishment of new citation models in which data and software tool providers and developers are credited with their contributions
Economic sustainability: NSF should develop and publish realistic cost models to underpin institutional/national business plans for research repositories/data services
Data Management Guidelines: NSF should identify and share best-practices for the critical areas of data management
Ethics and IP: NSF should train researchers in privacy-preserving data access
8
HPC Task Force Recommendations (Zacharia)
Develop a sustainable model to provide the academic research community with access to a rich mix of HPC systems 20-100 PF, integrated nationally, supported at campus levels Invest now for exascale systems by 2018-2020
Continue and grow a variety of education, outreach, and training programs to expand awareness and encourage the use of high-end modeling and simulation
Broaden outreach to improve the preparation of researchers and to engage industry, decision-makers, and new user communities in the use of HPC as a valuable tool
Provide funding for digital data framework to address the issues of knowledge discovery including co-location of archives and data resources with compute and visualization resources as appropriate. 9
Cyberlearning and Workforce Development Task Force
Recommendations (Ramirez) Overall: Continuous, Collaborative, Computation Cloud (C4) Pervasive/ubiquitous Internet-based, interacting devices, data
sources, users to dominate research, education & all areas of human endeavor
Promote cross-disciplinary, transformative research and education Systemic change needed at all levels of education; university
structures adjusted to train next generation scientists
Invest in efforts to understand learning and research mechanisms and organizations in the new world of CI Exploit and transform CI-enabled, STEM research advancements,
tools, and resources for cyberlearning and workforce development purposes
Focus on lifelong learning and professional development
Strengthen leadership, fund research in broadening participation: elimination of underrepresentation of women, persons with disabilities, and minorities
10
DiscoveryCollaboration
Education
Maintainability, sustainability, and extensibility
Cyberinfrastructure Ecosystem (CIF21)
Organizations Universities, schools Government labs, agencies Research and Medical Centers Libraries, Museums Virtual Organizations Communities
Expertise Research and Scholarship Education Learning and Workforce Development Interoperability and operations Cyberscience
Networking Campus, national, international networks Research and experimental networks End-to-end throughput Cybersecurity
Computational Resources Supercomputers Clouds, Grids, Clusters Visualization Compute services Data Centers
Data Databases, Data repositories Collections and Libraries Data Access; storage, navigation management, mining tools, curation, privacy
Scientific Instruments Large Facilities, MREFCs,,telescopes Colliders, shake Tables Sensor Arrays - Ocean, environment, weather, buildings, climate. etc
Software Applications, middleware Software development and supportCybersecurity: access, authorization, authentication
CIF21 – a metaphor A goal of Virtual Proximity –--
“ you are one with your resources” Continue to collapse the barrier of distance and
remove geographic location as an issue ALL resources (including people) are virtually
present, accessible and secure End-to-end integrated resources Science, discovery, innovation, education are the
metrics
12
An organizing fabric and foundation for science, engineering and education
Broad Principles to Lead CIF21
Builds national infrastructure for S&E Leverages common methods,
approaches, and applications – focus on interoperability
Catalyzes other CI investments across NSFProvides focus and is a vehicle for
coordinating efforts and programsIs a “force multiplier” across NSF
Shared governance; embedded into every directorate and office
Managed as a coherent program13
DiscoveryCollaboration
Education
Four Thrust Areas
Organizations Universities, schools Government labs, agencies Research and Medical Centers Libraries, Museums Virtual Organizations Communities
Expertise Research and Scholarship Education Learning and Workforce Development Interoperability and operations Cyberscience
Networking Campus, national, international networks Research and experimental networks End-to-end throughput Cybersecurity
Computational Resources Supercomputers Clouds, Grids, Clusters Visualization Compute services Data Centers
Data Databases, Data repositories Collections and Libraries Data Access; storage, navigation management, mining tools, curation, privacy
Scientific Instruments Large Facilities, MREFCs,,telescopes Colliders, shake Tables Sensor Arrays - Ocean, environment, weather, buildings, climate. etc
Software Applications, middleware Software development and supportCybersecurity: access, authorization, authentication
Data-Enabled Science
New ComputationalResources
Community ResearchNetworks
Access andConnections toCI Resources
Education: integral and embedded
Data-Enabled ScienceThrust Area 1
Data Services Program (data) Provide reliable digital preservation, access,
integration, and analysis capabilities for science and/or engineering data over a decades-long timeline
Data Analysis and Tools Program (information)Data mining, manipulation, modeling,
visualization, decision-making systems Data-intensive Science Program
(knowledge) Intensive disciplinary efforts, multi-disciplinary discoveryand innovation
15Dumped On by Data: Scientists Say a Deluge Is Drowning Research
Data Challenges
16
2012 2016 2020
Genomics
LHC
BlueWaters
SquareKilometer
Array
Genomics
LHC
Climate, Environment
LSST
ExaBytes
PetaBytes
TeraBytes
GigaBytes
Climate, Environment
Volume of data
Growth
Distribution of data
Interoperability of Data
Public Data Being Created
172012 2016 2020
Genomics
LHC
BlueWaters
SquareKilometer
ArrayGenomics
LHC
Climate, Environment
LSST
Zetta Bytes
ExaBytes
PetaBytes
TeraBytes
Climate, Environment
IDC
Sciencexpress
New Computational InfrastructureThrust Area 2
Computational and Data-enabled resourcesHPC, Clouds, Clusters, Data Centers
Long-term software for science and engineeringSustained software development and support
Discipline-specific activitiesServices, tools, compute environments that
serve specific research efforts and communities
18
Creating Scalable SoftwareDevelopment Environments
Create a software ecosystem that scales from individual or small groups of software innovators to large hubs of software excellence
Focus on innovation Focus on sustainability19
Community Research NetworksThrust Area 3
New multidisciplinary research communitiesAddress challenges beyond individuals and
disciplinary research communitiesSupport and optimize collaboration across small,
mid-level and large community networksSupport SEES and new research communities
Advanced research on community and social networksStructures, leadership, fostering and
sustainability“virtuous cycle” providing feedback through
formal evaluation and program iteration20
Access and Connectivity Thrust Area 4
Network connections and engineering programReal-time access to facilities and instruments;
Begins to tie in MREFC activities Integration and end-to-end performance to
provide seamless access from researcher to resource
Cybersecurity – from innovation to practiceDeployment of identity management systemsDevelopment of cybersecurity prototypes
21
CIF21 Strategic Plan
Development of a detailed CIF21 Roadmap for FY12 and beyond; updated as needed
Developing a plan and guide for CI investments across NSF
Established internal NSF working group Exploring and developing data policies
on open access, publications, citation, etc
Multidirectorate/office “collective” programs designed to build critical infrastructure and capabilities 22
CIF21 Strategy Plan con’t
Outcomes and metrics being identified for each Thrust Area
Spiral development model adopted for all components3-5 year overlapping spirals Iteration and creation of new versions and
capabilities with each spiralEver increasing improvements
NSF Advisory Committee on Cyberinfrastructure to review CIF21 progress; individual directorate ACs as well 23
Transient & Data-intensive Astronomy
24
New era: seeing events as they occur (Almost) here now
ALMA, EVLA in radio
Ice Cube neutrinos
On horizon 24-42m optical? LIGO south? LSST = SDSS
(40TB) every night!
SKA = exabytes Simulations integrate
all physics
?
Earth–Human Knowledge Management System (Earth-Cubed)
forDiscovery ObservingSimulatingSharing Collaborating Training Learning Informing Broadening
Participation
Impacts on NSF CI as enabling infrastructure for S&E
Sustainability and viability essential NSF has a unique role in this strategy
New role for Data data-enabled and data intensive science management of data - compute, storage, use,
access, etc
Multi-disciplinary approaches essential Education - embedded and integral
methodologies (compute and data intensive) culture/society (interdisciplinary approaches,
collaborative)
More coordinated post-award management ensure value across the entire research enterprise
27