xsede and national cyberinfrastructure

37
October 26, 2015 XSEDE and National Cyberinfrastructure John Towns PI and Project Director, XSEDE Executive Director, Science & Technology, NCSA Deputy CIO, Research IT, University of Illinois [email protected]

Upload: john-towns

Post on 14-Apr-2017

261 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: XSEDE and National Cyberinfrastructure

October 26, 2015

XSEDE and National Cyberinfrastructure

John Towns

PI and Project Director, XSEDE

Executive Director, Science & Technology, NCSA

Deputy CIO, Research IT, University of Illinois

[email protected]

Page 2: XSEDE and National Cyberinfrastructure

License terms

• Please cite as: Towns, John. XSEDE and National Cyberinfrastructure,October 2015, [http://www.slideshare.net/jtownsil/xsede-and-national-cyberinfrastructure]

• ORCID ID: http://orcid.org/0000-0001-7961-2277• Except where otherwise noted, by inclusion of a source URL or some other

note, the contents of this presentation are © by the Board of Trustees of University of Illinois. This content is released under the Creative Commons Attribution 3.0 Unported license (http://creativecommons.org/licenses/by/3.0/). This license includes the following terms: You are free to share – copy and redistribute the material in any medium or format; and to adapt – remix, transform, and build upon the material for any purpose, even commercially.

• This can be done under the following conditions: attribution – You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.

2

Page 3: XSEDE and National Cyberinfrastructure

XSEDE in Context

• XSEDE is an award made under the eXtreme Digital solicitation– TeraGrid Phase III: eXtreme Digital Resources for Science and

Engineering (XD), NSF 08-571– “an infrastructure to deliver the next generation of high-end

digital services, as national resources, that will provide researchers and educators with the capability to work with extremely large amounts of digitally represented information”

• Consistent with NSF’s vision and strategy statements

• NOTE: internationalization– Cyberinfrastructure == eScience Infrastructure == eInfrastructure == Digital Research

Infrastructure

3

Page 4: XSEDE and National Cyberinfrastructure

NSF’s Strategic Planning Documents

• Investing in Science, Engineering, and Education for the Nation's Future - National Science Foundation Strategic Plan for 2014-2018

• www.nsf.gov/pubs/2014/nsf14043/nsf14043.pdf

– Vision: A Nation that creates and exploits new concepts in science and engineering and provides global leadership in research and education.

• Cyberinfrastructure Framework for 21st Century Science and Engineering

• www.nsf.gov/cif21

• NSF’s Advanced Computing Infrastructure: Vision and Strategic Plan

• www.nsf.gov/pubs/2012/nsf12051/nsf12051.pdf

4

Page 5: XSEDE and National Cyberinfrastructure

Original Motivation for XSEDE

• Scientific advancement requires a variety of resources and services – and thus availability of comprehensive cyberinfrastructure

composed of heterogeneous digital resources

• Computational science better served if we leverage aggregate expertise of a small number of leading institutions– not fully centralized at a single institution; not fully

decentralized– full centralization less agile, single point of failure– different sites each offer a unique perspective and talent

to address a particular suite of community needs– best to have several leadership perspectives for addressing

the broad range of disciplinary needs

5

Page 6: XSEDE and National Cyberinfrastructure

Convenience Requirements will Always Increase

• Each generation of users requires more convenience than the former

• We must always be adding new capabilities while maintaining and extending existing reliability

• XSEDE has learned from the past– adds value in how we address

going forward

6

Change is the only Constant

– Heraclitis 535BC-475BC

No, his mind is not for rent

To any god or government.

Always hopeful, yet discontent,

He knows changes aren't permanent,

But change is.

– Rush - Tom Sawyer

Page 7: XSEDE and National Cyberinfrastructure

XSEDE – accelerating scientific discovery

XSEDE’s Vision: a world of digitally-enabled scholars, researchers, and

engineers participating in multidisciplinary collaborations while seamlessly accessing computing resources and sharing data to tackle society’s grand challenges.

XSEDE’s Mission: to substantially enhance the productivity of a growing

community of scholars, researchers, and engineers through access to advanced digital services that support open research;

and to coordinate and add significant value to the leading cyberinfrastructure resources funded by the NSF and other agencies.

7

Page 8: XSEDE and National Cyberinfrastructure

Vision/Mission: Enable Best Science

8

Page 9: XSEDE and National Cyberinfrastructure

XSEDE Factoids: high-order bits

• 5 year, US$121M project– plus US$9M, 5 year Technology Investigation Service

• separate award from NSF

– option for additional 5 years of funding upon major review after PY3

• No funding for major hardware– coordinate, support and create a national/international

cyberinfrastructure– coordinate allocations, support, training and documentation for

>US$100M of concurrent project awards from NSF

• ~112 FTE /~240 individuals funded across 20 partner institutions– this requires solid partnering!

9

Page 10: XSEDE and National Cyberinfrastructure

Total Research Funding Supported by XSEDE

in Program Years 1-4

10

$1.68 billion in research supported by XSEDE

in PY1-PY4 (July 2011-June 2015)

Research funding only. XSEDE leverages and integrates additional infrastructure, some

funded by NSF (e.g. Track 2 systems) and some not (e.g. Internet2).

Page 11: XSEDE and National Cyberinfrastructure

What is XSEDE?

• An ecosystem of advanced digital services accelerating scientific discovery– support a growing portfolio of resources and services

• advanced computing, high-end visualization, data analysis, and other resources and services

• interoperability with other infrastructures

• A virtual organization (partnership!) providing – dynamic distributed infrastructure– support services and technical expertise to enable researchers

engineers and scholars• addressing the most important and challenging problems facing the

nation and world

• More than just a project funded by the National Science Foundation– XSEDE is a path-finding experiment in how to develop, deploy

and support e-science infrastructure

11

Page 12: XSEDE and National Cyberinfrastructure

• World-class leadership

– partnership led by NCSA, NICS, PSC, TACC and SDSC

• CI centers with deep experience

– partners who strongly complement these CI centers with expertise in science, engineering, technology and education

XSEDE’s Distinguishing Characteristics:

Governance

12

Page 13: XSEDE and National Cyberinfrastructure

Science Requires Seamlessly Integrated

“Advanced Digital Services”

• Often use the terms “resources” and “services”– these should be interpreted very broadly– most are likely not operated by XSEDE

• Examples of resources– compute engines: HPC, HTC (high throughput computing), campus,

departmental, research group, project, …– data: simulation output, input files, instrument data, repositories, public

databases, private databases, …– instruments: telescopes, beam lines, sensor nets, shake tables, microscopes, …– infrastructure: local networks, wide-area networks, …

• Examples of services– collaboration: wikis, forums, telepresence, …– data: data transport, data management, sharing, curation, provenance, …– access/use: authentication, authorization, accounting, …– coordination: meta-queuing, … – support: helpdesk, consulting, ECSS, training, …– And many more: education, outreach, community building, …

13

Page 14: XSEDE and National Cyberinfrastructure

XSEDE Offers Efficient and Effective

Integrated Access to a Variety of Resources

• Leading-edge distributed memory systems• Very large shared memory systems• High throughput systems, including Open Science

Grid (OSG)• Visualization engines• Accelerators like GPUs and Xeon PHIs• Virtualization • Cloud-based resources (coming January 2016)

Many scientific problems have components that call for use of more than one architecture.

14

Page 15: XSEDE and National Cyberinfrastructure

XSEDEnet – Using Internet2’s AL2S

Page 16: XSEDE and National Cyberinfrastructure

Centralized/Coordinated Services Provide

Value Add

• User productivity enhancements

– XSEDE User Portal, single sign-on, allocation processes

• Centralized/coordinated support services

– coordination of problem resolution, extended support disciplinary breadth and depth

• National leadership function

• Training, Education, Outreach

– national scope

16

Page 17: XSEDE and National Cyberinfrastructure

XSEDE User Portal: THE User Site

portal.xsede.org

• XSEDE User Portal (XUP) is designed to be the only site a user needs to use XSEDE

• XUP presents information relevant to users– user info is easier to find– XUP also provides dynamic data about XSEDE systems– capabilities to manage usage, files, data

• As a user you can– request an allocation, and manage allocations– sign up for training– request help– manage file and data, and much more!

– Portal provides single sign-on to all XSEDE resources

Page 18: XSEDE and National Cyberinfrastructure

Enhanced User Productivity Examples

• The XSEDE User Portal as the place for users to go to get information and support– a single location for their needs– create a single account that gives you access to all XSEDE

resources: over 22,000 accounts!

• As a user you can– request an allocation, manage allocations, sign up for training,

request help, manage files and data, and much more!

• Single sign-on– use institutional credentials to authenticate to all XSEDE

resources and services– OAuth service to allow other services to leverage XSEDE’s

infrastructure• e.g. third party authentication for a science gateway

18

Page 19: XSEDE and National Cyberinfrastructure

Enhanced User Productivity Examples

• Single allocations process– single request to gain access to all XSEDE-allocated

resources– expert help in selecting the right resource from the entire

array of nationally-available resources of XSEDE– 11 compute resources (HPC and HTC), 2 visualization

resources, 6 storage resources, VM hosting service• https://www.xsede.org/resources/overview

• Unified tool set assures usability and reliability– distributed team collaborates in support of various

enterprise services– data management tools, usage accounting and account

management • Support XDMoD usage analysis portal:

https://xdmod.ccr.buffalo.edu/

19

Page 20: XSEDE and National Cyberinfrastructure

Direct interactions with the Community

• Facilitate broad range of ground-breaking research – provided in-depth support contributing to improved user

productivity– supported over 15,000 publications to date

• Seamlessly integrate and retire resources– transition community smoothly

• Pursue new disciplinary areas – increasing the diversity of disciplines utilizing advanced

digital services

• Campus Champions continue to reach new heights– over 250 Champions at more then 200 institutions– expanding program: Regional, Student, and Domain

Champions

20

Page 21: XSEDE and National Cyberinfrastructure

Mao Ye (U. of Illinois)

Computational Finance

21

• Showed that by using odd lots and rapid trading, traders were able to mask what they were doing

• His first findings contributed to a change in NASDAQ and New York Stock Exchange rules, such that all trades are reportable and visible moment-by-moment

• Later work suggests that ever-faster trades may be destabilizing the markets

Page 22: XSEDE and National Cyberinfrastructure

XSEDE Computational User Census

22

Page 23: XSEDE and National Cyberinfrastructure

Centralized/Coordinated Support Services

• Coordinated problem resolution– field and route over 10,000 tickets annually– work with Service Providers to resolve all reported issues

• Extended Collaborative Support Services (ECSS)– single, coordinated effort to bring the right expertise to bear on

issues raised by any user on any resource(s)– no unnecessary replication across Service Providers

• disciplinary breadth of expertise, allows coverage of domains composed of diverse sub-domains

– Novel and Innovative Projects• support of emerging & innovative research

– optimization of widely used community codes• prioritizing and coordinating effort • often optimized for multiple architectures• improving code substantially is as better than buying more hardware

23

Page 24: XSEDE and National Cyberinfrastructure

Diverse ECSS Expertise Possible Because of

Scale

• Fields of expertise: astrophysics,

bioinformatics, CFD, chemistry, computer science, climate modeling, engineering, genomics, hydrology, humanities , machine learning, molecular dynamics, phylogenetics, physics, seismology, statistics.

• Technologies: clusters, large

shared memory systems, MICs, GPUs

• Languages: C, C++,Fortran, MPI,

OpenMP, Java, JavaScript, shell programming, CUDA, OpenACC, Python, R, MATLAB

• Techniques: benchmarking, cloud

computing, Condor, data mining, databases, FFTs, finite element methods, grid generation, grid middleware, Lattice Boltzmann methods, libraries, linear algebra, Monte Carlo methods, parallel debugging, parallel I/O, petascalecomputing, scheduling, science gateways, visualization, workflows

24

Page 25: XSEDE and National Cyberinfrastructure

Training, Education, Outreach• Single set of programs of national scope

– Training & Education – Underrepresented Community Engagement– Campus Champions

• Programs serve a more diverse community – single coordinated set of programs without competition– one consistent message and set of technical information makes it

easier for technology adoption to spread organically

• Better ability to cover the entire nation in outreach:– XSEDE Conference– users in all 50 states, D.C., and US territories– Campus Champions – all 50 States– XSEDE staff physically located in 18 states + DC

• Over 32,000 training registrations over PY1-PY4!– HPCU and CI-Tutor, as well as center trainings, have been used in

universities around the country to prepare students to use the nation's pre-eminent computational resources

25

Page 26: XSEDE and National Cyberinfrastructure

Data-enabled Transformation of

Science

Astronomy 1500- 2000:

• Single scientist looks through

telescope

• Record KB of data in

notebook

• Require reproducibility

Sloan Digital Sky Survey

2000+

• Record data for decade

(40TB)

• Serve to entire world

• Thousands of scientists

work “together”

• DES (now)

• 200GB/night

• PB in decade

• LSST (6 years)

• Record data for

decade

• SDSS/night!

• 200 PB/decade

How can I publish, discover, verify

data in this new world?

Page 27: XSEDE and National Cyberinfrastructure

Big Data vs The Long Tail of

Science

• Many “Big Data” projects are “special”– Highly organized, singular sources of data,

professionally curated, a lot attention paid

• What about the “Long Tail” (the other 99%)?– 1000s of biologists sequencing communities of

organisms

– Thousands of chemists and materials scientists developing a “materials genome”

– Characteristics:• Heterogeneous, perhaps hand generated

• Not curated, reused, served, etc…

27

Page 28: XSEDE and National Cyberinfrastructure

Basic Vision for Open Data and

Publication Services

• Make it possible (easy) for anyone to:– Create a data collection and get an “identifier”…

– Deposit it somewhere where it can be kept safe…

– Provide services so others can find it, analyze it, repurpose it…

– Link it to traditional (open, please!) publications…• OA aspects very important to this

• With these capabilities in place– Many important things will happen…

Page 29: XSEDE and National Cyberinfrastructure

NDS: A Builders Consortium

• NDS vision requires collaboration of many kinds of institutions– compute and data services centers

– universities and project repositories

– discipline-specific federations

– publishers

• NDS Consortium to guide the building, governance of services– coordinate separately funded efforts to build NDS

components• ensure interoperability, integrate existing tools and

resources

– NDS Consortium Steering Committee formed

Page 30: XSEDE and National Cyberinfrastructure

NDS Lab and NDS Share• NDS Lab

– Target: friendly developers

– A community support environment for developing, coordinating, deploying prototype service

– Spinning disk, storage, virtual machines for developing and hosting services

– Available to NDS community members

• NDS Share – Target: friendly scientists

– Experimental platform for sharing data• Enable anyone to create data collections, store data, get DOI

– Include installations of community data sharing applications

– Will evolve over time

• Partnership between NCSA, ANL, TACC, and SDSC– Other interested partners?

• Became available in January 2015

Page 31: XSEDE and National Cyberinfrastructure

XSEDE Invited to Submit Renewal Proposal

• Letter received from NSF inviting a renewal proposal– non‐competitive submission – will be reviewed as rigorously as a competitive

proposal

• Some parameters for the proposal submission:– $100-$120M

• first round funding totaled ~$130M

– 5 years of operations, July 2016 – June 2021– page limit of 30 pages– no option for renewal– submitted June 15, 2015

31

Page 32: XSEDE and National Cyberinfrastructure

Priorities for PY6-PY10:

Extended Collaborative Support Service

• Continue to provide excellent support to the research community via ECSS

– this effort must continue to evolve as needs and technologies evolve

– support for data analysis and visualization (including analytics)

• support for sensitive data

– support for executing applications in virtual machines and containers

32

Page 33: XSEDE and National Cyberinfrastructure

Priorities for PY6-PY10:

Community Infrastructure

• Continue to evolve the XSEDE infrastructure– must provide support for and integration of “Track

2” resources

• Expose this architecture to the broader community– facilitate integration of broad range of services

• provide discoverability

– become the “connector of services” to support the research enterprise

• We need visualization and data analysis services!!!

33

Page 34: XSEDE and National Cyberinfrastructure

Priorities for PY6-PY10:

Toward “Sustainability” • Developing services on offer to others: providing basic

cyberinfrastructure services– expose services developed and put in place to operate the XSEDE– where necessary customize/extend for needs of other projects– charge incremental costs for operating/supporting services for other

projects

• Objective is not to make money!– provides mechanism for other NSF project investments to leverage the

XSEDE investment– can lead to significant cost saving across NSF CI investments– others can leverage this too: projects, institutions, regional consortia,

• Pilot under way: NCAR with XRAS– will use XRAS to support allocation of NCAR resources– expressions of interest from some campuses– contact Amy Schuele [email protected] if interested

34

Page 35: XSEDE and National Cyberinfrastructure

5th annual conference to showcase the discoveries,innovations, challenges and achievements of those whouse, research and support advanced digital resources andservices

DIVERSITY, BIG DATA, & SCIENCE AT SCALE

https://www.xsede.org/web/xsede16

Page 36: XSEDE and National Cyberinfrastructure

Questions?

Page 37: XSEDE and National Cyberinfrastructure