update - compute canada · compute canada (cc) leads canada’s national advanced research...
TRANSCRIPT
UpdateJanuary 11, 2017
What is Advanced Research Computing?
Supercomputing: Innovation’s Infrastructure
Supercomputers, or advanced research computing (ARC), is essential infrastructure being used around the world to accelerate scientific discovery for national competitiveness and economic success.
ARC powers dynamic inventions and innovations in almost every sector of our Canadian industry — from curing disease, to aerospace and transportation, to manufacturing and consumer goods.
It has transformed the way the world conducts scientific and engineering research.
2
Agenda
3
● Welcome and Introduction
● National Technology Deployment Update
● Lowering the Barrier to Advance Research Computing
● Increasing Platform Demands
● Closing remarks
Compute Canada An Effective Provider of Essential Digital Research Infrastructure
Compute Canada (CC) leads Canada’s national advanced research computing (ARC) platform.
We provide ~80% of the academic research ARC requirements in Canada. There is no other major supplier in Canada.
CC is a not-for-profit corporation. The membership includes 37 of Canada’s major research institutions and hospitals, grouped into 4 regional organizations: WestGrid, Compute Ontario, Calcul Quebec, and ACENET
Funding is through the Canada Foundation for Innovation with matching funds from provincial and institutional partners (40% federal / 60% provinces and institutions).
4
Consolidation and Renewing Services
Hardware Consolidation by 20185-10 Data Centres, 5-10 systemsNational Data CyberinfrastructureNew Cloud, HPC, HTCMajor migration of data and users
CANARIE and regional Networks
CC Manages All Software & Middleware Services
5
Service DeliveryCommon middleware across sitesNew national services (Cloud, RDM)New documentation site200 distributed experts, national teamsImprove platform ease-of-use
Funding for Compute Canada
6
Capital: CFI (Cyberinfrastructure) + match
● Stage-1 spending in progress ($30M CFI) ← We Are Here!● Stage-2 proposal being assessed ($20M CFI)● Stage-3 planning assumption ($50M CFI in 2018)
Operating: CFI (MSI) + match
● 2012-2017, ending March 31, $61M CFI ● 2017-2022, $70M CFI, announced January 9th ← We Are Here!
7
Serving Researchers in all Disciplines
Platform For Big Science in Canada
8
Partnerships
Canadian Light Source Women in HPC Software Carpentry Mitacs The Canadian Association of Research Libraries (CARL) European Grid Infrastructure (EGI)XSEDE (US)PIMS (Jupyter notebooks for training researchers)Leadership Council for Digital Infrastructure
9
Interacting with Canada’s ResearchersSustainable Planning for Advanced Research Computing (SPARC)
10
In 2016, we conducted our second major SPARC process:● 18 town hall meetings● 17 white papers received (disciplinary + institutional)● 189 survey responses
SPARC timing chosen to inform MSI proposal, Stage 2 capital plan
Ongoing consultations on CFI grants. ● Consulted with more than 100 projects in 2015 and 2016.
Several councils of researchers: Advisory Council On Research, RAC-Chairs, International Advisory Committee
11
Impact Affirmation: Bibliometrics
Field-Weighted Citation Impact (FWCI) of CC-enabled papers
How Are We Doing?
Compute Canada delivers infrastructure capabilities and excellent service to researchers across the country. Compute Canada is making a quantifiable impact on Canadian research excellence.
Resource and service needs are growing. We are focussed on meeting those needs to enable continued Canadian excellence today and in the future.
CFI’s Cyberinfrastructure Initiative investments are very welcome by the community. CC is working hard to translate those investments into production-ready systems now!
Canada needs sustainable, predictable investment in ARC to keep pace with Canadian research investments.
12
Technology Deployment Update
13
Technology Deployment Overview
● Busy (and exciting!) period for the technology office!
● Major deployment of new resources underway:○ National Data Cyberinfrastructure○ New Cloud resources○ New HPC resources○ New Services
● Technology brief published to community in November.
● Cloud Strategy document updated in December.
14
Stage-1 Deployment Status
15
System RFP Issued RFP Closed Delivered In Production
National Data Cyberinfrastructure
December 4, 2015
January 21, 2016
(Ongoing delivery)
Fall 2016 & ongoing
Arbutus - UVicCloud
January 28, 2016
February 29, 2016
September 8, 2016
Cedar - SFUGeneral Purpose
June 13, 2016
July 26, 2016
In progress April 1, 2017
Graham - WaterlooGeneral Purpose
August 26, 2016
October 6, 2016
In progress April 1, 2017
Niagara - UofTLarge Parallel
c. February 2017
Late 2017
Stage 1 Award to Implementation
16
Target - all four major new systems in full production less than 2 years after award finalization. Software services development continues through 2018.
(Note - Niagara schedule purposely delayed by recommendation of CFI expert panel, to benefit from technology improvements)
Expanded Cloud Services - Arbutus
17
● Seeing growing demand for cloud and cloud-like services:○ Research data portals○ User-friendly interfaces to
computing○ Users controlling their own
environments● Substantially more
cost-effective to provide flexible research-focused cloud resource than relying on commercial providers.
● Substantial expansion of CC cloud came online in September.
Arbutus at UVic, in production on September 8, 2016
Hybrid Systems - Cedar and Graham
18
● Systems capable of serving multiple needs:○ Traditional HPC○ Big Data Analytics○ Machine Learning/AI○ High throughput
computing○ Cloud/containers
● Two machines designed for workload portability, redundancy.
● Expect Cedar to be most powerful Canadian academic supercomputer (until we buy Niagara!). More FLOPS in a single machine than current CC national aggregate.
The SFU data centre, preparing to receive Cedar
19
International Comparisons
● Comparisons of CC normalized to giga-FLOPS (GF) per researcher give some insight into Canada’s relative position in the world vs. time.
● Recent trends:○ We used to be #6 (2009)○ We are now #24 (2015)
● Comparator countries for GF/researcher in charts that follow:○ US - #3 in 2015○ Germany - #5 in 2015○ Czech Republic - #10 in 2015
● Project forward using:○ Canada - same investment scenarios shown in previous slides○ Other countries - world median growth rate (64% CAGR) in
GF/researcher
20
International Rankings - Log ScaleGF = Gigaflop/s
National Middleware Initiatives
CFI Cyberinfrastructure Challenge-2 --> more than just hardware!
CFI Cyberinfrastructure Challenge-1, CANARIE-funded research software platforms, and other Research Platforms and Portals are making it easier for researchers to share data, and computational tools/methods.
Platforms have many common needs. They should be met in a common way on CC resources.
Example - common authentication, authorization, ID management:● Common login across systems● Exploit Canadian Access Federation (CANARIE) where possible● Projects able to manage their own authorization● Interoperability with international collaborators
Developers hired to work on common services, development underway.
21
Lowering Barriers to Research
22
Research Data Management (RDM)
As data volumes grow, data storage is not enough - data management is needed.
Granting councils have published a statement on data management principles.
There are significant gaps in the Canadian system, which inhibit RDM. No organization “owns” the problem.
CC has partnered with the CARL (Canadian Association of Research Libraries) with the support of others (RDC) to address the gaps:● CC brings big storage, scalable data movement, software expertise● CARL brings metadata, curation, preservation and national on-campus
support network
Provide RDM services useful for the “long tail” (and any other part of the distribution).
23
Federated Repositories, Federated Discovery.
Early Testing with Researchers Now!
25
RDM and Compute for Astronomy - CANFAR
The NRC is responsible for Canadian astronomy facilities and their data. Data managed through the Canadian Astronomy Data Centre (CADC). Both CADC and CC support Canada’s University-based astronomers.
CADC and CC are building a new version of the CANFAR system which will exploit CC’s national data and cloud infrastructures. Lowers the barrier to astronomy research!
This is a 3-year contract between NRC and CC
26
Plotting the Course to Pluto - New Horizons
27
Partnership with PIMS - Jupyter
● Compute Canada has signed an MOU with the Pacific Institute of Mathematical Sciences (PIMS - NSERC-support math institute) to support development of “Jupyter Hubs” for Canadian researchers.
● Jupyter allows users to do sophisticated programming through a nice web interface. Lowers the barrier to scientific programming and data analysis!
● Becoming very popular among researchers. Need to be able to scale to a large user community.
● PIMS has developed a model which scales very nicely on Compute Canada cloud. Has quietly rolled-out to several university communities for testing.
● Current allocation could support 8000 users. Relatively low impact on CC cloud resources.
28
Partnership with PIMS - Jupyter
29
Growing Needs
30
Continued Growth in User Base
31
Resource Allocation - 2017
32
● 2016 was a tough year for resource allocations. Increased demand, static supply.
● A number of changes for 2017:○ Allocate new systems (Arbutus, Cedar, Graham)○ Extend the lives of some (lower cost) old systems (still decommission
significant capacity)○ Shift schedule to fiscal year○ Process changes recommended through researcher consultations.
● Demand has continued to grow. 2017 will also be a tough year.
● Replacing older (fragile) systems with larger more robust systems will help.
● Massive user migration coincides with new system commissioning and 2017 RAC allocation implementation.
Continued Growth in Number of Requests
33
Resource Allocation - 2017
34
2017 Requests 2016 Requests % Change
Compute - CPU-years 256,000 238,000 +7.5%
Compute - GPU-years 2,660 1,357 +96%
Storage (PBs) 55,000 28,660 +92%
2017 Requested Fraction Available
2016 Requested Fraction Available
Compute - CPU 54%* 54%
Compute - GPU 38% 20%
Storage 90+% 90+%
Correlations between availability and requests. Significant increase in available GPU led some researchers to shift from CPU to GPU.
* 54% in 2017 includes 50k+ new cores with better performance
Summary/Conclusions
● Sustainable predictable investment in this essential digital infrastructure
● Alignment ARC funding with national science priorities and overall investment in research.
● Investment in robust, professional and scaleable software and common advanced research computing services.
35