møte med fagrådet for e-infrastruktur - universitetet i oslo · e-infrastruktur kundemøte uio...
TRANSCRIPT
Møte med Fagrådet for
e-infrastrukturKundemøte UiO
25.april 2017
About UNINETT Sigma2
Established in December 2014 based on a decision from the 4 oldest universities and the Research Council of Norway
A long-term model with 5+5 years and evaluation of the company after 5 years. (i.e. minimum 10 year lifetime for the company)
Part of the UNINETT corporation, separate company
Collaboration agreement with the 4 oldest universities incl. 50 MNOK yearly funding
Contract with the Norwegian Research Council incl. 25 MNOK yearly funding
Granted infrastructure funding (75.7 MNOK investment 2016-2017) from the Norwegian Research Council
Operation and support contract with the 4 oldest universities
Frame agreement with the universities for project work
2
The Metacenter
National coordination and shared, consolidated resources have cost and efficiency advantages but creates a “distance” to the end-users (researchers)
This is avoided by keeping the support staff and competence near where the research is going on, at the universities
Combined with a data-centric architecture for the e-infrastructure, this model combines the advantages of the centralized model and the local model
3
Sigma2 METACENTER
AUS IT.dep
NTNU
Researchers
RFK
(RAC)
IT.dep
UiB
IT.dep
UiO
IT.dep
UiT
Sak 4
xx.april 2017
Strategi for Sigma2
Innspill til
strategiarbeid..
xx.april 2017
Bakgrunn for strategiarbeid
Sigma2 styret sin egen prosess
Arbeidsgruppen for IKT-strategi og helhetlige løsninger
Organiseringsprosjekt i KD
7
Føringer frå styret så langt
Dagens overordna mål ligg fast
8
Procure, operate and develop a critical national infrastructure
Promote e-infrastructure to new research communities
Lead and coordinate participation in international cooperation for e-infrastructure
Provide an attractive and sustainable e-infrastructure for all research communities, with the following characteristics:
• High reliability and availability
• Cost effectiveness
• Predictable access
• Interoperability within the national e-infrastructure (compute and storage) and between national and international infrastructures (e.g. PRACE, EUDAT)
Provide services for data analytics of large datasets (Big Data)
High-level objectives
9
From the contract with the Research Council
Om prosjektet
10
Aktiviteter:
1. Analysere utviklingen siden etablering; hva har
man lykkes med og ikke?
2. Gjøre en overordnet egenvurdering i tråd med
foreslåtte evalueringskriterier.
3. Gjennomføre analyse av nåsituasjonen / SWOT
4. Gjennomgang metasenter-ledere
Aktiviteter:
1. Etablere syntese av funn i fase 1; hva betyr
egenvurderingen og SWOT for mulige
utviklingslinjer videre for Sigma2?
2. Innspill fra BOTT, Definere 2-4 strategiske
hovedretninger ()
3. Analysere disse langs relevante akser, og gjøre en
vurdering av hvordan de viktigste interessentene
(ansatte, BOTT, Forskningsrådet) vil se på de
aktuelle retningene.
Aktiviteter:
1. Utarbeide beslutningsgrunnlag for strategisk
hovedretning
2. Gjennomføre styresamling
3. Basert på valgt strategi utarbeide overordnede
handlingsplaner tidsplan.
Fase 1: Utvikling og nåsituasjonFase 2: Etablere strategiske
hovedretninger
Fase 3: Beslutte strategisk hovedretning og definere
handlingsplan
Leveranse:
• Kort rapport fra analyse av utvikling og
nåsituasjon
Leveranse :
• Strategisk analyse
Leveranse :
• Strategidokument
• Overordnet handlingsplan
Desember-Februar Januar-? ?-?
1. Evalueringskriteriene
Økonomi
Organisering
Styring og ledelse
Kontakt med interessentene
Investeringer og kjøp av tjenester
Tilgjengelighet av infrastruktur og
data
Ressursfordelingskomiteen
Brukertilfredshet
Støtte til forskningsinfrastrukturer
Tjenestenes relevans for utdanning
Drift
Internasjonal virksomhet
11
The actors… who provides what
International level
National level
University/institutional level
Deparments / Faculties
Institute or research group
12
Service delivery
13
International
National
Institution
Dept./Project
S
e
r
v
i
c
e
s
S
e
r
v
i
c
e
s
S
e
r
v
i
c
e
s
Who provides what?
14
Competence strategy
Metacenter competence and skills
Basic foundation found in operation model
15
16
Sak 5:
Brukerundersøkelse..
xx.april 2017
Critical infrastructure for research
18
… still critical
19
Excellent user support
20
21
… but documentation…
22
Satisfied users!
23
24
Very good response time for extra resources
25
User survey
Satisfied users!
Critical Infrastructure for research
Great user support, documentation not so good…
Very competent staff
26
Sak 7
xx.april 2017
Status på FRAM, NIRD og B1
Status and plans HPC
User support
• User support is in transition from a 4-site support model to a common contact point at
Current hardware
• Currently the load is serviced by Abel, Stallo, Hexagon and Vilje
• The plan is to replace Hexagon and Vilje by 1 July with the new Fram machine
• From 1 July load will be serviced by Abel, Stallo and Fram
• Abel and Stallo will be in production until replaced with a new system at NTNU end 2018
• Currently the only accelerators are the GPUs and Xeon Phis on Abel
• Most likely accelerators will be made available through the NIRD service platform
• Load not suited for HPC is subject to transfer from Fram to NIRD
28
Status and plans HPC (cont’d)
HPC storage
• Except for fast scratch storage, home and project storage will be served through NIRD
Queueing systems
• Service is converging on Slurm on all platforms (Abel, Stallo and Fram)
• Will (re)introduce pri- and non-pri scheduling
• Will provide “optimist” queue for jobs with internal C-R
• Support for commercial projects
• Service will first be made available on Fram and then later on also on Abel and Stallo
29
Status and plans HPC (cont’d)
Software platform
• A common software platform based on EasyBuild and Lmod will be provided on Fram
• User EasyBuild module builds will also be provided
• Platform is developed on Fram and will be eventually be introduced on Abel and Stallo
later
30
Historic and future compute demand
0
50,000,000
100,000,000
150,000,000
200,000,000
250,000,000
300,000,000
350,000,000
400,000,000
450,000,000
Co
re h
ou
rs
Notur compute period
RFK allocatable core hours
Granted core hours
Linear usageextrapolation
RFK core hours used
31
HPC ressurser
32
Status and plans project storage and
archive
33
Data-centric architecture
34
Current and planned research data
servicesdata transfer; can be done via command-line (ssh/sftp) and using sync-n-share (nextcloud). Labs generating large data volumes may wish to sync data automatically.
support; basic and advanced user support, DMP generator and training
data-analysis; science documentation, visualisation and analysis using Jupyternotebook, remote-visualisation (individual software packages), data analytics (e.g. Spark), portals framework and on-demand containerised AppStore
publishing data; minimal metadata collection, data archive deposit, DOI and dataset publishing, search and retrieve published datasets (data reuse), long-term data curation
35
36
Services
Archive
ssh/gridftp
cmd
web
HPC compute
<portals>
sigma-dmp
pilot
<on-demand compue services>
nextCloud
data
analytics
pilot
long-term
data
access
data
analysis
e-science
supportdata transfer
notebook(s)
serv
ice m
atu
rity
direct
compute
trainingadvanced
user
support
basic
support
Visualisation
pilot
37
Climate (IPCC production, ESGC data node, HPC intensive data) – large datasets, avoid moving data, scalability, data longevity and integrity
Neuroscience (HumanBrain, Kavli Inst., INCF)– sensitive data, raw sensor data, data mgmt tool
ELIXIR.NO (next generation sequencing, analysis/processing, sharing/archiving, data product delivery)– portals, AAI, work flow mgmt, access to tools
CLARINO (structured data, corpus)– AAI, data access, DOIs, centralising HPC+data
Biodiversity (GBIF, LifeWatch)– portals, access/sharing, metadata, own PIDs, Biobanks)
Marine environment (sensor collection, basic service needs) …
EPOS (implementation phase, sensor collection) …
Data intensive Science Disiciplines
38
39
A future common architecture?
41
Sak 9
xx.april 2017
Nye SFF´er (som CTCC) lurer på hvordan og når betalings-
regimet
innføres, hvordan berøres man av budsjettkutt hos NFR
Contribution model
43
Background
The Research Council requirement for financial user contribution
Research infrastructures supported by the National Financing Initiative for Research Infrastructure should as the main principle include user contribution as an element in the funding of the operational cost.
User contribution should also contribute to the development and delivery of services which are requested by users.
The introduction of the model for user contribution should not be a hindrance for projects to get free allocations based on scientific merit.
44
RCN intentions:
Increase the total funding for e-infrastructure by increasing the
funding sources
Currently, only Ministry of Education and Research contributes
All programmes in RCN should contribute to operational cost
The basic funding from RCN will not be reduced
45
1725 25
29
50 50
25
37.5 ?
.
.
?
0
20
40
60
80
100
120
140
160
Former fundingSIGMA
New fundingSIGMA 2
Future funding ?
Contributers (MNOK)
RCN Universities(UiO, UiB, NTNU, UiT)
Nationalinfra. Funding
Users
MNOK
Funding
Long term
funding
Contribution model: General principles
All projects get X TB storage for free on project area. Archiving is free
Compute resources are free with 3 exceptions:
• A) Commercial research and industry
• B) Large projects with RCN fundingSuggested definition of «large» is RCN funding above 10 MNOK
• C) Non-commercial Projects needing Dedicated Resources
This model is planned to be introduced during 2017
Existing research projects will get a reasonable time to adapt to these new rules and make provisions for this in their future applications for funding. (i.e.only projects with new funding from NRC after 2017 where NRC has requiredbudgeting for einfra resources. So far this is only valid for SFF and INFRA applications)
47
Funding versus resource usage
Four oldest universities fund 44.4%, NRC fund the rest
Four oldest universities use approx. 85-89% of compute resources
(2015.2) and approx. 60-75% of storage resources (2016)
Other universities spend less than 1% of compute resources
48
Sak 10
xx.april 2017
Sigma2 omorganiseringen skal frigi mer midler til AUS, når,
hvor og hvordan ?
Advanced User Support (AUS)
Kundemøte UiO
25. April 2017
Hans Eide, UNINETT Sigma2
Avansert brukerstøtte
IKT strategi for forskning (std. mld. 18) peker på AUS som en
forutsetning for konkurransedyktig utvikling, på linje med lagrings-
og beregningsressurser, og videre at tilgang på avansert
brukerstøtte vil være en forutsetning for a fa flere fagmiljøer til a
ta i bruk nye forskningsmetoder.
51
Avansert brukerstøtte
IKT strategi for forskning (std. mld. 18) peker på AUS som en forutsetning for konkurransedyktig utvikling, på linje med lagrings-og beregningsressurser, og videre at tilgang på avansert brukerstøtte vil være en forutsetning for a fa flere fagmiljøer til ata i bruk nye forskningsmetoder.
Målbildet i rapporten er at den nasjonale e-infrastrukturen stiller AUS til rådighet for prosjekter (innrettet mot et spesifikt prosjekt for en kortere periode), eller disipliner (for etablering av fagspesifikk e-infrastruktur, eller tjenester og kompetanse innenfor nye metodiske retninger).
Den nye organiseringen av nasjonal e-infrastruktur skal være med på å styrke ressurstilgangen og tilbudet på AUS.
52
The Metacenter model
National coordination and shared, consolidated resources have cost and efficiency advantages but creates a “distance” to the end-users (researchers)
This is countered by keeping the support staff and competence near where the research is going on, at the universities
Combined with a data-centric architecture for the e-infrastructure, this model combines the advantages of the centralized model and the local model
53
Sigma2 METACENTER
AUS IT.dep
NTNU
Researchers
RFK
(RAC)
IT.dep
UiB
IT.dep
UiO
IT.dep
UiT
Advanced User Support (AUS)
1) Project based AUS:
can be the sole initiative of a researcher or a science area
granted by RFK with 2-3 PMs spent over a maximum of 6 months.
2) Discipline specific AUS
initiated by Sigma2 in cooperation with a science discipline
can have allocations of more than 12 PMs spent over a maximum for 2 years
joint funding
Advanced User Support (AUS)
For the HPC services, project based advanced user support aims at helping scientists to improve or extend the performance and capabilities of their applications. This can be in a number of ways, including:
code parallelization
code porting
code profiling, optimization, benchmarking
improving user-interfaces
software development
For the storage services, project based advanced user support aims at:
assist researchers to create data plans
implementing best practices for collecting and handling data
identifying or defining meta-data schema
identifying suitable storage formats
identifying dedicated or specialised tools to help access or visualize data, utilise the facilities better
55
Strategi for avansert brukerstøtte
Trolig er avansert brukerstøtte en tjeneste som er særlig krevende
a levere på en slik måte at brukerne opplever den som
tilfredsstillende. (Std. mld. 18, IKT strategi for forskning)
For å imøtekomme denne utfordringen er det viktig at vi innen
Metasenteret prioriterer å opprettholde og bygge faglig og IT-
teknisk kompetanse som kan benyttes til AUS.
En effektivisering av driftsoppgaver sammen med bidrag fra
brukermiljøene vil være med på å øke ressurstilgangen for AUS og
applikasjonsforvaltning.
56
Planer for avansert brukerstøtte
Snarlig hente ut gevinsten med effektivisering av drift og omsette
denne til økte ressurser for AUS. Samtidig viktig å sørge for at AUS
aktiviteten faktisk øker i parallell med dette for å beholde ressurser
ved sitene.
Kontinuerlig utlysning for AUS prosjekter. Erfaringen fra andre e-
infrastruktuerer er at dette vil heve kvaliteten og samtidig medføre
bedre utnyttelse av tilgjengelige AUS ressurser.
Kontakten ut mot forskerne er vesentlig. Aktiviteter som kursing og
formidling må benyttes for å markedsføre AUS. Tilsvarende vil
applikasjonsforvaltning stadig komme i kontakt med forskere som
har AUS behov.
57
Shift in tasks
DriftHPC+NIRD+ Tj.plattform
App.ForvaltningHPC+ Tj.plattform
AUS Prosjekt
UiB
NTNU
UiT
UiO
SUM
58
Resource planning, PM allocation
Ongoing discussion
2017 2018 2019
Ops & support 252 248 159
App. mngmt 60 ? 60 60
AUS 39 70 94
Projects 155 144 129
Budget AUS
MNOK
3.65 6.24 8.32
59
END
60