a user-centric vision for future einfrastructure and services in norway hans a. eide, phd group...
TRANSCRIPT
A user-centric vision for future eInfrastructure and services in Norway
Hans A. Eide, PhDGroup leaderResearch Computing ServicesUSIT, University of Oslo
eSOP seminar on eInfrastructure Use Roadmap, March 11, 2011
University of Oslo and IT, research, HPC
• Two-tier IT organization:– Local: (at institutes / faculties)
– Central: University Center for Information Technology (USIT)
• USIT
– 240+ FTE and growing
– Covers all aspects of University IT activities
• Section for Education and Research Support (SUF)
– Provides resources, tools, support, competence for the primary production (education and research), 40 FTE
• Research Computing Services (VD – the HPC group)
– Research support, competence, operations
Research Computing Services group
• 14 people, 9 with research background (Ph.D)– “buffer” between advanced resources and researchers
– Advanced user support (e.g. parallelization, grid enabling)
– Computation, storage, visualization, emerging tech.
– Not limited to “hard sciences” or HPC
• Multi-source funding
– RCN (Notur, NorStore, Norgrid, projects)
– Research projects (life sci., astro, physics, etc.)
– UiO
• Training, support, operations, help-desk
Tomorrow’s eInfrastructure and services
Tomorrow’s eInfrastructure and services• Must support all fields of research, be accessible• Help maximize science production, to the benefit of
society (social, economic, ..), while• minimizing TCO (i.e. be effective)• Environmentally friendly• Quickly adapt to technology changes and new
demands to give competitive edge• Maintained at a sufficient and stable level relative to
use/need
Usually divided in two aspects:•eInfrastructure
– Hardwarei.e. computing resources, storage, network, …
•Services– Software– Brainware (support services)
eInfrastructure really should mean the whole package, but
eInfrastructure
The eInfrastructure pyramid(anno 2011)
PRACEMulti-Petaflop
Petaflop
Nordic?
Sub-Petaflop
Development Competence Services Support
Training Portals Tools Databases Data sourcesGreen(?)
Greener
Greenest
Capacity Capability
WLCG
NGIs
clou
ds
Today’s situation (simplified) for computing and storage
Basic infrastructure (network)
UiTNTNUUiBUiO
Today’s situation (simplified) for computing and storage
Basic infrastructure (network)
UiTNTNUUiBUiO
End of 2010 300kW (maxed out)From 2011 900kW (sufficient to 2013+)Limited space (and cooling)
Infinite power, space, and cooling
Alternative 1: go alone (x MW in 2015)
UiO+
UiO
Greendatacenter
Alternative 2: together (y MW in 2015+)
UiTNTNUUiBUiO
GreendatacenterGreen
datacenterGreendatacenter
+
Alternative 3 (2020!)
15
UiOUiOUiOUiO
UiOUiOUiOUiOU of XU of X
UiOUiOUiOUiOUiOUiOU of YU of Y
UiOUiOUiOUiO
UiOUiOUiOUiO
UiOUiOUiOUiO
U of YU of Y
Green datacenterGreen datacenter
“Life science”
Green datacenterGreen datacenter
“Climate”Green datacenterGreen datacenter
“Particle physics”
UiOUiOUiOUiO
UiOUiOU of XU of X
Green datacenterGreen datacenter
“Language technology”
Services
Ideal eInfrastructure services:• National core services together with local services
– Fully financed, permanent positions– Close to local resources, users– Pool of competence (advanced user support)– Training, courses, outreach, marketing– Technology watch, early adopters– Partake in Nordic/EU/world-wide programs– Members who are experienced with ICT in the research
process (have background as researchers)
The four waves of extraordinary growth in use of ICT
Mechanicalcalculator
19461820 1968
Towards thecomputer
1991A tool for many
A tool for “all”
2010
Data systemseverywhere
Research and developmentResearch and development
Mainframe computers
PC (affordable)
Internet applications
Advanced services and infrastructures
Number of users
(inverse of skills needed by users)
The evolution of the HPC computing pyramid (William Gropp, UIUC)
21.04.23 19
High Performance Workstations
Mid-Range Parallel Processors and Networked Workstations
Center Supercomputers
Tera Flop Class
Laptops, phones, wristwatches, eye glasses…
Single Cabinet PetascaleSystems
(or attack of the killer GPU successors)
Center Exascale Supercomputers
1993 2029
www.zettaflops.org
Users needed to be “inside the box” Users “outside the box”
Tomorrow’s today’s (average) user• Knows little (nothing) about HPC (and have no interest
in it either)• Most can’t program (at least not good)• Don’t want to spend time learning something if it can
be avoided• Just want results and move on• Doesn’t know what is available• ..but expects to get services, resources, and support
for free
SUIT 2010 – UiO user survey
SUIT 2010 – Research support
12) Bruker du, eller kjenner du til følgende tjenester fra USIT?Bruker / Har brukt / Kjenner til / Kjenner ikke til / Ikke aktuelt
Challenges
• Even HPC for dummies is too advanced(and why should users bother?)
• Knowledge about basic methodology seem to be declining in all fields, among students and researchers alike (e.g. statistics, mathematics)
• Hard to reach the “customers” with passive marketing (i.e. web-pages)
• Late adopters of new technologies/capabilities (“don’t ask me what I need, you should tell me what I need”)
• Serial jobs (not necessarily embarrassingly parallel)
(Some) solutions
• Make it simple to useF.ex. computing portals (can mitigate problem of serial jobs by e.g. using GPUs w/o user even knowing!)
• Emphasis on using ICT methods and eInfrastructure in the education program – part of the curriculum!
• Tailored courses and training for user groups• Forward-leaning marketing of services (e.g. approach
and ask “why are you not using our xyz service in your research?”)
• Advanced support (enter early in the problem formulation/design process), competence
40+ applications
Example: Bioportal
• 2659 registered users, 700+ active• 40+ applications (MrBayes, RaXML, BLAST, Paup, structure, R,
BEAST og PhyML, …)
• Bio (life science), chemistry, statistics• Tailored 454 sequencing work-flow• Use nearly 3 mill CPU hrs. in 6 mo.• Pre-compiled binaries allow advanced optimizations,
e.g. use of GPUs and MPI, transparently to the users
ICT services for hum-soc• Qualitative methods
– Used extensively in humanities and social sciences– Rich media (audio, video)– Typical applications: NVIVO, HyperResearch, Transana
• Quantitative methods– Statistics– Potentially huge datasets– Sometimes sensitive data– Typical applications: STATA, SPSS, R
• Storage services (data intensive)• Big need for training
eInfrastructure and services for sensitive data• Sensitive data enters in many fields
– Life Science– Medicine– Psychology– Social studies– Pedagogic studies
• Lack of eInfrastructure and services for sensitive research data impairs ability to perform research
DNA-sequencingDNA-sequencing
QuestionnairesQuestionnaires
Video/audioVideo/audio
MRIMRI
Patient/clinicalPatient/clinical
GeneticsGenetics
Industrial Industrial researchresearch
Sensitive research dataSensitive research data
eInfrastructure and services in the future– This is the missing slide about clouds and virtualization
Thanks for your attention!
Questions