Using e-Infrastructure for Research Summer School
Monday 13th to Thursday 16th AugustThe Cosener’s House, Abingdon
• Annual summer schools
• Workshops
• networks of Champions
http://www.ngs.ac.uk/communities/summer-school-2012
Logistics
Door code: 124
Toilets: through the lobby and double doors
Fire alarm: no tests, leave and assemble in main gardens
Breaks: refreshments in lobby
Lunch and dinner: main building. Dinner at 7pm each day.
Introduction to the e-Infrastructure landscape
Dr David WallomAssociate Director - Innovation (Oxford e-Research Centre)
Technical Director (UK NGS)
Overview
• 4th paradigm of research• Facilities & Providers• Services and Resource Providers• Partner Groups and organisations• Connecting Internationally
Jim Gray’s Science Paradigms:
• Empirical -- described phenomena
• Theoretical -- used models & generalizations
• Computational -- characterized by simulations
• Data Exploration -- allows us to unify theory, experimentation & simulation
(1944-2007)
• A defn of eInfrastructure;– computing facilities and software for the support
of research. – EPSRC support is concentrated around eScience,
Software, High Performance Computing and Cloud.
National and International Research infrastructure Projects
• CCPs (4 & b)• STFC CLF• Diamond LightSource• MOTT-2• NSCCS• NanoCMOS• DSR (analysing requirements)• MIMAS• EDINA• NeISS• DiRAC
• ELIXIR • LifeWatch• CLARIN• GridPP• SKA• SDSS
EPSRC Collaborative Computational Projects
CCP Title
CCP4 Macromolecular Crystallography
CCP5 The Computer Simulation of Condensed Phases
CCP9 Computational Electronic Structure of Condensed Matter
CCP12 High Performance Computing in Engineering
CCP-ASEArch Algorithms and Software for Emerging Architectures
CCP-BioSim Biomolecular simulation at the life sciences interface
CCP-EM Electron cryo-Microscopy
CCPi Tomographic Imaging
CCPN NMR
CCP-NC NMR Crystallography
CCPQ Quantum dynamics in Atomic, Molecular and Optical Physics
June 2011 MOTT 10
• Mott-2 (May 2005-2008 and 2008-2011)• Aims
– Purchase, install and operate the Mott2 cluster in support of the EPSRC grant GR/S84415/01 “Simulation on the Edge” given to the Minerals and Ceramics Consortium of 11 UK Universities and international partners. Lead by Prof Steve Parker (University of Bath)
– Wide range of applications: E.g.biomaterials, batteries, nanostructures, semiconductors, fuel cells, catalytic converters
• Future– Integrated into NGS for support until March 2011– Consortium to bid for fundingMOTT2 Usage 2009-10
National Service for Computational Chemistry Software (NSCCS)
• Provides access to software, specialist consultation, computing resources and software training to support UK academics working across all fields of chemistry.
• Lead by Imperial College with physical hardware based at STFC with service software support from both STFC and Imperial.
• Future service development roadmap– Web based portal job submission. – Local and then via WMS– Shibboleth authentication to portal.– Migration of local bespoke accounting tool to NGS UAS/APEL/RUS– Make available resources as an NGS partner.– Install the NGS software stack for job submission (likely to be glite Cream-CE,
integrate into WMS)
DAMES
LifeGuide
eStat
PolicyGrid
Obesity e-Lab
DReSS
OeSS
Genesis
Genesis
Rural communities
Creative Industries
Social Inclusion
Entertainment
Healthcare
highwire
Horizon
Media
Finance
Web Science
Current Nodes
Original Nodes
DE Hubs
DE DTCs
GeoVUE
MoSeS
MiMeG
CQeSS
HUB
NeISS
ncrm
m
m
m m
mmm
m
m
NCRM phase 2
National eInfrastructure for Social Simulation
• JISC Funded. Meets the demand for powerful simulation tools by social scientists, public and private sector policymakers.
• Problems being addressed:– Curation, sharing and re-use of simulation outputs– Design and implementation of standards for sharing data and
methods. – Controlling access to information which may be private,
confidential, or copyright.– Manipulation of complex simulation outputs across multiple service
components; providing real-time access to powerful computational resources.
– Facilitating access to research resources and expertise among a distributed community of users
DiRAC
• National service for theoretical astrophysics• Moving to a federated service• Services provided at 14 sites via SSH
– 200Tflops of IB Cluster services– 6Tlops of Shared Memory Service (2TB RAM)– 2PB networked storage– 840 Tflops of Bluegene services
• Serves 29 HEIs and has 300 Active Users
European Strategic Framework for Research Infrastructures (ESFRI)
• Biological and Medical Sciences (6)• Energy (3)• Environmental Sciences (7)• Materials and Analytical facilities (5)• Physical and Engineering Science (9)• Social Science and Humanities (5)• e-infrastructure (1)
ELIXIR: Europe’s emerging infrastructure for biological information
AIM – To build a sustainable European infrastructure for biological information, supporting life science research and its translation to medicine, the environment, the bio-industries and society.
Services:• Management of Europe’s growing volume
and variety of biological data which are heterogeneous, complex and heavily linked
• Interaction with and support for data in other ESFRI projects in medicine, agriculture and environment.
• Biological domain expertise• Computer Tools Infrastructure• Computational infrastructure • Training centres for users of ELIXIR.• Industry translational services• 3 million users growing to 10 million in
2020• Petabytes now growing to exabytes in
2020
1800 terrestrial Long-Term Ecological Research (LTER) sites: increasingly sensor instrumented
>200 Marine reference and focal sites, with more to come: increasingly sensor instrumented
Hundreds of millions of specimens in natural science collections: >275m now indexed, increasing at 20% p.a.
Challenge of SCALE: > 25,000 users
Plus: all kinds of small, personal, group, and departmental datasets that need to get published
International Argo•31 countries working together to create and sustain an array of 3000 profiling floats to sample the temperature and salinity of the upper 2000 m of the global deep ocean every 10 days.Euro-Argo project
•The coordination of European efforts within global Argo activities.
Introduction
CLARIN
Jisc Conference
Lonond
13th April 2010
The CLARIN Mission
what? create a research infrastructure that makes language
resources and technologies available to scholars of all disciplines, especially humanities and social sciences
how? unite existing digital archives into a federation of connected
archives with unified web access provide language and speech technology tools as web
services operating on language data in archives
Steve RawlingsOeRC, June 2010
The SKA ICT Challenge
Green SKA
• > 10 Tb/s network + “Mount ExaFlop”• ~30000 40 TMACs DSP engines• ~10000 50-Tflop many-core processors• >10 Pflop supercomputer• Pb/s input to ExaByte archive• ~100 MW power budget
Ian Bird, CERN 21
LHC Computing
Signal/Noise: 10-13 (10-9 offline) Data volume
High rate * large number of channels * 4 experiments
15 PetaBytes of new data each year Compute power
Event complexity * Nb. events * thousands users
200 k of (today's) fastest CPUs 45 PB of disk storage
Worldwide analysis & funding Computing funding locally in major
regions & countries Efficient analysis everywhere GRID technology
>250 k cores today
100 PB disk today!!!
Computational Resources
• Institutional – Advanced Computing Centres
• Regional – New EPSRC funded mid range centres
• National - HECToR
UK Government decided there was a need for regional research infrastructure to link into national facilities
UK Tier 1 and Tier 2 Systems
National HPC
Strathclyde/Glasgow
MidPlus
HPC Midlands
EPSRC Regional HPC• Emerald
• GPU system - 372 NVIDIA Tesla processors• sustained capability of 114TF and on installation in
March 2012 was one of the largest GPU based systems in Europe.
• Hosted by STFC e-Science• Iridis
• 12,000 core Intel Westmere based system• ~108TF. • Capability/highly scaling work
• SGI supercomputer cluster • 83 SGI servers with Intel Xeon processors
E5-2600• Total of 5,312 cores.
EPSRC Regional HPC• Compute
• New capability cluster 2700 cores, infiniBand, some GPU and large-memory SMP nodes
• High throughput cluster 2900 cores to facilitate projects requiring to span large parameter spaces.
• Data storage and archive facilities • initially ~1 PB capacity including metadata-based search
and retrieval with secure implementation of a range of user-specified levels of privacy.
MidPlus
HPC Midlands• 3000 core Bull HPC• 48 Tflop• Infiniband interconnect
HECToR• HECToR: High End Computing
Terascale Resource• Procured for UK scientists by
Engineering and Physical Sciences Research Council – EPSRC
• Hardware – CRAY • Management – UoE HPCx LTD• Computational Science and
Engineering Support – NAG• HECToR is located at The
University of Edinburgh• Until recently the UKs Largest HPC
system
UK e- Infrastructure
LHC
ISIS TS2
HECtoR
Users get common access, tools, information, nationally supported services, through NGS
Integratedinternationally
VRE, VLE, IE
Regional and Campus grids
Community Grids
HEIs
NGS Mission and GoalTo enable coherent electronic access for UK researchers to all computational and data based resources and facilities required to carry out their research, independent of resource or researcher location.
Goal:– To enable a UK wide integrated production quality e-
infrastructure• Enable expansion to all Higher Education Institutes and UK based
research institutions• Supporting cutting edge research
– To deliver core services and support • Support research computing groups within universities and research
organisations to help them support users• Highlight Collaborative research in key communities
– Integrate with international infrastructures supporting UK participation in international projects with EU and US collaborations
Vision / What we do• Enable Collaborative Research• Support HEIs and Research Facilities
– Enable sharing of resources– Tools / Services to integrate resources
• Gateways to European (& international) e-infrastructure (NGI)
• Centre of Excellence– Helpdesk– Training– Security Co-ordination– Outreach Activities– Deployment Expertise– Standards engagement
Impact
• Improve accessibility to local and national resources
• ‘Use once Use anywhere’
• Support the Share/Trade/Buy/Sell of resources
• Facilitate collaboration nationally and internationally
Institutional Membership
• Personnel• Appointment of an institutional Campus Champion
• Resource Exchanging• Commit institutional research computing resources to join the NGS• Nomination of Collaboration Board Member• Partner
• Supporting access by a significant body of NGS users• Affiliate
• Supporting only internal users
User CommunityComp. Sci
Chemistry
Biology
PhysicsEngin.
Biochem
Bioin-format-
ics
Informat-ics
MathsMedicine
Earth SystGenetics other
Oxford14%
Manch.13%
Leeds10%
UCL10%Edin.
11%Glasgow6%
So'ton6%
Westminster5%
Imperial4%
Cambridge4%
Reading2%
Liverpool3%
Warwick2%
Bristol2%
Notts2%
Sheffield2%
Bath2%
Other4%
%
<1.514%
1.5-3.024%
3.01-4.538%
>4.5124%
Journal impact factors
~400 active users
Conclusion
• Research is increasingly dominated by digital generation, interpretation and storage of information
• e-Infrastructure is the full breadth of services which must be integrated to provide a common platform for all research
• During the next 3 days we will show – How some services are already connected– How some services are being connected– How you can utilise all the services optimally
THANK [email protected]