utilising cloud computing for research through infrastructure, software and desktop as a service

32
1 Utilising Cloud Computing for research through Infrastructure, Software and Desktop as a Service’ Dr David Wallom Associate Director

Upload: david-wallom

Post on 07-Aug-2015

36 views

Category:

Science


1 download

TRANSCRIPT

1

Utilising Cloud Computing for research through Infrastructure, Software and Desktop as a Service’Dr David Wallom

Associate Director

2

Overview

• Infrastructure as a Service– The EGI Federated Cloud

• Software as a Service– Hub

• Desktop As A Service– EOSCloud

www.egi.euEGI-InSPIRE RI-261323

EGI-InSPIRE

www.egi.euEGI-InSPIRE RI-261323

The EGI Federated Cloud

www.egi.euEGI-InSPIRE RI-261323

The Federated CloudA federation of Cloud resources from the public, academic and private sectors, offering Cloud Services to all research communities

A ‘single’ cloud system to;• scale• integrate multiple providers irrespective of technology• target the research community

Standards based federation of IaaS cloud: • Exposing a set of independent cloud services through a common standards

profile• Allowing deployment of services across multiple providers and capacity

bursting• Building on world class EGI core services already proven

www.egi.euEGI-InSPIRE RI-261323

Usage Model• Total control over

deployed applications • Elastic resource

consumption based on real needs

• Workloads processed on-demand

• Endorsed and accredited applications available from multiple different communities shared

• Single sign-on at multiple, independent providers

• Centralised access to service information across multiple providers

VM Operator

Resource Provider

www.egi.euEGI-InSPIRE RI-261323

EGI Federated Cloud

6

EGI Core PlatformCloud Infrastructure Platform

Collaboration Platform

Monitoring and control of utilisation

Technical Consultancy and Support

Uniform interfaces to Cloud Compute and

Storage

Secure endorsed Application and Service

Deployment

User Community

Consumer VM-OperatorCommunity

Management All

www.egi.euEGI-InSPIRE RI-261323

EGI Cloud Infrastructure

7

EGI Core Platform

Federated AAIServiceRegistry Monitoring Accounting

EGI Cloud Infrastructure Platform

Instance Mgmt

Information Discovery

Storage Management

Help and Support

Security Co-ordination

Training and Outreach

EGI C

olla

bora

tion

Tool

s

EGI A

pplic

ation

D

BIm

age

Repo

sito

ryEG

I Clo

ud S

ervi

ce M

arke

tpla

ce

Sustainable Business Models

User Community

Monitoring and control of utilisation

Technical Consultancy and Support

Uniform interfaces to Cloud Compute and Storage

Cloud Management Stacks(OpenStack, OpenNebula, Synnefo, …)

Secu

re e

ndor

sed

Appl

icati

on

and

Serv

ice

Dep

loym

ent

GSIGLUE2

Cloudinit CDMI

SAM UR

OVF

OCCI

www.egi.euEGI-InSPIRE RI-261323

Using open standards for VM Management

rOCCI-server

8

www.egi.euEGI-InSPIRE RI-261323

Partnership

Resources– 21 certified resource providers from 13

Countries– 9 resources in certification process– Worldwide interest & integration

• Australia* (NeCTAR)• South Africa* (SAGrid)• South Korea* (KISTI)• United States* (NIST, NSF A.C. Centres)

– Technology• 12 x Openstack• 7 x Open Nebula• 1 x Syneffo• 1 x Emotive

* Not shown on map

9

www.egi.euEGI-InSPIRE RI-261323

EGIs Appliance Catalogue

• EGIs ‘App Store’• 30 Registered Virtual Appliances• 21 Supporting Sites• 9 Supported Virtual

Organizations• atlas, • biomed, • cms, • demo.fedcloud.egi.eu, • drihm.eu, • fedcloud.egi.eu, • highthroughputseq.egi.eu, • lhcb, • vo.chain-project.eu

10

www.egi.euEGI-InSPIRE RI-261323

DRIHM

11

• Scientific Discipline: Natural Science, Earth sciences, Hydrology • Status: Test & Integration (drihm.eu VO)

DRIHM in the EGI FedCloud:• Running various hydrological models in the

EGI Federated Cloud• 1 VM: 1 cores, 4/8 GB of RAM• few GB of storage• Windows OS• Contextualisation for Windows OS VM image• Licence issue

DRIHM:• project funded by EC aiming at providing an open, fully

integrated workflow platform for predicting, managing and mitigating the risks related to extreme weather phenomena.

www.egi.euEGI-InSPIRE RI-261323

Chipster

12

Chipster in the EGI FedCloud:• ‘light’ VM (datasets removed)• Chipster VM configured through

contextualisation• shared block storage exported as

NFS for tools (500 GB)• block storage for output (500 GB)

• Scientific Discipline: Natural Science, Biological Sciences, Bioinformatics• Status: Production

ELIXIR Pilot Action Proposal:Using virtual machines and clouds in bioinformatics training

User-friendly analysis software for high-throughput data:• NGS• Microarray• Proteomics• sequence data

www.egi.euEGI-InSPIRE RI-261323

Use Case Discipline Classification

13

Usage since launch>600k VMs>40M CPU hours

Usecases- 12 @ Launch- 60 to date

- 11 production

14

Hub

15

Initial Experimental

Idea

Experimental Design

Data CollectionAnalysis

Publication

• Use open source components to build comprehensive

LIMS

• Support all parts of the research lifecycle

• Integrate 3rd party services to increase value and

capability

Architecture

Digital microscope

Electro-physiology

rig

Digital pens

PCWorkflows

Drupal

Continuous integration

AlfrescoGoogle

Services

Search Metadata AAI

Thoughts on Hub

• People are increasingly transient– Stop loosing the unknown - knowns

• Living data is often the forgotten component in data management

• All data will be born digital

• Data management requirements mean responsibility for storage of raw data is increasingly important

• Laboratory equipment can directly record into Hub to ensure data management from birth to death

• Connecting all of the experiment to ensure institutional knowledge capture, – neat and rough notes, raw data, analysis applications and output

EOSCloud

Bio-Linux: A scalable solution • Comprehensive, free bioinformatics workstation based on

Ubuntu Linux

• 10 years & 8 major releases

• 200+ bioinf packages including big integrative tools :- QIIME, Galaxy Server, PredictProtein, EMBOSS, ...Incorporates all software

• >7000 users in >1600 locationsDual BootLinux Live Local Servers Cloud

Why Cloud?• Tools such as Bio-Linux are community enablers• Data sets can be too big or restricted to easily move

– move the compute to the data– Researcher work patterns are maintained

• Need more efficient use of shared resources• Central maintenance of infrastructure• Lower barrier to entry (Compared to traditional

HPC and Grid)

EOSCloud

• A NERC Big Data capital project• A tenancy in the STFC JASMIN Unmanaged Cloud• Each registered user receives two VMs

• Bio-Linux• Ubuntu Docker hosting environment

– With total responsibility for instantiated system– Accessible though standard remote desktop tools

• But, – utilising single scale of resources would be a waste– Can we scale the users virtual services to take into account demand?

Boosting Resource Capabilities• Users VMs operate in native state ‘Standard’

– Enough capability to access stored data– Configure applications and workflows– Free

• User may boost his running VM to increased capability

– Enough to run analysis applications on useful timescale

– Credit consumption only for Boosted instances

• Reference datasets available to users through shared storage

Name # Core Memory (GB) Cost(Credit/hour)

Standard 1 16 0

Standard+ 2 40 1

Big 8 140 4

Max 16 500 8

Desktop as a Service for research

• Giving researchers an environment they are confident in by changing the infrastructure around them

• Location independent persistence of research environments

• Launch for pilot user communities 31st Mar 2015– Moving beyond pilot user communities (e.g. Ocean

Sampling Day)

• Investigating other key usage models such as teaching or online learning

31

Conclusions

• Cloud is (obviously) an enabler for research– Allowing flexibility in infrastructure hitherto not possible– User control rather than provide control

• Its not just about infrastructure and not just about single cloud providers

• Cloud is a way of allowing higher level services to be made more easily and made accessible

• Open standards – allow a marketplace of services to develop– allows diverse resource providers to participate– Moves the value add from availability of service to quality of service

32

Thank you & Questions