penguin computing / iu partnership hpc “cluster as a service” and cloud services casc spring...

15
Penguin Computing / IU Partnership HPC “cluster as a service” and Cloud Services CASC Spring Meeting 2012 Craig A. Stewart ([email protected]) Executive Director, Pervasive Technology Institute Associate Dean, Research Technologies Matthew Jacobs SVP Corporate Development Penguincomputing.com

Upload: patrick-walsh

Post on 16-Dec-2015

222 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Penguin Computing / IU Partnership HPC “cluster as a service” and Cloud Services CASC Spring Meeting 2012 Craig A. Stewart (stewart@iu.edu) Executive Director,

Penguin Computing / IU PartnershipHPC “cluster as a service” and

Cloud ServicesCASC Spring Meeting 2012

Craig A. Stewart ([email protected])Executive Director, Pervasive Technology InstituteAssociate Dean, Research Technologies

Matthew JacobsSVP Corporate DevelopmentPenguincomputing.com

Page 2: Penguin Computing / IU Partnership HPC “cluster as a service” and Cloud Services CASC Spring Meeting 2012 Craig A. Stewart (stewart@iu.edu) Executive Director,

Please cite as:

Stewart, C.A. and M. Jacobs. 2012. “Penguin Computing / IU Partnership HPC ‘cluster as a service’ and Cloud Services.” Presentation. Presented at Coalition of Academic Scientific Computation, 29 February 2012, Arlington, VA. http://hdl.handle.net/2022/14441

The image on slide 1 (title slide) and slides 3 – 7 © Penguin Computing Inc. all rights reserved; may not be reused without permission from Penguin Computing Inc.

Other slides (except where explicitly noted) are copyright 2011 by the Trustees of Indiana University, and this content is released under the Creative Commons Attribution 3.0 Unported license (http://creativecommons.org/licenses/by/3.0/)

2

Page 3: Penguin Computing / IU Partnership HPC “cluster as a service” and Cloud Services CASC Spring Meeting 2012 Craig A. Stewart (stewart@iu.edu) Executive Director,

On-demand HPC system> Compute, storage, low latency fabrics, GPU, non-virtualized

Robust software infrastructure> Full automation

> User and administration space controls

> Secure and seamless job migration

> Extensible framework

> Complete billing infrastructure

Services

> Custom product design

> Site and workflow integration

> Managed services

> Application support

HPC support expertise> Skilled HPC administrators

> Leverage 13 yrs serving HPC market

What is POD

Internet (150Mb, burstable to 1Gb)

3

Page 4: Penguin Computing / IU Partnership HPC “cluster as a service” and Cloud Services CASC Spring Meeting 2012 Craig A. Stewart (stewart@iu.edu) Executive Director,

Penguin Computing on Demand

True HPC in the cloud on a pay-as-you-go basis

Overflow, Targeted workload, Targeted user set

Post-Purchase Collocation

Collocation services provided by Penguin

Cost Reduction, Budget reallocation

Public-Private On-Demand Partnerships

Penguin-owned and operated PODs, hosted at academic or government facilities

Revenue sharing, Augment local resources, Self-sustaining growth

POD Hybrid On-premise cluster designed to mean usage + POD

Save on initial capital outlay while sustaining high service level to users

OEM HPC Cloud POD distribution to internal or external customers

Augment local resource, expertise, fund growth

HPC SaaS Platform Hosting platform for SAAS providers

On-demand delivery platform for ISVs

Turnkey Managed Services

Remote managed services for penguin and non-Penguin clusters

Augment local expertise, reduce costs

Penguin HPC Cloud Services

4

Page 5: Penguin Computing / IU Partnership HPC “cluster as a service” and Cloud Services CASC Spring Meeting 2012 Craig A. Stewart (stewart@iu.edu) Executive Director,

Created by POD Developers and Administrators

Scyld HPC Cloud Management System

Create and Manage User and Group Hierarchies

Simultaneously Manage Multiple Collocated Clusters

Create Customer Facing Web Portals

Use Web Services to Integrate with Back-End Systems

Deploy HTML5 Based Cluster Management Tools

Securely Migrate User Workloads

Efficiently Schedule and Manage Cluster Resources

Create and Deploy Virtual Headnodes for User-Specific Clusters

5

Page 6: Penguin Computing / IU Partnership HPC “cluster as a service” and Cloud Services CASC Spring Meeting 2012 Craig A. Stewart (stewart@iu.edu) Executive Director,

Current data centers: Salt Lake City, Indiana University, Mountain View

1,500 cores (AMD and Intel)

240 TB on-demand storage

12 Million Commercial Jobs and Counting…

Replaced in-house image analysis

cluster with POD and co-located

storage

Provides cloud analysis services on POD for world-wide bioinformatics

customers

Replaced Amazon AWS cloud usage

with PODTools workflow migration

system

Nihon ESI provides crash analysis

analyses to Honda R&D during Japan’s

brown-outs

6

Page 7: Penguin Computing / IU Partnership HPC “cluster as a service” and Cloud Services CASC Spring Meeting 2012 Craig A. Stewart (stewart@iu.edu) Executive Director,

Persistent, customized user environment

High-speed Intel and AMD compute nodes (physical)

Fast access to local storage (data guaranteed to be local)

Highly secure (https, shared key authentication, IP matching, VPN)

Billed by the fractional core hour

HPC expertise included (Penguin’s core business for many years)

Cluster software stack included

Troubleshooting included in support

Collocated storage options available

Highly dependable and dynamically scalable

The POD Advantage

7

Page 8: Penguin Computing / IU Partnership HPC “cluster as a service” and Cloud Services CASC Spring Meeting 2012 Craig A. Stewart (stewart@iu.edu) Executive Director,

Clouds look serene enough - But is ignorance bliss? In the cloud, do you know:

>Where your data are?

>What laws prevail over the physical location of your data?

>What license you really agreed to?

>What is the security (electronic / physical) around your data?

>And how exactly do you get to that cloud, or get things out of it?

>How secure your provider is financially? (The fact that something seems unimaginable, like cloud provider such-and-such going out of business abruptly, does not mean it is impossible!)

Photo by http://www.flickr.com/photos/mnsc/http://www.flickr.com/photos/mnsc/2768391365/sizes/z/in/photostream/

http://creativecommons.org/licenses/by/2.0/

8

Page 9: Penguin Computing / IU Partnership HPC “cluster as a service” and Cloud Services CASC Spring Meeting 2012 Craig A. Stewart (stewart@iu.edu) Executive Director,

Penguin Computing & IU partner for “Cluster as a Service”

Just what it says: Cluster as a Service Cluster physically located on IU’s campus, in IU’s Data

Center Available to anyone at a .edu or FFRDC (Federally Funded

Research and Development Center) To use it:

>Go to podiu.penguincomputing.com

>Fill out registration form

>Verify via your email

>Get out your credit card

>Go computing This builds on Penguin’s experience - currently host Life

Technologies' BioScope and LifeScope in the cloud (http://lifescopecloud.com)

9

Page 10: Penguin Computing / IU Partnership HPC “cluster as a service” and Cloud Services CASC Spring Meeting 2012 Craig A. Stewart (stewart@iu.edu) Executive Director,

We know where the data are … and they are secure

10

Page 11: Penguin Computing / IU Partnership HPC “cluster as a service” and Cloud Services CASC Spring Meeting 2012 Craig A. Stewart (stewart@iu.edu) Executive Director,

An example of NET+ Services / Campus Bridging

"We are seeing the early emergence of a meta-university — a transcendent, accessible, empowering, dynamic, communally constructed framework of open materials and platforms on which much of higher education worldwide can be constructed or enhanced.” Charles Vest, president emeritus of MIT, 2006

NET+ Goal: achieve economy of scale and retain reasonable measure of controlSee: Brad Wheeler and Shelton Waggener. 2009. Above-Campus Services: Shaping the Promise of Cloud Computing for Higher Education. EDUCAUSE Review, vol. 44, no. 6 (November/December 2009): 52-67.

Campus Bridging goal – make it all feel like it’s just a peripheral to your laptop (see pti.iu.edu/campusbridging)

11

Page 12: Penguin Computing / IU Partnership HPC “cluster as a service” and Cloud Services CASC Spring Meeting 2012 Craig A. Stewart (stewart@iu.edu) Executive Director,

True On-Demand HPC for Internet2

Creative Public/Private model to address HPC shortfall

Turning lost EC2 dollars into central IT expansion

Tiered channel strategy expansion to EDU sector

Program and discipline-specific enhancements under way

Objective third party resource for collaboration

>EDU, Federal and Commercial

IU POD – Innovation Through Partnership

12

Page 13: Penguin Computing / IU Partnership HPC “cluster as a service” and Cloud Services CASC Spring Meeting 2012 Craig A. Stewart (stewart@iu.edu) Executive Director,

POD IU (Rockhopper) specifications

Server Information 

Architecture Penguin Computing Altus 1804TFLOPS 4.4Clock Speed 2.1GHzNodes 11 compute; 2 login; 4 management; 3 serversCPUs 4 x 2.1GHz 12-core AMD Opteron 6172 processors per compute nodeMemory Type Distributed and SharedTotal Memory 1408 GBMemory per Node 128GB 1333MHz DDR3 ECCLocal Scratch Storage 6TB locally attached SATA2Cluster Scratch 100TB Lustre

Further Details

OS CentOS 5

Network QDR (40Gb/s) Infiniband, 1Gb/s ethernetJob Management Software

SGE

Job Scheduling Software SGEJob Scheduling policy Fair ShareAccess keybased ssh login to headnodes

remote job control via Penguin's PODShell

13

Page 14: Penguin Computing / IU Partnership HPC “cluster as a service” and Cloud Services CASC Spring Meeting 2012 Craig A. Stewart (stewart@iu.edu) Executive Director,

Package name

Summary

COAMPS Coupled ocean / atmosphere meoscale prediction system

DesmondDesmond is a software package developed at D. E. Shaw Research to perform high-speed molecular dynamics simulations of biological systems on conventional commodity clusters.

GAMESS GAMESS is a program for ab initio molecular quantum chemistry.

Galaxy Galaxy is an open, web-based platform for data intensive biomedical research.

GROMACSGROMACS is a versatile package to perform molecular dynamics, i.e. simulate the Newtonian equations of motion for systems with hundreds to millions of particles.

HMMERHMMER is used for searching sequence databases for homologs of protein sequences, and for making protein sequence alignments.

Intel compilers and libraries

LAMMPSLAMMPS is a classical molecular dynamics code, and an acronym for Large-scale Atomic/Molecular Massively Parallel Simulator.

MM5

The PSU/NCAR mesoscale model (known as MM5) is a limited-area, nonhydrostatic, terrain-following sigma-coordinate model designed to simulate or predict mesoscale atmospheric circulation. The model is supported by several pre- and post-processing programs, which are referred to collectively as the MM5 modelingsystem.

mpiBLAST mpiBLAST is a freely available, open-source, parallel implementation of NCBI BLAST.

NAMD NAMD is a parallel molecular dynamics code for large biomolecular systems.

Available applications at POD IU (Rockhopper)

14

Page 15: Penguin Computing / IU Partnership HPC “cluster as a service” and Cloud Services CASC Spring Meeting 2012 Craig A. Stewart (stewart@iu.edu) Executive Director,

Package name

Summary

NCBI-BlastThe Basic Local Alignment Search Tool (BLAST) finds regions of local similarity between sequences. The program compares nucleotide or protein sequences to sequence databases and calculates the statistical significance of matches.

OpenAtomOpenAtom is a highly scalable and portable parallel application for molecular dynamics simulations at the quantum level. It implements the Car-Parrinello ab-initio Molecular Dynamics (CPAIMD) method.

OpenFoam

The OpenFOAM®  (Open Field Operation and Manipulation) CFD Toolbox is a free, open source CFD software package produced by OpenCFD Ltd. It has a large user base across most areas of engineering and science, from both commercial and academic organisations. OpenFOAM has an extensive range of features tosolve anything from complex fluid flows involving chemical reactions, turbulence and heat transfer, to solid dynamics and electromagnetics.

OpenMPI Infinibad based Message Passing Interface - 2 (MPI-2) implementation

POP

POP is an ocean circulation model derived from earlier models of Bryan, Cox, Semtner and Chervin in which depth is used as the vertical coordinate. The model solves the three-dimensional primitive equations for fluid motions on the sphere under hydrostatic and Boussinesq approximations.

Portland Group compilers

R R is a language and environment for statistical computing and graphics.

WRF

The Weather Research and Forecasting (WRF) Model is a next-generation mesoscale numerical weather prediction system designed to serve both operational forecasting and atmospheric research needs. It features multiple dynamical cores, a 3-dimensional variational (3DVAR) data assimilation system, and a software architecture allowing for computational parallelism and system extensibility.

Available applications at POD IU (Rockhopper)

15