aws webcast - an introduction to high performance computing on aws

58
KD Singh AWS Solutions Architect

Upload: amazon-web-services

Post on 05-Jul-2015

1.113 views

Category:

Technology


2 download

DESCRIPTION

High Performance Computing (HPC) allows scientists and engineers to solve complex science, engineering, and business problems using applications that require high bandwidth, low latency networking, and very high compute capabilities. Learn how the AWS cloud can cost- effectively provide the scalable computing resources, storage services, and analytic tools that enable running various kinds of HPC workloads. Who should attend? Engineers, architects, product managers, data scientists, high performance computing specialists, and researchers from industry and academia, along with technically-minded business stakeholders looking to put data to work for their organization.

TRANSCRIPT

Page 1: AWS Webcast - An Introduction to High Performance Computing on AWS

KD SinghAWS Solutions Architect

Page 2: AWS Webcast - An Introduction to High Performance Computing on AWS

• High performance and high throughput

computing on AWS

• Integrating on-premise HPC environments

with AWS

• HPC ecosystem – partners and tools

• Demo

Agenda

Page 3: AWS Webcast - An Introduction to High Performance Computing on AWS

HPC and HTC on AWS

Concepts, Patterns & Practices

Page 4: AWS Webcast - An Introduction to High Performance Computing on AWS

Take a typical big computation task…

Page 5: AWS Webcast - An Introduction to High Performance Computing on AWS

…that an average cluster is too small (or

simply takes too long to complete)…

Page 6: AWS Webcast - An Introduction to High Performance Computing on AWS

…optimization of algorithms can give some

leverage…

Page 7: AWS Webcast - An Introduction to High Performance Computing on AWS

…and complete the task in hand…

Page 8: AWS Webcast - An Introduction to High Performance Computing on AWS

Applying a large cluster…

Page 9: AWS Webcast - An Introduction to High Performance Computing on AWS

…can sometimes be overkill

Page 10: AWS Webcast - An Introduction to High Performance Computing on AWS

AWS instance clusters can be balanced to

the job in hand…

Page 11: AWS Webcast - An Introduction to High Performance Computing on AWS

…neither too large…

Page 12: AWS Webcast - An Introduction to High Performance Computing on AWS

…nor too small…

Page 13: AWS Webcast - An Introduction to High Performance Computing on AWS

…with multiple clusters running at the

same time

Page 14: AWS Webcast - An Introduction to High Performance Computing on AWS

…HPC clusters are too small when

you need them most,

…and too large the rest of the time

Jason Stowe, Cycle Computing

Page 15: AWS Webcast - An Introduction to High Performance Computing on AWS

Why AWS for HPC?

Low cost with flexible pricing Efficient clusters

Unlimited infrastructure

Faster time to results

Concurrent Clusters on-demand

Increased collaboration

Page 16: AWS Webcast - An Introduction to High Performance Computing on AWS

Elastic Cloud-Based Resources

Actual demand

Resources scaled to demand

Waste Customer

Dissatisfaction

Actual Demand

Predicted Demand

Rigid On-Premises Resources

Benefits of Agility

Page 17: AWS Webcast - An Introduction to High Performance Computing on AWS

Pay As You Go Model

Use only what you need

Multiple pricing models

On-Premises

Capital Expense Model

High upfront capital cost

High cost of ongoing support

Cost Benefits of HPC in the Cloud

Page 18: AWS Webcast - An Introduction to High Performance Computing on AWS

Reserved

Make a low, one-time

payment and receive

a significant discount

on the hourly charge

For committed

utilization

Free Tier

Get Started on AWS

with free usage &

no commitment

For POCs and

getting started

On-Demand

Pay for compute

capacity by the hour

with no long-term

commitments

For spiky workloads,

or to define needs

Spot

Bid for unused

capacity, charged at

a Spot Price which

fluctuates based on

supply and demand

For time-insensitive

or transient

workloads

Dedicated

Launch instances

within Amazon VPC

that run on hardware

dedicated to a single

customer

For highly sensitive or

compliance related

workloads

Many Pricing Models to Support Different Workloads

Page 19: AWS Webcast - An Introduction to High Performance Computing on AWS

Customers running HPC Workloads on AWS

Page 20: AWS Webcast - An Introduction to High Performance Computing on AWS

484.14 TFLOPS76th fastest supercomputer in

the world June 2014 Top500 list

26496 cores cluster of C3 instances

On-Demand Supercomputer!

Page 21: AWS Webcast - An Introduction to High Performance Computing on AWS

• 8 Regions; 156,314 cores; 16,788 instances

• 1.21 petaFLOPS RPeak

• 264 Compute years in 18 hours

• Supercomputing environment worth $68M cost $33K

1 c|net newshttp://news.cnet.com/8301-1001_3-57611919-92/supercomputing-simulation-employs-156000-amazon-

processor-cores/

“Supercomputing simulation employs 156,000 Amazon

processor cores

To simulate 205,000 molecules as quickly as possible for a

USC simulation, Cycle Computing fired up a mammoth

amount of Amazon servers around the globe.” 1

Page 22: AWS Webcast - An Introduction to High Performance Computing on AWS

Characterizing HPC

Tightly

Coupled

Loosely

Coupled

Supporting

Services

Embarrassingly

parallel

Elastic

Batch workloads

Data management

Task distribution

Workflow

management

Interconnected jobs

Network sensitivity

Job specific

algorithms

Page 23: AWS Webcast - An Introduction to High Performance Computing on AWS

Characterizing HPC

Tightly

Coupled

Loosely

Coupled

Supporting

Services

Embarrassingly

parallel

Elastic

Batch workloads

Data management

Task distribution

Workflow

management

Interconnected jobs

Network sensitivity

Job specific

algorithms

Page 24: AWS Webcast - An Introduction to High Performance Computing on AWS

Feature Details

Flexible Run windows or Linux distributions

Scalable Wide range of instance types from micro to cluster compute

Machine

Images

Configurations can be saved as machine images (AMIs) from which new

instances can be created

Full control Full root or administrator rights

Secure Full firewall control via Security Groups

Monitoring Publishes metrics to Cloud Watch

Inexpensive On-demand, Reserved and Spot instance types

VM

Import/Export

Import and export VM images to transfer configurations in and out of EC2

Compute

Elastic Compute Cloud (EC2)Basic unit of compute capacity

Range of CPU, memory & local disk options

35+ Instance types available, from micro to cluster

compute

c3.8xlarge

c3.2xlarge

c3.large

Vertical Scaling

Page 25: AWS Webcast - An Introduction to High Performance Computing on AWS

Automation & Control

ec2-run-instances ami-xxxxxxxx

--instance-count 3

--availability-zone eu-west-1a

--instance-type m3.medium

http://docs.amazonwebservices.com/AWSEC2/latest/CommandLineReference/

CLI, API and Console

Scripted configurations

Page 26: AWS Webcast - An Introduction to High Performance Computing on AWS

Auto Scaling

as-create-auto-scaling-group MyGroup

--launch-configuration MyConfig

--availability-zones eu-west-1a

--min-size 2

--max-size 200

Automatic re-sizing of compute clusters

based upon demand

Page 27: AWS Webcast - An Introduction to High Performance Computing on AWS

Monitoring & Alerting

CloudWatch alerts based upon CPU load,

memory, I/O & user defined triggers

Trigger

scaling

policy

X

Page 28: AWS Webcast - An Introduction to High Performance Computing on AWS

Time: +00h

<10 cores

Elastic Capacity

Page 29: AWS Webcast - An Introduction to High Performance Computing on AWS

Time: +24h>1500

cores

Elastic Capacity

Page 30: AWS Webcast - An Introduction to High Performance Computing on AWS

Time: +72h

<10 cores

Elastic Capacity

Page 31: AWS Webcast - An Introduction to High Performance Computing on AWS

Time: +120h

>600 cores

Elastic Capacity

Page 32: AWS Webcast - An Introduction to High Performance Computing on AWS

Computational Chemistry project for

Cancer treatment

Estimated computation time: 39 years

Estimate project cost: $40 million

87,000 Core AWS Cluster

Spot Instances

Completed in 9 hours

Total Cost $4,232

Page 33: AWS Webcast - An Introduction to High Performance Computing on AWS

Import Export

Glacier

S3 EC2

RedshiftDynamoDB

EMR

Data Pipeline

S3Direct Connect

Kinesis

AWS Big Data PortfolioWhen data sets and data analytics need to

scale to the point that you have to start

innovating around how to collect, store,

organize, analyze and share it

COLLECT | STORE | ANALYZE | SHARE

Page 34: AWS Webcast - An Introduction to High Performance Computing on AWS

Analyzed more than 3 billion data

points in 2.8 seconds instead of weeks

or months

SEC used Tradeworx and

the AWS Cloud to create an

analytics platform at 10%

the cost of a traditional

environment in less than 4

months

AWS gives Tradeworx the

ability to collect and analyze

billions of data over years,

allowing the SEC to

reconstruct any market event,

down to the individual record

Page 35: AWS Webcast - An Introduction to High Performance Computing on AWS

Characterizing HPC

Tightly

Coupled

Loosely

Coupled

Supporting

Services

Embarrassingly

parallel

Elastic

Batch workloads

Data management

Task distribution

Workflow

management

Interconnected jobs

Network sensitivity

Job specific

algorithms

Page 36: AWS Webcast - An Introduction to High Performance Computing on AWS

What if you need to:

Implement MPI?

Code for GPUs?

Page 37: AWS Webcast - An Introduction to High Performance Computing on AWS

Tightly coupled

Enhanced Networking EC2 InstancesSingle Root I/O Virtualization (SR-IOV)

Higher Packets per Seconds, lower latencies, low network jitter

Implement HVM process execution

10 Gigabit Ethernet

R3 instances

Intel Xeon E5-2670

v2 2.5GHz

32 vCPUs

640GB SSD Local

Disk

244 GB RAM C3 instances

Intel Xeon E5-2680

v2 2.8 GHz

32 vCPUs

640GB SSD Local

Disk

60GB RAM

I2 instances

Intel Xeon E5-2670

v2 2.5GHz

32 vCPUs

1.6TB SSD Local

Disk

244 GB RAM

Page 38: AWS Webcast - An Introduction to High Performance Computing on AWS

Tightly coupled

Network Placement GroupsCluster instances can be launched within a

Placement Group. All instances launched in a

Placement Group have low latency, full

bisection, 10 Gbps bandwidth between

instances.

10Gbps

Page 39: AWS Webcast - An Introduction to High Performance Computing on AWS

Compute-intensive clinical trial

simulations that previously took 60

hours are finished in only 1.2 hours on

the AWS Cloudhttp://aws.amazon.com/solutions/case-studies/bristol-myers-squibb/

BMS used AWS to build a

secure, self-provisioning portal

for hosting research so

scientists can run clinical trial

simulations on-demand while

BMS is able to establish rules

that keep compute costs low.

Running simulations 98%

faster has led to more

efficient and less costly

clinical trials and better

conditions for patients.

Page 40: AWS Webcast - An Introduction to High Performance Computing on AWS

GPU Computing

GPU compute instancesIntel® Xeon processors

NVIDIA GPUs

CUDA, OpenCL frameworks

Cluster GPU CG1

Intel Xeon X5570

16 vCPUs

10 Gigabit Ethernet

2x NVIDIA Tesla Fermi

M2050 448 cores each

G2 instances

Intel Xeon E5-2670

2.5 GHz

8 vCPUs, on-board

Hardware encoder

1,536 CUDA cores

15 GB RAM, 4GB

Video memory

Page 41: AWS Webcast - An Introduction to High Performance Computing on AWS

CUDA & OpenCL

CUDA & OpenCLMassive parallel clusters running in GPUs

NVIDIA GRID and Tesla cards in specialized

instance types

Page 42: AWS Webcast - An Introduction to High Performance Computing on AWS

National Taiwan University50 x cg1.4xlarge instance types

100 nvidia Tesla M2050

“Our purpose is to break the record of solving the shortest vector problem

(SVP) in Euclidean lattices…the vectors we found are considered the hardest

SVP anyone has solved so far.” Prof. Chen-Mou Cheng, the Principal Investigator of Fast Crypto Lab

$2,300 for using 100 Tesla M2050 for ten hours

Page 43: AWS Webcast - An Introduction to High Performance Computing on AWS

Coming Soon…

New Compute-Optimized EC2 Instances

C4 family

C4 instances

Intel Xeon E5-2666

v3 Haswell, custom

36 vCPUs

60GB RAM

2.9GHz, up to 3.5GHz

with Turbo boost

Larger and Faster Elastic Block Store (EBS)

Volumes

Up to 16TB per volume

Up to 10,000 baseline IOPS per volume

Up to 20,000 provisioned IOPS per volume

Page 44: AWS Webcast - An Introduction to High Performance Computing on AWS

Characterizing HPC

Tightly

Coupled

Loosely

Coupled

Supporting

Services

Embarrassingly

parallel

Elastic

Batch workloads

Data management

Task distribution

Workflow

management

Interconnected jobs

Network sensitivity

Job specific

algorithms

Page 45: AWS Webcast - An Introduction to High Performance Computing on AWS

Middleware Services

Data managementFully managed SQL, NoSQL and object storage

Relational Database Service

Fully managed database

(MySQL, Oracle, MSSQL)

DynamoDB

NoSQL, Schemaless,

Provisioned throughput

database

S3

Object datastore up to 5TB

per object

99.999999999% durability

Page 46: AWS Webcast - An Introduction to High Performance Computing on AWS

Collection CollaborationComputation

Moving computation closer to the data“Big Data” changes dynamic of computation and data sharing

Direct Connect

Import/Export

S3

DynamoDB

EC2

GPUs

Elastic Map Reduce

CloudFormation

Simple Workflow

S3

Zocalo

Page 47: AWS Webcast - An Introduction to High Performance Computing on AWS

Middleware Services

Feeding workloadsUsing highly available Simple Queue

Service to feed EC2 nodes

Amazon SQS

Processing

task/processing trigger

Processing results

Page 48: AWS Webcast - An Introduction to High Performance Computing on AWS

Middleware Services

Coordinating workloads & task clustersHandle long running processes across many nodes and task steps

with Simple Workflow

Task A

Task B

(Auto-

scaling)

Task C

2

3

1

Grid Engine

cfncluster

LSF

OpenLava

Bright Cluster Manager

Page 49: AWS Webcast - An Introduction to High Performance Computing on AWS

Integrated Solutions

Page 50: AWS Webcast - An Introduction to High Performance Computing on AWS

Legacy

Data Centers

On-Premises

Resources

Cloud

ResourcesIntegration

Cloud isn’t an ‘all or nothing’ choice

Page 51: AWS Webcast - An Introduction to High Performance Computing on AWS

Active Directory Shibboleth

/ SAML

Network Configuration

Encryption

Backup Appliances

Your On-Premises

Apps

Legacy

Data Centers

Users & Access Rules (IAM)

Your Private Network (VPC)

Encryption (S3, RDS, HSM)

Backups (Storage Gateway)

Your Cloud Apps

AWS Direct Connect

VPN

Integrating AWS with your existing on-premises

infrastructure

Page 52: AWS Webcast - An Introduction to High Performance Computing on AWS

AZ-1

AZ-2

Public

Public

Private

Private

Private

Private

Customer

Gateway

VPN

Gateway

Internet

Gateway

Amazon S3

VPN

Connection

SpotMaster

SpotClustered Storage

Server

Clustered Storage

Server

Internet

Example HPC Design Pattern

Page 53: AWS Webcast - An Introduction to High Performance Computing on AWS

AWS HPC Partners

and Tools

Page 54: AWS Webcast - An Introduction to High Performance Computing on AWS

HPC Software on AWS Marketplace

Page 55: AWS Webcast - An Introduction to High Performance Computing on AWS

HPC Partners and Apps

Page 56: AWS Webcast - An Introduction to High Performance Computing on AWS

Use your current development toolsNVIDIA CUDA drivers pre-loaded

Intel MPI and Intel MKL® libraries

OpenMPI and MPICH2

Applications/ServicesMathWorks MatLab, Intel Lustre, OrangeFS, Ansys Fluent,

COMSOL, OpenFOAM etc.

Use your favorite batch scheduler and configuration

management tools

cfncluster Univa Sun Grid

Engine

HTCondor MIT StarCluster

Torque Slurm Rocks+

(StackIQ)

AWS

CloudFormation

Openlava Chef Puppet Elasticluster

HPC Applications and Tools

Page 57: AWS Webcast - An Introduction to High Performance Computing on AWS

Oil and Gas

Seismic Data Processing

Reservoir Simulations,

Modeling

Manufacturing & Engineering

Computational Fluid

Dynamics (CFD)

Finite Element Analysis (FEA)

Life SciencesMedia &

Entertainment

Transcoding and Encoding

DRM, Encryption

Rendering

Scientific Computing

Computational Chemistry

High Energy Physics

Stochastic Modeling

Quantum Analysis

Climate Models

EDA

Simulation

Verification

Genome Analysis

Molecular Modeling

Protein Docking

Popular HPC Workloads on AWS

Page 58: AWS Webcast - An Introduction to High Performance Computing on AWS

[email protected]

cloud formation cluster

(cfncluster) demo

https://github.com/awslabs/cfncluster