hpc on openstack® use cases

49
© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. The OpenStack TM attribution statement should used: The OpenStack wordmark and the Square O Design, together or part, are trademarks or registered trademarks of OpenStack Foundation in the United States and other countries, and are used with the OpenStack Foundation’s permission. HPC on OpenStack® Use Cases Glyn Bowden, Chief Technologist Eric Lajoie, Snr. Solution Architect HP Helion Professional Services

Upload: others

Post on 19-Oct-2021

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: HPC on OpenStack® Use Cases

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. The OpenStack TM attribution statement should used: The OpenStack wordmark and the Square O Design, together or part, are trademarks or registered trademarks of OpenStack Foundation in the United States and other countries, and are used with the OpenStack Foundation’s permission.

HPC on OpenStack® Use Cases

Glyn Bowden, Chief Technologist

Eric Lajoie, Snr. Solution Architect

HP Helion Professional Services

Page 2: HPC on OpenStack® Use Cases

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.2

Identify the Architecture…

Page 3: HPC on OpenStack® Use Cases

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.3

OpenStack Architecture

Page 4: HPC on OpenStack® Use Cases

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.4

Moab HPC Suite

Page 5: HPC on OpenStack® Use Cases

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.5

HPC and Cloud – Opposing Forces

Cloud

• Share Everything

• Generic Workloads

• Loosely Coupled

• Many small workloads

Page 6: HPC on OpenStack® Use Cases

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.6

HPC and Cloud – Opposing Forces

Cloud

• Share Everything

• Generic Workloads

• Loosely Coupled

• Many small workloads

HPC

• Share Nothing

• Specific, Niche Workloads

• Tightly Coupled (RDMA)

• Few Large Distributed Workload

Page 7: HPC on OpenStack® Use Cases

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.7

But the same…

Cloud

• Highly Distributed

• Large Storage Pools

• Resource Management Key

• Performance Management

HPC

• Highly Distributed

• Large Storage Pools

• Resource Management Key

• Performance Management

Page 8: HPC on OpenStack® Use Cases

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.8

Background On HPC

• Two types of HPC• Analytics

• Big Data Sets

• Simple Operations repeated many times

• Aggregation of results

• Computationally Intensive

• Smaller Data Sets

• Very complex algorithms that need to be broken down

• Sequential processing and summary

• Often Latency Sensitive (RDMA, Lustre) or Bandwidth Sensitive (High Volume Filesystems, Large Files)

Page 9: HPC on OpenStack® Use Cases

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.9

Background On HPC

• Two types of Computational HPC• Batch Processing

• Loosely coupled

• Embarrassingly Parallel

• Limited / No shared resources during jobs

• Realtime / Grid Computing

• Tightly Coupled

• Requires High Performance Networking for Remote Direct Memory Access (RDMA)

• Usually a high performance shared file system is required

• CPU and Memory Architecture more critical than batch

Page 10: HPC on OpenStack® Use Cases

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.10

The Multi-Tenancy Challenge• HPC Schedulers and Frameworks have been working very well for years, why use OpenStack?

• Tenancy and Security have always been an issue!

• Most HPC environments, if you have access to the bastion host, you tend to have access to everything. Security usually a secondary thought.

• Single, flat but high performance networks mean mixed jobs. Insecure.

• Initially a problem for Genomics. European and US orgs have laws about security of human data, and human DNA is included!

• How to have different teams use the cluster without sharing the data.

• Same can be true in pharmaceuticals and test data.

Page 11: HPC on OpenStack® Use Cases

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.11

The Multi-Tenancy Solution

• OpenStack allows the tenants to leverage the same golden images for processing but use separated tenant networks

• Tenant based storage and provider VLANs means the data doesn’t mix• Better scheduling options within Nova now allow tenants to not compete

• Use of either Availability Zones or Cells can provide complete segregation even at compute management layer.

Page 12: HPC on OpenStack® Use Cases

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.12

Federation

• Allows for multiple separate clouds to communicate, share or trust same authentication

• Useful for abstracting cloud models. Share API standards and Authentication model, everything else can be different.

• Every cloud owner can manage their own authentication domains but leverage others.

Page 13: HPC on OpenStack® Use Cases

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Compute

Page 14: HPC on OpenStack® Use Cases

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.14

Compute

• Multi-Tenancy

• Regions

• Separate set of API endpoints per region.

• Reduce impact of one tenants activities on another

• Independent tenant scaling

• Requires more hardware (or careful management!)

• Cells

• Single API endpoint

• Nova-centric

• Hierarchical tree of Cells

• Each cell runs own set of Nova services except API which is shared

Page 15: HPC on OpenStack® Use Cases

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.15

Compute cont.

• Low Latency (Consistently)• RDMA over Infiniband

• Remote Procedure Calls between nodes (RPC)

• Dataset Shared amongst many nodes

• Non Uniform Memory Access (NUMA) and Pinning

• Cache Efficiency and Memory localisation

• vCPU -> pCPU Pinning (libvirt support April 14)

• CPU Affinity and Hyper-threading

• DPDK Framework

• CPU Offload• DPDK Framework

• NVMe Based SSD (Samsung 950 Pro)

• GPU PCI Passthrough

• NPIV

• SR-IOV

Presenter
Presentation Notes
Glyn Left, Eric Right
Page 16: HPC on OpenStack® Use Cases

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.16

NUMA & Pinning

Place VMs into your host in an ordered efficient fashion

Exposing resources in an efficient way

Glance image hint used for NUMA QPI Links, PCI IO

Core 2

Core 0

Core 1

Core 3

L3 Cach

e

Node 0 Memory

QPI Links, PCI IO

Core 2

Core 0

Core 1

Core 3

L3 Cach

e

Node 1 Memory

QPI x 3

Node 0 Node 1

NIC

NIC

Core 0 Core 1

Socket 0

eth0 eth1

Core 8 Core 9

Socket 1

eth2 eth3

QPI

Core 0 Core 1

Socket 0

eth0 eth1

Core 8 Core 9

Socket 1

eth2 eth3

QPI

Presenter
Presentation Notes
Eric
Page 17: HPC on OpenStack® Use Cases

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.17

Hyper-Threading and CPU Affinity

Presenter
Presentation Notes
Eric
Page 18: HPC on OpenStack® Use Cases

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.18

Fast Low Latancy Networking

DPDK 2.0 Requirements w/ vhost-user

OVS 2.4

Hugepages (2M and 1G)

Realtime Kernel (3.19)

NUMA

QEMU w/ NUMA (libnuma)

IOMMU (VT-d)

VFIO (UIO as option is HW does not have IOMMU)

SR-IOV (not all use cases work)

VirtIO in VMs (no change needed in VM to hit performance numbers)

Page 19: HPC on OpenStack® Use Cases

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.19

CPU and GPU

• GPU = Graphics Processing Unit

• Pioneered by NVidia in 2007

• Race to render more polygons for better gaming experience

• Architecture led to increased cores with dedicated GPU memory

• GPU Cores are massively parallel compared to CPU cores which number in the 10’s at most.

• NVidia created CUDA (Compute Unit Device Architecture)

• CUDA designed for General Compute on GPUs

• Workload shared between GPU and CPU

Presenter
Presentation Notes
Glyn
Page 20: HPC on OpenStack® Use Cases

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.20

Compute on Baremetal

• Ironic provides OpenStack Baremetal

• Integrated project as of Kilo

• Allows baremetal servers to be provisioned by Nova

• Dedicated resources, no sharing

• Hardware Agnostic (vendors responsible for own drivers for lights own management etc).

• Image based booting, not abstraction free!

Page 21: HPC on OpenStack® Use Cases

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.21

OpenStack Liberty Advances for HPC Compute• VirtIO Queue Scaling

• Hw_vif_multiqueue_enabled flag on image.

• Provides enhanced network performance for guests with >1 vCPU or large packet sizes

• Support for InfiniBand SR-IOV for libvirt virtualization• Allows for increased performance of InfiniBand interfaces direct to VMs.

• different_cells scheduler filter for Nova• Allows anti-affinity based on Cells

• Neutron QoS API• Ability to provide per-port Quality of service configuration

Presenter
Presentation Notes
Glyn
Page 22: HPC on OpenStack® Use Cases

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Storage

Page 23: HPC on OpenStack® Use Cases

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Ephemeral Storage

Page 24: HPC on OpenStack® Use Cases

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.24

Ephemeral Storage

What is it?Persists only as long as the VM existsUsually located locally on the compute server

HPC Use Cases?User scratch spaceWork scratch spaceOperating Environment

Page 25: HPC on OpenStack® Use Cases

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Block Storage

Page 26: HPC on OpenStack® Use Cases

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.26

Block Storage

What is it?Persistent, non-shared block storageCan be provided by many sources, SAN Arrays, Local Disk etc.

HPC Use Cases?Supporting DatabasesHigh performance scratch space

Page 27: HPC on OpenStack® Use Cases

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.27

Block Storage

OpenStack Project is CINDERA Large Disk Array attached by Fibre Channel SAN to all of the compute nodesOpenStack uses Cinder drivers to create, mount and protect LUNs for the guests.Guest is responsible for creating a file system on those LUNsNot shared with other guests

Page 28: HPC on OpenStack® Use Cases

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Object Storage

Page 29: HPC on OpenStack® Use Cases

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.29

Object Storage

What is it?Persistent, scalable storage poolsAccess using a REST based API Not bound to an individual Guest

HPC Use Cases?Centralised Data LakesArchives / Backups of source data

Page 30: HPC on OpenStack® Use Cases

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.30

Object Storage

OpenStack Project is SwiftUsually uses large pools of local disk attached directly to the object serversUses metadata to index the data and locate object blocks from unique identifiersLots of plugins for the various analytics engines that are expanding the adoption of object as a centralized data lake

Page 31: HPC on OpenStack® Use Cases

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

File Storage

Page 32: HPC on OpenStack® Use Cases

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.32

File Storage

What is it?Shared, persistent storageUses standard POSIX file system methods to access data

HPC Use Cases?User Home DirectoriesShared Project Data

Scale out file systems!

Page 33: HPC on OpenStack® Use Cases

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.33

File Storage

OpenStack Project is ManilaManage the creation of storage pools on the provider serviceCreate the shares and apply the correct permissionsMount those shares within the guests that need themCan be NFS or CIFS based todayPlugin driven

Page 34: HPC on OpenStack® Use Cases

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Use Case - Lustre

Page 35: HPC on OpenStack® Use Cases

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.35

Storage Use Case

Page 36: HPC on OpenStack® Use Cases

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.36

What is Lustre• Massively Parallel Filesystem made up of key components…

– MGS – Management Server

– MGT – Management Target

– MDS – Meta Data Server

– MDT – Meta Data Target

– OSS – Object Storage Server

– OST – Object Storage Target

• 1 File can be spread over up to 2000 objects

• With ldiskfs, each of those each object can be up to 16 TB

• That’s 31.25 PB (Yes PETA bytes) for a single file using ldiskfs

• Up to 4 Billion files per MDT

• Up to 4096 MDTs!

Page 37: HPC on OpenStack® Use Cases

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.37

Example for Genome Sequencing

Layering of Services

OST

ZFS

CRAM

ZFS

CRAM

OSS OSS

Cinder Cinder

OST

ZFS

CRAM

ZFS

CRAM

OSS OSS

Cinder Cinder

OST

ZFS

CRAM

ZFS

CRAM

OSS OSS

Cinder Cinder

Page 38: HPC on OpenStack® Use Cases

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.38

Why ZFS?

• Lustre has limited data protection.– RAID 0

– Protection from Physical Infrastructure

• Scale Out – Easy, Scale UP – Hard• ZFS has healing, snapshots (not necessarily a good idea here) and scale up!• ZFS Cache Pools for Meta-Data or even data sets, huge acceleration

potential

• Scale…

Page 39: HPC on OpenStack® Use Cases

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.39

LDISKFS

Object Size 16 TB

Maximum File Size 21.25 PB

Max Files per MDT 4 Billion

Max MDTs 4096

Lustre and ZFS File Limits

Page 40: HPC on OpenStack® Use Cases

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.40

LDISKFS ZFS

Object Size 16 TB 256 PB

Maximum File Size 21.25 PB 8 EB (2^63)

Max Files per MDT 4 Billion 4 Billion

Max MDTs 4096 4096

Lustre and ZFS File Limits

Page 41: HPC on OpenStack® Use Cases

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.41

Work in progress – Lustre as a Service

• Lustre can be for high bandwidth and low latency• Low latency challenging in virtual environment• High Bandwidth, easier (not simple though)

• Use OpenStack tools to provision Lustre Components• Build small scale, segregated clusters for multi-tenancy• Export via NFS with Manila on private networks

• Include ZFS and Compression• Use Cinder Block for the shared storage element

Page 42: HPC on OpenStack® Use Cases

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Networking

Page 43: HPC on OpenStack® Use Cases

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.43

Fat Tree Network Topology

• Simplest and most generic is Fat Tree

• Execution intelligence to pass local workloads to neighbours to reduce RTT.

• Looks very similar to standard core / edge network topology but can be many more layers depending on number of compute nodes and upstream bandwidth requirements

• ISL count increase the further up the tree you are to allow for aggregated bandwidth increases.

Page 44: HPC on OpenStack® Use Cases

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.44

3-Dimensional (wrap around) Torus

• Used in Low Latency Grid computing where RDMA is major factor

• Features in a lot of GPU compute grids to where RDMA is becoming more prominent

• Designed for nearest neighbour workloads

• Low cost vs. Fat Tree due to reduced switch requirements if using direct

• Means network switching can be brought closer to the CPU by using HTX technology.

• 2D needs 4 port cards, 3D needs 6 port cards! (EXTOLL)

Page 45: HPC on OpenStack® Use Cases

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.45

Provider VLAN

Network Considerations

802.1q direct to compute host

OVS 2.4 with user space virtIO support

High Performance VM Considerations

SR-IOV east-west vs. north-south

MSI-X Issues with old VM kernels (Rx queue interrupts broken)

Page 46: HPC on OpenStack® Use Cases

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.46

DPDK With and Without E-HOS

0

5

10

15

20

25

30

Number of VMs0.00E+002.00E+254.00E+256.00E+258.00E+251.00E+261.20E+261.40E+261.60E+261.80E+262.00E+26

pps

0

1000

2000

3000

4000

5000

6000

7000

8000

Mb/s

0

50000

100000

150000

200000

250000

Frame dropped/s0

1000

2000

3000

4000

5000

6000

7000

avg latency (us)

Page 47: HPC on OpenStack® Use Cases

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.

Summary

Page 48: HPC on OpenStack® Use Cases

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice.48

Summary

• Initial interest of HPC on OpenStack is being driven by tenancy requirements• Managing flexible HPC resources has been tricky, OpenStack makes that

easier for HPCaaS• Many areas are needed to work well together for success, OpenStack

Community beginning to address that as we have seen.

• HPC on OS is a reality and many are pushing the boundaries and committing back to the community.

Presenter
Presentation Notes
Eric
Page 49: HPC on OpenStack® Use Cases

© Copyright 2015 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. The OpenStack TM attribution statement should used: The OpenStack wordmark and the Square O Design, together or part, are trademarks or registered trademarks of OpenStack Foundation in the United States and other countries, and are used with the OpenStack Foundation’s permission.

Questionsどうもありがとう

Glyn Bowden - @bowdengl – [email protected]

Eric Lajoie – [email protected]