speeding up deep learning services: when gpus meet ...speeding up deep learning services: when gpus...

37
Dr. Seetharami Seelam & Dr. Yubo Li Research Staff Members IBM Research GPU Tech Conference: San Jose, CA – May 10, 2017 Speeding up Deep Learning Services: When GPUs meet Container Clouds

Upload: others

Post on 16-Mar-2020

18 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Speeding up Deep Learning Services: When GPUs meet ...Speeding up Deep Learning Services: When GPUs meet Container Clouds. Outline 2 IBM 5/8/17 ... Resource Request: GPU: 1 ... •If

1

Dr. Seetharami Seelam & Dr. Yubo LiResearch Staff MembersIBM ResearchGPU Tech Conference: San Jose, CA – May 10, 2017

Speeding up Deep Learning Services: When GPUs meet Container Clouds

Page 2: Speeding up Deep Learning Services: When GPUs meet ...Speeding up Deep Learning Services: When GPUs meet Container Clouds. Outline 2 IBM 5/8/17 ... Resource Request: GPU: 1 ... •If

Outline

5/8/17IBM2

• Whoarewe

• Whyshouldyoulistentous

• Whatproblemsarewetryingtosolve

• ChallengeswithdeliveringDLonCloud

• WhathavewedoneinMesos andKubernetes

• Whatislefttodo

• Howcanyouhelp

Page 3: Speeding up Deep Learning Services: When GPUs meet ...Speeding up Deep Learning Services: When GPUs meet Container Clouds. Outline 2 IBM 5/8/17 ... Resource Request: GPU: 1 ... •If

Who are we

5/8/17IBM3

Dr. Yubo Li(李玉博)Dr. Seelam is a Research Staff Member at the T. J. Watson Research Center. He is an expert at delivering hardware, middleware, applications as-a-service using containers. He delivered Autoscaling, Business Rules, Containers on Bluemix and multiple others internally.

2

Dr. Seetharami SeelamDr. Yubo Li is a Research Staff Member at IBM Research, China. He is the architect of the GPU acceleration and deep learning service onSuperVessel, an open-access cloud running OpenStack on OpenPOWER machines. He is currently working on GPU support for several cloud container technologies, including Mesos, Kubernetes, Marathon and OpenStack.

Page 4: Speeding up Deep Learning Services: When GPUs meet ...Speeding up Deep Learning Services: When GPUs meet Container Clouds. Outline 2 IBM 5/8/17 ... Resource Request: GPU: 1 ... •If

Why should you listen to us

5/8/17IBM4

• We have multiple years of developing, optimizing, and operating container clouds

• Heterogeneous HW (POWER and x86)

• Long running and batch jobs

• OpenStack, Docker, Mesos, Kubernetes

• Container clouds with Accelerators (GPUs)

Page 5: Speeding up Deep Learning Services: When GPUs meet ...Speeding up Deep Learning Services: When GPUs meet Container Clouds. Outline 2 IBM 5/8/17 ... Resource Request: GPU: 1 ... •If

What problems are we trying to solve

5/8/17IBM5

• EnableDeepLearningintheCloud• Needflexibleaccesstohardware(GPUs)

• Trainingtimesinhours,days,weeks,months

• Longrunninginferencingservices

• Supportold,newandemergingframeworks

• Sharehardwareamongmultipleworkloadsandusers

Speech Vision

Page 6: Speeding up Deep Learning Services: When GPUs meet ...Speeding up Deep Learning Services: When GPUs meet Container Clouds. Outline 2 IBM 5/8/17 ... Resource Request: GPU: 1 ... •If

DL in the Cloud: State-of-the-art

5/8/17IBM6

• HistoricallyDLison-prem infrastructureandSWstack–high-performanceenvironment• Baremetal GPUsystems(x86andPOWER),Ethernet,IBnetwork

connectivity,GPFS• SpectrumLSF,MPIandRDMAsupport,singleSWstack

• Cloud– Freesresearchers&developersfrominfrastructure&SWStack• AllinfrastructurefromCloudasservices:GPUs,objectstore,

NFS,SDN,etc,• JobsubmissionwithAPIs:Torch,Caffe,Tensorflow,Theano• 24/7service,elasticandresilient• Appropriatevisibilityandcontrol

Page 7: Speeding up Deep Learning Services: When GPUs meet ...Speeding up Deep Learning Services: When GPUs meet Container Clouds. Outline 2 IBM 5/8/17 ... Resource Request: GPU: 1 ... •If

Challenges with DL on Cloud

5/8/17IBM7

• Data,data,data,data,…

• Accesstodifferenthardwareandaccelerators(GPU,IB,… )

• Supportfordifferentapplicationmodels

• Visibilityandcontrolofinfrastructure

• DevandOpschallengeswith24/7statefullservice

Page 8: Speeding up Deep Learning Services: When GPUs meet ...Speeding up Deep Learning Services: When GPUs meet Container Clouds. Outline 2 IBM 5/8/17 ... Resource Request: GPU: 1 ... •If

Journey started in 2016… promised to deliver DL on Cloud

5/8/17IBM8

• Excellentpromise,goaheadandbuiltaDLcloudservice

• ContainersupportGPU:minimalornon-existent

• Theideacouldhavediedonday1butfailureisnotanoption…

• WechosecontainerswithMesos andKubernetestoaddresssomeofthesechallenges

• DevelopedandoperatedMesos andKubernetesbasedGPUCloudsforoverayear

• Whatfollowsarelessonslearnedfromthisexperience

Page 9: Speeding up Deep Learning Services: When GPUs meet ...Speeding up Deep Learning Services: When GPUs meet Container Clouds. Outline 2 IBM 5/8/17 ... Resource Request: GPU: 1 ... •If

DL on Containers: DevOps challenges

5/8/17IBM9

• Multiple GPUs per node –> multiple containers per node: need to maintain GPU <–> Container mapping (GPU Allocator)

• Images need NVIDIA Drivers: makes them non-portable (Volume Manager)

• Cluster quickly becomes heterogeneous (K80, M60, P100…): need to be able to pick GPU type (GPU Discovery)

• Fragmentation of GPUs is a real problem (Priority placement)

• Like everything else GPUs fail à must identify and remove unhealthy GPUs from scheduling (Liveness check)

• Visibility, control, and sharing (to be done)

Page 10: Speeding up Deep Learning Services: When GPUs meet ...Speeding up Deep Learning Services: When GPUs meet Container Clouds. Outline 2 IBM 5/8/17 ... Resource Request: GPU: 1 ... •If

Allocate GPU#1

High-level view of GPU support in containers clouds

5/8/17IBM 11

Worker Node

Master Node

Node

Job Executor

Task Id: xxxResource Request:GPU: 1

…Resource report:GPU: 2…

Task DistributeGPU: 1

Insert Driver Volume

Page 11: Speeding up Deep Learning Services: When GPUs meet ...Speeding up Deep Learning Services: When GPUs meet Container Clouds. Outline 2 IBM 5/8/17 ... Resource Request: GPU: 1 ... •If

GPU Allocator

5/8/17IBM

• Allocator handles GPU number/device mapping• Isolator uses cgroup(mesos)/docker(k8s) to control GPU access permission inside

container/dev├── nvidia0 (data interface for GPU0)├── nvidia1 (data interface for GPU1)├── nvidiactl (control interface)├── nvidia-uvm (unified virtual memory)└── nvidia-uvm-tools (UVM tools, optional)

• Allocate/Release GPU

11 12

Request 1 GPU

GPU0: in useGPU1: idle

NVIDIA GPU Allocator

GPU0: in useGPU1: in use

NVIDIA GPU Allocator

Allocate GPU1

Container/dev/nvidia1/dev/nvidiactl/dev/nvidia-uvm

Expose GPU1

Expose GPU 1

Page 12: Speeding up Deep Learning Services: When GPUs meet ...Speeding up Deep Learning Services: When GPUs meet Container Clouds. Outline 2 IBM 5/8/17 ... Resource Request: GPU: 1 ... •If

GPU Driver Challenges: Drivers in the container is an issue

5/8/17IBM12

Container1

NVIDIA base libraries v1

CUDA libraries

Application (Caffe/TF/…)

Container2

NVIDIA base libraries v1

CUDA libraries

Application (Caffe/TF/…)

Linux KernelNVIDIA-kernel-module v1 Linux Kernel

NVIDIA-kernel-module (v2)

Container

NVIDIA base libraries (v1)

CUDA libraries

Application (Caffe/TF/…)

Changes in the host driver require all containers to update their drivers!!!! Credit to Kevin Klues, Mesosphere

Page 13: Speeding up Deep Learning Services: When GPUs meet ...Speeding up Deep Learning Services: When GPUs meet Container Clouds. Outline 2 IBM 5/8/17 ... Resource Request: GPU: 1 ... •If

GPU Driver Challenges: NVIDIA-Docker solves it

Linux Kernel

NVIDIA-kernel-module (v2)

Container

NVIDIA base libraries (v1)

CUDA libraries

Application (Caffe/TF/…)

Linux Kernel

NVIDIA-kernel-module (v2)

NVIDIA base libraries (v2)

Container

CUDA libraries

Application (Caffe/TF/…)

NVIDIA base libraries (v2)

Volume Injection

App will not work if NVIDIA libraries and kernel module versions are not match

Page 14: Speeding up Deep Learning Services: When GPUs meet ...Speeding up Deep Learning Services: When GPUs meet Container Clouds. Outline 2 IBM 5/8/17 ... Resource Request: GPU: 1 ... •If

GPU Volume Manager

5/8/17IBM14

•Mimic functionality of nvidia-docker-plugin

• Finds all standard NVIDIA libraries / binaries on the host and consolidates them into a single place.

/var/lib/mesos/volumes└── nvidia_XXX.XX (version number)├── bin├── lib└── lib64

• Inject volume with read-only (“ro”) to container if needed

GPU container

Image label:com.nvidia.volumes.needed = nvidia_driver

/usr/local/nvidia

Inject

Page 15: Speeding up Deep Learning Services: When GPUs meet ...Speeding up Deep Learning Services: When GPUs meet Container Clouds. Outline 2 IBM 5/8/17 ... Resource Request: GPU: 1 ... •If

GPU Discovery

5/8/17IBM15 15

• Mesos-agent/kubelet auto detects GPU numbers

• Instead, we use nvml library to detect number of GPUS, model, etc

Compiled binary, dynamic link to

nvml

NVIDIA GDK (nvml library)

Run on GPU mode

Run on non-GPU node

Compile

nvml lib not found

Discover GPU

Same binary works on both GPU node and non-GPU nodes

nvml lib found

Page 16: Speeding up Deep Learning Services: When GPUs meet ...Speeding up Deep Learning Services: When GPUs meet Container Clouds. Outline 2 IBM 5/8/17 ... Resource Request: GPU: 1 ... •If

Fragmentation problem: GPU Priority Placement

5/8/17IBM1617

GPU node #4GPU node #1 GPU node #2 GPU node #3

Spread (Default)

J1

J1

J2

J2 J2 J3

Used Free

• Three jobs are spread on 4 nodes• New Job 4 needs 4 GPUs on a single node, can it run?• Although there are 10 free GPUs

Jx à Job x

Page 17: Speeding up Deep Learning Services: When GPUs meet ...Speeding up Deep Learning Services: When GPUs meet Container Clouds. Outline 2 IBM 5/8/17 ... Resource Request: GPU: 1 ... •If

Fragmentation problem: GPU Priority Placement

5/9/17IBM1717

GPU node #4GPU node #1 GPU node #2 GPU node #3

Bundle (Default)

J1 J2

J1 J2

J3 J4

J2 J4

J4

J4

Used Free

• Solution: Bundle the jobs• New Job 4 needs 4 GPUs

Jx à Job x

Page 18: Speeding up Deep Learning Services: When GPUs meet ...Speeding up Deep Learning Services: When GPUs meet Container Clouds. Outline 2 IBM 5/8/17 ... Resource Request: GPU: 1 ... •If

Fragmentation problem: GPU Priority Placement

5/9/17IBM1818

GPU node #4GPU node #1 GPU node #2 GPU node #3

Bundle (Default)

J1 J2

J1 J2

J3

J2

J4 J4

J4 J4

Used Free

• Solution: Bundle the jobs• New Job 4 needs 4 GPUs on a single node

Jx à Job x

Page 19: Speeding up Deep Learning Services: When GPUs meet ...Speeding up Deep Learning Services: When GPUs meet Container Clouds. Outline 2 IBM 5/8/17 ... Resource Request: GPU: 1 ... •If

Fragmentation problem: GPU Priority Placement

5/9/17IBM19 19

• GPU priority scheduler can bundle/spread GPU tasks across the cluster

• Bundle: Reserve large idle GPU nodes for large tasks

• Spread: Distribute GPU workload over cluster

GPU node #2

GPU task #1

GPU task #2

GPU node #1 GPU node #2

GPU task #2

GPU node #1

GPU task #1

Bundle Spread

Page 20: Speeding up Deep Learning Services: When GPUs meet ...Speeding up Deep Learning Services: When GPUs meet Container Clouds. Outline 2 IBM 5/8/17 ... Resource Request: GPU: 1 ... •If

GPU Liveness Check

5/8/17IBM20

• GPU errors due to:• Insufficient power supply• Hardware damage• Over heating• Software bugs• ...

• GPU liveness check• Agent will probe GPU through nvml periodically• If GPU probe fails, mark GPU as unavailable, no future applications are

scheduled on that GPU

GPU failure sample

Page 21: Speeding up Deep Learning Services: When GPUs meet ...Speeding up Deep Learning Services: When GPUs meet Container Clouds. Outline 2 IBM 5/8/17 ... Resource Request: GPU: 1 ... •If

Implementation in Mesos and Kubernetes

21

• Open-source cluster manager• Enables siloed applications to be consolidated on a shared pool of resources• Rich framework ecosystem• Emerging GPU support

Page 22: Speeding up Deep Learning Services: When GPUs meet ...Speeding up Deep Learning Services: When GPUs meet Container Clouds. Outline 2 IBM 5/8/17 ... Resource Request: GPU: 1 ... •If

GPU Support on Apache Mesos

22

Docker Containerizer

Containerizer API

Isolator API

CPU

Mem

ory

(Unified) Mesos

Containerizer

GPU

Composing Containerizer NVIDIA

GPUAllocator

NVIDIA Volume

Manager

Mesos Agent

Mesos Master

Scheduler Framework (Marathon) Resource Definition:

Page 23: Speeding up Deep Learning Services: When GPUs meet ...Speeding up Deep Learning Services: When GPUs meet Container Clouds. Outline 2 IBM 5/8/17 ... Resource Request: GPU: 1 ... •If

Kubernetes

22

• Open source orchestration system for Docker containers• Handles scheduling onto nodes in a compute cluster• Actively manages workloads to ensure that their state matches the user’s declared intentions• Emerging support for GPUs

Page 24: Speeding up Deep Learning Services: When GPUs meet ...Speeding up Deep Learning Services: When GPUs meet Container Clouds. Outline 2 IBM 5/8/17 ... Resource Request: GPU: 1 ... •If

GPU Support on Kubernetes -- Upstream

24

• Basic multi-GPU support in release 1.6 upstream• GPU discovery• GPU allocator• GPU isolator

21GPU isolator

GPU

GPU discovery

GPU allocator

alpha.kubernetes.io/nvidia-gpu: 2

alpha.kubernetes.io/nvidia-gpu: 2

GPU resource type extension

Page 25: Speeding up Deep Learning Services: When GPUs meet ...Speeding up Deep Learning Services: When GPUs meet Container Clouds. Outline 2 IBM 5/8/17 ... Resource Request: GPU: 1 ... •If

GPU Support on Kubernetes – Internal DL cloud

24

Docker

Docker API / CRICP

U

Mem

ory

GPU

Linux devices cgroup

NVIDIA GPU

Allocator

NVIDIA Volume

Managerkubelet

NVIDIA GPU Manager

libnvidia-ml library

kube-apiserver

Dynamic loadingkubectl

GPU number display for PODs/Nodes

GPU Liveness

Check

kube-scheduler

GPU priority scheduler

Community GPU Support

Our ExtensionGPU resource

type extension

Page 26: Speeding up Deep Learning Services: When GPUs meet ...Speeding up Deep Learning Services: When GPUs meet Container Clouds. Outline 2 IBM 5/8/17 ... Resource Request: GPU: 1 ... •If

Demo

26

Page 27: Speeding up Deep Learning Services: When GPUs meet ...Speeding up Deep Learning Services: When GPUs meet Container Clouds. Outline 2 IBM 5/8/17 ... Resource Request: GPU: 1 ... •If

Status of GPU Support in Mesos and Kubernetes

27

Function/Feature Nvidia-docker Mesos k8s upstream k8s IBMGPUs exposed to Dockerized applications ✔ ✔ ✔ ✔

GPU vendor NVIDIA NVIDIA NVIDIA NVIDIASupport Multiple GPUs per node ✔ ✔ ✔ ✔

No GPU driver in container ✔ ✔ Future ✔

Multi-node management ✘ ✔ ✔ ✔

GPU Isolation ✔ ✔ ✔ ✔

GPU Auto-discovery ✔ ✔ ✔ (no nvml) ✔

GPU Usage metrics ✔ On-going Future On-goingHeterogeneous GPUs in a cluster ✘ ✔ Partial ✔

GPU sharing ✔ (No control) ✔ (No control) Future FutureGPU liveness check ✘ Future Future ✔

GPU advanced scheduling ✘ Future Future ✔

Compatible with NVIDIA official docker image ✔ ✔ Future ✔

Page 28: Speeding up Deep Learning Services: When GPUs meet ...Speeding up Deep Learning Services: When GPUs meet Container Clouds. Outline 2 IBM 5/8/17 ... Resource Request: GPU: 1 ... •If

Our DL service

28

• Mesos/Marathon GPU support• Support NVIDIA GPU resource management• Developing and operating deep learning and AI Vision internal services• Code contributed back to community• Presentations at MesosCon EU 2016 and MesosCon Asia 2016

• Kubernetes GPU support• Support NVIDIA GPU resource management• Developing and operating deep learning and AI Vision internal services• GPU support in IBM Spectrum Conductor for Containers (CfC)• Engagement with community to bring several of these features

Page 29: Speeding up Deep Learning Services: When GPUs meet ...Speeding up Deep Learning Services: When GPUs meet Container Clouds. Outline 2 IBM 5/8/17 ... Resource Request: GPU: 1 ... •If

IBM Spectrum Conductor for Containers

29

§ Community Edition available now! Free to download and use as you wish (optional paid support)

• Customer-managed, on-premises Kubernetes offering from IBM on x86 or Power

• Simple container based installation with integrated orchestration & resource management

• Authorization and access control (built-in user registry or LDAP)

• Private Docker registry• Dashboard UI• Metrics and log aggregation• Calico networking• Pre-populated app catalog• GPU support in 1.1; paid support in 1.2 (June)

§ Learn more and register on our community page: http://ibm.biz/ConductorForContainers

Demo on YouTube

Page 30: Speeding up Deep Learning Services: When GPUs meet ...Speeding up Deep Learning Services: When GPUs meet Container Clouds. Outline 2 IBM 5/8/17 ... Resource Request: GPU: 1 ... •If

What is left to do

30

• GPU topology-aware scheduling (on-going)• Support GPU topology-aware scheduling to optimize performance

• GPU live metric collection (on-going)• Collect GPU live metrics (i.e., live core/mem usage)

• Support CRI interface (in plan)• kubernetes moves to CRI after release 1.6, we will not depend on docker API

• Support libnvidia-container (under discussion)• Use libnvidia-container instead of nvidia-docker logic to manage GPU

container

Page 31: Speeding up Deep Learning Services: When GPUs meet ...Speeding up Deep Learning Services: When GPUs meet Container Clouds. Outline 2 IBM 5/8/17 ... Resource Request: GPU: 1 ... •If

New OpenPOWER Systems with NVLink

31

S822LC “Minsky”:2 POWER8 CPUs with 4 NVIDIA® Tesla® P100 GPUs hooked to CPUs using NVIDIA’s NVLinkhigh-speed interconnecthttp://www-03.ibm.com/systems/power/hardware/s822lc-hpc/index.html

Page 32: Speeding up Deep Learning Services: When GPUs meet ...Speeding up Deep Learning Services: When GPUs meet Container Clouds. Outline 2 IBM 5/8/17 ... Resource Request: GPU: 1 ... •If

Enabling Accelerators/GPUs on OpenPOWER

Containers and images

Accelerators

NVIDIA DOCKER

Page 33: Speeding up Deep Learning Services: When GPUs meet ...Speeding up Deep Learning Services: When GPUs meet Container Clouds. Outline 2 IBM 5/8/17 ... Resource Request: GPU: 1 ... •If

33

• Requirements, requirements, requirements

• Comment on issues

• Hack it and PR

• Review PRs

Page 34: Speeding up Deep Learning Services: When GPUs meet ...Speeding up Deep Learning Services: When GPUs meet Container Clouds. Outline 2 IBM 5/8/17 ... Resource Request: GPU: 1 ... •If

Summary and Next Steps

34

• Cognitive, Machine and Deep Learning workloads are everywhere

• Containers can be leveraged with accelerators for agile deployment of these new workloads

• Docker, Mesos and Kubernetes are making rapid progress to support accelerators

• OpenPOWER and this emerging cloud stack makes it the preferred platform for Cognitive workloads

• Join the community, share your requirements, and contribute code

Page 35: Speeding up Deep Learning Services: When GPUs meet ...Speeding up Deep Learning Services: When GPUs meet Container Clouds. Outline 2 IBM 5/8/17 ... Resource Request: GPU: 1 ... •If

Community Activities: Mesos

35 35

• GPU features on Mesos/Marathon have been supported in upstream• GPU for Mesos Containerizer support added after Mesos 1.0• GPU support added after Marathon v1.3 • GPU usage: http://mesos.apache.org/documentation/latest/gpu-support/

• Companies collaborating on GPU support on Mesos/Marathon

• Collaborators

35

§ Benjamin Mahler § Vikrama Ditya§ Yong Feng

• Kevin Klues• Rajat Phull• Guangya Liu• Qian Zhang

• Yu Bo Li• Seetharami Seelam

Page 36: Speeding up Deep Learning Services: When GPUs meet ...Speeding up Deep Learning Services: When GPUs meet Container Clouds. Outline 2 IBM 5/8/17 ... Resource Request: GPU: 1 ... •If

Community Activities: Kubernetes

36

• Multi-GPU support starting in Kubernetes 1.6

• Collaborators• Vish Kannan• David Oppenheimer• Christopher M Luciano• Felix Abecassis• Derek Carr• Yu Bo Li• Seetharami Seelam• Hui Zhi• …

Page 37: Speeding up Deep Learning Services: When GPUs meet ...Speeding up Deep Learning Services: When GPUs meet Container Clouds. Outline 2 IBM 5/8/17 ... Resource Request: GPU: 1 ... •If

Thank You

Contact: [email protected]@cn.ibm.com