hypervisor-based prototyping of disaggregated memory and...

Hypervisor-based Prototyping of Disaggregated Memory and Benefits for VM ConsolidationKevin Lim

1, Jichuan Chang

2, Jose Renato Santos

2, Yoshio Turner

2, Trevor Mudge

1, Parthasarathy

Ranganathan2, Steven Reinhardt

1, 3, Thomas Wenisch

1

1University of Michigan, Ann Arbor

2Hewlett-Packard Labs

3Advanced Micro Devices

IntroductionProblem: Impending memory capacity wall for commodity systems!

Opportunity: Optimizing for ensemble-level memory usage.

Proposed Solution: Disaggregated memory to provide memory capacity expansion and dynamic capacity sharing.1

PrototypeGoal: Explore system-level implications of disaggregated memory.

What hypervisor changes are needed? Are new system-level optimizations enabled?

Current Research: Implementing a Xen hypervisor-based prototype of disaggregated memory (Figure B).

The majority of the code changes are in: (1) memory management(2) user-level tools for setting parameters

Functionality: Detects accesses to remote memory, on remote access swaps remote page with local page.

Configuration: 2x Quad-core AMD Opteron 2354, 32 GB DRAM.Large memory workloads: SPEC CPU (perlbench, gcc, bwaves, mcf, zeusmp), PARSEC (blackscholes), quicksort, TPC-H.

ResultsVirtual machines (VM) are setup with 2 or 8 GB of local and remote memory. Percentage and latency of remote memory are varied across runs.Results with the hypervisor-based prototype, Figure C and D, show disaggregated memory provides remote capacity with minimal slowdown.

Virtual Machine ConsolidationDisaggregated memory offers independent memory and CPU scaling, removing the memory bottleneck for VM consolidation. Other benefits are possible through techniques to increase effective memory capacity, such as content-based page sharing (CBPS).

ConclusionsMemory disaggregation addresses the capacity wall while providing greater efficiency and OS-transparency. Our prototype shows the feasibility of implementing disaggregated memory in a widely-used open source hypervisor. Importantly, it enables in-depth research into large-scale workloads and content-based page sharing.

5

A B FIgure A. Disaggregated memory architecture, with memory blade shared by multiple compute blades.

Figure B. Architecture of imple-mented prototype. Modifications are made to hypervisor; part of local memory emulates remote memory.

Future WorkWe plan to use the prototype to model multiple systems sharing a memory blade. We will also explore coordinating page migration with VM scheduling for greater consolidation. One use case is for virtual desktops, where non-shared pages can be evicted when a VM is descheduled (user think time), and the evicted pages are prefeteched when idle VM is rescheduled.

6

0%20%40%60%80%

100%120%140%160%180%

10% 20% 30% 40% 50% 60% 70% 80% 90%% of hot content in shared pages

Over CBPS-only

Over mBlade-only

C D

EFigure E. Potential VM consolidation savings enabled by memory disaggregation with CBPS at the memory blade level.

Figure C. Performance of workloads as percentage of remote memory is varied. SPEC work-loads have 2 GB of memory, all others have 8 GB.

Figure D. Microbenchmark results showing measured remote memory latency versus configured latency.

Hypervisor

OS

AppDIMMDIMM CPUs

DIMMDIMM

Software Stack Server using our hypervisor

1 K. Lim, et al., “Disaggregated memory for expansion and sharing in blade servers,” Proceed-ings of the 36th Annual International Symposium on Computer Architecture, June 2009

Memory blade (enlarged)

Backplane

Protocol engine

Memory controller

Address mapping

CPUsDIMMDIMM

CPUsDIMMDIMM

CPUsDIMMDIMM

CPUsDIMMDIMM

Compute blades

OS

App

Software Stack

Hypervisor

DIM

M

DIM

M

DIM

M

DIM

M

DIM

M

DIM

M

DIM

M

DIM

M

DIM

M

DIM

M

DIM

M

DIM

M

DIM

M

DIM

M

DIM

M

0

5

10

15

20

25

30

35

40

0 µsec 4 µsec 10 µsec 15 µsec 20 µsec

Acc

ess ti

me

per r

emot

e pa

ge, µ

sec

Configured remote memory access latency

0.7

0.8

0.9

1

1.1

1.2

1.3

1.4

1.5

1.6

bwaves gcc mcf perlbench zeusmp blackscholes nsort tpch

Nor

mal

ized

runti

me

0% 10% 20% 30% 40% 50% 90%

1

10

100

1000

2003

2004

2005

2006

2007

2008

2009

2010

2011

2012

2013

2014

2015

2016

2017

Relativ

e ca

paci

ty

Year

# of Cores

GB DRAM

Capacity Wall

Time

Per-

serv

er m

em. u

sage

hypervisor-based prototyping of disaggregated memory and...

Documents