hypervisor-based prototyping of disaggregated memory and...

Hypervisor-based Prototyping of Disaggregated Memory and Benefits for VM ConsolidationKevin Lim

1, Jichuan Chang

2, Jose Renato Santos

2, Yoshio Turner

2, Trevor Mudge

1, Parthasarathy

Ranganathan2, Steven Reinhardt

1, 3, Thomas Wenisch

1University of Michigan, Ann Arbor

2Hewlett-Packard Labs

3Advanced Micro Devices

IntroductionProblem: Impending memory capacity wall for commodity systems!

Opportunity: Optimizing for ensemble-level memory usage.

Proposed Solution: Disaggregated memory to provide memory capacity expansion and dynamic capacity sharing.1

PrototypeGoal: Explore system-level implications of disaggregated memory.

What hypervisor changes are needed? Are new system-level optimizations enabled?

Current Research: Implementing a Xen hypervisor-based prototype of disaggregated memory (Figure B).

The majority of the code changes are in: (1) memory management(2) user-level tools for setting parameters

Functionality: Detects accesses to remote memory, on remote access swaps remote page with local page.

Configuration: 2x Quad-core AMD Opteron 2354, 32 GB DRAM.Large memory workloads: SPEC CPU (perlbench, gcc, bwaves, mcf, zeusmp), PARSEC (blackscholes), quicksort, TPC-H.

ResultsVirtual machines (VM) are setup with 2 or 8 GB of local and remote memory. Percentage and latency of remote memory are varied across runs.Results with the hypervisor-based prototype, Figure C and D, show disaggregated memory provides remote capacity with minimal slowdown.

Virtual Machine ConsolidationDisaggregated memory offers independent memory and CPU scaling, removing the memory bottleneck for VM consolidation. Other benefits are possible through techniques to increase effective memory capacity, such as content-based page sharing (CBPS).

ConclusionsMemory disaggregation addresses the capacity wall while providing greater efficiency and OS-transparency. Our prototype shows the feasibility of implementing disaggregated memory in a widely-used open source hypervisor. Importantly, it enables in-depth research into large-scale workloads and content-based page sharing.

A B FIgure A. Disaggregated memory architecture, with memory blade shared by multiple compute blades.

Figure B. Architecture of imple-mented prototype. Modifications are made to hypervisor; part of local memory emulates remote memory.

Future WorkWe plan to use the prototype to model multiple systems sharing a memory blade. We will also explore coordinating page migration with VM scheduling for greater consolidation. One use case is for virtual desktops, where non-shared pages can be evicted when a VM is descheduled (user think time), and the evicted pages are prefeteched when idle VM is rescheduled.

0%20%40%60%80%

100%120%140%160%180%

10% 20% 30% 40% 50% 60% 70% 80% 90%% of hot content in shared pages

Over CBPS-only

Over mBlade-only

EFigure E. Potential VM consolidation savings enabled by memory disaggregation with CBPS at the memory blade level.

Figure C. Performance of workloads as percentage of remote memory is varied. SPEC work-loads have 2 GB of memory, all others have 8 GB.

Figure D. Microbenchmark results showing measured remote memory latency versus configured latency.

Hypervisor

AppDIMMDIMM CPUs

DIMMDIMM

Software Stack Server using our hypervisor

1 K. Lim, et al., “Disaggregated memory for expansion and sharing in blade servers,” Proceed-ings of the 36th Annual International Symposium on Computer Architecture, June 2009

Memory blade (enlarged)

Backplane

Protocol engine

Memory controller

Address mapping

CPUsDIMMDIMM

Compute blades

Software Stack

Hypervisor

0 µsec 4 µsec 10 µsec 15 µsec 20 µsec

ess ti

ge, µ

Configured remote memory access latency

bwaves gcc mcf perlbench zeusmp blackscholes nsort tpch

0% 10% 20% 30% 40% 50% 90%

Relativ

# of Cores

GB DRAM

Capacity Wall

hypervisor-based prototyping of disaggregated memory and...

Documents

ce 2354 unit iii

tnt remote destination list this list is up to and inclusive...

p-issn 2354-8568

hp xw4550...

using the ibm opteron 1350 at osc - personal pages -...

ace’s server guide: dual xeon, dual opteron, and quad...

opteron (advanced micro devices). history of the opteron...

storage for opteron server

tb9 6625-2354-35

builders guide - amd opteron processor based servers &...

stream 1 quad core opteron: ccnuma arch.stream 1 quad core...

· download 328 10400 0-2610-5417, 0-2610-5467...

2354 n watts super flyer

issn 2354-7251 (print)

cache opteron

amd opteron - presentaciones

o pato donald 2354

the amd opteron processor

alpex accessory2018-compressed · spo-001pb spo-ooipg...

it 2354 unit i