xs boston 2008 memory overcommit

26
<Insert Picture Here> Memory Overcommit… without the Commitment Speaker: Dan Magenheimer Oracle Corporation 2008

Upload: xen-project

Post on 05-Dec-2014

2.422 views

Category:

Technology


9 download

DESCRIPTION

Dan Magenheimer: Memory Overcommit...Without the Commitment

TRANSCRIPT

Page 1: XS Boston 2008 Memory Overcommit

<In

sert

Pic

ture

Here

>

Me

mo

ry O

ve

rco

mm

it…

wit

ho

ut

the

Co

mm

itm

en

t

Sp

eaker:

Da

n M

ag

en

heim

er

Ora

cle

Co

rpo

rati

on

2008

Page 2: XS Boston 2008 Memory Overcommit

Memory Overcommit without the Commitment (Xen Summit 2008) -Dan Magenheimer

Overv

iew

•What is overcommitment?

•akaoversubscription or overbooking

•Why (and why not) overcommitmemory?

•Known techniques for memory overcommit

•Feedback-directed ballooning

Page 3: XS Boston 2008 Memory Overcommit

Memory Overcommit without the Commitment (Xen Summit 2008) -Dan Magenheimer

CP

U o

verc

om

mit

men

t

On

e 4

-CP

U p

hysic

al

serv

er

Fo

ur underutilized 2

-cp

u v

irtu

al

serv

ers

� ��� � ���� ���� ���

Xe

n s

up

po

rts

CP

U

ov

erc

om

mit

me

nt

(aka

“co

nso

lid

ati

on

”)

Page 4: XS Boston 2008 Memory Overcommit

Memory Overcommit without the Commitment (Xen Summit 2008) -Dan Magenheimer

I/O

overc

om

mit

men

t

On

e 4

-CP

U p

hysic

al

serv

er

wit

h a

1G

b N

IC

Fo

ur underutilized 2

-cp

u v

irtu

al

serv

ers

each

wit

h a

1G

b N

IC

� ��� � ���� ���� ���

Xe

n s

up

po

rts

I/O

o

ve

rco

mm

itm

en

t

Page 5: XS Boston 2008 Memory Overcommit

Memory Overcommit without the Commitment (Xen Summit 2008) -Dan Magenheimer

Mem

ory

overc

om

mit

me

nt?

??

On

e 4

-CP

U p

hysic

al

serv

er

w/4

GB

RA

M

Fo

ur underutilized 2

-cp

u v

irtu

al

serv

ers

each

wit

h 1

GB

RA

M

X

� ���� ��� SO

RR

Y!

Xen

doesn’t

su

pp

ort

m

em

ory

overc

om

mit

men

t!

� ���

X

Page 6: XS Boston 2008 Memory Overcommit

Memory Overcommit without the Commitment (Xen Summit 2008) -Dan Magenheimer

Wh

y d

oesn

’t X

en

overc

om

mit

mem

ory

?

•Memory is cheap –buy more

•“W

hen you overbook memory excessively, perform

ance takes a hit”

•Most consolidated workloads don’t benefit much

•Overcommit requires swapping –and Xen doesn’t do I/O in the

hypervisor

•Overcommit adds lots of complexity and latency to important

features like save/restore/migration

•Operating systems know what they are doing and canuse all the

memory they can get

Page 7: XS Boston 2008 Memory Overcommit

Memory Overcommit without the Commitment (Xen Summit 2008) -Dan Magenheimer

Wh

y d

oesn

’t X

en

overc

om

mit

mem

ory

?

•Memory is cheap –buy more… except when you’re out of slots or need BIG dimms.

•“W

hen you overbook memory excessively, perform

ance takes a

hit”…

yes, but true of overbooking CPU and I/O too. Sometimes tradeoffs have to be made.

•Most consolidated workloads don’t benefit much…but some workloads do!

•Overcommit requires swapping –and Xen doesn’t do I/O in the

hypervisor…

only if black-box swapping is required

•Overcommit adds lots of complexity and latency to important

features like save/restore/migration… some techniques maybe… we’ll see…

•Operating systems know what they are doing and canuse all the

memory they can get…

but an idle or lightly-loaded OS may not!

Page 8: XS Boston 2008 Memory Overcommit

Memory Overcommit without the Commitment (Xen Summit 2008) -Dan Magenheimer

Wh

y s

ho

uld

Xen

su

pp

ort

mem

ory

overc

om

mit

me

nt?

•Competitive reasons

“VMwareInfrastructure’s exclusive ability to overcommit memory gives it an

advantage in cost per VM that others can’t match” *

•High(er) density consolidation can save m

oney

•Sum of guest working sets is often smaller than

available physical memory

•Inefficient guest OS utilization of physical memory

(cacheingvs“hoarding”)

* E. Horschman, Cheap Hypervisors: A Fine Idea --If you can afford them

, blogposting,

http://blogs.vmware.com/virtualreality/2008/03/cheap-hyperviso.htm

l

Page 9: XS Boston 2008 Memory Overcommit

Memory Overcommit without the Commitment (Xen Summit 2008) -Dan Magenheimer

Pro

ble

m s

tate

men

t

Oracle OnDemand businesses

(both internal/external):

•would like to use Oracle VM (Xen-based)

•but uses memory overcommit extensively

The Oracle VM team was asked… can we:

•implement memory overcommiton Xen?

•get it accepted upstream?

Page 10: XS Boston 2008 Memory Overcommit

Memory Overcommit without the Commitment (Xen Summit 2008) -Dan Magenheimer

Mem

ory

Overc

om

mit

Investi

gati

on

•Technology survey

•understand known techniques and implementations

•understand what Xen has today and its limitations

•Propose a solution

•OK to place requirements on guest

•e.g. black-box solution unnecessary

•soon and good is better than late and great

•phased delivery OK if necessary

•e.g. Oracle Enterprise Linux now, Windows later

•preferably high bang for the buck

•e.g. 80% of value with 20% of cost

Page 11: XS Boston 2008 Memory Overcommit

Memory Overcommit without the Commitment (Xen Summit 2008) -Dan Magenheimer

Tech

niq

ues f

or

mem

ory

overc

om

mit

•Ballooning

•Content-based page sharing

•VMM-driven demand paging

•Hot-plug m

emory add/delete

•Ticketed ballooning

•Swapping entire guests

Black-box or gray-box*…

or white-box?

* T. Wood, et al. Black-box and Gray-box Strategies for Virtual M

achine Migration, In Proceedings NSDI ’07.

http://www.cs.umass.edu/~twood/pubs/NSDI07.pdf

Page 12: XS Boston 2008 Memory Overcommit

Memory Overcommit without the Commitment (Xen Summit 2008) -Dan Magenheimer

WHAT IF…..?

Operating systems were able to:

•recognize when physical memory is not being used

efficiently and communicate relevant statistics

•surrender memory when it is underutilized

•reclaim m

emory when it is needed

And Xen/domain0 could balance the allocation of

physical memory, just as it does for CPU/devices?

…. Maybe this is already possible?!?

Page 13: XS Boston 2008 Memory Overcommit

Memory Overcommit without the Commitment (Xen Summit 2008) -Dan Magenheimer

Ballo

on

ing

(g

ray-b

ox)

•In-guest device driver

•“steals” / reclaims memory via guest in-kernel APIs

•e.g. get_free_page() and MmAllocatPagesforMdl()

•Balloon inflation increases guest memory pressure

•leverages guest native memory management algorithms

•Xenhas ballooning today

•mostly used for domain0 autoballooning

•has problems, but recent patch avoids worst*

•Vmware and KVM have it today too

Issues:

•driver must be installed

•not available during boot

•reclaim may not be fast enough; potential out-of-memory conditions

* J. Beulich, [PATCH] linux/balloon: don’t allow ballooning down a domain below a reasonable lim

it,, Xen

developers

archive, http://lists.xensource.com/archives/htm

l/xen-devel/2008-04/m

sg00143.htm

l

KV

M

Currently implemented by:

Page 14: XS Boston 2008 Memory Overcommit

Memory Overcommit without the Commitment (Xen Summit 2008) -Dan Magenheimer

Co

nte

nt-

based

pag

e s

hari

ng

(bla

ck-b

ox)

•One physical page frame used for multiple identical pages

•sharing works both intra-guest and inter-guest

•hypervisorperiodically scans for copies and “merges”

•copy-on-write breaks share

•Investigated on Xen, but never in-tree* **

•measured savings of 4-12%

•Vmware*** has had for a long time, KVM soon ****

Issues:

•Performance cost of discovery scans, frequent share set-up/tear-down

•High complexity for relatively low gain

* J. Kloster

et al, On the feasibility of memory sharing: content based pagesharing in the xenvirtual m

achine monitor,

Technical Report, 2006. http://w

ww.cs.aau.dk/library/files/rapbibfiles1/1150283144.pdf

** G. Milos, M

emory COW in Xen, Presentation at XenSummit, Nov 2007.

http://w

ww.xen.org/files/xensummit_fall2007/18_GregorM

ilos.pdf

*** C. Waldspurger. Memory Resource Management in Vmware

ESX Server, In Proceedings OSDI’02,

http://w

ww.usenix.org/events/osdi02/tech/walspurger/waldspurger.pdf

**** A. Kivity, Memory overcommitwith kvm, http://avikivity.blogspot.com/2008/04/m

emory-overcommit-w

ith-kvm.htm

l

KV

M

Currently implemented by:

Page 15: XS Boston 2008 Memory Overcommit

Memory Overcommit without the Commitment (Xen Summit 2008) -Dan Magenheimer

Dem

an

d p

ag

ing

(b

lack-b

ox)

•VMM reclaims m

emory and swaps to disk

•VMware

has today

•used as last resort

•randomized page selection

•Could potentially be done on Xen via

domain0

Issues:

•“Hypervisor” must have disk/net drivers

•“Semantic gap”* �

Double paging

* P. Chen et al. When Virtual is Better than Real. In Proceedings HOTOS ’01. http://w

ww.eecs.umich.edu/~pmchen/papers/chen01.pdf

KV

M

Currently implemented by:

Page 16: XS Boston 2008 Memory Overcommit

Memory Overcommit without the Commitment (Xen Summit 2008) -Dan Magenheimer

Ho

tplu

g m

em

ory

ad

d/d

ele

te (

wh

ite-b

ox)

•Essentially just ballooning with:

•larger granularity

•less fragmentation

•potentially unlim

ited maximum memory

•no kernel data overhead for unused pages

Issues:

•Not widely available yet (for x86)

•Larger granularity

•Hotplugdelete requires defragmentation

J. Schoppet al, Resizing M

emory with Balloons and Hotplug, Ottaw

a Linux Symposium 2006,

https://ols2006.108.redhat.com/reprints/schopp-reprint.pdf

Currently implemented by:

Page 17: XS Boston 2008 Memory Overcommit

Memory Overcommit without the Commitment (Xen Summit 2008) -Dan Magenheimer

“T

ickete

d” b

allo

on

ing

(w

hit

e-b

ox)

•Proposed by Ian Pratt*

•A ticket is obtained when a page is surrendered to

the balloon driver

•Original page can be retrieved if Xen hasn’t given

the page to another domain

•Similar to a system-wide second-chance buffer

cache or unreliable swap device

•Never implemented (afaik)

* http://lists.xensource.com/archives/html/xen-devel/2008-05/msg00321.html

Page 18: XS Boston 2008 Memory Overcommit

Memory Overcommit without the Commitment (Xen Summit 2008) -Dan Magenheimer

Wh

ole

-gu

est

sw

ap

pin

g (

bla

ck?

-bo

x)

•Proposed by KeirFraser*

•Forced save/restore of idle/low-priority guests

•Wake-on-LAN-like m

echanism causes restore

•Never implemented (afaik)

Issues:

•Very long latency for guest resume

•Very high system I/O overhead when densely overcommitted

* http://lists.xensource.com/archives/html/xen-devel/2005-12/msg00409.html

Page 19: XS Boston 2008 Memory Overcommit

Memory Overcommit without the Commitment (Xen Summit 2008) -Dan Magenheimer

Ob

serv

ati

on

s

•Xen balloon driver works well

•recent patch avoids O-O-M

problems

•works on hvm if pv-on-hvm drivers present

•ballooning up from “memory=xxx” to “maxmem=yyy” works (on

pvm domains)

•ballooned-down domain doesn’t restrict creation of new domains

•Linux provides lots of memory-status inform

ation

•/proc/meminfo and /proc/vmstat

•Committed_AS is a decent estimator of current memory need

•Linux does OK when put under memory pressure

•rapid/frequent balloon inflation/deflation just works… as long as

remaining available Linux memory is not toosmall

•properly configured Linux swap disk works when necessary;

obviates need for “system-wide” demand paging

•Xenstore tools work for two-way communication even in hvm

Page 20: XS Boston 2008 Memory Overcommit

Memory Overcommit without the Commitment (Xen Summit 2008) -Dan Magenheimer

Pro

po

sed

So

luti

on

:

Feed

back-d

irecte

d b

allo

on

ing

(g

ray-b

ox)

Use relevant Linux m

emory statistics to control balloon size

•Selfballooning:

•Local feedback loop; immediate balloon changes

•Eagerly inflates balloon to create memory pressure

•No management or domain0 involvement

•Directed ballooning:

•Memory stats fed from each domainUto domain0

•Policy module in domain0 determ

ines balloon size, controls

memory pressure for each domain (not yet implemented)

Page 21: XS Boston 2008 Memory Overcommit

Memory Overcommit without the Commitment (Xen Summit 2008) -Dan Magenheimer

Imp

lem

en

tati

on

:

Feed

back-d

irecte

d b

allo

on

ing

•No changes to Xenor domain0 kernel or drivers!

•Entirely implemented with user-land bash scripts

•Self-ballooning and stat reporting/monitoring only (for now)

•Committed_ASused (for now) as memory estimator

•Hysteresisparameters

--settable to rate-limit balloon changes

•Minimum m

emory floor enforced to avoid O-O

-M conditions

•same maxmem-dependent algorithm as recent balloon driver bugfix

•Other guest requirements:

•Properly sized and configured swap (virtual) disk for each guest

•HVM: pv-on-hvm drivers present

•Xenstore tools present (but not for selfballooning)

Page 22: XS Boston 2008 Memory Overcommit

Memory Overcommit without the Commitment (Xen Summit 2008) -Dan Magenheimer

Feed

back-d

irecte

d B

allo

on

ing

Resu

lts

•Overcommit ratio

•7:4

w/default configuration(7 512MB loadedguests, 2GB phys memory)

•15:4

w/aggressive config(15 512MB idleguests, 2GB phys memory)

•for pvm guests, arbitrarily higher due to “maxmem=”

•Prelim

inary perform

ance

(Linux kernel make after make clean, 5 runs, mean of middle 3)

� ���S

elf

ball

oo

nin

gco

stl

y f

or

larg

e-m

em

ory

do

main

s b

ut

bare

ly n

oti

ceab

le f

or

sm

all

er-

mem

ory

do

main

s

10

96

50

12

02

80

77

51

72

se

lf

91

50

11

86

79

77

3256

Off

46

50

11

20

95

77

51024

Off

10

94

90

12

01

93

77

52

38

Se

lf

10

6

36

8

Min

(MB

)

10

96

50

12

02

80

77

5S

elf

85

20

11

67

79

77

5512

Off

10

89

40

12

09

10

47

78

Se

lf

09

54

12

17

85

2048

Off

Do

wn

Hyste

resis

Majo

r p

ag

e

fau

lts

Ela

psed

(sec)

Sys

(sec)

User

(sec)

Mem

ory

(MB

)

ballo

on

ing

27

%

slo

we

r

3%

s

low

er

Page 23: XS Boston 2008 Memory Overcommit

Memory Overcommit without the Commitment (Xen Summit 2008) -Dan Magenheimer

Do

main

0 s

cre

en

sh

ot

wit

h m

on

ito

rin

g t

oo

l an

d x

en

top

sh

ow

ing

mem

ory

ove

rco

mm

itm

en

t

Page 24: XS Boston 2008 Memory Overcommit

Memory Overcommit without the Commitment (Xen Summit 2008) -Dan Magenheimer

Fu

ture

Wo

rk

•Domain0 policy m

odule for directed ballooning

•some combination of directed andself-ballooning??

•Im

proved feedback / heuristics

•Combine multiple memory statistics, check idle time

•Prototype kernel changes (“white-box” feedback)

•Better “idle memory” metrics

•Benchmarking in real world

•More aggressive m

inimum m

emory experiments

•Windows support

Page 25: XS Boston 2008 Memory Overcommit

Memory Overcommit without the Commitment (Xen Summit 2008) -Dan Magenheimer

Co

nclu

sio

ns

•Xen doesdo memory overcommit today!

•Memory overcommit has some perform

ance impact

•but still useful in environments where high VM density is more

important than max perform

ance

•Lots of cool research directions possible for

“virtualization-aware” OS memory m

anagement

Page 26: XS Boston 2008 Memory Overcommit

<In

sert

Pic

ture

Here

>

Me

mo

ry O

ve

rco

mm

it…

wit

ho

ut

the

Co

mm

itm

en

t

Sp

eaker:

Da

n M

ag

en

heim

er

Ora

cle

Co

rpo

rati

on

2008