computer systems principles dynamic memory management

Post on 31-Dec-2015

42 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Computer Systems Principles Dynamic Memory Management. Emery Berger and Mark Corner University of Massachusetts Amherst. Dynamic Memory Management. How the heap manager is implemented malloc, free new, delete. Memory Management. Programs ask memory manager - PowerPoint PPT Presentation

TRANSCRIPT

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science

Computer Systems PrinciplesDynamic Memory Management

Emery Berger and Mark CornerUniversity of Massachusetts

Amherst

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 2

Dynamic Memory Management How the heap manager is implemented

– malloc, free– new, delete

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science

Memory Management Programs ask memory manager

– to allocate/free objects (or multiple pages) Memory manager asks OS

– to allocate/free pages (or multiple pages)

Operating System

User Program

Allocator(java, libc)

Objects (new, malloc)

Pages (mmap,brk)

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 4

Memory Management Ideal memory manager:

– Fast• Raw time, asymptotic runtime, locality

– Memory efficient• Low fragmentation

With multicore & multiprocessors:– Scalable to multiple processors

New issues:– Secure from attack– Reliable in face of errors

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 5

Memory Manager Functions Not just malloc/free

– realloc• Change size of object, copying old contents

– ptr = realloc (ptr, 10);• But: realloc(ptr, 0) = ?• How about: realloc (NULL, 16) ?

Other fun– calloc– memalign

Needs ability to locate size & object start

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 6

Fragmentation Intuitively, fragmentation stems from

“breaking” up heap into unusable spaces– More fragmentation = worse utilization

External fragmentation– Wasted space outside allocated objects

Internal fragmentation– Wasted space inside an object

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 7

Classical Algorithms First-fit

– find first chunk of desired size

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 8

Classical Algorithms Best-fit

– find chunk that fits best• Minimizes wasted space

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 9

Classical Algorithms Worst-fit

– find chunk that fits worst– name is a misnomer!– keeps large holes around

Reclaim space: coalesce free adjacent objects into one big object

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science

Quick Activity

Program asks for: 300,25,25,100– First-fit and best-fit allocations go where?– Which ones cannot be fulfilled?

What about: 110,54,25,70,50?

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 11

Implementation Techniques Freelists

– Linked lists of objects in same size class• Range of object sizes

First-fit, best-fit in this context?– Which is faster?

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 12

Implementation Techniques Segregated size classes

– Use free lists, but never coalesce or split Choice of size classes

– Exact– Powers-of-two

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 13

Implementation Techniques Big Bag of Pages (BiBOP)

– Page or pages (multiples of 4K)– Usually segregated size classes

Header contains metadata– Locate with bitmasking

Limits external fragmentation Can be very fast

Secret Sauce for project– Use free objects to track free objects

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 14

Runtime Analysis Key components

– Cost of malloc (best, worst, average)– Cost of free– Cost of size lookup (for realloc & free)

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 15

Space Bounds Fragmentation worst-case for “optimal”:

O(log M/m)– M = largest object size– m = smallest object size

Best-fit = O(M * m) !

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 16

Performance Issues Goal: perform well for typical programs

– Considerations:• Internal fragmentation• External fragmentation• Headers (metadata)• Scalability (later)• Reliability, too

“Canned” allocator often seen as slow

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 17

“Use custom allocators”

Custom Memory Allocation Programmers replace

new/delete Reduce runtime

– Often Expand functionality

– Sometimes Reduce space

– rarely

Very common Apache, gcc, lcc, STL,

database servers…– Language-level

support in C++– Widely recommended

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 18

Drawbacks of Custom Allocators Avoiding system allocator:

– More code to maintain & debug– Can’t use memory debuggers– Not modular or robust:

• Mix memory from customand general-purpose allocators → crash!

Increased burden on programmers

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 19

Class1free list

a

b

c

a = new Class1;b = new Class1;c = new Class1;delete a;delete b;delete c;a = new Class1;b = new Class1;c = new Class1;

+ Fast+ Linked list

operations + Simple

+ Identical semantics

+ C++ language support

- Possibly space-inefficient

(1) Per-Class Allocators Recycle freed objects from a free list

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 20

char[MEMORY_LIMIT]

a = xalloc(8);b = xalloc(16);c = xalloc(8);xfree(b);xfree(c);d = xalloc(8);

a b cd

end_of_arrayend_of_arrayend_of_arrayend_of_arrayend_of_arrayend_of_array

+ Fast+ Pointer-bumping

allocation

- Brittle- Fixed memory size- Requires stack-like

lifetimes

(II) Custom Patterns Tailor-made to fit allocation patterns

– Example: 197.parser (natural language parser)

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 21

+ Fast+ Pointer-bumping allocation+ Deletion of chunks

+ Convenient+ One call frees all memory

regionmalloc(r, sz)regiondelete(r)

Separate areas, deletion only en masseregioncreate(r) r

- Risky- Dangling

references- Too much space

Increasingly popular custom allocator

(III) Regions

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 22

Custom Allocators Are Faster…Runtime - Custom Allocator Benchmarks

0

0.25

0.5

0.75

1

1.25

1.5

1.75

197.parserboxed-simc-breeze 175.vpr 176.gcc apachelcc

mudlle

Normalized Runtime

Custom Win32

non-regions regions

As good as and sometimes much faster than Win32

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 23

Not So Fast…Runtime - Custom Allocator Benchmarks

0

0.25

0.5

0.75

1

1.25

1.5

1.75

197.parserboxed-simc-breeze 175.vpr 176.gcc apache

lccmudlle

Normalized Runtime

Custom Win32 DLmalloc

non-regions regions

DLmalloc (Linux): as fast or faster for most benchmarks

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science

Are custom allocators a win? Generally not worth the trouble

– Just use good general-purpose allocator• Alternative: reaps (hybrid of regions & heaps)

However…– Sometimes worth it for specialized apps

• Especially pool allocation, as in Apache

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science

Problems w/Unsafe Languages C, C++: pervasive apps, but langs. unsafe Numerous opportunities for security

vulnerabilities, errors– Double free– Invalid free– Uninitialized reads– Dangling pointers– Buffer overflows (stack & heap)

Can memory allocator help?

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science

Soundness for Erroneous Progs Normally: memory errors lead to crashes,

but…consider infinite-heap allocator:– All news fresh; ignore delete

• No dangling pointers, invalid frees,double frees

– Every object infinitely large• No buffer overflows, data overwrites

Transparent to correct program “Erroneous” programs sound

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science

Probabilistic Memory Safety

Fully-randomized M-heap– Approximates with M, e.g., M=2– Increases odds of benign errors– Probabilistic memory safety

• i.e., P(no error) n– Errors independent across heaps

• E(users with no error) n * |users|

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science

DieHard Key ideas:

– Isolate heap metadata– Randomize Allocation– Trade space for

robustness– Replication (optional)

Key influence in design of Windows 7’s Fault-Tolerant Heap

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science

obj obj objobj

pages

Implementation Issues Conventional, freelist-based heaps

– Hard to randomize, protect from errors• Double frees, heap corruption

What about bitmaps? (one bit per word)– Catastrophic fragmentation!

• Each small object likely to occupy one page

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science

00000001 1010

metadata

heap

Randomized Heap Layout

Bitmap-based, segregated size classes– Bit represents one object of given size

• i.e., one bit = 2i+3 bytes, etc.– Prevents fragmentation

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science

00000001 1010

metadata

heap

Randomized Allocation

malloc(8):– compute size class = ceil(log sz) – 3– randomly probe bitmap for zero-bit (free)

Fast: runtime O(1)– M=2 means E[# of probes] = 2

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science

00010001 1010

metadata

heap

Randomized Allocation

malloc(8):– compute size class = ceil(log sz) – 3– randomly probe bitmap for zero-bit (free)

Fast: runtime O(1)– M=2 means E[# of probes] = 2

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science

00010001 1010

metadata

heap

Randomized Deallocation

free(ptr):– Ensure object valid – aligned to right address– Ensure allocated – bit set– Resets bit

Prevents invalid frees, double frees

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science

00010001 1010

metadata

heap

Randomized Deallocation

free(ptr):– Ensure object valid – aligned to right address– Ensure allocated – bit set– Resets bit

Prevents invalid frees, double frees

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science

00000001 1010

metadata

heap

Randomized Deallocation

free(ptr):– Ensure object valid – aligned to right address– Ensure allocated – bit set– Resets bit

Prevents invalid frees, double frees

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science

2 34 5 3 1 6

object size = 2i+4object size = 2i+3

11 6 3 2 5 4 …

My Mozilla: “malignant” overflow

Your Mozilla: “benign” overflow

Randomized Heaps & Reliability

Objects randomly spread across heap Different run = different heap

– Errors across heaps independent

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science

Increasing Reliability Space Shuttle

– 3 copies of everything(hw & sw)

– Votes on every action

Failure:majority rules

37

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science

broadcast vote

input output

execute replicas(separate processes)

replica3seed3

replica1seed1

replica2seed2

DieHard - Replication

Replication-based fault-tolerance– Requires randomization! Makes errors independent

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science

DieHard Results Empirical results

– Runtime overhead– Error avoidance

• Injected faults & actual applications

Analytical results (if time, pictures!)– Buffer overflows– Uninitialized reads– Dangling pointer errors (the best)

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science

Analytical Results: Buffer Overflows

Model overflow as random write of live data Heap half full (max occupancy)

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science

Analytical Results: Buffer Overflows

Model overflow as random write of live data Heap half full (max occupancy)

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science

Analytical Results: Buffer Overflows

Model overflow: random write of live data Heap half full (max occupancy)

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science

rep

licas

Analytical Results: Overflows Replicas: Increase odds of avoiding overflow in

at least one replica

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science

rep

licas

Analytical Results: Overflows Replicas: Increase odds of avoiding overflow in

at least one replica

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science

rep

licas

Analytical Results: Overflows Replicas: Increase odds of avoiding overflow in at least one replica

P(Overflow in all replicas) = (½)3 = 1/8 P(No overflow in > 1 replica) = 1-(½)3 = 7/8

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science

Empirical Results: Runtime

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science

Analytical Results: Buffer Overflows

F = free space H = heap size N = # objects

worth of overflow

k = replicas

Overflow one object

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science

Error Avoidance Injected faults:

– Dangling pointers (@50%, 10 allocations)• glibc: crashes; DieHard: 9/10 correct

– Overflows (@1%, 4 bytes over) –• glibc: crashes 9/10, inf loop; DieHard: 10/10 correct

Real faults:– Avoids Squid web cache overflow

• Crashes Boehm-Demers-Weiser(BDW) Collector & glibc– Avoids dangling pointer error in Mozilla

• DoS in glibc & Windows

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 49

The End

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 50

Backup Slides

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science 51

Lea Allocator (Dlmalloc 2.7.0) Mature general-purpose allocator Optimized for common allocation patterns

– Per-size quicklists ≈ per-class allocation Deferred coalescing

– combining adjacent free objects– Highly-optimized fastpath

Space-efficient

top related