memory management (part 2) virtual memory fileagenda • recap memory management in early systems...

Memory management (part 2)

Virtual memory

15/11/2010 1TU/e Computer Science, System Architecture and Networking

Igor Radovanović, Rudolf Mak, [email protected]

Dr. Tanir Ozcelebi

by courtesy of

Igor Radovanović &

Rudolf Mak

Agenda

• Recap memory management in early systems

• Principles of virtual memory

• Paging

• Segmentation

• Paging and Segmentation



15/11/2010 2

• Paging and Segmentation

Memory management: requirements

IDEALLY, memory needs to be

• Simple to use

• Private (isolation)

• Permanent (non-volatile)

• Fast (zero-time) access



15/11/2010 3

• Fast (zero-time) access

• Huge (unlimited) capacity

• Cheap (cost-effective)

\

Memory Hierarchy

CPU Registers

“Main” Memory

L1 Cache Memory

L2 Cache MemoryP

rim

ary

(Exe

cu

tab

le)

La

rge

r sto

rag

e



15/11/2010 4

Rotating Magnetic Memory

Optical Memory

Sequentially Accessed Memory

Se

co

nd

ary

La

rge

r sto

rag

e

Fa

ste

r a

cce

ss

Early systems (properties)

• every active process resides in its entirety in MM

– large program can’t execute (except with an overlay structure)

• Overlay: Structure a program in independent parts (e.g. function

calls) which can be overlayed in memory

A



15/11/2010 5

• every active process is allocated a contiguous part of

main memory. Three partitioning schemes

– fixed partitions, dynamic, relocatable dynamic partitions

B C

D E

versus

Early systems (partitioning schemes)

• Fixed partitions– limits the number and maximal size of the active processes

– internal fragmentation

• Dynamic partitions– external fragmentation



15/11/2010 6

• Relocatable dynamic partitions– requires dynamic binding and relocatable load modules

– no fragmentation, but expensive compaction (or swapping)

Summary

Simple YesWith the exception

of overlays

Private Yes/NoIsolation can be provided with dynamic

binding. No sharing

Permanent NoUnless the programmer enforces

this during execution.



15/11/2010 7

Permanent Nothis during execution.

Fast NoProcess-based compaction and

swapping are expensive

Huge NoProcess size cannot exceed main

memory size

Cost-effective Yes Hardware hierarchy

Agenda



• Paging

• Segmentation

• Segmentation with paging



15/11/2010 8


VM: abstraction

• Q: What needs to be done such that

– programs do not have to be stored contiguously, and

– not the entire program, but only parts are stored in main memory

Approach:

• Split the memory into segments and try to fit parts of the program into those segments



15/11/2010 9

into those segments

• Terminology:

– If segments are of different size: segmentation

– If segments are of the same size: paging

• Physical memory blocks: (page) frames

• Logical memory blocks: pages

• Paging and segmentation combined

Paging memory allocation

• Non-contiguous memory allocation

– Processes divided into pages of equal size

– Works well if page size is equal to page frame size

• and the disk’s section size (sectors)

• Advantages:



15/11/2010 10

• Advantages:

– An empty page frame is always usable

– The compaction scheme is not required

• No external and almost no internal fragmentation

• Disadvantage:

– Mechanism needed to keep track of page locations

Non-contiguous allocation -an example-

Process 1

Main Memory

Page

frame #

1st 100 lines Operating system 0

1

Page 0 2

3

2nd

100 lines 4

Page 1 Process 1 – Page 2 5

6

7

3rd

100 lines Process 1 – Page 0 8

Page 2 9

Process 1 – Page 1 10

Proc1 size=350 lines



15/11/2010 11

• The number of free pages left=6

• Any process of more than 600 lines has to wait until the Proc1 ends its execution

• Any process of more than 1000 lines cannot fit into memory at all!

• Disadvantage:

– Entire process must be stored in memory during its execution

Process 1 – Page 1 10

Remaining 50 lines Page 3 Process 1 – Page 3 11

Wasted space 12

Internal fragmentation

VM: Demand paging

• Made virtual memory widely available

• The first allocation scheme that removed the restriction of having the entire job stored in memory

– Bring a page into memory only when it is needed

– Gives appearance of an almost infinite physical memory

• Takes advantage that programs are written sequentially so not all pages are necessary at once. For example:



15/11/2010 12

pages are necessary at once. For example:

– User-written error handling modules

– Mutually exclusive modules

– Certain program options are either mutually exclusive or not always

accessible

– Tables assigned fixed amount of address space even though only a

fraction of table is actually used

VM: implementation issues

• Address translation – must be done at run time

– HW support (MMU, TLBs) to lower access time & to enforce isolation

• Placement strategies– simple for pages; just any free frame

– for segments, strategies are similar to those for dynamic partitions

• Replacement strategies– that decide which page(s) or segment(s) must be swapped out in



15/11/2010 13

– that decide which page(s) or segment(s) must be swapped out in case there is not enough free space

• Load control policies that determine– how many pages of a process are resident?

– when are pages loaded into memory (demand paging, prepaging)?

• Sharing– possible with both paging and segmentation

Agenda


• Paging

– address translation

• frame table, page table, TLBs

– replacement strategies



15/11/2010 14


– load policies

• Segmentation


Frame table vs Page table

Process page

number

Page Frame Number

0 2

1 7

Page Frame

Number

Process ID

Process page

number

0 3 9

1 2 11



15/11/2010 15

2 5

3 3

2 1 0

3 1 3

4 3 8

5 1 2

6 3 12

7 1 1

Frame tables

0

1

· ·

id p w ·

.

.

frame f

page pw

address_map (id, p, w) {

pa = UNDEFINED;

for (f = 0; f<F; ++f) {

if (FT[f].pid == id &&

FT[f].page == p )

pa = (f+w)

}

return pa;



15/11/2010 16

·

.

.

·

.

.

f id p

·

.

.

·

.

.

F-1

pid page

·

.

.

return pa;

}

Note that the frame table

is an associative memory

in which searching is

done in parallel on keys

of the form (id, p)

Page tables

p

p w

.

.

frame f

page p

address_map (p, w) {

pa = ∗(PTR+p)+w;

return pa;

}

PTR

.

.

w

For fast access a

hardware register



15/11/2010 17

p page p

·

.

.

f

·

.

.

whardware register

PTR is used. Its

content is stored

in the process

table

Both reading and

writing require

two memory

accesses

Process table

Process Table (a) Process Table (b) Process Table (c)

Process Size PMT Location Process Size PMT Location Process Size PMT Location

400 3096 400 3096 400 3096

200 3100 700 3100

500 3150 500 3150 500 3150

(a) PT has 3 entries initially; one for each process



15/11/2010 18

(a) PT has 3 entries initially; one for each process

(b) second process ends, entry in table is released and replaced by

(c) information about next process that is processed

Translation look-aside buffers

Problems

• Page tables can be huge

• Paging with page tables requires two memory accesses per reference

Approach



15/11/2010 19

Approach

• use a TLB, i.e., an associative memory implemented in hardware

– a cache that keeps track of the most recently used pages

– Does not contain actual data or instructionsd

– most common replacement scheme (least recently used)

• if the page cannot be found in the TLB, the page table is used

• if the page still cannot be found a page fault is generated

Page faults

• Page faults are handled by the OS (not the hardware)

• Page fault handler determines whether there are empty page frames in memory

– If not, it must decide which page to swap out

– This depends on predefined policy for page removal



15/11/2010 20

• When page is swapped out tables to be updated:

– Page tables of two tasks (1 in 1 out)

– Table to locate free frames

• A memory reference may create several page faults

• Problem with swapping: thrashing

Agenda



• Paging





15/11/2010 21



– load policies

• Segmentation


Replacement strategies

• Global strategies– fixed number of pages shared by all processes

– evicted page need not be owned by the process that needs extra memory

• Local strategies– each process has a set of pages called the working set

– when a process runs out of memory a page from its working set is evicted



15/11/2010 22

• Comparison of replacement strategies is done using reference strings– a reference string is an execution trace of a program in which only

memory references are recorded, and such that only the page number of the referenced location is mentioned

– goodness criteria• the number of generated page faults

• the total number of pages loaded at a page fault

• these two are equal under pure demand paging

Global replacement strategies

• Min replacement

– select the page which will not be used for the longest time in the

future, this gives the minimal number of page faults

• Random replacement

– select a random page

• FIFO replacement

– select the page that has been resident for the longest time



15/11/2010 23

– select the page that has been resident for the longest time

• LRU replacement

– select the page that is least recently used

• Clock replacement (second chance, third chance)

– circular list of all resident pages equipped with a use-bit u

– search clockwise for u=0, while setting the use-bits to zero

MIN policy (looks to the future)

Page

Frame 1

Page A

Page

Page

Requested: A

A

B

A

A

A

C

A

A

A

B

D

D

D

B

D

A

D

C

D

D



15/11/2010 24

• How each page requested is swapped into 2 available page frames using MIN.

When program is ready to be processed all 4 pages are on secondary storage.

Throughout program, 11 page requests are issued. When program calls a page

that isn’t already in memory, a page interrupt is issued (shown by *).

7 page interrupts

Interrupt:

Time:

Page

Frame 2

(empty)

*

1

B

*

2

B

3

C

*

4

C

5

B

*

6

B

*

7

B

8

A

*

9

C

10

C

11

*

FIFO policy

Page

Frame 1

Page A

Page

Page

Requested: A

A

B

A

A

C

C

C

A

B

B

B

D

B

B

A

A

A

C

D

D



15/11/2010 25

• How each page requested is swapped into 2 available page frames using FIFO.

9 page interrupts

Interrupt:

Time:

Page

Frame 2

(empty)

*

1

B

*

2

B

3

B

*

4

A

*

5

A

*

6

D

*

7

D

8

D

*

9

C

*

10

C

*

11

LRU policy

Page

Frame 1

Page A

Page

Page

Requested: A

A

B

A

A

A

C

A

A

A

B

D

D

D

B

A

A

A

C

D

D



15/11/2010 26

• Only 8 page interrupts

• Efficiency slightly better than FIFO

• The most widely used static replacement algorithm

Interrupt:

Time:

Page

Frame 2

(empty)

*

1

B

*

2

B

3

C

*

4

C

5

B

*

6

B

*

7

B

8

B

*

9

C

*

10

C

*

11

Page Table Extensions

• Status bit indicates whether page is currently in memory or not

Page Status bit Referenced bit Modified bi t Page frame

0 1 1 1 5

1 1 0 0 9

2 1 0 0 7

3 1 1 0 12

Extra fields with respect to

PMT in a static paging



15/11/2010 27

• Status bit indicates whether page is currently in memory or not

• Referenced bit (use bit) indicates whether page has been referenced recently

– Used by LRU to determine which pages should be swapped out

• Modified bit (dirty bit) indicates whether page contents have been altered

– Used to determine if page must be rewritten to secondary storage when it is swapped out

• There may be more of these bits for other purposes, e.g., locking

Local Replacement strategies

• VMIN replacement – looks to the future

– at each memory reference

• first if there is page fault, then the requested page is loaded

• next each resident page that is not referenced during the next τmemory references is removed

• Working set Model



15/11/2010 28

• Working set Model– looks at the past

– at each memory reference a working set is determined

– a process can run iff its entire working set is in main memory

– only the pages that belong to the working set reside in main memory

– the working set is given by W (t, τ) = { rj | t-τ ≤ j ≤ t }

– hardware support in the form of aging registers

Working-set model (example)

• A set of active pages large enough to avoid trashing

• Use a parameter τ to determine a working-set window

Page reference window

….2 6 1 5 7 7 7 7 5 1 6 2 3 4 1 2 3 4 4 4 3 4 3 4 4 4 1 3 2 3 4 4 4….

τ = 9 τ = 9



15/11/2010 29

τ = 9 τ = 9

WS (t1)={1,2,5,6,7} WS (t2)={3,4}

D: the total demand for frames

WSSi: the working set size

For thrashing will occur; m-the total number of available frames

∑= iWSSD

mD >

Working Set –an example-

-2 -1 Time t 0 1 2 3 4 5 6 7 8 9 10

e d Reference string

a c c d b c e c e a d

Page a √ √ √ √ -- -- -- -- -- √ √

Page b -- -- -- -- √ √ √ √ -- -- --

Page c -- √ √ √ √ √ √ √ √ √ √



15/11/2010 30

Working set: from t-τ to t; i.e. [t-τ, t]; in this case τ=3

Page d √ √ √ √ √ √ √ -- -- -- √

Page e √ √ -- -- -- -- √ √ √ √ √

INt c * b * e * a * d *

OUTt e a d b

Agenda



• Paging





15/11/2010 31



– load policies

• Segmentation


Load control

• Paging strategy

– which and how many pages to load

• static paging: upon activation all pages of the process are loaded

– also called simple paging

• dynamic paging: upon a page fault one or more pages are loaded

– pure demand paging loads a single page

– demand paging with prepaging loads several pages upon (re)activation



15/11/2010 32

• Degree of multiprogramming

– automatic, when using local replacement (working set)

– when global replacement is used a separate policy is needed to

determine the number of pages per process

• avoid thrashing

• service time of a page fault vs average time between page faults

Load control (cnt’d)

• which process is to be suspended

– difficult and also dependent on scheduling

– lowest priority process

• follows the scheduling policy

– faulting process

• is blocked while waiting for its page



15/11/2010 33

• is blocked while waiting for its page

– last process activated

• considered to be the least important

– smallest process

• least expensive to swap

– largest process

• frees the largest number of frames

Thrashing

• A process may spend more time paging than executing

• When it happens?

– Pages in active use are replaced by other pages in active use

– Happens with increased degree of multiprogramming



15/11/2010 34

• Solution:

– Local replacement algorithms (not completely)

– Provide the process with as many frames as it needs

• Use locality model (working set)

Thrashing-an example-

for (j=1; j<100; ++j)

{

k:=j*j;

Page 0

Swapping

between these



15/11/2010 35

m:=a*j;

printf(“\n%d %d %d”, j, k, m);

}

printf(“\n”);

Page 1

between these

two pages

Agenda



• Paging

• Segmentation




15/11/2010 36


Segmented memory allocation

• Based on common practice by programmers of structuring their programs in modules (logical groupings of code)– A segment is a logical unit such as: main program,

subroutine, procedure, function, local variables, global variables, common block, stack, symbol table, or an array



15/11/2010 37

• Main memory is not divided into page frames because size of each segment is different– Memory is allocated dynamically

• Address specifying a segment name and an offset– This in contrast to paging where user specifies only a single

address

Segmentation

• Segments are useful for

– mirroring the logical structure of an application

– placing multiple dynamically changing entities in a single address

space

• Each process has three obvious candidates

– code, data, and stack

– other candidates are



15/11/2010 38

– other candidates are

• stacks of the individual threads of a process

• memory-mapped files

• Virtual addresses consist of segment number s and offset w

• Segments can be either contiguous (pure segmentation) or paged

address_map (s, w) { pa = ∗(STR+s)+w; return pa; }

Agenda



• Paging

• Segmentation




15/11/2010 39



• segment table, TLBs

– (re)placement strategies

– load policies

Segment + page tables

s

s p w

.

.

frame f

page p

STR

.

.

w

.

.

p



15/11/2010 40

s page p

·

.

.

·

.

.

w

f

·

.

.

p

Segment + page tables (cnt’d)

• Each memory reference requires three accesses to main memory

• Use TLB reduce memory accesses per reference to one

address_map (s, p, w) {

pa = ∗(∗(STR+s)+p)+w;

return pa;

}



15/11/2010 41

• Use TLB reduce memory accesses per reference to one

• Balance between s and p

– older systems: s large and p small

• hence the segment tables are large and can themselves be paged

• this results in yet another memory access

– multimedia applications, however, favor s small and p large

• now the page tables are large and must be paged

Advantages of VM

• Works well in a multiprogramming environment – most programs spend a lot of time waiting

• Process size is no longer restricted to MM size– (or the free space within main memory)

• Memory is used more efficiently– Eliminates external fragmentation when used with paging and eliminates

internal fragmentation when used with segmentation



15/11/2010 42

internal fragmentation when used with segmentation

• Allows an unlimited amount of multiprogramming

• Allows a program to be loaded multiple times occupying a different memory location each time

• Allows sharing of code and data

• Facilitates dynamic linking of program segments

Disadvantages of VM

• Increased processor hardware costs

• Increased overhead for handling paging interrupts

• Increased software complexity to prevent thrashing



15/11/2010 43

• Increased software complexity to prevent thrashing

Summary

Simple YesUser has a linear address space

Private Yes VM also facilitates sharing

Permanent NoUnless the programmer enforces

this during execution.



15/11/2010 44

Permanent Nothis during execution.

Fast ModerateManagement overhead for tables and

strategies

Huge Yes Memory size virtually unlimited

Cost-effective YesHardware support for VM can be

expensive

memory management (part 2) virtual memory fileagenda • recap memory management in early systems...

Documents