cs333 / cutler amortized analysis 1 amortized analysis the average cost of a sequence of n...

30
CS333 / Cutler Amortized Analysis 1 Amortized Analysis The average cost of a sequence of n operations on a given Data Structure. Aggregate Analysis Accounting Method

Post on 21-Dec-2015

222 views

Category:

Documents


2 download

TRANSCRIPT

CS333 / Cutler Amortized Analysis

1

Amortized AnalysisThe average cost of a sequence of n operations on a given Data

Structure.

Aggregate Analysis

Accounting Method

CS333 / Cutler Amortized Analysis

2

Amortized Analysis

• Amortized analysis computes the average time required to perform a sequence of n operations on a data structure

• Often worst case analysis is not tight and the amortized cost of an operation is less that its worst case.

• Analogy (making coffee)

CS333 / Cutler Amortized Analysis

3

Applications of amortized analysis

• Vectors/ tables

• Disjoint sets

• Priority queues

• Heaps, Binomial heaps, Fibonacci heaps

• Hashing

CS333 / Cutler Amortized Analysis

4

Difference between amortized and average cost

• To do averages we need to use probability

• For amortized analysis no such assumptions are needed

• We compute the average cost per operation for any mix of n operations

CS333 / Cutler Amortized Analysis

5

Operations on Data Structures

• A data structure has a set of operations associated with it.

• Example: A stack with– push(), pop() and MultiPop(k).

• Often some operations may be slow while others are fast.

• push() and pop() are fast.• MultiPop(k) may be slow. • Sometimes the time of a single operation

can vary

CS333 / Cutler Amortized Analysis

6

Methods

• Aggregate analysis- the total amount of time needed for the n operations is computed and divided by n

• Accounting - operations are assigned an amortized cost. Objects of the data structure are assigned a credit

• Potential – The prepaid work (money in the “bank”) is represented as “potential” energy that can be released to pay for future operations

CS333 / Cutler Amortized Analysis

7

Aggregate analysis

• n operations take T(n) time

• Amortized cost of an operation is T(n)/n

CS333 / Cutler Amortized Analysis

8

Stack - aggregate analysis

• A stack with operations

• Push, Pop and Multipop.

Multipop(S, k)

while not empty(S) and k>0 do

Pop(S);

k:=k-1

end while

CS333 / Cutler Amortized Analysis

9

Stack - aggregate analysis

• Push and Pop are O(1) (to move 1 data element)

• Multipop is O(min(s, k)) where

s is the size of the stack and

k the number of elements to pop.

• Assume a sequence of n Push, Pop and Multipop operations

CS333 / Cutler Amortized Analysis

10

Stack - aggregate analysis

• Each object can be popped only once for each time it is pushed

• So the total number of times Pop can be called ( directly or from Multipop) is bound by the number of Pushes <=n.

CS333 / Cutler Amortized Analysis

11

Stack - aggregate analysis

• A sequence of n Push and Pop operations is therefore O(n) and the amortized cost of each is

O(n)/n=O(1)

CS333 / Cutler Amortized Analysis

12

Stack Example: Op/Moves <=2

Start

push a a

push b ab

push c abc

Multipop(3)ab

pop b a

pop a

Operation Stack

a

ab

abc

Stack

pop c

6 Operation: 6 Moves 4 Operation: 6 Moves

CS333 / Cutler Amortized Analysis

13

Accounting Method

• Charge each operation an (invented) amortized cost. – Often different from actual run time cost. Some

operations may have an amortized cost larger than runtime, others may have less.

– Unlike businesses we do not want to make a profit– We want to cover the actual cost

• Amount charged but not used in performing an operation is stored with objects of the data structure

CS333 / Cutler Amortized Analysis

14

Accounting method

• Later operations can use stored amount to pay for their actual cost

• Credit balance must not go negative (always enough to pay for performance of future operations)

CS333 / Cutler Amortized Analysis

15

Stack - amortized analysis

• We assign the amortized costs:

$2 for Push

$0 for both Pop and Multipop

• For a sequence of n Push and Pop operations the total amortized cost is at most 2n or O(n)

CS333 / Cutler Amortized Analysis

16

Stack - amortized analysis

• Each time we do a Push we pay $1 for the actual cost of the Push and the element has a credit of $1.

• Each time an element is popped we take the $1 credit to pay for it

• Thus the balance is always nonnegative

CS333 / Cutler Amortized Analysis

17

k bit binary counter

Increment (A) i = 0 while i < length[A] and

A[i] = 1 A[i] = 0 i ++ if i < length[A] A[i] = 1

• Initially the counter contains 0

• Eventually it becomes 2k -1

• Next it is reset to 0

1 0k -1

A

CS333 / Cutler Amortized Analysis

18

Aggregate analysis

• Count number of times a bit is flipped.

• Let number increments n = 2k (if n < 2k analysis similar)

• A[0] flipped n times• A[1] flipped n/21 times• …• A[k - 1] flipped n/2 k-1 times

Bit 2 1 0

D No.0 0 0 01 0 0 12 0 1 0 3 0 1 1

4 1 0 0 5 1 0 1 6 1 1 0 7 1 1 1Flips 2 4 8

3 bit counterk=3

nnn

ii

k

ii

22

1

2 0

1

0

CS333 / Cutler Amortized Analysis

19

Accounting method

• Charge amortized cost of $2 to set a bit to 1• When a bit is set to 1, pay $1 for actual cost

and store $1 with bit• Note: at all times a bit with value 1 has $1• When a bit is reset to 0 use $1 to pay for

actual cost• $2 per Increment operation

CS333 / Cutler Amortized Analysis

20

Accounting method

• Let the value stored in the counter be:

• After increment and a payment of

$2 = $1 +$1:

1 1 0 0 1 0 0 1 1 1 $1 $1 $0 $0 $1 $0 $0 $1 $1 $1

$1 $1 $0 $0 $1 $0 $1 $0 $0 $0 1 1 0 0 1 0 1 0 0 0

CS333 / Cutler Amortized Analysis

21

Dynamic table (object table, hash table, vector, etc)

• The table is dynamic

• We can’t predict its maximum size

• We would like to avoid allocating a lot of unused space (reasonable load balance)

• May not be able to avoid table overflow.

• Overflow should not cause run time failure

CS333 / Cutler Amortized Analysis

22

java.util.Vector

• A built in class which is a “growable” array.

• The user can set:– initialCapacity - the initial capacity of the

vector. – capacityIncrement - the amount by which the

capacity is increased when the vector overflows.

• The default for capacityIncrement is to double the size

CS333 / Cutler Amortized Analysis

23

Dynamic table

• Idea: Allocate more memory as needed. • Duplicate the size of the table after

each overflow. • After each duplication must copy

elements from old to new table• We assume for now the only operation

is insert and calculate amortized cost• Table sizes: 1, 2, 4, 8, …, 2k

CS333 / Cutler Amortized Analysis

24

Aggregate methodOp. Size Cost The table before after

1234

12345678

12

11 0 1 12 1 2 1+13 2 4 2+14 4 4 1 5 4 8 4+16 8 8 17 8 8 1 24/98 8 8 19 8 16 8+1

123456789..

Copy

CS333 / Cutler Amortized Analysis

25

• Let the number of inserts be 2k < n 2k+1

• (Note: 2*2k = 2k+1 < 2n)• At this point the size of the table is 2k+1

• After n inserts:– The total cost for copy operations only is

1 +2 + 4 + . . . + 2k = 2k+1 - 1 < 2n – The total cost for n inserts (without copy) is

n.

• Total < 3n• Therefore the amortized time is O(1).

Aggregate Analysis

2k

2k

n

CS333 / Cutler Amortized Analysis

26

Accounting analysis• Charge each insert $3.• When the table is not full, use $1 for the

cost of insert, and store $2 with element• When the table doubles from m to 2m:

– m/2 elements that never moved before have $2 credit,

– m/2 elements which already moved have $0– After copy all m elements have $0 credit

CS333 / Cutler Amortized Analysis

27

Accounting analysis

$0$0$0$0$2$2$2$2

$0$0$2$2

Size = 42 copied elementswith $02 new elementswith $2

$0$0$0$0

Size =84 copied elementswith $04 new elements with $2

Size = 84 copied elementswith $0

CS333 / Cutler Amortized Analysis

28

java.util.Vector fixed increment

• Let the number of inserts n satisfy

c0 +(m-1)c < n c0 +mc

• So (n- c0 )/c m <1+ (n- c0 )/c and m = (n)

• At this point the size of the array is c0 +mc

• After the nth insert:

– The total time for copy operations is ?

– The total time for n inserts (without copy) is n.

CS333 / Cutler Amortized Analysis

29

java.util.Vector fixed increment

m-1

Cost for m vector copies = mc0 + c (1 + 2+ 3 + …

+(m-1)) = mc0 + cm(m-1)/2= (m2) = (n2)

c0

c0

c0

c0

c

c c

0

1

2

c c

Initial capacity c0

Capacity increment c

CS333 / Cutler Amortized Analysis

30

java.util.Vector fixed increment

• After the nth insert:

– The total time for copy operations is (n2)

– The total time for n inserts (without copy)

is n.

• Average insert time is (n)