equivalence between priority queues and sorting in external memory

Equivalence Between Priority Queues and Sorting in External Memory

Zhewei WeiRenmin University of China

MADALGO, Aarhus University

Ke YiThe Hong Kong University of Science and Technology

Priority Queue

• Maintain a set of keys• Support insertions, deletions and findmin

(deletemin)• Fundamental data structure• Used as subroutines in greedy algorithms– Dijkstra’s single source shortest path algorithm– Prim’s minimum spanning tree algorithm

Sorting to Priority Queue

• Priority queue can do sorting• Given N unsorted keys– Insert the keys to the priority queue– Perform N deletemin operations (find minimum

and delete it)• If a priority queue can support insertion,

deletion, findmin in S(N) time, then the sorting algorithm runs in O(NS(N)) time.

Priority Queue to Sorting

• Thorup [2007]: sorting can do priority queue!A sorting algorithm sorts N keys in

N*S(N) time in RAM model

• O(Nloglog N) sorting -> O(loglog N) priority queue

• O() sorting -> O() priority queue

A priority queue support all operations in O(S(N)) time

Use sorting algorithm as a black box

The I/O Model [Aggarwal and Vitter 1988]

DiskMemor

yCPU

Block

• Complexity: # of block transfers (I/Os)• CPU computations and memory accesses are free

Size: M Unlimited sizeSize: B

Cache-Oblivious Model

DiskMemor

yCPU

Block

• Optimal without knowledge of M and B • Optimal for all M and B

Size: ?

Unlimited sizeSize: ?

Sorting in the I/O Model

• Sorting bound:

• Upper bound: external merge sort• Lower bound: holds for comparison model or

indivisibility assumption• Conjecture: lower bound holds for B not too

small, even without indivisibility assumption

Sort(N)= Θ(N/B * logM/BN ) I/Os

Treat keys as atoms

Priority Queue in External Memory

• Tree-based: do not give any priority queue-to-sorting reduction

O(1/B*logM/BN ) amortized cost

• I/O model– Buffer tree [Arge 1995]– M/B-ary heaps [Fadel et. al. 1999]– Array heaps[Brodal and Katajainen 1998]

Priority Queue in External Memory

• Cache-oblivious priority queue [Arge et.al. 2002]

• Keys are moving around in loglog N levels

O(1/B*logM/BN) with tall cache assumption

M>B2

• Reduction: Given an external sorting algorithm that sorts N keys in NS(N)/B I/Os, there is an external priority queue that support all operations in O(S(N)loglog N/B) amortized I/Os

Our Results

• S(N)/B for S(N) = Ω(2log*N), or M = Ω(B*log(c)N)• Other wise O((S(N) log*N) /B)• No new bounds for external priority queue• External priority queue lower bound -> external

sorting lower bound

A sorting algorithm sorts N keys in N*S(N)/B time in the I/O model

A priority queue support all operations in 1/B*Σi≥0S(Blog(i)(N/B)) amortized I/Os

Use sorting algorithm as a black box

S(N) + S(B*log N) + S(B*loglog N)) + …

Outline

• How Thorup did it (on a high level)

• How we extend it in external memory (on a high level)

• Open problems

Thorup’s Reduction

• Word RAM model: – each word consists of w ≥ log N bits– constant number of registers, each with capacity

for one word

• Atomic heap [Han 2004]: support insertions, deletions, and predecessor queries in set of O(log2 N) size in constant time

Thorup’s Reduction – O(S(N)*log N)

O(log N) levels

…

N keys

N/2 keys

c keys

2c keys

N/4 keys

Keep min in the head

Invariant: Keys in higher level are larger than keys in Lower level

Thorup’s Reduction – O(S(N)*log N)

• Rebalance cost for level 2j: 2j*S(N) • # of sorts in N updates: N/2j

• Amortized cost in level 2j: S(N)• log N levels

N keys

N/2 keys

c keys

2c keys

N/4 keysO(log N) levels

…

Cost: O(S(N)*logN)

Thorup’s ReductionN/log N base sets

N/2log Nbase sets

1 base sets

2 base sets

N/4log NBase sets

log NSplit/merge base sets: S(N) amortized Rebalancing level 2j: 2jS(N)/log N# of rebalance in N updates: N/2j Amortized cost for level 2j: S(N)/log N

…

O(log N) levels

O(S(N)) Amortized

cost

Thorup’s ReductionN/log N base sets

N/2log Nbase sets

1 base sets

2 base sets

N/4log NBase sets

Atomic heapof size log N

log NSplit/merge base sets: S(N) amortized Rebalancing level 2j: 2jS(N)/log N# of rebalance in N updates: N/2j Amortized cost for level 2j: S(N)/log N

…

O(1) cost

O(S(N)) Amortized

cost

Thorup’s Reduction

Amortized Cost: O(S(N))

Atomic heapof size log N

N/log N base sets

N/2log Nbase sets

1 base sets

2 base sets

N/4log NBase sets

Atomic heap of size log N

Buffer size: N/log N

Buffer size: N/2log N


…

O(S(N)) Amortized

cost

O(1) cost

Externalize Thorup’s Reduction

• Where does B come in?

• How to replace atomic heap?

• How to handle deletions in external memory?

Where does B come in?

Bufferof size B*log N

N/Blog N base sets

N/2Blog Nbase sets

1 base sets

2 base sets

N/4Blog NBase sets

Buffer size: N/log N


Buffer size: N/4log NB*log N

…

I/O-efficient Flush OperationBuffer size |R|

k substructures

• Sort keys in buffer: O(R*S(R)/B)• Distribute keys to k substructures: O(R/B+k)

Total I/O cost: O(RS(N)/B + k)

• If k =O(R/B), total flush cost is O(RS(N)/B), amortized cost is O(S(N)/B)

Where does B come in?

Base sets: 2j/(Blog N) Buffer size: 2j/log N

B*log N

… Amortized I/O cost for flushing level buffers: O(S(N)/B)

If a level holds 2j keysLargest buffer size: 2j/log NLargest # of base sets: 2j/Blog NSmallest base set (head) size: B*log N

Replacing Atomic HeapR = B*log N

k = log N


…

Replacing Atomic Heap

Head of size O(Blog N)

Amortized I/O cost:

O(S(N)/B)


…Recursively build the structure in the head

Recursively Build LayersN keys

B*log (N/B) keys

cB keys

2^c*B keys

B*loglog(N/B) keys

O(log* N) Layers

… Levels rebalancing- Move base sets around - Redistribute buffer- S(N)/(Blog N) for one level- S(N)/B for one layer- S(N)log* N/B amortized I/O cost


B*log (N/B) keys

cB keys

2^c*B keys

B*loglog(N/B) keys

O(log* N) Layers

…

Layers Rebalancing- Rebuild the first (last) level- S(N)/B for one layer- S (N)log* N/B amortized I/O cost


B*log (N/B) keys

cB keys

2^c*B keys

B*loglog(N/B) keys

O(log* N) Layers

…


B*log (N/B) keys

cB keys

2^c*B keys

B*loglog(N/B) keys

Memorybufferof sizeO(B)

R = Bk = log* N

…


B*log (N/B) keys

cB keys

2^c*B keys

B*loglog(N/B) keys

Memorybufferof sizeO(B)

Amortized cost: log* N/B

I/O cost per update: O(S(N)log* N/B)

…

Handle Deletions

• Follow a pointer to perform deletion takes 1 I/O per deletion

• Deleting signals: Delete x -> Insert (-, x)

• Perform actual deletion afterwards• Unlike buffer tree, we don’t have access to the

“leaves”(base sets)• Invariant: Only process deleting signals in the

head

Schedule

• Avoid repeated sorting• If head or memory buffer unbalanced:– Flush stage: flush all overflowed buffers and

rebalance all unbalanced base sets– Push stage: rebalance all overflowed layers and

levels (expand)– Pull stage: deal with delete signals and rebalance

all underflowed layers and levels (shrink)

Open problems

• Optimal reduction? – Priority queue that support insertions/deletions in

O(1/B) I/O cost for set of size O(B*log(c) N)– New reduction framework

• Better (than loglog N) reduction in Cache-oblivious model?– Hard to do I/O-efficient flushing and rebalancing

without knowing B

Thank You!

equivalence between priority queues and sorting in external memory

Documents