equivalence between priority queues and sorting in external memory
DESCRIPTION
Priority Queue Maintain a set of keys Support insertions, deletions and findmin (deletemin) Fundamental data structure Used as subroutines in greedy algorithms Dijkstra’s single source shortest path algorithm Prim’s minimum spanning tree algorithmTRANSCRIPT
Equivalence Between Priority Queues and Sorting in External Memory
Zhewei WeiRenmin University of China
MADALGO, Aarhus University
Ke YiThe Hong Kong University of Science and Technology
Priority Queue
• Maintain a set of keys• Support insertions, deletions and findmin
(deletemin)• Fundamental data structure• Used as subroutines in greedy algorithms– Dijkstra’s single source shortest path algorithm– Prim’s minimum spanning tree algorithm
Sorting to Priority Queue
• Priority queue can do sorting• Given N unsorted keys– Insert the keys to the priority queue– Perform N deletemin operations (find minimum
and delete it)• If a priority queue can support insertion,
deletion, findmin in S(N) time, then the sorting algorithm runs in O(NS(N)) time.
Priority Queue to Sorting
• Thorup [2007]: sorting can do priority queue!A sorting algorithm sorts N keys in
N*S(N) time in RAM model
• O(Nloglog N) sorting -> O(loglog N) priority queue
• O() sorting -> O() priority queue
A priority queue support all operations in O(S(N)) time
Use sorting algorithm as a black box
The I/O Model [Aggarwal and Vitter 1988]
DiskMemor
yCPU
Block
• Complexity: # of block transfers (I/Os)• CPU computations and memory accesses are free
Size: M Unlimited sizeSize: B
Cache-Oblivious Model
DiskMemor
yCPU
Block
• Optimal without knowledge of M and B • Optimal for all M and B
Size: ?
Unlimited sizeSize: ?
Sorting in the I/O Model
• Sorting bound:
• Upper bound: external merge sort• Lower bound: holds for comparison model or
indivisibility assumption• Conjecture: lower bound holds for B not too
small, even without indivisibility assumption
Sort(N)= Θ(N/B * logM/BN ) I/Os
Treat keys as atoms
Priority Queue in External Memory
• Tree-based: do not give any priority queue-to-sorting reduction
O(1/B*logM/BN ) amortized cost
• I/O model– Buffer tree [Arge 1995]– M/B-ary heaps [Fadel et. al. 1999]– Array heaps[Brodal and Katajainen 1998]
Priority Queue in External Memory
• Cache-oblivious priority queue [Arge et.al. 2002]
• Keys are moving around in loglog N levels
O(1/B*logM/BN) with tall cache assumption
M>B2
• Reduction: Given an external sorting algorithm that sorts N keys in NS(N)/B I/Os, there is an external priority queue that support all operations in O(S(N)loglog N/B) amortized I/Os
Our Results
• S(N)/B for S(N) = Ω(2log*N), or M = Ω(B*log(c)N)• Other wise O((S(N) log*N) /B)• No new bounds for external priority queue• External priority queue lower bound -> external
sorting lower bound
A sorting algorithm sorts N keys in N*S(N)/B time in the I/O model
A priority queue support all operations in 1/B*Σi≥0S(Blog(i)(N/B)) amortized I/Os
Use sorting algorithm as a black box
S(N) + S(B*log N) + S(B*loglog N)) + …
Outline
• How Thorup did it (on a high level)
• How we extend it in external memory (on a high level)
• Open problems
Thorup’s Reduction
• Word RAM model: – each word consists of w ≥ log N bits– constant number of registers, each with capacity
for one word
• Atomic heap [Han 2004]: support insertions, deletions, and predecessor queries in set of O(log2 N) size in constant time
Thorup’s Reduction – O(S(N)*log N)
O(log N) levels
…
N keys
N/2 keys
c keys
2c keys
N/4 keys
Keep min in the head
Invariant: Keys in higher level are larger than keys in Lower level
Thorup’s Reduction – O(S(N)*log N)
• Rebalance cost for level 2j: 2j*S(N) • # of sorts in N updates: N/2j
• Amortized cost in level 2j: S(N)• log N levels
N keys
N/2 keys
c keys
2c keys
N/4 keysO(log N) levels
…
Cost: O(S(N)*logN)
Thorup’s ReductionN/log N base sets
N/2log Nbase sets
1 base sets
2 base sets
N/4log NBase sets
log NSplit/merge base sets: S(N) amortized Rebalancing level 2j: 2jS(N)/log N# of rebalance in N updates: N/2j Amortized cost for level 2j: S(N)/log N
…
O(log N) levels
O(S(N)) Amortized
cost
Thorup’s ReductionN/log N base sets
N/2log Nbase sets
1 base sets
2 base sets
N/4log NBase sets
Atomic heapof size log N
log NSplit/merge base sets: S(N) amortized Rebalancing level 2j: 2jS(N)/log N# of rebalance in N updates: N/2j Amortized cost for level 2j: S(N)/log N
…
O(1) cost
O(S(N)) Amortized
cost
Thorup’s Reduction
Amortized Cost: O(S(N))
Atomic heapof size log N
N/log N base sets
N/2log Nbase sets
1 base sets
2 base sets
N/4log NBase sets
Atomic heap of size log N
Buffer size: N/log N
Buffer size: N/2log N
Buffer size: N/4log N
…
O(S(N)) Amortized
cost
O(1) cost
Externalize Thorup’s Reduction
• Where does B come in?
• How to replace atomic heap?
• How to handle deletions in external memory?
Where does B come in?
Bufferof size B*log N
N/Blog N base sets
N/2Blog Nbase sets
1 base sets
2 base sets
N/4Blog NBase sets
Buffer size: N/log N
Buffer size: N/2log N
Buffer size: N/4log NB*log N
…
I/O-efficient Flush OperationBuffer size |R|
k substructures
• Sort keys in buffer: O(R*S(R)/B)• Distribute keys to k substructures: O(R/B+k)
Total I/O cost: O(RS(N)/B + k)
• If k =O(R/B), total flush cost is O(RS(N)/B), amortized cost is O(S(N)/B)
Where does B come in?
Base sets: 2j/(Blog N) Buffer size: 2j/log N
B*log N
… Amortized I/O cost for flushing level buffers: O(S(N)/B)
If a level holds 2j keysLargest buffer size: 2j/log NLargest # of base sets: 2j/Blog NSmallest base set (head) size: B*log N
Replacing Atomic HeapR = B*log N
k = log N
Bufferof size B*log N
…
Replacing Atomic Heap
Head of size O(Blog N)
Amortized I/O cost:
O(S(N)/B)
Bufferof size B*log N
…Recursively build the structure in the head
Recursively Build LayersN keys
B*log (N/B) keys
cB keys
2^c*B keys
B*loglog(N/B) keys
O(log* N) Layers
… Levels rebalancing- Move base sets around - Redistribute buffer- S(N)/(Blog N) for one level- S(N)/B for one layer- S(N)log* N/B amortized I/O cost
Recursively Build LayersN keys
B*log (N/B) keys
cB keys
2^c*B keys
B*loglog(N/B) keys
O(log* N) Layers
…
Layers Rebalancing- Rebuild the first (last) level- S(N)/B for one layer- S (N)log* N/B amortized I/O cost
Recursively Build LayersN keys
B*log (N/B) keys
cB keys
2^c*B keys
B*loglog(N/B) keys
O(log* N) Layers
…
Recursively Build LayersN keys
B*log (N/B) keys
cB keys
2^c*B keys
B*loglog(N/B) keys
Memorybufferof sizeO(B)
R = Bk = log* N
…
Recursively Build LayersN keys
B*log (N/B) keys
cB keys
2^c*B keys
B*loglog(N/B) keys
Memorybufferof sizeO(B)
Amortized cost: log* N/B
I/O cost per update: O(S(N)log* N/B)
…
Handle Deletions
• Follow a pointer to perform deletion takes 1 I/O per deletion
• Deleting signals: Delete x -> Insert (-, x)
• Perform actual deletion afterwards• Unlike buffer tree, we don’t have access to the
“leaves”(base sets)• Invariant: Only process deleting signals in the
head
Schedule
• Avoid repeated sorting• If head or memory buffer unbalanced:– Flush stage: flush all overflowed buffers and
rebalance all unbalanced base sets– Push stage: rebalance all overflowed layers and
levels (expand)– Pull stage: deal with delete signals and rebalance
all underflowed layers and levels (shrink)
Open problems
• Optimal reduction? – Priority queue that support insertions/deletions in
O(1/B) I/O cost for set of size O(B*log(c) N)– New reduction framework
• Better (than loglog N) reduction in Cache-oblivious model?– Hard to do I/O-efficient flushing and rebalancing
without knowing B
Thank You!