Transcript
Page 1: The History of I/O ModelsSolves Floyd’s two-level storage problem (𝑀𝑀= 3𝐵𝐵) 𝐵𝐵items Sorting Lower Bound [Aggarwal & Vitter — ICALP 1987, C. ACM 1988] • Sorting

The History ofI/O Models

Erik Demaine

MASSACHUSETTSINSTITUTE OFTECHNOLOGY

Page 2: The History of I/O ModelsSolves Floyd’s two-level storage problem (𝑀𝑀= 3𝐵𝐵) 𝐵𝐵items Sorting Lower Bound [Aggarwal & Vitter — ICALP 1987, C. ACM 1988] • Sorting

Memory Hierarchies in Practice

CPU Registers Level 1Cache

Level 2Cache

MainMemory

Disk

1MB

10GB

100KB100B

1TB

4 ms

14 ns

750 ps

Internet

1EB-1ZB

Outer Space< 1083

Presenter
Presentation Notes
http://en.wikipedia.org/wiki/Disk-drive_performance_characteristics#Rotational_latency 4ms on 7200rpm
Page 3: The History of I/O ModelsSolves Floyd’s two-level storage problem (𝑀𝑀= 3𝐵𝐵) 𝐵𝐵items Sorting Lower Bound [Aggarwal & Vitter — ICALP 1987, C. ACM 1988] • Sorting

Models, Models, ModelsModel Year Blocking Caching Levels SimpleIdealized2-level 1972 ✓ ✗ 2 ✓Red-blue pebble 1981 ✗ ✓ 2 ✓−External memory 1987 ✓ ✓ 2 ✓HMM 1987 ✗ ✓ ∞ ✓BT 1987 ~ ✓ ∞ ✓−(U)MH 1990 ✓ ✓ ∞ ✗Cache oblivious 1999 ✓ ✓ 2–∞ ✓+

Page 4: The History of I/O ModelsSolves Floyd’s two-level storage problem (𝑀𝑀= 3𝐵𝐵) 𝐵𝐵items Sorting Lower Bound [Aggarwal & Vitter — ICALP 1987, C. ACM 1988] • Sorting

Idealized Two-Level Storage[Floyd — Complexity of Computer Computations 1972]

PDP-11 (1970–1990s)

Ken Thompson

Dennis Ritchie

2.4MB RK03 disks

8KB core memory

Presenter
Presentation Notes
PDP-11 is 16-bit RK03 disk stores 1.2 million words [ftp://ftp.update.uu.se/pub/pdp11/doc/rk11-c.txt] http://cm.bell-labs.com/who/dmr/kd14.jpg from http://cm.bell-labs.com/who/dmr/picture.html 4K words of memory, 950 ns must be core: http://www.village.org/pdp11/faq.pages/price.images/pdp11-20-aug-1971/small/page1.gif http://www.psych.usyd.edu.au/pdp-11/Images/core_closeup.jpeg from http://www.psych.usyd.edu.au/pdp-11/core.html
Page 5: The History of I/O ModelsSolves Floyd’s two-level storage problem (𝑀𝑀= 3𝐵𝐵) 𝐵𝐵items Sorting Lower Bound [Aggarwal & Vitter — ICALP 1987, C. ACM 1988] • Sorting

Idealized Two-Level Storage[Floyd — Complexity of Computer Computations 1972]

• RAM = blocks of ≤ 𝐵𝐵 items• Block operation: Read up to 𝐵𝐵 items

from two blocks 𝑖𝑖, 𝑗𝑗 Write to third block 𝑘𝑘

• Ignore item order within block CPU operations considered free

• Items are indivisible

CPU

𝐵𝐵 items

𝑖𝑖

𝑗𝑗

𝑘𝑘

Page 6: The History of I/O ModelsSolves Floyd’s two-level storage problem (𝑀𝑀= 3𝐵𝐵) 𝐵𝐵items Sorting Lower Bound [Aggarwal & Vitter — ICALP 1987, C. ACM 1988] • Sorting

Permutation Lower Bound[Floyd — Complexity of Computer Computations 1972]

• Theorem: Permuting 𝑁𝑁 items to⁄𝑁𝑁 𝐵𝐵 (full) specified blocks needs

block operations, in average case Assuming 𝑁𝑁

𝐵𝐵> 𝐵𝐵 (tall disk)

• Simplified model: Move items instead of copy Equivalence: Follow item’s path from start to finish

CPU

𝐵𝐵 items

𝑖𝑖

𝑗𝑗

𝑘𝑘

Ω𝑁𝑁𝐵𝐵

log𝐵𝐵

Presenter
Presentation Notes
Original paper called blocks “pages”, used “p” for block size, used “w” for number of pages, and made the assumption w > p (didn’t call it anything).
Page 7: The History of I/O ModelsSolves Floyd’s two-level storage problem (𝑀𝑀= 3𝐵𝐵) 𝐵𝐵items Sorting Lower Bound [Aggarwal & Vitter — ICALP 1987, C. ACM 1988] • Sorting

Permutation Lower Bound[Floyd — Complexity of Computer Computations 1972]

• Potential:

Maximized in target configurationof full blocks (𝑛𝑛𝑖𝑖𝑖𝑖=𝐵𝐵): Φ = 𝑁𝑁 log𝐵𝐵

Random configuration with 𝑁𝑁𝐵𝐵

> 𝐵𝐵has E 𝑛𝑛𝑖𝑖𝑖𝑖 = 𝑂𝑂(1) ⇒ E Φ = 𝑂𝑂(𝑁𝑁) Claim: Block operation increases Φ by ≤ 𝐵𝐵

⇒ Number of block operations ≥ 𝑁𝑁 log 𝐵𝐵−𝑂𝑂(𝑁𝑁)𝐵𝐵

𝐵𝐵 items

𝑖𝑖

𝑗𝑗

𝑘𝑘

# items in block 𝑖𝑖 destined for block 𝑗𝑗

Φ = �𝑖𝑖,𝑖𝑖

𝑛𝑛𝑖𝑖𝑖𝑖 log𝑛𝑛𝑖𝑖𝑖𝑖

Page 8: The History of I/O ModelsSolves Floyd’s two-level storage problem (𝑀𝑀= 3𝐵𝐵) 𝐵𝐵items Sorting Lower Bound [Aggarwal & Vitter — ICALP 1987, C. ACM 1988] • Sorting

Permutation Lower Bound[Floyd — Complexity of Computer Computations 1972]

• Potential:

Maximized in target configurationof full blocks (𝑛𝑛𝑖𝑖𝑖𝑖=𝐵𝐵): Φ = 𝑁𝑁 log𝐵𝐵

Random configuration with 𝑁𝑁𝐵𝐵

> 𝐵𝐵has E 𝑛𝑛𝑖𝑖𝑖𝑖 = 𝑂𝑂(1) ⇒ E Φ = 𝑂𝑂(𝑁𝑁) Claim: Block operation increases Φ by ≤ 𝐵𝐵

oFact: 𝑥𝑥 + 𝑦𝑦 log 𝑥𝑥 + 𝑦𝑦 ≤ 𝑥𝑥 log 𝑥𝑥 + 𝑦𝑦 log 𝑦𝑦 + 𝑥𝑥 + 𝑦𝑦o So combining groups 𝑥𝑥 & 𝑦𝑦 increases Φ by ≤ 𝑥𝑥 + 𝑦𝑦

𝐵𝐵 items

𝑖𝑖

𝑗𝑗

𝑘𝑘

# items in block 𝑖𝑖 destined for block 𝑗𝑗

Φ = �𝑖𝑖,𝑖𝑖

𝑛𝑛𝑖𝑖𝑖𝑖 log𝑛𝑛𝑖𝑖𝑖𝑖

Page 9: The History of I/O ModelsSolves Floyd’s two-level storage problem (𝑀𝑀= 3𝐵𝐵) 𝐵𝐵items Sorting Lower Bound [Aggarwal & Vitter — ICALP 1987, C. ACM 1988] • Sorting

Permutation Bounds[Floyd — Complexity of Computer Computations 1972]

• Theorem: Ω 𝑁𝑁𝐵𝐵

log𝐵𝐵 Tight for 𝐵𝐵 = 𝑂𝑂(1)

• Theorem: 𝑂𝑂 𝑁𝑁𝐵𝐵

log 𝑁𝑁𝐵𝐵

Similar to radix sort,where key = target block index

Accidental claim: tight for all 𝐵𝐵 < 𝑁𝑁𝐵𝐵

We will see: tight for 𝐵𝐵 > log 𝑁𝑁𝐵𝐵

𝐵𝐵 items

𝑖𝑖

𝑗𝑗

𝑘𝑘

CPU

Page 10: The History of I/O ModelsSolves Floyd’s two-level storage problem (𝑀𝑀= 3𝐵𝐵) 𝐵𝐵items Sorting Lower Bound [Aggarwal & Vitter — ICALP 1987, C. ACM 1988] • Sorting

Idealized Two-Level Storage[Floyd — Complexity of Computer Computations 1972]

• External memory & word RAM: 𝐵𝐵 items

𝑖𝑖

𝑗𝑗

𝑘𝑘

Page 11: The History of I/O ModelsSolves Floyd’s two-level storage problem (𝑀𝑀= 3𝐵𝐵) 𝐵𝐵items Sorting Lower Bound [Aggarwal & Vitter — ICALP 1987, C. ACM 1988] • Sorting

Idealized Two-Level Storage[Floyd — Complexity of Computer Computations 1972]

• External memory & word RAM:

• Foreshadowing future models:

𝐵𝐵 items

𝑖𝑖

𝑗𝑗

𝑘𝑘

CPU

Page 12: The History of I/O ModelsSolves Floyd’s two-level storage problem (𝑀𝑀= 3𝐵𝐵) 𝐵𝐵items Sorting Lower Bound [Aggarwal & Vitter — ICALP 1987, C. ACM 1988] • Sorting

Red-Blue Pebble Game[Hong & Kung — STOC 1981]

CPU

𝑀𝑀 items

cache disk

TRS-80[1977–1981]

VIC-20[1980–1985]

Apple II[1977–1993]

4KB RAM

5KB RAM

4KB RAM

Presenter
Presentation Notes
First use of “I/O complexity” http://upload.wikimedia.org/wikipedia/commons/thumb/e/e4/TRS-80_Model_I_-_Rechnermuseum_Cropped.jpg/485px-TRS-80_Model_I_-_Rechnermuseum_Cropped.jpg from http://en.wikipedia.org/wiki/TRS-80 from http://en.wikipedia.org/wiki/TRS-80 http://upload.wikimedia.org/wikipedia/commons/thumb/3/39/CBMVIC20P8.jpg/800px-CBMVIC20P8.jpg from http://en.wikipedia.org/wiki/File:CBMVIC20P8.jpg from http://en.wikipedia.org/wiki/Commodore_VIC-20 http://upload.wikimedia.org/wikipedia/en/thumb/8/82/Apple_II_tranparent_800.png/651px-Apple_II_tranparent_800.png from http://en.wikipedia.org/wiki/File:Apple_II_tranparent_800.png from http://en.wikipedia.org/wiki/Apple_II_series#Apple_II
Page 13: The History of I/O ModelsSolves Floyd’s two-level storage problem (𝑀𝑀= 3𝐵𝐵) 𝐵𝐵items Sorting Lower Bound [Aggarwal & Vitter — ICALP 1987, C. ACM 1988] • Sorting

Pebble Game[Hopcroft, Paul, Valiant — J. ACM 1977]

• View computation as DAGof data dependencies

• Pebble = “in memory”• Moves: Place pebble on node

if all predecessors have a pebble Remove pebble from node

• Goal: Pebbles on all output nodes Minimize maximum number of pebbles over time

inputs:

outputs:

Presenter
Presentation Notes
http://dl.acm.org/citation.cfm?id=322015&dl=
Page 14: The History of I/O ModelsSolves Floyd’s two-level storage problem (𝑀𝑀= 3𝐵𝐵) 𝐵𝐵items Sorting Lower Bound [Aggarwal & Vitter — ICALP 1987, C. ACM 1988] • Sorting

Pebble Game[Hopcroft, Paul, Valiant — J. ACM 1977]

• Theorem: Any DAG can be “executed” using 𝑂𝑂 𝑛𝑛

log 𝑛𝑛maximum pebbles

• Corollary:

DTIME 𝑡𝑡 ⊆ DSPACE𝑡𝑡

log 𝑡𝑡

inputs:

outputs:

Presenter
Presentation Notes
http://dl.acm.org/citation.cfm?id=322015&dl=
Page 15: The History of I/O ModelsSolves Floyd’s two-level storage problem (𝑀𝑀= 3𝐵𝐵) 𝐵𝐵items Sorting Lower Bound [Aggarwal & Vitter — ICALP 1987, C. ACM 1988] • Sorting

Red-Blue Pebble Game[Hong & Kung — STOC 1981]

• Red pebble = “in cache”• Blue pebble = “on disk”• Moves: Place red pebble on node if all

predecessors have red pebble Remove pebble from node Write: Red pebble → blue pebble Read: Blue pebble → red pebble

• Goal: Blue inputs to blue outputs ≤ 𝑀𝑀 red pebbles at any time

inputs:

outputs:

minimize

Page 16: The History of I/O ModelsSolves Floyd’s two-level storage problem (𝑀𝑀= 3𝐵𝐵) 𝐵𝐵items Sorting Lower Bound [Aggarwal & Vitter — ICALP 1987, C. ACM 1988] • Sorting

Red-Blue Pebble Game[Hong & Kung — STOC 1981]

• Red pebble = “in cache”• Blue pebble = “on disk”

inputs:

outputs:

CPU

𝑀𝑀 items

cache disk

minimize number ofcache ↔ disk I/Os

(memory transfers)

Page 17: The History of I/O ModelsSolves Floyd’s two-level storage problem (𝑀𝑀= 3𝐵𝐵) 𝐵𝐵items Sorting Lower Bound [Aggarwal & Vitter — ICALP 1987, C. ACM 1988] • Sorting

Red-Blue Pebble Game Results[Hong & Kung — STOC 1981]

Computation DAG Memory Transfers Speedup

Fast Fourier Transform (FFT) Θ 𝑁𝑁 log𝑀𝑀 𝑁𝑁 Θ log𝑀𝑀

Ordinary matrix-vector multiplication Θ

𝑁𝑁2

𝑀𝑀Θ 𝑀𝑀

Ordinary matrix-matrix multiplication Θ

𝑁𝑁3

𝑀𝑀Θ 𝑀𝑀

Odd-even transposition sort Θ

𝑁𝑁2

𝑀𝑀Θ 𝑀𝑀

𝑁𝑁 × 𝑁𝑁 × ⋯× 𝑁𝑁 grid Ω𝑁𝑁𝑑𝑑

𝑀𝑀1/(𝑑𝑑−1) Θ 𝑀𝑀1/(𝑑𝑑−1)

𝑑𝑑

Presenter
Presentation Notes
Actual matrix results handle rectangular matrices, not just square. Speedups are the same.
Page 18: The History of I/O ModelsSolves Floyd’s two-level storage problem (𝑀𝑀= 3𝐵𝐵) 𝐵𝐵items Sorting Lower Bound [Aggarwal & Vitter — ICALP 1987, C. ACM 1988] • Sorting

Comparison

CPU

𝑀𝑀 items

cache disk

CPU

𝐵𝐵 items

Idealized two-level storage

[Floyd 1972]

Red-bluepebble game

[Hong & Kung 1981]

blocks but no cache cache but no blocks

Presenter
Presentation Notes
http://cache.gawkerassets.com/assets/images/8/2011/06/avp_2.jpg from http://io9.com/avp/
Page 19: The History of I/O ModelsSolves Floyd’s two-level storage problem (𝑀𝑀= 3𝐵𝐵) 𝐵𝐵items Sorting Lower Bound [Aggarwal & Vitter — ICALP 1987, C. ACM 1988] • Sorting

I/O Model [Aggarwal & Vitter — ICALP 1987, C. ACM 1988]

• AKA: External Memory Model, Disk Access Model• Goal: Minimize number of I/Os (memory transfers)

CPU

𝐵𝐵 items

𝑀𝑀𝐵𝐵

blocks

cache𝐵𝐵 itemsdisk

cache and blocks

Page 20: The History of I/O ModelsSolves Floyd’s two-level storage problem (𝑀𝑀= 3𝐵𝐵) 𝐵𝐵items Sorting Lower Bound [Aggarwal & Vitter — ICALP 1987, C. ACM 1988] • Sorting

Scanning[Aggarwal & Vitter — ICALP 1987, C. ACM 1988]

• Visiting 𝑁𝑁 elements in order costs𝑂𝑂 1 + 𝑁𝑁

𝐵𝐵memory transfers

• More generally, can run ≤ 𝑀𝑀𝐵𝐵

parallel scans, keeping 1 block per scan in cache E.g., merge 𝑂𝑂 𝑀𝑀

𝐵𝐵lists of total size 𝑁𝑁

in 𝑂𝑂 1 + 𝑁𝑁𝐵𝐵

memory transfers

𝐵𝐵 items

Page 21: The History of I/O ModelsSolves Floyd’s two-level storage problem (𝑀𝑀= 3𝐵𝐵) 𝐵𝐵items Sorting Lower Bound [Aggarwal & Vitter — ICALP 1987, C. ACM 1988] • Sorting

Practical Scanning [Arge]

• Does the 𝐵𝐵 factor matter? Should I presort my linked list before traversal?

• Example: 𝑁𝑁 = 256,000,000 ~ 1GB 𝐵𝐵 = 8,000 ~ 32KB (small) 1ms disk access time (small)

𝑁𝑁 memory transfers take 256,000 sec ≈ 71 hours

𝑁𝑁𝐵𝐵

memory transfers take ⁄256 8 = 𝟑𝟑𝟑𝟑 seconds

Page 22: The History of I/O ModelsSolves Floyd’s two-level storage problem (𝑀𝑀= 3𝐵𝐵) 𝐵𝐵items Sorting Lower Bound [Aggarwal & Vitter — ICALP 1987, C. ACM 1988] • Sorting

Searching[Aggarwal & Vitter — ICALP 1987, C. ACM 1988]

• Finding an element 𝑥𝑥 among 𝑁𝑁 items requires Θ log𝐵𝐵+1 𝑁𝑁 memory transfers

• Lower bound: (comparison model) Each block reveals where 𝑥𝑥 fits among 𝐵𝐵 items ⇒ Learn ≤ log 𝐵𝐵 + 1 bits per read Need log 𝑁𝑁 + 1 bits

• Upper bound: B-tree Insert & delete

in 𝑂𝑂 log𝐵𝐵+1 𝑁𝑁

𝐵𝐵 items

Page 23: The History of I/O ModelsSolves Floyd’s two-level storage problem (𝑀𝑀= 3𝐵𝐵) 𝐵𝐵items Sorting Lower Bound [Aggarwal & Vitter — ICALP 1987, C. ACM 1988] • Sorting

Sorting and Permutation[Aggarwal & Vitter — ICALP 1987, C. ACM 1988]

• Sorting bound: Θ 𝑁𝑁𝐵𝐵

log ⁄𝑀𝑀 𝐵𝐵𝑁𝑁𝐵𝐵

• Permutation bound: Θ min 𝑁𝑁, 𝑁𝑁𝐵𝐵

log ⁄𝑀𝑀 𝐵𝐵𝑁𝑁𝐵𝐵

Either sort or use naïve RAM algorithm Solves Floyd’s two-level storage problem (𝑀𝑀 = 3𝐵𝐵)

𝐵𝐵 items

Page 24: The History of I/O ModelsSolves Floyd’s two-level storage problem (𝑀𝑀= 3𝐵𝐵) 𝐵𝐵items Sorting Lower Bound [Aggarwal & Vitter — ICALP 1987, C. ACM 1988] • Sorting

Sorting Lower Bound[Aggarwal & Vitter — ICALP 1987, C. ACM 1988]

• Sorting bound: Ω 𝑁𝑁𝐵𝐵

log ⁄𝑀𝑀 𝐵𝐵𝑁𝑁𝐵𝐵

Always keep cache sorted (free) Might as well presort each block Upon reading a block, learn how those 𝐵𝐵 items fit

amongst 𝑀𝑀 items in cache

⇒ Learn lg 𝑀𝑀+𝐵𝐵𝐵𝐵 ∼ 𝐵𝐵 lg 𝑀𝑀

𝐵𝐵bits

Need lg𝑁𝑁! ∼ 𝑁𝑁 lg𝑁𝑁 bits Know 𝑁𝑁 lg𝐵𝐵 bits from block presort

𝐵𝐵 items

Page 25: The History of I/O ModelsSolves Floyd’s two-level storage problem (𝑀𝑀= 3𝐵𝐵) 𝐵𝐵items Sorting Lower Bound [Aggarwal & Vitter — ICALP 1987, C. ACM 1988] • Sorting

Sorting Upper Bound[Aggarwal & Vitter — ICALP 1987, C. ACM 1988]

• Sorting bound: 𝑂𝑂 𝑁𝑁𝐵𝐵

log ⁄𝑀𝑀 𝐵𝐵𝑁𝑁𝐵𝐵

𝑂𝑂 𝑀𝑀𝐵𝐵

-way mergesort

𝑇𝑇 𝑁𝑁 = 𝑀𝑀𝐵𝐵𝑇𝑇 �𝑁𝑁 𝑀𝑀

𝐵𝐵+ 𝑂𝑂 1 + 𝑁𝑁

𝐵𝐵

𝑇𝑇 𝐵𝐵 = 𝑂𝑂 1𝐵𝐵 items

𝑁𝑁/𝐵𝐵

1 1 1

𝑁𝑁/𝐵𝐵

𝑁𝑁/𝐵𝐵

𝑁𝑁/𝐵𝐵

log ⁄𝑀𝑀 𝐵𝐵𝑁𝑁𝐵𝐵

levels

𝑀𝑀/𝐵𝐵

Page 26: The History of I/O ModelsSolves Floyd’s two-level storage problem (𝑀𝑀= 3𝐵𝐵) 𝐵𝐵items Sorting Lower Bound [Aggarwal & Vitter — ICALP 1987, C. ACM 1988] • Sorting

Distribution Sort[Aggarwal & Vitter — ICALP 1987, C. ACM 1988]

• ⁄𝑀𝑀 𝐵𝐵-way quicksort

1. Find ⁄𝑀𝑀 𝐵𝐵 partition elements,roughly evenly spaced

2. Partition array into ⁄𝑀𝑀 𝐵𝐵 + 1 pieces

Scan: 𝑂𝑂 𝑁𝑁𝐵𝐵

memory transfers

3. Recurse Same recurrence as mergesort

𝐵𝐵 items

Page 27: The History of I/O ModelsSolves Floyd’s two-level storage problem (𝑀𝑀= 3𝐵𝐵) 𝐵𝐵items Sorting Lower Bound [Aggarwal & Vitter — ICALP 1987, C. ACM 1988] • Sorting

Distribution Sort Partitioning[Aggarwal & Vitter — ICALP 1987, C. ACM 1988]

1. For first, second, … interval of 𝑀𝑀 items: Sort in 𝑂𝑂 ⁄𝑀𝑀 𝐵𝐵 memory transfers Sample every 1

4⁄𝑀𝑀 𝐵𝐵 th item

Total sample: 4𝑁𝑁/ ⁄𝑀𝑀 𝐵𝐵 items

2. For 𝑖𝑖 = 1,2, …, ⁄𝑀𝑀 𝐵𝐵: Run linear-time selection to find

sample element at ⁄𝑖𝑖 ⁄𝑀𝑀 𝐵𝐵 fraction

Cost: 𝑂𝑂 �𝑁𝑁⁄𝑀𝑀 𝐵𝐵

𝐵𝐵 each

Total: 𝑂𝑂 ⁄𝑁𝑁 𝐵𝐵 memory transf.

𝐵𝐵 items

Page 28: The History of I/O ModelsSolves Floyd’s two-level storage problem (𝑀𝑀= 3𝐵𝐵) 𝐵𝐵items Sorting Lower Bound [Aggarwal & Vitter — ICALP 1987, C. ACM 1988] • Sorting

Random vs. Sequential I/Os [Farach,Ferragina, Muthukrishnan — FOCS 1998]

• Sequential memory transfers are part ofbulk read/write of Θ 𝑀𝑀 items

• Random memory transfer otherwise• Sorting: 2-way mergesort achieves 𝑂𝑂 𝑁𝑁

𝐵𝐵log 𝑁𝑁

𝐵𝐵sequential

𝑜𝑜 𝑁𝑁𝐵𝐵

log ⁄𝑀𝑀 𝐵𝐵𝑁𝑁𝐵𝐵

random implies Ω 𝑁𝑁𝐵𝐵

log 𝑁𝑁𝐵𝐵

total

• Same trade-off forsuffix-tree construction

𝐵𝐵 items

Page 29: The History of I/O ModelsSolves Floyd’s two-level storage problem (𝑀𝑀= 3𝐵𝐵) 𝐵𝐵items Sorting Lower Bound [Aggarwal & Vitter — ICALP 1987, C. ACM 1988] • Sorting

Hierarchical Memory Model (HMM) [Aggarwal, Alpern, Chandra, Snir — STOC 1987]

• Nonuniform-cost RAM: Accessing memory location 𝑥𝑥 costs 𝑓𝑓 𝑥𝑥 = log 𝑥𝑥

“particularly simple model of computation that mimics the behavior of a memory hierarchy consisting of increasingly larger amounts of slower memory”

1 1 2 4 8 2𝑖𝑖1 1 1 1 1CPU

Presenter
Presentation Notes
http://dl.acm.org/citation.cfm?id=28428
Page 30: The History of I/O ModelsSolves Floyd’s two-level storage problem (𝑀𝑀= 3𝐵𝐵) 𝐵𝐵items Sorting Lower Bound [Aggarwal & Vitter — ICALP 1987, C. ACM 1988] • Sorting

Why 𝒇𝒇 𝒙𝒙 = 𝐥𝐥𝐥𝐥𝐥𝐥 𝒙𝒙? [Mead & Conway 1980]

Presenter
Presentation Notes
http://ai.eecs.umich.edu/people/conway/VLSI/VLSIText/IntroToVLSI.jpg from http://ai.eecs.umich.edu/people/conway/Retrospective4.html
Page 31: The History of I/O ModelsSolves Floyd’s two-level storage problem (𝑀𝑀= 3𝐵𝐵) 𝐵𝐵items Sorting Lower Bound [Aggarwal & Vitter — ICALP 1987, C. ACM 1988] • Sorting

HMM Upper & Lower Bounds[Aggarwal, Alpern, Chandra, Snir — STOC 1987]

Problem Time SlowdownSemiring matrix multiplication Θ 𝑁𝑁3 Θ(1)

Fast Fourier Transform Θ(𝑁𝑁 log𝑁𝑁 log log𝑁𝑁) Θ log log𝑁𝑁

Sorting Θ(𝑁𝑁 log𝑁𝑁 log log𝑁𝑁) Θ log log𝑁𝑁Scanning input (sum, max, DFS, planarity, etc.)

Θ 𝑁𝑁 log𝑁𝑁 Θ log𝑁𝑁

Binary search Θ log2 𝑁𝑁 Θ log𝑁𝑁

Page 32: The History of I/O ModelsSolves Floyd’s two-level storage problem (𝑀𝑀= 3𝐵𝐵) 𝐵𝐵items Sorting Lower Bound [Aggarwal & Vitter — ICALP 1987, C. ACM 1988] • Sorting

𝐇𝐇𝐇𝐇𝐇𝐇𝒇𝒇(𝒙𝒙)[Aggarwal, Alpern, Chandra, Snir — STOC 1987]

• Say accessing memory location 𝑥𝑥 costs 𝑓𝑓 𝑥𝑥• Assume 𝑓𝑓 2𝑥𝑥 ≤ 𝑐𝑐 𝑓𝑓(𝑥𝑥) for a constant 𝑐𝑐 > 0

(“polynomially bounded”)• Write 𝑓𝑓 𝑥𝑥 = ∑𝑖𝑖 𝑤𝑤𝑖𝑖 ⋅ [𝑥𝑥 ≥ 𝑥𝑥𝑖𝑖? ]

(weighted sum of threshold functions)

𝑥𝑥0 𝑥𝑥1 − 𝑥𝑥0𝑤𝑤0CPU 𝑥𝑥2 − 𝑥𝑥1

𝑤𝑤1 𝑤𝑤2 …0

𝑥𝑥𝑖𝑖

Presenter
Presentation Notes
http://upload.wikimedia.org/wikipedia/commons/0/05/Threshold_function.svg
Page 33: The History of I/O ModelsSolves Floyd’s two-level storage problem (𝑀𝑀= 3𝐵𝐵) 𝐵𝐵items Sorting Lower Bound [Aggarwal & Vitter — ICALP 1987, C. ACM 1988] • Sorting

Uniform Optimality [Aggarwal, Alpern, Chandra, Snir — STOC 1987]

• Consider one term 𝑓𝑓𝑀𝑀 𝑥𝑥 = [𝑥𝑥 ≥ 𝑀𝑀? ]• Algorithm is uniformly optimal

if optimal on HMM𝑓𝑓𝑀𝑀 𝑥𝑥 for all 𝑀𝑀• Implies optimality for all 𝑓𝑓 𝑥𝑥

∞CPU 𝑀𝑀 10

𝑀𝑀

CPU

𝑀𝑀 items

cache disk red-bluepebble game!

Presenter
Presentation Notes
http://upload.wikimedia.org/wikipedia/commons/0/05/Threshold_function.svg
Page 34: The History of I/O ModelsSolves Floyd’s two-level storage problem (𝑀𝑀= 3𝐵𝐵) 𝐵𝐵items Sorting Lower Bound [Aggarwal & Vitter — ICALP 1987, C. ACM 1988] • Sorting

𝐇𝐇𝐇𝐇𝐇𝐇𝒇𝒇𝑴𝑴 𝒙𝒙 Upper & Lower Bounds[Aggarwal, Alpern, Chandra, Snir — STOC 1987]

Problem Time SpeedupSemiring matrix multiplication Θ

𝑁𝑁3

𝑀𝑀Θ 𝑀𝑀

Fast Fourier Transform Θ 𝑁𝑁 log𝑀𝑀 𝑁𝑁 Θ log𝑀𝑀

Sorting Θ 𝑁𝑁 log𝑀𝑀 𝑁𝑁 Θ log𝑀𝑀Scanning input (sum, max, DFS, planarity, etc.)

Θ 𝑁𝑁 −𝑀𝑀 1 + 1/𝑀𝑀

Binary search Θ log𝑁𝑁 − log𝑀𝑀 1 + 1/ log𝑀𝑀

upper bounds knownby Hong & Kung 1981

other bounds follow fromAggarwal & Vitter 1987

Page 35: The History of I/O ModelsSolves Floyd’s two-level storage problem (𝑀𝑀= 3𝐵𝐵) 𝐵𝐵items Sorting Lower Bound [Aggarwal & Vitter — ICALP 1987, C. ACM 1988] • Sorting

Implicit HMM Memory Management[Aggarwal, Alpern, Chandra, Snir — STOC 1987]

• Instead of algorithm explicitly moving data,use any conservative replacement strategy (e.g., FIFO or LRU) to evict from cache[Sleator & Tarjan — C. ACM 1985]

• 𝑇𝑇LRU 𝑁𝑁,𝑀𝑀 ≤ 2 ⋅ 𝑇𝑇OPT 𝑁𝑁, ⁄𝑀𝑀 2= 𝑂𝑂 𝑇𝑇OPT 𝑁𝑁,𝑀𝑀 assuming 𝑓𝑓 2𝑥𝑥 ≤ 𝑐𝑐 𝑓𝑓(𝑥𝑥)

• Not uniform!

∞CPU 𝑀𝑀 10

Page 36: The History of I/O ModelsSolves Floyd’s two-level storage problem (𝑀𝑀= 3𝐵𝐵) 𝐵𝐵items Sorting Lower Bound [Aggarwal & Vitter — ICALP 1987, C. ACM 1988] • Sorting

Implicit HMM Memory Management[Aggarwal, Alpern, Chandra, Snir — STOC 1987]

• For general 𝑓𝑓, split memory into chunks at 𝑥𝑥where 𝑓𝑓(𝑥𝑥) doubles (up to rounding)

𝑥𝑥0 𝑥𝑥1 − 𝑥𝑥0𝑤𝑤0CPU 𝑥𝑥2 − 𝑥𝑥1

𝑤𝑤1 𝑤𝑤2 …0

Page 37: The History of I/O ModelsSolves Floyd’s two-level storage problem (𝑀𝑀= 3𝐵𝐵) 𝐵𝐵items Sorting Lower Bound [Aggarwal & Vitter — ICALP 1987, C. ACM 1988] • Sorting

Implicit HMM Memory Management[Aggarwal, Alpern, Chandra, Snir — STOC 1987]

• For general 𝑓𝑓, split memory into chunks at 𝑥𝑥where 𝑓𝑓(𝑥𝑥) doubles (up to rounding)

• LRU eviction from first chunk into second;LRU eviction from second chunk into third; etc.

• 𝑇𝑇𝐿𝐿𝐿𝐿𝐿𝐿 𝑁𝑁 = 𝑂𝑂 𝑇𝑇𝑂𝑂𝑂𝑂𝑂𝑂 𝑁𝑁 + 𝑁𝑁 ⋅ 𝑓𝑓 𝑁𝑁 Like MTF

1 1 2 4 8 2𝑖𝑖1 1 1 1 1CPU

Page 38: The History of I/O ModelsSolves Floyd’s two-level storage problem (𝑀𝑀= 3𝐵𝐵) 𝐵𝐵items Sorting Lower Bound [Aggarwal & Vitter — ICALP 1987, C. ACM 1988] • Sorting

HMM with Block Transfer (BT) [Aggarwal, Chandra, Snir — FOCS 1987]

• Accessing memory location 𝑥𝑥 costs 𝑓𝑓 𝑥𝑥• Copying memory interval from 𝑥𝑥 − 𝛿𝛿… 𝑥𝑥

to 𝑦𝑦 − 𝛿𝛿…𝑦𝑦 costs 𝑓𝑓 max 𝑥𝑥, 𝑦𝑦 + 𝛿𝛿 Models memory pipelining ~ block transfer Ignores block alignment, explicit levels, etc.

1 1 2 4 8 2𝑖𝑖1 1 1 1 1CPU

Presenter
Presentation Notes
Aggarwal, Chandra, Snir - Hierarchical Memory with Block Transfer - FOCS 1987 http://dx.doi.org/10.1109/SFCS.1987.31
Page 39: The History of I/O ModelsSolves Floyd’s two-level storage problem (𝑀𝑀= 3𝐵𝐵) 𝐵𝐵items Sorting Lower Bound [Aggarwal & Vitter — ICALP 1987, C. ACM 1988] • Sorting

BT Results[Aggarwal, Chandra, Snir — FOCS 1987]

Problem 𝑓𝑓 𝑥𝑥 = log 𝑥𝑥 𝑓𝑓 𝑥𝑥 = 𝑥𝑥𝛼𝛼, 0 < 𝛼𝛼 < 1 𝑓𝑓 𝑥𝑥 = 𝑥𝑥 𝑓𝑓 𝑥𝑥 = 𝑥𝑥𝛼𝛼,

𝛼𝛼 > 1Dot product, merging lists Θ 𝑁𝑁 log∗ 𝑁𝑁 Θ 𝑁𝑁 log log𝑁𝑁 Θ(𝑁𝑁 log𝑁𝑁) Θ 𝑁𝑁𝛼𝛼

Matrix mult. Θ 𝑁𝑁3 Θ 𝑁𝑁3 Θ 𝑁𝑁3 Θ 𝑁𝑁𝛼𝛼

if 𝛼𝛼 > 1.5Fast Fourier Transform Θ 𝑁𝑁 log𝑁𝑁 Θ 𝑁𝑁 log𝑁𝑁 Θ 𝑁𝑁 log2 𝑁𝑁 Θ 𝑁𝑁𝛼𝛼

Sorting Θ 𝑁𝑁 log𝑁𝑁 Θ 𝑁𝑁 log𝑁𝑁 Θ 𝑁𝑁 log2 𝑁𝑁 Θ 𝑁𝑁𝛼𝛼

Binary search Θ

log2 𝑁𝑁log log𝑁𝑁 Θ 𝑁𝑁𝛼𝛼 Θ 𝑁𝑁 Θ 𝑁𝑁𝛼𝛼

Presenter
Presentation Notes
Aggarwal, Chandra, Snir - Hierarchical Memory with Block Transfer - FOCS 1987 http://dx.doi.org/10.1109/SFCS.1987.31
Page 40: The History of I/O ModelsSolves Floyd’s two-level storage problem (𝑀𝑀= 3𝐵𝐵) 𝐵𝐵items Sorting Lower Bound [Aggarwal & Vitter — ICALP 1987, C. ACM 1988] • Sorting

Memory Hierarchy Model (MH) [Alpern, Carter, Feig, Selker — FOCS 1990]

• Multilevel version of external-memory model• 𝑀𝑀𝑖𝑖 ↔ 𝑀𝑀𝑖𝑖+1 transfers happen in blocks of size 𝐵𝐵𝑖𝑖

(subblocks of 𝑀𝑀𝑖𝑖+1), and take 𝑡𝑡𝑖𝑖 time• All levels can be actively transferring at once

𝑀𝑀0 𝑀𝑀1 𝑀𝑀2 𝑀𝑀3 𝑀𝑀40 𝑡𝑡0 𝑡𝑡1 𝑡𝑡2 𝑡𝑡3CPU

𝐵𝐵0 𝐵𝐵1𝐵𝐵2

𝐵𝐵3𝐵𝐵4

Presenter
Presentation Notes
Conference version: Alpern, Carter, Feig: http://dx.doi.org/10.1109/FSCS.1990.89581 Journal version: Alpern, Carter, Feig, Selker: http://www.springerlink.com/index/W24P02H58XG51322.pdf
Page 41: The History of I/O ModelsSolves Floyd’s two-level storage problem (𝑀𝑀= 3𝐵𝐵) 𝐵𝐵items Sorting Lower Bound [Aggarwal & Vitter — ICALP 1987, C. ACM 1988] • Sorting

Uniform Memory Hierarchy (UMH) [Alpern, Carter, Feig, Selker — FOCS 1990]

• Fix aspect ratio 𝛼𝛼 = 𝑀𝑀/𝐵𝐵𝐵𝐵

, block growth 𝛽𝛽 = 𝐵𝐵𝑖𝑖+1𝐵𝐵𝑖𝑖

• 𝐵𝐵𝑖𝑖 = 𝛽𝛽𝑖𝑖

• 𝑀𝑀𝑖𝑖𝐵𝐵𝑖𝑖

= 𝛼𝛼 ⋅ 𝛽𝛽𝑖𝑖

• 𝑡𝑡𝑖𝑖 = 𝛽𝛽𝑖𝑖 ⋅ 𝑓𝑓(𝑖𝑖)

𝑀𝑀0 𝑀𝑀1 𝑀𝑀2 𝑀𝑀3 𝑀𝑀40 𝑡𝑡0 𝑡𝑡1 𝑡𝑡2 𝑡𝑡3CPU

𝐵𝐵0 𝐵𝐵1𝐵𝐵2

𝐵𝐵3𝐵𝐵4

2 parameters1 function

Presenter
Presentation Notes
Conference version: Alpern, Carter, Feig: http://dx.doi.org/10.1109/FSCS.1990.89581 Journal version: Alpern, Carter, Feig, Selker: http://www.springerlink.com/index/W24P02H58XG51322.pdf
Page 42: The History of I/O ModelsSolves Floyd’s two-level storage problem (𝑀𝑀= 3𝐵𝐵) 𝐵𝐵items Sorting Lower Bound [Aggarwal & Vitter — ICALP 1987, C. ACM 1988] • Sorting

UMH Results[Alpern, Carter, Feig, Selker — FOCS 1990]Problem Upper Bound Lower Bound

Matrix transpose𝑓𝑓 𝑖𝑖 = 1 𝑂𝑂 1 +

1𝛽𝛽2

𝑁𝑁2 Ω 1 +1𝛼𝛼𝛽𝛽4

𝑁𝑁2

Matrix mult.𝑓𝑓 𝑖𝑖 = O 𝛽𝛽𝑖𝑖 𝑂𝑂 1 +

1𝛽𝛽3

𝑁𝑁3 Ω 1 +1𝛽𝛽3

𝑁𝑁3

FFT𝑓𝑓 𝑖𝑖 ≤ 𝑖𝑖 𝑂𝑂 1 Ω 1

General approach: Divide & conquer

Presenter
Presentation Notes
Conference version: Alpern, Carter, Feig: http://dx.doi.org/10.1109/FSCS.1990.89581 Journal version: Alpern, Carter, Feig, Selker: http://www.springerlink.com/index/W24P02H58XG51322.pdf
Page 43: The History of I/O ModelsSolves Floyd’s two-level storage problem (𝑀𝑀= 3𝐵𝐵) 𝐵𝐵items Sorting Lower Bound [Aggarwal & Vitter — ICALP 1987, C. ACM 1988] • Sorting

Cache-Oblivious Model [Frigo, Leiserson, Prokop, Ramachandran — FOCS 1999]

• Analyze RAM algorithm (not knowing 𝐵𝐵 or 𝑀𝑀) on external-memory model Must work well for all 𝐵𝐵 and 𝑀𝑀

CPU

𝐵𝐵 items

𝑀𝑀𝐵𝐵

blocks

cache𝐵𝐵 itemsdisk

Page 44: The History of I/O ModelsSolves Floyd’s two-level storage problem (𝑀𝑀= 3𝐵𝐵) 𝐵𝐵items Sorting Lower Bound [Aggarwal & Vitter — ICALP 1987, C. ACM 1988] • Sorting

Cache-Oblivious Model [Frigo,Leiserson, Prokop, Ramachandran — FOCS 1999]

• Automatic block transfers via LRU or FIFO• Lose factor of 2 in 𝑀𝑀 and number of transfers Assume 𝑇𝑇 𝐵𝐵, 2𝑀𝑀 ≤ 𝑐𝑐 𝑇𝑇(𝐵𝐵,𝑀𝑀)

CPU

𝐵𝐵 items

𝑀𝑀𝐵𝐵

blocks

cache𝐵𝐵 itemsdisk

Page 45: The History of I/O ModelsSolves Floyd’s two-level storage problem (𝑀𝑀= 3𝐵𝐵) 𝐵𝐵items Sorting Lower Bound [Aggarwal & Vitter — ICALP 1987, C. ACM 1988] • Sorting

Cache-Oblivious Model [Frigo,Leiserson, Prokop, Ramachandran — FOCS 1999]

• Clean model• Adapts to changing 𝐵𝐵 (e.g., disk tracks) and

changing 𝑀𝑀 (e.g., competing processes)• Adapts to multilevel memory hierarchy (MH) Assuming inclusion

𝑀𝑀0 𝑀𝑀1 𝑀𝑀2 𝑀𝑀3 𝑀𝑀40 𝑡𝑡0 𝑡𝑡1 𝑡𝑡2 𝑡𝑡3CPU

𝐵𝐵0 𝐵𝐵1𝐵𝐵2

𝐵𝐵3𝐵𝐵4

Page 46: The History of I/O ModelsSolves Floyd’s two-level storage problem (𝑀𝑀= 3𝐵𝐵) 𝐵𝐵items Sorting Lower Bound [Aggarwal & Vitter — ICALP 1987, C. ACM 1988] • Sorting

Scanning [Frigo, Leiserson, Prokop, Ramachandran — FOCS 1999]

• Visiting 𝑁𝑁 elements in order costs𝑂𝑂 1 + 𝑁𝑁

𝐵𝐵memory transfers

• More generally, can run 𝑂𝑂 1 parallel scans Assume 𝑀𝑀 ≥ 𝑐𝑐 𝐵𝐵 for appropriate constant 𝑐𝑐 > 0

• E.g., merge two lists in 𝑂𝑂 𝑁𝑁𝐵𝐵

𝐵𝐵 items

Page 47: The History of I/O ModelsSolves Floyd’s two-level storage problem (𝑀𝑀= 3𝐵𝐵) 𝐵𝐵items Sorting Lower Bound [Aggarwal & Vitter — ICALP 1987, C. ACM 1988] • Sorting

Searching [Prokop — Meng 1999]

“van Emde Boas layout”

Page 48: The History of I/O ModelsSolves Floyd’s two-level storage problem (𝑀𝑀= 3𝐵𝐵) 𝐵𝐵items Sorting Lower Bound [Aggarwal & Vitter — ICALP 1987, C. ACM 1988] • Sorting

Searching [Prokop — Meng 1999]

• height(△)

≥12

lg𝐵𝐵

• ≤ 2 memory transfers per △

• ≤ 4 log𝐵𝐵 𝑁𝑁 total

Page 49: The History of I/O ModelsSolves Floyd’s two-level storage problem (𝑀𝑀= 3𝐵𝐵) 𝐵𝐵items Sorting Lower Bound [Aggarwal & Vitter — ICALP 1987, C. ACM 1988] • Sorting

Cache-Oblivious Searching

• lg 𝑒𝑒 + 𝑜𝑜 1 log𝐵𝐵 𝑁𝑁 is optimal[Bender, Brodal, Fagerberg, Ge, He, Hu, Iacono,López-Ortiz — FOCS 2003]

• Dynamic B-tree in 𝑂𝑂 log𝐵𝐵 𝑁𝑁 per operation[Bender, Demaine, Farach-Colton — FOCS 2000][Bender, Duan, Iacono, Wu — SODA 2002][Brodal,Fagerberg,Jacob —SODA 2002]

Page 50: The History of I/O ModelsSolves Floyd’s two-level storage problem (𝑀𝑀= 3𝐵𝐵) 𝐵𝐵items Sorting Lower Bound [Aggarwal & Vitter — ICALP 1987, C. ACM 1988] • Sorting

Cache-Oblivious Sorting

• 𝑂𝑂 𝑁𝑁𝐵𝐵

log ⁄𝑀𝑀 𝐵𝐵𝑁𝑁𝐵𝐵

possible, assuming𝑀𝑀 ≥ Ω 𝐵𝐵1+𝜀𝜀 (tall cache) Funnel sort:

mergesort analog Distribution sort[Frigo, Leiserson, Prokop,Ramachandran — FOCS 1999;Brodal & Fagerberg — ICALP 2002]

• Impossible without tall-cache assumption[Brodal & Fagerberg — STOC 2003]

Page 51: The History of I/O ModelsSolves Floyd’s two-level storage problem (𝑀𝑀= 3𝐵𝐵) 𝐵𝐵items Sorting Lower Bound [Aggarwal & Vitter — ICALP 1987, C. ACM 1988] • Sorting

http://courses.csail.mit.edu/6.851/

Page 52: The History of I/O ModelsSolves Floyd’s two-level storage problem (𝑀𝑀= 3𝐵𝐵) 𝐵𝐵items Sorting Lower Bound [Aggarwal & Vitter — ICALP 1987, C. ACM 1988] • Sorting

Models, Models, ModelsModel Year Blocking Caching Levels SimpleIdealized2-level 1972 ✓ ✗ 2 ✓Red-blue pebble 1981 ✗ ✓ 2 ✓−External memory 1987 ✓ ✓ 2 ✓HMM 1987 ✗ ✓ ∞ ✓BT 1987 ~ ✓ ∞ ✓−(U)MH 1990 ✓ ✓ ∞ ✗Cache oblivious 1999 ✓ ✓ 2–∞ ✓+


Top Related