the history of i/o modelssolves floydโ€™s two-level storage problem (๐‘€๐‘€= 3๐ต๐ต)...

52
The History of I/O Models Erik Demaine MASSACHUSETTS INSTITUTE OF TECHNOLOGY

Upload: others

Post on 26-Jul-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The History of I/O ModelsSolves Floydโ€™s two-level storage problem (๐‘€๐‘€= 3๐ต๐ต) ๐ต๐ตitems Sorting Lower Bound [Aggarwal & Vitter โ€” ICALP 1987, C. ACM 1988] โ€ข Sorting

The History ofI/O Models

Erik Demaine

MASSACHUSETTSINSTITUTE OFTECHNOLOGY

Page 2: The History of I/O ModelsSolves Floydโ€™s two-level storage problem (๐‘€๐‘€= 3๐ต๐ต) ๐ต๐ตitems Sorting Lower Bound [Aggarwal & Vitter โ€” ICALP 1987, C. ACM 1988] โ€ข Sorting

Memory Hierarchies in Practice

CPU Registers Level 1Cache

Level 2Cache

MainMemory

โ€ฆ

Disk

1MB

10GB

100KB100B

1TB

4 ms

14 ns

750 ps

Internet

1EB-1ZB

Outer Space< 1083

Presenter
Presentation Notes
http://en.wikipedia.org/wiki/Disk-drive_performance_characteristics#Rotational_latency 4ms on 7200rpm
Page 3: The History of I/O ModelsSolves Floydโ€™s two-level storage problem (๐‘€๐‘€= 3๐ต๐ต) ๐ต๐ตitems Sorting Lower Bound [Aggarwal & Vitter โ€” ICALP 1987, C. ACM 1988] โ€ข Sorting

Models, Models, ModelsModel Year Blocking Caching Levels SimpleIdealized2-level 1972 โœ“ โœ— 2 โœ“Red-blue pebble 1981 โœ— โœ“ 2 โœ“โˆ’External memory 1987 โœ“ โœ“ 2 โœ“HMM 1987 โœ— โœ“ โˆž โœ“BT 1987 ~ โœ“ โˆž โœ“โˆ’(U)MH 1990 โœ“ โœ“ โˆž โœ—Cache oblivious 1999 โœ“ โœ“ 2โ€“โˆž โœ“+

Page 4: The History of I/O ModelsSolves Floydโ€™s two-level storage problem (๐‘€๐‘€= 3๐ต๐ต) ๐ต๐ตitems Sorting Lower Bound [Aggarwal & Vitter โ€” ICALP 1987, C. ACM 1988] โ€ข Sorting

Idealized Two-Level Storage[Floyd โ€” Complexity of Computer Computations 1972]

PDP-11 (1970โ€“1990s)

Ken Thompson

Dennis Ritchie

2.4MB RK03 disks

8KB core memory

Presenter
Presentation Notes
PDP-11 is 16-bit RK03 disk stores 1.2 million words [ftp://ftp.update.uu.se/pub/pdp11/doc/rk11-c.txt] http://cm.bell-labs.com/who/dmr/kd14.jpg from http://cm.bell-labs.com/who/dmr/picture.html 4K words of memory, 950 ns must be core: http://www.village.org/pdp11/faq.pages/price.images/pdp11-20-aug-1971/small/page1.gif http://www.psych.usyd.edu.au/pdp-11/Images/core_closeup.jpeg from http://www.psych.usyd.edu.au/pdp-11/core.html
Page 5: The History of I/O ModelsSolves Floydโ€™s two-level storage problem (๐‘€๐‘€= 3๐ต๐ต) ๐ต๐ตitems Sorting Lower Bound [Aggarwal & Vitter โ€” ICALP 1987, C. ACM 1988] โ€ข Sorting

Idealized Two-Level Storage[Floyd โ€” Complexity of Computer Computations 1972]

โ€ข RAM = blocks of โ‰ค ๐ต๐ต itemsโ€ข Block operation: Read up to ๐ต๐ต items

from two blocks ๐‘–๐‘–, ๐‘—๐‘— Write to third block ๐‘˜๐‘˜

โ€ข Ignore item order within block CPU operations considered free

โ€ข Items are indivisible

CPU

๐ต๐ต items

๐‘–๐‘–

๐‘—๐‘—

๐‘˜๐‘˜

Page 6: The History of I/O ModelsSolves Floydโ€™s two-level storage problem (๐‘€๐‘€= 3๐ต๐ต) ๐ต๐ตitems Sorting Lower Bound [Aggarwal & Vitter โ€” ICALP 1987, C. ACM 1988] โ€ข Sorting

Permutation Lower Bound[Floyd โ€” Complexity of Computer Computations 1972]

โ€ข Theorem: Permuting ๐‘๐‘ items toโ„๐‘๐‘ ๐ต๐ต (full) specified blocks needs

block operations, in average case Assuming ๐‘๐‘

๐ต๐ต> ๐ต๐ต (tall disk)

โ€ข Simplified model: Move items instead of copy Equivalence: Follow itemโ€™s path from start to finish

CPU

๐ต๐ต items

๐‘–๐‘–

๐‘—๐‘—

๐‘˜๐‘˜

ฮฉ๐‘๐‘๐ต๐ต

log๐ต๐ต

Presenter
Presentation Notes
Original paper called blocks โ€œpagesโ€, used โ€œpโ€ for block size, used โ€œwโ€ for number of pages, and made the assumption w > p (didnโ€™t call it anything).
Page 7: The History of I/O ModelsSolves Floydโ€™s two-level storage problem (๐‘€๐‘€= 3๐ต๐ต) ๐ต๐ตitems Sorting Lower Bound [Aggarwal & Vitter โ€” ICALP 1987, C. ACM 1988] โ€ข Sorting

Permutation Lower Bound[Floyd โ€” Complexity of Computer Computations 1972]

โ€ข Potential:

Maximized in target configurationof full blocks (๐‘›๐‘›๐‘–๐‘–๐‘–๐‘–=๐ต๐ต): ฮฆ = ๐‘๐‘ log๐ต๐ต

Random configuration with ๐‘๐‘๐ต๐ต

> ๐ต๐ตhas E ๐‘›๐‘›๐‘–๐‘–๐‘–๐‘– = ๐‘‚๐‘‚(1) โ‡’ E ฮฆ = ๐‘‚๐‘‚(๐‘๐‘) Claim: Block operation increases ฮฆ by โ‰ค ๐ต๐ต

โ‡’ Number of block operations โ‰ฅ ๐‘๐‘ log ๐ต๐ตโˆ’๐‘‚๐‘‚(๐‘๐‘)๐ต๐ต

๐ต๐ต items

๐‘–๐‘–

๐‘—๐‘—

๐‘˜๐‘˜

# items in block ๐‘–๐‘– destined for block ๐‘—๐‘—

ฮฆ = ๏ฟฝ๐‘–๐‘–,๐‘–๐‘–

๐‘›๐‘›๐‘–๐‘–๐‘–๐‘– log๐‘›๐‘›๐‘–๐‘–๐‘–๐‘–

Page 8: The History of I/O ModelsSolves Floydโ€™s two-level storage problem (๐‘€๐‘€= 3๐ต๐ต) ๐ต๐ตitems Sorting Lower Bound [Aggarwal & Vitter โ€” ICALP 1987, C. ACM 1988] โ€ข Sorting

Permutation Lower Bound[Floyd โ€” Complexity of Computer Computations 1972]

โ€ข Potential:

Maximized in target configurationof full blocks (๐‘›๐‘›๐‘–๐‘–๐‘–๐‘–=๐ต๐ต): ฮฆ = ๐‘๐‘ log๐ต๐ต

Random configuration with ๐‘๐‘๐ต๐ต

> ๐ต๐ตhas E ๐‘›๐‘›๐‘–๐‘–๐‘–๐‘– = ๐‘‚๐‘‚(1) โ‡’ E ฮฆ = ๐‘‚๐‘‚(๐‘๐‘) Claim: Block operation increases ฮฆ by โ‰ค ๐ต๐ต

oFact: ๐‘ฅ๐‘ฅ + ๐‘ฆ๐‘ฆ log ๐‘ฅ๐‘ฅ + ๐‘ฆ๐‘ฆ โ‰ค ๐‘ฅ๐‘ฅ log ๐‘ฅ๐‘ฅ + ๐‘ฆ๐‘ฆ log ๐‘ฆ๐‘ฆ + ๐‘ฅ๐‘ฅ + ๐‘ฆ๐‘ฆo So combining groups ๐‘ฅ๐‘ฅ & ๐‘ฆ๐‘ฆ increases ฮฆ by โ‰ค ๐‘ฅ๐‘ฅ + ๐‘ฆ๐‘ฆ

๐ต๐ต items

๐‘–๐‘–

๐‘—๐‘—

๐‘˜๐‘˜

# items in block ๐‘–๐‘– destined for block ๐‘—๐‘—

ฮฆ = ๏ฟฝ๐‘–๐‘–,๐‘–๐‘–

๐‘›๐‘›๐‘–๐‘–๐‘–๐‘– log๐‘›๐‘›๐‘–๐‘–๐‘–๐‘–

Page 9: The History of I/O ModelsSolves Floydโ€™s two-level storage problem (๐‘€๐‘€= 3๐ต๐ต) ๐ต๐ตitems Sorting Lower Bound [Aggarwal & Vitter โ€” ICALP 1987, C. ACM 1988] โ€ข Sorting

Permutation Bounds[Floyd โ€” Complexity of Computer Computations 1972]

โ€ข Theorem: ฮฉ ๐‘๐‘๐ต๐ต

log๐ต๐ต Tight for ๐ต๐ต = ๐‘‚๐‘‚(1)

โ€ข Theorem: ๐‘‚๐‘‚ ๐‘๐‘๐ต๐ต

log ๐‘๐‘๐ต๐ต

Similar to radix sort,where key = target block index

Accidental claim: tight for all ๐ต๐ต < ๐‘๐‘๐ต๐ต

We will see: tight for ๐ต๐ต > log ๐‘๐‘๐ต๐ต

๐ต๐ต items

๐‘–๐‘–

๐‘—๐‘—

๐‘˜๐‘˜

CPU

Page 10: The History of I/O ModelsSolves Floydโ€™s two-level storage problem (๐‘€๐‘€= 3๐ต๐ต) ๐ต๐ตitems Sorting Lower Bound [Aggarwal & Vitter โ€” ICALP 1987, C. ACM 1988] โ€ข Sorting

Idealized Two-Level Storage[Floyd โ€” Complexity of Computer Computations 1972]

โ€ข External memory & word RAM: ๐ต๐ต items

๐‘–๐‘–

๐‘—๐‘—

๐‘˜๐‘˜

Page 11: The History of I/O ModelsSolves Floydโ€™s two-level storage problem (๐‘€๐‘€= 3๐ต๐ต) ๐ต๐ตitems Sorting Lower Bound [Aggarwal & Vitter โ€” ICALP 1987, C. ACM 1988] โ€ข Sorting

Idealized Two-Level Storage[Floyd โ€” Complexity of Computer Computations 1972]

โ€ข External memory & word RAM:

โ€ข Foreshadowing future models:

๐ต๐ต items

๐‘–๐‘–

๐‘—๐‘—

๐‘˜๐‘˜

CPU

Page 12: The History of I/O ModelsSolves Floydโ€™s two-level storage problem (๐‘€๐‘€= 3๐ต๐ต) ๐ต๐ตitems Sorting Lower Bound [Aggarwal & Vitter โ€” ICALP 1987, C. ACM 1988] โ€ข Sorting

Red-Blue Pebble Game[Hong & Kung โ€” STOC 1981]

CPU

๐‘€๐‘€ items

cache disk

TRS-80[1977โ€“1981]

VIC-20[1980โ€“1985]

Apple II[1977โ€“1993]

4KB RAM

5KB RAM

4KB RAM

Presenter
Presentation Notes
First use of โ€œI/O complexityโ€ http://upload.wikimedia.org/wikipedia/commons/thumb/e/e4/TRS-80_Model_I_-_Rechnermuseum_Cropped.jpg/485px-TRS-80_Model_I_-_Rechnermuseum_Cropped.jpg from http://en.wikipedia.org/wiki/TRS-80 from http://en.wikipedia.org/wiki/TRS-80 http://upload.wikimedia.org/wikipedia/commons/thumb/3/39/CBMVIC20P8.jpg/800px-CBMVIC20P8.jpg from http://en.wikipedia.org/wiki/File:CBMVIC20P8.jpg from http://en.wikipedia.org/wiki/Commodore_VIC-20 http://upload.wikimedia.org/wikipedia/en/thumb/8/82/Apple_II_tranparent_800.png/651px-Apple_II_tranparent_800.png from http://en.wikipedia.org/wiki/File:Apple_II_tranparent_800.png from http://en.wikipedia.org/wiki/Apple_II_series#Apple_II
Page 13: The History of I/O ModelsSolves Floydโ€™s two-level storage problem (๐‘€๐‘€= 3๐ต๐ต) ๐ต๐ตitems Sorting Lower Bound [Aggarwal & Vitter โ€” ICALP 1987, C. ACM 1988] โ€ข Sorting

Pebble Game[Hopcroft, Paul, Valiant โ€” J. ACM 1977]

โ€ข View computation as DAGof data dependencies

โ€ข Pebble = โ€œin memoryโ€โ€ข Moves: Place pebble on node

if all predecessors have a pebble Remove pebble from node

โ€ข Goal: Pebbles on all output nodes Minimize maximum number of pebbles over time

inputs:

outputs:

Presenter
Presentation Notes
http://dl.acm.org/citation.cfm?id=322015&dl=
Page 14: The History of I/O ModelsSolves Floydโ€™s two-level storage problem (๐‘€๐‘€= 3๐ต๐ต) ๐ต๐ตitems Sorting Lower Bound [Aggarwal & Vitter โ€” ICALP 1987, C. ACM 1988] โ€ข Sorting

Pebble Game[Hopcroft, Paul, Valiant โ€” J. ACM 1977]

โ€ข Theorem: Any DAG can be โ€œexecutedโ€ using ๐‘‚๐‘‚ ๐‘›๐‘›

log ๐‘›๐‘›maximum pebbles

โ€ข Corollary:

DTIME ๐‘ก๐‘ก โŠ† DSPACE๐‘ก๐‘ก

log ๐‘ก๐‘ก

inputs:

outputs:

Presenter
Presentation Notes
http://dl.acm.org/citation.cfm?id=322015&dl=
Page 15: The History of I/O ModelsSolves Floydโ€™s two-level storage problem (๐‘€๐‘€= 3๐ต๐ต) ๐ต๐ตitems Sorting Lower Bound [Aggarwal & Vitter โ€” ICALP 1987, C. ACM 1988] โ€ข Sorting

Red-Blue Pebble Game[Hong & Kung โ€” STOC 1981]

โ€ข Red pebble = โ€œin cacheโ€โ€ข Blue pebble = โ€œon diskโ€โ€ข Moves: Place red pebble on node if all

predecessors have red pebble Remove pebble from node Write: Red pebble โ†’ blue pebble Read: Blue pebble โ†’ red pebble

โ€ข Goal: Blue inputs to blue outputs โ‰ค ๐‘€๐‘€ red pebbles at any time

inputs:

outputs:

minimize

Page 16: The History of I/O ModelsSolves Floydโ€™s two-level storage problem (๐‘€๐‘€= 3๐ต๐ต) ๐ต๐ตitems Sorting Lower Bound [Aggarwal & Vitter โ€” ICALP 1987, C. ACM 1988] โ€ข Sorting

Red-Blue Pebble Game[Hong & Kung โ€” STOC 1981]

โ€ข Red pebble = โ€œin cacheโ€โ€ข Blue pebble = โ€œon diskโ€

inputs:

outputs:

CPU

๐‘€๐‘€ items

cache disk

minimize number ofcache โ†” disk I/Os

(memory transfers)

Page 17: The History of I/O ModelsSolves Floydโ€™s two-level storage problem (๐‘€๐‘€= 3๐ต๐ต) ๐ต๐ตitems Sorting Lower Bound [Aggarwal & Vitter โ€” ICALP 1987, C. ACM 1988] โ€ข Sorting

Red-Blue Pebble Game Results[Hong & Kung โ€” STOC 1981]

Computation DAG Memory Transfers Speedup

Fast Fourier Transform (FFT) ฮ˜ ๐‘๐‘ log๐‘€๐‘€ ๐‘๐‘ ฮ˜ log๐‘€๐‘€

Ordinary matrix-vector multiplication ฮ˜

๐‘๐‘2

๐‘€๐‘€ฮ˜ ๐‘€๐‘€

Ordinary matrix-matrix multiplication ฮ˜

๐‘๐‘3

๐‘€๐‘€ฮ˜ ๐‘€๐‘€

Odd-even transposition sort ฮ˜

๐‘๐‘2

๐‘€๐‘€ฮ˜ ๐‘€๐‘€

๐‘๐‘ ร— ๐‘๐‘ ร— โ‹ฏร— ๐‘๐‘ grid ฮฉ๐‘๐‘๐‘‘๐‘‘

๐‘€๐‘€1/(๐‘‘๐‘‘โˆ’1) ฮ˜ ๐‘€๐‘€1/(๐‘‘๐‘‘โˆ’1)

๐‘‘๐‘‘

Presenter
Presentation Notes
Actual matrix results handle rectangular matrices, not just square. Speedups are the same.
Page 18: The History of I/O ModelsSolves Floydโ€™s two-level storage problem (๐‘€๐‘€= 3๐ต๐ต) ๐ต๐ตitems Sorting Lower Bound [Aggarwal & Vitter โ€” ICALP 1987, C. ACM 1988] โ€ข Sorting

Comparison

CPU

๐‘€๐‘€ items

cache disk

CPU

๐ต๐ต items

Idealized two-level storage

[Floyd 1972]

Red-bluepebble game

[Hong & Kung 1981]

blocks but no cache cache but no blocks

Presenter
Presentation Notes
http://cache.gawkerassets.com/assets/images/8/2011/06/avp_2.jpg from http://io9.com/avp/
Page 19: The History of I/O ModelsSolves Floydโ€™s two-level storage problem (๐‘€๐‘€= 3๐ต๐ต) ๐ต๐ตitems Sorting Lower Bound [Aggarwal & Vitter โ€” ICALP 1987, C. ACM 1988] โ€ข Sorting

I/O Model [Aggarwal & Vitter โ€” ICALP 1987, C. ACM 1988]

โ€ข AKA: External Memory Model, Disk Access Modelโ€ข Goal: Minimize number of I/Os (memory transfers)

CPU

๐ต๐ต items

๐‘€๐‘€๐ต๐ต

blocks

cache๐ต๐ต itemsdisk

cache and blocks

Page 20: The History of I/O ModelsSolves Floydโ€™s two-level storage problem (๐‘€๐‘€= 3๐ต๐ต) ๐ต๐ตitems Sorting Lower Bound [Aggarwal & Vitter โ€” ICALP 1987, C. ACM 1988] โ€ข Sorting

Scanning[Aggarwal & Vitter โ€” ICALP 1987, C. ACM 1988]

โ€ข Visiting ๐‘๐‘ elements in order costs๐‘‚๐‘‚ 1 + ๐‘๐‘

๐ต๐ตmemory transfers

โ€ข More generally, can run โ‰ค ๐‘€๐‘€๐ต๐ต

parallel scans, keeping 1 block per scan in cache E.g., merge ๐‘‚๐‘‚ ๐‘€๐‘€

๐ต๐ตlists of total size ๐‘๐‘

in ๐‘‚๐‘‚ 1 + ๐‘๐‘๐ต๐ต

memory transfers

๐ต๐ต items

Page 21: The History of I/O ModelsSolves Floydโ€™s two-level storage problem (๐‘€๐‘€= 3๐ต๐ต) ๐ต๐ตitems Sorting Lower Bound [Aggarwal & Vitter โ€” ICALP 1987, C. ACM 1988] โ€ข Sorting

Practical Scanning [Arge]

โ€ข Does the ๐ต๐ต factor matter? Should I presort my linked list before traversal?

โ€ข Example: ๐‘๐‘ = 256,000,000 ~ 1GB ๐ต๐ต = 8,000 ~ 32KB (small) 1ms disk access time (small)

๐‘๐‘ memory transfers take 256,000 sec โ‰ˆ 71 hours

๐‘๐‘๐ต๐ต

memory transfers take โ„256 8 = ๐Ÿ‘๐Ÿ‘๐Ÿ‘๐Ÿ‘ seconds

Page 22: The History of I/O ModelsSolves Floydโ€™s two-level storage problem (๐‘€๐‘€= 3๐ต๐ต) ๐ต๐ตitems Sorting Lower Bound [Aggarwal & Vitter โ€” ICALP 1987, C. ACM 1988] โ€ข Sorting

Searching[Aggarwal & Vitter โ€” ICALP 1987, C. ACM 1988]

โ€ข Finding an element ๐‘ฅ๐‘ฅ among ๐‘๐‘ items requires ฮ˜ log๐ต๐ต+1 ๐‘๐‘ memory transfers

โ€ข Lower bound: (comparison model) Each block reveals where ๐‘ฅ๐‘ฅ fits among ๐ต๐ต items โ‡’ Learn โ‰ค log ๐ต๐ต + 1 bits per read Need log ๐‘๐‘ + 1 bits

โ€ข Upper bound: B-tree Insert & delete

in ๐‘‚๐‘‚ log๐ต๐ต+1 ๐‘๐‘

๐ต๐ต items

Page 23: The History of I/O ModelsSolves Floydโ€™s two-level storage problem (๐‘€๐‘€= 3๐ต๐ต) ๐ต๐ตitems Sorting Lower Bound [Aggarwal & Vitter โ€” ICALP 1987, C. ACM 1988] โ€ข Sorting

Sorting and Permutation[Aggarwal & Vitter โ€” ICALP 1987, C. ACM 1988]

โ€ข Sorting bound: ฮ˜ ๐‘๐‘๐ต๐ต

log โ„๐‘€๐‘€ ๐ต๐ต๐‘๐‘๐ต๐ต

โ€ข Permutation bound: ฮ˜ min ๐‘๐‘, ๐‘๐‘๐ต๐ต

log โ„๐‘€๐‘€ ๐ต๐ต๐‘๐‘๐ต๐ต

Either sort or use naรฏve RAM algorithm Solves Floydโ€™s two-level storage problem (๐‘€๐‘€ = 3๐ต๐ต)

๐ต๐ต items

Page 24: The History of I/O ModelsSolves Floydโ€™s two-level storage problem (๐‘€๐‘€= 3๐ต๐ต) ๐ต๐ตitems Sorting Lower Bound [Aggarwal & Vitter โ€” ICALP 1987, C. ACM 1988] โ€ข Sorting

Sorting Lower Bound[Aggarwal & Vitter โ€” ICALP 1987, C. ACM 1988]

โ€ข Sorting bound: ฮฉ ๐‘๐‘๐ต๐ต

log โ„๐‘€๐‘€ ๐ต๐ต๐‘๐‘๐ต๐ต

Always keep cache sorted (free) Might as well presort each block Upon reading a block, learn how those ๐ต๐ต items fit

amongst ๐‘€๐‘€ items in cache

โ‡’ Learn lg ๐‘€๐‘€+๐ต๐ต๐ต๐ต โˆผ ๐ต๐ต lg ๐‘€๐‘€

๐ต๐ตbits

Need lg๐‘๐‘! โˆผ ๐‘๐‘ lg๐‘๐‘ bits Know ๐‘๐‘ lg๐ต๐ต bits from block presort

๐ต๐ต items

Page 25: The History of I/O ModelsSolves Floydโ€™s two-level storage problem (๐‘€๐‘€= 3๐ต๐ต) ๐ต๐ตitems Sorting Lower Bound [Aggarwal & Vitter โ€” ICALP 1987, C. ACM 1988] โ€ข Sorting

Sorting Upper Bound[Aggarwal & Vitter โ€” ICALP 1987, C. ACM 1988]

โ€ข Sorting bound: ๐‘‚๐‘‚ ๐‘๐‘๐ต๐ต

log โ„๐‘€๐‘€ ๐ต๐ต๐‘๐‘๐ต๐ต

๐‘‚๐‘‚ ๐‘€๐‘€๐ต๐ต

-way mergesort

๐‘‡๐‘‡ ๐‘๐‘ = ๐‘€๐‘€๐ต๐ต๐‘‡๐‘‡ ๏ฟฝ๐‘๐‘ ๐‘€๐‘€

๐ต๐ต+ ๐‘‚๐‘‚ 1 + ๐‘๐‘

๐ต๐ต

๐‘‡๐‘‡ ๐ต๐ต = ๐‘‚๐‘‚ 1๐ต๐ต items

๐‘๐‘/๐ต๐ต

1 1 1

๐‘๐‘/๐ต๐ต

๐‘๐‘/๐ต๐ต

๐‘๐‘/๐ต๐ต

log โ„๐‘€๐‘€ ๐ต๐ต๐‘๐‘๐ต๐ต

levels

๐‘€๐‘€/๐ต๐ต

Page 26: The History of I/O ModelsSolves Floydโ€™s two-level storage problem (๐‘€๐‘€= 3๐ต๐ต) ๐ต๐ตitems Sorting Lower Bound [Aggarwal & Vitter โ€” ICALP 1987, C. ACM 1988] โ€ข Sorting

Distribution Sort[Aggarwal & Vitter โ€” ICALP 1987, C. ACM 1988]

โ€ข โ„๐‘€๐‘€ ๐ต๐ต-way quicksort

1. Find โ„๐‘€๐‘€ ๐ต๐ต partition elements,roughly evenly spaced

2. Partition array into โ„๐‘€๐‘€ ๐ต๐ต + 1 pieces

Scan: ๐‘‚๐‘‚ ๐‘๐‘๐ต๐ต

memory transfers

3. Recurse Same recurrence as mergesort

๐ต๐ต items

Page 27: The History of I/O ModelsSolves Floydโ€™s two-level storage problem (๐‘€๐‘€= 3๐ต๐ต) ๐ต๐ตitems Sorting Lower Bound [Aggarwal & Vitter โ€” ICALP 1987, C. ACM 1988] โ€ข Sorting

Distribution Sort Partitioning[Aggarwal & Vitter โ€” ICALP 1987, C. ACM 1988]

1. For first, second, โ€ฆ interval of ๐‘€๐‘€ items: Sort in ๐‘‚๐‘‚ โ„๐‘€๐‘€ ๐ต๐ต memory transfers Sample every 1

4โ„๐‘€๐‘€ ๐ต๐ต th item

Total sample: 4๐‘๐‘/ โ„๐‘€๐‘€ ๐ต๐ต items

2. For ๐‘–๐‘– = 1,2, โ€ฆ, โ„๐‘€๐‘€ ๐ต๐ต: Run linear-time selection to find

sample element at โ„๐‘–๐‘– โ„๐‘€๐‘€ ๐ต๐ต fraction

Cost: ๐‘‚๐‘‚ ๏ฟฝ๐‘๐‘โ„๐‘€๐‘€ ๐ต๐ต

๐ต๐ต each

Total: ๐‘‚๐‘‚ โ„๐‘๐‘ ๐ต๐ต memory transf.

๐ต๐ต items

Page 28: The History of I/O ModelsSolves Floydโ€™s two-level storage problem (๐‘€๐‘€= 3๐ต๐ต) ๐ต๐ตitems Sorting Lower Bound [Aggarwal & Vitter โ€” ICALP 1987, C. ACM 1988] โ€ข Sorting

Random vs. Sequential I/Os [Farach,Ferragina, Muthukrishnan โ€” FOCS 1998]

โ€ข Sequential memory transfers are part ofbulk read/write of ฮ˜ ๐‘€๐‘€ items

โ€ข Random memory transfer otherwiseโ€ข Sorting: 2-way mergesort achieves ๐‘‚๐‘‚ ๐‘๐‘

๐ต๐ตlog ๐‘๐‘

๐ต๐ตsequential

๐‘œ๐‘œ ๐‘๐‘๐ต๐ต

log โ„๐‘€๐‘€ ๐ต๐ต๐‘๐‘๐ต๐ต

random implies ฮฉ ๐‘๐‘๐ต๐ต

log ๐‘๐‘๐ต๐ต

total

โ€ข Same trade-off forsuffix-tree construction

๐ต๐ต items

Page 29: The History of I/O ModelsSolves Floydโ€™s two-level storage problem (๐‘€๐‘€= 3๐ต๐ต) ๐ต๐ตitems Sorting Lower Bound [Aggarwal & Vitter โ€” ICALP 1987, C. ACM 1988] โ€ข Sorting

Hierarchical Memory Model (HMM) [Aggarwal, Alpern, Chandra, Snir โ€” STOC 1987]

โ€ข Nonuniform-cost RAM: Accessing memory location ๐‘ฅ๐‘ฅ costs ๐‘“๐‘“ ๐‘ฅ๐‘ฅ = log ๐‘ฅ๐‘ฅ

โ€œparticularly simple model of computation that mimics the behavior of a memory hierarchy consisting of increasingly larger amounts of slower memoryโ€

1 1 2 4 8 2๐‘–๐‘–1 1 1 1 1CPU

Presenter
Presentation Notes
http://dl.acm.org/citation.cfm?id=28428
Page 30: The History of I/O ModelsSolves Floydโ€™s two-level storage problem (๐‘€๐‘€= 3๐ต๐ต) ๐ต๐ตitems Sorting Lower Bound [Aggarwal & Vitter โ€” ICALP 1987, C. ACM 1988] โ€ข Sorting

Why ๐’‡๐’‡ ๐’™๐’™ = ๐ฅ๐ฅ๐ฅ๐ฅ๐ฅ๐ฅ ๐’™๐’™? [Mead & Conway 1980]

Presenter
Presentation Notes
http://ai.eecs.umich.edu/people/conway/VLSI/VLSIText/IntroToVLSI.jpg from http://ai.eecs.umich.edu/people/conway/Retrospective4.html
Page 31: The History of I/O ModelsSolves Floydโ€™s two-level storage problem (๐‘€๐‘€= 3๐ต๐ต) ๐ต๐ตitems Sorting Lower Bound [Aggarwal & Vitter โ€” ICALP 1987, C. ACM 1988] โ€ข Sorting

HMM Upper & Lower Bounds[Aggarwal, Alpern, Chandra, Snir โ€” STOC 1987]

Problem Time SlowdownSemiring matrix multiplication ฮ˜ ๐‘๐‘3 ฮ˜(1)

Fast Fourier Transform ฮ˜(๐‘๐‘ log๐‘๐‘ log log๐‘๐‘) ฮ˜ log log๐‘๐‘

Sorting ฮ˜(๐‘๐‘ log๐‘๐‘ log log๐‘๐‘) ฮ˜ log log๐‘๐‘Scanning input (sum, max, DFS, planarity, etc.)

ฮ˜ ๐‘๐‘ log๐‘๐‘ ฮ˜ log๐‘๐‘

Binary search ฮ˜ log2 ๐‘๐‘ ฮ˜ log๐‘๐‘

Page 32: The History of I/O ModelsSolves Floydโ€™s two-level storage problem (๐‘€๐‘€= 3๐ต๐ต) ๐ต๐ตitems Sorting Lower Bound [Aggarwal & Vitter โ€” ICALP 1987, C. ACM 1988] โ€ข Sorting

๐‡๐‡๐‡๐‡๐‡๐‡๐’‡๐’‡(๐’™๐’™)[Aggarwal, Alpern, Chandra, Snir โ€” STOC 1987]

โ€ข Say accessing memory location ๐‘ฅ๐‘ฅ costs ๐‘“๐‘“ ๐‘ฅ๐‘ฅโ€ข Assume ๐‘“๐‘“ 2๐‘ฅ๐‘ฅ โ‰ค ๐‘๐‘ ๐‘“๐‘“(๐‘ฅ๐‘ฅ) for a constant ๐‘๐‘ > 0

(โ€œpolynomially boundedโ€)โ€ข Write ๐‘“๐‘“ ๐‘ฅ๐‘ฅ = โˆ‘๐‘–๐‘– ๐‘ค๐‘ค๐‘–๐‘– โ‹… [๐‘ฅ๐‘ฅ โ‰ฅ ๐‘ฅ๐‘ฅ๐‘–๐‘–? ]

(weighted sum of threshold functions)

๐‘ฅ๐‘ฅ0 ๐‘ฅ๐‘ฅ1 โˆ’ ๐‘ฅ๐‘ฅ0๐‘ค๐‘ค0CPU ๐‘ฅ๐‘ฅ2 โˆ’ ๐‘ฅ๐‘ฅ1

๐‘ค๐‘ค1 ๐‘ค๐‘ค2 โ€ฆ0

๐‘ฅ๐‘ฅ๐‘–๐‘–

Presenter
Presentation Notes
http://upload.wikimedia.org/wikipedia/commons/0/05/Threshold_function.svg
Page 33: The History of I/O ModelsSolves Floydโ€™s two-level storage problem (๐‘€๐‘€= 3๐ต๐ต) ๐ต๐ตitems Sorting Lower Bound [Aggarwal & Vitter โ€” ICALP 1987, C. ACM 1988] โ€ข Sorting

Uniform Optimality [Aggarwal, Alpern, Chandra, Snir โ€” STOC 1987]

โ€ข Consider one term ๐‘“๐‘“๐‘€๐‘€ ๐‘ฅ๐‘ฅ = [๐‘ฅ๐‘ฅ โ‰ฅ ๐‘€๐‘€? ]โ€ข Algorithm is uniformly optimal

if optimal on HMM๐‘“๐‘“๐‘€๐‘€ ๐‘ฅ๐‘ฅ for all ๐‘€๐‘€โ€ข Implies optimality for all ๐‘“๐‘“ ๐‘ฅ๐‘ฅ

โˆžCPU ๐‘€๐‘€ 10

๐‘€๐‘€

CPU

๐‘€๐‘€ items

cache disk red-bluepebble game!

Presenter
Presentation Notes
http://upload.wikimedia.org/wikipedia/commons/0/05/Threshold_function.svg
Page 34: The History of I/O ModelsSolves Floydโ€™s two-level storage problem (๐‘€๐‘€= 3๐ต๐ต) ๐ต๐ตitems Sorting Lower Bound [Aggarwal & Vitter โ€” ICALP 1987, C. ACM 1988] โ€ข Sorting

๐‡๐‡๐‡๐‡๐‡๐‡๐’‡๐’‡๐‘ด๐‘ด ๐’™๐’™ Upper & Lower Bounds[Aggarwal, Alpern, Chandra, Snir โ€” STOC 1987]

Problem Time SpeedupSemiring matrix multiplication ฮ˜

๐‘๐‘3

๐‘€๐‘€ฮ˜ ๐‘€๐‘€

Fast Fourier Transform ฮ˜ ๐‘๐‘ log๐‘€๐‘€ ๐‘๐‘ ฮ˜ log๐‘€๐‘€

Sorting ฮ˜ ๐‘๐‘ log๐‘€๐‘€ ๐‘๐‘ ฮ˜ log๐‘€๐‘€Scanning input (sum, max, DFS, planarity, etc.)

ฮ˜ ๐‘๐‘ โˆ’๐‘€๐‘€ 1 + 1/๐‘€๐‘€

Binary search ฮ˜ log๐‘๐‘ โˆ’ log๐‘€๐‘€ 1 + 1/ log๐‘€๐‘€

upper bounds knownby Hong & Kung 1981

other bounds follow fromAggarwal & Vitter 1987

Page 35: The History of I/O ModelsSolves Floydโ€™s two-level storage problem (๐‘€๐‘€= 3๐ต๐ต) ๐ต๐ตitems Sorting Lower Bound [Aggarwal & Vitter โ€” ICALP 1987, C. ACM 1988] โ€ข Sorting

Implicit HMM Memory Management[Aggarwal, Alpern, Chandra, Snir โ€” STOC 1987]

โ€ข Instead of algorithm explicitly moving data,use any conservative replacement strategy (e.g., FIFO or LRU) to evict from cache[Sleator & Tarjan โ€” C. ACM 1985]

โ€ข ๐‘‡๐‘‡LRU ๐‘๐‘,๐‘€๐‘€ โ‰ค 2 โ‹… ๐‘‡๐‘‡OPT ๐‘๐‘, โ„๐‘€๐‘€ 2= ๐‘‚๐‘‚ ๐‘‡๐‘‡OPT ๐‘๐‘,๐‘€๐‘€ assuming ๐‘“๐‘“ 2๐‘ฅ๐‘ฅ โ‰ค ๐‘๐‘ ๐‘“๐‘“(๐‘ฅ๐‘ฅ)

โ€ข Not uniform!

โˆžCPU ๐‘€๐‘€ 10

Page 36: The History of I/O ModelsSolves Floydโ€™s two-level storage problem (๐‘€๐‘€= 3๐ต๐ต) ๐ต๐ตitems Sorting Lower Bound [Aggarwal & Vitter โ€” ICALP 1987, C. ACM 1988] โ€ข Sorting

Implicit HMM Memory Management[Aggarwal, Alpern, Chandra, Snir โ€” STOC 1987]

โ€ข For general ๐‘“๐‘“, split memory into chunks at ๐‘ฅ๐‘ฅwhere ๐‘“๐‘“(๐‘ฅ๐‘ฅ) doubles (up to rounding)

๐‘ฅ๐‘ฅ0 ๐‘ฅ๐‘ฅ1 โˆ’ ๐‘ฅ๐‘ฅ0๐‘ค๐‘ค0CPU ๐‘ฅ๐‘ฅ2 โˆ’ ๐‘ฅ๐‘ฅ1

๐‘ค๐‘ค1 ๐‘ค๐‘ค2 โ€ฆ0

Page 37: The History of I/O ModelsSolves Floydโ€™s two-level storage problem (๐‘€๐‘€= 3๐ต๐ต) ๐ต๐ตitems Sorting Lower Bound [Aggarwal & Vitter โ€” ICALP 1987, C. ACM 1988] โ€ข Sorting

Implicit HMM Memory Management[Aggarwal, Alpern, Chandra, Snir โ€” STOC 1987]

โ€ข For general ๐‘“๐‘“, split memory into chunks at ๐‘ฅ๐‘ฅwhere ๐‘“๐‘“(๐‘ฅ๐‘ฅ) doubles (up to rounding)

โ€ข LRU eviction from first chunk into second;LRU eviction from second chunk into third; etc.

โ€ข ๐‘‡๐‘‡๐ฟ๐ฟ๐ฟ๐ฟ๐ฟ๐ฟ ๐‘๐‘ = ๐‘‚๐‘‚ ๐‘‡๐‘‡๐‘‚๐‘‚๐‘‚๐‘‚๐‘‚๐‘‚ ๐‘๐‘ + ๐‘๐‘ โ‹… ๐‘“๐‘“ ๐‘๐‘ Like MTF

1 1 2 4 8 2๐‘–๐‘–1 1 1 1 1CPU

Page 38: The History of I/O ModelsSolves Floydโ€™s two-level storage problem (๐‘€๐‘€= 3๐ต๐ต) ๐ต๐ตitems Sorting Lower Bound [Aggarwal & Vitter โ€” ICALP 1987, C. ACM 1988] โ€ข Sorting

HMM with Block Transfer (BT) [Aggarwal, Chandra, Snir โ€” FOCS 1987]

โ€ข Accessing memory location ๐‘ฅ๐‘ฅ costs ๐‘“๐‘“ ๐‘ฅ๐‘ฅโ€ข Copying memory interval from ๐‘ฅ๐‘ฅ โˆ’ ๐›ฟ๐›ฟโ€ฆ ๐‘ฅ๐‘ฅ

to ๐‘ฆ๐‘ฆ โˆ’ ๐›ฟ๐›ฟโ€ฆ๐‘ฆ๐‘ฆ costs ๐‘“๐‘“ max ๐‘ฅ๐‘ฅ, ๐‘ฆ๐‘ฆ + ๐›ฟ๐›ฟ Models memory pipelining ~ block transfer Ignores block alignment, explicit levels, etc.

1 1 2 4 8 2๐‘–๐‘–1 1 1 1 1CPU

Presenter
Presentation Notes
Aggarwal, Chandra, Snir - Hierarchical Memory with Block Transfer - FOCS 1987 http://dx.doi.org/10.1109/SFCS.1987.31
Page 39: The History of I/O ModelsSolves Floydโ€™s two-level storage problem (๐‘€๐‘€= 3๐ต๐ต) ๐ต๐ตitems Sorting Lower Bound [Aggarwal & Vitter โ€” ICALP 1987, C. ACM 1988] โ€ข Sorting

BT Results[Aggarwal, Chandra, Snir โ€” FOCS 1987]

Problem ๐‘“๐‘“ ๐‘ฅ๐‘ฅ = log ๐‘ฅ๐‘ฅ ๐‘“๐‘“ ๐‘ฅ๐‘ฅ = ๐‘ฅ๐‘ฅ๐›ผ๐›ผ, 0 < ๐›ผ๐›ผ < 1 ๐‘“๐‘“ ๐‘ฅ๐‘ฅ = ๐‘ฅ๐‘ฅ ๐‘“๐‘“ ๐‘ฅ๐‘ฅ = ๐‘ฅ๐‘ฅ๐›ผ๐›ผ,

๐›ผ๐›ผ > 1Dot product, merging lists ฮ˜ ๐‘๐‘ logโˆ— ๐‘๐‘ ฮ˜ ๐‘๐‘ log log๐‘๐‘ ฮ˜(๐‘๐‘ log๐‘๐‘) ฮ˜ ๐‘๐‘๐›ผ๐›ผ

Matrix mult. ฮ˜ ๐‘๐‘3 ฮ˜ ๐‘๐‘3 ฮ˜ ๐‘๐‘3 ฮ˜ ๐‘๐‘๐›ผ๐›ผ

if ๐›ผ๐›ผ > 1.5Fast Fourier Transform ฮ˜ ๐‘๐‘ log๐‘๐‘ ฮ˜ ๐‘๐‘ log๐‘๐‘ ฮ˜ ๐‘๐‘ log2 ๐‘๐‘ ฮ˜ ๐‘๐‘๐›ผ๐›ผ

Sorting ฮ˜ ๐‘๐‘ log๐‘๐‘ ฮ˜ ๐‘๐‘ log๐‘๐‘ ฮ˜ ๐‘๐‘ log2 ๐‘๐‘ ฮ˜ ๐‘๐‘๐›ผ๐›ผ

Binary search ฮ˜

log2 ๐‘๐‘log log๐‘๐‘ ฮ˜ ๐‘๐‘๐›ผ๐›ผ ฮ˜ ๐‘๐‘ ฮ˜ ๐‘๐‘๐›ผ๐›ผ

Presenter
Presentation Notes
Aggarwal, Chandra, Snir - Hierarchical Memory with Block Transfer - FOCS 1987 http://dx.doi.org/10.1109/SFCS.1987.31
Page 40: The History of I/O ModelsSolves Floydโ€™s two-level storage problem (๐‘€๐‘€= 3๐ต๐ต) ๐ต๐ตitems Sorting Lower Bound [Aggarwal & Vitter โ€” ICALP 1987, C. ACM 1988] โ€ข Sorting

Memory Hierarchy Model (MH) [Alpern, Carter, Feig, Selker โ€” FOCS 1990]

โ€ข Multilevel version of external-memory modelโ€ข ๐‘€๐‘€๐‘–๐‘– โ†” ๐‘€๐‘€๐‘–๐‘–+1 transfers happen in blocks of size ๐ต๐ต๐‘–๐‘–

(subblocks of ๐‘€๐‘€๐‘–๐‘–+1), and take ๐‘ก๐‘ก๐‘–๐‘– timeโ€ข All levels can be actively transferring at once

๐‘€๐‘€0 ๐‘€๐‘€1 ๐‘€๐‘€2 ๐‘€๐‘€3 ๐‘€๐‘€40 ๐‘ก๐‘ก0 ๐‘ก๐‘ก1 ๐‘ก๐‘ก2 ๐‘ก๐‘ก3CPU

๐ต๐ต0 ๐ต๐ต1๐ต๐ต2

๐ต๐ต3๐ต๐ต4

โ€ฆ

Presenter
Presentation Notes
Conference version: Alpern, Carter, Feig: http://dx.doi.org/10.1109/FSCS.1990.89581 Journal version: Alpern, Carter, Feig, Selker: http://www.springerlink.com/index/W24P02H58XG51322.pdf
Page 41: The History of I/O ModelsSolves Floydโ€™s two-level storage problem (๐‘€๐‘€= 3๐ต๐ต) ๐ต๐ตitems Sorting Lower Bound [Aggarwal & Vitter โ€” ICALP 1987, C. ACM 1988] โ€ข Sorting

Uniform Memory Hierarchy (UMH) [Alpern, Carter, Feig, Selker โ€” FOCS 1990]

โ€ข Fix aspect ratio ๐›ผ๐›ผ = ๐‘€๐‘€/๐ต๐ต๐ต๐ต

, block growth ๐›ฝ๐›ฝ = ๐ต๐ต๐‘–๐‘–+1๐ต๐ต๐‘–๐‘–

โ€ข ๐ต๐ต๐‘–๐‘– = ๐›ฝ๐›ฝ๐‘–๐‘–

โ€ข ๐‘€๐‘€๐‘–๐‘–๐ต๐ต๐‘–๐‘–

= ๐›ผ๐›ผ โ‹… ๐›ฝ๐›ฝ๐‘–๐‘–

โ€ข ๐‘ก๐‘ก๐‘–๐‘– = ๐›ฝ๐›ฝ๐‘–๐‘– โ‹… ๐‘“๐‘“(๐‘–๐‘–)

๐‘€๐‘€0 ๐‘€๐‘€1 ๐‘€๐‘€2 ๐‘€๐‘€3 ๐‘€๐‘€40 ๐‘ก๐‘ก0 ๐‘ก๐‘ก1 ๐‘ก๐‘ก2 ๐‘ก๐‘ก3CPU

๐ต๐ต0 ๐ต๐ต1๐ต๐ต2

๐ต๐ต3๐ต๐ต4

โ€ฆ

2 parameters1 function

Presenter
Presentation Notes
Conference version: Alpern, Carter, Feig: http://dx.doi.org/10.1109/FSCS.1990.89581 Journal version: Alpern, Carter, Feig, Selker: http://www.springerlink.com/index/W24P02H58XG51322.pdf
Page 42: The History of I/O ModelsSolves Floydโ€™s two-level storage problem (๐‘€๐‘€= 3๐ต๐ต) ๐ต๐ตitems Sorting Lower Bound [Aggarwal & Vitter โ€” ICALP 1987, C. ACM 1988] โ€ข Sorting

UMH Results[Alpern, Carter, Feig, Selker โ€” FOCS 1990]Problem Upper Bound Lower Bound

Matrix transpose๐‘“๐‘“ ๐‘–๐‘– = 1 ๐‘‚๐‘‚ 1 +

1๐›ฝ๐›ฝ2

๐‘๐‘2 ฮฉ 1 +1๐›ผ๐›ผ๐›ฝ๐›ฝ4

๐‘๐‘2

Matrix mult.๐‘“๐‘“ ๐‘–๐‘– = O ๐›ฝ๐›ฝ๐‘–๐‘– ๐‘‚๐‘‚ 1 +

1๐›ฝ๐›ฝ3

๐‘๐‘3 ฮฉ 1 +1๐›ฝ๐›ฝ3

๐‘๐‘3

FFT๐‘“๐‘“ ๐‘–๐‘– โ‰ค ๐‘–๐‘– ๐‘‚๐‘‚ 1 ฮฉ 1

General approach: Divide & conquer

Presenter
Presentation Notes
Conference version: Alpern, Carter, Feig: http://dx.doi.org/10.1109/FSCS.1990.89581 Journal version: Alpern, Carter, Feig, Selker: http://www.springerlink.com/index/W24P02H58XG51322.pdf
Page 43: The History of I/O ModelsSolves Floydโ€™s two-level storage problem (๐‘€๐‘€= 3๐ต๐ต) ๐ต๐ตitems Sorting Lower Bound [Aggarwal & Vitter โ€” ICALP 1987, C. ACM 1988] โ€ข Sorting

Cache-Oblivious Model [Frigo, Leiserson, Prokop, Ramachandran โ€” FOCS 1999]

โ€ข Analyze RAM algorithm (not knowing ๐ต๐ต or ๐‘€๐‘€) on external-memory model Must work well for all ๐ต๐ต and ๐‘€๐‘€

CPU

๐ต๐ต items

๐‘€๐‘€๐ต๐ต

blocks

cache๐ต๐ต itemsdisk

Page 44: The History of I/O ModelsSolves Floydโ€™s two-level storage problem (๐‘€๐‘€= 3๐ต๐ต) ๐ต๐ตitems Sorting Lower Bound [Aggarwal & Vitter โ€” ICALP 1987, C. ACM 1988] โ€ข Sorting

Cache-Oblivious Model [Frigo,Leiserson, Prokop, Ramachandran โ€” FOCS 1999]

โ€ข Automatic block transfers via LRU or FIFOโ€ข Lose factor of 2 in ๐‘€๐‘€ and number of transfers Assume ๐‘‡๐‘‡ ๐ต๐ต, 2๐‘€๐‘€ โ‰ค ๐‘๐‘ ๐‘‡๐‘‡(๐ต๐ต,๐‘€๐‘€)

CPU

๐ต๐ต items

๐‘€๐‘€๐ต๐ต

blocks

cache๐ต๐ต itemsdisk

Page 45: The History of I/O ModelsSolves Floydโ€™s two-level storage problem (๐‘€๐‘€= 3๐ต๐ต) ๐ต๐ตitems Sorting Lower Bound [Aggarwal & Vitter โ€” ICALP 1987, C. ACM 1988] โ€ข Sorting

Cache-Oblivious Model [Frigo,Leiserson, Prokop, Ramachandran โ€” FOCS 1999]

โ€ข Clean modelโ€ข Adapts to changing ๐ต๐ต (e.g., disk tracks) and

changing ๐‘€๐‘€ (e.g., competing processes)โ€ข Adapts to multilevel memory hierarchy (MH) Assuming inclusion

๐‘€๐‘€0 ๐‘€๐‘€1 ๐‘€๐‘€2 ๐‘€๐‘€3 ๐‘€๐‘€40 ๐‘ก๐‘ก0 ๐‘ก๐‘ก1 ๐‘ก๐‘ก2 ๐‘ก๐‘ก3CPU

๐ต๐ต0 ๐ต๐ต1๐ต๐ต2

๐ต๐ต3๐ต๐ต4

โ€ฆ

Page 46: The History of I/O ModelsSolves Floydโ€™s two-level storage problem (๐‘€๐‘€= 3๐ต๐ต) ๐ต๐ตitems Sorting Lower Bound [Aggarwal & Vitter โ€” ICALP 1987, C. ACM 1988] โ€ข Sorting

Scanning [Frigo, Leiserson, Prokop, Ramachandran โ€” FOCS 1999]

โ€ข Visiting ๐‘๐‘ elements in order costs๐‘‚๐‘‚ 1 + ๐‘๐‘

๐ต๐ตmemory transfers

โ€ข More generally, can run ๐‘‚๐‘‚ 1 parallel scans Assume ๐‘€๐‘€ โ‰ฅ ๐‘๐‘ ๐ต๐ต for appropriate constant ๐‘๐‘ > 0

โ€ข E.g., merge two lists in ๐‘‚๐‘‚ ๐‘๐‘๐ต๐ต

๐ต๐ต items

Page 47: The History of I/O ModelsSolves Floydโ€™s two-level storage problem (๐‘€๐‘€= 3๐ต๐ต) ๐ต๐ตitems Sorting Lower Bound [Aggarwal & Vitter โ€” ICALP 1987, C. ACM 1988] โ€ข Sorting

Searching [Prokop โ€” Meng 1999]

โ€œvan Emde Boas layoutโ€

Page 48: The History of I/O ModelsSolves Floydโ€™s two-level storage problem (๐‘€๐‘€= 3๐ต๐ต) ๐ต๐ตitems Sorting Lower Bound [Aggarwal & Vitter โ€” ICALP 1987, C. ACM 1988] โ€ข Sorting

Searching [Prokop โ€” Meng 1999]

โ€ข height(โ–ณ)

โ‰ฅ12

lg๐ต๐ต

โ€ข โ‰ค 2 memory transfers per โ–ณ

โ€ข โ‰ค 4 log๐ต๐ต ๐‘๐‘ total

Page 49: The History of I/O ModelsSolves Floydโ€™s two-level storage problem (๐‘€๐‘€= 3๐ต๐ต) ๐ต๐ตitems Sorting Lower Bound [Aggarwal & Vitter โ€” ICALP 1987, C. ACM 1988] โ€ข Sorting

Cache-Oblivious Searching

โ€ข lg ๐‘’๐‘’ + ๐‘œ๐‘œ 1 log๐ต๐ต ๐‘๐‘ is optimal[Bender, Brodal, Fagerberg, Ge, He, Hu, Iacono,Lรณpez-Ortiz โ€” FOCS 2003]

โ€ข Dynamic B-tree in ๐‘‚๐‘‚ log๐ต๐ต ๐‘๐‘ per operation[Bender, Demaine, Farach-Colton โ€” FOCS 2000][Bender, Duan, Iacono, Wu โ€” SODA 2002][Brodal,Fagerberg,Jacob โ€”SODA 2002]

Page 50: The History of I/O ModelsSolves Floydโ€™s two-level storage problem (๐‘€๐‘€= 3๐ต๐ต) ๐ต๐ตitems Sorting Lower Bound [Aggarwal & Vitter โ€” ICALP 1987, C. ACM 1988] โ€ข Sorting

Cache-Oblivious Sorting

โ€ข ๐‘‚๐‘‚ ๐‘๐‘๐ต๐ต

log โ„๐‘€๐‘€ ๐ต๐ต๐‘๐‘๐ต๐ต

possible, assuming๐‘€๐‘€ โ‰ฅ ฮฉ ๐ต๐ต1+๐œ€๐œ€ (tall cache) Funnel sort:

mergesort analog Distribution sort[Frigo, Leiserson, Prokop,Ramachandran โ€” FOCS 1999;Brodal & Fagerberg โ€” ICALP 2002]

โ€ข Impossible without tall-cache assumption[Brodal & Fagerberg โ€” STOC 2003]

Page 51: The History of I/O ModelsSolves Floydโ€™s two-level storage problem (๐‘€๐‘€= 3๐ต๐ต) ๐ต๐ตitems Sorting Lower Bound [Aggarwal & Vitter โ€” ICALP 1987, C. ACM 1988] โ€ข Sorting

http://courses.csail.mit.edu/6.851/

Page 52: The History of I/O ModelsSolves Floydโ€™s two-level storage problem (๐‘€๐‘€= 3๐ต๐ต) ๐ต๐ตitems Sorting Lower Bound [Aggarwal & Vitter โ€” ICALP 1987, C. ACM 1988] โ€ข Sorting

Models, Models, ModelsModel Year Blocking Caching Levels SimpleIdealized2-level 1972 โœ“ โœ— 2 โœ“Red-blue pebble 1981 โœ— โœ“ 2 โœ“โˆ’External memory 1987 โœ“ โœ“ 2 โœ“HMM 1987 โœ— โœ“ โˆž โœ“BT 1987 ~ โœ“ โˆž โœ“โˆ’(U)MH 1990 โœ“ โœ“ โˆž โœ—Cache oblivious 1999 โœ“ โœ“ 2โ€“โˆž โœ“+