comparison based dictionaries: fault tolerance versus i/o efficiency

23
Comparison Based Dictionaries: Fault Tolerance versus I/O Efficiency Gerth Stølting Brodal Allan Grønlund Jørgensen Thomas Mølhave University of Aarhus ADS 2007, 3rd Bertinoro Workshop on Algorithms and Data Structures University Residential Centre of Bertinoro, Italy, September 30-October 5, 2007

Upload: haviva-finley

Post on 01-Jan-2016

23 views

Category:

Documents


1 download

DESCRIPTION

Comparison Based Dictionaries: Fault Tolerance versus I/O Efficiency. Gerth Stølting Brodal Allan Grønlund Jørgensen Thomas Mølhave University of Aarhus. ADS 2007, 3rd Bertinoro Workshop on Algorithms and Data Structures - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Comparison Based Dictionaries:  Fault Tolerance versus I/O Efficiency

Comparison Based Dictionaries: Fault Tolerance versus I/O Efficiency

Gerth Stølting BrodalAllan Grønlund Jørgensen

Thomas Mølhave

University of Aarhus

ADS 2007, 3rd Bertinoro Workshop on Algorithms and Data StructuresUniversity Residential Centre of Bertinoro, Italy, September 30-October 5, 2007

Page 2: Comparison Based Dictionaries:  Fault Tolerance versus I/O Efficiency

Dictionaries: Fault Tolerance versus I/O EfficiencyBrodal, Jørgensen, Mølhave2

Binary Searching

Fault tolerance

I/O Efficiency

This talk

Future work

Page 3: Comparison Based Dictionaries:  Fault Tolerance versus I/O Efficiency

Dictionaries: Fault Tolerance versus I/O EfficiencyBrodal, Jørgensen, Mølhave3

Search(17)

4 7 10 13 14 15 16 18 19 23 25 26 27 29 30 31 32 33 34 36 38

17

O(log N) comparisons

Page 4: Comparison Based Dictionaries:  Fault Tolerance versus I/O Efficiency

Dictionaries: Fault Tolerance versus I/O EfficiencyBrodal, Jørgensen, Mølhave4

Search(17)

4 7 10 13 14 15 16 18 19 23 25 26 27 29 30 31 32 33 34 36 38

17?

9

soft memory error

Page 5: Comparison Based Dictionaries:  Fault Tolerance versus I/O Efficiency

Dictionaries: Fault Tolerance versus I/O EfficiencyBrodal, Jørgensen, Mølhave5

Faulty-Memory RAM ModelFinocchi and Italiano, STOC’04

Content of memory cells can get corrupted Corrupted and uncorrupted content cannot be distinguished O(1) safe registers Assumption: At most δ corruptions

Example: Sorting requires time Θ(N·log N+δ2)Finocchi, Grandoni, Italiano, ICALP‘06

Page 6: Comparison Based Dictionaries:  Fault Tolerance versus I/O Efficiency

Dictionaries: Fault Tolerance versus I/O EfficiencyBrodal, Jørgensen, Mølhave6

Faulty-Memory RAM:Searching

Lower bound

Upper bound

Θ(log N + δ) comparisons

Finocchi, Grandoni, Italiano, ICALP’06

Brodal, Fagerberg, Finocchi, Grandoni, Italiano, Jørgensen, Moruz, Mølhave, ESA’07

Page 7: Comparison Based Dictionaries:  Fault Tolerance versus I/O Efficiency

Dictionaries: Fault Tolerance versus I/O EfficiencyBrodal, Jørgensen, Mølhave7

Faulty-Memory RAM:Searching

17?

94 7 10 13 14 15 16 18 19 23 25 26 27 29 30 31 32 33 34 36 38

Problem? High confidenceLow confidence

Requirement: If there exists an uncorrupted element equal to the search key, we should find such an element

Page 8: Comparison Based Dictionaries:  Fault Tolerance versus I/O Efficiency

Dictionaries: Fault Tolerance versus I/O EfficiencyBrodal, Jørgensen, Mølhave8

Faulty-Memory RAM:Searching

When are we done (δ=3)?

Contradiction, i.e. at least one fault

If range contains at least δ+1 and δ+1 then there is at least one uncorrupted and , i.e. x must be contained in the range

Page 9: Comparison Based Dictionaries:  Fault Tolerance versus I/O Efficiency

Dictionaries: Fault Tolerance versus I/O EfficiencyBrodal, Jørgensen, Mølhave9

If verification fails → contradiction, i.e. ≥1 memory-fault → ignore 4 last comparisons → backtrack one level of search

Faulty-Memory RAM:Θ(log N + δ) Searching

1 12 233 44 55

Brodal, Fagerberg, Finocchi, Grandoni, Italiano, Jørgensen, Moruz, Mølhave, ESA’07

Page 10: Comparison Based Dictionaries:  Fault Tolerance versus I/O Efficiency

Dictionaries: Fault Tolerance versus I/O EfficiencyBrodal, Jørgensen, Mølhave10

Faulty-Memory RAM:Θ(log N + δ) Searching

1 12 233 44

Standard binary search + verification steps At most δ verification steps can fail/backtracking Detail: Avoid repeated comparison with the same

(wrong) element by grouping elements into blocks of size O(δ)

Brodal, Fagerberg, Finocchi, Grandoni, Italiano, Jørgensen, Moruz, Mølhave, ESA’07

Page 11: Comparison Based Dictionaries:  Fault Tolerance versus I/O Efficiency

Dictionaries: Fault Tolerance versus I/O EfficiencyBrodal, Jørgensen, Mølhave11

Faulty-Memory RAM: Reliable Values

Store 2δ+1 copies of value x - at most δ copies uncorrupted x = majority Time O(δ) using two safe registers (candidate and count)

δ=5 y y y x x y x x x y x

Candidate y y y y y y y – x – x

Count 1 2 3 2 1 2 1 0 1 0 1

Boyer and Moore ‘91

Page 12: Comparison Based Dictionaries:  Fault Tolerance versus I/O Efficiency

Dictionaries: Fault Tolerance versus I/O EfficiencyBrodal, Jørgensen, Mølhave12

Faulty-Memory RAM:Dynamic Dictionaries

Packed array Reliable pointers and keys Updates O(δ ·log2 N) Searches = fault tolerant O(log N+δ)

2-level buckets of size O(δ·log N) Root: Reliable pointers and keys Bucket search/update amortized O(log

N+δ)

... ...

Θ(δ·log N)elements

Itai, Konheim, Rodeh, 1981

Search and update amortized O(log N+δ)

...

Θ(δ) elements

Brodal, Fagerberg, Finocchi, Grandoni, Italiano, Jørgensen,

Moruz, Mølhave, ESA’07

Page 13: Comparison Based Dictionaries:  Fault Tolerance versus I/O Efficiency

Dictionaries: Fault Tolerance versus I/O EfficiencyBrodal, Jørgensen, Mølhave13

I/O Model

Page 14: Comparison Based Dictionaries:  Fault Tolerance versus I/O Efficiency

Dictionaries: Fault Tolerance versus I/O EfficiencyBrodal, Jørgensen, Mølhave14

I/O Model

N = problem size M = memory size B = I/O block size

One I/O moves B consecutive records from to disk Complexity = number of I/Os

CPU External

I/O

MemoryyromeM

Aggarwal and Vitter 1988

B

N

B

NBM /log Example: Sorting requires I/Os

Page 15: Comparison Based Dictionaries:  Fault Tolerance versus I/O Efficiency

Dictionaries: Fault Tolerance versus I/O EfficiencyBrodal, Jørgensen, Mølhave15

B-trees

O(logB N)

....

Ω(B)

Search path

Search and update O(logB N)

Page 16: Comparison Based Dictionaries:  Fault Tolerance versus I/O Efficiency

Dictionaries: Fault Tolerance versus I/O EfficiencyBrodal, Jørgensen, Mølhave16

Fault-Tolerance

versus

I/O Efficiency

Page 17: Comparison Based Dictionaries:  Fault Tolerance versus I/O Efficiency

Dictionaries: Fault Tolerance versus I/O EfficiencyBrodal, Jørgensen, Mølhave17

Lower Bound for Fault-Tolerant External Searching

Adversary argument

If Bε slabs per I/O → factor Bε reduction and B1-ε faults After k I/Os N/(Bε)k–k· B1-ε elements remain

Possible values

1

log1

BNB I/Os required [minimized wrt ε ]

Page 18: Comparison Based Dictionaries:  Fault Tolerance versus I/O Efficiency

Dictionaries: Fault Tolerance versus I/O EfficiencyBrodal, Jørgensen, Mølhave18

Randomized Upper Bound for Fault-Tolerant External Searching Sorted array + 2δ identical B-trees

(over N/(2δ) elements, stored in BFS layout) Search: Select random tree for each

node on search path + verification Probability no faults on path:

where Σβi≤δ

Search O(logB N+δ/B) expected

2

1

21

log

1

N

i

iB

....

Page 19: Comparison Based Dictionaries:  Fault Tolerance versus I/O Efficiency

Dictionaries: Fault Tolerance versus I/O EfficiencyBrodal, Jørgensen, Mølhave19

Sorted array

+ 2δ/B1-ε identical B-trees of degree Bε

+ B1-ε copies of each key + min/max Search: Verify against min/max in

each step – if fail, backtrack one

level and advance to next copy

Search I/Os

Deterministic Upper Bound for Fault-Tolerant External Searching

BB

NO B

1

log1

Page 20: Comparison Based Dictionaries:  Fault Tolerance versus I/O Efficiency

Dictionaries: Fault Tolerance versus I/O EfficiencyBrodal, Jørgensen, Mølhave20

Dynamic Fault-Tolerant External DictionariesStatic structure

+ Packed arrays

+ Buckets of size O(δ ·log3 N)

Deterministic

I/Os search and updates

Randomized

Expected O(logB N+δ/B) I/Os search and updates

1

log1

BNO B

... ...

Static

Page 21: Comparison Based Dictionaries:  Fault Tolerance versus I/O Efficiency

Dictionaries: Fault Tolerance versus I/O EfficiencyBrodal, Jørgensen, Mølhave21

Fault-tolerant external memory searching

I/Os

worst-case [minized wrt ε]

Randomized O(logB N+δ/B) I/Os

Conclusion

1

log1

BNB

Page 22: Comparison Based Dictionaries:  Fault Tolerance versus I/O Efficiency

Dictionaries: Fault Tolerance versus I/O EfficiencyBrodal, Jørgensen, Mølhave22

Future Work Fault Tolerance versus I/O Efficiency Randomized algorithms:

Memory faults in internal memory?

Sorting:

? ...

BB

N

B

NBM

2

/log

Page 23: Comparison Based Dictionaries:  Fault Tolerance versus I/O Efficiency

Dictionaries: Fault Tolerance versus I/O EfficiencyBrodal, Jørgensen, Mølhave23

THANKS