GARBAGE COLLECTION
SAURABH KADEKODI RAJAT KATEJA TIANYUAN DING
GOALS
PROBLEM STATEMENT
▸ To implement efficient garbage collection in Peloton
GOALS (REVISED)
▸ 75% - implement basic tuple recycling using vacuum
▸ 100% - implement epoch based co-operative gc
▸ 112.5% - lock-free implementations, GetMemoryFootprint()
▸ 125% - DDL garbage collection (deferred after discussion)
METHODOLOGY
WITHOUT GC
Empty Tile Slot
Tile Group
TxnAdd new tuple
METHODOLOGY
WITHOUT GC
Empty Tile Slot
Tile Group
TxnAdd new tuple
METHODOLOGY
WITHOUT GC
Empty Tile Slot
Tile Group
TxnUpdate tuple 3
METHODOLOGY
WITHOUT GC
Empty Tile Slot
Tile Group
Empty Tile Slot
TxnUpdate tuple 3
METHODOLOGY
WITHOUT GC
Empty Tile Slot
Tile Group
Empty Tile Slot
Cannot mark this slot free yet (MVCC)Tuple Update
TxnUpdate tuple 3
METHODOLOGY
WITHOUT GC
Empty Tile Slot
Tile Group
Empty Tile Slot
Can use this slot…
TxnUpdate tuple 3
METHODOLOGY
WITHOUT GC
Empty Tile Slot
Tile Group
Empty Tile Slot
Can use this slot…
TxnUpdate tuple 1
METHODOLOGY
WITHOUT GC
Empty Tile Slot
Tile Group
Empty Tile SlotEmpty Tile Slot
Can use this slot…
TxnUpdate tuple 1
METHODOLOGY
WITHOUT GC
Empty Tile Slot
Tile Group
Empty Tile SlotEmpty Tile Slot
Can use this slot…
TxnUpdate tuple 1
METHODOLOGY
WITHOUT GC
Empty Tile Slot
Tile Group
Empty Tile SlotEmpty Tile Slot
Can use this slot…
Unused Slots
TxnUpdate tuple 1
METHODOLOGY
WITH GC
Empty Tile Slot
Tile Group
1
2
3
4
3
Actually Free
Possibly Free
1
METHODOLOGY
WITH GC
Empty Tile Slot
Tile Group
1
2
3
4
3
Actually Free
Possibly Free
1
METHODOLOGY
WITH GC
Empty Tile Slot
Tile Group
Empty Tile Slot
Add to possibly_free_list
1
2
3
4
5
3
Actually Free
Possibly Free
1
METHODOLOGY
WITH GC
Empty Tile Slot
Tile Group
Empty Tile Slot
Move to actually_free_list
1
2
3
4
5
3 Actually Free
Possibly Free
1
METHODOLOGY
WITH GC
Empty Tile Slot
Tile Group
Empty Tile SlotEmpty Tile Slot
Move to actually_free_list
1
2
3
4
5
6
3 Actually Free
Possibly Free
1
METHODOLOGY
WITH GC
Empty Tile Slot
Tile Group
Empty Tile SlotEmpty Tile Slot
1
2
3
4
5
6
Actually Free
Possibly Free
1
Use recycled tuple slot
METHODOLOGY
WITH GC
Empty Tile Slot
Tile Group
Empty Tile SlotEmpty Tile Slot
1
2
3
4
5
6
Actually Free
Possibly Free1
Use recycled tuple slot
METHODOLOGY
WITH GC
Empty Tile Slot
Tile Group
Empty Tile SlotEmpty Tile Slot
1
2
3
4
5
6
Actually Free
Possibly Free
1
Use recycled tuple slot
METHODOLOGY
MEMORY FOOTPRINT
Tile Group
Empty Tile SlotEmpty Tile Slot
1
2
3
4
5
6
3
Actually Free
Possibly Free
1
1
{-
CONTRIBUTIONS
GC MODES
▸ Off (default)
▸ Vacuum
▸ Naïve co-operative
▸ Epoch based co-operative
WORKLOAD
▸ YCSB - 80% updates, 20% reads, 10 terminals, 100000 tuples and 100 sec runtime
▸ YCSB - 50% updates, 50% inserts, 10 terminals, 100000 tuples and 100 sec runtime
CONTRIBUTIONS
VACUUM
▸ Separate thread started as Peloton bootstraps
▸ Periodically moves elements from possibly_free_list to actually_free_list
VACUUM TRADEOFFS
▸ Vacuum thread period vs GC efficiency
▸ # Recycled per vacuum invocation vs GC efficiency
CONTRIBUTIONS
VACUUM TRADEOFFS MEASURED
Mem
ory
Usag
e (M
B)
8
8.75
9.5
10.25
11
Thro
ughp
ut (r
ps)
0
3250
6500
9750
13000
Vacuum Configurations (V-<sleep>-<# recycled>)
V-1-1000 V-2-1000 V-5-1000 V-5-2000 V-5-3000 V-5-5000
Throughput (rps)Memory Usage (MB)
CONTRIBUTIONS
NAÏVE CO-OPERATIVE (COOPERATIVE)
CONTRIBUTIONS
NAÏVE CO-OPERATIVE (COOPERATIVE)
PERFORM GC
CONTRIBUTIONS
NAÏVE CO-OPERATIVE (COOPERATIVE)
PERFORM GCPER
CONTRIBUTIONS
COOPERATIVE TRADEOFFS MEASURED
Mem
ory
Usag
e (M
B)
0
2.25
4.5
6.75
9
Thro
ughp
ut (r
ps)
0
7500
15000
22500
30000
Naïve Configurations (N-<# recycled>)
N-1 N-10 N-25 N-50 N-100 N-1000 N-5000
Throughput (rps)Memory Usage (MB)
CONTRIBUTIONS
EPOCH CO-OPERATIVE (EPOCH)
ejoin
CONTRIBUTIONS
EPOCH CO-OPERATIVE (EPOCH)
ejoin
CONTRIBUTIONS
EPOCH CO-OPERATIVE (EPOCH)
ejoin
CONTRIBUTIONS
EPOCH CO-OPERATIVE (EPOCH)
e
join
CONTRIBUTIONS
EPOCH CO-OPERATIVE (EPOCH)
e
join
CONTRIBUTIONS
EPOCH CO-OPERATIVE (EPOCH)
e
join
ee
CONTRIBUTIONS
EPOCH CO-OPERATIVE (EPOCH)
e
join
leave
e e
CONTRIBUTIONS
EPOCH CO-OPERATIVE (EPOCH)
e
join
leave
e e
CONTRIBUTIONS
EPOCH CO-OPERATIVE (EPOCH)
PERFORM GCPER
e
FOR
join
leave
CONTRIBUTIONS
EPOCH TRADEOFFS MEASURED
Mem
ory
Usag
e (M
B)
0
3.5
7
10.5
14
Thro
ughp
ut (r
ps)
0
1750
3500
5250
7000
Epoch Configurations (E-<# epochs>)E-1 E-5 E-10 E-INF
Throughput (rps)Memory Usage (MB)
CONTRIBUTIONS
OVERALL GC COMPARISON
Mem
ory
Usag
e (M
B)
0
5
10
15
20
Thro
ughp
ut (r
ps)
0
7500
15000
22500
30000
Best ConfigurationsOFF VACUUM NAÏVE EPOCH
Throughput (rps)Memory Usage (MB)
CONTRIBUTIONS
CACHE MISSES MATTER AND THEY DON'T!Ca
che
Miss
es
0
4.5
9
13.5
18
Best ConfigurationsOFF VACUUM NAÏVE EPOCH
Cache Misses (%)
PROBLEMS
ENTIRE TABLE TRUNCATIONS
▸ Possibly_free_list may potentially end up with an absurdly large number of free slots
▸ Has to be handled as a special case
RECYCLING ACROSS TILE GROUPS
▸ Current recycling is at table granularity - i.e. across tile groups
▸ Different tile group schemas in the same table may become problematic
▸ Tradeoff: recycling granularity vs # tuples recycled
QUESTIONS?
Thank You…