kinetic modeling of data eviction in cache
TRANSCRIPT
KineticModelingofDataEvictioninCache
XiamengHu,XiaolinWang,LanZhou,Yingwei LuoPekingUniversity
ChenDingUniversityofRochester
ZhenlinWangMichiganTechnologicalUniversity
6/29/16 Usenix ATC'16 1
Background�MissRatioCurve(MRC)isapowerfulmetricforcacheoptimization:� Allocation,Partition,Scheduling,QoS managing…
� OnlineMRCprofilingtechniqueshavebeendevelopedfordecades.
� Ultimategoals:� Lessspaceconsumption.� Lowertimecomplexity.
6/29/16 Usenix ATC'16 2
0%
50%
100%MRC
Background
6/29/16 Usenix ATC'16 3
•AbriefhistoryofMRCtechniques.
Our Model: Average Eviction Time
6/29/16 Usenix ATC'16 4
•Linear time•Constant space•Composability
EvictionTime
6/29/16 Usenix ATC'16 5
d a b c
a d b c
c a d b
b c a d
e b c a
a d b c
2nd access
Eviction
d a b c
a b c e
1st access
a
d
a
d
a
c
b
e
data
Residence time
Eviction time
1
2
3
4
5
6
7
8
time
d e b cd93nd access
MRU LRU
EvictionTime
�Theevictiontimeisthetimebetweenthelastaccessandtheeviction.�Property of eviction time:
� Ifthereusetimeofanaccessislargerthanit’sevictiontime,it’samiss.
� Reusetime:thetimebetweenanaccessanditsnextreuse.Thereusetimeofcoldmissisdefinedasinfinite.
6/29/16 Usenix ATC'16 6
Backtotheexample
6/29/16 Usenix ATC'16 7
d a b c
a d b c
c a d b
b c a d
e b c a
a d b c
d a b c
a b c ea
d
a
d
a
c
b
e
data
Eviction time = 4
d e b cd
Hit!
Cold Miss!
Miss!
Reuse time = ∞
Reuse time = 2
Reuse time = 5
1
2
3
4
5
6
7
8
time
9
MRU LRU
AverageEvictionTime
�AverageEvictionTime(AET)isthemeanevictiontimeofalldataevictionsinafullyassociativeLRUcache.�WecanassumealldatareferenceswithareusetimelargerthanAETaremisses.
6/29/16 Usenix ATC'16 8
HowtomodelAET?� Movecondition#1:
� Cachehitinsertsthelower priority position datatotheLRUstack top.
� Movecondition#2:� Cachemissinsertsamissed datatotheLRUstack top.
6/29/16 Usenix ATC'16 9
d a b c
a d b c
d a b c
e d a b
access: a
access: e
MRU LRU
MRU LRU
HowtomodelAET?� Staycondition:
� Cachehitinsertsthehigher priority position datatotheLRUstack top.
6/29/16 Usenix ATC'16 10
a b d c
b a d caccess: b
MRU LRU
HowtomodelAET?�Wedefinethearrivaltime𝑇% astheexpected timeittakesforan evicting data toreachthem-th position(fromitslastaccess).
� Adatablockatposition𝑚moveonestepdownwheneverthereusetimeofcurrentaccessisgreaterthanthe𝑇%.
� 𝑃(𝑡) istheprobabilityforanaccesswithareusetimegreaterthan𝑡.
� Themovementconditionisnowaprobability.Everyaccess,adatablockatstackposition𝑚movesby𝑃(𝑇%) .
6/29/16 Usenix ATC'16 11
KineticModel� Datatravelsinonedirectionwithchangingspeed:
� Ingeneral,ifthetimethatevictingdataalreadytraveledis𝑡, its’currentevictingspeedis𝑃(𝑡).
6/29/16 Usenix ATC'16 12
top d bottom… …
𝑉(𝑡) = 𝑃(𝑡)
AverageEvictionTime� Physics:theintegrationofspeedovertimeistraveldistance.� ThelengthofLRU lististhetraveldistanceofeveryeviction.Whichisthecachesize𝑐.
1 𝑃 𝑡 𝑑𝑡 = 𝑐345(6)
7
� With𝑃,wecalculateAETsofdifferentcachesizesinlineartime.� 𝑃 canbeacquiredonlinebymonitoringthereusetimehistogram.
6/29/16 Usenix ATC'16 13
FromAETtoMRC
6/29/16 Usenix ATC'16 14
•Themissratio𝑚𝑟(𝑐)atcachesizec istheprobabilitythatareusetimeisgreaterthantheaverageevictiontime𝐴𝐸𝑇 𝑐 :
𝑚𝑟(𝑐) = 𝑃(𝐴𝐸𝑇(𝑐))
AET Design Overview
6/29/16 UsenixATC'15 15
ProgramMonitoring
AccessTrace
Reuse TimeHistogram AET Miss Ratio
Curve
RandomSampling
� Randomlypickcurrentaccesseddatatomonitoritsreusetime.
� Thedistancebetweentwosampledisarandomvalue.� Constraintherandomvaluerangetocontrolsamplingrate.� Ahashtableisrequired.Itmaintainscurrentmonitoreddata.� Thespaceconsumptionislinearbutlimited.
6/29/16 Usenix ATC'16 16
ReservoirSampling
� Toboundthespacecosttoconstant.𝑂(1)�Whenthe𝑖-th sampleddataarrives,reservoirsamplingkeepsthenewdatainmonitoringsetwithprobabilitymin(1, 𝑘/𝑖)andrandomlydiscardsanolddatawhenthesetisfull.
� Itensurestheequalprobabilityforeverysampledreusetoupdatereusetimehistogram.
�Whilethenumberofsamplesberecordedisbounded.
6/29/16 Usenix ATC'16 17
AETinSharedCache� Composability:co-runbehaviorcanbecomputedfromthemetricofsolo-runs.
�When𝑛 programssharethecacheofsize𝑐,all𝑛 + 1co-run𝐴𝐸𝑇𝑠,𝐴𝐸𝑇I(𝑐) foreachprogrami and𝐴𝐸𝑇(𝑐)forthegroup,arethesame:
� Detailedmodelingis describedinpaper.
6/29/16 Usenix ATC'16 18
Evaluation
�AET vs Counter Stacks (OSDI’14)�AET vs SHARDS (FAST’15)� Shared Cache AET
6/29/16 Usenix ATC'16 19
AETvsCounterStacks
� CounterStacks:� Onlyrequiresextremelysmallspacewhilemaintaininganacceptableaccuracy.
� HyperLogLog countertotrackreusedistance.� Balanceaccuracyandspacebylimitingthenumberofcounters.
� Benchmarks:� MicrosoftResearchCambridge(MSR)storagetraces.� Configuredwithonlyreadrequestsof4KBcacheblocks.
6/29/16 Usenix ATC'16 20
AETvsCounterStacks
6/29/16 Usenix ATC'16 21
AETvsCounterStacks
6/29/16 Usenix ATC'16 22
AETRandomSampling(𝟏 ∗ 𝟏𝟎M𝟓)
AETReservoirSampling8kentries
CounterStacksHighfidelity(d=1M,s=60,δ=0.02)
CounterStacksLowfidelity(d=1M,s=3600,δ=0.1)
MeanAbsoluteError
0.96% 1.12% 0.77% 1.26%
AverageSpace Cost
452KB 384KB 7363KB 1292KB
AverageThroughput
63.99Mreqs/sec
61.99Mreqs/sec
1.73Mreqs/sec
5.86Mreqs/sec
AETvsSHARDS� SHARDS:
� hash-basedspatialsampling� asplaytreetotrackthereusedistancesofthesampleddata.� Limitsthespaceoverheadtoaconstantbyadaptivelyloweringthesamplingrate.
� Benchmarks:� “master”MSR,whichisa2.4billion-accesstracecombiningall13MSRtracesbyrankingthetimestampsofallaccesses.
6/29/16 Usenix ATC'16 23
AETvsSHARDS
6/29/16 Usenix ATC'16 24
AETvsSHARDS
6/29/16 Usenix ATC'16 25
AETRandomSampling(𝟏 ∗ 𝟏𝟎M𝟓)
AETReservoirSampling8k samples
SHARDS8k samples
CounterStacks
MeanAbsoluteError
1% 1% 0.6% 0.3%
AverageSpace Cost
1.7MB 1.4MB 2.3MB 80MB
AverageThroughput
79Mreqs/sec
66.6Mreqs/sec
81.4Mreqs/sec
3.2Mreqs/sec
SharedCacheAET
�We choose Four MSRstoragetraces{prn,src2,web,stg}asaco-rungroup.
� Generateacombinedtracefromthefourtracesunderequalspeedassumption.
�WecompareMRCcomposedbyindividualAETmodelingofeachtrace,aswellastherealMRCofthecombinedtrace.
6/29/16 Usenix ATC'16 26
SharedCacheAET
6/29/16 Usenix ATC'16 27
Summary
6/29/16 Usenix ATC'16 28
•Anewmodeltocharacterizecachebehavior.• EnablefastMRCprofilingwithO(1)spaceandO(n)time.•PredictsharedcacheMRCwithoutco-runtesting.•Perfectforonlinedeploymentwithlimitedoverhead.
AETvsStatStack� StatStack:
� DesignedforCPUworkloads.� Itsamplescacheblocksandmeasurestheirreusetimeusingperformancecountersandwatchpoints.
� Reusetimehistogram->Reusedistancehistogram.� Benchmarks:
� SPECCPU2006,30benchmarks.� Foreachbenchmark,weintercept1billionreferencesfromtheirexecutionusingtheinstrumentationtoolPin.
� Wemeasurethecumulativedistributionfunction(CDF)ofabsoluteerroroffull-traceStatStack,full-traceAET,samplingAET.
6/29/16 Usenix ATC'16 30
AETvsStatStack
6/29/16 Usenix ATC'16 31