meta algorithms for hierarchical web caches
DESCRIPTION
Meta algorithms for Hierarchical Web Caches. Nikolaos Laoutaris Sofia Syntila Ioannis Stavrakakis. {laoutaris,grad0585,ioannis}@di.uoa.gr. Department of Informatics and Telecommunications University of Athens 15784 Athens, Greece. Introduction. - PowerPoint PPT PresentationTRANSCRIPT
Meta algorithms for Hierarchical Web Caches
Nikolaos LaoutarisSofia Syntila
Ioannis Stavrakakis
Department of Informatics and TelecommunicationsUniversity of Athens
15784 Athens, Greece
{laoutaris,grad0585,ioannis}@di.uoa.gr
Introduction
The rapid growth of the Internet and the WWW have increased
The network traffic The user-perceived latency The load on web servers
Caching has been employed in order to Reduce access latency Reduce bandwidth consumption Server load balancing Improved data availability
Contemporary hierarchical caches
characteristic of contemporary hierarchical caches: Leave Copy Everywhere (LCE): a hit for a document at an
l-level cache leads to the caching of the document in all intermediate caches, on the path towards the leaf cache that received the initial request.
3,1
1,2 1,3 1,41,1
2,12,1 miss
miss
copy
copy
client request
hit
New approach
We introduce three new Meta Algorithms that revise the standard behavior of hierarchical caches, by:
operating before and independently of the actual replacement algorithm running in each individual cache (hence the “Meta”)
keeping copies in a subset of intermed. caches instead of all
We compare these algorithms against the de facto one (LCE) the one proposed by Che, Tung and Wang (JSAC, Sep. 2002 )
Additionally, we introduce a simple load balancing algorithm, based on the concept of meta algorithms
Advantages of the new algorithms
Significant reduction of average hit distance (delay/traffic reduction gain) over LCE in most cases
Suitable for storage constrained applications Low complexity
Memoryless Do not require additional information (e.g., object request frequencies etc.)
Little or no change to the protocols used to implement the existing hierarchical caches
The Prob algorithm
Each intermediate cache keeps a copy with probability p, and does not keep a copy with probability 1-p
3,1
1,2 1,3 1,41,1
2,12,1 miss
miss
copy with probability p
copy with probability p
client request
hit
The LCD algorithm
Leave a copy only at the cache that resides immediately below the location of the hit on the path to the requesting client.
Requires multiple requests to bring document to a leaf cache
3,1
1,2 1,3 1,41,1
2,12,1 miss
miss
copy
client request
hit
The MCD algorithm
Similar to LCD with the difference that a hit at level-l moves the requested document to the underlying cache (whereas LCD copies the document).
deletes requested document from the cache where the hit occurred
3,1
1,2 1,3 1,41,1
2,12,1 miss
miss
copy
client request
hit† delete
† The document does not have to be physically deleted but rather be marked for eviction
The Filter algorithm (Che et al.)
Each cache is seen as a low-pass filter, with a cutoff frequency given by the inverse of its characteristic
time the characteristic time of cache m is approximated by:
= (current time –last access time of the replaced document )
1m
m
1m
A hit for document i at level l on behalf of client k leads to the caching of i in an intermediate cache m on the path to k, when m satisfies the condition: λki is the frequency that client k requests document I
Filter is non-memory-less (requires frequency estimation)
ki1
m
Rarely requested objects are not cached thus theirrequests pass the filterthus flowing to upper levels
The Filter algorithm (cont.)
When a document is evicted from a cache at level l the algorithm forces its caching at level l+1 (upwards) if not already cached there (this may lead to a domino effect)
3,1
1,2 1,3 1,41,1
2,12,1
miss
miss
client request
hit
i11
m1 leave copy
i11
m2 don’t leave copy
* Assume that caches (1,1),(2,1),(3,1) are full
Design Principles
Prob, LCD, MCD they take advantage of the following 3 design principles:
1. Avoid the amplification of replacement errors
2. Filter-out one-timer documents
3. Rationalize the degree of replication
1.Avoid the amplification of replacement errors
replacement error: when document i is evicted while there exists a document j that if evicted would lead to an improved hit ratio.
LCE: in an L-level hierarchical cache a request for an unpopular document leads to its caching in all L caches L replacement errors amplification of replacement errors
Prob,LCD, MCD reduce the extent of the amplification by reducing the number of copies triggered by a single request
2.Filter-out one-timer documents
Measured proxy workloads contain high percentage of so called one-timer documents
One-timers: documents that are requested only once Caching a one-timer document leads to the worse type of
replacement error that can occur
LCE: deprives popular documents of valuable storage capacity by allowing one-timers to clog all caches
LCD,MCD: one-timers cannot affect any cache other than the root cache
Prob: filters out one-timers by using a small p (cache probability)
3.Rationalize the degree of replication
LCE places copies in all intermediate caches to achieve 2 goals: Have a nearby copy to service other clients connected to leaf caches Have a “backup” copy for the requesting client in case its leaf copy is
evicted Storing a large number of replicas is not always beneficial.
When demand pattern is non-homogeneous When storage capacity is limited
Prob, LCD, MCD create fewer copies, allowing for more distinct documents to be cached
This improves the exclusivity† of caches (Wong, Wilkes, Usenix 2002 ) Exclusivity relates to the ability to avoid the ineffective caching the same
documents at multiple levels
† We would like to thank an anonymous IPCCC reviewer for bringing Wong and Wilke’swork to our attention
Synthetic Simulations
Zipf-like document popularity distribution (a=0.9) Simulated hierarchical cache: regular Q-ary tree with L levels
(Q=2,L=3) Documents originate from an origin server (L+1 level) Each client is co-located with a leaf cache
A client represents the population of an organization Replacement policy at each cache: LRU Storage capacity equally allocated to the caches
Further improvements if the dimensioning of the caches is optimized (Laoutaris et al., Information Processing Letters, March 2004)
Average hit distance for Prob
Prob:[+]small p filters out more effectively one-timers [-] cost paid: slower convergence to steady state
Average hit distance for LCE,Prob,LCD,MCD …
… Average hit distance for LCE,Prob,LCD,MCD
The following may be noted:
LCE has the worse performance
Prob(0.2) is ranking second across all S
MCD’s, LCD’s performance is always better than LCE and Prob
Filter, although non-memoryless, is outperformed by LCD and closely matched by MCD
Non-stationary demand …
Non-stationary document sets common in the web Simulation scenario: every W reqs., M documents out of
the total N that can be requested, are replaced by M new ones
Models volatility is user access patterns
… Non-stationary demand
Hit distance increases with the volatility (captured here by M) LCE:
for small M is the worst performerfor large M outperforms all algorithms
Why?
LCE is able to track the new demand more quickly by requiring a single request to bring a new document to the leaf level
Prob,LCD,MCD,Filter require multiple requests to bring a copy of a new document to the leaf cache
However, the required volatility to make LCE better than the new algorithms is too high and is not typical of measured workloads which appear quite stable (Chen et al., JSAC, Aug. 2003)
Trace-driven Simulations
Description of traces: traces were filtered to keep only requests for cacheable
documents 2 types of caches were studied:
Leaf caches (duration:one week) UoA NTUA
Root caches of the NLANR hierarchy (duration:one day) Boulder,Colorado Palo Alto California Pittsburgh, Pennsylvania Urbana-Champaign San Diego,California Silicon Valley,California
Urbana-Champaign requests: 815194, docs:279375, 1-timers: 72%
Silicon Valley, California requests: 1299024, docs:726075, 1-timers: 82%
Boulder, Colorado requests: 698691, docs:365060, 1-timers: 81%
Pittsburgh, Pennsylvania requests:709180, docs:405680, 1-timers: 84%
San Diego, California requests: 193769, docs:94457, 1-timers: 83%
Palo Alto requests: 273511, docs:137497, 1-timers: 76%
UoArequests: 282540, docs:41088, 1-timers: 71%
NTUA requests: 580460, docs:234432, 1-timers: 73%
Results
Filter inferior to the best performing one, LCD, across all traces Filter more complicated than LCD
Average hit distance (AHD) AHD_Prob > AHD_MCD > AHD_LCD
LCE compared to LCD is inferior under all six NLANR traces almost as good under the UoA trace slightly better under the NTUA trace
LCE performs better when S/N is large (S:storage, N: #of docs)
Load Balancing…
LCE gives rise to the “filtering effect” (Williamson, ACM ToIT, Feb. 2002)
The “filtering effect“: popular documents gather at the leaf cachesIt leads to:
Poor hit ratios at upper levels The servicing of most of the requests at the lower level caches (causing load
imbalance)
A simple load balancing mechanism Threshold based / fully distributed Each cache
calculates its load accepts new copies of documents only when its load is below the threshold
Some popular documents are denied admission to the leaf level thus reside only at upper levels this allows for load to flow upwards
solution ?
…Load Balancing
Load: we count as load only the requests that lead to hits (we neglected the relatively smaller load due to misses)
Nearby hit small propagation delay!But!!!
This does not always lead to small total delivery delay When? when the low level cache is overloaded (then processing takes too long)
With the proposed load balancing mechanism: we sacrifice an increase of propagation delay to gain in terms of end-system processing delay
Simulations…
Load balancing may be applied to all discussed meta algorithms
Our experiments evaluate the effectiveness of LB mechanism :
Using trace data Under the LCE algorithm
LCE-LB: variation of LCE that keeps copies at all intermediate caches provided that a cache has not reached its load threshold TH
nkTH j
j
n: #of cachesk: controls the intensity of the desired load balancing
(1) LCE without LB
(2)LCE with LB (k=1)
No change relative to the no-LB case (previous slide)LB becomes effective after k=2
(3)LCE with LB (k=8)
The effect of LB becomes clear for k=8 With k=16 all levels get almost the same amount of load (see the paper for more results under several k)
Summary of LB related results
Previous figures show that: As k , load tends to be more evenly distributed among
levels Distribution of load under LCE-LB with k=1 is
almost identical to one under LCE.Why?
Load constraint under k=1 is too loose Load constraint almost equal to the maximum load that is
assigned under LCE
Load balancing becomes effective for k>2 Almost perfect LB for high values of k
The cost paid for having LB
The average hit distance (propagation delay) increases with the intensity of LB (with k that is)
Conclusions
We introduced three new Meta Algorithms We compared these algorithms against
the de facto one the one proposed by Che, Tung and Wang
We showed that these algorithms are useful in a variety of situations
LCD (the best one) seems to be performing well under all studied scenarios
We introduced a simple load balancing algorithm, based on the concept of meta algorithms that deals effectively with the “filtering effect”
Post IPCCC work†
We have derived an approximate mathematical model for predicting the performance of LCE analytically
The model predicts accurately the actual performance and gives further insights as to why LCD outperforms LCE
We have shown that LCE performs better than the DEMOTE algorithm of Wong and Wilkes (not discussed in this paper)
† Nikolaos Laoutaris, Hao Che, Ioannis Stavrakakis,
"The LCD interconnection of LRU caches and its analysis,"
submitted work, 2004.