the - usenix.org · http complian t cac he asso ciates with eac hcac hed ob ject an expiration time...

The Age Penalty

and its E�ect on Cache Performance

Edith Cohen Haim Kaplan

AT&T Labs{Research School of Computer Science

180 Park Avenue, Florham Park Tel-Aviv University

NJ 07932, USA Tel-Aviv 69978, Israel

[email protected] [email protected]

Abstract

Web content caching is recognized as ane�ective mechanism to decrease server load,network tra�c, and user-perceived latency.An HTTP compliant cache associates witheach cached object an expiration time calcu-lated according to directives set by the ob-ject's origin server. The cache incurs a miss

when it has no cached copy of a requestedobject or when the existing copy had expired(is not fresh). Upon a miss, the cache needsto fetch or validate a copy through exchangeswith another cache with a fresh copy or theorigin server. Thus, misses generate tra�cand prolong service times.

Caches are deployed as proxies, reverseproxies, and hierarchically and as a result,caches often serve other caches. As thishappens, content age at higher-level caches,in addition to availability and freshness,emerges as a performance factor. The age

of a cached copy of an object is the elapsedtime since fetched from the respective ori-gin. Fresh cached copies of the same ob-ject can have di�erent ages and older copiestypically expire sooner. Therefore, a proxycache would su�er a higher miss rate if itreceives older objects (e.g., from a reverse-proxy cache). Similarly, reverse-proxy cachesthat serve proxy-caches receive more requeststhan an origin server would have received.We refer to the increase in miss rate due toage as the age penalty. We use trace-basedsimulations to measure the extent of the agepenalty for content served by content deliv-ery networks and large caches. Even though

the age penalty had not been considered pre-viously, we demonstrate that it can be signif-icant, and moreover, can highly vary underdi�erent practices.

1 Introduction

Caching and replication of Web contentsare widely deployed mechanisms for reducingWeb servers load, network load, and user-perceived latency. The cache e�ectivenessdepends on the lifetime durations of cachedcopies. Web caching is governed by HTTP,which associates with each cached copy ofan object a freshness expiration time afterwhich it is considered stale. When a requestarrives and there is no cached copy (con-tent miss), the cache attempts to obtain afresh copy through another cache or the ori-gin server. When a request arrives and thereis a stale cached copy of the object, the cachemust attempt to validate it (ask for valida-tion or a modi�ed copy) before serving it tothe user Validations are performed througha conditional (If-Modified-Since or E-tagbased) GET request and involve communica-tion with the origin server or another cache.A large fraction of validation requests return\unmodi�ed" (we term such requests fresh-

ness misses), but very often the latency theyincur is comparable to that of a full- edgedcontent miss. Thus, both content and fresh-ness misses decrease the cache e�ectivenessand are important to avoid.

Caches determine an expiration time for a1

cached copy by computing its freshness life-

time and its (approximate) age. A copy be-comes stale when its age exceeds its freshnesslifetime. The freshness lifetime is essentiallydetermined by the origin server as it is com-puted using directives and values found in theobject's response headers. If explicit direc-tives are not provided, a heuristic is used.The copy's age is the elapsed time since itwas sent by its origin server. When a copy isobtained from another cache it has a positiveage and thus, typically a shorter time-to-live(TTL) than would have been if fetched di-rectly from the origin server.

Caches are being deployed throughout thenetwork. Proxy caches are placed in ISPsnetworks close to clients and reverse proxycaches are placed near network exit pointsand cache objects for hosted Web sites. Con-tent delivery services (CDNs) such as Aka-mai and Sandpiper place caching servers inmultiple locations [1, 10]. Therefore, it is be-coming increasingly more common that (ex-plicitly or transparently) content is served toa cache (such as a proxy cache) from anothercache (e.g., a reverse proxy). As this happens,another performance factor emerges, namely,the age of copies at higher-level caches. If anolder copy of an object is received by the low-level cache, the cached copy expires soonerand thus is less likely to remain fresh till asubsequent request arrives for the same ob-ject. Therefore, low-level caches that receiveolder copies incur more misses. We refer tothe relative di�erence in the number of fresh-ness misses incurred by a cache when it for-wards requests to a non-authoritative (high-level cache) vs. authoritative (origin server)source as the age penalty .

Cache performance issues related to con-tent age had been by and large overlooked.Our main contribution is introducing the agepenalty issue and assessing its magnitude.Aging issues take a di�erent spin for con-tent served by CDNs. CDNs such as Aka-mai act as reverse proxies, but have tighterrelation with their clients (Web sites). Inparticular, they deploy a consistency mech-anism other than HTTP/1.1 between themand their clients. Downstream client caches,however, still use the HTTP response head-ers to determine the object lifetime. The in-

teraction of two coherence mechanism givesrise to di�erent practices that greatly a�ectobject age, and as a result, the e�ectivenessof client caches, user-perceived latency, andtra�c. We discuss observed CDN practicesand measure the associated age penalty.

In Section 2 we overview HTTP freshnesscontrol and its usage, and introduce aging is-sues. Section 3 discusses our traces and ex-perimental methodology. Section 4 containssimulation results measuring the age penaltyfor objects served from a high-level cache.Section 5 is concerned with the age penaltyfor objects served by content delivery net-works. We conclude in Section 6.

2 Freshness control

Cache freshness control mechanisms arediscussed in the speci�cation of HTTP/1.1and its predecessors. Caches should deter-mine the freshness period conservatively, us-ing parameters in the response headers ofthe object. If the header does not includespeci�c directives, then it is suggested thatcaches apply a heuristic, by �rst matching theURL against some regular expression (e.g.,according to su�x which usually indicatesthe content type), and then determining thefreshness lifetime to be some fraction of theelapsed time between the object's date andits last modi�cation time. We provide a quickoverview of freshness control at caches. Forfurther details see [5, 2, 11, 8, 7].

The following response headers are used bySquid [11] for freshness control:

� Date. A time stamp indicating when anobject is sent by the origin server. Allorigin servers that have clocks must pro-vide a Date header. An object receivedwithout a Date header must be assignedone by the recipient if the object will becached. The Date header of an object is up-dated after a 304 (not modi�ed) response toan If-Modified-Since GET request. Thisheader is an end-to-end header not supposedto be changed by intermediate caches.

� Pragma: no-cache (HTTP/1.0) orCache-Control: no-cache (HTTP/1.1).

Explicit directive that the object is not tobe cached. Response without this directiveis considered cachable.

� Cache-Control: max-age (HTTP/1.1).Explicit freshness lifetime value assignmentby the origin server in seconds. This valueis typically �xed for copies of the object ob-tained at di�erent times.

� Expires: (HTTP/1.0). A time stampbeyond which the object stops being fresh.If Max-Age is also present then Max-Age

takes priority. In practice, Expires is of-ten either set to be the current time or timein the past (implying freshness lifetime ofzero), a far time in the future (years), or iscalculated dynamically as Date + T (pro-viding for a freshness lifetime of T ) and es-sentially behaves like a Max-Age directive.

� Last-Modified (HTTP/1.0). The timewhen the object was last modi�ed by theorigin server.

� E-tag (HTTP/1.1). An identi�er for thereceived version of the object. The E-tagis generated at the origin and can be usedfor validation instead of a Last-Modified

time.

� Age (HTTP/1.1). The cumulative timean object spent in caches when received bythe current cache. This response header isadded and modi�ed by caches and is presentif all caches along the response path areHTTP/1.1 compliant.

HTTP/1.1 speci�cation (and Squid) con-sider every object as cachable unless an ex-plicit no-cache directive is present.1 Thefreshness calculation for a cachable objectcompares the age of the object with its fresh-ness lifetime. If the age is smaller thanthe freshness lifetime the object is consid-ered fresh and otherwise it is considered stale.We refer to the di�erence between the fresh-ness lifetime and the age as the time-to-live(TTL).

Squid calculates the age of an object as thedi�erence between the current time (accord-ing to its own clock) and the time speci�edby the Date header. This method of age cal-culation is also described in HTTP/1.1 speci-�cation. If an Age header is present, the age

1There are few exceptions to this rule such as re-sponses to requests with authorization headers andresponses with vary headers that are not yet sup-ported by Squid and are very rarely used.

is taken to be the maximum of the above andwhat is implied by the Age header.

Squid implements freshness lifetime calcu-lation according to HTTP/1.1 speci�cationsas follows. First, if a Max-Age directiveis present, the value is used as the fresh-ness lifetime, and the TTL is the freshnesslifetime minus the age (or zero if negative).Otherwise, if Expires header is present, thefreshness lifetime is the di�erence between thetimes speci�ed by the Expires and Date

headers (zero if negative), and thus the TTLis the di�erence between the time speci�ed bythe Expires header and the current time (orzero in case this di�erence is negative). Oth-erwise, no explicit freshness lifetime is pro-vided by the origin server and a heuristic isused: The freshness lifetime is assigned tobe a fraction (denoted by CONF PERCENT,HTTP/1.1 mentions 10% as an example) ofthe time di�erence (denoted by LM AGE)between the time speci�ed by the Date

header and the time speci�ed by the Last-Modified header, subject to a maximum al-lowed value (denoted by CONF MAX and isusually a day, since HTTP/1.1 requires thatthe cache must attach a warning if heuristicexpiration is used and the object's age ex-ceeds a day). Origin servers are supposed tokeep their clocks fairly close to real time, andthe age calculation assume that this is indeedthe case.

2.1 Usage of freshness control

To get an idea for the distribution of fresh-ness control mechanisms used, we took one ofthe logs we used (6 day NLANR cache trace,see Section 3) and performed separate GETrequests to URLs appearing in the log. Wethen weighted the freshness mechanism bythe number of requests to the object recordedin the log. 3.4% of requests were to objectswith Max-Age speci�ed. 1.4% were with noMax-Age header and with Expires speci-�ed in a relative way (or to a time equal orbefore the one speci�ed by the Date header),and 0.8% were with Expires speci�ed in anabsolute way. The vast majority, 70%, didnot have either Max-Age or Expires speci-�ed but had a Last-Modified header which

allowed for a heuristic calculation of fresh-ness lifetime. Other requests either had nei-ther of these 3 header �elds, were explicitnoncachables (3%), or corresponded to ob-jects with response headers other than 200(when separately GETted), the most com-mon of which was 302 (HTTP redirect).

Figure 1 plots CDF (Cumulative Distri-bution Function) of freshness lifetime valuesweighted by respective number of client re-quests. The majority of freshness lifetime val-ues are 24 hours (86400 seconds), which ismostly due to the heuristic calculation (usingLast-Modified) with a CONF MAX set-ting of 24 hours. About 25% of values are

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 20000 40000 60000 80000 100000 120000

frac

tion

belo

w x

seconds

CDF of freshness lifetime values, weighted by requests

UC traceSD trace

Figure 1: CDF of freshness lifetime valuesweighted by requests (cachable objects only)

0 (due to Max-Age or Expires directive).The linearity of the line between very shortfreshness lifetime values and a value of 86400(one day) is due to recently-modi�ed objectsand the heuristic expiration calculation (ob-jects modi�ed less than 10 days ago anda CONF PERCENT value of 10%). Thesestatistics also show that most freshness life-time values are in e�ect �xed , being the samefor copies subsequently obtained from the ori-gin. The freshness lifetime value is �xed whenit is computed using Max-Age, relative useof Expires, Expires value before the currentDate (freshness lifetime of 0), or with theheuristic when the object is su�ciently old tohave freshness lifetime of CONF MAX (thatis, the elapsed time since its creation timesCONF PERCENT is at least CONF MAX).The two cases without �xed freshness life-time values are (i) recently-modi�ed objectswith no explicit freshness control directive,and (ii) the Expires value is �xed (that is,

the server returns the same value for later re-quests). As we shall see next, these distinc-tions are relevant for our study.

2.2 TTLs via di�erent sources

We consider the TTL of a cached copyof an object at the time it is received by acache. The age of a received copy, and thus,its TTL, may vary according to the source.A copy fetched through the origin is typi-cally received with zero age whereas a copyobtained from a cache has positive age. 2

The relation between age and TTL dependson the applicable freshness control directive:With a \�xed" freshness lifetime value, (aswith Max-Age header. dynamically-set Ex-pires header, or heuristic expiration for ob-jects that were not modi�ed very recently),the di�erence in TTL durations of two freshcopies is equal to the di�erence in their ages(the older copy becomes stale �rst). For ob-jects with a \static" Expires header, theTTL is the same for copies with di�erentages. For recently-modi�ed objects with noexplicit directives, the di�erence in TTL val-ues of two copies is in fact greater than thedi�erence in their ages.3

The following �gures illustrate these gaps.Figure 2(A) plots the TTL for an object withMax-Age or \relative" Expires freshnesscontrol when the object is served from its ori-gin and when it is cached at a top-level cache.The illustrated scenario is such that the cachedirectly contacts an authoritative source onlywhen its cached copy is stale. Figure 2(B)plots the same for objects with heuristic ex-piration based on Last-Modified header. Ifno-cache requests are received by the parent-

2Recall that the age of the copy is calculated asthe maximum of the age and the di�erence betweenthe current time and the Date header value. Hence,if clocks at di�erent hosts are synchronized, copiesobtained from a cache should have positive age.

3Recall that when Expires or Max-Age headersare not present, a heuristic is used to determine thefreshness lifetime. The freshness lifetime is deter-mined as a fraction of the di�erence between Date

and Last-Modified. In that case the TTL gets pe-nalized twice for the copy's age: �rst, the freshnesslifetime (fraction of the di�erence between Date andLast-Modified subject to a limit) is smaller for oldercopies, and second, the age of the older copy is larger.

TT

L

time

from origin

from cache

max_age

TT

L

from origin

from cache

conf_max

time

TT

L

time

no-cache requests

max_age

from cache

from origin

(A) (B) (C)

Figure 2: TTL for an object (i) when fetched from the origin and (ii) when fetched from acache. (A) the object hasMax-Age response header; (B) object has heuristic expiration basedon Last-Modified response header, The slope of the line corresponds to CONF PERCENTvalue and it levels o� at CONF MAX; (C) object with Max-Age response header when twoclient requests with a no-cache directive where received.

cache, the TTL would look as illustrated inFigure 2(C).

2.3 Age penalty

The age-penalty is measured by a what-ifscenario (illustrated in Figure 3). We com-pare the performance of a (low-level) cachethat forwards requests (transparently or ex-plicitly) to high-level caches to the perfor-mance if all requests were forwarded to ori-gin servers. We focus here on the freshnesshit-rate at the cache. Generally, the age-penalty e�ect depends on the relation of inter-request times and freshness lifetime durationand kicks in when the low-level cache receivesmore than one request per freshness lifetime.

Origin serverHigh-level cache

Low-level cache

Figure 3: A low-level cache with choice ofsources.

3 Experimental methodology

Our data included NLANR cache traces [6].We used two 6 days traces from NLANR

caches collected January 20th till January25th, 2000 and between July 25 to July 29,2000 (the second set of logs was used only inmeasuring CDN performance).

Our experiments required response headervalues in order to deduce freshness life-time durations. As these are not typicallylogged by high-volume caches (including theNLANR traces), we separately performedGET requests shortly after downloading thetrace. Timeliness and the distribution offreshness control directives and values (pre-sented in Section 2.1) suggest that the pro-jected values we obtained were fairly closeto actual freshness lifetime values that wouldhave been obtained at the time requests werelogged.

We run simulation using the original traceand the projected values. The NLANR logscontain various labels that characterize eachrequest by its type and cache performance.(Classi�cation of hits and misses, presence ofno-cache request header, whether the cachedcopy was stale or fresh, etc.). In the simula-tion we accounted for clients requests withno-cache header (forcing the cache to for-ward the request to the server even if thecache has a fresh copy of the object). Weused a heuristic to determine on which re-quests content had changed (treating requestson which the content deemed fresh by theNLANR cache as \no change."). Our heuris-tic used the labels associated with the re-quest and the logged size of the responseto the client. In the simulation we appliedthe Squid object freshness model (HTTP/1.1compliant), using a CONF PERCENT value

of 10%, a CONF MAX value of a day, and aCONF MIN value of 0 for all URLs. We usedthe projected freshness lifetime durations. Tosimulate requests served by origin servers, weset the TTLs to be the respective projectedlifetime durations. In various experiments weused TTL values lower than freshness lifetimeto simulate access through a cache.

Since we could not GET all URLs in areasonable time without adversely a�ectingour environment, we selected a subset. Weused all URLs that were requested more than12 times, and applied some weighted randomsampling to others. In total we fetched about224K distinct URLs (the original logs hadmillions, most of them requested only once). We factored the sampling back in by scal-ing the results and grouping URLs by numberof requests.

Our simulations assumed in�nite cachestorage capacity, which is consistent with cur-rent trends and with the desired performanceon the actual traces we used.

We used the freshness rate as our perfor-mance metric. We de�ne the freshness rateas the ratio of freshness hits to content hits.We de�ne content hits and freshness hits asfollows. A request for a fresh cached object isconsidered a content hit and a freshness hit.A request for a stale cached object is consid-ered a content hit only if an authority with afresh copy (e.g, the server or another cache)certi�es that it is not modi�ed. We refer torequests that constitute content hits but notfreshness hits as freshness misses . Contenthit requests exclude requests to explicit non-cachable objects, requests on which the clientspeci�ed to bypass caches (no-cache requestheader), and requests when the content hadactually changed. We also excluded the �rstrequest for each object. Content hits includerequests for objects with respective freshnesslifetime value of 0.

We note that the freshness rate capturesonly one performance aspect of a cache andwe later discuss how to interpret it and com-bine it with others.

The Squid logs recorded the response codereturned to the client and the cache action.

We only considered GET requests such thatthe cache returned a \200" or \304" responsecode to its client. We further classi�ed theserequests as follows, using the listed Squid la-bels.

� freshness hit (fhits):TCP HIT, TCP MEM HIT, TCP IMS HIT

� freshness miss (fmiss):TCP REFRESH HIT

� content miss (cmiss): we separately ac-counted for

{ cmiss-r (the cache had a stalecached copy, issued an IMS re-quest, and got a Modi�ed response):TCP REFRESH MISS

{ cmiss-d (there was no cached copy):TCP MISS

� no-cache request header:TCP CLIENT REFRESH MISS

The following table shows the fraction ofrequests of each type.

log fhit fmiss cmiss-d cmiss-r no-cache

UC 23% 10% 56% 1% 10%SD 19% 15% 56% 3% 7%

Requests classi�ed as fmisses, cmisses, orno-cache involve communication with theorigin server or another cache. Freshnessmisses constitute 13% (UC) and 19% (SD)of all requests directed outside. The cachesdirected most validation requests (fmissesand cmisses-r) outside the NLANR hierarchy(100% in the UC cache and 99.3% in the SDcache). All these requests were directed to anorigin server or a transparent reverse proxy.Our measurements suggested that the vastmajority of requests directed outside reachedthe origin server. 4 The great exception was

4We distinguish a transparent reverse proxy cachefrom an origin server by issuing two consecutive GETrequests within more that a few seconds apart to thesame IP-address of the same server. A cache returna Date header as obtained from the authoritativeserver. An authoritative server returns the currenttime according to its clock. Thus, we examined theDate header value of the two HTTP responses. Thesame value indicated a cache. Two di�erent values,where the di�erence approximates the time betweenour two HTTP GET request-response pairs, suggestorigin server, very short TTL (few seconds or less), ora non-cachable object (that is, no age-penalty e�ect).

requests directed to CDN servers (this is dis-cussed further in Section 5). Thus, by andlarge, the original trace was not subjected toage penalty. If more validation requests aredirected to a cache we expect to see someof the freshness hits transform to freshnessmisses. It is also apparent that the vast ma-jority (90% for UC and 95% for SD) of vali-dation requests return Not-Modi�ed.

4 High-level cache simulation

The following experiment attempts to mea-sure the age penalty when content is fetchedfrom nonauthoritative servers such as reverseproxy caches. As discussed above, a givenrequest can constitute a hit or a miss atthe higher-level cache. For request consti-tuting cache misses, performance would havebeen better had the higher-level cache notbeen there, as longer latency and more tra�cis incurred than through direct communica-tion between our cache and the origin server.The higher-level cache is also not as e�ec-tive on request constituting content misses.Thus, in order to distill the performance ef-fects of age, we simulate an optimistic sce-nario where all requests constitute cache hits(the higher-level cache always keeps a freshcopy). Note that overall performance wouldbe worse when this is not the case.

Our simulation corresponds to a situa-tion where all requests to a given URLare forwarded from our cache consistentlythrough the same top-level cache (e.g., a re-verse proxy) 5. These top-level caches main-tain continuous freshness by refreshing copiesthrough the respective origin servers as soonas it becomes stale. Under these assumptions,for objects with �xed freshness lifetime dura-tions, the TTL of the copy at the top-levelcache cycles from the lifetime duration to 0.We denote the freshness lifetime value by Tand compute the TTL obtained after each

5Interestingly, analysis shows that otherwise theage penalty is higher [3]

NLANR log & object freshness raterequests fraction source fhits=chits

UC 1 origin 52%UC 1 cache 43%UC 0.1 origin 35%UC 0.1 cache 28%

SD 1 origin 47%SD 1 cache 38%SD 0.1 origin 30%SD 0.1 cache 24%

Table 1: Freshness rates when requests aredirected to (i) a cache or (ii) origin servers.

(content or freshness) miss as

TTL = T � (time� log start time) mod T :

For requests labeled as CLIENT REFRESHin the trace (arriving with no-cache requestheader), we set the TTL to T in order to sim-ulate a situation where misses are forwardedto the origin server.

The respective freshness rates in the twoscenarios are listed in Table 1, and showthat the overall age penalty of going throughhigher-level caches amounts to 20%-25% de-crease in freshness hits. The logs UC0.1 andSD0.1 are reduced traces that included a ran-dom sample of 10% of the requests.

5 Content Delivery Networks

Many sites now use content delivery net-works (CDNs) to distribute some of their con-tent [1, 10]. CDNs place multiple servers dis-tributed inside di�erent ISP's networks. Eachof these servers is essentially a cache. Like re-verse proxy caches, these servers only processURLs within the CDN domain, but like proxycaches, they are located close to clients. Sinceclients can reach a CDN server through fewerrouter hops and peering points, the responsetime, packet loss, and the number of data re-transmissions typically improves. CDNs alsoo�-load origin sites.

The current architecture used by popularCDNs (such as Akamai [1]) involves URL

substitutions. The origin sites substitutethe original object URL to one within theCDN domain. For example, the origin sitesubstitutes the URL of the embedded imagehttp://cnn.com/images/icons/video.gif

with the URL http://a388.g.akamaitech

.net/7/388/21/e0a3b4215f9e4b/cnn.com/images/

icons/video.gif. The substituted URL istermed by Akamai the Akamaized URL orARL.

Clients are then directed to a \good" CDNserver (one that is likely to be less loaded,have a cached copy of the object, and be closeto the client). When the client local DNSserver resolves the hostname of the object,the Akamai DNS server returns an IP-addressof a \good" CDN server. CDNs mainly serverelatively static content such as images andjava applets but are striving to serve addi-tional content types including streaming me-dia, authenticated content and dynamic con-tent.

The URL substitution architecture forcesthe deployment of some coherence protocolbetween the CDN and the origin sites. Onenatural possibility is to adhere to HTTP/1.1freshness control - where copies on the CDNservers must be refreshed when they expire.This approach would make the aging issuessimilar to plain reverse proxy caches. Weshall see, however, strong indication that adi�erent mechanism is used.

Experimentation shows that the contentof the response is not sensitive to the par-ticular Akamai hostname used (e.g., whena388.g.akamaitech.net is substituted witha534.g.akamaitech.net). Furthermore, theresponse content is not sensitive to changesin the 3 �elds of the ARL precedingthe part that contains the origin URL(e.g., /388/21/e0a3b4215f9e4b/ in the exam-ple above). Values in the last of these�eld suggest that it encodes a time dura-tion, counter, or an object identi�er. It seemsthat the redundancy in URL to ARL trans-lation is used to encode freshness lifetimeand that the Akamai servers use the infor-mation embedded in the requested ARL todetermine when to refresh (through the orig-inal URL). In some cases (as in the exam-ple above) the ARL seems to be unique per

version of the URL, as it encodes the ver-sion identi�er (e0a3b4215f9e4b). Measuringlatency when requesting di�erent ARLs 6

strongly suggest that CDN servers refresha copy when a request is received with anew \Akamization" of the same URL.7 It isalso evident that Akamai servers do not con-duct any checks to determine if a particular\Akamization" was actually used by the ori-gin site, since they responded to requests witharbitrary values (even in the URL portion ofthe ARL, e.g., in the ARL above, the URLportion cnn.com/images/icons/video.gif canbe substituted with an arbitrary URL). WithCDN servers not performing checks, freshnesscontrol can be performed either by (i) CDNservers considering a copy as expired after apre-agreed duration had elapsed, or as thecontent of a particular embedded URL ismodi�ed at its origin site, the origin site hasto (ii) re-substitute a new ARL for it or (iii) totrigger removal of all cached copies at allCDN servers. The most plausible possibilityseems to be the �rst, that is, the use of pre-agreed durations, most likely encoded withinthe ARL itself.

We discuss how age a�ects performance un-der past and present CDN practices.

5.1 Leave response headers intact

The �rst practice we discuss was imple-mented by the two studied CDN sites un-til about March of 2000. The CDN servers,like plain HTTP caches should, populated theend-to-end header values of their responseswith the original settings provided for the ob-ject when it was fetched from the origin site.As a result, when an object is fetched to acache from a CDN server, its age includes the

6We conducted the following measurements withseveral URLs: We compared the latency of 500 GETrequests of the URL from the same CDN server when(i) the same ARL was used each time (ii) a di�erentARL was used each time (e.g., by replacing the stringe0a3b4215f9e4b). The cumulative latency in thelatter measurement was 25%-40% slower than in the�rst. This suggests that when seeing a new Akamiza-tion, the CDN server obtains a new copy of the URLfrom the origin server prior to sending the response.

7As of 11/2000, Akamai seem to use HTTP redi-rect (302 response code) to the original URL in somecases when a new Akamization is seen.

duration it resided on the CDN server. Morespeci�cally, when a CDN server responded toan HTTP request it returned the Date, Ex-pires, Max-Age, and Last-Modified re-sponse headers as they were when the objectwas fetched to the CDN server from the trueorigin site. The CDN servers, however, di�erfrom HTTP caches since they are consideredby caches to be the authoritative sources forthe substituted URLs (ARLs). Hence, theirresponses are always considered valid for thepresent request (even if already expired) andthey constitute a �nal destination of requestswith no-cache request headers.

To experimentally measure the age penaltye�ect for CDN-delivered objects we extractedfrom the (January, 2000) NLANR logs re-quests that used two CDNs Akamai [1] andDigital Island (Sandpiper) [10]. To that endwe extracted URLs whose hostname includedthe strings \akamai," \sandpiper", or had thepre�x \fp.cache." We note that this is only asubset of requests served by these providersbecause sometimes the domain name does notinclude these strings. We used the same sim-ulation methodology as outlined in Section 3.For these URLs we calculated TTLs in twoways: 1) According to the headers providedby the content provider, 2) As they wouldhave been calculated if a current copy was ob-tained from the origin (with the Date set tocurrent time and adjusting for dynamically-generated Expires headers.) Figure 4 showsthe fraction of URLs with TTL value below xfor varying x. When fetched directly from theorigin, most objects have TTL value of a day(this is consistent with what we get when con-sidering all objects in the log). When fetchedfrom the CDN server, TTL values drop dras-tically, and most of them become zero.

The table below includes for each providerand cache, the respective fraction of requestsin the log, the freshness rate as is (fetchedfrom the CDN behaving as an HTTP cachewith respect to the headers), and the fresh-ness rate as would-have-been if the originserver was serving the object.

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 20000 40000 60000 80000 100000 120000

frac

tion

of tt

l val

ues

< x

ttl (seconds)

ttl values from origin and from content-host

ttls from content-hostttls from origin

Figure 4: CDF of TTL values when contentis fetched from origin vs. content-host

CDN cache % of log fhits=chits thru:requests CDN origin

Sandpiper UC 0.4% 5% 76%Sandpiper SD 0.5% 6% 67%Akamai UC 1.7% 5% 61%Akamai SD 1.1% 6% 63%

These results show that requests to objectsserved through CDNs incur a ten to �fteenfold decrease in freshness hits and 2-3 fold in-crease in freshness misses relative to the samerequests directed to respective origins.

Freshness misses constitute 26%-38% (UCand SD Sandpiper) and 26%-40% (SD andUC Akamai) of all requests forwarded by theUC cache to the CDN, which is considerablemore than general statistics across all loggedrequests. Our simulation suggests that mostof these requests are due to the age penaltye�ect.

What caused the age-penalty through CDNservers to be so much worse than throughHTTP-compliant caches (see Section 4) wasdeployment of a freshness control mechanismother than HTTP/1.1 to decide when to re-fresh their copies of hosted objects. Thefreshness lifetime value used by the CDNserver was typically considerably longer thanthe HTTP-implied value, and thus, the typ-ical hosted object HTTP-expired well beforeit was refreshed by the CDN server. For ex-ample, often the content-host-served objectshave an age exceeding Max-Age value, re-sulting in a TTL of 0 and them becomingstale almost immediately at an HTTP/1.1compliant cache.

Around March, 2000, one of the CDNs,Akamai, addressed the issue by rewritingsome of the end-to-end response headers ofhosted objects. In particular, the Date

header value was set to the current timeat the Akamai server, and the Expires ismade consistent with the Max-Age direc-tive. The returned headers are then suchthat an HTTP/1.1 compliant cache assignsthe same TTL as it would have if it had con-tacted the origin server8. This rewriting ofresponse headers eliminated much of the un-necessary freshness misses. However, the in-teraction of the two freshness control proto-cols gives rise to further questions.

5.2 ARL freshness control

It is evident from the previous measure-ments that the freshness control mechanismbetween the CDN servers and the sites allowsconsiderably longer freshness lifetime dura-tions than HTTP freshness control. Thus,the freshness rate of client caches can be sig-ni�cantly improved if the freshness lifetimeknown to the CDN servers is translated intothe HTTP response headers. In particular,when the \Akamization" is unique per ver-sion, the Expires header can be set accord-ingly to a far time in the future, and as aresult, greatly reduce the number of fresh-ness misses at downstream caches and net-work load due to validations.

To asses the potential gain we used a 5 daylong trace from the UC NLANR cache takenbetween 25 and 29 of July, 2000. Note thatthis trace was downloaded after Akamai im-plemented the change of rewriting responseheader values. The trace included for eachrequest the response code returned from theUC cache to the client, the UC cache actiontaken to process it, and service times (the to-tal processing time from the UC cache per-spective).

In total there were 19K requests to 6Kdi�erent ARLs in the akamaitech.net domain

8This is true in all cases except when the Expiresheader is used in a relative way and there is no Max-

Age directive (see Section 2), and when clocks on theorigin site and the Akamai server are not synchro-nized

for which the UC cache returned a 200 (OK)or 304 (Not-Modi�ed) response code to theclient. 304 responses are freshness misses atthe client cache and 200 responses are contentmisses at the client cache. The breakdown ofrequests is given in Table 2. In each category,we further breakdown requests according toactions taken at the UC cache. In particular,we list the percentage of freshness hits atthe UC cache (LOCAL), misses forwardedto a sibling cache (SIBLING), and missesforwarded to the Akamai server (DIRECT)9.All request directed to a SIBLING werecache content misses (TCP MISS in Squidterminology). For DIRECT requests weaccounted separately for those that arrivedfrom the client with a no-cache requestheader (TCP CLIENT REFRESH MISS),those that were freshness misses at the UCcache (TCP REFRESH HIT), and othercache misses. We calculated average servicetime for requests in each category excludingvalues exceeding 10 seconds, since otherwisethe average is distorted by few outliers. Ser-vice times of less than 10 seconds includedall local responses and about 99.5% of allresponses. It is evident that average servicetimes highly varies by request category.LOCAL requests are handled signi�cantly(70%) faster than requests on which thecache contacted another server. The re-sponse time on 304 responses is about 40%shorter than on 200 responses.

We are now able to asses the potential re-duction in freshness misses, both at the UCcache and at caches of its clients. 10.4K outof the 19.4K (over 50%) of the requests di-rected to the UC cache from client cacheswere validation requests with 304 response.All of these requests except for those witha no-cache request header could have beeneliminated. Thus, 45% of total requests be-tween client caches and the UC caches wouldhave been eliminated. Considering averageservice times shows that the average latencygain would have amounted to about 150ms(plus RTT) per request. Requests from theUC cache to Akamai servers that potentiallycould have been eliminated are all validationrequests without a no-cache request header.This included about 19% (11% for 200 client

9no requests were forwarded to a parent cache

at client cache at UC cache

content miss (200) LOCAL DIRECT SIBLING

9K 208ms 35% 80ms 59% 269ms 6% 360msno-cache 304 miss

20% 1.1K 12% 0.62K 68% 3.6K

freshness miss (304) LOCAL DIRECT SIBLING

10.4K 157ms 28% 51ms 67% 196ms 5% 201msno-cache 304 miss

18% 1.3K 29% 2K 53% 3.5K

Table 2: Breakdown of requests according to cache action.

responses and 29% for 304 responses) of totalrequests issued to an outside server (siblingcache or origin).

On a side note, it is evident that the UCcache evicts items more aggressively than itsclients caches: the majority of requests where304 response was sent to the client were con-tent misses at the UC cache and on the otherhand 60% of the freshness misses at the UCcache were freshness misses (and content hits)at the client. This suggests that UC cacheperformance on these items (and the possi-ble gain from ARL uniqueness) would bene�tfrom increased storage or better replacementpolicy.

In an attempt to try to consider only ARLsthat are used in a unique fashion per URLversion we considered only requests to ARLswhere the value in the last �eld precedingthe URL seemed to contain a time-stamp,counter, or version identi�er. We identi�ed5.8K such requests which included 2.9K with304 response and 2.9K with 200 response.Out of the 304 responses there were 0.8K withno-cache headers. Thus, about 35% of totalrequests from clients to the UC cache wouldhave been eliminated. Out of the 5.8K re-quests, about 60% were DIRECT and about13% were DIRECT freshness hits. Thus,about 20% of requests to the Akamai servercould have been eliminated.

The above discussion asses the potentialgain from translating the proprietary CDNfreshness control into the HTTP headers, butleaves aside possible reasons for keeping twosets of directives. For example, hit-count andcollecting statistics, but these can usually be

performed through the referring HTML pageor using a single embedded object on eachpage.

6 Conclusion

We discussed the e�ects of object-age onthe performance of cascaded caches, andshowed, through trace-based simulations,that the age penalty can be signi�cant. Weremark that the age penalty can be elim-inated altogether if strong consistency ismaintained between high-level caches (e.g.,reverse-proxies or CDN servers) and originservers. Strong consistency, however, is ex-pensive and not facilitated through HTTP orotherwise widely supported.

client caches higher−levelcache

origin

Figure 5: Client caches, high-level cache, andthe origin

For future work, we propose two ap-proaches to alleviate the age penalty whileworking within the current consistency infras-tructure. The �rst, source selection, is de-ployed by \low-level" caches. In some archi-tectures a cache may have a choice of whereto forward requests on which a miss occurred.If so, it is preferable to forward requeststo a server that is more likely to have the\youngest" fresh copy of the object. Moregenerally, source selection should balance dis-tance, likelihood of fresh cached copy, and

age. The second approach, rejuvenation, isdeployed by \high-level" caches. We use theterm rejuvenation for pre-term validation ofselected copies (well before they expire), asa mean to decrease age. At the limit, fre-quent rejuvenation amount to strong consis-tency between the cache and the origin andthus, no age penalty. Rejuvenations reducetra�c between the cache and its clients (anduser-perceived latency), but increase tra�cbetween the cache and the origin servers (seeFigure 5). When the cache serves manyclients which request an object frequently, thebene�t of decreased age penalty could signif-icantly outweigh the cost. We analyse andexperimentally-evaluate rejuvenation in sub-sequent work [4, 3].

The age penalty can widely vary for contentserved by CDNs. The practice of intact end-to-end HTTP response headers resulted in anorder of magnitude decrease in freshness hitsand in a 2-3 fold increase in validation traf-�c. With edited response headers, it seemsthat performance could greatly improve ifARL freshness control is translated to HTTPfreshness control in the rewritten HTTP re-sponse headers. Generally, it is desirablethat CDN clients would need to use only oneset of freshness control speci�cations, eitherthrough HTTP headers, or through the CDNproprietary mechanism. CDNs can then ei-ther adhere to HTTP directives or re-writethem to be consistent with the proprietaryprotocol. A related discussion on de�ning therole of CDNs, possibly by incorporating spe-ci�c CDN directives into HTTP headers, isgiven in [9].

We conclude with the hope that our workwould contribute to better understanding,by researchers and practitioners, of the age-related facet of cache performance, and ulti-mately, spur further work aimed at improvingperformance of cascaded caches.

Acknowledgment

We thank Duane Wessels for answeringquestions on a (since corrected) Squid loggingbug, and Bruce Maggs and Mark Nottinghamfor answering questions on observed Akamai

practices. We also thank Je� Mogul for com-ments on an earlier version of this paper.

References

[1] Akamai. http://www.akamai.com.

[2] T. Berners-Lee, R. Fielding, andH. Frystyk. Hypertext Transfer Proto-col | HTTP/1.0. RFC 1945, MIT/LCS,May 1996.

[3] E. Cohen, E. Halperin, and H. Ka-plan. Performance aspects of dis-tributed caches using TTL-based consis-tency. Manuscript, 2000.

[4] E. Cohen and H. Kaplan. Agingthrough cascaded caches: performanceissues in the distribution of web content.Manuscript, 2000.

[5] R. Fielding, J. Gettys, J. Mogul,H. Frystyk, L. Masinter, and T. Leach,P. Berners-Lee. Hypertext Transfer Pro-tocol | HTTP/1.1. RFC 2616, ISI, June1999.

[6] A Distributed Testbed for National In-formation Provisioning.http://www.ircache.net.

[7] J. C. Mogul. Errors in timestamp-basedHTTP header values. Technical Report99/3, Compaq Western Research Lab,December 1999.

[8] M. Nottingham. Optimizing objectfreshness controls in Web caches. In The

4th International Web Caching Work-

shop, 1999.

[9] M. Nottingham. On de�ning a role fordemand-driven surrogate origin servers.In The 5th International Web Caching

and Content Delivery Workshop, 2000.

[10] Digital Island (Sandpiper).http://www.sandpiper.com.

[11] Squid internet object cache.http://squid.nlanr.net/Squid.

the - usenix.org · http complian t cac he asso ciates with eac hcac hed ob ject an expiration time...

Documents