cs193h: high performance web sites lecture 16: rule 13 – configure etags steve souders google...

Post on 27-Mar-2015

222 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

CS193H:High Performance Web Sites

Lecture 16: Rule 13 – Configure ETags

Steve SoudersGoogle

souders@cs.stanford.edu

announcements11/17 – guest lecturer: Robert Johnson

(Facebook), "Fast Data at Massive Scale - lessons learned at Facebook"

HTTP/1.1 200 OKContent-Type: application/x-javascriptLast-Modified: Mon, 22 Sep 2008 21:14:35 GMTContent-Length: 2066Content-Encoding: gzip

XmoÛHþ\ÿFÖvã*wØoq...

Expires

expiration date determines freshnesscan also use Cache-Control:max-age

GET /v-app/scripts/107652916-dom.common.js HTTP/1.1Host: www.blogger.comUser-Agent: Mozilla/5.0 (…) Gecko/2008070208 Firefox/3.0.1Accept-Encoding: gzip,deflate

HTTP/1.1 200 OKContent-Type: application/x-javascriptLast-Modified: Mon, 22 Sep 2008 21:14:35 GMTContent-Length: 2066Content-Encoding: gzipExpires: Fri, 26 Sep 2008 22:00:00 GMT

XmoÛHþ\ÿFÖvã*wØoq...

GET /v-app/scripts/107652916-dom.common.js HTTP/1.1Host: www.blogger.comUser-Agent: Mozilla/5.0 (…) Gecko/2008070208 Firefox/3.0.1Accept-Encoding: gzip,deflateIf-Modified-Since: Mon, 22 Sep 2008 21:14:35 GMT

HTTP/1.1 304 Not Modified

Conditional GET (IMS)

IMS determines validity – does the browser's cached version match what's on the server?

the comparison is based on the resource's datea 304 response is sent instead of all the dataIMS is used when Reload is pressed

sometime after 3pm PT 9/24/08:

HTTP/1.1 200 OKContent-Type: application/x-javascriptLast-Modified: Mon, 22 Sep 2008 21:14:35 GMTContent-Length: 2066Content-Encoding: gzip

XmoÛHþ\ÿFÖvã*wØoq...

ETag Response Header

GET /v-app/scripts/107652916-dom.common.js HTTP/1.1Host: www.blogger.comUser-Agent: Mozilla/5.0 (…) Gecko/2008070208 Firefox/3.0.1Accept-Encoding: gzip,deflate

HTTP/1.1 200 OKContent-Type: application/x-javascriptLast-Modified: Mon, 22 Sep 2008 21:14:35 GMTContent-Length: 2066Content-Encoding: gzipExpires: Fri, 26 Sep 2008 22:00:00 GMTETag: "19f1e-7920-4525b037f0440"

XmoÛHþ\ÿFÖvã*wØoq...

GET /v-app/scripts/107652916-dom.common.js HTTP/1.1Host: www.blogger.comUser-Agent: Mozilla/5.0 (…) Gecko/2008070208 Firefox/3.0.1Accept-Encoding: gzip,deflateIf-Modified-Since: Mon, 22 Sep 2008 21:14:35 GMTIf-None-Match: "19f1e-7920-4525b037f0440"

HTTP/1.1 304 Not Modified

Conditional GET (INM)

alternative way to test validity

sometime after 3pm PT 9/24/08:

What is an ETaghttp://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.11

added in HTTP/1.1used by clients and servers to validate expired

resourcesmore flexible than Last-Modified date"An entity tag consists of an opaque quoted

string"" An entity tag MUST be unique across all versions

of all entities associated with a particular resource."

If-None-Match (hit)

"If any of the entity tags match the entity tag of the entity that would have been returned in the response to a similar GET request (without the If-None-Match header) on that resource[…], then the server MUST NOT perform the requested method, unless required to do so because the resource's modification date fails to match that supplied in an If-Modified-Since header field in the request. Instead, if the request method was GET or HEAD, the server SHOULD respond with a 304 (Not Modified) response,…"

http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.26

INM, IMS hit & miss

hit miss

hit 304 full response

miss

If-Modified- Since

If-None-Match

If-None-Match (miss)If none of the entity tags match, then the server

MAY perform the requested method as if the If-None-Match header field did not exist, but MUST also ignore any If-Modified-Since header field(s) in the request. That is, if no entity tags match, then the server MUST NOT return a 304 (Not Modified) response.

INM, IMS hit & miss

hit miss

hit 304 full response

miss full response full response

If-Modified- Since

If-None-Match

if not managed properly, sending both IMS and INM lowers the chances of a simple, small 304 response

How could it not be managed properly?!

Apache ETags"19f1e-7920-4525b037f0440"

"inode-size-timestamp"

inode – used by filesystems to store file type, owner, group, permissions, etc.

inode for the same file differs across servers even if file size, timestamp, and directory is the same

http://stevesouders.com/images/arrow-right-9x13.pngETag: "21f5315-d4-5d51f0c0"

http://1.cuzillion.com/images/arrow-right-9x13.pngETag: "1ee57ec-d4-5d51f0c0"

IIS ETags"b4f35327edac51:113f"

"timestamp:changenumber"

changenumber – counter to track IIS configuration changes

changenumber rarely the same across servershttp://hp.msn.com/global/c/hpv10/favicon.ico

ETag: "b4f35327edac51:113f"ETag: "b4f35327edac51:e6e"

example ETag missGET /global/c/hpv10/favicon.ico HTTP/1.1Host: hp.msn.comIf-Modified-Since: Wed, 26 Oct 2005 22:39:58 GMTIf-None-Match: "b4f35327edac51:19bc"

HTTP/1.x 200 OKContent-Length: 1406Etag: "b4f35327edac51:d76"Last-Modified: Wed, 26 Oct 2005 22:39:58 GMTExpires: Wed, 06 Feb 2008 01:10:16 GMT

timestamp is the sameLast-Modified matches (but IMS misses)

changenumber differs, validations misses, entire body is resentvalidation miss

the problem with ETagsthe default ETag syntax in Apache and IIS makes it

unlikely that INM will match across servers, even when the resource is the same

probability of an incorrect INM miss:(n-1)/n where "n" is the number of serversnot an issue if you just have one server

http://www.apacheweek.com/issues/02-01-18 "can cause an unnecessary performance hit as

resources are fetched more often than is required"

http://support.microsoft.com/kb/922703"IIS 6.0 sends a 200 response because it considers the

different change numbers to mean that [the resources] are not the same versions"

the solution for ETagsif you're not leveraging ETags, turn them offreduces size of requests and responsesreduces outbound traffic from your serversincreases proxy cache hit rateApache:

FileETag none

IIS:synchronize changenumber across servershttp://support.microsoft.com/kb/922703/

ETags in the wild

serverETag

s?

default syntax

?

www.aol.com AOLserver no –www.ebay.com IIS yes yes

www.facebook.com Apache no –www.google.com/search

gws no –

search.live.com/results

ASP.NET yes no

www.msn.com IIS no –www.myspace.com Apache some no

en.wikipedia.org/wikiApachelighthttpd

someyes

no?

www.yahoo.com YTS no –www.youtube.com btfe no –

possible uses for ETags???

Homework11/7 11:59pm – rules 4-10 applied to your

"Improving a Top Site" class project 11/12 3:15pm – Web 100 Double Check

• look at your rows in Web 100 spreadsheet• double-check your entries for any rows in red• update incorrect entries• enter "y" in "Double Checked" column

read HPWS Chapter 14

QuestionsWhy were ETags introduced in HTTP/1.1?What do "IMS" and "INM" stand for?How do IMS and INM interplay during resource

validation?What's the default syntax for ETags in Apache

and IIS?What component in each default syntax hurts

performance, and why?What are three performance gains you can

achieve by turning off ETags?

top related