cs193h: high performance web sites lecture 16: rule 13 – configure etags steve souders google...

20
CS193H: High Performance Web Sites Lecture 16: Rule 13 – Configure ETags Steve Souders Google [email protected]

Upload: timothy-little

Post on 27-Mar-2015

222 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: CS193H: High Performance Web Sites Lecture 16: Rule 13 – Configure ETags Steve Souders Google souders@cs.stanford.edu

CS193H:High Performance Web Sites

Lecture 16: Rule 13 – Configure ETags

Steve SoudersGoogle

[email protected]

Page 2: CS193H: High Performance Web Sites Lecture 16: Rule 13 – Configure ETags Steve Souders Google souders@cs.stanford.edu

announcements11/17 – guest lecturer: Robert Johnson

(Facebook), "Fast Data at Massive Scale - lessons learned at Facebook"

Page 3: CS193H: High Performance Web Sites Lecture 16: Rule 13 – Configure ETags Steve Souders Google souders@cs.stanford.edu

HTTP/1.1 200 OKContent-Type: application/x-javascriptLast-Modified: Mon, 22 Sep 2008 21:14:35 GMTContent-Length: 2066Content-Encoding: gzip

XmoÛHþ\ÿFÖvã*wØoq...

Expires

expiration date determines freshnesscan also use Cache-Control:max-age

GET /v-app/scripts/107652916-dom.common.js HTTP/1.1Host: www.blogger.comUser-Agent: Mozilla/5.0 (…) Gecko/2008070208 Firefox/3.0.1Accept-Encoding: gzip,deflate

HTTP/1.1 200 OKContent-Type: application/x-javascriptLast-Modified: Mon, 22 Sep 2008 21:14:35 GMTContent-Length: 2066Content-Encoding: gzipExpires: Fri, 26 Sep 2008 22:00:00 GMT

XmoÛHþ\ÿFÖvã*wØoq...

Page 4: CS193H: High Performance Web Sites Lecture 16: Rule 13 – Configure ETags Steve Souders Google souders@cs.stanford.edu

GET /v-app/scripts/107652916-dom.common.js HTTP/1.1Host: www.blogger.comUser-Agent: Mozilla/5.0 (…) Gecko/2008070208 Firefox/3.0.1Accept-Encoding: gzip,deflateIf-Modified-Since: Mon, 22 Sep 2008 21:14:35 GMT

HTTP/1.1 304 Not Modified

Conditional GET (IMS)

IMS determines validity – does the browser's cached version match what's on the server?

the comparison is based on the resource's datea 304 response is sent instead of all the dataIMS is used when Reload is pressed

sometime after 3pm PT 9/24/08:

Page 5: CS193H: High Performance Web Sites Lecture 16: Rule 13 – Configure ETags Steve Souders Google souders@cs.stanford.edu

HTTP/1.1 200 OKContent-Type: application/x-javascriptLast-Modified: Mon, 22 Sep 2008 21:14:35 GMTContent-Length: 2066Content-Encoding: gzip

XmoÛHþ\ÿFÖvã*wØoq...

ETag Response Header

GET /v-app/scripts/107652916-dom.common.js HTTP/1.1Host: www.blogger.comUser-Agent: Mozilla/5.0 (…) Gecko/2008070208 Firefox/3.0.1Accept-Encoding: gzip,deflate

HTTP/1.1 200 OKContent-Type: application/x-javascriptLast-Modified: Mon, 22 Sep 2008 21:14:35 GMTContent-Length: 2066Content-Encoding: gzipExpires: Fri, 26 Sep 2008 22:00:00 GMTETag: "19f1e-7920-4525b037f0440"

XmoÛHþ\ÿFÖvã*wØoq...

Page 6: CS193H: High Performance Web Sites Lecture 16: Rule 13 – Configure ETags Steve Souders Google souders@cs.stanford.edu

GET /v-app/scripts/107652916-dom.common.js HTTP/1.1Host: www.blogger.comUser-Agent: Mozilla/5.0 (…) Gecko/2008070208 Firefox/3.0.1Accept-Encoding: gzip,deflateIf-Modified-Since: Mon, 22 Sep 2008 21:14:35 GMTIf-None-Match: "19f1e-7920-4525b037f0440"

HTTP/1.1 304 Not Modified

Conditional GET (INM)

alternative way to test validity

sometime after 3pm PT 9/24/08:

Page 7: CS193H: High Performance Web Sites Lecture 16: Rule 13 – Configure ETags Steve Souders Google souders@cs.stanford.edu

What is an ETaghttp://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.11

added in HTTP/1.1used by clients and servers to validate expired

resourcesmore flexible than Last-Modified date"An entity tag consists of an opaque quoted

string"" An entity tag MUST be unique across all versions

of all entities associated with a particular resource."

Page 8: CS193H: High Performance Web Sites Lecture 16: Rule 13 – Configure ETags Steve Souders Google souders@cs.stanford.edu

If-None-Match (hit)

"If any of the entity tags match the entity tag of the entity that would have been returned in the response to a similar GET request (without the If-None-Match header) on that resource[…], then the server MUST NOT perform the requested method, unless required to do so because the resource's modification date fails to match that supplied in an If-Modified-Since header field in the request. Instead, if the request method was GET or HEAD, the server SHOULD respond with a 304 (Not Modified) response,…"

http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.26

Page 9: CS193H: High Performance Web Sites Lecture 16: Rule 13 – Configure ETags Steve Souders Google souders@cs.stanford.edu

INM, IMS hit & miss

hit miss

hit 304 full response

miss

If-Modified- Since

If-None-Match

Page 10: CS193H: High Performance Web Sites Lecture 16: Rule 13 – Configure ETags Steve Souders Google souders@cs.stanford.edu

If-None-Match (miss)If none of the entity tags match, then the server

MAY perform the requested method as if the If-None-Match header field did not exist, but MUST also ignore any If-Modified-Since header field(s) in the request. That is, if no entity tags match, then the server MUST NOT return a 304 (Not Modified) response.

Page 11: CS193H: High Performance Web Sites Lecture 16: Rule 13 – Configure ETags Steve Souders Google souders@cs.stanford.edu

INM, IMS hit & miss

hit miss

hit 304 full response

miss full response full response

If-Modified- Since

If-None-Match

if not managed properly, sending both IMS and INM lowers the chances of a simple, small 304 response

How could it not be managed properly?!

Page 12: CS193H: High Performance Web Sites Lecture 16: Rule 13 – Configure ETags Steve Souders Google souders@cs.stanford.edu

Apache ETags"19f1e-7920-4525b037f0440"

"inode-size-timestamp"

inode – used by filesystems to store file type, owner, group, permissions, etc.

inode for the same file differs across servers even if file size, timestamp, and directory is the same

http://stevesouders.com/images/arrow-right-9x13.pngETag: "21f5315-d4-5d51f0c0"

http://1.cuzillion.com/images/arrow-right-9x13.pngETag: "1ee57ec-d4-5d51f0c0"

Page 13: CS193H: High Performance Web Sites Lecture 16: Rule 13 – Configure ETags Steve Souders Google souders@cs.stanford.edu

IIS ETags"b4f35327edac51:113f"

"timestamp:changenumber"

changenumber – counter to track IIS configuration changes

changenumber rarely the same across servershttp://hp.msn.com/global/c/hpv10/favicon.ico

ETag: "b4f35327edac51:113f"ETag: "b4f35327edac51:e6e"

Page 14: CS193H: High Performance Web Sites Lecture 16: Rule 13 – Configure ETags Steve Souders Google souders@cs.stanford.edu

example ETag missGET /global/c/hpv10/favicon.ico HTTP/1.1Host: hp.msn.comIf-Modified-Since: Wed, 26 Oct 2005 22:39:58 GMTIf-None-Match: "b4f35327edac51:19bc"

HTTP/1.x 200 OKContent-Length: 1406Etag: "b4f35327edac51:d76"Last-Modified: Wed, 26 Oct 2005 22:39:58 GMTExpires: Wed, 06 Feb 2008 01:10:16 GMT

timestamp is the sameLast-Modified matches (but IMS misses)

changenumber differs, validations misses, entire body is resentvalidation miss

Page 15: CS193H: High Performance Web Sites Lecture 16: Rule 13 – Configure ETags Steve Souders Google souders@cs.stanford.edu

the problem with ETagsthe default ETag syntax in Apache and IIS makes it

unlikely that INM will match across servers, even when the resource is the same

probability of an incorrect INM miss:(n-1)/n where "n" is the number of serversnot an issue if you just have one server

http://www.apacheweek.com/issues/02-01-18 "can cause an unnecessary performance hit as

resources are fetched more often than is required"

http://support.microsoft.com/kb/922703"IIS 6.0 sends a 200 response because it considers the

different change numbers to mean that [the resources] are not the same versions"

Page 16: CS193H: High Performance Web Sites Lecture 16: Rule 13 – Configure ETags Steve Souders Google souders@cs.stanford.edu

the solution for ETagsif you're not leveraging ETags, turn them offreduces size of requests and responsesreduces outbound traffic from your serversincreases proxy cache hit rateApache:

FileETag none

IIS:synchronize changenumber across servershttp://support.microsoft.com/kb/922703/

Page 17: CS193H: High Performance Web Sites Lecture 16: Rule 13 – Configure ETags Steve Souders Google souders@cs.stanford.edu

ETags in the wild

serverETag

s?

default syntax

?

www.aol.com AOLserver no –www.ebay.com IIS yes yes

www.facebook.com Apache no –www.google.com/search

gws no –

search.live.com/results

ASP.NET yes no

www.msn.com IIS no –www.myspace.com Apache some no

en.wikipedia.org/wikiApachelighthttpd

someyes

no?

www.yahoo.com YTS no –www.youtube.com btfe no –

Page 18: CS193H: High Performance Web Sites Lecture 16: Rule 13 – Configure ETags Steve Souders Google souders@cs.stanford.edu

possible uses for ETags???

Page 19: CS193H: High Performance Web Sites Lecture 16: Rule 13 – Configure ETags Steve Souders Google souders@cs.stanford.edu

Homework11/7 11:59pm – rules 4-10 applied to your

"Improving a Top Site" class project 11/12 3:15pm – Web 100 Double Check

• look at your rows in Web 100 spreadsheet• double-check your entries for any rows in red• update incorrect entries• enter "y" in "Double Checked" column

read HPWS Chapter 14

Page 20: CS193H: High Performance Web Sites Lecture 16: Rule 13 – Configure ETags Steve Souders Google souders@cs.stanford.edu

QuestionsWhy were ETags introduced in HTTP/1.1?What do "IMS" and "INM" stand for?How do IMS and INM interplay during resource

validation?What's the default syntax for ETags in Apache

and IIS?What component in each default syntax hurts

performance, and why?What are three performance gains you can

achieve by turning off ETags?