cs193h: high performance web sites lecture 8: rule 4 – gzip components steve souders google...

17
CS193H: High Performance Web Sites Lecture 8: Rule 4 – Gzip Components Steve Souders Google [email protected]

Upload: ian-mcnally

Post on 26-Mar-2015

230 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: CS193H: High Performance Web Sites Lecture 8: Rule 4 – Gzip Components Steve Souders Google souders@cs.stanford.edu

CS193H:High Performance Web Sites

Lecture 8: Rule 4 – Gzip Components

Steve SoudersGoogle

[email protected]

Page 2: CS193H: High Performance Web Sites Lecture 8: Rule 4 – Gzip Components Steve Souders Google souders@cs.stanford.edu

AnnouncementsWeb 100 Performance Profile (round 1) class

project has been graded – contact Aravind if you want to know your grade

Page 3: CS193H: High Performance Web Sites Lecture 8: Rule 4 – Gzip Components Steve Souders Google souders@cs.stanford.edu

Compression (encoding)

typically reduces size by 70%(6230-2066)/6230 = 67%

GET /v-app/scripts/107652916-dom.common.js HTTP/1.1Host: www.blogger.comUser-Agent: Mozilla/5.0 (…) Gecko/2008070208 Firefox/3.0.1

HTTP/1.1 200 OKContent-Type: application/x-javascriptLast-Modified: Mon, 22 Sep 2008 21:14:35 GMTContent-Length: 6230

function d(s) {...

GET /v-app/scripts/107652916-dom.common.js HTTP/1.1Host: www.blogger.comUser-Agent: Mozilla/5.0 (…) Gecko/2008070208 Firefox/3.0.1Accept-Encoding: gzip,deflate

HTTP/1.1 200 OKContent-Type: application/x-javascriptLast-Modified: Mon, 22 Sep 2008 21:14:35 GMTContent-Length: 2066Content-Encoding: gzip

XmoÛHþ\ÿFÖvã*wØoq...

Page 4: CS193H: High Performance Web Sites Lecture 8: Rule 4 – Gzip Components Steve Souders Google souders@cs.stanford.edu

Gzip vs. Deflate

gzip (default settings) compresses more

Gzip Deflate

Size SizeSavin

gs SizeSavin

gs

Script 3.3K 1.1K 67% 1.1K 66%

Script 39.7K 14.5K 64% 16.6K 58%

Stylesheet 1.0K 0.4K 56% 0.5K 52%

Stylesheet 14.1K 3.7K 73% 4.7K 67%

Page 5: CS193H: High Performance Web Sites Lecture 8: Rule 4 – Gzip Components Steve Souders Google souders@cs.stanford.edu

Pros and ConsPro:

smaller transfer sizeCon:

CPU cycles – on client and server

Don't compress resources < 1K

Page 6: CS193H: High Performance Web Sites Lecture 8: Rule 4 – Gzip Components Steve Souders Google souders@cs.stanford.edu

Gzip configurationApache 1.3: mod_gzip

mod_gzip_item_include file \.html$mod_gzip_item_include mime ^text/html$mod_gzip_item_include file \.js$mod_gzip_item_include mime ^application/x-javascript$

mod_gzip_item_include file \.css$mod_gzip_item_include mime ^text/css$

Apache 2.x: mod_deflateAddOutputFilterByType DEFLATE text/html text/css application/x-javascript

control compression level: DeflateCompressionLevelhttp://httpd.apache.org/docs/2.0/mod/mod_deflate.html

Page 7: CS193H: High Performance Web Sites Lecture 8: Rule 4 – Gzip Components Steve Souders Google souders@cs.stanford.edu

HTML Scripts Stylesheets

amazon.com x

aol.com x some some

cnn.com

ebay.com x

froogle.google.com x x x

msn.com x deflate deflate

myspace.com x x x

wikipedia.org x x x

yahoo.com x x x

youtube.com x some some

Gzip: not just for HTML

HTML Scripts Stylesheets

aol.com x x x

ebay.com x some

facebook.com x x x

google.com/search x x na

search.live.com/results

x x x

msn.com x x x

myspace.com x x x

en.wikipedia.org/wiki x some some

yahoo.com x x x

youtube.com x x x

gzip scripts, stylesheets, XML, JSON (not images, Flash, PDF) March 2007October 2008

Page 8: CS193H: High Performance Web Sites Lecture 8: Rule 4 – Gzip Components Steve Souders Google souders@cs.stanford.edu

Edge Case: ProxiesProxy Origin Server

6 GET main.js (no Accept-Encoding)

2 GET main.js Accept-Encoding: gzip

3 main.js Content-Encoding: gzip

4 main.js Content-Encoding: gzip

5 main.js Content-Encoding: gzip

1 GET main.js Accept-Encoding: gzip

7 main.js Content-Encoding: gzip

proxies may serve gzipped content to browsers that don't support it, and vice versa

Page 9: CS193H: High Performance Web Sites Lecture 8: Rule 4 – Gzip Components Steve Souders Google souders@cs.stanford.edu

Edge Case: Proxies w/ VaryProxy Origin Server

6 GET main.js (no Accept-Encoding)

2 GET main.js Accept-Encoding: gzip

3 main.js Content-Encoding: gzip Vary: Accept-Encoding

4 main.js Content-Encoding: gzip [Accept-Encoding: gzip]

5 main.js Content-Encoding: gzip

1 GET main.js Accept-Encoding: gzip

10 main.js (no gzip)

7 GET main.js (no Accept-Encoding)

9 main.js [Accept-Encoding: ]

8 main.js Vary: Accept-Encoding

11 GET main.js Accept-Encoding: gzip

12 main.js Content-Encoding: gzip

13 GET main.js (no Accept-Encoding)

14 main.js (no gzip)

add Vary: Accept-Encoding

Page 10: CS193H: High Performance Web Sites Lecture 8: Rule 4 – Gzip Components Steve Souders Google souders@cs.stanford.edu

Edge Case: Bad Browsers< 1% of browsers have problems with gzip

IE 5.5: http://support.microsoft.com/default.aspx?scid=kb;en-us;Q313712

IE 6.0: http://support.microsoft.com/default.aspx?scid=kb;en-us;Q31249

Netscape 3.x, 4.x http://www.schroepl.net/projekte/mod_gzip/browser.htm

User-Agent white list for gzipApache 1.3: mod_gzip_item_include reqheader "User-Agent: MSIE [6-9]" mod_gzip_item_include reqheader "User-Agent: Mozilla/[5-9]"

Apache 2.0: BrowserMatch ^MSIE [6-9] gzip BrowserMatch ^Mozilla/[5-9] gzip

Page 11: CS193H: High Performance Web Sites Lecture 8: Rule 4 – Gzip Components Steve Souders Google souders@cs.stanford.edu

Edge Case: Bad Browsers(cont'd)proxies could mix-up responses

give cached response from useragent1 to useragent2

could add Vary: User-Agentso many possibilities, defeats proxy caching

better to add Cache-Control: Private downside: disables all proxy caches

is it a serious problem?hard to diagnose; problem getting smaller

Page 12: CS193H: High Performance Web Sites Lecture 8: Rule 4 – Gzip Components Steve Souders Google souders@cs.stanford.edu

Edge Case: ETagswhat happens when proxy makes Conditional

GET requests?Last-Modified date for gzipped vs. ungzipped is

different => If-Modified-Since works fineETag is the same in Apache for gzipped &

ungzipped => If-None-Match succeeds, proxy could give browser mismatched content

remove Etags! (Rule 13)

http://issues.apache.org/bugzilla/show_bug.cgi?id=39727

Page 13: CS193H: High Performance Web Sites Lecture 8: Rule 4 – Gzip Components Steve Souders Google souders@cs.stanford.edu

Edge Case: ETags presentProxy Origin Server

6 GET main.js (no Accept-Encoding)

2 GET main.js Accept-Encoding: gzip

3 main.js Content-Encoding: gzip Cache-Control: max-age=0 ETag: "de158-e58-c7ee4140"

4 main.js Content-Encoding: gzip Cache-Control: max-age=0 ETag: "de158-e58-c7ee4140"

5 main.js Content-Encoding: gzip

1 GET main.js Accept-Encoding: gzip

7 GET main.js If-None-Match: "de158-e58-c7ee4140"

8 304 Not Modified9 main.js Content-Encoding: gzip

proxy gives browser mismatched content

Page 14: CS193H: High Performance Web Sites Lecture 8: Rule 4 – Gzip Components Steve Souders Google souders@cs.stanford.edu

Edge Case: ETags removedProxy Origin Server

6 GET main.js (no Accept-Encoding)

2 GET main.js Accept-Encoding: gzip

3 main.js Content-Encoding: gzip Cache-Control: max-age=0 Last-Modified: Thu, 21 Aug

2008 23:53:57 GMT

4 main.js Content-Encoding: gzip Cache-Control: max-age=0 Last-Modified: Thu, 21 Aug 2008 23:53:57 GMT

5 main.js Content-Encoding: gzip

1 GET main.js Accept-Encoding: gzip

7 GET main.js If-Modified-Since: Thu, 21 Aug 2008 23:53:57

GMT

8 main.js Cache-Control: max-age=0 Last-Modified: Fri, 22 Aug

2008 09:43:15 GMT

removing ETags avoids the problem

10 main.js (no gzip)

9 main.js Cache-Control: max-age=0 Last-Modified: Fri, 22 Aug 2008 09:43:15 GMT

Page 15: CS193H: High Performance Web Sites Lecture 8: Rule 4 – Gzip Components Steve Souders Google souders@cs.stanford.edu

Edge Case Fixes

Vary: Accept-Encoding

Cache-Control: private

ETag

aol.com x

ebay.com x x x (IIS)

facebook.com x

google.com/search x

search.live.com/results

x x (IIS)

msn.com x (IIS)

myspace.com x x (Apa)

en.wikipedia.org/wiki x (Apa)

yahoo.com x

youtube.com x someVary: User-Agent – not used

March 2007October 2008

Page 16: CS193H: High Performance Web Sites Lecture 8: Rule 4 – Gzip Components Steve Souders Google souders@cs.stanford.edu

Homework"Improving Top Site" class project:• add improvements for Rule 4• measure improvements using Hammerhead• record results in your personal Web 100 sheet

read Chapter 5 of HPWS for 10/17

Page 17: CS193H: High Performance Web Sites Lecture 8: Rule 4 – Gzip Components Steve Souders Google souders@cs.stanford.edu

QuestionsHow much are file sizes typically reduced by using

gzip compression?What types of resources (images, scripts, etc.)

should not be compressed?For the resource types that should be compressed,

should they always be compressed?How do you prevent proxies from serving gzipped

resources to browsers that don't support gzip?How can ETags cause proxies to serve mismatched

content to browsers?