varnish cache plus. random notes for wise web developers

63
Varnish Cache Plus Random notes for wise web developers Carlos Abalde, Roberto Moreda {cabalde, moreda}@allenta.com October 2014

Upload: carlos-abalde

Post on 27-Jun-2015

1.435 views

Category:

Technology


7 download

DESCRIPTION

Collection of random notes for web developers willing to make the most of Varnish Cache Plus.

TRANSCRIPT

Varnish Cache PlusRandom notes for wise web developers

Carlos Abalde, Roberto Moreda {cabalde, moreda}@allenta.com October 2014

Agenda

1. Introduction

2. Varnish 101

3. Invalidations

4. HTTP headers

5. Content composition

6. VAC

7. VCS

8. Device detection

9. Varnish Plus 4.x

10. Q&A

1. Introduction

Disclaimer๏ General understanding of ‘The Varnish Book’ is assumed

‣ This is not the official Varnish Cache training

‣ This is not a Varnish Cache internals course

‣ This is not a Varnish module development course

‣ This is a collection of random notes for web developers willing to make the most of Varnish Cache Plus

๏ OSS Varnish Cache vs. Varnish Cache Plus

‣ 3.x vs. 4.x

Varnish Cache 3.x

๏ The Varnish Book

‣ https://www.varnish-software.com/static/book/

๏ The Varnish Reference Manual

‣ https://www.varnish-cache.org/docs/.../index.html

๏ Default VCL

‣ https://www.varnish-cache.org/trac/.../default.vcl

What everybody should know

Varnish Cache Plus 3.x

๏ Support, advise & training

๏ Varnish Enhanced Cache Invalidation

‣ Hash Two, Hash Ninja…

๏ Varnish Administration Console (VAC)

๏ Varnish Custom Statistics (VCS)

๏ Device detection

Components I

Varnish Cache Plus 3.x

๏ Varnish Tuner

๏ Enhanced HTTP streaming

๏ Packaged binary VMODs

๏ Varnish Paywall

๏ … and more to come shortly!

Components II

Varnish Cache Plus 3.x

๏ 64 bits

๏ Distributions

‣ RedHat Enterprise Linux 5 & 6

‣ Ubuntu Linux 12.04 LTS (precise)

‣ Ubuntu Linux 14.04 LTS (trusty)

‣ Debian Linux 7 (wheezy)

Supported platforms

2. Varnish 101

Caching policy

๏ Varnish Cache Plus would require zero configuration in a perfect world with perfect HTTP citizens

‣ Correct HTTP caching headers

‣ Vary HTTP header used wisely

‣ HTTP cookies used conservatively

๏ By default Varnish Cache Plus will not cache anything marked as private, carrying a cookie or including a '*'  Vary HTTP header

VCL

๏ Varnish Configuration Language

‣ Domain specific state engine

‣ No loops, variables, functions…

‣ Command line configuration & Tunable parameters

๏ Translated to C code

๏ Loaded as a dynamically generated shared library

‣ Zero downtime & Blazingly fast

Overview

VCL

๏ Normalize client-input

๏ Pick a backend / director

๏ Re-write / extend client-input

๏ Decide caching policy based on client-input

๏ Access control

๏ Security barriers

vcl_recv I

VCLvcl_recv II

sub  vcl_recv  {    #  Backend  selection  &  URL  normalization.    if  (req.http.host  ~  "^blogs\.")  {        set  req.backend  =  blogs;        set  req.http.host  =  regsub(req.http.host,"^blogs\.",  "");        set  req.url  =  regsub(req.url,  "^",  "/blogs");    }  else  {        set  req.backend  =  default;    }    #  Poor  man's  device  detection.    if  (req.http.User-­‐Agent  ~  "(iPad|iPhone|Android)")  {        set  req.http.X-­‐Device  =  "mobile";    }  else  {        set  req.http.X-­‐Device  =  "desktop";    }}

VCL

๏ Sanitize / extend backend response

๏ Override cache duration

‣ beresp.ttl  

- s-­‐maxage & maxage in Cache-­‐Control HTTP header

- Expires HTTP header

- Default TTL

‣ Beware with TTL of hitpass objects

vcl_fetch I

VCLvcl_fetch II

sub  vcl_fetch  {    #  Override  caching  TTL.    if  (beresp.http.Cache-­‐Control  !~  "s-­‐maxage")  {        set  beresp.ttl  =  0;        if  (bereq.url  ~  "\.jpg(\?|$)")  {            set  beresp.ttl  =  30s;        }    }      #  Never  cache  a  Set-­‐Cookie  header.    if  (beresp.ttl  >  0s)  {        unset  beresp.http.Set-­‐Cookie;    }    #  Create  ban-­‐lurker  friendly  objects.    set  beresp.http.X-­‐Url  =  bereq.url;}

VCLRequest flow I

VCLRequest flow II

Process architecture

VMODs

๏ Shared libraries extending the VCL core

‣ std VMOD

- std.toupper(), std.log(), std.fileread()…

‣ ABI (Application Binary Interface) mismatches

๏ cookie, header, var, curl, digest, geoip, boltsort, memcached, redis, dns…

๏ https://www.varnish-cache.org/vmods

Backends

๏ Multiple backends

‣ Selected at request time based on any request property

๏ Probes

‣ Per-backend periodic health checks

- Interval, timeout, expected response…

๏ Directors

‣ Load balanced backend groups

Error handling

๏ Some backend may be sick for a particular object

‣ Other objects from the same backend can still be accessed

- Unless more than a set amount of objects are added to the saint mode blacklist for a specific backend

๏ Do not request again the object to that backend for a period of time

‣ Grace mode is used when all possible backends for the requested object have been blacklisted

๏ Complement backend probes

Saint mode

Error handling

๏ A graced object is an object that has expired, but is still kept in cache

‣ beresp.ttl vs. beresp.grace

๏ Graced objects are used to

‣ Serve outdated content if the backend is down

- Probes or saint mode is required for this

‣ Serve sightly staled content while fresh versions are fetched

Grace mode

Beyond caching policy

๏ Why restricting VCL / VMODs to implement the caching policy?

๏ Any logic modeled in VCL / VMODs is compiled, embedded & executed in the caching edger layer

‣ 1000x times faster than typical Java / PHP apps

- Strong restrictions

‣ Accounting, paywalling, A/B testing…

varnishtest

๏ Powerful Varnish-specific testing tool

‣ Mocked clients & backends executing / processing HTTP requests against real Varnish Cache Plus instances

‣ http://www.clock.co.uk/...varnishtest

๏ Essential when implementing complex VCL logic

๏ Easily integrable in any CI infrastructure

FAQ๏ When SSL support will be implemented?

‣ "[...] huge waste of time and effort to even think about it."

๏ When SPDY support will be implemented?

‣ "[...] Varnish is not speedy, Varnish is fast! [...]"

๏ What is the recommended value for this bizarre kernel / varnishd parameter I found in some random blog?

‣ Use Varnish Tuner + Fine tune based on necessity

‣ Pay attention to workspaces & syslog messages

3. Invalidations

Overview

๏ Updated objects may be available before TTL expiration

‣ Purges

‣ Forced misses

‣ Bans

‣ Hash Two / Hash Ninja / …

Purges

๏ VCL

๏ Eagerly discards an object along with all its variants

Overview

acl  internal  {    "localhost";    "192.168.55.0"/24;}  

sub  vcl_recv  {    if  (req.request  ==  "PURGE")  {        if  (client.ip  !~  internal)  {            error  405  "Not  allowed.";        }        return  (lookup);    }}

sub  vcl_hit  {    if  (req.request  ==  "PURGE")  {        purge;        error  200  "Purged.";    }}  

sub  vcl_miss  {    if  (req.request  ==  "PURGE")  {        purge;          error  200  "Purged.";    }}

Purges

๏ What if the new object cannot be fetched after the invalidation?

‣ Soft-purges VMOD

‣ Forces misses

๏ What if multiple objects need to be invalidated? What if objects need to be invalidated too frequently?

‣ Bans

‣ Hash Two

Downsides I

Purges

๏ How to invalidate hitpass objects?

‣ Not possible in Varnish Cache Plus 3.x

- Redesigned in Varnish Cache Plus 4.x

- https://www.varnish-cache.org/trac/.../1033

‣ return(pass); during vcl_recv is preferred when possible

Downsides II

Forced misses

๏ VCL

๏ Forces a cache miss for the request

‣ Useful for cache priming scripts

Overview

sub  vcl_recv  {    if  (req.http.X-­‐Priming-­‐Script)  {        ...        set  req.hash_always_miss  =  true;    }    ...}

Forced misses

๏ Object will always be (re)fetched from the backend

๏ New object is put into cache and used from that point onward

‣ Old object is not evicted until it’s safe to do so

‣ Controls who takes the penalty of waiting for an updated object

๏ Old objects are not freed up until expiration

‣ This is considered a flaw and a fix is expected

Behavior

Bans

๏ VCL or CLI

๏ Lazily discards multiple objects matching an expression

‣ Logical operators + Object attributes + Regular expressions

‣ Only works on objects already in the cache

๏ Ban lurker

‣ Frees up memory + Keeps the ban list at a manageable size

‣ obj.* based expressions

Overview

BansExample

sub  vcl_recv  {    if  (req.request  ==  "BAN")  {        ...        if  (!req.http.X-­‐Ban-­‐Url-­‐Regexp)  {            error  400  "Empty  URL  regexp.";        }        ban("obj.http.X-­‐Url  ~  "  +  req.http.X-­‐Ban-­‐Url-­‐Regexp);    }}  

sub  vcl_fetch  {    set  beresp.http.X-­‐Url  =  req.url;}  

sub  vcl_deliver  {    unset  resp.http.X-­‐Url;}

Hash Two

๏ VCL + VMOD

๏ Workarounds bans scalability

Overview

HTTP/1.x  200  OKTransfer-­‐Encoding:  chunked...X-­‐Tags:  C10  P42  P236  P857...

ban  obj.http.X-­‐Tags  ~  "(\s|^)P42(\s|$)"

Hash TwoExample

import  hashtwo;  

sub  vcl_recv  {    if  (req.request  ==  "PURGE")  {        ...        if  (hashtwo.purge(req.http.X-­‐Tag)  !=  0)  {            error  200  "Purged.";        }  else  {            error  404  "Not  found.";        }    }}  

sub  vcl_fetch  {    set  beresp.http.X-­‐HashTwo  =  beresp.http.X-­‐Tags; }

4. HTTP headers

Cache related headers

๏ Expires

๏ Cache-Control

๏ Last-Modified

๏ If-Modified-Since

๏ If-None-Match

๏ Etag

๏ Pragma

๏ Vary

๏ Age

Cache-Control

๏ Specifies directives that must be applied by all caching mechanisms (from Varnish Cache Plus to browser cache)

Overview

‣ public  |  private  

‣ no-­‐store  

‣ no-­‐cache  

‣ max-­‐age  

‣ s-­‐maxage  

‣ must-­‐revalidate  

‣ no-­‐transform  

‣ …

Cache-Control

๏ Ignored in incoming client HTTP requests

๏ Only s-­‐maxage & max-­‐age used in backend HTTP responses to calculate default TTL

‣ Always overrides Expires header

‣ Beware of Age header in client responses

- Objects not cached client side

- https://www.varnish-cache.org/...Caching

beresp.ttl

Vary

๏ Indicates the response returned by the backend server may vary depending on headers received in the request

๏ Object variants & Hit ratio

‣ Vary:  Accept-­‐Encoding  

- Normalization of Accept-­‐Encoding header is not required

‣ Vary:  User-­‐Agent

5. Content composition

Overview๏ Break objects into smaller fragments

‣ Separate cache policy for each fragment

‣ Increase hit ratio

๏ Tools

‣ Edge Side Includes (ESI)

‣ AJAX

- Beware of RTT & Cross domain policy

Edge Side Includes

๏ Subset of ESI Language Specification 1.0

‣ <esi:include  src="<URL>  "  />  

‣ <esi:remove>...</esi:remove>  

‣ <!-­‐-­‐esi  ...—>  

๏ set  beresp.do_esi  =  true;  

‣ Separate Varnish requests

๏ Testing ESI in dev environment

6. VAC

Overview

๏ Central control of Varnish Cache Plus servers

‣ Web UI + RESTful API

- Super Fast Purger

๏ Cache group management

‣ Real time statistics, VCL editor, ban submission…

๏ Varnish Agent 2

Super Fast Purger

๏ High performance intermediary distributing invalidation requests to groups of Varnish Cache Plus servers

‣ Leverages speed & flexibility of VCL

‣ Keep-alive workaround

๏ Part of the VAC RESTful API

‣ Trivially integrable in existing applications

Change management

๏ Easily integrable using the VAC RESTful API

‣ git, Mercurial… hooks

‣ Jenkins, Travis, GitLab… CI scripts

๏ Manual VCL bundle generation

๏ Orchestrated / programmed deployments, rollbacks, etc.

7. VCS

Overview

๏ Real-time aggregated statistics

‣ Multiple vstatdprobe daemons

‣ One vstatd daemon

‣ JSON + Time series API

๏ VSM log based

‣ Efficient circular in-memory data structure

‣ std.log("vcs-­‐key:"  +  <key  suffix>);

Some ideas

๏ Trending articles or sale products

๏ Cache hits and cache misses

๏ URLs with long load times

๏ URLs with the most 5xx response codes

๏ Where traffic is coming from

๏ …

Example

sub  vcl_deliver  {    std.log("vcs-­‐key:"  +  req.http.host);    std.log("vcs-­‐key:"  +  req.http.host  +  req.url);    std.log("vcs-­‐key:TOTAL");    if  (obj.hits  ==  0)  {        std.log("vcs-­‐key:MISS");    }  }

API I๏ Stats (#requests, #misses, avg ttfb, acc body bytes, #2xx,

#3xx…) for key named “example.com" during the last time windows

‣ GET  /key/example.com  

๏ Keys that produced the most 5xx responses during the last time window

‣ GET  /all/top_5xx  

๏ Top 5 requested keys during the last time window

‣ GET  /all/top/5?verbose=1

API II

๏ Top 10 most requested keys ending with ‘.gif' during the last time window

‣ GET  /match/(.*)%5C.gif$/top  

๏ Top 50 slowest backend requests aggregating the last 20 time windows

‣ GET  /all/top_ttfb/50?b=20

8. Device detection

Overview๏ VMOD

๏ DeviceAtlas

‣ https://deviceatlas.com

‣ Database locally deployed & Daily updated

๏ OSS alternatives

‣ https://github.com/serbanghita/Mobile-Detect

‣ …

Example

import  deviceatlas;  

sub  vcl_recv  {    if  (deviceatlas.lookup(req.http.User-­‐Agent,                                                                                  "isMobilePhone")  ==  "1")  {        set  req.http.X-­‐Device  =  "mobile";    }  elsif  (deviceatlas.lookup(req.http.User-­‐Agent,                                                              "isTablet")  ==  "1")  {        set  req.http.X-­‐Device  =  "tablet";    }  else  {        set  req.http.X-­‐Device  =  "desktop";    }}

Some ideas

๏ Redirections based on device properties

๏ Backend selection based on device properties

๏ Normalization of the UA header

‣ Caching different versions (i.e. Vary header) of the same object based on normalized UAs

๏ …

9. Varnish Plus 4.x

Highlights๏ Client / backend thread split

‣ Background content refreshing

๏ Redesigned purges

‣ return(purge); during vcl_recv

๏ Directors implemented as VMODs

‣ Consistent hashing director

๏ Distinction between error & synthetic responses

10. Q&A