polishing your cache with varnish - davidsmalley » davidsmalley

43
Polishing your cache with Varnish David Smalley, Co-Founder of Litmus I’m David Smalley - co-founder of Litmus Talking about our newest site, Doctype.

Upload: others

Post on 03-Feb-2022

15 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Polishing your cache with Varnish - davidSmalley » davidSmalley

Polishing your cache with Varnish

David Smalley, Co-Founder of Litmus

I’m David Smalley - co-founder of Litmus

Talking about our newest site, Doctype.

Page 2: Polishing your cache with Varnish - davidSmalley » davidSmalley

http://doctype.com

Doctype is the newest project from Litmus

It’s a web design q&a site

Heavily inspired by Stackoverflow et-al

In fact, when we got in touch with Jeff Atwood, he proposed we join his

Page 3: Polishing your cache with Varnish - davidSmalley » davidSmalley

Web League of Justice

Page 4: Polishing your cache with Varnish - davidSmalley » davidSmalley

We knew we’d get big traffic from the stackoverflow affiliation

Most people would be anonymous users

Wanted to avoid embarrassment

Didn’t want to spend time+money on a big cluster pre-emptively

After we made the rails site as efficient as possible we went looking at...

Page 5: Polishing your cache with Varnish - davidSmalley » davidSmalley

Caching

Caching

Most visitors would be anonymous, just hopping in on search engine, browsing and leaving

Hopefully lots of spidering traffic as a result of good quality, fresh content and our association with Stackoverflow

Page 6: Polishing your cache with Varnish - davidSmalley » davidSmalley

Wikipedia set a good standard for caching

Page 7: Polishing your cache with Varnish - davidSmalley » davidSmalley

"Squid cache servers handle about 78% of requests, almost all which are made by viewers who are not logged in to the site.

During load surges from media mentions, the Squids handle almost all of the traffic."

http://meta.wikimedia.org/wiki/Cache_strategy

<Read quote>

Old quote from Wikimedia’s meta wiki, from 2005-ish

Page 8: Polishing your cache with Varnish - davidSmalley » davidSmalley

Caching

We all know about Rails caching

- page caching- action caching- fragment caching

Page 9: Polishing your cache with Varnish - davidSmalley » davidSmalley

Page caching

•!Best caching for the anonymous access strategy•!Page gets parsed once and written to disk•!All subsequent requests get served by web server from disk•!Problem is we have anonymous AND non-anonymous users•!Can't distinguish between the two with page caching•!ALSO - each server in a future cluster would have to look after its own cache. After cache clear would lead to page being reparsed and recached on each app server

Page 10: Polishing your cache with Varnish - davidSmalley » davidSmalley

Action Caching

•!Caches output of an action to a rails cache store•!Lets us run filters etc. first so can distinguish between logged in/out•!Can use memcached so things are only cached once amongst the cluster•!Still has to hit rails and process before serving the cached content•!Potentially still runs all the queries you have in your controller

Page 11: Polishing your cache with Varnish - davidSmalley » davidSmalley

Fragment Caching

•!Cache bits of the page into rails cache store•!Would be good to cache the post-markdown processed questions/answers•!Still runs all the queries in the controller•!Still hitting Rails

Page 12: Polishing your cache with Varnish - davidSmalley » davidSmalley

Not happy with the options I went back to Wikipedia

Page 13: Polishing your cache with Varnish - davidSmalley » davidSmalley

"Squid cache servers handle about 78% of requests..."

http://meta.wikimedia.org/wiki/Cache_strategy

So I went and researched reverse proxy caches and came across Varnish

Page 14: Polishing your cache with Varnish - davidSmalley » davidSmalley

“Varnish is a reverse Web accelerator designed for content-heavy dynamic web sites. In contrast to other HTTP accelerators, many of which began life as client-side proxies or origin servers, Varnish was designed from the ground up as an accelerator for incoming traffic.”

Used by search.twitter.comhulu.comwikia.com

Page 15: Polishing your cache with Varnish - davidSmalley » davidSmalley

As I dug deeper I found a ruby library that handled cache purging via the varnish CLI interface

Klarlack basically means varnish in german

Page 16: Polishing your cache with Varnish - davidSmalley » davidSmalley

Advantages of a reverse proxy cache server

Can load balance between app servers

Only caches things once across the cluster

Can use logic to determine how and when to cache and serve from cache

Page 17: Polishing your cache with Varnish - davidSmalley » davidSmalley

“There are only two hard things in Computer Science: cache invalidation and naming things”

- Phil Karlton

We’ve all heard this quote before

But we had a few things in our favour

Page 18: Polishing your cache with Varnish - davidSmalley » davidSmalley

Cache sweepers + a good ruby library for communicating with varnish

Will fit right in with the way we normally handle our caches in Rails

Page 19: Polishing your cache with Varnish - davidSmalley » davidSmalley

We also had advantages in Doctype

Page 20: Polishing your cache with Varnish - davidSmalley » davidSmalley

Comments

Answers

Questions

Simple object model

QuestionsAnswers& Comments

Everything basically centres around a question page, change any of them and just purge the question page it relates to

Page 21: Polishing your cache with Varnish - davidSmalley » davidSmalley

Cache Sweeping

With Rails cache sweepers, and the klarlack library. I wrote a plugin as some glue between the two

Page 22: Polishing your cache with Varnish - davidSmalley » davidSmalley

MDF

Imaginatively, on the varnish/wood theme I called it MDF

YAML file holds details of the cache servers and which port they are running the varnish CLI

Page 23: Polishing your cache with Varnish - davidSmalley » davidSmalley

MDF

plugin then basically calls the purge command against each of the servers listed in the YAML file. I modified it slightly to include the http host in the purge because my varnish servers handles a few different sites

Page 24: Polishing your cache with Varnish - davidSmalley » davidSmalley

MDF

Normal looking cache sweeper, just passes the purge path through to the MDF plugin

Page 25: Polishing your cache with Varnish - davidSmalley » davidSmalley

doesn’t like caching

Rails doesn’t like caching

Just look at the default headers you get back from it

Page 26: Polishing your cache with Varnish - davidSmalley » davidSmalley

doesn’t like caching

it says everything is private, with a max-age of zero, this means no caching

We need to fix this in our code.

Page 27: Polishing your cache with Varnish - davidSmalley » davidSmalley

doesn’t like caching

I added this method to application_controller

unless we’re in development mode, or there is someone logged in, then set a default cache age of 30 minutes, and also set “public” in the cache header which tells proxies to cache it.

Page 28: Polishing your cache with Varnish - davidSmalley » davidSmalley

What is s-maxage?

Basically, it's a max-age header that only public caches listen to, not browser caches. This ensure's we retain control on expiry with our backend cache purging antics

You *need* to call cache_control on every action you want to cache. Think carefully before you do this

Page 29: Polishing your cache with Varnish - davidSmalley » davidSmalley

Using our cache_control method

Throw a call to cache_control into any method we want to cache. With no options itʼll just do the default and set the age to 30 minutes

Page 30: Polishing your cache with Varnish - davidSmalley » davidSmalley

Using our cache_control method

On some actions we may only want to cache for a short amount of time. Here, as the sphinx index is updated via cron every 5 minutes and doesnʼt tie into a cache sweeper, we set the cache time to 5 minutes.

Page 31: Polishing your cache with Varnish - davidSmalley » davidSmalley

Back to Varnish config...

How do we make varnish do anonymous only caching?

Page 32: Polishing your cache with Varnish - davidSmalley » davidSmalley

Logged in users have a user_credentials cookie.

As weʼre using authlogic, any logged in users have a user_credentials cookie so letʼs differentiate on that

Page 33: Polishing your cache with Varnish - davidSmalley » davidSmalley

but....

Caches donʼt like cookies

If varnish seeʼs a cookie in the request then it wonʼt cache - for safety to ensure you donʼt cache a users private data

Page 34: Polishing your cache with Varnish - davidSmalley » davidSmalley

however....

Varnish can meddle with the request - we know when cookies are needed and when they are not, so we can create a varnish config that handles that correctly

Page 35: Polishing your cache with Varnish - davidSmalley » davidSmalley

Snippet of varnish config

If itʼs an image, css, javascript or an icon - unset any cookies

If the user has a user_credentials cookie - skip the cache

If the user hits one of these urls - skip the cache

If the request isnʼt a head of a get, skip the cache

Otherwise, bin the cookie and check the cache

Page 36: Polishing your cache with Varnish - davidSmalley » davidSmalley

Snippet of varnish config

This means that users can hit the login page, get a cookie which comes back in response to their POST request (not cached remember) and then once theyʼve got the cookie, theyʼre cache free.

Otherwise, cache with extreme prejudice

Page 37: Polishing your cache with Varnish - davidSmalley » davidSmalley

Snippet of varnish config

This means that users can hit the login page, get a cookie which comes back in response to their POST request (not cached remember) and then once theyʼve got the cookie, theyʼre cache free.

Otherwise, cache with extreme prejudice

Page 38: Polishing your cache with Varnish - davidSmalley » davidSmalley

I think youʼll find itʼs a bit more complicated than that...

Actually, varnish config is quite complicated. Thereʼs a vcl_recv and a vcl_fetch section. One deals with incoming client requests, one deals with the response to back end requests.

Because our clients can send, and our backend can also send cookies. We need to have the cookie filtering block repeated in those two sections.

Iʼll post my varnish config along with this presentation on my blog

Page 39: Polishing your cache with Varnish - davidSmalley » davidSmalley

Stats

Actually, I have no real stats to show you.

During peak times and search spidering the site definitely benefits from being cached.

However, in our case we definitely cached prematurely. We had good reason to do so.

We wanted to avoid a hammering when Jeff Atwood announced us on the Stackoverflow blog. But ultimately you donʼt need this kind of caching to start out with unless youʼre expecting a big traffic spike by being mentioned in a national newspaper or something

Page 40: Polishing your cache with Varnish - davidSmalley » davidSmalley

Final thoughts

One thing Iʼve learned in my time running successful commercial websites - and working for two hosting companies itʼs that caching is not a crutch

If your site is slow and shit before you cache, itʼll be slow, shit and temperamental after you cache.

Badly written sites that cache heavily tend to fall to their knees when the cache is cleared for some reason - the expensive action gets hammered and multiple concurrent requests to recache it are triggered.

Iʼve seen too many people use caching as a way to make a badly written site hobble along

What you should be aiming for, is writing an efficient site and then once the load starts to build - applying caching selectively to maintain the same level of efficiency and throughput.

Not everything has to be cached, focus on your most hit actions.

Page 41: Polishing your cache with Varnish - davidSmalley » davidSmalley

Different caching strategies for different situations

Varnish is just one of the caching strategies you can look at

Itʼs particularly suited for a site where most of the users are anonymous

For mostly private sites you should be looking at fragment caching using memcached

Page 42: Polishing your cache with Varnish - davidSmalley » davidSmalley

Further reading:

I was heavily inspired by this presentationhttp://www.slideshare.net/schoefmax/caching-with-varnish-1642989

Varnishhttp://varnish.projects.linpro.no/

Wikimedia Caching Strategy (a bit old)http://meta.wikimedia.org/wiki/Cache_strategy

Rails Guide to Cachinghttp://guides.rubyonrails.org/caching_with_rails.html

This talk, and some config snippets, will be posted on my bloghttp://davidsmalley.com

Page 43: Polishing your cache with Varnish - davidSmalley » davidSmalley

Questions?