http at your local bigco

24
HTTP at your local BigCo: How the internet sausage gets made Peter Griess @pgriess

Upload: pgriess

Post on 08-Jul-2015

964 views

Category:

Automotive


0 download

TRANSCRIPT

Page 1: HTTP at your local BigCo

HTTP at your local BigCo:How the internet sausage gets made

Peter Griess

@pgriess

Page 2: HTTP at your local BigCo

Goals and non-goals

• Basics of TCP/IP, DNS and HTTP and how they work together; pitfalls and optimizations

• A 1,000 foot view of scaling out HTTP infrastructure

– All manner of load balancing / traffic shaping

– Living on the edge

• Not: how to make a fast application (database access, rendering performance, etc)

Page 3: HTTP at your local BigCo

Background: DNS

• Map hostnames to IP(s)

– www.facebook.com 69.171.229.12, 69.171.228.40

• Resolution process

– Recursion (and what does the DNS server see?)

– Caching

• Latencies: on-host, cached in LAN, cached at ISP, miss

Page 4: HTTP at your local BigCo

Background: TCP

• Stateful protocol

• Negotiated by a synchronous 3-way handshake:

– 2xRTT before first byte is sent!

– e.g. USA => South America ~250ms RTT

• Seamless failover is hard (but not impossible)

• Load balancing must be aware of flows

Page 5: HTTP at your local BigCo

Background: HTTP

• Layered on top of TCP/TLS• Has some useful bits

– Compression– Connection re-use– Pipelining– Caching

• Kind of sucks– Headers on all requests/responses– Compression on bodies only– Pipelining has to be disabled most of the time– Pipelining suffers from head-of-line blocking

Page 6: HTTP at your local BigCo

mycutekittens.tv

Big bad internet

HTTP

68.193.17.4

Page 7: HTTP at your local BigCo

Problem?

Page 8: HTTP at your local BigCo

Problem

• Availability

– Server goes down (kernel panic?)

– Network goes down (cable cut?)

– Datacenter goes down (EC2?)

• Overload

– Shed load (good, can be transparent)

– Get infinitely slow (not good)

Page 9: HTTP at your local BigCo

mycutekittens.tv: multi-server

Big bad internet

???

Page 10: HTTP at your local BigCo

We have options

• DNS load balancing

• IP load balancing

• HTTP load balancing

Page 11: HTTP at your local BigCo

DNS load balancing

• mycutekittens.tv resolves to IPs: A, B, C, D– Add new IPs to scale out– Remove IPs when hosts go down

• Benefits– Don’t need extra hardware to do load balancing– Can span datacenters– DNS servers are cheap / fast

• Drawbacks– Hotspots due to caching– Hotspots due to ordering in result list– Hotspots due to resolver size– TTL / flexibility trade-off

Page 12: HTTP at your local BigCo

mycutekittens.tv: DNS

Big bad internet

68.193.17.4 68.193.17.5 68.193.17.6

DNS ServerDNS

Page 13: HTTP at your local BigCo

IP load balancing (1)

• mycutekittens.tv resolves to 1 public IP owned by an IP load balancer

– Add new backend hosts w/ private IPs to scale out

– Load balancer health-checks hosts actively or passively to avoid dead hosts

• Scheduling policies vs. failover

• DSR

Page 14: HTTP at your local BigCo

IP load balancing (2)

• Benefits

– Only 1 public IP (high DNS TTL)

– Backend network capacity/membership transparent to the internet

– Cheap-ish

– Failover is possible, not insanely difficult

• Drawbacks

– Can’t do what you can with HTTP

Page 15: HTTP at your local BigCo

mycutekittens.tv: IP

Big bad internet

10.0.0.1

10.0.0.2

10.0.0.3LB

GW

68.193.17.4

Page 16: HTTP at your local BigCo

HTTP load balancing (1)

• mycutekittens.tv resolves to 1 public IP owned by an HTTP load balancer– Largely same as IP load balancing

– Terminates TCP connections (sees all bytes)

– Can make routing decisions based on HTTP

– Can autonomously serve requests (caching, access control, etc)

• Examples:– Send requests for /foo/* to pool A

– 401 requests without cookie Q

Page 17: HTTP at your local BigCo

HTTP load balancing (2)

• Benefits

– Largely the same as IP

– More flexible rules

– Can terminate TLS (security+, cost+)

• Drawbacks

– No DSR

– Failover difficult

– Not as performant as IP

Page 18: HTTP at your local BigCo

mycutekittens.tv: HTTP

Big bad internet

10.0.0.1

10.0.0.2

10.0.0.3

LB68.193.17.4

HTTP(S)

Page 19: HTTP at your local BigCo

mycutekittens.tv: MOAR

• Eventually a single LB is going to be a problem

– Not enough capacity

– Availability

• Turtles all the day way down

– LB of LBs!

– DNS load balancing between datacenters

– …

Page 20: HTTP at your local BigCo

HTTPS: myths and reality

• Too computationally expensive– Only a few percent (imperialviolet.org); is your

webserver actually CPU bound? doubt it

– SSL acceleration cards, GPUs, etc

• Too much latency– Handshaking is 5-7xRTT

• Session resume

• False start

• Snap start

– Caching breaks

Page 21: HTTP at your local BigCo

My latency is huge in Japan

• RTT to USA is (or any single DC) can be huge

– Re-use connections (connection: keep-alive)

– Send work in parallel (pipelining)

– Use compression (content-encoding)

– Lots of tricks for static resources (bundling, CDNs, caching, etc)

– Pre-fetch data

Page 22: HTTP at your local BigCo

Let’s get crazy: SPDY

• Don’t limit yourself to HTTP; use a different protocol

– SPDY developed by Google, supported by Chrome, google.com (and soon facebook.com)

– Connection re-use w/o head-of-line blocking

– Headers always compressed

– Always SSL (but breaks caching)

Page 23: HTTP at your local BigCo

Let’s get crazy: TCP termination

• Synchronous RTTs: the silent killer– Opening new TCP connections is very costly

• Run proxies close to users and proxy traffic back to core using optimized protocol– Low RTT to proxy– Do SPDY-like tricks between edge + core– Potentially faster network to core than public internet

• Advertise these proxies via DNS– Geo-targetting– AS-adjacency

• Akamai CDN does this, sort of

Page 24: HTTP at your local BigCo

Let’s get crazy: DNS anycast

• Remember how DNS resolutions were slow?

– DNS servers could be far away from a user

• Advertise multiple network routes for the same DNS IP, let the IP stack pick the closest one