handling massive traffic with python

15
Handling massive traffic with Python Òscar Vilaplana, Paylogic PyGrunn 2013

Upload: oscar-vilaplana

Post on 10-May-2015

161 views

Category:

Technology


2 download

DESCRIPTION

At Paylogic we handle massive online peak sales, with tens of thousands customers coming every second trying to get a chance to buy their ticket. We built a virtual queue to handle this load and sell the tickets in a fair order. This is how we did it (as much as I can tell you!). I presented this talk at PyGrunn 2013.

TRANSCRIPT

Page 1: Handling Massive Traffic with Python

Handling massive traffic

with PythonÒscar Vilaplana, Paylogic

PyGrunn 2013

Page 2: Handling Massive Traffic with Python

What’s the problem?

• High Traffic (>10k hits/s)

• Redirect low traffic to Paylogic

• Change redirected TPS

• Expect things to break

• Be fair, respect FIFO (within reason)

• Keep users informed

02

Page 3: Handling Massive Traffic with Python

In more detail

• Open/hold/close sales

• Expect any server to go down

• Expect ALL servers to go down

• Expect users to disappear

• Display expected waiting time and other inf

• Keep it working

• Prevent attacks

03

Page 4: Handling Massive Traffic with Python

How It Works

• A horde of customers appear!

• see a pretty page.

• get a position in the queue.

• page auto-refresh.

• your turn? to the Frontoffice!

• meanwhile info is shown.

• (waiting time, information from event managers…)

04

Page 5: Handling Massive Traffic with Python

Data Storage

• Estimates

• Not much data, stored in the instances and synced.

• Tokens

• A LOT of data!

• way too much to store and sync

• use distributed storage

• (the browsers)

05

Page 6: Handling Massive Traffic with Python

Architecture

• ELB

• Queue Instances

• Bouncer Process

• Syncer Process

• HTML/JS Queue Page in Cloudfront

06

Page 7: Handling Massive Traffic with Python

ELB

• Auto-scales (but not fast enough).

• Many regions.

• Can boot/kill instances automatically.

• We don’t do it yet.

07

Page 8: Handling Massive Traffic with Python

Queue Instances

• EC2 instances, which handle the traffic.

• All identical, sync eachother.

• They can be added or removed at will.

• If some (but not all) die, the users won’t notice.

• If all die, only the statistics will be affected.

• (Never happened).

08

Page 9: Handling Massive Traffic with Python

Users Handler

• Give out and validate tokens.

• Determine if the user should:

• Keep waiting

• Go to the Frontoffice

• See the Sold Out page.

• Return the expected waiting time.

• Return the values configured by the Event Managers.

09

Page 10: Handling Massive Traffic with Python

Synchronization of Statistics

• Keep the Queue Instances synced so they know:

• How many users are waiting.

• How to calculate the waiting time.

• How many users are being let through by the system

10

Page 11: Handling Massive Traffic with Python

HTML/JS Queue Page in Cloudfront

• Uses Handlebars

• Served by Cloudfront so that the Queue keeps looking good even if all

our servers were down.

• Updated frequently.

• Calls the Load Balancer. Error? Retry.

• Errors are very rare.

11

Page 12: Handling Massive Traffic with Python

Deployment

• Debs in private repos.

• Installed through tunnel.

• Custom python2deb tool (to be released).

12

Page 13: Handling Massive Traffic with Python

Stresstest

• Custom client with human-like behaviour.

• Notify amazon!

13

Page 14: Handling Massive Traffic with Python

What we learned

• Debugging distributed apps is hard.

• Last bugs are nasty.

• ELB doesn’t scale fast enough by itself.

14

Page 15: Handling Massive Traffic with Python

Q&A

15