python load testing - pygotham 2012
DESCRIPTION
Getting ready for a big event on your website? Re-architecting for better performance or scalability? Releasing a hot new feature? Load testing can help you plan and provision accordingly. In this talk, I'll briefly discuss load testing strategies, then dive into how to DIY with Python using Corey Goldberg's open source library multi-mechanize as well as how to gather performance data from your tests. Real-life examples against a deployment of a popular open-source Python web app (reddit!).TRANSCRIPT
$finger dkuebric
• Currently: Tracelytics
• Before: Songza.com / AmieStreet.com
• Likes websites, soccer, and beer
2
MENU• What is load testing?
• How does it work?
• One good way to do it
• (with Python, no less!)
• A few examples
• Survey: making sense of it all
3
WHAT IS LOAD TESTING• AKA stress testing
• Putting demand on a system (web app) and measuring its response (latency, availability)
http://www.flickr.com/photos/60258236@N02/5515327431/
4
Q: WILL IT PERFORM
Do my code + infrastructure provide the desired level of performance in terms of
latency and throughput for real user workloads?
And if not, how can I get there?
5
http://www.flickr.com/photos/dvs/3540320095/
6
WHEN TO LOAD TEST
• Launching a new feature (or site!)
• Expect a big traffic event
• Changing infrastructure
• “I just don’t like surprises”
7
A PAGEVIEW IS BORN
give me search results
DNS
First HTTP request
Retrieve assets
and run js
t=0 t=done
now I can lolcats
8
THE WATERFALL CHART
9
A PAGEVIEW IS BORN
DNS, connectionFulfill HTTP Request (“Time to first byte”)
Download + render page contents(+js)
yslow?
10
HOW TO TEST IT
• “F*ck it, we’ll do it live!”
• Synthetic HTTP requests(“Virtual Users / VUs”)
• Synthetic clicks in real browsers (“RBUs”)
most common
least common
11
VU vs RBUVU
• Simulate individual protocol-level requests
• Low overhead
• Scripts must parallel user behavior
• Tough to get AJAX-heavy sites exactly right, maintain
RBU• Operate a browser
• Higher overhead
• Scripts must parallel user behavior
• Accurate simulation of AJAX
12
MULTI-MECHANIZE
• FOSS (LGPL v3), based on mechanize
• Load generation framework for VUs
• Written in Python, scripted in Python
• Heavyweight for bandwidth-bound tests, but stocked with py-goodness
13
MULTI-MECHANIZE
• Basic idea:
• Write a few scripts that simulate user actions or paths
• Specify how you want to run them: x VUs in parallel on script A, y on script B, ramping up, etc.
• Run and watch
14
A SIMPLE M-M SCRIPT
import requests
class Transaction(object): def run(self): r = requests.get(‘http://website.com/’) r.raw.read()
get_index.py
15
A SIMPLE PROJECT• A multi-mechanize “project” is a set of
test scripts and a config that specifies how to run them:
dan@host:~/mm/my_project$ ls -‐1 ...config.cfg # config filetest_scripts/ # your tests are hereresults/ # result files go here
16
A SIMPLE PROJECT
[global]run_time: 60rampup: 60results_ts_interval: 60console_logging: offprogress_bar: on
[user_group-‐1]threads: 25script: get_index.py
config.cfg
17
A FEW M-M FEATURES
import requestsimport time
class Transaction(object): def run(self): r = requests.get(‘http://website.com/a’) r.raw.read() assert (r.status_code == 200), ‘not 200’ assert (‘Error’ not in r.text)
t1 = time.time() r = requests.get(‘http://website.com/b’) r.raw.read() latency = time.time() -‐ t1 self.custom_timers[‘b’] = latency
features.py
18
[ $ multimech-‐run example ]
19
INTERACTION
import mechanize as m
class MyTransaction(object): def run(self): br = m.Browser() br.set_handle_equiv(True) br.set_handle_gzip(True) br.set_handle_redirect(True) br.set_handle_referer(True) br.set_handle_robots(False) br.set_handle_refresh(m._http.HTTPRefreshProcessor(), max_time=1) _ = br.open(‘http://reddit.tlys.us’)
br.select_form(nr=1) br.form['user'] = u br.form['passwd'] = p r = br.submit() r.read()
login.py
20
[ $ cat more-‐advanced-‐example.py ][ $ multimech-‐run more-‐advanced ]
21
GETTING INSIGHT
• First, is the machine working hard?
• Inspect basic resources: CPU/RAM/IO
• Ganglia/Munin/etc.
22
MUNIN
23
GETTING INSIGHT
• Second, why is the machine working hard?
• What is my app doing?
• What are my databases doing?
• How are my caches performing?
24
REDDIT: A PYLONS APP
nginx
uwsgi pylons
memcached
postgresql cassandra
queued jobs
25
REDDIT: A PYLONS APP
nginx
uwsgi pylons
memcached
postgresql cassandra
queued jobs
request start
request end
26
REDDIT: A PYLONS APP
nginx
uwsgi pylons
memcached
postgresql cassandra
queued jobs
request start
request end
27
INSIDE THE APPPROFILING
• Fine-grained analysis of Python code
• Easy to set up
• No visibility into DB, cache, etc.
• Distorts app performance
• profile, cProfile
INSTRUMENTATION
• Gather curated set of information
• Requires monkey-patching (or code edits)
• Can connect DB, cache performance to app
• Little (tunable) overhead
• django-debug-toolbar, statsd, Tracelytics, New Relic
VS
28
PROFILING 101
ncalls tottime percall cumtime percall filename:lineno(function) 892048 12.643 0.000 17.676 0.000 thing.py:116(__getattr__)14059/2526 9.475 0.001 34.159 0.014 template_helpers.py:181(_replace_render) 562060 7.384 0.000 7.384 0.000 {posix.stat}204587/163113 6.908 0.000 51.302 0.000 filters.py:111(mako_websafe)115192/109693 6.590 0.000 9.700 0.000 {method 'join' of 'str' objects} 1537933 6.584 0.000 15.437 0.000 registry.py:136(__getattr__)1679803/1404938 5.294 0.000 11.767 0.000 {hasattr}2579769/2434607 5.173 0.000 12.713 0.000 {getattr} 139 4.809 0.035 106.065 0.763 pages.py:1004(__init__) 8146 3.967 0.000 15.031 0.002 traceback.py:280(extract_stack) 43487 3.942 0.000 3.942 0.000 {method 'recv' of '_socket.socket' objects} 891579 3.759 0.000 21.430 0.000 thing.py:625(__getattr__) 72021 3.633 0.000 5.910 0.000 memcache.py:163(serverHashFunction) 201 3.319 0.017 38.667 0.192 pages.py:336(render) 392 3.236 0.008 3.236 0.008 {Cfilters.uspace_compress} 1610797 3.208 0.000 3.209 0.000 registry.py:177(_current_obj) 2017343 3.113 0.000 3.211 0.000 {isinstance}
• ncalls: # of calls to method
• tottime: total time spent exclusively in that method
• cumtime: time spent in that method and all child calls
• Try repoze.profile for WSGI
29
INSTRUMENTATION• DB queries
• Cache usage
• RPC calls
• Optionally profile critical segments of code
• Exceptions
• Associate with particular codepaths or URLs
30
DJANGO-DEBUG-TOOLBAR
31
STATSD/GRAPHITE
32
TRACELYTICS/NEW RELIC
33
THANKS!
• Multi-mechanize: testutils.org/multi-mechanize/
• Email me: [email protected]
• My job: tracelytics.com
(any questions?)
34
APPENDIX• Multi-mechanize
• Download: testutils.org/multi-mechanize
• Development: github.com/cgoldberg/multi-mechanize
• Mechanize: wwwsearch.sourceforge.net/mechanize/
• Source: github.com/reddit/reddit
• Demo load test: github.com/dankosaur/reddit-loadtest
• RBU load testing
• BrowserMob: browsermob.com/performance-testing
• LoadStorm: loadstorm.com
• New, but promising: github.com/detro/ghostdriver
35
APPENDIX• Machine monitoring:
• Ganglia: ganglia.sourceforge.net
• Munin: munin-monitoring.org
• Application monitoring:
• repoze.profile: docs.repoze.org/profile/
• Django-Debug-Toolbar: github.com/django-debug-toolbar/django-debug-toolbar
• Graphite: graphite.wikidot.com
• and statsd: github.com/etsy/statsd
• and django instrumentation: pypi.python.org/pypi/django-statsd/1.8.0
• Tracelytics: tracelytics.com
36
We’ve got a 14-day trial; with only minutes to install, you’ll have plenty of time for load testing. No credit card
required!
http://tracelytics.com
AND: TRACELYTICS FREE TRIAL
37