performance tweaks and tools for linux

113
performance tweaks and tools for linux joe damato twitter: @joedamato blog: timetobleed.com

Upload: ice799

Post on 21-Nov-2014

171 views

Category:

Documents


5 download

TRANSCRIPT

Page 1: performance tweaks and tools for linux

performance tweaks and tools for linux

joe damatotwitter: @joedamatoblog: timetobleed.com

Page 2: performance tweaks and tools for linux
Page 3: performance tweaks and tools for linux
Page 4: performance tweaks and tools for linux

~12 hour flight+11 hour time change

Page 5: performance tweaks and tools for linux

and

Page 6: performance tweaks and tools for linux
Page 7: performance tweaks and tools for linux

side effects are..

Page 8: performance tweaks and tools for linux
Page 9: performance tweaks and tools for linux
Page 10: performance tweaks and tools for linux
Page 11: performance tweaks and tools for linux
Page 12: performance tweaks and tools for linux

ejpphoto (flickr)

nasty bugs

Page 13: performance tweaks and tools for linux

codefatboyke (flickr)

Page 14: performance tweaks and tools for linux

37prime (flickr)

memory bloat

Page 15: performance tweaks and tools for linux

?

Page 16: performance tweaks and tools for linux

TOOLS

Page 17: performance tweaks and tools for linux

list open filesLSOF

lsof -nPp <pid>

Page 18: performance tweaks and tools for linux

lsof -nPp <pid>-nInhibits the conversion of network numbers to host names.

-PInhibits the conversion of port numbers to names for network files

FD TYPE NAMEcwd DIR /var/www/myapptxt REG /usr/bin/rubymem REG /json-1.1.9/ext/json/ext/generator.somem REG /json-1.1.9/ext/json/ext/parser.somem REG /memcached-0.17.4/lib/rlibmemcached.somem REG /mysql-2.8.1/lib/mysql_api.so 0u CHR /dev/null 1w REG /usr/local/nginx/logs/error.log 2w REG /usr/local/nginx/logs/error.log 3u IPv4 10.8.85.66:33326->10.8.85.68:3306 (ESTABLISHED) 10u IPv4 10.8.85.66:33327->10.8.85.68:3306 (ESTABLISHED) 11u IPv4 127.0.0.1:58273->127.0.0.1:11211 (ESTABLISHED) 12u REG /tmp/RackMultipart.28957.0 33u IPv4 174.36.83.42:37466->69.63.180.21:80 (ESTABLISHED)

jsonmemcached

mysqlhttp

Page 19: performance tweaks and tools for linux

trace system calls and signalsSTRACE

strace -cp <pid>strace -ttTp <pid> -o <file>

Page 20: performance tweaks and tools for linux

strace -cp <pid>-cCount time, calls, and errors for each system call and report a summary on program exit.

-p pidAttach to the process with the process ID pid and begin tracing.

% time seconds usecs/call calls errors syscall------ ----------- ----------- --------- --------- ---------------- 50.39 0.000064 0 1197 592 read 34.65 0.000044 0 609 writev 14.96 0.000019 0 1226 epoll_ctl 0.00 0.000000 0 4 close 0.00 0.000000 0 1 select 0.00 0.000000 0 4 socket 0.00 0.000000 0 4 4 connect 0.00 0.000000 0 1057 epoll_wait------ ----------- ----------- --------- --------- ----------------100.00 0.000127 4134 596 total

Page 21: performance tweaks and tools for linux

strace -ttTp <pid> -o <file>-ttIf given twice, the time printed will include the microseconds.

-TShow the time spent in system calls.

-o filenameWrite the trace output to the file filename rather than to stderr.

epoll_wait(9, {{EPOLLIN, {u32=68841296, u64=68841296}}}, 4096, 50) = 1 <0.033109>accept(10, {sin_port=38313, sin_addr="127.0.0.1"}, [1226]) = 22 <0.000014>fcntl(22, F_GETFL) = 0x2 (flags O_RDWR) <0.000007>fcntl(22, F_SETFL, O_RDWR|O_NONBLOCK) = 0 <0.000008>setsockopt(22, SOL_TCP, TCP_NODELAY, [1], 4) = 0 <0.000008>accept(10, 0x7fff5d9c07d0, [1226]) = -1 EAGAIN <0.000014>epoll_ctl(9, EPOLL_CTL_ADD, 22, {EPOLLIN, {u32=108750368, u64=108750368}}) = 0 <0.000009>epoll_wait(9, {{EPOLLIN, {u32=108750368, u64=108750368}}}, 4096, 50) = 1 <0.000007>read(22, "GET / HTTP/1.1\r"..., 16384) = 772 <0.000012>rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 <0.000007>poll([{fd=5, events=POLLIN|POLLPRI}], 1, 0) = 0 (Timeout) <0.000008>write(5, "1\0\0\0\0\0\0-\0\0\0\3SELECT * FROM `table`"..., 56) = 56 <0.000023>read(5, "\25\1\0\1,\2\0x\234m"..., 16384) = 284 <1.300897>

Page 22: performance tweaks and tools for linux

read(22, "GET / HTTP/1.1\r"..., 16384) = 772 <0.0012>

http client connection

read 772 bytes

incoming http request

took 0.0012s

Page 23: performance tweaks and tools for linux

write(5, "SELECT * FROM `table`"..., 56) = 56 <0.0023>read(5, "\25\1\0\1,\2\0x\234m"..., 16384) = 284 <1.30>

mysql connectionwrite sql query to db

read query response

slow query

Page 24: performance tweaks and tools for linux

strace ruby to see some interesting things...

Page 25: performance tweaks and tools for linux

stracing ruby: SIGVTALRM--- SIGVTALRM (Virtual timer expired) @ 0 (0) ---rt_sigreturn(0x1a) = 2207807 <0.000009>--- SIGVTALRM (Virtual timer expired) @ 0 (0) ---rt_sigreturn(0x1a) = 0 <0.000009>--- SIGVTALRM (Virtual timer expired) @ 0 (0) ---rt_sigreturn(0x1a) = 140734552062624 <0.000009>--- SIGVTALRM (Virtual timer expired) @ 0 (0) ---rt_sigreturn(0x1a) = 140734552066688 <0.000009>--- SIGVTALRM (Virtual timer expired) @ 0 (0) ---rt_sigreturn(0x1a) = 11333952 <0.000008>--- SIGVTALRM (Virtual timer expired) @ 0 (0) ---rt_sigreturn(0x1a) = 0 <0.000009>--- SIGVTALRM (Virtual timer expired) @ 0 (0) ---rt_sigreturn(0x1a) = 1 <0.000010>--- SIGVTALRM (Virtual timer expired) @ 0 (0) ---

• ruby 1.8 uses signals to schedule its green threads• process receives a SIGVTALRM signal every 10ms

Page 26: performance tweaks and tools for linux

stracing ruby: sigprocmask

• debian/redhat compile ruby with --enable-pthread• uses a native thread timer for SIGVTALRM• causes excessive calls to sigprocmask: 30% slowdown!

% time seconds usecs/call calls errors syscall------ ----------- ----------- --------- --------- ----------------100.00 0.326334 0 3568567 rt_sigprocmask 0.00 0.000000 0 9 read 0.00 0.000000 0 10 open 0.00 0.000000 0 10 close 0.00 0.000000 0 9 fstat 0.00 0.000000 0 25 mmap------ ----------- ----------- --------- --------- ----------------100.00 0.326334 3568685 0 total

Page 27: performance tweaks and tools for linux

dump traffic on a networkTCPDUMP

tcpdump -i eth0 -s 0 -nqAtcp dst port 3306

Page 28: performance tweaks and tools for linux

tcpdump -i <eth> -s <len> -nqA <expr>

-i <eth>Network interface.

-s <len>Snarf len bytes of data from each packet.

-nDon't convert addresses (host addresses, port numbers) to names.

-qQuiet output. Print less protocol information.

-APrint each packet (minus its link level header) in ASCII.

-w <file>Write the raw packets to file rather than printing them out.

<expr>libpcap expression, for example: tcp src port 80 tcp dst port 3306

tcpdump -i <eth> -w <file> <expr>

Page 29: performance tweaks and tools for linux

tcp dst port 8019:52:20.216294 IP 24.203.197.27.40105 > 174.37.48.236.80: tcp 438E...*[email protected].%&.....%0....POx..%s.oP.......GET /poll_images/cld99erh0/logo.png HTTP/1.1Accept: */*Referer: http://apps.facebook.com/realpolls/?_fb_q=1

Page 30: performance tweaks and tools for linux

tcp dst port 330619:51:06.501632 IP 10.8.85.66.50443 > 10.8.85.68.3306: tcp 98E..."K@[email protected]..[......W....SELECT * FROM `votes` WHERE (`poll_id` = 72621) LIMIT 1

Page 31: performance tweaks and tools for linux

tcpdump -w <file>

Page 32: performance tweaks and tools for linux

google's performance toolsPERFTOOLS

CPUPROFILE=/tmp/myprof ./myapppprof ./myapp /tmp/myprof

Page 33: performance tweaks and tools for linux

wget http://google-perftools.googlecode.com/files/google-perftools-1.6.tar.gztar zxvf google-perftools-1.6.tar.gzcd google-perftools-1.6

./configure --prefix=/optmakesudo make install

# for linuxexport LD_PRELOAD=/opt/lib/libprofiler.so

# for osxexport DYLD_INSERT_LIBRARIES=/opt/lib/libprofiler.dylib

CPUPROFILE=/tmp/ruby.prof ruby -e' 5_000_000.times{ "hello world" }'

pprof `which ruby` --text /tmp/ruby.prof

download

compile

profile

report

setup

Page 34: performance tweaks and tools for linux

Total: 103 samples 95 92.2% rb_yield_0 103 100.0% rb_eval 12 11.7% gc_sweep 52 50.5% rb_str_new3 3 2.9% obj_free 103 100.0% int_dotimes 12 11.7% gc_mark

pprof ruby ruby.prof --text

pprof ruby ruby.prof --gif

Page 35: performance tweaks and tools for linux

Profiling MRI• 10% of production VM time spent in rb_str_sub_bang

• String#sub!• called from Time.parse

return unless str.sub!(/\A(\d{1,2})/, '')return unless str.sub!(/\A( \d|\d{1,2})/, '')return unless str.sub!(/\A( \d|\d{1,2})/, '')return unless str.sub!(/\A(\d{1,3})/, '')return unless str.sub!(/\A(\d{1,2})/, '')return unless str.sub!(/\A(\d{1,2})/, '')

Page 36: performance tweaks and tools for linux

Profiling EM + threads

• known issue: EM+threads = slow

• memcpy??• thread context switches copy the stack w/ memcpy

• EM allocates huge buffer on the stack

• solution: move buffer to the heap

Total: 3763 samples 2764 73.5% catch_timer 989 26.3% memcpy 3 0.1% st_lookup 2 0.1% rb_thread_schedule 1 0.0% rb_eval 1 0.0% rb_newobj 1 0.0% rb_gc_force_recycle

Page 37: performance tweaks and tools for linux

perftools for ruby codePERFTOOLS.RB

pprof.rb /tmp/myrbprof

github.com/tmm1/perftools.rb

Page 38: performance tweaks and tools for linux

gem install perftools.rb

RUBYOPT="-r`gem which perftools | tail -1`"CPUPROFILE=/tmp/myrbprofruby myapp.rb

pprof.rb /tmp/myrbprof --textpprof.rb /tmp/myrbprof --gif > /tmp/myrbprof.gif

Page 39: performance tweaks and tools for linux

• Sampling profiler:

• 232 samples total

• 83 samples were in /compute

• 118 samples had /compute on the stack but were in another function

• /compute accounts for 50% of process, but only 35% of time was in /compute itself

require 'sinatra'

get '/sleep' do sleep 0.25 'done'end

get '/compute' do proc{ |n| a,b=0,1 n.times{ a,b = b,a+b } b }.call(10_000) 'done'end

$ ab -c 1 -n 50 http://127.0.0.1:4567/compute$ ab -c 1 -n 50 http://127.0.0.1:4567/sleep

== Sinatra has ended his set (crowd applauds)PROFILE: interrupts/evictions/bytes = 232/0/2152

Total: 232 samples 83 35.8% 35.8% 118 50.9% Sinatra::Application#GET /compute 56 24.1% 59.9% 56 24.1% garbage_collector 35 15.1% 75.0% 113 48.7% Integer#times

Page 40: performance tweaks and tools for linux

CPUPROFILE=app.profCPUPROFILE_REALTIME=1CPUPROFILE=app-rt.prof

Page 41: performance tweaks and tools for linux

redis-rb bottleneck

Page 42: performance tweaks and tools for linux

why is rubygems slow?

Page 43: performance tweaks and tools for linux

faster bundle install

• 23% spent in Gem::Version#<=>

• simple patch to rubygems improved overall install performance by 15%

• http://gist.github.com/458185

Page 44: performance tweaks and tools for linux

CPUPROFILE_OBJECTS=1CPUPROFILE=app-objs.prof

• object allocation profiler mode built-in

• 1 sample = 1 object created

• Time parsing is bothCPU and objectallocation intensive

• using mysql2 movesthis to C

Page 45: performance tweaks and tools for linux

trace library callsLTRACE

ltrace -cp <pid>ltrace -ttTp <pid> -o <file>

Page 46: performance tweaks and tools for linux

% time seconds usecs/call calls function------ ----------- ----------- --------- -------------------- 48.65 11.741295 617 19009 memcpy 30.16 7.279634 831 8751 longjmp 9.78 2.359889 135 17357 _setjmp 8.91 2.150565 285 7540 malloc 1.10 0.265946 20 13021 memset 0.81 0.195272 19 10105 __ctype_b_loc 0.35 0.084575 19 4361 strcmp 0.19 0.046163 19 2377 strlen 0.03 0.006272 23 265 realloc------ ----------- ----------- --------- --------------------100.00 24.134999 82999 total

ltrace -c ruby threaded_em.rb

01:24:48.769408 --- SIGVTALRM (Virtual timer expired) ---01:24:48.769616 memcpy(0x1216000, "", 1086328) = 0x1216000 <0.000578>01:24:48.770555 memcpy(0x6e32670, "\240&\343v", 1086328) = 0x6e32670 <0.000418>

01:24:49.899414 --- SIGVTALRM (Virtual timer expired) ---01:24:49.899490 memcpy(0x1320000, "", 1082584) = 0x1320000 <0.000628>01:24:49.900474 memcpy(0x6e32670, "", 1086328) = 0x6e32670 <0.000479>

ltrace -ttT -e memcpy ruby threaded_em.rb

Page 47: performance tweaks and tools for linux

trace dlopen’d library callsLTRACE/LIBDL

ltrace -F <conf> -bg -x <symbol> -p <pid>

github.com/ice799/ltrace/tree/libdl

Page 48: performance tweaks and tools for linux

ltrace -F <conf> -b -g -x <sym>-bIgnore signals.

-gIgnore libraries linked at compile time.

-F <conf>Read prototypes from config file.

-x <sym>Trace calls to the function sym.

-F ltrace.confint mysql_real_query(addr,string,ulong); void garbage_collect(void);int memcached_set(addr,string,ulong,string,ulong);

Page 49: performance tweaks and tools for linux

ltrace -x garbage_collect

19:08:06.436926 garbage_collect() = <void> <0.221679>19:08:15.329311 garbage_collect() = <void> <0.187546>19:08:17.662149 garbage_collect() = <void> <0.199200>19:08:20.486655 garbage_collect() = <void> <0.205864>19:08:25.102302 garbage_collect() = <void> <0.214295>19:08:35.552337 garbage_collect() = <void> <0.189172>

Page 50: performance tweaks and tools for linux

ltrace -x mysql_real_querymysql_real_query(0x1c9e0500, "SET NAMES 'UTF8'", 16) = 0 <0.000324>mysql_real_query(0x1c9e0500, "SET SQL_AUTO_IS_NULL=0", 22) = 0 <0.000322>mysql_real_query(0x19c7a500, "SELECT * FROM `users`", 21) = 0 <1.206506>mysql_real_query(0x1c9e0500, "COMMIT", 6) = 0 <0.000181>

Page 51: performance tweaks and tools for linux

ltrace -x memcached_setmemcached_set(0x15d46b80, "Status:33", 21, "\004\b", 366) = 0 <0.01116>memcached_set(0x15d46b80, "Status:96", 21, "\004\b", 333) = 0 <0.00224>memcached_set(0x15d46b80, "Status:57", 21, "\004\b", 298) = 0 <0.01850>memcached_set(0x15d46b80, "Status:10", 21, "\004\b", 302) = 0 <0.00530>memcached_set(0x15d46b80, "Status:67", 21, "\004\b", 318) = 0 <0.00291>memcached_set(0x15d46b80, "Status:02", 21, "\004\b", 299) = 0 <0.00658>memcached_set(0x15d46b80, "Status:34", 21, "\004\b", 264) = 0 <0.00243>

Page 52: performance tweaks and tools for linux

the GNU debuggerGDB

gdb <executable>gdb attach <pid>

Page 53: performance tweaks and tools for linux

Debugging Ruby Segfaults

#include "ruby.h"

VALUEsegv(){ VALUE array[1]; array[1000000] = NULL; return Qnil;}

voidInit_segv(){ rb_define_method(rb_cObject, "segv", segv, 0);}

test_segv.rb:4: [BUG] Segmentation faultruby 1.8.7 (2008-08-11 patchlevel 72) [i686-darwin9.7.0]

def test require 'segv' 4.times do Dir.chdir '/tmp' do Hash.new{ segv }[0] end endend

sleep 10test()

Page 54: performance tweaks and tools for linux

1. Attach to running process

$ sudo gdb ruby 23611Attaching to program: ruby, process 236110x00007fa5113c0c93 in nanosleep () from /lib/libc.so.6(gdb) cContinuing.

Program received signal SIGBUS, Bus error.segv () at segv.c:77 array[1000000] = NULL;

$ ps aux | grep rubyjoe 23611 0.0 0.1 25424 7540 S Dec01 0:00 ruby test_segv.rb

2. Use a coredump

$ sudo mkdir /cores$ sudo chmod 777 /cores$ sudo sysctl kernel.core_pattern=/cores/%e.core.%s.%p.%t

Process.setrlimit Process::RLIMIT_CORE, 300*1024*1024

$ sudo gdb ruby /cores/ruby.core.6.23611.1259781224

Page 55: performance tweaks and tools for linux

def test require 'segv' 4.times do Dir.chdir '/tmp' do Hash.new{ segv }[0] end endend

test()

(gdb) where#0 segv () at segv.c:7#1 0x000000000041f2be in call_cfunc () at eval.c:5727...#13 0x000000000043ba8c in rb_hash_default () at hash.c:521...#19 0x000000000043b92a in rb_hash_aref () at hash.c:429...#26 0x00000000004bb7bc in chdir_yield () at dir.c:728#27 0x000000000041d8d7 in rb_ensure () at eval.c:5528#28 0x00000000004bb93a in dir_s_chdir () at dir.c:816...#35 0x000000000041c444 in rb_yield () at eval.c:5142#36 0x0000000000450690 in int_dotimes () at numeric.c:2834...#48 0x0000000000412a90 in ruby_run () at eval.c:1678#49 0x000000000041014e in main () at main.c:48

Page 56: performance tweaks and tools for linux

gdb with MRI hooksGDB.RB

gem install gdb.rbgdb.rb <pid>

github.com/tmm1/gdb.rb

Page 57: performance tweaks and tools for linux

(gdb) ruby eval 1+23

(gdb) ruby eval Thread.current#<Thread:0x1d630 run>

(gdb) ruby eval Thread.list.size8

Page 58: performance tweaks and tools for linux

(gdb) ruby threads list0x15890 main thread THREAD_STOPPED WAIT_JOIN(0x19ef4)0x19ef4 thread THREAD_STOPPED WAIT_TIME(57.10s) 0x19e34 thread THREAD_STOPPED WAIT_FD(5) 0x19dc4 thread THREAD_STOPPED WAIT_NONE 0x19dc8 thread THREAD_STOPPED WAIT_NONE 0x19dcc thread THREAD_STOPPED WAIT_NONE 0x22668 thread THREAD_STOPPED WAIT_NONE 0x1d630 curr thread THREAD_RUNNABLE WAIT_NONE

Page 59: performance tweaks and tools for linux

(gdb) ruby objects HEAPS 8 SLOTS 1686252 LIVE 893327 (52.98%) FREE 792925 (47.02%)

scope 1641 (0.18%) regexp 2255 (0.25%) data 3539 (0.40%) class 3680 (0.41%) hash 6196 (0.69%) object 8785 (0.98%) array 13850 (1.55%) string 105350 (11.79%) node 742346 (83.10%)

Page 60: performance tweaks and tools for linux

(gdb) ruby objects strings 140 u'lib' 158 u'0' 294 u'\n' 619 u''

30503 unique strings 3187435 bytes

Page 61: performance tweaks and tools for linux

def test require 'segv' 4.times do Dir.chdir '/tmp' do Hash.new{ segv }[0] end endend

test()(gdb) ruby threads

0xa3e000 main curr thread THREAD_RUNNABLE WAIT_NONE node_vcall segv in test_segv.rb:5 node_call test in test_segv.rb:5 node_call call in test_segv.rb:5 node_call default in test_segv.rb:5 node_call [] in test_segv.rb:5 node_call test in test_segv.rb:4 node_call chdir in test_segv.rb:4 node_call test in test_segv.rb:3 node_call times in test_segv.rb:3 node_vcall test in test_segv.rb:9

Page 62: performance tweaks and tools for linux

rails_warden leak(gdb) ruby objects classes 1197 MIME::Type 2657 NewRelic::MetricSpec 2719 TZInfo::TimezoneTransitionInfo 4124 Warden::Manager 4124 MethodOverrideForAll 4124 AccountMiddleware 4124 Rack::Cookies 4125 ActiveRecord::ConnectionAdapters::ConnectionManagement 4125 ActionController::Session::CookieStore 4125 ActionController::Failsafe 4125 ActionController::ParamsParser 4125 Rack::Lock 4125 ActionController::Dispatcher 4125 ActiveRecord::QueryCache 4125 ActiveSupport::MessageVerifier 4125 Rack::Head

middleware chain leaking per request

Page 63: performance tweaks and tools for linux

mongrel sleeper thread 0x16814c00 thread THREAD_STOPPED WAIT_TIME(0.47) 1522 bytes node_fcall sleep in lib/mongrel/configurator.rb:285 node_fcall run in lib/mongrel/configurator.rb:285 node_fcall loop in lib/mongrel/configurator.rb:285 node_call run in lib/mongrel/configurator.rb:285 node_call initialize in lib/mongrel/configurator.rb:285 node_call new in lib/mongrel/configurator.rb:285 node_call run in bin/mongrel_rails:128 node_call run in lib/mongrel/command.rb:212 node_call run in bin/mongrel_rails:281 node_fcall (unknown) in bin/mongrel_rails:19

def run @listeners.each {|name,s| s.run }

$mongrel_sleeper_thread = Thread.new { loop { sleep 1 } }end

Page 64: performance tweaks and tools for linux

god memory leaks(gdb) ruby objects arrays elements instances 94310 3 94311 3 94314 2 94316 1 5369 arrays 2863364 member elements

many arrays with 90k+ elements!

5 separate god leaks fixed by Eric Lindvall with the help of gdb.rb!

43 God::Process 43 God::Watch 43 God::Driver 43 God::DriverEventQueue 43 God::Conditions::MemoryUsage 43 God::Conditions::ProcessRunning 43 God::Behaviors::CleanPidFile 45 Process::Status 86 God::Metric327 God::System::SlashProcPoller327 God::System::Process406 God::DriverEvent

Page 65: performance tweaks and tools for linux

ruby memory leak detectorBLEAK_HOUSE

ruby-bleak-house myapp.rbbleak /tmp/bleak.<PID>.*.dump

github.com/fauna/bleak_house

Page 66: performance tweaks and tools for linux

• BleakHouse

• installs a patched version of ruby: ruby-bleak-house

• unlike gdb.rb, see where objects were created (file:line)

• create multiple dumps over time with `kill -USR2 <pid>` and compare to find leaks

191691 total objectsFinal heap size 191691 filled, 220961 freeDisplaying top 20 most common line/class pairs 89513 __null__:__null__:__node__ 41438 __null__:__null__:String 2348 ruby/site_ruby/1.8/rubygems/specification.rb:557:Array 1508 ruby/gems/1.8/specifications/gettext-1.9.gemspec:14:String 1021 ruby/gems/1.8/specifications/heel-0.2.0.gemspec:14:String 951 ruby/site_ruby/1.8/rubygems/version.rb:111:String 935 ruby/site_ruby/1.8/rubygems/specification.rb:557:String 834 ruby/site_ruby/1.8/rubygems/version.rb:146:Array

Page 67: performance tweaks and tools for linux

summarizes strace and lsofIOPROFILE

wget http://aspersa.googlecode.com/svn/trunk/ioprofile

Page 68: performance tweaks and tools for linux

strace -cp <pid>

ioprofile is a script that captures one sample of lsof output then starts strace for a specified amount of time.

after strace finishes, the results are processed.

below is an example that comes with ioprofile

$ ioprofile t/samples/ioprofile-001.txt total pread read pwrite write filename 10.094264 10.094264 0.000000 0.000000 0.000000 /data/data/abd_2dia/aia_227_228.ibd 8.356632 8.356632 0.000000 0.000000 0.000000 /data/data/abd_2dia/aia_227_223.ibd 0.048850 0.046989 0.000000 0.001861 0.000000 /data/data/abd/aia_instances.ibd 0.035016 0.031001 0.000000 0.004015 0.000000 /data/data/abd/vo_difuus.ibd 0.013360 0.000000 0.001723 0.000000 0.011637 /var/log/mysql/mysql-relay.002113 0.008676 0.000000 0.000000 0.000000 0.008676 /data/data/master.info 0.002060 0.000000 0.000000 0.002060 0.000000 /data/data/ibdata1 0.001490 0.000000 0.000000 0.001490 0.000000 /data/data/ib_logfile1 0.000555 0.000000 0.000000 0.000000 0.000555 /var/log/mysql/mysql-relay-log.info 0.000141 0.000000 0.000000 0.000141 0.000000 /data/data/ib_logfile0 0.000100 0.000000 0.000000 0.000100 0.000000 /data/data/abd/9fvus.ibd

Page 69: performance tweaks and tools for linux

strace -ttTp <pid> -o <file>-c CELLspecify what to put in the cells of the output. ‘times’, ‘count’, or ‘sizes‘.

below is an example of -c sizes:

$ ioprofile -c sizes t/samples/ioprofile-001.txt total pread read pwrite write filename 90800128 90800128 0 0 0 /data/data/abd_2dia/aia_227_223.ibd 52150272 52150272 0 0 0 /data/data/abd_2dia/aia_227_228.ibd 999424 0 0 999424 0 /data/data/ibdata1 638976 131072 0 507904 0 /data/data/abd/vo_difuus.ibd 327680 114688 0 212992 0 /data/data/abd/aia_instances.ibd 305263 0 149662 0 155601 /var/log/mysql/mysql-relay.002113 217088 0 0 217088 0 /data/data/ib_logfile1 22638 0 0 0 22638 /data/data/master.info 16384 0 0 16384 0 /data/data/abd/9fvus.ibd 1088 0 0 0 1088 /var/log/mysql/mysql-relay-log.info 512 0 0 512 0 /data/data/ib_logfile0

Page 70: performance tweaks and tools for linux

lots of interesting things hiding in proc/proc

Page 71: performance tweaks and tools for linux

• (as of linux 2.6.25)

• /proc/[pid]/pagemap - find out which physical frame each virtual page is mapped to and swap flag.

• /proc/kpagecount - stores number of times a particular physical page is mapped.

• /proc/kpageflags - flags about each page (locked, slab, dirty, writeback, ...)

• read more: Documentation/vm/pagemap.txt

/proc/[pid]/pagemap

Page 72: performance tweaks and tools for linux

• get a process’s stack trace

/proc/[pid]/stack

Page 73: performance tweaks and tools for linux

• lots of small bits of information

/proc/[pid]/status

Page 74: performance tweaks and tools for linux
Page 75: performance tweaks and tools for linux

• tell system where to output core dumps

• can also launch user program when core dumps happen.

• echo "|/path/to/core_helper.rb %p %s %u %g" > /proc/sys/kernel/core_pattern

• /proc/[pid]/ files are kept in place until your handler exits, so full state of the process at death may be inspected.

• http://gist.github.com/587443

/proc/sys/kernel/core_pattern

Page 76: performance tweaks and tools for linux

and lots more/proc/meminfo

/proc/scsi/*/proc/net/*

/proc/sys/net/ipv4/*. . .

Page 77: performance tweaks and tools for linux

MONITOR RAID STATUS

Page 78: performance tweaks and tools for linux

• how do you know when a hard drive in your RAID array fails?

• turns out there some command line tools that most RAID vendors provide.

• adaptec - /usr/StorMan/arcconf getconfig 1 AL

• 3ware - /usr/bin/tw_cli

Page 79: performance tweaks and tools for linux

snooki:/# /usr/StorMan/arcconf getconfig 1 ALControllers found: 1----------------------------------------------------------------------Controller information---------------------------------------------------------------------- Controller Status : Optimal Channel description : SAS/SATA Controller Model : Adaptec 3405 Controller Serial Number : 7C4911519E3 Physical Slot : 2 Temperature : 42 C/ 107 F (Normal) Installed memory : 128 MB Copyback : Disabled Background consistency check : Enabled Automatic Failover : Enabled Global task priority : High Stayawake period : Disabled Spinup limit internal drives : 0 Spinup limit external drives : 0 Defunct disk drive count : 0 Logical devices/Failed/Degraded : 1/0/0 -------------------------------------------------------- Controller Version Information -------------------------------------------------------- BIOS : 5.2-0 (17304) Firmware : 5.2-0 (17304) Driver : 1.1-5 (2461) Boot Flash : 5.2-0 (17304) -------------------------------------------------------- Controller Battery Information -------------------------------------------------------- Status : Optimal Over temperature : No Capacity remaining : 100 percent Time remaining (at current draw) : 3 days, 1 hours, 52 minutes

----------------------------------------------------------------------Logical device information----------------------------------------------------------------------Logical device number 0 Logical device name : RAID1-A RAID level : 1 Status of logical device : Optimal Size : 239200 MB Read-cache mode : Enabled Write-cache mode : Enabled (write-back) Write-cache setting : Enabled (write-back) when protected by battery Partitioned : Unknown Protected by Hot-Spare : No Bootable : Yes Failed stripes : No Power settings : Disabled -------------------------------------------------------- Logical device segment information -------------------------------------------------------- Segment 0 : Present (0,0) WD-WCANY3977712 Segment 1 : Present (0,1) WD-WCANY4141196

----------------------------------------------------------------------Physical Device information---------------------------------------------------------------------- Device #0 Device is a Hard drive State : Online Supported : Yes Transfer Speed : SATA 3.0 Gb/s Reported Channel,Device(T:L) : 0,0(0:0) Reported Location : Connector 0, Device 0 Vendor : WDC Model : WD2500YS-01SHB1 Firmware : 20.06C06 Serial number : WD-WCANY3977712 Size : 239372 MB Write Cache : Enabled (write-back) FRU : None S.M.A.R.T. : No Device #1 Device is a Hard drive State : Online Supported : Yes Transfer Speed : SATA 3.0 Gb/s Reported Channel,Device(T:L) : 0,1(1:0) Reported Location : Connector 0, Device 1 Vendor : WDC Model : WD2500YS-01SHB1 Firmware : 20.06C06 Serial number : WD-WCANY4141196 Size : 239372 MB Write Cache : Enabled (write-back) FRU : None S.M.A.R.T. : No

Command completed successfully.

Page 80: performance tweaks and tools for linux

• write a script to parse that

• run it with cron

• the ugly ruby script i use for adaptec:

http://gist.github.com/643666

Page 81: performance tweaks and tools for linux

iostat, vmstat, top, and freemore well known tools:

Page 82: performance tweaks and tools for linux

LINUX TWEAKS

Page 83: performance tweaks and tools for linux

ADJUST TIMER FREQUENCY

Page 84: performance tweaks and tools for linux

CONFIG_HZ_100=yCONFIG_HZ=100

• Set the timer interrupt frequency.

• Fewer timer interrupts means processes run with fewer interruptions.

• Servers (without interactive software) should have lower timer frequency.

Page 85: performance tweaks and tools for linux

CONNECTOR

Page 86: performance tweaks and tools for linux

CONFIG_CONNECTOR=yCONFIG_PROC_EVENTS=y

• connector kernel module is useful for process monitoring.

• build or use a system like god to watch processes.

• when processes die the kernel notifies you

• you can restart/recover/etc.

Page 87: performance tweaks and tools for linux

TCP SEGMENTATION OFFLOADING

Page 88: performance tweaks and tools for linux

sudo ethtool -K eth1 tso on• Allows kernel to offload large packet

segmentation to the network adapter.

• Frees the CPU to do more useful work.

• After running the command above, verify with:

[joe@timetobleed]% dmesg | tail -1

[892528.450378] 0000:04:00.1: eth1: TSO is Enabled

Page 89: performance tweaks and tools for linux

http://kerneltrap.org/node/397Tx/Rx TCP file send long (bi-directional Rx/Tx):

w/o TSO: 1500Mbps, 82% CPU

w/ TSO: 1633Mbps, 75% CPU

Tx TCP file send long (Tx only):

w/o TSO: 940Mbps, 40% CPU

w/ TSO: 940Mbps, 19% CPU

Page 90: performance tweaks and tools for linux

INTEL I/OAT DMA ENGINE

Page 91: performance tweaks and tools for linux

CONFIG_DMADEVICES=y CONFIG_INTEL_IOATDMA=y

CONFIG_DMA_ENGINE=y CONFIG_NET_DMA=y

• these options enable Intel I/OAT DMA engine present in recent Xeon CPUs.

• increases throughput because kernel can offload network data copying to the DMA engine.

• CPU can do more useful work.

• statistics about savings can be found in sysfs: /sys/class/dma/

Page 92: performance tweaks and tools for linux

check if I/OAT is enabled

[joe@timetobleed]% dmesg | grep ioat

ioatdma 0000:00:08.0: setting latency timer to 64

ioatdma 0000:00:08.0: Intel(R) I/OAT DMA Engine found, 4 channels, device version 0x12, driver version 3.64

ioatdma 0000:00:08.0: irq 56 for MSI/MSI-X

Page 93: performance tweaks and tools for linux

DIRECT CACHE ACCESS

Page 94: performance tweaks and tools for linux

CONFIG_DCA=y

• I/OAT includes Direct Cache Access (DCA)

• DCA allows a driver to warm CPU cache.

• Requires driver and device support.

• Intel 10GbE driver (ixgbe) supports this feature.

• Must enable this feature in the BIOS.

• Some vendors hide BIOS option so you will need a hack to enable DCA.

Page 96: performance tweaks and tools for linux

THROTTLE NIC INTERRUPTS

Page 97: performance tweaks and tools for linux

insmod e1000e.ko InterruptThrottleRate=1• SOME drivers allow you to specify the interrupt throttling

algorithm.

• e1000e is one of these drivers.

• Two dynamic throttling algorithms: dynamic (1) and dynamic conservative (3).

• The difference is the interrupt rate for “Lowest Latency” traffic.

• Algorithm 1 is more aggressive for this traffic class.

• Read driver documentation for more information.

• Be careful to avoid receive livelock.

Page 98: performance tweaks and tools for linux

PROCESS AFFINITY

Page 99: performance tweaks and tools for linux

Process Affinity

• Linux allows you to set which CPU(s) a process may run on.

• For example, set PID 123 to CPUs 4-6:

# taskset -c 4,5,6 123

Page 100: performance tweaks and tools for linux

IRQ AFFINITY

Page 101: performance tweaks and tools for linux

IRQ Affinity

• NIC, disk, etc IRQ handlers can be set to execute on specific processors.

• The IRQ to CPU map can be found at:

/proc/interrupts

• Individual IRQs may be set in the file:

/proc/irq/[IRQ NUMBER]/smp_affinity

Page 102: performance tweaks and tools for linux

IRQ affinity

• Can pin IRQ handlers for devices to specific CPUs.

• Can then use taskset to pin important processes to other CPUs.

• The result is NIC and disk will not interrupt important processes running elsewhere.

• Can also help preserve CPU caches.

Page 103: performance tweaks and tools for linux

irqbalance

• http://www.irqbalance.org

• “irqbalance is a Linux* daemon that distributes interrupts over the processors and cores you have in your computer system.”

Page 104: performance tweaks and tools for linux

OPROFILE

Page 105: performance tweaks and tools for linux

CONFIG_OPROFILE=yCONFIG_HAVE_OPROFILE=y

• oprofile is a system wide profiler that can profile both kernel and application level code.

• oprofile has a kernel driver which collects data from CPU registers (on x86 these are MSRs).

• oprofile can also annote source code with performance information.

• this can help you find and fix performance problems.

Page 106: performance tweaks and tools for linux

LATENCYTOP

Page 107: performance tweaks and tools for linux

CONFIG_LATENCYTOP=y• LatencyTOP helps you understand what

caused the kernel to block a process.

Page 108: performance tweaks and tools for linux

ejpphoto (flickr)

nasty bugs

Page 109: performance tweaks and tools for linux

codefatboyke (flickr)

Page 110: performance tweaks and tools for linux

37prime (flickr)

memory bloat

Page 111: performance tweaks and tools for linux

TOOLS

Page 112: performance tweaks and tools for linux

TWEAKS

Page 113: performance tweaks and tools for linux

Thanks.(spasiba ?)

@joedamatotimetobleed.com