vertical scaling & performance tuning
TRANSCRIPT
-
8/10/2019 Vertical Scaling & Performance Tuning
1/64
Vertical Scaling& Performance Tuning
-
8/10/2019 Vertical Scaling & Performance Tuning
2/64
Chapter 1introductory =boring
-
8/10/2019 Vertical Scaling & Performance Tuning
3/64
wikipedia
Handle a growing amount of work in a capablemanner and accommodate max growth
Scalability
-
8/10/2019 Vertical Scaling & Performance Tuning
4/64
wikipedia
add more resources in a single node
Vertically
-
8/10/2019 Vertical Scaling & Performance Tuning
5/64
wikipedia
tune / modifying a system to handle a higherload
Performance Tuning
-
8/10/2019 Vertical Scaling & Performance Tuning
6/64
anonymous
fully optimise all available resources formaximum possible load
Performance Tuning - Vertically
-
8/10/2019 Vertical Scaling & Performance Tuning
7/64
anonymous
when Vertical + Tuning already maximise
Horizontally ?
-
8/10/2019 Vertical Scaling & Performance Tuning
8/64
Chapter 2vertical scalingCPU . Core
still boring but essential
-
8/10/2019 Vertical Scaling & Performance Tuning
9/64
CPU
! CPU utilisation depend on accessed resources
!
Linux Kernel has a scheduler, and scheduler givepriorities to the different resources:
! scheduling two kind of resources:
! threads
! interrupts
-
8/10/2019 Vertical Scaling & Performance Tuning
10/64
CPU
! Scheduler Priorities:
!
Hardware interrupts (highest priority)! by hardware on the system to process data
! eg:
! by disk when completed io transaction
! by NIC when packet has been received
-
8/10/2019 Vertical Scaling & Performance Tuning
11/64
CPU
! Scheduler Priorities:
! Soft Interrupts (softirq) - related to maintenance of the
kernel itself! Real Time Thread - parallel processing / real time
programming
! Kernel Threads - all kernel processing! User Threads - a.k.a userland. All software application
run in the user space / the lowest priority of all
-
8/10/2019 Vertical Scaling & Performance Tuning
12/64
Cores
!
Linux consider / view each core on n-way Hyperthreaded processor as an:
! INDEPENDENT PROCESSOR
! Dual Core Processor = two individual processors
-
8/10/2019 Vertical Scaling & Performance Tuning
13/64
Context Switches
! eg: 20k threads in hand
! make it simpler: only one CPU / single processor / core
! so each 20k threads need to be schedule and balance - NEVER forever and ever
scheduled / executing thread, so each thread:
! allotted time quantum to spend on the processor
! pass allotted time or pre-empted by something higher priority:
! the thread:
! place back to queue
! higher priority / next in queue thread is placed on the processor
! switching of thread = Context Switch
-
8/10/2019 Vertical Scaling & Performance Tuning
14/64
The Run Queue
! each CPU maintain a RUN QUEUE of Threads
! process thread are either:
! runnable
! sleep state (blocked and waiting for IO)
! CPU heavily utilised
! the larger the run queue
! the longer it will take for process threads to execute
-
8/10/2019 Vertical Scaling & Performance Tuning
15/64
The Run Queue
! Load
! describe the state of the Run Queue
! System Load
! equal to amount of process threads currentlyexecuting + amount of threads in the CPU Run
Queue
! top command report load averages over thecourse of 1, 5, and 15 minutes
-
8/10/2019 Vertical Scaling & Performance Tuning
16/64
CPU Utilisation
! defined as the percentage of usage of a CPU
! mostly CPU utilisation falls under following categories:
! User Time: percentage of time CPU spends executing
threads in the user space
! System Time: percentage of time CPU spendsexecuting kernel threads and interrupts
-
8/10/2019 Vertical Scaling & Performance Tuning
17/64
CPU Utilisation
!
Wait IO: the percentage of time a CPU spends idlebecause all process threads are blocked waiting for IO
requests to complete
! Idle: the percentage of time a processor spends in
completely idle state
-
8/10/2019 Vertical Scaling & Performance Tuning
18/64
Time Slicing
! a numeric value represent how long a task can run until
it pre-empted
! scheduler policy dictate the default timeslice
! too long time slice = poor interactive performance
! time slice too short = significant amount of
processor time been wasted because of overhead ofswitching process from one process (short time
slice) to another (context switching)
-
8/10/2019 Vertical Scaling & Performance Tuning
19/64
top command
Load Average
-
8/10/2019 Vertical Scaling & Performance Tuning
20/64
Chapter 3vertical scalingCPU Performance Monitoring
hooraay - hands on lab exercise
-
8/10/2019 Vertical Scaling & Performance Tuning
21/64
CPU Performance Monitoring
! a matter of interpreting performance of:
! run queue
! utilisation
! context switching
-
8/10/2019 Vertical Scaling & Performance Tuning
22/64
CPU Performance Monitoring
! General Expectations:
! Run Queues: a run queue should have no more than 1
- 3 threads queued per processor
! eg: a dual processor should not have more than 6
threads in the run queue
-
8/10/2019 Vertical Scaling & Performance Tuning
23/64
CPU Performance Monitoring
! General Expectations:
! CPU Utilisation: if a CPU is fully utilised, ideally then
the following balance of utilisation should beachieved:
! 65% - 70% : User Time
! 30% - 35% : System Time
! 0% - 5% : Idle Time
-
8/10/2019 Vertical Scaling & Performance Tuning
24/64
CPU Performance Monitoring
! General Expectations:
! Context Switches
! high amount of context switches is acceptable if:
!
CPU utilisation stays within previouslymentioned balance
-
8/10/2019 Vertical Scaling & Performance Tuning
25/64
wta
where to look for how many context switches a
known process make ?
Context Switches
-
8/10/2019 Vertical Scaling & Performance Tuning
26/64
wta
where to look for the total of context switches in
your linux box?
Context Switches
-
8/10/2019 Vertical Scaling & Performance Tuning
27/64
wta
/proc/$pid/status | grep ctxt
/usr/bin/time -v ls | grep contextvmstat
Context Switches
-
8/10/2019 Vertical Scaling & Performance Tuning
28/64
Performance Monitoring Tools
!
must be low overhead tool! still practical having it running under heavily loaded
system
! able to monitor the health of the system at glance
-
8/10/2019 Vertical Scaling & Performance Tuning
29/64
Performance Monitoring Tools
! vmstat
-
8/10/2019 Vertical Scaling & Performance Tuning
30/64
Performance Monitoring Tools
! mpstat
-
8/10/2019 Vertical Scaling & Performance Tuning
31/64
Performance Monitoring Tools
! top
-
8/10/2019 Vertical Scaling & Performance Tuning
32/64
Case Study 1
-
8/10/2019 Vertical Scaling & Performance Tuning
33/64
Case Study 2
-
8/10/2019 Vertical Scaling & Performance Tuning
34/64
Case Study 3
-
8/10/2019 Vertical Scaling & Performance Tuning
35/64
CPU Performance Tuning
! no configureable/tunable parameter for kernel 2.6
! use ps / top / vmstat / mpstat
! familiarise with system
! find the offending applications
! move to another better server/system
-
8/10/2019 Vertical Scaling & Performance Tuning
36/64
Chapter 4vertical scalingMemory
meh - another theory
-
8/10/2019 Vertical Scaling & Performance Tuning
37/64
Virtual Memory
! use disk as extension of RAM
!
kernel writes unused blocks of memory to disks so thatmemory can be used for another purpose
! when needed again they are read back to memory
! reading and writing to disks are slower
! a.k.a SWAP
-
8/10/2019 Vertical Scaling & Performance Tuning
38/64
Virtual Memory Pages
! Virtual Memory divided into pages
! on X86 architecture VM Pages = 4kb
! when writing from memory to disk, it write memory in
Pages
-
8/10/2019 Vertical Scaling & Performance Tuning
39/64
Virtual Size and Resident Set Size
! when application starts it request Virtual Memory Size (VSZ)
! the kernel either grants or denies virtual memory request
! as application use the requested memory, that memorymapped into physical memory
! RSS is amount of the virtual memory that physicallymapped into memory
! most cases application use less RSS than it requested(VSZ)
-
8/10/2019 Vertical Scaling & Performance Tuning
40/64
Virtual Size and Resident Set Size
-
8/10/2019 Vertical Scaling & Performance Tuning
41/64
Paging and Swapping
! System Paging is a normal activity
! If system is low on RAM:
!
first kernel will attempt to write pages to the swap device tofree RAM
! if kernel cant free enough RAM in time, it will swap wholeprocesses
! paging takes single memory pages
! swapping takes entire memory region associated with certainprocesses and write them to the swap devices
-
8/10/2019 Vertical Scaling & Performance Tuning
42/64
Kernel Paging
! when pages in memory are modified by running processes, theybecome dirty
! dirty pages must be written back either to swap or or disk
! pdflush : daemon responsible for sync pages associated with a fileon filesystem back to disk
! when file modified in memory, pdflush daemon writes back todisk
! by default pdflush starts writes back to disk when 10% of thepages in memory are dirty:
! cat /proc/sys/vm/dirty_background_ratio
-
8/10/2019 Vertical Scaling & Performance Tuning
43/64
Kernel Swapping
! kswapd daemon is responsible for freeing memory when theevent of memory shortage
! kswapd scans memory pages and performs following actionswhen below threshold:
! if page is unmodified, it place the page on the free list
! if page is modified and backed by a filesystem, it writes the
content of the page to disk
! if page is modified and not backed up by a filesystem, itwrite the contents of the page to SWAP device
-
8/10/2019 Vertical Scaling & Performance Tuning
44/64
Memory Performance Monitoring Tool
! vmstat
-
8/10/2019 Vertical Scaling & Performance Tuning
45/64
Swappiness
! decide how quickly they want the VM to reclaim mapped
pages rather than just try to flush out dirty pages
! algorithm based on combinations of:! percent of the inactive list scanned
! amount of total system memory mapped
! and the swappiness value
! /proc/sys/vm/swappiness
-
8/10/2019 Vertical Scaling & Performance Tuning
46/64
Chapter 5vertical scaling
Monitoring and Analysis Tools
-
8/10/2019 Vertical Scaling & Performance Tuning
47/64
Monitoring / Analysis Tools
! uptime
! show load average
!
only useful as clue - use other tools to investigate! top
! system wide per process summaries
!
mpstat! mpstat -P ALL 1
! check for hot threads, unbalanced workloads
-
8/10/2019 Vertical Scaling & Performance Tuning
48/64
Monitoring / Analysis Tools
! iostat
! disk io statistics
! 1st output summary stats since boot
! vmstat
! virtual memory statistic and other high level summaries
! free
! memory usage summary
! buffers: block device I/O cache
! cached: virtual page cache
-
8/10/2019 Vertical Scaling & Performance Tuning
49/64
Monitoring / Analysis Tools
! dstat
! a better vmstat-like tool
! sar
! system activity reporter
! eg: paging statistics -B:
! sar -B 1
! netstat -s
! network protocol statistics
! pidstat
! monitor individual task managed by linux kernel
-
8/10/2019 Vertical Scaling & Performance Tuning
50/64
Monitoring / Analysis Tools
! strace
! system call tracer
! blktrace
! block device I/O event tracing
! iotop
! disk I/O by process
! slabtop
! kernel slab cache information in real time
! show where kernel memory is consumed
! eg: slabtop -sc
-
8/10/2019 Vertical Scaling & Performance Tuning
51/64
Monitoring / Analysis Tools
! sysctl
! system settings
! /proc
! statistic source
! eg: cat /proc/meminfo
-
8/10/2019 Vertical Scaling & Performance Tuning
52/64
Monitoring / Analysis Tools
! perf
! profiling and tracing tools with numerous sub
command:
-
8/10/2019 Vertical Scaling & Performance Tuning
53/64
Monitoring / Analysis Tools
! perf stat - key performance counter summary:
!
eg: perf stat bzip2 bigfile1! perf list - list available events
! eg: perf list | grep Hardware
! $ perf stat -e instructions,L1-dcache-load-misses bzip2
bigfile
-
8/10/2019 Vertical Scaling & Performance Tuning
54/64
Monitoring / Analysis Tools
! perf record: profiling / sampling CPU activity
! eg: perf record -a -g sleep 10
!
-a: all CPUs! -g: call stacks
! sleep 10: duration to sample via dummy command (10 seconds)
!
generate perf.data file! perf report - to view sample in interactive mode
! perf report stdio : non interactive mode
-
8/10/2019 Vertical Scaling & Performance Tuning
55/64
Monitoring / Analysis Tools
! perf probe: dynamic tracing - define custom probes
from kernel! perf probe add=tcp_sendmsg
! perf record -e probe:tcp_sendmsg -aR -g sleep 5
! perf report
-
8/10/2019 Vertical Scaling & Performance Tuning
56/64
Exploiting Linux SMP for Net Performance
! watch -d -n 1 grep NET /proc/softirqs
! /sys/class/net/[net interface]/queue
! ls
! RPS & XPS
! Receive Packet Steering
! Transmit Packet Steering
-
8/10/2019 Vertical Scaling & Performance Tuning
57/64
Performance Tuning Stacksvertical scaling
-
8/10/2019 Vertical Scaling & Performance Tuning
58/64
Stress Load Tool
! a load test must address this following issues:
! representative of what users are doing (or expected to
do)
! balanced in the same proportion to mimic end user
behavior
-
8/10/2019 Vertical Scaling & Performance Tuning
59/64
Stress Load Tools
! user login
! read disclaimer - thinking period
! click first form
! 1min to fill up the form manually
! ajax request - query db, automatically fill up some fields
! submit
! logout
-
8/10/2019 Vertical Scaling & Performance Tuning
60/64
Stress Load Tools
!
in order to mimic user behavior - stress test tool mustable to:
! record all user activities + behavior
!
distributed workers in multiple instance
-
8/10/2019 Vertical Scaling & Performance Tuning
61/64
Stress Load Tools
! tsung
! stress testing tool
!
written in erlang! GPL
! ab / siege
!
simple standalone tool! for quick stress load test
! monitor changes in param tuned/configured
-
8/10/2019 Vertical Scaling & Performance Tuning
62/64
Load Monitoring
!
real time load monitoring during stress test! understand some pattern behavior
! easily digest/queries infos in order to understand
certain issues
-
8/10/2019 Vertical Scaling & Performance Tuning
63/64
Load Monitoring
! ELK Stacks
! Elasticsearch
! Logstash
! Kibana
-
8/10/2019 Vertical Scaling & Performance Tuning
64/64
Load Monitoring