resource management: beancounters
DESCRIPTION
Resource Management: Beancounters. Agenda. Current state of resource management in the Linux kernel Beancounters overview User memory management I/O accounting Kernel memory management Network buffers accounting Performance. Current state. Per-process accounting and limiting (rlimits) - PowerPoint PPT PresentationTRANSCRIPT
Resource Management: Beancounters
Pavel [email protected]
Denis [email protected]
Kirill [email protected]
Agenda
Current state of resource management in the Linux kernel
Beancounters overview User memory management I/O accounting Kernel memory management Network buffers accounting Performance
Current state
Per-process accounting and limiting (rlimits) Manages individual processes Memory limits are mostly ignored by the kernel
Group-based management Absent
Global statistics Not suitable for group isolation
Operating system resources
Memory CPU time IO bandwidth Networking bandwidth Disk space
Agenda
Current state of resource management in the Linux kernel
Beancounters overview User memory management I/O accounting Kernel memory management Network buffers accounting Performance
Beancounters basics A beancounter manages a group of tasks Resource counters parameters
held – the current consumption level limit – the maximal allowed level of consumption barrier – the "shortage warn" line – each resource
controller may take some precautions fails – the number of allocation rejects
Beancounter is assigned once during process lifetime
Accounting details
Process
User space Kernel space
Beancounter
kernel object
Beancounters controlled resources
User memory Length of mappings RSS Locked pages
Dirty page cache Kernel memory Network buffers
Miscellaneous resources Number of tasks Number of files Number of sockets Number of file locks Number of PTYs Number of signals Active dentry cache
Agenda
Current state of resource management in the Linux kernel
Beancounters overview User memory management I/O accounting Kernel memory management Network buffers accounting Performance
User memory management
VMA lengths accounting Graceful rejects of VM region allocation Take precautions against overcommitment
RSS accounting Real memory usage OOM killer priorities
Dirty page cache accounting IO statistics and scheduling
VMA lengths accounting
VMAs classification
unreclaimable:private and anonymous
reclaimable:shared file mappings
Unused pages Used pages Unreclaimable VMAsReclaimable VMAs
“Lengths of mappings” resource
“RSS” resource
Pages classification
unused:parts of mapped regions
used:touched pages
Task address space
VMA lengths accounting pros'n'cons
Pros The way to track the
host commitment level Graceful rejects of
address space growths
Cons Hard limiting of
address space growth
RSS accounting
First touch N Touches
Drawbacks Additional pointer on the struct page Extra locking during page faults
page page beancounter
beancounter
Shared pages accounting
Account the page to the first beancounter Non uniform statistics for similar beancounters
Account a whole page for each beancounter The values accounted are not related to the actual
memory usage Account page's fractions the all beancounters
The “middle” way used in the beancounters
Page fractions accounting
BC1
BC2
BC3BC4
1½
½¼
¼¼
¼Algorithm benefits O(1) algorithm of
adding and removing The sum of RSS on all
beancounters is an amount of all actually used pages
Agenda
Current state of resource management in the Linux kernel
Beancounters overview User memory management I/O accounting Kernel memory management Network buffers accounting Performance
Dirty page cache accounting
First touch N Touches
Dirty
Unmap
Last unmap
Clean
IO beancounter
RSS accounting pros'n'cons
Pros Node memory
utilization statistics Asynchronous IO
scheduling Ground for fair page
reclamation
Cons Performance issues Memory consumption
by auxiliary data structures
Agenda
Current state of resource management in the Linux kernel
Beancounters overview User memory management I/O accounting Kernel memory management Network buffers accounting Performance
Kernel memory management
Reason Limited normal zone
Mainly for 32-bit arches
Major problem Object freeing context
Reference counters RCU
Kernel MM data structures (pages)
Buddy page allocator Additional pointer on
the struct page
Vmalloc 0th page's pointer ...
page
struct vm_struct
Kernel MM data structures (slab)
Array of pointers after the slab
struct slab
kmem_bufctl_t[N]
... ...
N objects
...
beancounters
Kernel MM drawbacks A slab can carry less objects Slabs could become “offslab”
Slab name# of objects Offslab-ness
Before After Before After
Size-32 113 101 – –
Size-64 59 56 – –
Size-128 30 29 – –
Size-256 15 15 – –
Size-512 8 8 + +
Size-1024 4 4 + +
Size-2048 2 2 + +
Size-4096 1 1 + +
Kernel MM pros'n'cons
Pros Tracking of kernel
memory usage
Cons No (all are already
optimized out)
Agenda
Current state of resource management in the Linux kernel
Beancounters overview User memory management I/O accounting Kernel memory management Network buffers accounting Performance
Network buffers accounting
Mainstream accounting shortcomings
slab overhead is not included up to 30% for usual Ethernet frames unpredictable difference for non-ethernet MTU no way to recalculate skb->truesize
Implementation basics
Separate accounting for send and receive buffers TCP and all the other types of traffic
Implementation is straightforward: account actual memory usage for objects with
undefined or infinite lifetime select(2) compatibility Buffer space guarantees
Packets context handling
beancounter
process
NetworksocketSKB SKB
Agenda
Current state of resource management in the Linux kernel
Beancounters overview User memory management I/O accounting Kernel memory management Network buffers accounting Performance
Performance
Test nameNo RSS Full
% %
Process creation 97% 91%
Execl Throughtput 99% 91%
Pipe Throughtput 100% 99%
Shell Scripts 96% 87%
File Read 99% 98%
File Write 101% 99%
RSS accounting – the bottleneck
Main future directions Optimization
Pre-charging Kernel memory VMAs lengths
On-demand accounting Active dentry cache RSS
RSS limits Page reclamation
Better TCP window management
That's all folks Questions?
Comments?
http://download.openvz.org/~xemul/