cantrill, b., shapiro, m., and leventhal, a. 2004

32
Cantrill, B., Shapiro, M., and Leventhal, A. 2004. Dynamic instrumentation of production systems. roceedings of the 2004 Usenix Annual Technical Conference Onur Derin Jan 25, 2007

Upload: harvey

Post on 06-Jan-2016

44 views

Category:

Documents


0 download

DESCRIPTION

Cantrill, B., Shapiro, M., and Leventhal, A. 2004. Dynamic instrumentation of production systems. Proceedings of the 2004 Usenix Annual Technical Conference. Onur Derin. Jan 25, 2007. Outline Motivation Expectations from a solution DTrace Features DTrace Architecture - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Cantrill, B., Shapiro, M., and Leventhal, A.  2004

Cantrill, B., Shapiro, M., and Leventhal, A.

2004.

Dynamic instrumentation of production systems.

Proceedings of the 2004 Usenix Annual Technical Conference.

Onur Derin

Jan 25, 2007

Page 2: Cantrill, B., Shapiro, M., and Leventhal, A.  2004

Outline

Motivation Expectations from a solution DTrace Features DTrace Architecture Instrumentation Providers D Language Aggregations Speculative Tracing Future work Examples Further readings Discussion

Page 3: Cantrill, B., Shapiro, M., and Leventhal, A.  2004

Motivation

Software Observability Problem Since software is not physical, only way to observe it is again by software.

if(tracing_enabled)

printf(“we got here”); This creates overhead. (load, compare, branch)

A solution to this is conditional compilation But it creates two versions of software

One in development and test Other in use in production systems

But then how do you identify a problem that occurs while the system is in use? Problem of reproducing the problem in development Usually, you end up finding a solution to a different problem

An instrumentation solution should allow observability in production systems

Page 4: Cantrill, B., Shapiro, M., and Leventhal, A.  2004

Motivation

Software abstraction is a good thing. (web application, web server, DB server, OS) It implies, at higher levels, less code induces more work.

Less of a misstep induces more unintended consequences. Missteps accumulate as you go down to software layers. (Avalanche effect) Therefore problems are observed first in lowest layers.

e.g. excessive memory demand, excessive I/O activity, excessive network traffic. Seeing lowest layer problems, a typical solution is to use faster hardware.

e.g. more RAM, more CPU, more bandwidth However, real problem is on higher layers of software.

Real solution is fixing the problem at higher levels.

Therefore identifying the real problem requires a system-view in instrumentation

rather than a process-centric one.

Page 5: Cantrill, B., Shapiro, M., and Leventhal, A.  2004

Expectations from a Solution

Shift from development to production Zero disabled-probe effect

Ship the product totally optimized When it is to be observed, dynamically modify the code

Shift from programs to systems Entire stack should be able to be dynamically instrumented.

e.g. operating system, system libraries, high-level languages and environments. Kernel is involved. So observability infrastructure should be absolutely safe.

Abruptions during production are costly. Problematic state of the system is lost in case of a restart.

First time software is to be observed, it is already running in production Solution shouldn’t require special compilation options, having source code,

restarting components.

These expectations formed the design guidelines for DTrace.

Page 6: Cantrill, B., Shapiro, M., and Leventhal, A.  2004

DTrace Features

Dynamic Inst.: achieves zero disabled-probe effect. Unified Inst.: instruments both user and kernel level software. Arbitrary-context Kernel Inst.: instruments all kernel incl. scheduler and synch. Data Integrity: reports errors in handling of data during instrumentation. Arbitrary Actions: lets user specify arbitrary actions safely at any inst. point. Predicates: actions when predicate true. Allows pruning of data at source. A High-level Control Language: lets specifying predicates and actions. A Scalable Mechanism for Aggregating Data: processes data at low levels. Speculative Tracing: leaves decision to commit or not at a later time. Heterogeneous Inst.: a glue framework for diff. providers from I/O to scheduler to net. Scalable Architecture: efficient classification and selection of thousands of inst. points. Virtualized Consumers: allows multiple, concurrent consumers of the framework.

How have these features been enabled by DTrace?

Page 7: Cantrill, B., Shapiro, M., and Leventhal, A.  2004

DTrace Architecture

Virtualized consumers

a.d b.d DTrace program source files

dtrace(1M)

intrstat(1M)

intrstat(1M)

intrstat(1M)

libdtrace(3LIB)

DTrace consumers

DTrace

dtrace(7D)userland

kernel

sysinfo vminfo fasttrap

syscall profile fbtDTrace poviders

Scalable architecture

API

API

Heterogeneous Inst.

Page 8: Cantrill, B., Shapiro, M., and Leventhal, A.  2004

Internals

Providers are loadable kernel modules that carry out the instrumentation task. Providers communicate with DTrace Framework using a well-defined API.

DTraceFramework::determineInstrumentationPoints()

{

for provider in all providers

{

provider.determineInstrumentationPoints(createProbe);

}

}

Provider::determineInstrumentationPoints()

{

Generate list of all inst. points

for instPoint in all instrumentation points

{

probeID = DTraceFramework.createProbe(instPoint.moduleName,

instPoint.funcName, instPoint.semanticName);

Associate probeID with instrumentation point

}

}

Page 9: Cantrill, B., Shapiro, M., and Leventhal, A.  2004

Internals

dtrace(3LIB) advertises these probes to consumers.

dtrace(3LIB)::enableProbe(providerName, moduleName, funcName, name)

{

probe = DTraceFramework.getProbe(providerName, moduleName, funcName, name);

if(!probe.isEnabled() )

{

provider.enableProbe(probe.ID);

}

Create Enabling Control Block(i.e. ECB)

Create per-CPU buffer associated with ECB

Associate ECB and probe

}

Provider::enableProbe(probeID)

{

Dynamically modify inst. point s.t. when hit,

it calls DTraceFramework::probeFired(probeID)

}

ECB enables virtualized

consumers. A probe is associated with

an ECB per enabling

consumer. This association is kept in

DTraceFramework.

Page 10: Cantrill, B., Shapiro, M., and Leventhal, A.  2004

Internals

DTraceFramework::probeFired(probeID)

{

Disable interrupts

for ecb in all ECBs where ECB.probeID = probeID

{

if(ecb.predicate)

DTraceFramework.execute(ecb.actions);

}

Re-enable interrupts

}

ECB

+ predicate

+ actions

ECB Actions: may store data in per-CPU buffer associated with ECB. mayupdate D variable state. may not store to kernel memory, modify registers, change system state.

Page 11: Cantrill, B., Shapiro, M., and Leventhal, A.  2004

Internals

DTraceFramework::storeDataInPerCPUBuffer(ecb, data)

{

buffer = DTraceFramework.getBuffer(ecb);

if(buffer.freeSpace() >= ecb.DATA_SIZE)

buffer.store(data);

else

ecb.dropCount++;

}

To minimize dropCount, buffers should be read periodically.

How to read buffers such that data integrity

and waiting-free probe processing is assured?

Page 12: Cantrill, B., Shapiro, M., and Leventhal, A.  2004

Buffers

Since buffer switching and probe processing can not be interrupted, data integrity is assured.

What if interrupts were not disabled?

CPU0 CPU1

Consumer

program

Initiating

read

buffer

operation

xcall()

Buffer2Buffer1Inactive

Interrupts disabled

Active

Inactive Active

Interrupts re-enabled

xcall()

returns

Page 13: Cantrill, B., Shapiro, M., and Leventhal, A.  2004

Buffers

Two inactive buffers, none writtable.

CPU0 CPU1

Consumer

program

Initiating

read

buffer

operation

xcall()

Buffer2Buffer1InactiveActive

Inactive

Probe interrupts, ECB action wants to store to the buffer.

Page 14: Cantrill, B., Shapiro, M., and Leventhal, A.  2004

DIF

D Intermediate Format Instruction set for specifying predicates and actions But mainly in order to to allow programmable actions to be

executed safely in arbitrary contexts. DIF code is checked for validity when it is loaded. Only forward branches are allowed to avoid infinite loops. Illegal loads (from misaligned addresses, memory-

mapped I/O devices, unmapped memory) and division by

zero are handled at run-time by returning errors to the

consumers. Arbitrary stores are not allowed. Only defined subroutines can be called at run-time.

Page 15: Cantrill, B., Shapiro, M., and Leventhal, A.  2004

Instrumentation Providers

General properties No disabled-probe effect Mostly use dynamic code modification

Some examples syscall: traces entire comm. from userland to kernel fbt:entry and return points of kernel functions sched: which threads run on which CPU, how long io: disk I/O requests mib: counters for IP, IPv6 etc. profile: time-triggered probing at specified intervals lockstat: kernel synchronization behaviour

Page 16: Cantrill, B., Shapiro, M., and Leventhal, A.  2004

Function Boundary Tracing implementation in SPARC

call x

Modified

dynamically

ba y

y:prepare probeID etc.

call DTrace, probeFired(probeID, …)

On return, call x is executed in y

Production Software

Instrumented Software

Page 17: Cantrill, B., Shapiro, M., and Leventhal, A.  2004

D Language

C-like, supports ANSI C operators Strings exist No if, no loop. Only integer arithmethic No need to declare variables Scalar variables Associative arrays

Collection of data elements No predefined number Like hashes name[key] = expression

Page 18: Cantrill, B., Shapiro, M., and Leventhal, A.  2004

D Language

Thread-local variables: Variables for OS threads referred with self->variable-name

Clause-local variables: Their storage is reused for each program clause. Referred with this->variable-name

Built-in variables (execname, pid, timestamp, curthread) External variables

Used in kernel modules (kmem_flags)

Page 19: Cantrill, B., Shapiro, M., and Leventhal, A.  2004

D Language

General template

probe descriptions

/predicate/

{

action statements

}

Probe description:

Provider Name:Module Name:FunctionName:Semantic name

Predicate is a D expression.

Actions: Recording actions (print(), printa(), trace()) Destructive actions(disabled by default) Special actions(copyinstr(), strlen(), rand() etc.)

Page 20: Cantrill, B., Shapiro, M., and Leventhal, A.  2004

Aggregations (Cherry on the cake)

Aggregate data and look for trends, generate reports General form

@name[keys] = aggfunc(args); Aggregation function:

f(f(x0) U f(x1) U ... U f(xn)) = f(x0 U x1 U ... U xn) e.g.

Count() Min() Max() Sum() Avg() Quantize()

Page 21: Cantrill, B., Shapiro, M., and Leventhal, A.  2004

Speculative Tracing

Trace data and later commit or not to a buffer When you cannot use a predicate condition and don't know

a probe event When you have an error event and would like to know the

history behind it and why that error occurred Functions:

speculation() speculate() commit() discard()

Page 22: Cantrill, B., Shapiro, M., and Leventhal, A.  2004

Example D Programs

BEGIN

{

trace(“Hello world”);

exit();

}

# dtrace -s helloworld.d

dtrace: script 'helloworld.d' matched 1 probe

CPU ID FUNCTION:NAME

0 1 :BEGIN

Hello world

syscall::read:entry

{

printf("Process %d", pid);

}

Page 23: Cantrill, B., Shapiro, M., and Leventhal, A.  2004

Example D Programs

syscall::read:entry

{

printf("Process %d", pid);

}

# dtrace -s d2.d

dtrace: script 'd2.d' matched 1 probe

CPU ID FUNCTION:NAME

0 44129 read:entry Process 2680

0 44129 read:entry Process 2680

0 44129 read:entry Process 2827

0 44129 read:entry Process 2680

0 44129 read:entry Process 2680

0 44129 read:entry Process 2827

Page 24: Cantrill, B., Shapiro, M., and Leventhal, A.  2004

syscall::write:entry

/execname=="sshd"/

{

@[arg0] = quantize(arg2);

}

RESULT:

4

value ------------- Distribution ------------- count

8 | 0

16 |@ 1

32 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 24

64 |@@@@@@ 5

128 |@ 1

256 |@@@@ 3

512 | 0

Example D Programs

Page 25: Cantrill, B., Shapiro, M., and Leventhal, A.  2004

syscall::write:entry

/execname==“sshd” && arg0==5/

{

@[ustack()] = quantize(arg2);

}

RESULT: next slide

Example D Programs

Page 26: Cantrill, B., Shapiro, M., and Leventhal, A.  2004

# dtrace -s d4.d

dtrace: script 'd4.d' matched 1 probe

^C

libc.so.1`_write+0x15

sshd`altprivsep_start_monitor+0x220

sshd`main+0xe57

sshd`0x805bad2

value ------------- Distribution ------------- count

2 | 0

4 |

@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 1

8 | 0

Page 27: Cantrill, B., Shapiro, M., and Leventhal, A.  2004

libc.so.1`_write+0x15

pkcs11_softtoken.so.1`looping_write+0x32

pkcs11_softtoken.so.1`C_SeedRandom+0xfd

libpkcs11.so.1`C_SeedRandom+0xed

mech_krb5.so.1`krb5_c_random_seed+0x3d

mech_krb5.so.1`init_common+0x121

mech_krb5.so.1`krb5_init_context+0xd

mech_krb5.so.1`krb5_gss_get_context+0x3d

mech_krb5.so.1`_C0095D0A+0x49

libgss.so.1`__gss_get_mechanism+0xad

libgss.so.1`gss_add_cred+0x79

libgss.so.1`gss_acquire_cred+0xfb

sshd`ssh_gssapi_server_mechs+0x7c

sshd`ssh_gssapi_server_kex_hook+0x22

sshd`0x807cc12

sshd`kex_send_kexinit+0x2a

sshd`kex_setup+0x74

sshd`0x805e90f

sshd`main+0xe05

sshd`0x805bad2

value ------------- Distribution ------------- count

4 | 0

8 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 1

16 | 0

Page 28: Cantrill, B., Shapiro, M., and Leventhal, A.  2004

syscall::open:entry

{

@files[copyinstr(arg0)] = count();

}

RESULT:

# dtrace -s d5.d

dtrace: script 'd5.d' matched 1 probe

^C

/etc/resolv.conf 1

Example D Programs

Page 29: Cantrill, B., Shapiro, M., and Leventhal, A.  2004

RESULT when copyinstr is removed:

# dtrace -s d5.d

dtrace: script 'd5.d' matched 1 probe

dtrace: error on enabled probe ID 1 (ID 44133: syscall::open:entry): invalid address

(0x80fbdaf) in action #2 at DIF offset 28

dtrace: error on enabled probe ID 1 (ID 44133: syscall::open:entry): invalid address

(0x80fbdaf) in action #2 at DIF offset 28

^C

/lib/libc.so.1 1

/proc/2647/psinfo 1

/proc/2723/psinfo 1

/proc/2874/psinfo 1

/proc/4680/psinfo 1

/proc/4691/psinfo 1

/proc/4740/psinfo 1

/var/ld/ld.config 1

/dev/null 2

/etc/resolv.conf 2

/var/adm/utmpx 2

Page 30: Cantrill, B., Shapiro, M., and Leventhal, A.  2004

Future Work

Performance counter provider Helper actions: Embracing high-level languages and their environments. User lock analysis: lock contention analysis of user-level multi-threaded processes. Fine-grained user-level providers Software visualization

Page 31: Cantrill, B., Shapiro, M., and Leventhal, A.  2004

Further Readings to be Googled

DTrace Guide Hidden In Plain Sight, Cantrill B. DTrace Toolkit as a repository of D scripts classified in

terms of application domains like CPU, Disk, Mem, Kernel,

Net etc. DTrace & DTraceToolkit, Stefan Parvu

Page 32: Cantrill, B., Shapiro, M., and Leventhal, A.  2004

Discussion

No discussion of how much overhead is introduced when probes are enabled. Safety is considered only as not crashing and halting the system.

What about guarenteeing not violating other requirements of the system

like real-time properties?