safe and efficient instrumentation

28
Paradyn Project Paradyn / Dyninst Week Madison, Wisconsin April 12-14, 2010 Paradyn Project Safe and Efficient Instrumentation Andrew Bernat

Upload: hidi

Post on 23-Feb-2016

31 views

Category:

Documents


0 download

DESCRIPTION

Safe and Efficient Instrumentation. Andrew Bernat. Binary Instrumentation. Instrumentation modifies the original code Moves original code Allocates new memory Overwrites original code This affects the behavior of: Moved code Code that references moved code - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Safe and Efficient Instrumentation

Paradyn Project

Paradyn / Dyninst WeekMadison, WisconsinApril 12-14, 2010

Paradyn Project

Safe and Efficient Instrumentation

Andrew Bernat

Page 2: Safe and Efficient Instrumentation

Binary Instrumentation

2Safe and Efficient Instrumentation

• Instrumentation modifies the original code• Moves original code• Allocates new memory• Overwrites original code

• This affects the behavior of:• Moved code• Code that references moved code• Code that references changed memory

Page 3: Safe and Efficient Instrumentation

Sensitivity Models• A program is sensitive to a particular

modification if that modification changes the program’s behavior

• Current binary instrumenters rely on fixed sensitivity models

• Compensating for sensitivity imposes overhead

3Safe and Efficient Instrumentation

pop %eaxcall addr_translatejmp %eax

ret

Page 4: Safe and Efficient Instrumentation

Safe and Efficient

Approach

Safe and Efficient

Approach

Efficiency vs Sensitivity

4Safe and Efficient Instrumentation

Sensitivity Malware

Optimized Code

Conventional Code

Efficiency

Pin, Valgrind, …

Dyninst

Safe and Efficient

Approach

Page 5: Safe and Efficient Instrumentation

How do we do this?• Formal model of code relocation• Visible behavior• Instruction sensitivity• External sensitivity

• Implementation in Dyninst• Analysis phase• Transformation phase

• Performance Results

5Safe and Efficient Instrumentation

Page 6: Safe and Efficient Instrumentation

Three Questions

• What program behavior do we wish to preserve?

• How does modification affect instructions?

• How do instructions change program behavior?

6Safe and Efficient Instrumentation

Page 7: Safe and Efficient Instrumentation

Approach• Preserve visible behavior• Relationship of input to output

• Identify sensitive instructions• Those whose behavior is changed

• Emulate only externally sensitive instructions• Those whose sensitivity affects visible

behavior7Safe and Efficient Instrumentation

Page 8: Safe and Efficient Instrumentation

Visible Behavior• Intuition: we can change anything that

does not affect the output of the program

• Formalization: in terms of denotational semantics• Briefly: two programs P, P’ are equivalent if:

8Safe and Efficient Instrumentation

Page 9: Safe and Efficient Instrumentation

Visibly Equivalent Programs

9Safe and Efficient Instrumentation

Original Binary

X YInstrumented

Binary

X + A Y + BInstrumentati

onInput

Instrumentation

Output

Page 10: Safe and Efficient Instrumentation

Sensitivity• What does instrumentation change?• Addresses of instructions• Contents of memory• Shape of the address space

• Sensitive instructions are directly affected• Access the PC (and are moved)• Read modified memory• Test allocated memory

10Safe and Efficient Instrumentation

Page 11: Safe and Efficient Instrumentation

Sensitivity Examples

11Safe and Efficient Instrumentation

main: push %ebp mov %esp, %ebp … call worker … leave ret

worker: push %ebp mov %esp, %ebp … ret

jumptable: push %ebp mov %esp, %ebp call get_pc_thunk add $(offset), %ebx mov (%ebx, %eax, 4), %ecx jmp *%ecx

get_pc_thunk: mov (%esp), %ebx ret

Call/Return pair:

Jumptable:protect: call initializeinitialize: pop %esi mov $(unpack_base), %edi mov $0x0, %ebxloop_top: mov (%esi, %ebx, 4), %eax call unpack mov %eax, (%edi, %ebx, 4) inc %ebx cmp %ebx, $0x42 jnz loop_top jmp $(unpacked_base)

Self-Unpacking Code(Simplified)

Page 12: Safe and Efficient Instrumentation

External Sensitivity• An instruction is externally sensitive if it

causes a visible change in behavior• Approximation: or changes control flow

• This requires:• The sensitive instruction must produce

different values• These differences must reach an instruction

that affects output (or control flow)• … and change its behavior

12Safe and Efficient Instrumentation

Page 13: Safe and Efficient Instrumentation

Program Modification

13Safe and Efficient Instrumentation

Analysis

Compensation

Code

Original Binary

Modified BinaryCode

Relocated Code

Page 14: Safe and Efficient Instrumentation

Analysis Phase• Identify sensitive instructions• InstructionAPI: used and defined sets

• Determine affected instructions• DepGraphAPI: forward slice

• Analyze effects of modification• SymEval: symbolic expansion of the slice

14Safe and Efficient Instrumentation

Page 15: Safe and Efficient Instrumentation

Analysis Example: Call/Return Pair

15Safe and Efficient Instrumentation

main: push %ebp mov %esp, %ebp … call worker … leave ret

worker: push %ebp mov %esp, %ebp … ret

Call/Return pair:

Sensitivity: call (moved, uses PC)

Slice: call ret

Symbolic Expansion: call: ret:

Page 16: Safe and Efficient Instrumentation

Analysis Example: Jumptable

16Safe and Efficient Instrumentation

Sensitivity: call (moved, uses PC)

Slice: call mov (%esp), %ebx

Symbolic Expansion: call: ret: jmp:

jumptable: push %ebp mov %esp, %ebp call get_pc_thunk add $(offset), %ebx mov (%ebx, %eax, 4), %ecx jmp *%ecx

get_pc_thunk: mov (%esp), %ebx ret

Jumptable:

add $0x42, %ebx mov (%ebx, %eax, 4), %ecx jmp *%ecx

Page 17: Safe and Efficient Instrumentation

Analysis Example: Unpacking Code

17Safe and Efficient Instrumentation

Sensitivity: call (moved, uses PC)

Slice: call initialize pop %esi mov (%esi, %ebx, 4), %eax call unpack … Symbolic Expansion: call: pop: mov:

protect: call initialize…initialize: pop %esi mov $(unpack_base), %edi mov $0x0, %ebxloop_top: mov (%esi, %ebx, 4), %eax call unpack mov %eax, (%edi, %ebx, 4) inc %ebx cmp %ebx, $0x42 jnz loop_top jmp $(unpacked_base)

Self-Unpacking Code(Simplified)

Page 18: Safe and Efficient Instrumentation

Compensation Phase• Generates the relocated code

• Two approaches:• Instruction transformation• Group transformation

18Safe and Efficient Instrumentation

Page 19: Safe and Efficient Instrumentation

Instruction Transformation• Emulate each externally sensitive

instruction• Replace some instructions (e.g., calls) with

sequences

• Straightforward to implement

• Some sequences impose high overhead• e.g., address translation

19Safe and Efficient Instrumentation

pop %eaxcall addr_translatejmp %eax

ret

Page 20: Safe and Efficient Instrumentation

Group Transformation• Emulate the behavior of a group of

instructions• Motivating example: thunks

• Open questions:• Which instructions are included in the

group?• How is the replacement sequence

determined?• Current status: hand-crafted templates

20Safe and Efficient Instrumentation

Page 21: Safe and Efficient Instrumentation

Transformation: Call/Return Pair

21Safe and Efficient Instrumentation

main: push %ebp mov %esp, %ebp … call worker … leave ret

worker: push %ebp mov %esp, %ebp … ret

Original Codemain: push %ebp mov %esp, %ebp … call worker … leave ret

worker: push %ebp mov %esp, %ebp … ret

Relocated Code

Page 22: Safe and Efficient Instrumentation

Transformation: Jumptable

22Safe and Efficient Instrumentation

Original Code Relocated Codejumptable:

push %ebp mov %esp, %ebp call get_pc_thunk add $(offset), %ebx mov (%ebx, %eax, 4), %ecx jmp *%ecx

get_pc_thunk: mov (%esp), %ebx ret

jumptable: push %ebp mov %esp, %ebp mov $(orig_ret_addr), %ebx add $(offset), %ebx mov (%ebx, %eax, 4), %ecx jmp *%ecx

Page 23: Safe and Efficient Instrumentation

Transformation: Unpacking Code

23Safe and Efficient Instrumentation

Relocated Codeprotect:

call initialize…initialize: pop %esi mov $(unpack_base), %edi mov $0x0, %ebxloop_top: mov (%esi, %ebx, 4), %eax call unpack mov %eax, (%edi, %ebx, 4) inc %ebx cmp %ebx, $0x42 jnz loop_top jmp $(unpack_base)

Original Codeprotect: jmp initialize…initialize: mov $(orig_addr), %esi mov $(unpack_base), %edi mov $0x0, %ebxloop_top: mov (%esi, %ebx, 4), %eax call unpack mov %eax, (%edi, %ebx, 4) inc %ebx cmp %ebx, $0x42 jnz loop_top jmp $(unpacked_base)

Page 24: Safe and Efficient Instrumentation

Results

Type of Binary % PC Sensitive % Externally Sensitive

% Unanalyzable

Executable (a.out) 9.0% 1.1% 6.6%Library (.so) 7.9% 6.9% 9.1%

24Safe and Efficient Instrumentation

Percentage of PC-Sensitive Instructions (32-bit, GCC, static analysis)

Dyninst S&E (no memory)

S&E (memory)

go (uninstrumented)

21.3 (73.2%) 12.4s (0.8%) 15.0s (22.0%)

go (basic block count)

23.4 (90.2%) 16.3s (32.5%) 19.5s (58.5%)

Instrumentation Overhead (go, 32-bit, 12.3s base time)

Page 25: Safe and Efficient Instrumentation

Future Work• Memory sensitivity and compensation• Improved pointer analysis• Useful user intervention?

• Investigate group transformations• Widen range of input binaries• Expand supported platforms

25Safe and Efficient Instrumentation

Page 26: Safe and Efficient Instrumentation

Questions?

26Safe and Efficient Instrumentation

Page 27: Safe and Efficient Instrumentation

ASProtect code loop

27Safe and Efficient Instrumentation

8049756: call 8049761

8049761: mov EDX, ECX8049763: pop EDI8049764: push EAX8049765: pop ESI8049766: add EDI, 2183804976c: mov ESI, EDI804976e: push 08049773: jz 804977c

8049779: adc DH, 229

804977c: pop EBX804977d: mov EAX, 2015212641

8049782: mov ECX, EBX(EDI)8049785: jmp 804979c

804979c: add ECX, 158698631680497a2: xor ESI, 31433375680497a8: xor ECX, 59491573380497ae: jmp 80497c3

80497c3: sub ECX, 59494877880497c9: sub ESI, 6426080497ce: push ECX, ESP80497cf: mov EAX, 88437732180497d4: pop EBX(EDI)80497d7: jmp 80497ed

80497ed: adc AL, 10080497f0: sub EBX, 159502605080497f6: xor EAX, 3477880497fb: add EBX, 15950260468049801: call 804980c

804980c: mov AX, 27838049810: pop ESI8049811: cmp EBX, 42949653448049817: jnz 8049834

804981d: or ESI, 8391819108049823: jmp 8049847

8049834: mov ESI, 12875703758049839: jmp 8049782

Page 28: Safe and Efficient Instrumentation

Emulation Examples

28Safe and Efficient Instrumentation

add %eax, %ebx

jnz 0xf3e

call fprintf

mov (%esi, %ebx, 4), %eax

jnz 0xe498d3

add %eax, %ebx

push $804391jmp fprintf

lea (%esi, %ebx, 4), %eaxcall mem_addr_translatemov (%eax), %eax

retpop %eaxcall addr_translatejmp %eax