reverse engineering

Reverse EngineeringGrant Curell

What is reverse engineering?

Reverse engineering is the process of discovering the technological principles of a device, object, or system through analysis of its structure, function, and operation

Why does the military care and more specifically, why should you?

General reversing

• What is the x86 architecture?• What is a computer?• What is the general structure of a computer?• What differentiates malicious instructions

from legitimate instructions?

Registers

• EAX - Accumulator Register• EBX - Base Register• ECX - Counter Register• EDX - Data Register• ESI - Source Index• EDI - Destination Index• EBP - Base Pointer• ESP - Stack Pointer

• EAX - All major calculations take place in EAX, making it similar to a dedicated accumulator register.

• EDX - The data register is the an extension to the accumulator. It is most useful for storing data related to the accumulator's current calculation.

• ECX - Like the variable i in high-level languages, the count register is the universal loop counter.

• EDI - Every loop must store its result somewhere, and the destination index points to that place. With a single-byte STOS instruction to write data out of the accumulator, this register makes data operations much more size-efficient.

• ESI - In loops that process data, the source index holds the location of the input data stream. Like the destination index, EDI had a convenient one-byte instruction for loading data out of memory into the accumulator.

• ESP - ESP is the sacred stack pointer. With the important PUSH, POP, CALL, and RET instructions requiring it's value, there is never a good reason to use the stack pointer for anything else.

• EBP - In functions that store parameters or variables on the stack, the base pointer holds the location of the current stack frame. In other situations, however, EBP is a free data-storage register.

• EBX - In 16-bit mode, the base register was useful as a pointer. Now it is completely free for extra storage space.

Commands

command operand1, operand2command reg1, op1/reg2command op1/reg1

Where an operator can be fixed number or a register

Commands – The basics

• sub dest, src - The source is subtracted from the destination and the result is stored in the destination. (dest-src=dest)

• add dest, src - Adds "src" to "dest" and replacing the original contents of "dest". Both operands are binary.

• mov dest, src - Copies byte or word from the source operand to the destination operand.

Commands – push and pop

• push src - Decrements SP by the size of the operand (two or four, byte values are sign extended) and transfers one word from source to the stack top (SS:SP).

• Transfers word at the current stack top (SS:SP) to the destination then increments SP by two to point to the new stack top. CS is not a valid destination.

commands – comparison operators

• xor dest, src - Performs a bitwise exclusive OR of the operands and returns the result in the destination.

• test dest, src - Performs a logical AND of the two operands updating the flags register without saving the result.

• cmp - Subtracts source from destination and updates the flags but does not save result. Flags can subsequently be checked for conditions.

Commands – call

• call - Pushes Instruction Pointer (and Code Segment for far calls) onto stack and loads Instruction Pointer with the address of proc-name. Code continues with execution at CS:IP.

Commands – leave and ret

• Leave –

commands - jumps

Details on Jumps

http://www.penguin.cz/~literakl/intel/j.html

commands – pointers and lea

• lea dest, src - Transfers offset address of "src" to the destination register.

Question? Is the value of ESI the same or different after each of these instructions? What is its value(s)?

MOV ESI, [EBX + 8*EAX + 4]

and

LEA ESI, [EBX + 8*EAX + 4]

Example 1

We’ll talk about these later.

The original code...

Calling Conventions

The C Convention: pushes arguments onto the stack from right to left (i.e., the first argument of the function is placed on the stack last, and thus appears on top). Deleting arguments from the stack is entrusted not to the function, but to the code calling the function.

The Pascal convention pushes arguments on the stack from left to right (i.e., the first argument of the function is placed on the stack first, and thus appears on the bottom). The deletion of function arguments is entrusted to the function itself,

Example 1 – What’s happening here?

Example 2

Meow

Example 2 – Where was our string stored?

Example 3: Now you know enough to be dangerous

Hint: You’ll need this ASCII TABLE

http://www.asciitable.com/

Some review – Buffer Overflows

Where is data stored?

• Generally: All local variables are stored on the stack.

• Dynamically allocated variables are generally placed in the heap.

Example 4

-There are two flaws in this program. Can you spot them?

What is an access violation and why did it occur?(Pay close attention because the answer will help you with your assignment.)

As compiled with Dev C++

A quick comparison with what it looked like when compiled with Visual Studio

What’s happening here?

What do we expect the stack to look like when we reach this point?

What is this for?

What is this?

Stack after the sub esp, 68

Can you identify what is on the stack right now?

Assuming NO access violation occurred, what do you think happened?

So now what? We want control.

How do you think we should do it?

Time for some math.

Start of our buffer

Return address

0022FF60-22FF10 = 0x50 or 80 bytes first 76 bytes are buffer last 4 overwrite return address.

What it looks like after input – notice the return address overwritten by B instead of A.

Return address

So that’s great, but where are we gonna put our shell code?

Registers just before the return Stack just before the return

Finding a suitable jmp ecx

Now we write some shell code…

Let’s take a look… right before gets

Right after gets… looks good

It didn’t work… why?

This still won’t work. Why?

So we need a new exploitation technique.

Example 5

SEH… What is that?

Let’s get a better look…

What it looks like in Ollydbg… your target

This is the unfiltered exception handler. It is called if no other EH can handle the problem.

So here’s generally what to do…

See later slides for why step 4 is different.

What your malicious payload should look like…

NOPs Exploit Code Short JMP Address of Pop Pop Ret

16 NOPs Long Jump To NOP Sled Large amount of garbage

One difference conceptually…

• Because we do not have space after the SEH, we must put our shellcode before. This means we will have to jump to the nopsled above and let it “slide” down.

• The encoding you will want to use is:• \xE9\x14\xFD\xFF\xFF

Generating a Payload Using Backtrack

1) Open a console2) Type “msfconsole”3) Type “use payload/windows/messagebox”4) Type “show options” to see exploit options. Mess with these

as you wish.5) Type “generate” to generate the payload

Note: My payload code is in the notes if you wish to use that.

Let’s take a pictorial look

That questions slide

Insert a generic picture of question marks here

reverse engineering

Documents