how to recover malare assembly codes

How to recover malware assembly codesJean-Yves MarionLORIA!

Jean-Yves Marion - Laboratoire de Haute Sécurité (LHS)

Duqu : The precursor to the next Stuxnet

Duqu is Targeted attacksStart in June 2010 ? Discover in Sept. 2011 by Crysys (Budapest) See white paper of Symantec

Code injection

Duqu is similar to Stuxnet ➡ Same installation mechanisms and Similar functionalities ➡ But Anti-Virus companies detect it in Sept 2011 ! ➡ None of 43 anti-virus of VirusTotal was able to detect Duqu knowing Stuxnet.

0-day exploit

Driver File (.sys)

Installer (.dll)

Decryption

DUQU Main DLL Service.exe

The decryption routine!of the payload installer

Unpack a UPX-file

The main DLL code is !now decrypted !and depacked !in memory only

Wave 1

Wave 2

Decryption

Duqu is a self-modifying program

A common protection scheme for malware

Wave 1

payload

P33C7 18+&012234A

%0:+$-.&'$JK/$01$-.+$<&5=+:$50(+$A,$A'$600<,$L3W

Decrypt

..........

Decrypt

P33C7 18+&012234A

%0:+$-.&'$JK/$01$-.+$<&5=+:$50(+$A,$A'$600<,$L3W

Decrypt

Wave 2

P33C7 18+&012234A

%0:+$-.&'$JK/$01$-.+$<&5=+:$50(+$A,$A'$600<,$L3W

Self-modifying program schema

Self-modifying codesA bare semantics

µ0[c] : binary c loaded into memory µ0

µn : memory

�n : registers

�n(ip) returns the address of the next instruction to run

Traces ➡ Traces are obtained by code instrumentation : we use Pin (intel)

We collect an execution trace of P :

For each run instruction, we gather

– its memory address

– its machine instruction

(µ0[c],�0) ! (µ1,�1) ! . . . ! (µn,�n) ! . . .

Self-modifying codesDynamic typing of memories

�(m) = (kr

, kw

, kx

) where m is a memory adress

kw is the writing level

kr is the reading level

kx

is the execution level

�0(m) = (0, 0, 0)

(µ0[c],�0,�0) ! (µ1,�1,�1) ! . . . ! (µn�n,�n) ! . . .

The execution level is the level of �n(ip) given by �n(�n(ip))

Self-modifying codesDynamic typing of memories

�(m) = (kr

, kw

, kx

) where m is a memory adress

kw is the writing levelkx


�0(m) = (0, 0, 0)

(µ0[c],�0,�0) ! (µ1,�1,�1) ! . . . ! (µn�n,�n) ! . . .

An instruction written at level k has an execution level of k+1

@a: mov esi,$index @b: xor [@offset+esi],$key @c: sub esi,4 @d: jnz @b @offset: [encrypted data]

Wave 1 @a,…,@d

Decrypt

Wave 2 @offset

kw is 1

The execution level of @offset is 2 because it is written by instructions in wave 1

So kx

is 2

Self-modifying codes

kw is the writing level

kx


A self-modifying program c is a program such that its execution level is > 1 for an input

�

i+1(m) =

8><

>:

(kr

, k + 1, kx

) if m is written

and �

i

(m) = (kr

, kw

, kx

)

�

i

(m) otherwise

�

i+1(m) =

((k

r

, k, k + 1) m = �(ip) & �

i

(�i

(ip)) = (kr

, k, kx

)

�

i

(m) otherwise

Similar to Phase semantics of Preda, Giacobazzi, and Debray

The execution level is k + 1

�i(ip) points to an instruction like mov [m],eax

Packer protectionsExemple (4/5)

• hostname packe avec Themida

��

��

��

��

��

��

��

��

��

��

��

Different code waves with their

relations

Themida packer

Yoda packer

��

��

UPX

7 Résultats expérimentaux

Fig. 7.2: Résultats de l’analyse

Nom du binaire k k Blind Decrypt Check Scrambledhostname.exe (original) 1 1

!EPack_1..exe 2 2 X Xacprotect-hostname.exe 18 882 X X X X

aspack-hostname.exe 2 3 X X X Xenigma_protector_1.16.exe 5 24 X X X X

exefog_1.1.exe 3 5 X X Xexpressor-hostname.exe 2 3 X X

fsg.exe 2 2 X X Xmew11.exe 2 2 X X X

molebox-hostname.exe 3 5 X X X Xmorphine_1.9.exe 3 3 X X X

nakedpack.exe 2 2 Xnpack-hostname.exe 2 2 X

nspack.exe 3 4 X X Xpackman_1..exe 2 2 X X X

pec2-hostname.exe 3 4 X X X Xpelock-hostname.exe 9 16 X X X X

pepack.exe 1 1 Xpespin-hostname.exe 4 38 X X X X

petite.exe 2 2 X Xrlpack_1.17_full_version.exe 2 2 X X X

rlpack-hostname.exe 2 2 Xtelock_.98.exe 2 2 X

themida_1.8.5.2.exe 11 164 X X X Xupx-hostname.exe 2 2 X X

vmprotect-hostname.exe 1 1 Xwinupack-hostname.exe 3 4 X X X X

Yodas_Crypter_v1.3.exe 4 4 X X X Xyp-1.02-hostname.exe 4 6 X X X X

Légende :– k est le niveau d’exécution maximal en typage classique (chapitre 5, définition

50) ;– k est le niveau d’exécution maximal en typage monotone.

84

Where are we ?

(µ0[c],�0,�0) ! (µ1,�1,�1) ! . . . ! (µn,�n,�n) ! . . .Dynamic typed memory trace

which defines a sequence of waves

Wave 1

Decrypt

..........

DecryptDecrypt

Wave 2 Wave K

Can we recover the assembly code of the wave K ? Can we reconstruct the full CFG ?

The inputs: An execution trace inside wave K The snapshot of the memory at wave K

The problem and its inputs

Can we recover the assembly code of this wave ?

The inputs: An execution trace inside wave K The snapshot of the memory at wave K

Snapshot of the memory at the beginning of wave 5

Dynamic vs Static analysis

A trace obtained by dynamic analysis

Dynamic typed memory trace

Undiscovered code in white boxes

Why is it difficult to recover a CFG in x86 ?

Indirect jumps

100: jmp eax

– Fuzzing !– We need to have a robust approximation of x86 semantics!– Abstract interpretation!– SMT Solver

What is the set of possible values of eax ?

Junk code insertionJunk code insertion at the expected return adress

!!

100 : call @a

junk code

@a : …

How to determine the return address of a call ?

125 : pop esi

Modify the return address (125)

See Debray’s paper

Yet another difficulty mis-alignment

01006 e7a f e 04 0b inc byte [ ebx+ecx ]01006 e7d eb f f jmp +101006 e7e f f c9 dec ecx

01006 e80 7 f e6 jg 01006 e6801006 e82 8b c1 mov eax , ecx

Figure 1. Overlapping assembly in tELock.

010059 f0 89 f9 mov ecx , edi

,=< 010059 f2 79 07 jns +9| 010059 f4 0 f b7 07 movzx eax , word [ edi ]| 010059 f7 47 inc edi

| 010059 f8 50 push eax

| 010059 f9 47 inc edi

| 010059 fa b9 57 48 f2 ae mov ecx , ae f24857‘�> 010059 fb 57 push edi

010059 f c 48 dec eax

010059 fd f2 ae repne scasb010059 f f 55 push ebp

Figure 2. Overlapping assembly in UPX.

2.2.1 tELock0.99

tELock0.99 uses an overlapping technique to simply obfuscate the code as follows. Figure 1 shows a recursivedisassembly taken from the address 01006e7a. There is a jmp +1 instruction at address 01006e7d and coded onthe two bytes eb ff, that jumps to the address 01006e7d+1, which is a dec ecx instruction (ff c9 ) which shares thebyte ff at address 01006e7d+ 1 with the jmp instruction.

2.2.2 UPX

UPX uses overlapping to optimize the size of the final packed binary (figure 2). The unpacker part uses a conditionaljump to separate the control flow into two overlapping blocks which both realign after a few instructions.(TODO: expliquer les deux branches, rapidement en quoi elles sont utiles)

2.2.3 Overlapping in state-of-the-art disassemblers

Existing disassemblers, even when doing recursive traversal, assume that code cannot overlap and fail at displayingthe resulting disassembly.

With IDA Pro (v6.3), the tELock example looks as follows:

01006E7A inc byte ptr [ ebx+ecx ]01006E7D jmp short near ptr loc_1006E7D+101006E7D ; ��01006E7F db 0C9h ;

01006E80 db 7Fh ;

01006E81 db 0E6h ;

01006E82 db 8Bh ;

01006E83 db 0C1h ;

With Radare (TODO: recursive?), the tELock example is disassembled as follows:

01006 e7a fe040b inc byte [ ebx+ecx ]01006 e7d e b f f jmp 6 e7e01006 e7 f c9 leave01006 e80 7 f e6 jg 6e6801006 e82 8bc1 mov eax , ecx

Both are not able to follow the jmp: the target of the jmp is already disassembled in another assembly instructionand is thus deemed invalid.

2

teLock

01006 e7a f e 04 0b inc byte [ ebx+ecx ]01006 e7d eb f f jmp +101006 e7e f f c9 dec ecx



010059 f0 89 f9 mov ecx , edi


| 010059 f8 50 push eax

| 010059 f9 47 inc edi


010059 f c 48 dec eax



2.2.1 tELock0.99


2.2.2 UPX






01006E80 db 7Fh ;

01006E81 db 0E6h ;

01006E82 db 8Bh ;

01006E83 db 0C1h ;




2

IDA failsbecause of jmp +1

BB [0x4 -> 0x5] (0x2)0x4 dec ecx

BB [0x3 -> 0x4] (0x2)0x3 jmp 0x4

BB [0x6 -> 0x7] (0x2)0x6 jg 0x��ee

BB [0x0 -> 0x2] (0x3)0x0 inc byte [ebx+ecx]

BB [0x8 -> 0x9] (0x2)0x8 mov eax, ecx

Figure 4. Control flow graph for the tELock sample

010059 f0 89 f9 mov ecx , edi


| 010059 f8 50 push eax

| 010059 f9 47 inc edi


010059 f c 48 dec eax



2.2.2 UPX


The control flow graph for this overlapping code is given on figure ??.





01006E80 db 7Fh ;

01006E81 db 0E6h ;

01006E82 db 8Bh ;

01006E83 db 0C1h ;


01006 e7a fe040b inc byte [ ebx+ecx ]01006 e7d e b f f jmp 6 e7e01006 e7 f c9 leave

3

Another example of mis-alignment 01006 e7a f e 04 0b inc byte [ ebx+ecx ]01006 e7d eb f f jmp +101006 e7e f f c9 dec ecx



010059 f0 89 f9 mov ecx , edi


| 010059 f8 50 push eax

| 010059 f9 47 inc edi


010059 f c 48 dec eax



2.2.1 tELock0.99


2.2.2 UPX






01006E80 db 7Fh ;

01006E81 db 0E6h ;

01006E82 db 8Bh ;

01006E83 db 0C1h ;




2

UPX

Re-synchronized

bytes in common !!

mov ecx,edi jnz +9

movzx eax, [edi] inc edi push eax inc edi mov ecx, aef24857

push ebp

push edu dec eax repine scasb

Share 4 bytes

Let’s recap the problem

First instruc*on

Last instruc*on

TRACER

W2 W4W1 W3 W5

Snapshot of the memory at the beginning of wave 5

Goal : Reconstruct the full CFG

Problem inputsSnapshot of the memory at the beginning of wave 5

An execution trace

A path in the woods

Junk codes insertion after a call

100 : call @a

junk code

125 : pop esi

@a : pop ebp Modify the return

address @b: ret

100:call @a, @a:pop esi,…, @b:ret;125:pop esi;…

A trace will provide automatically the address 125

It is junk codes only if it is not reachable

Trace:

See the paper of Krugel and al, Usenix 2004 for another approach

Method for mis alignment

… 89 F9 79 07 0F B7 07 47 50 47 B9 57 48 F2 AE 55 …

mov ecx,edi jnz +9

push edi dec eax repne scasb push ebp


push ebp

An obfuscation similar to UPX

The CFG construction follow the trace

Then, we search for missing codes

3/ We split blocks

Method for mis alignment

… 89 F9 79 07 0F B7 07 47 50 47 B9 57 48 F2 AE 55 …

mov ecx,edi jnz +9

push edi dec eax repne scasb


push ebp

An obfuscation similar to UPX

1/ The CFF construction follows the trace

2/ Then, we search for missing codes

3/ We split blocks

Misalignment can come from indirect jump … traces are then useful !

The overall method (work in progress)• A partial CFG is an un-complete CFG

• Two partial CFG are in conflict if there are two mis-aligned instructions.

• Traces define a set of partial CFG which are in conflict.

mov ecx,edi jnz +9


push edi dec eax repne scasb push ebp

push ebppush ebp

Share the !same adresses

• Edges between partial CFG indicate mis-alignement

• Then we can synchronize partial CFG

• There are orphan partial CFG

• There are ok if there is an edge to a valid address

• Statistic recognition is useful at this stage

Conclusion and Questions

• We develop a disassembler for self-modified codes :

BinVizz : Visualization of each code wave from a trace

how to recover malare assembly codes

Education