intro. to static analysis
DESCRIPTION
Basic Static Analysis to Understand ProgramTRANSCRIPT
![Page 1: Intro. to static analysis](https://reader034.vdocuments.net/reader034/viewer/2022052621/5583a0e8d8b42a03088b4981/html5/thumbnails/1.jpg)
Intro. To Sta+c Analysis
C.K. Chen 2014.09.27
![Page 2: Intro. to static analysis](https://reader034.vdocuments.net/reader034/viewer/2022052621/5583a0e8d8b42a03088b4981/html5/thumbnails/2.jpg)
Who am I • C.K Chen (陳仲寬) – P.H.D Student in DSNS Lab, NCTU – BambooFox CTF – Research in
• Reverse Engineering • Malware Analysis • Virtual Machine
• About this work – CHI-‐WEI WANG, CHIA-‐WEI WANG, CHONG-‐KUAN CHEN
2
![Page 3: Intro. to static analysis](https://reader034.vdocuments.net/reader034/viewer/2022052621/5583a0e8d8b42a03088b4981/html5/thumbnails/3.jpg)
About DSNS
• 謝續平教授 • 實驗室研究方向 – 惡意程式分析 – 虛擬機器 – 數位鑑識 – 網路安全
3
![Page 4: Intro. to static analysis](https://reader034.vdocuments.net/reader034/viewer/2022052621/5583a0e8d8b42a03088b4981/html5/thumbnails/4.jpg)
Outline
• Intro. To Sta+c Analysis • Common Tools inLinux for Sta+c Analysis • Disassemble • Reverse Assambly to C – Fundamental ASM
• IDA Pro – Prace+ce
• Tips for Sta+c Analtsis
![Page 5: Intro. to static analysis](https://reader034.vdocuments.net/reader034/viewer/2022052621/5583a0e8d8b42a03088b4981/html5/thumbnails/5.jpg)
Intr. to Sta+c Analysis
• Sta+c analysis – Analysis malware without execu+on
• Dynamic analysis – Execute malware inside controllable environment and monitor it’s behavior
![Page 6: Intro. to static analysis](https://reader034.vdocuments.net/reader034/viewer/2022052621/5583a0e8d8b42a03088b4981/html5/thumbnails/6.jpg)
Informa+on from Sta+c Analysis
• What informa+on we can get from sta+c analysis
![Page 7: Intro. to static analysis](https://reader034.vdocuments.net/reader034/viewer/2022052621/5583a0e8d8b42a03088b4981/html5/thumbnails/7.jpg)
Informa+on from Sta+c Analysis
• What informa+on we can get from sta+c analysis – File Structure – Binary Code – Related Module – Suspicious String
![Page 8: Intro. to static analysis](https://reader034.vdocuments.net/reader034/viewer/2022052621/5583a0e8d8b42a03088b4981/html5/thumbnails/8.jpg)
Informa+on from Sta+c Analysis
• What we cannot get? – Register Value – Memory Value – Packed Code – Encrypted Message
![Page 9: Intro. to static analysis](https://reader034.vdocuments.net/reader034/viewer/2022052621/5583a0e8d8b42a03088b4981/html5/thumbnails/9.jpg)
Usage of sta+c analysis
• In normal case, there are some problems that sta+c analysis is involved – Reverse: Windows, Linux – Pwned(Exploit): Linux, Windows(rare)
• Complement to the dynamic analysis
![Page 10: Intro. to static analysis](https://reader034.vdocuments.net/reader034/viewer/2022052621/5583a0e8d8b42a03088b4981/html5/thumbnails/10.jpg)
First step to Sta+c Analysis
• There are some Linux commands that can give useful informa+on of file – Strings – Objdump – Hexdump – File
![Page 11: Intro. to static analysis](https://reader034.vdocuments.net/reader034/viewer/2022052621/5583a0e8d8b42a03088b4981/html5/thumbnails/11.jpg)
Linux Command
• Strings – For each file given, GNU strings prints the printable character sequences that are at least 4 characters long and are followed by an unprintable character.
– Get clues of file
![Page 12: Intro. to static analysis](https://reader034.vdocuments.net/reader034/viewer/2022052621/5583a0e8d8b42a03088b4981/html5/thumbnails/12.jpg)
Linux Command
• File – File tests each argument in an a`empt to classify it. There are three sets of tests, performed in this order: filesystem tests, magic number tests, and language tests. The first test that succeeds causes the file type to be printed.
![Page 13: Intro. to static analysis](https://reader034.vdocuments.net/reader034/viewer/2022052621/5583a0e8d8b42a03088b4981/html5/thumbnails/13.jpg)
Linux Command
• Hexdump – The hexdump u+lity is a filter which displays the specified files, or the standard input, if no files are specified, in a user specified format.
– Hex, Oct, Char, …..
![Page 14: Intro. to static analysis](https://reader034.vdocuments.net/reader034/viewer/2022052621/5583a0e8d8b42a03088b4981/html5/thumbnails/14.jpg)
Linux Command
• ldd -‐ print shared library dependencies – Loading library – Loca+on of library file – Loading address of library
![Page 15: Intro. to static analysis](https://reader034.vdocuments.net/reader034/viewer/2022052621/5583a0e8d8b42a03088b4981/html5/thumbnails/15.jpg)
Linux Command
• Objdump – Dump informa+on of ELF file – Rich informa+on can be dumped – Can used to build simplest malware analysis system
![Page 16: Intro. to static analysis](https://reader034.vdocuments.net/reader034/viewer/2022052621/5583a0e8d8b42a03088b4981/html5/thumbnails/16.jpg)
Objdump
![Page 17: Intro. to static analysis](https://reader034.vdocuments.net/reader034/viewer/2022052621/5583a0e8d8b42a03088b4981/html5/thumbnails/17.jpg)
Disassemble
• objdump -‐D
![Page 18: Intro. to static analysis](https://reader034.vdocuments.net/reader034/viewer/2022052621/5583a0e8d8b42a03088b4981/html5/thumbnails/18.jpg)
Global Offset Table
• objdump –R – Key of sharing library in linux – GOT Hijack
![Page 19: Intro. to static analysis](https://reader034.vdocuments.net/reader034/viewer/2022052621/5583a0e8d8b42a03088b4981/html5/thumbnails/19.jpg)
Disassemble
• Disassemble is a procedure to convert binary machine code into assembly code
![Page 20: Intro. to static analysis](https://reader034.vdocuments.net/reader034/viewer/2022052621/5583a0e8d8b42a03088b4981/html5/thumbnails/20.jpg)
Code Discovery Problem
• In the binary file, instruc+ons and data may hybrid in the sec+on. – It is not easy to discover instruc+ons in the binary – Especial for variable-‐length instruc+on set like x86
![Page 21: Intro. to static analysis](https://reader034.vdocuments.net/reader034/viewer/2022052621/5583a0e8d8b42a03088b4981/html5/thumbnails/21.jpg)
Linear sweep
• Starts usually to disassemble from the first byte of the code sec+on in a linear fashion
• Disassembles one instruc+on afer another un+l the end of the sec+on is reached
• Do not understand program flow • objdump
![Page 22: Intro. to static analysis](https://reader034.vdocuments.net/reader034/viewer/2022052621/5583a0e8d8b42a03088b4981/html5/thumbnails/22.jpg)
Recursive traversal • instruc+on classified as
– Sequen+al flow: pass execu+on to the next instruc+on that immediately follows – Condi+onal branching: if the condi+on is true the branch is taken and the instruc+on
pointer must change to reflect the target of the branch, otherwise it con+nues in a linear fashion (jnz, jne, . . . ). In sta+c context this algorithm disassemble both paths
– Uncondi+onal branching: the branch is taken without any condi+on; the algorithm follows the (execu+on) flow (jmp)
– Func+on call: are like uncondi+onal jumps but they return to the instruc+on immediately following the call
– Return: every instruc+ons which may modify the flow of the program add the target address to a list of deferred disassembly. When a return instruc+on is reached an address is popped from the list and the algorithm con+nues from there (recursive algorithm).
• Some issue – Indirect code invoca+ons – Does returning from a call always allow for a faithful disassembly
![Page 23: Intro. to static analysis](https://reader034.vdocuments.net/reader034/viewer/2022052621/5583a0e8d8b42a03088b4981/html5/thumbnails/23.jpg)
Problem of Disassembly
• Remember that disassembler may not always true
• Linear sweep
• Recursive traversal
jmp .des+na+on db 0x6a ; garbage byte technique .des+na+on: pop eax
eb 01 jmp 0x401003 6a 58 push 0x58
push DWORD .des+na+on jmp DWORD [esp] db 0x6a ; garbage byte technique .des+na+on: pop eax
push DWORD .des+na+on jmp DWORD [esp] push 0x58
![Page 24: Intro. to static analysis](https://reader034.vdocuments.net/reader034/viewer/2022052621/5583a0e8d8b42a03088b4981/html5/thumbnails/24.jpg)
Reverse Assambly to C
• Registers Architecture
• The EIP register contains the address of the next instruc+on to be executed if no branching is done.
![Page 25: Intro. to static analysis](https://reader034.vdocuments.net/reader034/viewer/2022052621/5583a0e8d8b42a03088b4981/html5/thumbnails/25.jpg)
Memory Layout • Stack
– Not maintain in Executable – Local Variable
• Heap – Not maintain in Executable – Dynamic Allocate Memory
• BSS Sec+on – Unini+alized Data – Global variables and sta+c variables
that are ini+alized to zero or do not have explicit ini+aliza+on in source code
• Data Sec+on – Ini+alized Data – Global variables and sta+c
variables
![Page 26: Intro. to static analysis](https://reader034.vdocuments.net/reader034/viewer/2022052621/5583a0e8d8b42a03088b4981/html5/thumbnails/26.jpg)
Variables
• Disassembled code for local and global variables
![Page 27: Intro. to static analysis](https://reader034.vdocuments.net/reader034/viewer/2022052621/5583a0e8d8b42a03088b4981/html5/thumbnails/27.jpg)
Local Variables/Arguments
• Caller push argument into stack • Caller push eip by call instruc+on
• Callee save/push the caller’s ebp
• Callee reserve space for local variables – sub
Stack Growing Direc+on
![Page 28: Intro. to static analysis](https://reader034.vdocuments.net/reader034/viewer/2022052621/5583a0e8d8b42a03088b4981/html5/thumbnails/28.jpg)
Data Movement • MOV dst, src – Src <= dst
• LEA dst, src – Load effec+ve address of operand into specified register
– To calculate the address of a variable which doesn't have a fixed address
• Example – mov eax, [ebp -‐ 4] <= get content in [ebp -‐ 4] – mov eax, ebp – 4 <= wrong, no such instruc+on – lea eax, [ebp -‐ 4] <= get address of [ebp -‐ 4]
![Page 29: Intro. to static analysis](https://reader034.vdocuments.net/reader034/viewer/2022052621/5583a0e8d8b42a03088b4981/html5/thumbnails/29.jpg)
Arithme+c Operator
• add dest, src • sub dest, src • mul arg • div – DIV r/m8 – DIV r/m16 – DIV r/m32
• inc • dec
![Page 30: Intro. to static analysis](https://reader034.vdocuments.net/reader034/viewer/2022052621/5583a0e8d8b42a03088b4981/html5/thumbnails/30.jpg)
Control Instruc+ons • Flag, each instruc+on updates some field of flag for future
branch • test
– Performs a bit-‐wise logical AND – sets the ZF(zero), SF(sign) and PF(parity) flags
• cmp – Performs a comparison opera+on between arg1 and arg2 – Set SF, ZF, PF, CF, OF and AF
![Page 31: Intro. to static analysis](https://reader034.vdocuments.net/reader034/viewer/2022052621/5583a0e8d8b42a03088b4981/html5/thumbnails/31.jpg)
Branch Instruc+on
• JE Jump if Equal ZF=1 • JNE Jump if Not Equal ZF=0 • JG Jump if Greater (ZF=0) AND (SF=OF) • JGE Jump if Greater or Equal SF=OF • JL Jump if Less SF≠OF • JLE Jump if Less or Equal (ZF=1) OR (SF≠OF)
![Page 32: Intro. to static analysis](https://reader034.vdocuments.net/reader034/viewer/2022052621/5583a0e8d8b42a03088b4981/html5/thumbnails/32.jpg)
Stack Opera+on
• Stack is the LIFO data structure – PUSH: put data into top of stack – POP: get data from top of stack
![Page 33: Intro. to static analysis](https://reader034.vdocuments.net/reader034/viewer/2022052621/5583a0e8d8b42a03088b4981/html5/thumbnails/33.jpg)
Func+on Call
• Call – Similar to jmp, but a CALL stores the current EIP on the stack
• RET – Load the address in esp, and jump to that address
• RET num – Increase esp by num – Load the address in esp, and jump to that address
![Page 34: Intro. to static analysis](https://reader034.vdocuments.net/reader034/viewer/2022052621/5583a0e8d8b42a03088b4981/html5/thumbnails/34.jpg)
Func+on Pro
• Func+on Prologue – Store current EBP – Save ESP to current EBP
– Leave space for local variables
• Func+on Epilogue – Set ESP to EBP – Restore EBP
![Page 35: Intro. to static analysis](https://reader034.vdocuments.net/reader034/viewer/2022052621/5583a0e8d8b42a03088b4981/html5/thumbnails/35.jpg)
Calling Conven+on • The transi+on of func+on arguments must be maintain by assembly programmer,
but most case maintain by compiler • Stdcall
– func+on arguments are passed from right to lef – the calleé is in charge of cleaning up the stack. – Return values are stored in EAX.
• cdecl – The cdecl (short for c declara+on) is a calling conven+on that originates from the C
programming language and is used by many C compilers for the x86 architecture. – The main difference of cdecl and stdcall is that in a cdecl, the caller, not the calleé, is
responsible for cleaning up the stack. • pascal
– The pascal calling conven+on origins from the Pascal programming language – The main difference between it and stdcall is that the parameters are pushed to the stack
from lef to right. • fastcall
– The fastcall is a non-‐standardized calling conven+on. – the fastcall conven+on tends to load them into registers. This results in less memory
interac+on and increases the performance of a call.
![Page 36: Intro. to static analysis](https://reader034.vdocuments.net/reader034/viewer/2022052621/5583a0e8d8b42a03088b4981/html5/thumbnails/36.jpg)
Func+on Call Structure
• Func+on Call Structure
•
![Page 37: Intro. to static analysis](https://reader034.vdocuments.net/reader034/viewer/2022052621/5583a0e8d8b42a03088b4981/html5/thumbnails/37.jpg)
Branch Structure
• Branch Structure
![Page 38: Intro. to static analysis](https://reader034.vdocuments.net/reader034/viewer/2022052621/5583a0e8d8b42a03088b4981/html5/thumbnails/38.jpg)
Do-‐For loop
• Do-‐For loop
![Page 39: Intro. to static analysis](https://reader034.vdocuments.net/reader034/viewer/2022052621/5583a0e8d8b42a03088b4981/html5/thumbnails/39.jpg)
IDA Pro
• IDA Pro is the most well-‐known dissemble/decompile tool for reversing – Disassemble – Friendly GUI – Decopiler – Debugger
![Page 40: Intro. to static analysis](https://reader034.vdocuments.net/reader034/viewer/2022052621/5583a0e8d8b42a03088b4981/html5/thumbnails/40.jpg)
Overview
Assembly and Control Flow View
Message View
Control Flow View
![Page 41: Intro. to static analysis](https://reader034.vdocuments.net/reader034/viewer/2022052621/5583a0e8d8b42a03088b4981/html5/thumbnails/41.jpg)
Func+onality(1)
Convert Current Loca+on • DATA • Instruc+on • String • Self-‐defined Data
Structure • Array
Convert Oprand • Offset • Hex/Oct/Dec/Bin • Constant Char • Segment-‐based
Var • Stack-‐based Var • ….
Fun Call Window
Xref Table
Graph
Once The disassemble make mistake, you can fix it yourself
![Page 42: Intro. to static analysis](https://reader034.vdocuments.net/reader034/viewer/2022052621/5583a0e8d8b42a03088b4981/html5/thumbnails/42.jpg)
Func+onality(2)
Export Func+on • List func+ons
export to other Binary
• DLL, entry point
Import Func+on • Func+ons
included from other files
• Import func+on can help you to guess the behavior of program
Names • Func+on
Name • Variable
Names • Strings • For problem
with debugger informa+on inside, names can be useful
Strings • All strings use • For some easy
problem, this can help you to get flag
• For other problem, it s+ll give you quick look to program
![Page 43: Intro. to static analysis](https://reader034.vdocuments.net/reader034/viewer/2022052621/5583a0e8d8b42a03088b4981/html5/thumbnails/43.jpg)
Useful Hotkeys
Func,on Hotkey
1 Strings Shif+F12
2 Jump to operand Enter
3 Jump to previous posi+on ESC
4 Jump to next posi+on Ctrl+Enter
5 Jump to address G
6 Jump to entry point Ctrl+E
7 Sequence of bytes Alt+B
• List of useful hotkeys
![Page 44: Intro. to static analysis](https://reader034.vdocuments.net/reader034/viewer/2022052621/5583a0e8d8b42a03088b4981/html5/thumbnails/44.jpg)
Prac+ce
• Reverse encryp+on algo in bot.exe – sub_418f50 – h`p://140.113.216.151/bot.exe
![Page 45: Intro. to static analysis](https://reader034.vdocuments.net/reader034/viewer/2022052621/5583a0e8d8b42a03088b4981/html5/thumbnails/45.jpg)
Decompiler
• Decompiler can help you to transfer assembly into C code – More easy to read
![Page 46: Intro. to static analysis](https://reader034.vdocuments.net/reader034/viewer/2022052621/5583a0e8d8b42a03088b4981/html5/thumbnails/46.jpg)
But
• Decompiler result is not perfect – Most of +me is buggy – Lack of source code level informa+on
• May not support All playorm – Arm – X86 – X64 – …..
![Page 47: Intro. to static analysis](https://reader034.vdocuments.net/reader034/viewer/2022052621/5583a0e8d8b42a03088b4981/html5/thumbnails/47.jpg)
Reversing Concept
• Iden+fy important part of program • Backward tracking user data • Forward tracking interes+ng API func+on • Convert back to C code
![Page 48: Intro. to static analysis](https://reader034.vdocuments.net/reader034/viewer/2022052621/5583a0e8d8b42a03088b4981/html5/thumbnails/48.jpg)
Iden+fy important part of program
• Iden+fy what you interes+ng – Strings: ‘flag’, ‘key’, …. – Func+on to read input: scanf(), gets(),… – Func+on for network communica+on: recv(), send()
– Read/Write file – …..
![Page 49: Intro. to static analysis](https://reader034.vdocuments.net/reader034/viewer/2022052621/5583a0e8d8b42a03088b4981/html5/thumbnails/49.jpg)
Backward tracking user data
• Most program vulns must be trigger by user input – You can not(or difficult) a`ack a func+on independent to your input
• Keep track about variables affected by your input – Data Propagate
• Data Dependency
![Page 50: Intro. to static analysis](https://reader034.vdocuments.net/reader034/viewer/2022052621/5583a0e8d8b42a03088b4981/html5/thumbnails/50.jpg)
Forward tracking interes+ng API func+on
• Most vulns are cause by some certain func+ons – strcpy() – memcpy() – scanf() – priny() – strcat() – …..
• Try to trigger these func+ons • Analysis control flow and make strategy to enforce program goto these func+ons
![Page 51: Intro. to static analysis](https://reader034.vdocuments.net/reader034/viewer/2022052621/5583a0e8d8b42a03088b4981/html5/thumbnails/51.jpg)
Convert back to C code
1. Gather informa+on – IAT – Strings – Dynamic analysis
2. Iden+fy func+on of interest 3. Iden+fy CALLs 4. Iden+fy algorithms and data structures 5. Pseudo-‐code it! 6. Rename func+on(s), argument(s), variable(s)
![Page 52: Intro. to static analysis](https://reader034.vdocuments.net/reader034/viewer/2022052621/5583a0e8d8b42a03088b4981/html5/thumbnails/52.jpg)
Problem of sta+c analysis
• Encryp+on/Self Modified Code • Lack of run+me informa+on • Take a lot of +me to understand program L
![Page 53: Intro. to static analysis](https://reader034.vdocuments.net/reader034/viewer/2022052621/5583a0e8d8b42a03088b4981/html5/thumbnails/53.jpg)
Advantage
• Why we s+ll needed sta+c analysis? – Give you very first concept of program – Overview of program flow – Hybrid with dynamic analysis
![Page 54: Intro. to static analysis](https://reader034.vdocuments.net/reader034/viewer/2022052621/5583a0e8d8b42a03088b4981/html5/thumbnails/54.jpg)
Summary
• This course brings the basic idea of sta+c analysis
• Intro. some tool for sta+c analysis • Basic ASM • How to reverse asm to c – Func+on call – Memory
• Some +ps for sta+c analysis
![Page 55: Intro. to static analysis](https://reader034.vdocuments.net/reader034/viewer/2022052621/5583a0e8d8b42a03088b4981/html5/thumbnails/55.jpg)
Q&A