introduction to malware analysis

Introduction to Malware Analysis

Disclaimer

• This stuff requires the analyst to dive extremely deep into technical details

• This quick talk will attempt to give you a 1000 foot view of malware analysis

• I put a careful distinction between Malware Analysis and Reverse Engineering

Malware Analysis Overview

• Static Analysis: involves analyzing the code without actually running the code– File identification, header information, strings, etc.

– Disassembler – IDA Pro

• Dynamic Analysis: involves executing the code in a controlled manner and monitoring system changes– Sysinternals, memory forencis, etc.

– Debuggers – Immunity Debugger OllyDbg

Coding Terms

• Malware authors with code in High Level Programming Language: C/C++

Static Analysis: File Identification

• Linux “file” utility

• Python-magic module

Static Analysis: MD5 Hash

• Linux “md5sum” utility: md5sum <fileName>

• Python hashlib module:

Static Analysis: Strings

• Can be a quick way to gain intelligence from the file:

– Domains, Ips, URLs, Function names, hardcoded information

Static Analysis: Packers

• Packers are used to obfuscate the code which leads to: Changes the file signature (MD5 Hash) – Obfuscates the file strings, and code – Compress file size (sometimes)

• Packed code can be identified by: – Examining the PE sections, and Imports: If a PE file only

has LoadLibrary/GetProcAddress normally packed – Strings: UPX0, UPX1, aspack, adata, NSP0, NSP1, WinRAR

SFX, PEC2, PECompact2, Themida, Orean.sys, NTkrnl, Secure Suite

• Tools like (PEiD, LordPE, and Python peutils module)

Static Analysis: Packers

• Unpacked vs. Packed Strings:

Data Encoding

• Malware uses encoding for a number of reasons, some are to disguise internal workings, hide C2 information, and data exfil– Some simple encoding algorithms are: – Character Substitution

– XOR – uses a static key to XOR with the original value – Base64 – Can use default or custom character set – Default Base64 character set: A-Z, a-z, 0-9, +, /

• We will examine two common data encoding techniques used in Malware XOR and Base64

Data Encoding: XOR

• Strings are often required to be stored in a program in order to pass it as a parameter to a function

• XOR once = encoded

• XOR again with same key = plaintext

Data Encoding: Base64

• Storing base64 strings as HTML comments is how the APT group “Comment Crew” got their name. This technique is still leveraged today in malware

• Base64 is a common encoding scheme because it is very easy to decode

Static Analysis: PE File Format

• PE data structure contains all the information required for the Windows OS loader to manage executable code. .text – instructions the CPU executes – .rdata – Imports and Exports – .data – Global data – .rsrc – Resources (icons, images, strings, etc.)

• Useful information in PE header: Imports and Exports – Gives an idea to malware functionality – Compilation Time, Language Settings, and strings – Section Names – Packed code can have non-standard section names

• Tools to analyze PE header: pescanner.py, CFF Explorer, python pefile, Resource Hacker, Dependency Walker, LordPE, etc.

Windows API Calls:

• When performing advanced static or dynamic analysis it’s important to have a good understanding of Windows API calls

• By looking at the imported functions within the PE header you can see which Windows API functions the PE file wants to utilize

• By recognizing API calls you can quickly get an idea of malware’s functionality by analyzing strings output, and during advanced static analysis using a disassembler

• An excellent resource for Windows API calls is MSDN. Google search “API_Function MSDN”

Windows API: MSDN Example

• The Parameters modify how the function will be used on the system.

• The return type is what the function will return after it is called in a program

Windows API: Disassembly

• Parameters are pushed to the stack in Last In First Out(LIFO) order, which is why they are in reverse order in the disassembly

Wake Up

• Okay, that was likely starting to bore some people – SORRY

• Let’s move to Dynamic analysis which is more flashy

Getting Infected

• Double clicking the executable doesn’t always work– Sometimes you need to register the malware as a service or load it as

a DLL (regsvr32.exe and rundll32.exe )

• Install the malware as a service

– Interact with the system like a normal user The malware may be waiting for a certain application to open to inject code into it (Ex: Internet Explorer)

– It could require a CLI argument : One sample required <filename> /install in order to actually run the malware

– Static analysis is normally required to determine CLI switches

SysInternals Tool Suite

• If I could pick just one tool, id pick the 50+ in the Sysinternals tool suite

• Tools put out by Mark Russinovich – now works for Microsoft

• Process Explorer, Process Monitor, Autoruns, etc.

Process Explorer

Process Monitor

• Very verbose tool that generates a lot of events

• Filtering is required to make sense of the data

Process Monitor Cont.

• Press Ctrl+L to bring up the filtering dialog box – Quick filters are: Operation is WriteFile

– Category is Write

Malware Persistence - Autoruns

• Really is the key to identify malware – how does it gain persistence?

• Autoruns can help enumerate persistence mechanisms:

Monitoring Network Activity

• Some interesting network indicators of malware are:

– SYNs out to an IP or domain

– UDP traffic to IP or domain

– HTTP GET/POST requests

– DNS Queries

– Connection attempt times are important. Every 1 min, 30mins, etc.

Automation? Sandboxes

• So far the basic dynamic analysis we have talked about can be automated

• Sandboxes are a good tool in any malware analyst toolbox – they have Pro’s and Con’s:– Pros: Speeds up analysis, fast, saves time– Cons: Misses details, can be fooled

• Sandboxes can be open source or commercial:– Really good free option is Cuckoo sandbox:

• Install Tutorial: http://www.primalsecurity.net/im-cuckoo-for-malware-with-a-spice-of-reverse-engineering/

http://www.primalsecurity.net/im-cuckoo-for-malware-with-a-spice-of-reverse-engineering/

Summary

• Malware analysis requires both static and dynamic analysis techniques to accurately enumerate indicators of compromise

• As with any automated tool an analyst will need to be able to validate findings manually

introduction to malware analysis

Technology