![Page 1: Towards Network Containment in Malware Analysis Systemss3.eurecom.fr/slides/acsac12_graziano.slides.pdf · 2020-06-17 · Malware Analysis Scenario Analysis based on Sandboxes (API](https://reader030.vdocuments.net/reader030/viewer/2022041107/5f0a3fa77e708231d42abb6d/html5/thumbnails/1.jpg)
Towards Network Containment in Malware Analysis Systems
Mariano Graziano, Corrado Leita, Davide BalzarottiACSAC, Orlando, Florida, 3-7 December 2012
![Page 2: Towards Network Containment in Malware Analysis Systemss3.eurecom.fr/slides/acsac12_graziano.slides.pdf · 2020-06-17 · Malware Analysis Scenario Analysis based on Sandboxes (API](https://reader030.vdocuments.net/reader030/viewer/2022041107/5f0a3fa77e708231d42abb6d/html5/thumbnails/2.jpg)
Malware Analysis Scenario● Analysis based on Sandboxes (API Hooking, Emulation)
● Complex and distributed Security Companies Infrastructure
● Malware behavior often depends on external factors (C&C servers)
● Sophisticated attacks involve multiple stages
![Page 3: Towards Network Containment in Malware Analysis Systemss3.eurecom.fr/slides/acsac12_graziano.slides.pdf · 2020-06-17 · Malware Analysis Scenario Analysis based on Sandboxes (API](https://reader030.vdocuments.net/reader030/viewer/2022041107/5f0a3fa77e708231d42abb6d/html5/thumbnails/3.jpg)
Malware Execution Stages
DNS
WEBSERVER
C&C SERVER
PCs
DNS name resolution
Download additional components, check Internet connectivity
Receive commands, exfiltrate information
Extend infected population
MALWARE
![Page 4: Towards Network Containment in Malware Analysis Systemss3.eurecom.fr/slides/acsac12_graziano.slides.pdf · 2020-06-17 · Malware Analysis Scenario Analysis based on Sandboxes (API](https://reader030.vdocuments.net/reader030/viewer/2022041107/5f0a3fa77e708231d42abb6d/html5/thumbnails/4.jpg)
Repeatability & Containment
DNS
WEBSERVER
C&C SERVER
PCs
DNS name resolution
Web Server Unreachable,Impossible to download the components
Receive commands, exfiltrate information
Impossible to harm other machines
MALWARE
CONTAINMENT
![Page 5: Towards Network Containment in Malware Analysis Systemss3.eurecom.fr/slides/acsac12_graziano.slides.pdf · 2020-06-17 · Malware Analysis Scenario Analysis based on Sandboxes (API](https://reader030.vdocuments.net/reader030/viewer/2022041107/5f0a3fa77e708231d42abb6d/html5/thumbnails/5.jpg)
Goal● Goal:
– Model/Replay the network traffic for malware containment and experiment repeatability
● Motivation:
– Malware behavior often depends on the network context
– Experiments are not repeatable over time
– Sandbox containment of polymorphic variations
![Page 6: Towards Network Containment in Malware Analysis Systemss3.eurecom.fr/slides/acsac12_graziano.slides.pdf · 2020-06-17 · Malware Analysis Scenario Analysis based on Sandboxes (API](https://reader030.vdocuments.net/reader030/viewer/2022041107/5f0a3fa77e708231d42abb6d/html5/thumbnails/6.jpg)
Malware Containment● Only possible in case of:
Polymorphic variations Re-execution of the same sample
● Full containment → Repeatable execution
● Current containment solutions:
APPROACH CONTAINMENT QUALITY
Full Internet Access x ~
Filter/Redirect specific ports ~ ~
Common service emulation v ~
Full Isolation v x
![Page 7: Towards Network Containment in Malware Analysis Systemss3.eurecom.fr/slides/acsac12_graziano.slides.pdf · 2020-06-17 · Malware Analysis Scenario Analysis based on Sandboxes (API](https://reader030.vdocuments.net/reader030/viewer/2022041107/5f0a3fa77e708231d42abb6d/html5/thumbnails/7.jpg)
Roadmap● Introduction
● Protocol Inference● System Overview
● Evaluation
![Page 8: Towards Network Containment in Malware Analysis Systemss3.eurecom.fr/slides/acsac12_graziano.slides.pdf · 2020-06-17 · Malware Analysis Scenario Analysis based on Sandboxes (API](https://reader030.vdocuments.net/reader030/viewer/2022041107/5f0a3fa77e708231d42abb6d/html5/thumbnails/8.jpg)
ScriptGen1
● Existing suite of protocol learning techniques developed for high interaction honeypots
● It aims at rebuilding portions of a protocol finite state machine (FSM) through the observation of samples of network interaction between a client and a server implementing such protocol
● No assumption is made on the protocol structure, and no a priori knowledge is assumed on the protocol semantics
1 Leita Corrado, Mermoud Ken, Dacier Marc - “ScriptGen: an automated script generation tool for honeyd” - ACSA 2005, 21st Annual Computer Security Applications Conference, December 5-9, 2005, Tucson, USA
![Page 9: Towards Network Containment in Malware Analysis Systemss3.eurecom.fr/slides/acsac12_graziano.slides.pdf · 2020-06-17 · Malware Analysis Scenario Analysis based on Sandboxes (API](https://reader030.vdocuments.net/reader030/viewer/2022041107/5f0a3fa77e708231d42abb6d/html5/thumbnails/9.jpg)
Finite State Machine● It is a tree:
The vertices contain the server’s answer The edges contain the client’s request
SMTP Finite State Machine
![Page 10: Towards Network Containment in Malware Analysis Systemss3.eurecom.fr/slides/acsac12_graziano.slides.pdf · 2020-06-17 · Malware Analysis Scenario Analysis based on Sandboxes (API](https://reader030.vdocuments.net/reader030/viewer/2022041107/5f0a3fa77e708231d42abb6d/html5/thumbnails/10.jpg)
Roadmap● Introduction
● Protocol Inference
● System Overview● Evaluation
![Page 11: Towards Network Containment in Malware Analysis Systemss3.eurecom.fr/slides/acsac12_graziano.slides.pdf · 2020-06-17 · Malware Analysis Scenario Analysis based on Sandboxes (API](https://reader030.vdocuments.net/reader030/viewer/2022041107/5f0a3fa77e708231d42abb6d/html5/thumbnails/11.jpg)
System Overview● Traffic Collection
● By running the sample in a sandbox or by using past analyses
● Endpoint Analysis● Cleaning and normalization process
● Traffic Modeling● Model generation (two ways: incremental
learning or offline)● Traffic Containment
● Two modes (Full or partial containment)
![Page 12: Towards Network Containment in Malware Analysis Systemss3.eurecom.fr/slides/acsac12_graziano.slides.pdf · 2020-06-17 · Malware Analysis Scenario Analysis based on Sandboxes (API](https://reader030.vdocuments.net/reader030/viewer/2022041107/5f0a3fa77e708231d42abb6d/html5/thumbnails/12.jpg)
Traffic Model Creation
SANDBOX
ENDPOINT ANALYSIS
CLUSTERING
NORMALIZATION
NETWORK TRACES
TRAFFIC MODELING
SCRIPTGEN
![Page 13: Towards Network Containment in Malware Analysis Systemss3.eurecom.fr/slides/acsac12_graziano.slides.pdf · 2020-06-17 · Malware Analysis Scenario Analysis based on Sandboxes (API](https://reader030.vdocuments.net/reader030/viewer/2022041107/5f0a3fa77e708231d42abb6d/html5/thumbnails/13.jpg)
Mozzie – Full Containment
FSM Player
SANDBOX TRAFFIC CONTAINMENT
![Page 14: Towards Network Containment in Malware Analysis Systemss3.eurecom.fr/slides/acsac12_graziano.slides.pdf · 2020-06-17 · Malware Analysis Scenario Analysis based on Sandboxes (API](https://reader030.vdocuments.net/reader030/viewer/2022041107/5f0a3fa77e708231d42abb6d/html5/thumbnails/14.jpg)
Mozzie – Partial Containment
FSM Player
Refinement
TRAFFIC CONTAINMENT
SANDBOXREMOTE SERVER
![Page 15: Towards Network Containment in Malware Analysis Systemss3.eurecom.fr/slides/acsac12_graziano.slides.pdf · 2020-06-17 · Malware Analysis Scenario Analysis based on Sandboxes (API](https://reader030.vdocuments.net/reader030/viewer/2022041107/5f0a3fa77e708231d42abb6d/html5/thumbnails/15.jpg)
Partial containment
SETUP PHASE
PROXY PHASE
FULL CONTAINMENT
![Page 16: Towards Network Containment in Malware Analysis Systemss3.eurecom.fr/slides/acsac12_graziano.slides.pdf · 2020-06-17 · Malware Analysis Scenario Analysis based on Sandboxes (API](https://reader030.vdocuments.net/reader030/viewer/2022041107/5f0a3fa77e708231d42abb6d/html5/thumbnails/16.jpg)
Roadmap● Introduction
● Protocol Inference
● System Overview
● Evaluation
![Page 17: Towards Network Containment in Malware Analysis Systemss3.eurecom.fr/slides/acsac12_graziano.slides.pdf · 2020-06-17 · Malware Analysis Scenario Analysis based on Sandboxes (API](https://reader030.vdocuments.net/reader030/viewer/2022041107/5f0a3fa77e708231d42abb6d/html5/thumbnails/17.jpg)
Experiments● Goals
– Find minimum number of network traces to generate a FSM to fully contain the network traffic
– Learning optimal parameters for commonly used protocols (HTTP, IRC, DNS, SMTP) + custom protocols
● Two groups of experiments
– Offline
– Incremental learning
![Page 18: Towards Network Containment in Malware Analysis Systemss3.eurecom.fr/slides/acsac12_graziano.slides.pdf · 2020-06-17 · Malware Analysis Scenario Analysis based on Sandboxes (API](https://reader030.vdocuments.net/reader030/viewer/2022041107/5f0a3fa77e708231d42abb6d/html5/thumbnails/18.jpg)
Offline Experiments
Sample Category Containmnet Normalization Traces
W32/Virut IRC Botnet FULL NO 15
PHP/PBot.AN IRC Botnet FULL NO 12
W32/Koobface.EXT HTTP Botnet 72% YES 9
W32/Agent.VCRE Dropper FULL NO 23
W32/Agent.XIMX Dropper FULL YES 10
![Page 19: Towards Network Containment in Malware Analysis Systemss3.eurecom.fr/slides/acsac12_graziano.slides.pdf · 2020-06-17 · Malware Analysis Scenario Analysis based on Sandboxes (API](https://reader030.vdocuments.net/reader030/viewer/2022041107/5f0a3fa77e708231d42abb6d/html5/thumbnails/19.jpg)
Incremental Learning Experiments
Sample Category Runs Containment Normalization
W32/Banload.BFHV Dropper 23 FULL NO
W32/Downloader Dropper 25 FULL NO
W32/Troj_generic.AUULE Ransomware 4 FULL NO
W32/Obfuscated.X!genr Backdoor 6 FULL NO
SCKeylog.ANMB Keylogger 14 FULL YES
![Page 20: Towards Network Containment in Malware Analysis Systemss3.eurecom.fr/slides/acsac12_graziano.slides.pdf · 2020-06-17 · Malware Analysis Scenario Analysis based on Sandboxes (API](https://reader030.vdocuments.net/reader030/viewer/2022041107/5f0a3fa77e708231d42abb6d/html5/thumbnails/20.jpg)
Results● Tested samples: 2 IRC botnets, 1 HTTP botnet, 4 droppers, 1
ransomware, 1 backdoor and 1 keylogger
● Required network traces ranging from 4 to 25 (AVG 14)
● DNS lower bound (6 traces)
● On AVG the number of traces is reasonable (Polymorphism, packing techniques)
![Page 21: Towards Network Containment in Malware Analysis Systemss3.eurecom.fr/slides/acsac12_graziano.slides.pdf · 2020-06-17 · Malware Analysis Scenario Analysis based on Sandboxes (API](https://reader030.vdocuments.net/reader030/viewer/2022041107/5f0a3fa77e708231d42abb6d/html5/thumbnails/21.jpg)
Limitations● Protocol agnostic approach
✔ Find a good trade-off● Analysis of encrypted protocols is impossible
✔ API level solution✔ MITM solution
● Malware with different behaviors (Domain flux)
✔ Improve the training set✔ Protocol-aware heuristics
![Page 22: Towards Network Containment in Malware Analysis Systemss3.eurecom.fr/slides/acsac12_graziano.slides.pdf · 2020-06-17 · Malware Analysis Scenario Analysis based on Sandboxes (API](https://reader030.vdocuments.net/reader030/viewer/2022041107/5f0a3fa77e708231d42abb6d/html5/thumbnails/22.jpg)
Use Cases● Repeat the analysis after weeks/months
● Analysis of similar variations (polymorphic) of the same sample
● Provide network containment for privacy/ethical issues
● Analysis of sophisticated attacks (Stuxnet/SCADA systems)