learning factor graphs for preempting multi-stage attacks in...
TRANSCRIPT
1 2 3 4 5 6 7 8 9 10Iteration
0.0
0.2
0.4
0.6
0.8
1.0
g(x,
z 1,z
2)
EM learning for g(x, z1, z2)
z1 = 0z1 = 1z2 = 0z2 = 1
Approach
Learning Factor Graphs for Preempting Multi-Stage Attacks in Cloud InfrastructurePhuong Cao, Alexander Withers, Zbigniew Kalbarczyk, Ravishankar IyerUniversity of Illinois at Urbana-Champaign
Result on learning parametersExpectation-Minimization algorithm shows a fast convergence rate, i.e., 3 iterations, of marginal probability of zi for a 3-variable clique.
Goals• Detect multi-stage attacks that make use of stolen credentials in large
enterprise networks, e.g., cloud infrastructure• Employ factor graphs, a probabilistic graphical model, to capture
attacker behavior and detect malicious activities.– Learning graph structure that represents dependencies among observed events and attack
stages
– Learning graph parameters, i.e., factor functions among observed events and hidden attack stages, that represents strength of their dependencies using a factor graph correspond to ongoing attack
Future Work1. Automatically build graphs for evaluating
security of pre-deployed cloud applications
2. Evaluate of learned graphs in terms of detection accuracy or model complexity
3. Build models to automatically respond to on-going attacks
References[1] Koller, Daphne, and Nir Friedman. Probabilistic graphical models: principles and techniques. MIT press, 2009. [2] Cao, Phuong, et al. "Preemptive intrusion detection." Proceedings of the 2014 Symposium and Bootcamp on the Science of Security. ACM, 2014.
AcknowledgementThis material is based upon work supported by the Maryland Procurement Office under Contract No. H98230-14-C-0141National Center for Supercomputing Applications staffs for insightful discussions on incident reports of multi-stage attacks
Attack Paths
</> —{} —————
Cloud ConfigMarkup Document
ConstructAttack Graph
…
Attack Graph
I ——C ——D——
Pastincidents
Events
Factor functions
Learnparameters
Build factor graph
DetermineAttackStage
𝑠1𝑠2𝑠3𝑠4𝑠5
=
0.10.30.10.40.1
Motivating Example: A Credential Stuffing Attack
Steal credentials / secret keys
Launch Denial of Service
Send spams
Mine Bitcoins
Target Site
Leaked Credentials
Botnet
Black Markets Password*********
Compromised Machine
Firewall
File system/dev/shm
Compile RHEL exploit
Escalate to #root
Attacker
Remote Location
srcseq
dst
ack…
3 x
Venom Rootkit
Port 9090
Loadable Kernel Module
Userland backdoor
Network scanning tools
X
Buy, steal, or fishcredentials
1
3 Drop exploit kit
ssh-rsaAAAAB3…
2 Masquerade
4 Escalate privilege
5 Deploy rootkit
6 Open port upon receivingport knocking sequence
8 Perform attack payloads
Invisible to network scanning tools
7 Send port knocking sequenceTCP dst + seq = 1221
</>Automated
guessing
Account takeover $$$
Credential Stuffing: Attacks that exploit leaked credentials for automated penetration of systems.
𝑧+ 𝑧, 𝑧- 𝑧. 𝑧/
𝑥+ 𝑥, 𝑥- 𝑥. 𝑥/ 𝑥1
𝑧1𝑧2𝑧3𝑧4𝑧5
=
0.10.30.10.40.1
𝑧𝑖 ∈ 𝑍, z8 ∈ 0,1 representhiddenattackstages𝑥𝑖 ∈ 𝑋,x𝑖 ∈ [1,|X|]representobservedevents
𝑓(𝑠1) 𝑧10.10 0
0.90 1
𝑔(𝑥1, 𝑧1, 𝑧2) 𝑥1 𝑧1 𝑧2? 1 0 0
? 1 0 1
? 1 1 0
? 1 1 1
𝒛𝟏 :initialcompromise𝒛𝟐 :hosthopping𝒛𝟑 :escalateprivileges𝒛𝟒 :maintainpresence𝒛𝟓 :deliverpayload
The most probable attack stage:”maintain presence”
𝑥1 :incorrectlogins𝑥2 :successfullogin𝑥3 :sensitivefilesyslocation𝑥4 :downloadsourcefile𝑥5 :compile𝑥6 :newkernelmodule
𝑷 𝒙, 𝒛|𝜽 =𝟏𝒁^𝜽𝒄𝑻𝒇𝒄(𝒙𝒄, 𝒛𝒄)𝒄∈𝓒
h(𝑧2) 𝑧20.70 0
0.30 1
Graph structure consist of edgesspecify link among variables
I ——C —D——
PastIncidentsDataset D
IndependenceTest
Dependent Events
𝑧,𝑥+
𝑧-𝑥-
𝑧/𝑥/
…
Events at run-time
ExpectationMaximization of 𝜃
Factor Functions
𝑧,𝑥+
𝒈(𝒙𝟏, 𝒛𝟏, 𝒛𝟐) 𝒙𝟏 𝒛𝟏 𝒛𝟐
? 1 0 0
… … … …
𝜃ef
Ongoing attacks
Bro logsSystem logsNetwork flows
Learning graph structure (offline and runtime)The goal of learning graph structure, i.e., factor graph, is to automatically establish dependencies among observed events and hidden attack stages by using an Χ,independence test on training data D. The dependencies are used to construct a set of model candidates 𝑚i ∈ 𝑀 , e.g., simple model using only strongest dependencies or complex model using all dependencies.
𝑠𝑐𝑜𝑟𝑒 opq 𝑚i 𝐷 = 𝑚𝑎𝑥tlog𝑃(𝑥, 𝑧,𝑚i, 𝐷, 𝜃) + 𝑙𝑜𝑔 𝜃𝑃 𝜃 𝑚i − 𝑑𝑖𝑚 𝑚i 𝑙𝑛|𝐷|
A model candidate𝑚i is scored based on three terms in respective order: - Goodness of fit with training data (𝑚𝑎𝑥tlog𝑃(𝑥, 𝑧,𝑚i, 𝐷, 𝜃))- Entropy of 𝜃 to avoid overfitting and favor model stability (𝑙𝑜𝑔 𝜃𝑃 𝜃 𝑚i
- Complexity of model and availability of training data (𝑑𝑖𝑚 𝑚i 𝑙𝑛|𝐷|)
Learning graph parameters (offline)The goal of learning graph parameters is to automatically define parameters 𝜃efof factor functions in tabular forms.
Expectation Maximization algorithm is used for learning parameters of each factor function because it can handle missing or incomplete training data, which is the case for most multi-stage attacks.
Input. Training dataset D of past attacksInit. Start with a random initialization of 𝜃|
Repeat each iteration until converge:E-step. Calculate expected likelihood of log likelihood function
𝑄 𝜃~�+ 𝜃~ = 𝐸 𝑧 𝑥, 𝜃~ [log(𝑃(𝑥, 𝑧, 𝜃~)]
M-step. Maximize parameters of 𝜃 ~�+
𝜃 ~�+ = 𝑎𝑟𝑔𝑚𝑎𝑥 t 𝑄 𝜃~�+ 𝜃~
Inference of ongoing attack stages (runtime)Given a factor graph of an ongoing attack at runtime, inference is to determine the most likely unknown attack stage and output a confidence level for each stage.
𝑧∗ = 𝑎𝑟𝑔𝑚𝑎𝑥 � P(x, z, 𝜃)At this stage, off-the-shelf inference techniques such as Belief Propagation, Monte Carlo Markov Chain, or Variational Inference can be employed.
Determine response to the identified attack stage, e.g., zi = maintain presence is the most probable attack stage in this example.0.1
0.3
0.1
0.4
0.1
0
0.1
0.2
0.3
0.4
0.5
1 2 3 4 5
P(z)
Illustration of the proposed approach in the context of a credential stuffing attack.