predicting enterprise cyber incidents using social network …ssarka18/cyber_cycon.pdf ·...
TRANSCRIPT
Predicting enterprise cyber incidents using social network analysis on the darwkeb
hacker forums
Soumajyoti Sarkar, Mohammad Almukaynizi, Jana Shakarian, Paulo Shakarian
Department of Computer Science
Arizona State University
Cyber Attack Prediction with Unconventional Signals1. “When” will the cyber attack happen?
2. “What” are the indicators of the attack?
Cyber Attack Prediction with Unconventional SignalsFeature extraction periodvuln. Feature extractionSocial signal extraction
Attack ML chain
Cyber Attack Prediction with Unconventional SignalsSocial signal extraction
1. Use social network analysis on hacker forum discussions
2. Enterprise specific trained models
Interaction patternsfrom darkweb forums
ML model
Predict future attacks
Attack ML chain
Dataset-Enterprise attacks❑ Data available from a research program** that provided cyber
attack incidents from Armstrong corporation.
❑ Period of April 2016 to September 2017.
❑ Two kinds of attacks reported in the Ground Truth data from Armstrong:
❑ Malicious Email: an email contains a malicious email attachment to a known malicious destination.
❑ Endpoint malware: malware discovered on an endpoint device
** IARPA CAUSE
Dataset-Darkweb forums & CVE❑ Use the SDK provided by a commercial entity, that provides structured forum data in a given time frame.
– Attributes: content, user id, post time, forum id
❑ Filter out forums based on a threshold number of posts
– timeframe of January 2016 to September 2017
– 53 forums considered in our study
❑ Common Vulnerabilities and Exposures- maintained on a platform operated by the MITRE corporation
❑ vulnerabilities mentioned in the forums in the period from January 2016 to October 2017.
❑ The total number of unique CVE mentions – 3553 across all forums
Prediction task
Machine Learning models that input these time series and outputs incident chances
Use the time series of features to predict the time series of the attacks
2 hyperparameters: ηand δ
η
Daily attacks
t
Features
Networks for time series features
Goal
t1 t2 tn
Need to compute features from networks analyzing interactions in each forum on a daily basis
[t1 - ∆, t1 - ∆+T]
∆T
Daily Directed Networks• Nodes: Users
• Edges: Thread replies
Experts
Assess streaming interaction patterns between users as signals
Filter out edges✓ Filter out nodes
Networks for time series features
Goal
t1 t2 tn
Assess interaction patterns between experts and daily users as signals
Need to compute features from networks analyzing interactions in each forum on a daily basis
[t1 - ∆, t1 - ∆+T]
∆T
Daily Directed Networks• Nodes: Users
• Edges: Thread replies
Experts
Networks for time series features
Goal
t1 t2 tn
Need to compute features from networks analyzing interactions in each forum on a daily basis
[t1 - ∆, t1 - ∆+T]
∆T
2 steps1. Extract experts from this historical network
2. Curate features analyzing interactions between experts
and daily users
Daily Directed Networks• Nodes: Users
• Edges: Thread replies
Experts
Assess interaction patterns between experts and daily users as signals
Expert usersExtract set of users with credible knowledge about CVEs from the historical reply networks
induced in T
1. Mentioned CVE at least once in the time period T in their posts
2. At least one CVE mentioned should be in the top software groups during T
3. The in-degree of the users should cross a threshold
1. An expert – mentions important CVEs in its posts and which
also receive higher # replies
2. Attention broadcast from these experts leveraged as
indicators for a correlating cyber attack
expert users
Hypothesis of experts❑ Time periods of 3 widely known security breaches: the Wannacry ransomware attack, the Petya
cyber attack and the Equifax breach attack
❑ Statistical Testing❑ Null hypothesis: Experts interact less than regular posters with CVE mentions prior to important security breaches
❑ Alternate hypothesis: Experts interact more than regular posters with CVE mentions
❑ Conduct a 2 sample t-test by randomly picking 4 weeks from the datasetThe null hypothesis is rejected at α = 0.01 with p value of 0.7e-4.
The null hypothesis is not rejected at α = 0.01 with p value of 0.05.
Hypothesis of experts❑ Time periods of 3 widely known security breaches: the Wannacry ransomware attack, the Petya
cyber attack and the Equifax breach attack
❑ Statistical Testing❑ Null hypothesis: Experts interact less than regular posters with CVE mentions prior to important security breaches
❑ Alternate hypothesis: Experts interact more than regular posters with CVE mentions
❑ Conduct a 2 sample t-test by randomly picking 4 weeks from the datasetThe null hypothesis is rejected at α = 0.01 with p value of 0.7e-4.
The null hypothesis is not rejected at α = 0.01 with p value of 0.05.
Suggests that during important security breaches
1. experts tend to interact more prior to than other users who
randomly post CVEs
2. interactions with these experts are more correlated with an important cyber security event.
Measuring impact of interactions - evolving networks
❑ Try to analyze the path structure between experts and regular posters
❑ See if posts that get attention from experts can be used as markers for attack prediction
Historical
Networks
Features for the prediction task
Feature Group Feature Brief description
Expert Centric
Graph Conductance
Path structure between the experts and the daily users assessing information flow,
shortest paths and the community structures
Shortest Path
Expert Replies
Common Communities
Forum/User metadata
# threads
Metadata about the forums and the experts
# users
# expert threads
# CVE mentions
Path structure measures
Shortest Paths
experts non-experts
Graph Conductance
Probability of random walks originating from experts to the rest of the interaction graph
Hyper parameter settings❑ We consider a span of 1 week time window η while keeping δ = 8 days.
❑ Class imbalance issue❑ for malicious-email, out of 335 days considered in the dataset, there have been reported attacks
on 97 days which constitutes a positive class ratio of around 29%.
❑ for endpoint-malware the total number of attack days are 31 out of 306 days of considered span in the training dataset which constitutes a positive class ratio of around 10%
❑ Training – Test split❑ 70%-30% averaged to the nearest month separately for each event-type❑ October 2016 to May 2017 (8 months) – training data ❑ June 2017 to August 2017 (3 months) – test data malicious-email
Results – Longitudinal Logistic regression
❑ Graph conductance (GC) with a precision of 0.44, recall of 0.65 and an F1 score of 0.53
❑ # threads – precision 0.43, recall 0.59 and F1 score of 0.5 against random guess F1 of 0.34
❑ Random guess over the days – F1 score of 0.34
Malicious email
GC
Random
# threads
Random
Prediction in high activity weeksMalicious email
❑ Communities feature having a precision of 0.7 and a recall of 0.63 and an F1 score of 0.67 ❑ #times CVE mentioned with a precision of 0.51 and recall of 0.82 and an F1 score of 0.62❑ random (no priors) F1 score of 0.48
❑ Unlike the results over all the days, for these specific weeks, the model achieves high precision while maintaining comparable recall
Results – Feature CombinationMalicious email
❑ Training the model with all features over windowed time points would require regularization (p ≈ n)
❑ Use L1, L2 and Group Lasso regularization
❑ The F1 score obtained at η=7 days and δ=8 days: 0.56 against the single GC feature having F1 score: 0.53.
Results – Comparison with network centralities
❑ We obtain an F1 score of 0.41 from the outdegree and betweenness metrics ❑ Additionally, filtering users creating posts with CVE mentions does not help much.
❑ Computing node significance measures alone does not help much in prediction
Malicious email