privacy-preserving collaborative network anomaly detection haakon ringberg
TRANSCRIPT
![Page 1: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/1.jpg)
PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION
Haakon Ringberg
![Page 2: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/2.jpg)
Unwanted network traffic
Haakon Ringberg
2
Problem Attacks on resources (e.g., DDoS, malware) Lost productivity (e.g., instant messaging) Costs USD billions every year
Goal: detect & diagnose unwanted traffic Scale to large networks by analyzing
summarized data Greater accuracy via collaboration
Protect privacy using cryptography
![Page 3: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/3.jpg)
Network
Challenges with detection
Data volume Some commonly
used algorithms analyze IP packet payload info
Infeasible at edge of large networks
3
Haakon Ringberg
![Page 4: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/4.jpg)
Challenges with detection
Data volume Attacks
deliberately mimic normal traffic e.g., SQL-
injection, application-level DoS1
4
Haakon Ringberg
Network
I’m not sure about Beasty
Let me in!
1[Srivatsa TWEB ’08], 2[Jung WWW ’02]
AnomalyDetector
![Page 5: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/5.jpg)
Challenges with detection
Data volume Attacks deliberately
mimic normal traffic e.g., SQL-injection,
application-level DoS1
Is it a DDoS attack or a flash crowd?2
A single network in isolation may not be able to distinguish
5
Haakon Ringberg1[Srivatsa TWEB ’08], 2[Jung WWW ’02]
Network
![Page 6: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/6.jpg)
CNN.com
FOX.com
Collaborative anomaly detection “Bad guys tend
to be around when bad stuff happens”
6
Haakon Ringberg
I’m just not sure about Beasty :-/
I’m just not sure about Beasty :-/
![Page 7: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/7.jpg)
Collaborative anomaly detection “Bad guys tend
to be around when bad stuff happens”
Targets (victims) could correlate attacks/attackers1
7
Haakon Ringberg
1[Katti IMC ’05], [Allman Hotnets ‘06], [Kannan SRUTI ‘06], [Moore INFOC ‘03]2George W. Bush
Fool us once, shame on you. Fool us, we can’t get fooled again!
“Fool us once, shame on you. Fool us, we can’t get fooled again!”2
CNN.com
FOX.com
![Page 8: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/8.jpg)
Corporations demand privacy
Corporations are reluctant to share sensitive data Legal constraints Competitive
reasons
8
Haakon Ringberg
I don’t want FOX to know my customers
CNN.com
FOX.com
![Page 9: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/9.jpg)
Common practice
Haakon Ringberg
9
AT&T Sprint
Every network for themselves!
![Page 10: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/10.jpg)
• -like system • Greater scalability• Provide as a service
System architecture
Haakon Ringberg
10
AT&T
Sprint
• Collaboration infrastructure• For greater accuracy• Protects privacy
N.B. collaboration could also be
performed between stub
networks
![Page 11: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/11.jpg)
Dissertation Overview
Haakon Ringberg
11
Providing
Technologies
Venue
CollaborationInfrastructure
Privacy of participants and
suspects
Cryptography
SubmittedACM CCS ‘09
Detection at a
single network
Scalable Snort-like IDS
system
Machine Learning
PresentedIEEE Infocom
’09
Collaboration
Effectiveness
Quantifying benefits of
coll.
Analysis of Measureme
nts
To be submitted
![Page 12: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/12.jpg)
Chapter I: scalable signature-based detection at individual networks
Work with at&t labs:• Nick Duffield• Patrick Haffner• Balachander Krishnamurthy
12
![Page 13: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/13.jpg)
Intrusion Detection Systems (IDSes) Protect the edge of a network
Leverage known signatures of traffic e.g., Slammer worm packets contain “MS-SQL” (say) in payload or AOL IM packets use specific TCP ports and application
headers
13
IP header
TCP header
App header
Payload
Background: packet & rule IDSes
Enterprise
![Page 14: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/14.jpg)
A predicate is a boolean function on a packet feature e.g., TCP port = 80
A signature (or rule) is a set of predicates
Leverage existing community Many rules already exist CERT, SANS Institute, etc
Classification “for free”
Accurate (?)
Benefits
14
Background: packet and rule IDSes
![Page 15: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/15.jpg)
Background: packet and rule IDSes
Too many packets per second
Packet inspection at the edge requires deployment at many interfaces
Drawbacks
15
A predicate is a boolean function on a packet feature e.g., TCP port = 80
A signature (or rule) is a set of predicates
Network
![Page 16: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/16.jpg)
Too many packets per second
Packet inspection at the edge requires deployment at many interfaces
DPI (deep-packet inspection) predicates can be computationally expensive
Drawbacks
16
Packet has:• Port number X, Y, or Z• Contains pattern “foo” within the first 20 bytes• Contains pattern “bar” within the last 40 bytes
A predicate is a boolean function on a packet feature e.g., TCP port = 80
A signature (or rule) is a set of predicates
Background: packet and rule IDSes
![Page 17: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/17.jpg)
src IP
dst IP
src Por
t
dst Por
t
Duratio
n
# Packet
s
A B 5 min
36
… … … … … …
Our idea: IDS on IP flows17
How well can signature-based IDSes be mimicked on IP flows?
EfficientOnly fixed-offset
predicates Flows are more
compactFlow collection
infrastructure is ubiquitous
IP flows capture the concept of a connection
![Page 18: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/18.jpg)
Idea18
1. IDSes associate a “label” with every packet
2. An IP flow is associated with a set of packets
3. Our system associates the labels with flows
![Page 19: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/19.jpg)
Snort rule taxonomy19
Header-only
Meta-Informatio
n
Payload dependent
Inspect only IP flow header
Inexact corresponde
nce
Inspect packet payload
e.g., port numbers
e.g., TCP flags
e.g., ”contains abc”
Relies on features that cannot be exactly reproduced in IP
flow realm
![Page 20: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/20.jpg)
Simple translation20
3. Our systems associates the labels with flows
Simple rule translation would capture only flow predicatesLow accuracy or low applicability
• dst port = MS SQL• contains “Slammer”
20
• dst port = MS SQL
Snort rule:
Only flow predicates:
Slammer Worm
![Page 21: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/21.jpg)
Machine Learning (ML)21
3. Our systems associates the labels with flows
Leverage ML to learn mapping from “IP flow space” to labele.g., IP flow space = src port * # packets *
flow duration:
if raised
otherwise
src port
# p
acke
ts
![Page 22: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/22.jpg)
Boosting22
Boosting combines a set of weak learners to create a strong learner
h1
h2
h3
Hfinalsign
![Page 23: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/23.jpg)
• dst port = MS SQL• contains “Slammer”
Benefit of Machine Learning (ML)
ML algorithms discover new predicates to capture rule Latent correlations between predicates Capturing same subspace using different dimensions
23
• dst port = MS SQL
Snort rule: Only flow predicates: ML-generated rule:
Slammer Worm
• dst port = MS SQL• packet size = 404• flow duration
![Page 24: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/24.jpg)
Evaluation24
Border router on OC-3 link Used Snort rules in place Unsampled NetFlow v5 and packet
traces Statistics
One month, 2 MB/s average, 1 billion flows
400k Snort alarms
![Page 25: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/25.jpg)
Accuracy metrics
Receiver Operator Characteristic (ROC) Full FP vs TP tradeoff But need a single number
Area Under Curve (AUC) Average Precision (AP)
25
AP of p1 - p
p FP per TP
25
![Page 26: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/26.jpg)
Training on week 1, testing on week nMinimal drift within a monthHigh degree of accuracy for header and
meta
26 5 FP per 100 TP
43 FP per 100 TP
Classifier accuracy
Rule class Week1-2 Week1-3 Week1-4
Header rules 1.00 0.99 0.99
Meta-information
1.00 1.00 0.95
Payload 0.70 0.71 0.70
![Page 27: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/27.jpg)
Variance within payload group
Accuracy is a function of correlation between flow and packet-level features
27
Rule Average Precision
MS-SQL version overflow 1.00
ICMP PING speedera 0.82
NON-RFC HTTP DELIM 0.48
![Page 28: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/28.jpg)
Computational efficiency28
1. Machine learning (boosting) 33 hours per rule for one week of
OC48
2. Classification of flows 57k flows/sec 1.5 GHz Itanium 2 Line rate classification for OC48
Our prototype can supportOC48 (2.5 Gbps) speeds:
![Page 29: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/29.jpg)
Chapter II: Evaluating the effectiveness of collaborative anomaly detection
Work with:• Matthew Caesar• Jennifer Rexford• Augustin Soule
29
![Page 30: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/30.jpg)
Methodology
1. Identify attacks in IP flow traces2. Extract attackers3. Correlate attackers across victims
1) 2) 3)
30
![Page 31: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/31.jpg)
Identifying anomalous events
Use existing anomaly detectors1
IP scans, port scans, DoS e.g., IP scan is more than
n IP addresses contacted Minimize false positives
Correlate with DNS BL IP addresses exhibiting
open proxy or spambot behavior
1[Allan IMC ’07], [Kompella IMC ’04]
31
![Page 32: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/32.jpg)
Cooperative blocking
A set ‘S’ of victims agree to participate Beasty is blocked following initial attack
Subsequent attacks by Beasty on members of ‘S’ are deemed ineffective
CNN
FOX
Beasty is very bad!
32
![Page 33: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/33.jpg)
DHCP lease issues
Dynamic address allocation IP address first owned by Beasty Then owned by innocent Tweety
Should not block Tweety’s innocuous queries
10.0.0.1CNN
?
33
![Page 34: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/34.jpg)
DHCP lease issues
Dynamic address allocation IP address first owned by Beasty Then owned by innocent Tweety
Should not block Tweety’s innocuous queries
• Update DNS BL hourly
• Block IP addresses for a period shorter than most DHCP leases1
1[Xie SIGC ’07]
34
![Page 35: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/35.jpg)
Methodology
IP flow traces from Géant
DNS BL to limit FP Cooperative blocking of
attackers for Δ hours Metric is fraction of
potentially mitigated flows
35
![Page 36: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/36.jpg)
Blacklist duration parameter Δ
Collaboration between all hosts Majority of benefit can be had with small Δ
36
![Page 37: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/37.jpg)
Number of participating victims
Randomly selecting n victims to collaborate in scheme Reported number average of 10 random selections
37
![Page 38: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/38.jpg)
Number of participating victims
Collaboration between most victimized hosts Attackers are more like to continue to engage in bad action
“x” than a random other action
38
![Page 39: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/39.jpg)
Chapter conclusion
Repeat-attacks often occur within one hour Substantially less than average DHCP lease
Collaboration can be effective Attackers contact a large number of victims 10k random hosts could mitigate 50%
Some hosts are much more likely victims Subsets of victims can see great improvement
39
![Page 40: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/40.jpg)
Chapter III: Privacy-preserving collaborative anomaly detection
Work with:• Benny Applebaum• Matthew Caesar• Michael J Freedman• Jennifer Rexford
40
![Page 41: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/41.jpg)
E( )
E( )
Secure Correlatio
n
Privacy-Preserving Collaboration
Haakon Ringberg
41
CNN
FOXGoogle
E( )
Protect privacy of• Participants: do not reveal who suspected whom• Suspects: only reveal suspects upon correlation
![Page 42: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/42.jpg)
System sketch
Trusted third party is a point of failure Single rogue
employee Inadvertent data
leakage Risk of subpoena
42
Haakon Ringberg
Secure Correlatio
n
CNN FOX
Google MSFT
![Page 43: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/43.jpg)
System sketch
Trusted third party is a point of failure Single rogue employee Inadvertent data
leakage Risk of subpoena
Fully distributed impractical Poor scalability Liveness issues
43
Haakon Ringberg
CNN FOX
Google MSFT
![Page 44: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/44.jpg)
Managed by separate organizational entities Honest but curious proxy, DB, participants (clients) Secure as long as proxy and DB do not collude
Haakon Ringberg
44
CNN
FOX
Proxy DB
Split trustRecall:• Participant privacy• Suspect privacy
![Page 45: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/45.jpg)
1. Clients send suspect IP addrs (x) e.g., x = 127.0.0.1
2. DB releases IPs above threshold
Protocol outline45
Client / Participa
nt
Proxy
DBx #
1 23
3 2
x
But this violates suspect privacy!
Recall:• Participant privacy• Suspect privacy
![Page 46: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/46.jpg)
Protocol outline
1. Clients send suspect IP addrs (x)
2. DB releases IPs above threshold
46
Client / Participa
nt
Proxy
DBH(x)
#
1 23
3 2
Still violates suspect privacy!
Hash of IP address H(x)
Recall:• Participant privacy• Suspect privacy
![Page 47: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/47.jpg)
Protocol outline
1. Clients send suspect IP addrs (x)
2. IP addrs blinded w/Fs(x) Keyed hash function (PRF) Key s held only by proxy
3. DB releases IPs above threshold
47
Fs(x)
#
Fs(1)
23
Fs(3)
2
Client / Participa
nt
Proxy
DB
Fs(x)
Still violates suspect privacy!
Keyed hash of IP address
Recall:• Participant privacy• Suspect privacy
![Page 48: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/48.jpg)
Protocol outline
1. Clients send suspect IP addrs (x)
2. IP addrs blinded w/EDB(Fs(x)) Keyed hash function (PRF) Key s held only by proxy
3. DB releases IPs above threshold
48
Fs(x)
#
Fs(1)
23
Fs(3)
2
Client / Participa
nt
Proxy
DB
EDB(Fs(x))
But how do clients learn EDB(Fs(x))?
Encrypted keyed hash of IP address
Recall:• Participant privacy• Suspect privacy
![Page 49: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/49.jpg)
Protocol outline
1. Clients send suspect IP addrs (x)
2. IP addrs blinded w/EDB(Fs(x)) Keyed hash function (PRF) Key s held only by proxy
3. EDB(Fs(x)) learned throughsecure function evaluation
4. DB releases IPs above threshold
49
Fs(x)
#
Fs(1)
23
Fs(3)
2
Client / Participa
nt
Proxy
DB
Fs(x)
x
s
Recall:• Participant privacy• Suspect privacy
EDB(Fs(x))
Possible to reveal IP addresses at the
end
![Page 50: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/50.jpg)
Protocol summary
Clients send suspects IPs Learns Fs(x) using
secure function evaluation Proxy forwards to DB
Randomly shuffles suspects Re-randomizes encryptions
DB correlates using Fs(x) DB forwards bad Ips to proxy
50
Fs(x)
#
Fs(3)12
Client
EDB(Fs(3))
Fs(3)
Ds (Fs(3)) = 3
![Page 51: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/51.jpg)
Architecture
Proxy split into client-facing and decryption oracles Proxies and DB are fully parallelizable
Clients Client-Facing Proxies
Proxy Decryption
OraclesFront-EndDB Tier
Back-EndDB
Storage
51
![Page 52: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/52.jpg)
Evaluation
All components implemented ~5000 lines of C++ Utilizing GnuPG, BSD TCP sockets, and Pthreads
Evaluated on custom test bed ~2 GHz (single, dual, quad-core) Linux machines
52
Algorithm Parameter
Value
RSA / ElGamal key size 1024 bits
Oblivious Transfer
k 80
AES key size 256
![Page 53: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/53.jpg)
Scalability w.r.t. # IPs53
Single CPU core for DB and proxy each
![Page 54: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/54.jpg)
Scalability w.r.t. # clients54
Four CPU cores for DB and proxy each
![Page 55: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/55.jpg)
Scalability w.r.t. # CPU cores
55
n CPU cores for DB and proxy each
![Page 56: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/56.jpg)
Summary
Collaboration protocol protects privacy of Participants: do not reveal who suspected whom Suspects: only reveal suspects upon agreement
Novel composition of crypto primitives One-way function hides IPs from DB; public key
encryption allows subsequent revelation; secure function evaluation
Efficient implementation of architecture Millions of IPs in hours Scales linearly with computing resources
56
![Page 57: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/57.jpg)
1. Speed ML-based architecture supports accurate
and scalable Snort-like classification on IP flows
2. Accuracy Collaborating against mutual adversaries
3. Privacy Novel cryptographic protocol supports
efficient collaboration in privacy-preserving manner
Conclusion57
![Page 58: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/58.jpg)
Future Work Highlights
1. ML-based Snort-like architecture Cross-site: train on site A and test on site B Performance on sampled flow records
2. Measurement study Biased correlation results due to biased DNSBL
(ongoing) Rate at which information must be exchanged Who should cooperate: end-points or ISPs?
3. Privacy-preserving collaboration Other applications, e.g., Viacom-vs-YouTube
concerns
58
![Page 59: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/59.jpg)
THANK YOU!
Collaborators: Jennifer Rexford, Benny Applebaum, Matthew Caesar, Nick Duffield, Michael J Freedman, Patrick Haffner, Balachander Krishnamurthy, and Augustin Soule
![Page 60: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/60.jpg)
Accuracy is a function of correlation between flow and packet-level features
w/o dst port
w/o mean packet size
0.99 0.83
0.79 0.06
0.02 0.22
60
Rule Overall Accuracy
MS-SQL version overflow 1.00
ICMP PING speedera 0.82
NON-RFC HTTP DELIM 0.48
Difference in rule accuracy
![Page 61: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/61.jpg)
Choosing an operating point61
X ZY
• X = alarms we want raised• Z = alarms that are raised
PrecisionY
ZExactness
RecallY
XCompleteness
AP is a single number, but not most intuitive
Precision & recall are useful for operators“I need to detect 99% of these alarms!”
![Page 62: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/62.jpg)
Choosing an operating point62
Rule Precision w/recall 1.00
Precision w/recall=0.99
MS-SQL version overflow 1.00 1.00
ICMP PING speedera 0.02 0.83
CHAT AIM receive message 0.02 0.11
AP is a single number, but not most intuitive Precision & recall are useful for operators
“I need to detect 99% of these alarms!”
![Page 63: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/63.jpg)
Quantifying the benefit of collaboration
MSNBC FOX CNN
Effectiveness of collaboration is a function of1. Whether different victims see the same attackers
2. Whether all victims are equally likely to be targeted
63
![Page 64: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/64.jpg)
IP address blinding
Haakon Ringberg
64
DB requires injective and one-way function on IPs Cannot use simple hash
Fs(x) is keyed hash function (PRF) on IPs Key s held only by proxy
Client
EDB(Fs(x))
![Page 65: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/65.jpg)
xFs(x)
Secure Function Evaluation
Haakon Ringberg
65
IP address blinding can be split into per-IP-bit xi
problem Client must learn EDB(Fs(xi)) Client must not learn s Proxy must not learn xi
Oblivious Transfer (OT) accomplishes this1,2
Amortized OT makes asymptotic performance equal to matrix multiplication3
Clientx s
EDB(Fs(x))
1[Naor et al. SODA ’01] ,1[Freedman et al. TCC ’05] ,2[Ishai et al. CRYPTO ’03]
![Page 66: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/66.jpg)
Public key encryption
Clients encrypt suspect IPs (x) First w/proxy’s pubkey Then w/DB’s pubkey
Forwarded by proxy Does not learn IPs
Decrypted by DB Does not learn IPs
Does not allow for DB correlation due to padding (e.g., OAEP)
66
Haakon Ringberg
Client
EDB(EPX(x))
EPX(x)
![Page 67: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/67.jpg)
How client learns Fs(x)
Client must learn Fs(x) Client must not learn ‘s’ Proxy must not learn ‘x’
Naor-Reingold PRF s = { si | 1 ≤ i ≤ 32}
PRF = g^(∏xi=1 si)
Add randomness ui to obscure si from client
Haakon Ringberg
67
Message = ui * si
![Page 68: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/68.jpg)
How client learns Fs(x)
For each bit xi of the IP, the client learns ui * si, if xi is 1 ui, if xi is 0
The user also learns ∏ ui
Haakon Ringberg
68
x0=0 x1=1 x31=1 x =
u0 u1 * s1 u31 * s31Fs(x) =
s0 s1 s31 s =
![Page 69: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/69.jpg)
How client learns Fs(x)
User multiplies together all values Divides out ∏ ui
Acquires Fs(x) w/o having learned ‘s’
Haakon Ringberg
69
∏ ui * ∏xi=1 si ∏xi=1 ui * si * ∏xi=0 ui
∏ ui * ∏xi=1 si / ∏ ui
∏ ui
Fs(x) = ∏xi=1 si
![Page 70: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/70.jpg)
How client learns Fs(x)
User multiplies together all values Divides out ∏ ui
Acquires Fs(x) w/o having learned ‘s’
Haakon Ringberg
70
70
• But how does the client learn• si * ui, if xi is 1
• ui, if xi is 0• Without the proxy learning the IP x?
![Page 71: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/71.jpg)
Oblivious Transfer (details)
1. Client sends f(x=0) and f(x=1) Proxy doesn’t learn x
2. Proxy sends v(0) = Eg(f(0))(1 + r) v(1) = Eg(f(1))(s + r)
3. Client decrypts v(x) with g(f(x)) Calculates g(f(x)) Cannot calculate g(f(1-x))
71
Haakon Ringberg
Client
• x• g(f(x))
s
Public:• f(x)• g(x)
f(0)f(1)
v(0)v(1)
![Page 72: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/72.jpg)
Oblivious Transfer (more details)
Haakon Ringberg
72
Proxy chooses random c and r (at startup) Proxy publishes c and gr
Client chooses random k (for each bit)
Preprocessing:
1. Keyx = gk
Key1-x = c * g-k
2. Keyxr = (gr)k
Used to decrypt yx
1. Key0r = Key0
r
Key1r = cr / Key0
r
2. y0 = AESKey1r (u)
y1 = AESKey0r (s * u)
Key0
y0
y1
![Page 73: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/73.jpg)
Oblivious Transfer (more details)
Haakon Ringberg
73
1. Keyx = gk
Key1-x = c * g-k
2. Keyxr = (gr)k
Used to decrypt yx
1. Key0r = Key0
r
Key1r = cr / Key0
r
2. y0 = AESKey1r (u)
y1 = AESKey0r (s * u)
Key0
y0
y1
• Proxy never learns x
• Client can calculate Keyxr = (gr)k easily,
but cannot calculate cr (due to lack of r), which is needed for Key1-x
r = cr * (gr)-k
![Page 74: PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg](https://reader036.vdocuments.net/reader036/viewer/2022062516/56649e4b5503460f94b3fe7b/html5/thumbnails/74.jpg)
Other usage scenarios
1. Cross-checking certificates e.g., Perspectives1
Clients = end users Keys = Hash of certificates received
2. Distributed ranking e.g., Alexa Toolbar2
Clients = Web users Keys = Hash of web pages
74
1[Wendlandt USENIX ’08],2[www.alexa.com]