detecting malicious flux service networks through passive analysis of recursive dns traces
DESCRIPTION
Detecting Malicious Flux Service Networks through Passive Analysis of Recursive DNS Traces. Roberto Perdisci, Igino Corona, David Dagon, Wenke Lee ACSAC (Dec, 2009) . Agenda. Introduction Objective Detecting Malicious Flux Networks Experiments Conclusion. Agenda. fast-flux domain names. - PowerPoint PPT PresentationTRANSCRIPT
1
Detecting Malicious Flux Service Networks through Passive Analysis of Recursive DNS
Traces
Roberto Perdisci, Igino Corona, David Dagon, Wenke LeeACSAC (Dec, 2009)
2010/3/2
2
Agenda
• Introduction• Objective• Detecting Malicious Flux Networks• Experiments• Conclusion
2010/3/2
3
Agenda
• Introduction• Objective• Detecting Malicious Flux Networks• Experiments• Conclusion
Fast-Flux?At 2007
fast-flux domain names
Malicious Fast-Flux Network
2010/3/2
4
Malicious flux service networks
• Be viewed as illegitimate content-delivery networks (CDNs)
• The nodes of a malicious flux service network is called flux agents
• Commonly used to host phishing websites, illegal adult content, or serve as malware propagation vectors
2010/3/2
5
Related Work
• Detecting fast-flux domain names
• Characterized fast flux domains and the details of the classification algorithms
• Limited to mainly studying fast-flux domains advertised through email spams
2010/3/2
6
Approach
• Novel and passive
• Monitor the DNS queries and responses fromthe users to the RDNS, and selectively store information about potential fast-flux domains into a central DNS data collector
• By deploying sensors in front of the recursive DNS (RDNS) ?
2010/3/2
7
Agenda
• Introduction• Objective• Detecting Malicious Flux Networks• Experiments• Conclusion
Focus on detecting malicious flux networks in- the-wild
Passive detection benefit the accuracy of spam filtering applications
2010/3/2
8
Agenda
• Introduction• Objective• Detecting Malicious Flux Networks• Experiments• Conclusion
2010/3/2
9
Characteristics of Flux Domain Names
a) Short time-to-live (TTL)b) The set of resolved IPs (i.e., the flux agents)
returned at each query changes rapidly, usually after every TTL
c) The overall set of resolved IPs obtained by querying the same domain name over time is often very large
d) The resolved IPs are scattered across many different networks
2010/3/2
10
Traffic Volume Reduction(F1)(1)
• q(d) = (ti, T(d),P(d))– DNS query performed by a user at time ti to resolve
the set of IP addresses owned by domain name d• T(d)
– the time-to-live (TTL) of the DNS response• P(d)
– the set of resolved IPs returned by the RDNS server
2010/3/2
11
Traffic Volume Reduction(F1)(2)F1-a) seconds (i.e., 3 hours)
F1-b)
F1-c)
31
|P| |,16)prefix(P|
(d)
(d)
30 T OR 3 |P| (d)(d) 10800 T (d)
10800 T(d)
Three Constraints in F1!2010/3/2
12
Periodic List Pruning(F2)(1)• Candidate flux domain name d– d =
• : the time when the last DNS query for d was observed
• : the total number of DNS queries related to d ever seen until
• : the maximum TTL ever observed for d• : the cumulative set of all the resolved IPs
ever seen for d until time • : a sequence of pairs– where
)G,R ,T̂,Q ,(t (d)i
(d)i
(d)i
)d(ii
it
(d)iQ
(d)iR
(d)iG
(d)iT̂
it
it
1..ij(d)
jj )}r , {(t
|R| - |R| r (d)1-j
(d)j
(d)j
2010/3/2
13
Periodic List Pruning(F2)(2)
0.5)p OR 5 |R(| AND 3 |G| AND 100 Q (d)j
(d)jj
F2-a)
2010/3/2
14
Domain Clustering(1)
• A similarity (or proximity) matrix P = {sij}i,j=1..n that consists of similarities sij = sim(di, dj)– D = {d1, d2, ..dn},
2010/3/2
15
Domain Clustering(2)
• The hierarchical clustering algorithm takes P as input and produces in output a dendrogram, i.e., a tree-like data structure in which the leaves represent the original domains in D
2010/3/2
16
Service Classifier (1)
• Some features used to distinguish between malicious flux services and legitimate/non-flux services
• Both passive and active features– Passive: directly extracted from the information
collected by passive monitoring the DNS queries• Ex: Number of resolved IPs,
– Active: need some external information to be computed• Ex: Country code diversity,
1
10
2010/3/2
17
Service Classifier (2)
• Employ the popular C4.5 decision-tree classifier to automatically classify a cluster Ci as either malicious flux service or legitimate/non-flux service
2010/3/2
18
Agenda
• Introduction• Objective• Detecting Malicious Flux Networks• Experiments• Conclusion
2010/3/2
19
Collecting Recursive DNS Traffic
• Two sensors in front of two different RDNS servers of large north American ISP
• Between March 1 and April 14, 2009• More than 4 million users• Monitor 2.5 billion DNS queries per day• Set the epoch E to be one day
2010/3/2
20
Clustering Candidate Flux Domains(1)
• Apply a single-linkage hierarchical clustering algorithm to group together domains that belong to the same network
• Need 30 ~ 40 minutes per day and per sensor• Obtained 4000 domain clusters per day
2010/3/2
21
Clustering Candidate Flux Domains(2)
• Manually verified the quality of the results for a subset of the clusters obtained every day
• With the help of a graphical interface• Ex:– NTP server pool in Europe, North America,
Oceania, etc
2010/3/2
22
Evaluation of the Service Classifier(1)
• Statistical supervised learning approach• Label the cluster domains,– according to network prefix diversity, – cumulative number of distinct resolved IPs,– the IP growth ratio, , etc.
4
6
1
2010/3/2
23
Evaluation of the Service Classifier(2)
5
3
6
6
3
5
: Avg. TTL per domain
: Number of domains per network
: IP Growth Ratio
Classify between malicious flux network
andnon malicious flux network
2010/3/2
24
Can this Contribute to Spam Filtering?(1)
• Check the intersection between domain name set from spam emails and domains from the malicious flux networks identified by the detection system
2010/3/2
25
Can this Contribute to Spam Filtering?(2)
2010/3/2
26
Agenda
• Introduction• Objective• Detecting Malicious Flux Networks• Experiments• Conclusion
2010/3/2
27
Conslusion
• The detection system is based on passive analysis of recursive DNS (RDNS) traffic traces
• Not limited to the analysis of suspicious domain names extracted from spam emails or precompiled domain blacklists
• Benefit spam filtering applications
2010/3/2