network awareness and network security room/oct13/pst.pdf · they range from network wide analyses...
TRANSCRIPT
Faculty Of Computer SciencePrivacy and Security Lab
Network Awareness andNetwork Security
John McHughCanada Research Chair in Privacy and Security
Director, Privacy and Security Laboratory12 October 2005
Faculty Of Computer SciencePrivacy and Security Lab
OverviewThe internet has grown to the point where
observational approaches offer one of the onlyapproaches to its understanding.• In this case, we have a view of the border of a large
composite network with > /8 address space and severalmillion hosts.
• Given a broad view, it is possible to look at bothsimilarities and differences in the traffic going to/fromvarious sites.
• In this talk, we will try to sketch a variety of analyses thatcan be performed when such data is available
Note: Similar data can be aggregated in a variety ofways. Opportunities to obtain similar views exist.
Faculty Of Computer SciencePrivacy and Security Lab
Network Data CollectionApproach
Look at flow abstractions for a large customer network.• No payload data just headers – Source, Destination IP and
ports; protocol; times; traffic volumes (e.g., packets and bytes)- Cisco NetFlow like sources
Comprehensive coverage• >95% coverage of the customer network• Multiple networks, at the minimum
Collect a lot of data• Requires a data center with large computational and storage
capacity for historical analysis• Scalable collection and analysis
Faculty Of Computer SciencePrivacy and Security Lab
Analysis ApproachNetflow Data
• Organized by hour, type, sensor (router), etc. for outsideto inside and inside to outside.
• Packed format for efficient linear search• 10s of TB and growing by 10s of GB/day
Primary operation is selection based on time, IP, flowcharacteristics (protocol, volume, etc.)• Creates files of raw data satisfying selection criteria• Statistics on files can be produced
Can create sets or bags (multisets, counted sets) ofIPs with selection characteristics• Sets can be used for further selection / filtering See
ESORICS paper next week
Faculty Of Computer SciencePrivacy and Security Lab
A few quick examplesThe next few slides show some typical samples of the
kinds of information that can be derived from the flowdata.
They range from network wide analyses toexaminations of the characteristics of specificsubnets and even of specific hosts.• The analysis system can be viewed as a powerful zoom
lens.• It is capable at looking at the overall traffic on a few percent
of the internet at its widest angle, or at a single host at itshighest magnification.
Faculty Of Computer SciencePrivacy and Security Lab
Protocol 6: TCP Routed Data (Bytes)
0
2
4
6
8
10
12
14
16
5/2
/02
5/9
/02
5/1
6/0
2
5/2
3/0
2
5/3
0/0
2
6/6
/02
6/1
3/0
2
6/2
0/0
2
6/2
7/0
2
7/4
/02
7/1
1/0
2
7/1
8/0
2
7/2
5/0
2
Bytes
July 5
May 24 July 1
Faculty Of Computer SciencePrivacy and Security Lab
Characteristics of small sample
We looked at 1 minute of data from themonitored network• Used set of active inside IPs to partition out into
• Hit data is addressed to “active” hosts• Miss data is addressed to “inactive” hosts
• Partitions have very different characteristics• Miss data appears to be mix of scans and noise
Faculty Of Computer SciencePrivacy and Security Lab
1 Min sample - destinationsIP Destination Analysis
0.0%
20.0%
40.0%
60.0%
80.0%
100.0%
1 2 3 4 5 6 7 8 9
Flows per Destination
Perc
en
t o
f to
tal IP
s
Miss set - 0.1% > 9 flows
Hit set - 2.4% > 9 flows
Faculty Of Computer SciencePrivacy and Security Lab
1 Min Sample - sourcesIP Source Analysis
0.0%
20.0%
40.0%
60.0%
80.0%
100.0%
1 2 3 4 5 6 7 8 9
Flows per Source
Perc
en
t o
f to
tal
IPs
Miss set - 36.1% > 9 flows
Hit set - 4.1% > 9 flows
Faculty Of Computer SciencePrivacy and Security Lab
top 5 in 1 min sample(39) lip $ readbag --count --print jcm-tcp-s-
10+.bag|\ sort -r -n | head
12994 AAA.BBB.068.218 - scan 4899 (Radmin) 6598 CCC.DDD.209.215 - scan 7100 (X-Font) 5944 EEE.FFF.125.117 - scan 20168 (Lovegate) 5465 GGG.HHH.114.052 - ditto 5303 III.JJJ.164.126 - scan 3127 (My doom)
Faculty Of Computer SciencePrivacy and Security Lab
Bottom of bag in 1 min sample3335 external hosts sent exactly one TCP flow
• SYN probes for port 8866 449 times• W32.Beagle.B@mm is a mass-mailing worm-back door
on TCP port 8866.• SYN probes for port 25 are seen 271 times.• Most remainder are SYNs to a variety of ports, mostly with
high port numbers.• There are a number of ACK/RST packets which are
probably associated with responses to spoofed DDoSattacks.
Faculty Of Computer SciencePrivacy and Security Lab
Host activity on NNN.OOO.0.0/16
0
20
40
60
80
100
120
140
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Date in April 2004
Acti
ve H
osts
NNN.OOO.012.x
NNN.OOO.013.x
NNN.OOO.014.x
NNN.OOO.015.x
NNN.OOO.016.x
12 others 1 host ea
Faculty Of Computer SciencePrivacy and Security Lab
NNN.OOO in to out TCP dst
0
2000
4000
6000
8000
10000
12000
14000
16000
4/1 4/3 4/5 4/7 4/9 4/11 4/13 4/15
Date
Flo
ws p
er
ho
ur
80
443
25
3200
0
2847
2848
8080
8383
2112
9100
Faculty Of Computer SciencePrivacy and Security Lab
NNN.OOO in to out TCP src
0
200
400
600
800
1000
1200
1400
1600
1800
4/1 4/3 4/5 4/7 4/9 4/11 4/13 4/15
Date
Flo
ws /
Ho
ur
443
25
80
3838
2607
21
1361
3374
2310
4378
Faculty Of Computer SciencePrivacy and Security Lab
What does this mean?
This installation operates 5/8 or 6/8 for themost part.
It is sparsely populated• A handfull of /24s• 25% to 50% populated
Outgoing traffic typical of workstations in sportOutgoing traffic typical of https server in dportInteresting mix of minority ports, but numbers
very small after first few.
Faculty Of Computer SciencePrivacy and Security Lab
One week on another /16
0
500000
1000000
1500000
2000000
2500000
3000000
3500000
4000000
4500000
11-Jan 12-Jan 13-Jan 14-Jan 15-Jan 16-Jan 17-Jan 18-Jan
Date and Time
Ho
url
y F
low
Fil
e S
ize
Inbound Hits
Inbound misses
Faculty Of Computer SciencePrivacy and Security Lab
Host Characterization
Cpt. Damon Becknel, USA looked at hostcharacterization based on port mixes. Wedeveloped the following visualizations to helpin the process.• The log scale on flows helps when there are large
differences in flow volumes among ports• The data was NOT filtered by protocol and there
is “noise” in the port fields for some protocols,especially ICMP and possibly others.
Faculty Of Computer SciencePrivacy and Security Lab
Workstation?
Faculty Of Computer SciencePrivacy and Security Lab
Web Server
Faculty Of Computer SciencePrivacy and Security Lab
Analysis of Misconfigurations andMalicious Activity
Identify at different levels of detail:• Web server revisited• Non-client traffic routed through client networks• Client traffic inbound, Non-client traffic outbound• DDoS attack traffic• Worms of various kinds• Precursor activities• Scan Detection• Contact surface anomaly
Faculty Of Computer SciencePrivacy and Security Lab
Web Server
Faculty Of Computer SciencePrivacy and Security Lab
Routing Anomalies and Backdoors
3205
8078
0100020003000400050006000700080009000
5/28 6/4 6/1
16/1
86/2
5 7/2 7/9 7/16
7/23
7/30 8/6 8/1
38/2
08/2
7 9/3 9/10
9/17
9/24
Inbound Non-Client -> Non-Client
Faculty Of Computer SciencePrivacy and Security Lab
Examining Denial Of ServiceAttacksUnique class A Address Spaces during a DOS attack
0
50
100
150
200
250
16_05
16_15
17_00
17_10
17_19
18_04
18_15
19_00
19_10
19_19
20_04
20_14
20_23
21_09
21_18
22_03
22_13
Day/Hour
Un
iqu
e A
dd
res
se
s
Normal
Faculty Of Computer SciencePrivacy and Security Lab
Inbound Slammer Traffic
UDP Port 1434 Flows
0
5000000
10000000
15000000
20000000
25000000
30000000
35000000
40000000
0 2 4 6 8 10 12 14 16 18 20 22 0 2 4 6 8 10 12 14 16 18
Hour 1/24:00-1/25:18
Flo
ws
Faculty Of Computer SciencePrivacy and Security Lab
Slammer: Precursor DetectionUDP Port 1434 - Precursor
0
20000
40000
60000
80000
100000
120000
140000
160000
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 0 1 2 3 4Hour 1/24:00 1/25:04
Flow
s
Series1
Faculty Of Computer SciencePrivacy and Security Lab
Focused on hours 6, 7, 8, 13, 14Identified 3 primary sources, all from a known
adversaryAll 3 used a fixed patternIdentified responders: 2 out of 4 subsequently
compromised.
Slammer: Precursor Analysis
Faculty Of Computer SciencePrivacy and Security Lab
Scanner
Faculty Of Computer SciencePrivacy and Security Lab
Developing the contact surfaceIn looking for scanners, Carrie Gates asked if there is a
normal contact pattern?• i.e. how many external hosts contact how many internal
hosts per unit time ?One might expect clustering or a power law, but
NumberOf
Sources
Number of Destinations
Faculty Of Computer SciencePrivacy and Security Lab
Structure in the contact surfaceNow you see it (July 2003)
Faculty Of Computer SciencePrivacy and Security Lab
Structure in the contact surfaceAway it goes (August 2003)
Faculty Of Computer SciencePrivacy and Security Lab
Structure in the contact surfaceNow you don’t (September 2003)
Faculty Of Computer SciencePrivacy and Security Lab
Structure in the contact surfaceHere it is again (Feb / April /June 2004)
Faculty Of Computer SciencePrivacy and Security Lab
We are still studying thisPhase 1
• Persisted from Jan - Aug• details for 1 week in July• 91% port 80, 87% SYN,
96% unique sIP/dIP pairs• 49% to 60 /16 in 1 /8
• 50% of 49% to 5 /16• 14% to 1 /16
• 48 /24s <3.2%ea notarget
• 14% to another /8• 3 /8 srcs 46%, 34%, 20%
• 2 AP, 1 SA• Too slow to be DoS• Too persistent for scan?
Blaster released on Aug 11• Could this be the related?
Phase 2• From mid Feb - early June• Again SYN to Port 80• 2 of 3 /8 sources same
• 3rd is there but not asstrong
• 23% to a new /8Conclusion
• Appears to be wellcoordinated activity, butwhat or why?
Faculty Of Computer SciencePrivacy and Security Lab
Weekly Contact HoursContacts per Source (1 Week)
1
10
100
1000
10000
100000
1000000
10000000
0 50 100 150
Total contact hours
Nu
mb
er o
f s
ou
rces
Contacts per Source (1 Week)
Faculty Of Computer SciencePrivacy and Security Lab
Lack of stealthy activity• During the week there are about 3.3 million
sources that show up exactly once.• The distribution tapers off rapidly from there.• In a linear plot it looks like a straight line at
approximately zero, but the log plot shows aninteresting upturn at the end. This contains:• One connection that persists for the entire
week with flows in each hour.• A number of apparently scripted transfers that
occur every few minutes. At least two involveaccess to a weather site.
• At first glance, no evidence of hourly, stealthyactivity.
Faculty Of Computer SciencePrivacy and Security Lab
Partitioning DataA set can be used with rwfilter to partition data
into portions that have a destination (orsource) IP that are in the set and those thatare not.• The set of hosts in a network that have been
observed to emit traffic during some time intervalapproximates the active population of the network.
• Traffic sent to addresses that are not associatedwith hosts is arguably malicious
• The active host set can be used to separate thistraffic from traffic addressed to active hosts• Additional refinements are possible
Faculty Of Computer SciencePrivacy and Security Lab
Hit and Miss Sources above32000
Hit and Miss Sources above 32000 Flows/Hour
0
10
20
30
40
50
60
70
80
0 50 100 150
Hour in Week
Nu
mb
er
of
So
urc
es p
er
Ho
ur
Hit Sources
Miss Sources
Faculty Of Computer SciencePrivacy and Security Lab
So who are these sourcesThe following are in the heavy miss list for more than 50 Hours inthe week analyzed.
063.210.x.y 66 Level 3 Communications, Inc., CO199.099.x.y 135 Performance Systems International, DC200.074.x.y 51 Metropolis Intercom, Santiago, CL203.129.x.y 77 Software Technology Parks of India, Pune, IN217.107.x.y 113 RTComm.ru
Two are registered in ARIN, one in the Asian Registry, one in theLatin American Registry and one with RIPE.
Faculty Of Computer SciencePrivacy and Security Lab
And what are they doing?Flows from Heavy sources
0
100000
200000
300000
400000
500000
600000
700000
800000
900000
1000000
0 50 100 150
Hour in Week
Flo
ws p
er
ho
ur
063.210.x.y
199.099.x.y
200.074.x.y
203.129.x.y
217.107.x.y
Faculty Of Computer SciencePrivacy and Security Lab
199.99.x.y is scanningThis analysis is based on 2003/01/14:00• rwcut -fields=1,4,5,6,7,8 finds the flow signature• Clustering the lines shows a SYN scan
Clusters, counts, and order: 117075 records 2 clusters processsed. sIP|dPort|pro|packets| bytes| flags| 199.99.x.y| 80| 6 | 2| 88| S | 96842 1 199.99.x.y| 80| 6 | 1| 44| S | 20233 2End of input after 117075 records forming 2 clusters.• Forming the destination set shows that a single /16
readset --print-net=AB# 199.099.x.y-d.set AAA.BBB.0.0/16 : 51744 hosts in 228 /24s and 1785 /27s.• This is a fairly common pattern for high volume scanners
Faculty Of Computer SciencePrivacy and Security Lab
200.74.x.y is a more complexcase
Clusters, counts, and order: 376572 records 53 clusters processsed. sIP|dPort|pro|packets| bytes| flags| 200.74.x.y| 80| 6| 1| 40| S | 92354 1 200.74.x.y| 8080| 6| 1| 40| S | 91678 2 200.74.x.y| 3128| 6| 1| 40| S | 90959 3 200.74.x.y| 1080| 6| 1| 40| S | 88865 4• Futher down are clusters indicating responses
200.74.x.y| 1080 | 6| 2| 80| SR | 1246 8 200.74.x.y| 3128| 6| 2| 80| SR | 136 10 200.74.x.y| 8080| 6| 2| 80| SR | 64 15 200.74.x.y| 8080| 6| 7| 386| SRPA | 5 21 200.74.x.y| 8080| 6| 4| 254| FS PA | 4 22 200.74.x.y| 80| 6| 7| 338| SRPA | 4 23
Faculty Of Computer SciencePrivacy and Security Lab
Proactive use - Spyware
Outgoing traffic from spyware creates hot spots.• Aggregation of destinations hints at targets• Similarity of content provides additional evidence• The wider the spread of the spyware, the easier it
is to detect this way• The more data is aggregated the easier it is to
see this.
Zombie command and control networks mightbe detectable this way, also.
Faculty Of Computer SciencePrivacy and Security Lab
Summary / Conclusion / FutureWe have an unprecedented ability to examine network
traffic• Long periods of time, large volumes
Empirical Analyses• How to identify scans, Denials Of Service• What defenses work?• Evidence of compromise
Acquire Additional Data Sources• Different network topologies - Would like to look at interior
nodes and subnets as well as border.• Different types of networks (e.g., Wireless)
Faculty Of Computer SciencePrivacy and Security Lab
Credit where credit is dueTo the entire Situational Awareness team, especially:The late Suresh L. Konda
• Suresh was, and remains, the inspiration for this program
Mike Collins, Andrew Kompanek, and the SiLKtoolsdevelopers• Mike Duggan, Mark Thomas
CERT analysts and users of the data• Carrie Gates, Marc Kellner, Capt. Damon Becknel, USA
• Carrie and Damon are responsible many of the graphics
Tom Longstaff for managing the unmanagable