locality, network control and anomaly detection

58
Locality, Network Control and Anomaly Detection Jim Binkley [email protected] Portland State University Computer Science

Upload: minh

Post on 12-Jan-2016

47 views

Category:

Documents


0 download

DESCRIPTION

Locality, Network Control and Anomaly Detection. Jim Binkley [email protected] Portland State University Computer Science. Outline. intro to ourmon, a network monitoring system network control and anomaly detection theory pictures at an exhibition of attacks port signatures - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Locality, Network  Control and Anomaly Detection

Locality, Network Control and Anomaly Detection

Jim Binkley

[email protected]

Portland State University

Computer Science

Page 2: Locality, Network  Control and Anomaly Detection

2

Outline

intro to ourmon, a network monitoring system network control and anomaly detection

theory pictures at an exhibition of attacks port signatures

Gigabit Ethernet - flow measurement what happens when you receive 1,488,000 64 byte

packets a second? answer: not enough

more information

Page 3: Locality, Network  Control and Anomaly Detection

3

the network security problem

you used IE and downloaded a file and executed it perhaps automagically

you used outlook and got email and executed it. just click here to solve the password problem

you were infected by a worm/virus thingee trolling for holes which tries to spread itself with a known set of network-based attacks

result: your box is now controlled by a 16-year old boy in Romania, along with 1000 others

and may be a porn server, or is simply running an IRC client or server, or has illegal music, “warez”. on it

and your box was just used to attack Microsoft or the Whitehouse

Page 4: Locality, Network  Control and Anomaly Detection

4

how do the worm/virus thingees behave on the network?

1. case the joint: try to find other points to infect: via TCP syn scanning of remote IPs and remote ports this is just L4, TCP SYNS, no data at L7 ports 135-139, 445, 80, 443, 1433, 1434 are popular ports like 9898 are also popular

• and represent worms looking for worms (backdoors) 2. once they find you

they can launch known and unknown attacks at you now we have L7 data

3. this is a random process, although certainly targeting at certain institutions/networks is common

4. a tool like agobot can be aimed at dest X, and then dest X can be “DOSSED/scanned” with a set of attacks

Page 5: Locality, Network  Control and Anomaly Detection

5

what the heck is going on out there?

Vern Paxson said: the Internet is not something you can simulate

Comer defined the term Martian means a packet on your network that is unexpected

The notion of well-known ports so you can state that port 80 is used *only* for web traffic is defunct

media vendors “may” have had something to do with this John Gilmore said: The Internet routes around censorship We have more and more P2P apps where 1. you don’t

know the ports used, 2. things are less server-centric, and 3. crypto may be in use

Page 6: Locality, Network  Control and Anomaly Detection

6

a lot of this work was motivated by this problem

gee, where’s the “anomaly”?now tell me what caused it … and what caused the drops too

Page 7: Locality, Network  Control and Anomaly Detection

7

ourmon introduction

ourmon is a network monitoring system with some similarities/differences to

traditional SNMP RMON II• name is a take off on this (ourmon is not rmon)

Linux ntop is somewhat similar, Cisco netflow we deployed it in the PSU DMZ a number of

years ago (2001) first emphasis on network stats

• how many packets, how much TCP vs UDP, etc.• what the heck is going on out there?

recent emphasis on detection of network anomalies

Page 8: Locality, Network  Control and Anomaly Detection

8

PSU network

Gigabit Ethernet backbone including GE connection to Inet1 and Inet2 I2 is from University of Washington (OC-3) I1 is from State of Oregon university net (NERO) PREN(L3)/NWAX(L2) exchange in Pittock

350 Ethernet switches at PSU 10000 live ports, 5-6k hosts 4 logical networks: resnet, OIT, CECS, 802.11 10 Cisco routers in DMZ

ourmon shows 15-30k packets per second in DMZ (normally), 2G packets per day

Page 9: Locality, Network  Control and Anomaly Detection

9

PSU network DMZ Simplified overview

DMZ net

Fricka: Pittock PSU router at NWAX exchange

gig E link

local DMZ switch

MCECS PUBNET OIT RESNETall (often redundant) L3 routers

instrumentation including ourmon front-end

Firewall ACLs exist for the most part on DMZ interior routersourmon.cat.pdx.edu/ourmon is within MCECS network

Page 10: Locality, Network  Control and Anomaly Detection

10

ourmon architectural overview

a simple 2-system distributed architecture front-end probe computer – gather stats back-end graphics/report processor

front-end depends on Ethernet switch port-mirroring like Snort

does NOT use ASN.1/SNMP summarizes/condenses data for back-end

data is ASCII cp summary file via out of band technique

micro_httpd/wget, or scp, or rsync, or whatever

Page 11: Locality, Network  Control and Anomaly Detection

11

ourmon current deployment in PSU DMZ

Cisco gE switch

A. gigabit connection toexchange router (Inet)

B. to Engineering C. to dorms, OIT, etc.

2 ourmon probeboxes (P4)

port-mirror of A, B, C

ourmongraphicsdisplaybox

Inet

gE

Page 12: Locality, Network  Control and Anomaly Detection

12

ourmon architectural breakdown

probe box/FreeBSD graphics box/BSDor linux

ourmon.confconfig file

runtime:1. N BPF expressions2. + topn (hash table) offlows and other things(lists)3. some hardwired C filters(scalars of interest)

pkts from NIC/kernel BPFbuffer

mon.litereport file

outputs:1. RRDTOOL strip charts2. histogram graphs 3. various ASCII reports,

hourly summaries

tcpworm.txt

Page 13: Locality, Network  Control and Anomaly Detection

13

the front-end probe

written in C input file: ourmon.conf (filter spec)

1-6 BPF expressions may be grouped in a named graph, and count either packets or bytes

some hardwired filters written in C topn filters (generates lists, #1, #2, ... #N) 3 kinds of topn tuples: flow, TCP syn, icmp/udp error

output files: mon.lite, tcpworm.txt summarization of stats by named filter ASCII, but usually small (current 5k)

Page 14: Locality, Network  Control and Anomaly Detection

14

the front-end probe

typically use 7-8 megabyte kernel BPF buffer we only look at traditional 68 byte snap size

a la tcpdump meaning L2-L4 HEADERS only, not data this is starting to change

at this point due to hash tuning we rarely drop packets barring massive syn attacks

front-end basically is 2-stage gather packets and count according to filter type write report at 30-second alarm period

Page 15: Locality, Network  Control and Anomaly Detection

15

ourmon.conf filter types

1. hardwired filters are specified as: fixed_ipproto # tcp/udp/icmp/other pkts packet capture filter cannot be removed 2. 1 user-mode bpf filter (configurable) bpf “ports” “ssh” “tcp port 22” bpf-next “p2p” “port 1214 or port 6881 or ...” bpf-next “web” “tcp port 80 or tcp port 443” bpf-next “ftp” “tcp port 20 or tcp port 21” 3. topN filter (ip flows) is just topn_ip 60 meaning top 60 flows

Page 16: Locality, Network  Control and Anomaly Detection

16

mon.lite output file roughly like this:

pkts: caught:670744 : drops:0: fixed_ipproto: tcp:363040008 : udp:18202658 :

icmp:191109 : xtra:1230670: bpf:ports:0:5:ssh:6063805:p2p:75721940:web:1

02989812:ftp:7948:email:1175965:xtra:0 topn_ip : 55216 : 131.252.117.82.3112-

>193.189.190.96.1540(tcp): 10338270 : etc useful for debug, but not intended for human

use

Page 17: Locality, Network  Control and Anomaly Detection

17

back-end does graphics

written in perl, one main program but there are now a lot of helper programs

uses Tobias Oetiker’s RRDTOOL for some graphs (hardwired and BPF): integers as used in cricket/mrtg, other apps popular with

network engineers easy to baseline 1-year of data logs (rrd database) has fixed size at creation

top N uses histogram (our program) plus perl reports for topn data and other things

hourly/daily summarization of interesting things

Page 18: Locality, Network  Control and Anomaly Detection

18

hardwired-filter #1: bpf counts/drops

this happens to be yet another SQL slammer attack.front-end stressed as it lost packets due to the attack.

Page 19: Locality, Network  Control and Anomaly Detection

19

2. bpf filter output example

note: xtra means any remainder and is turned off in this graph.note: 5 bpf filters mapped to one graph

Page 20: Locality, Network  Control and Anomaly Detection

20

3. topN example (histogram)

Page 21: Locality, Network  Control and Anomaly Detection

21

ourmon has taught us a few hard facts about the PSU net

P2P never sleeps (although it does go up in the evening) Internet2 wanted “new” apps. It got bittorrent.

PSU traffic is mostly TCP traffic (ditto worms) web and P2P are the top apps

bittorrent/edonkey PSU’s security officer spends a great deal of time

chasing multimedia violations ... Sorry students: Disney doesn’t like it when you serve up

Shrek It’s interesting that you have 1000 episodes of Friends on a

server, but …

Page 22: Locality, Network  Control and Anomaly Detection

22

current PSU DMZ ourmon probe

has about 60 BPF expressions grouped in 16 graphs many are subnet specific (e.g., watch the dorms) some are not (watch tcp control expressions)

about 7 hardwired graphs including a count of flows IP/TCP/UDP/ICMP, and a

worm-counter … topn graphs include (5 fundamentally):

TCP syn’ners, IP flows (TCP/UDP/ICMP), top ports, ICMP error generators, UDP weighted errors

topn scans, ip src to ip dst, ip src to L4 ports

Page 23: Locality, Network  Control and Anomaly Detection

23

ourmon and intrusion detection

obviously it can be an anomaly detector McHugh/Gates paraphrase: Locality is a

paradigm for thinking about normal behavior and “Outsider” threat or insider threat if you are at a university with dorms

thesis: anomaly detection focused on 1. network control packets; e.g., TCP syns/fins/rsts 2. errors such as ICMP packets (UDP) 3. meta-data such as flow counts, # of hash inserts

this is useful for scanner/worm finding

Page 24: Locality, Network  Control and Anomaly Detection

24

inspired by noticing this ...

mon.lite file (reconstructed), Oct 1: 2003:note 30 second flow counts (how many tuples)

topn_ip: 163000:topn_tcp: 50000topn_udp: 13000topn_icmp: 100000

normal icmp flow count: 1000/30 seconds

We should have been graphing the meta-data (the flow counts).Widespread Nachi/Welchia worm infection in PSU dorms

oops ...

Page 25: Locality, Network  Control and Anomaly Detection

25

actions taken as a result:

we use the BPF/RRDTOOL to graph: 1. network “errors” TCP resets and ICMP errors 2. we graph TCP syns/resets/fins 3. we graph ICMP unreachables (admin prohibit, host

unreachable etc).

we have RRDTOOL graphs for flow meta-data: topN flow counts tcp SYN tuple weighted scanner numbers

topn syn graph now shows more info based on ip src, sorts by a work weight, shows

SYNS/FINS/RESETS

and the portreport …based on SYNs/2 functions

Page 26: Locality, Network  Control and Anomaly Detection

26

more new anomaly detectors

the TCP syn tuple is a rich vein of information top ip src syn’ners + front-end generates 2nd output tcpworm.txt file (which is processed by the back-end) port signature report, top wormy ports, stats work metric is a measure of TCP efficiency worm metric is a set of IP srcs deemed to have too

many syns topN ICMP/UDP error

weighted metric for catching UDP scanners

Page 27: Locality, Network  Control and Anomaly Detection

27

daily topn reports are useful

top N syn reports show us the cumulative synners over time (so far today, and days/week) if many syns, few fins, few resets

• almost certainly a scanner/worm (or trinity?) many syns, same amount of fins, may be a P2P app

ICMP error stats show up both top TCP and UDP scanning hosts especially in cumulative report logs

both of the above reports show MANY infected systems (and a few that are not)

Page 28: Locality, Network  Control and Anomaly Detection

28

front-end syn tuple

TCP Syn Tuple produced by the front-end (ip src, syns sent, fins received, resets

received, icmp errors received, total tcp pkts sent, total tcp pkts received, set of sampled ports)

There are three underlying concepts in this tuple: 1. 2-way info exchange 2. control data plane 3. attempt to sample TCP dst ports, with counts

Page 29: Locality, Network  Control and Anomaly Detection

29

and these metrics used in particular places

TCP work weight: (0..100%)IP src: S(s) + F(r) + R(r)/total pkts

TCP worm weight: low-pass filter IP src: S(s) - F(r) > N

TCP error weight:IP src: (S - F) * (R+ICMP_errors)

UDP work metric: IP src: (U(s) - U(r)) * (ICMP_errors)

Page 30: Locality, Network  Control and Anomaly Detection

30

EWORM flag system

TCP syn tuple only: E, if error weight > 100000 W, if the work weight is greater than 90% O (ouch), if there are few FINS returned and

some SYNS, no FINS more or less R (reset), if there are RESETS M (keeping mum flag), if no packets are sent

back, therefore not 2-way what does that spell? (often it spells WOM)

Page 31: Locality, Network  Control and Anomaly Detection

31

6:00 am TCP attack - BPF net errors

Page 32: Locality, Network  Control and Anomaly Detection

32

topn RRD flow count graph

Page 33: Locality, Network  Control and Anomaly Detection

33

bpf TCP control

Page 34: Locality, Network  Control and Anomaly Detection

34

6 am TCP top syn (this filter is useful...)

Page 35: Locality, Network  Control and Anomaly Detection

35

topn syn syslog sort

start log time : instances: DNS/ip : syns/fins/resets total countsend log time----------------------------------------------------------------------------------------Wed Mar 3 00:01:04 2004: 777: host-78-50.dhcp.pdx.edu:401550:2131:2983Wed Mar 3 07:32:36 2004Wed Mar 3 00:01:04 2004: 890: host-206-144.resnet.pdx.edu:378865:1356:4755Wed Mar 3 08:01:03 2004Wed Mar 3 00:01:04 2004: 876: host-245-190.resnet.pdx.edu:376983:1919:8041Wed Mar 3 08:01:03 2004Wed Mar 3 00:01:04 2004: 674: host-244-157.resnet.pdx.edu:348895:

:8468:29627Wed Mar 3 08:01:03 2004

Page 36: Locality, Network  Control and Anomaly Detection

36

1st graph you see in the morning:

Page 37: Locality, Network  Control and Anomaly Detection

37

BPF: in or out?

Page 38: Locality, Network  Control and Anomaly Detection

38

BPF ICMP unreachables

Page 39: Locality, Network  Control and Anomaly Detection

39

hmm... size is 100.500 bytes

Page 40: Locality, Network  Control and Anomaly Detection

40

flow picture: UDP counts up

Page 41: Locality, Network  Control and Anomaly Detection

41

bpf subnet graph:

OK, it came from the dorms (this is rare ..., it takes a strong signal)

Page 42: Locality, Network  Control and Anomaly Detection

42

top ICMP shows the culprit’s IP

8 out of 9 slots taken by ICMP errors back to 1 host

Page 43: Locality, Network  Control and Anomaly Detection

43

the worm graph (worm metric)

Page 44: Locality, Network  Control and Anomaly Detection

44

port signatures in tcpworm.txt

port signature report simplified: ip src: flags work: port-tuples 131.252.208.59 () 39: (25,93)(2703,2) ... 131.252.243.162 () 22: (6881,32)(6882,34) (6883,5)... 211.73.30.187 (WOM) 100: (80,11)(135,11)(139,11)

(445,11)(1025,11)(2745,11)(3127,11)(5000,11)

211.73.30.200 (WOM) 100: port sig same as above 212.16.220.10 (WOM) 100: (135,100) 219.150.64.11 (WOM) 100: (5554,81) (9898,18)

Page 45: Locality, Network  Control and Anomaly Detection

45

tcpworm.txt stats show:

on May 20, 2004 top 20 ports for “W” metric are: 445,135,5554,9898,80, 2745, 6667 1025,139,6129,3127,44444,2125,6666 7000,9900,5000,1433 www.incidents.org reports at this time:

port 135 traffic increase due to bobax.c ports mentioned are 135,445,5000

Page 46: Locality, Network  Control and Anomaly Detection

46

Gigabit Ethernet speed testing

test questions: what happens when we hit ourmon and its various kinds of filters

1. with max MTU packets at gig E rate? can we do a reasonable amount of work? can we capture all the packets?

2. with min-sized packets (64 bytes) same questions

3. is any filter kind better/worse than any other topn in particular (answer is it is worse) and by the way roll the IP addresses (insert-only)

Page 47: Locality, Network  Control and Anomaly Detection

47

Gigabit Ethernet - Baseline

acc. to TR-645-2, Princeton University, Karlin, Peterson, “Maximum Packet Rates for Full-Duplex Ethernet”:

3 numbers of interest for gE min-packet theoretical rate: 1488 Kpps (64 bytes) max-packet theoretical rate: 81.3 Kpps (1518 bytes) min-packet end-end time: 672 ns

note: the min-pkt inter-frame gap for gigabit is 96 ns (not a lot of time between packets ...)

an IXIA 1600 packet generator can basically send min/max at those rates

Page 48: Locality, Network  Control and Anomaly Detection

48

test setup for ourmon/bpf measurement

ixia 1600 gE packetgenerator

1. min-sized pkts. 2 max-sized pkts

1.7 ghz AMD 2000 UNIX/FreeBSD64-bit PCI/syskonnect+ ourmon/bpf

ixia

Packet Enginesline-speedgigabit ethernet switch

UNIXpc

ixia sends/recvs packets

port-mirror of IXIA send port

Page 49: Locality, Network  Control and Anomaly Detection

49

test notes

keep in mind that ourmon snap length is 68 bytes (includes L4 headers, not data) The kernel BPF is not capturing all of a max-sized

packet An IDS like snort must capture all of the pkt

it must run an arbitrary set of signatures over an individual packet

reconstruct flows undo fragmentation shove it in a database? (may I say… GAAAA)

Page 50: Locality, Network  Control and Anomaly Detection

50

maximum packets results

with work including 32 bpf expressions, top N, and hardwired filters:

we can always capture all the packets but we need a N megabyte BPF buffer in the

kernel add bpfs, add kernel buffer size 8 megabytes (true for Linux too)

this was about the level of real work we were doing at the time in the real world

Page 51: Locality, Network  Control and Anomaly Detection

51

minimum packet results

using only the basic count/drop filter NO OTHER WORK!

using any-size of kernel buffer (didn’t matter) we start dropping packets at around 80 mbit

speed (10% of the line rate with overhead) this is only with the drop/count filter! if you want to do real work, 30-50 mbits

more like it challenged by opponent that has 100mbit

NIC card ... (or distributed zombie)

Page 52: Locality, Network  Control and Anomaly Detection

52

why is min performance so poor?

Two points-of-view that are complimentary. 1. there is not enough time to do any real work (you

have 500 ns or so) 2. the bottom-half of the os is at HW priority, interrupts

prevent the top-half from running (enough) to avoid drops.

note that growing the kernel buffer doesn’t help

research question: what is to be done? btw: this is why switch/router vendors usually publish

performance stats on min-sized pkts.

Page 53: Locality, Network  Control and Anomaly Detection

53

also top N has a problem

random inserts means bucket lookup always fails followed by memory allocation/cache churn

random IP src and/or random IP dst how to deal with this? one obvious answer: make sure hash algorithm is

optimized as much as possible improved lookup certainly does help

insert is logically: lookup to find correct bucket + insert (node allocation/setup/chaining)

our hash bucket size was way too small ...

Page 54: Locality, Network  Control and Anomaly Detection

54

this explains my long standing question of

why does ourmon sometimes drop packets? in the drop/count graph

but you couldn’t find any reason for it reason: looking at BIG things like flows, not small things like TCP syn attacks

which do not add up to anything in the way of a MegaByte flow

and may be distributed

small packets are evil...

Page 55: Locality, Network  Control and Anomaly Detection

55

ourmon gigabit test conclusions

min packets are a problem no filters and still overflows topn needed optimization (now bpf needs optimization) does topn problem apply to route caching in routers?

a different parallel architecture is needed.for min pkts consequences for IDS snort system are terrible

easy to construct a DOS attack that can sneak packets by snort

80 mbits of 64-byte packets will likely clog it. to say nothing of a concerted zombie attack less could always do the trick depending on the exact

circumstances and the amount of work done in the monitor

Page 56: Locality, Network  Control and Anomaly Detection

56

current work

auto-capture packets with packet-capture probe front-end driven with simple trigger auto-capture the new big worm graph event

correlate ip sources across back-end topn reports; e.g., did ip src X show up in top_udp flows, udp scanners, icmp errors?

statistical analysis of the basic two TCP metrics under normal and adverse conditions

can we build a prototype intrusion prevention system? how about a jail for “bad” hosts?

Page 57: Locality, Network  Control and Anomaly Detection

57

other ideas

simple L7 scanning to determine information about P2P IRC hacker control plane

what about traffic analysis in a more general sense what if all traffic is encrypted, L4 and up

signal graph correlation analysis RRDTOOL graphs can correlate in ourmon AND router/switch

infrastructure, so can we find the switch port fast

what about saving the last N days worth of packets for analysis tons of data. but given IP x infected, can I determine how that

happened? or was there an IRC correlation with DDOS?

Page 58: Locality, Network  Control and Anomaly Detection

58

more ourmon information

ourmon.cat.pdx.edu/ourmon - PSU stats and download page

ourmon.cat.pdx.edu/ourmon/info.html - how it works (not totally out of date)

talk in http://www.cs.pdx.edu/~jrb/ourmon/ourmon.pdf

2 Technical reports … if you want to read them, write to me