1 efficient rule matching for large scale systems packet classification – a case study alok...
TRANSCRIPT
1
Efficient Rule Matching for Large Scale Systems
Packet Classification – A Case Study
Alok Tongaonkar
Stony Brook University
A
2
Rule Based Systems
Applications in Security – Intrusion Detection System Firewalls Access Control Systems
Policy specified in terms of a database of rules
Enforcement involves identifying the applicable rule(s)
3
Fundamental Operation Given an input p with attributes {p1, p2, ..., pk}, identify the
rules Ri from {R1, R2, ..., Rn} that match p
Ri: condition -> action
e.g. R1: dhost == PLUTO && dport == HTTP && content: “Bad command” -> DENY
Challenge
Rule matching algorithms do not scale well – either in space or in time
4
Matching Algorithms n – no. of rules k – no. of attributes Linear Search
Match one rule at a time Space efficient – O(n*k) Matching time increases very fast – O(n)
Table-based Search Columns correspond to attributes Rows correspond to rules Wastes space when many rules specify “*” for many
attributes – O(n*k) Efficient matching in hardware/multiprocessor – match
different attributes in parallel and combine results In uniprocessor environment matching time – O(n)
5
Matching Algorithms contd. Decision Tree (Trie-like structure)
Each node corresponds to test on an attribute Matching time – O(k)
No. of attributes is order of magnitude smaller than no. of rules
Size – Can be exponential in n
Minimization of decision tree is a NP-complete problem!
Goal
Develop efficient techniques for rule matching that scale to support thousands of rules
6
Outline
Problem Formulation Techniques
Minimize duplication Benign non-determinism Polynomial bound Utility
Results
7
Packet Classification A mechanism that
inspects network packets determines how to process a packet based on the values of
header fields and the payload Applications
Firewalls – Identify highest priority matching rule Intrusion Detection Systems
Use unordered rules Identify all matching rules
Network Monitoring – whether a packet satisfies any of the conditions
8
Objective Promote sharing of tests
not restricted to equality tests we need to support inequalities, disequalities, and
bit-masking operations Flexibility to support diverse application
Ordered (firewalls) and unordered (intrusion detection) rule sets
Packet-filtering (network monitoring)
9
Problem FormulationTests involve a variable x and one or two constants
(denoted by c). Equality tests x == c
tcp_sport == 80 Equality tests with bitmasks x & c1 == c
tcp_flags & 0x03 == 0x03 Disequality tests x != c
tcp_sport != 80
Disequality tests with bitmasks x & c1 != c tcp_flags & 0x03 != 0x03
Inequality tests x <= c tcp_dport <= 1024
10
Rules and priorities A rule R is a conjunction of tests
(dport == 22) && (sport <=1024) && (flags&0xb == 0x3) A set of rules may be partially ordered by a priority
relation The priority of R is denoted as Pri(R).
A rule R matches a packet p, if: the packet satisfies R, i.e., R(p) is true the packet does not satisfy any rule that has higher
priority than R
11
Decision Tree for Packet Classification
{R1, R2, R3}
{}
icmp_type == ECHO
ttl == 1
ttl == 1
ttl == 1 ttl != 1
ttl != 1
ttl != 1
icmp_type == ECHO_REPLY {R1, R3}
{R2, R3}
{R3}
{}{R3}
{R2, R3}
{R1, R3} {R1}
icmp_type != ECHO &&
icmp_type != ECHO_REPLY
R1: (icmp_type == ECHO)R2: (icmp_type == ECHO_REPLY) && (ttl ==1)R3: (ttl == 1)
12
Exponential Blowup R1: x == 1 R2: x == 2 R3: x == 3 R4: x == 4
R5: y == 1 R6: y == 2 R7: y == 3 R8: y == 4
12
34
x
y
213 4
else
elseelse1
2 3 4
{R1, R5} {R1, R6} {R2, R5} {R2, R6}
13
Decision Tree Construction Decompose and reorder tests to increase
sharing of tests among rules
R1: x == 5
R2: x & 0x03 != 1
{R2}
x & 0x03 != 1x & 0x03 == 1
x & 0x03 != 1x & 0x03 == 1
x == 5 x != 5
{R1} {R1, R2} {}
{R1}
14
Condition Factorization Decomposing rules into combination of more
primitive tests Similar to factorization of integers Based on the residue operation – analogous to
integer divisionResidue We want to determine if there is a match for a rule
C1
We have so far tested a condition C2
A residue captures the additional tests that need to be performed at this point to verify C1
15
Residue OperationThe residue C1/C2 is another condition C3 such
that:1. C2 Æ C3 ) C1
2. C1 Æ C2 ) C3
Examples C1: x 2 [1, 20], C2: x 2 [15, 25] C3: x <= 20
C1: x 2 [1, 20], C2: x == 15 C3: true
C1: x 2 [1, 20], C2: x == 35 C3: false
C1: x 2 [1, 20], C2: y == 15 C3: x 2 [1, 20]
16
Computing Residue on Tests
17
Build Algorithm Recursive procedure Takes a node s as its first parameter Builds the sub-tree that is rooted at s It takes two other parameters
Candidate Set (Cs) – rules that haven’t completed a match, but future matches can’t be ruled out either.
Match Set (Ms) – all rules for which a match can be announced at s.
18
Minimize Duplication R1: x == 1 && y == 1
R2: x == 2 && y == 2
R3: y == 3
x
12
else
yy y
1 3 else 2 else3 3 else
{R1} {R3} {} {}{R3}{R2} {} {R3}
19
Minimize Duplication R1: x == 1 && y == 1
R2: x == 2 && y == 2
R3: y == 3
y
12
else
xx
1 else 2 else
3
{R3}
{R1} {} {}
{}
{R2}
20
Benign Non-determinism Two rules R1 and R2 are said to be independent of each
if they do not have a common test Build separate trees for each independent set Match packets against each tree – non-determinism
without incurring any performance penalties If R1 and R2 are independent, packet may match R1, R2,
both, or neither. Number of nodes of tree for R1 is k1, for R2 is k2. Number of states of tree for R1 U R2 is k1 * k2. Combined number of nodes of independent trees for R1
and R2 is k1 + k2.
21
Exponential Blowup R1: x == 1 R2: x == 2 R3: x == 3 R4: x == 4
R5: y == 1 R6: y == 2 R7: y == 3 R8: y == 4
12
34
x
y
213 4
else
elseelse1
2 3 4
{R1, R5} {R1, R6} {R2, R5} {R2, R6}
yx
{R1} {R2} {R5} {R6}
22
Ensuring Polynomial Bounds Breadth of tree is function of breadth of sub-
trees Select a polynomial bound to satisfy at each
node Pick tests that satisfy the bounds Pick a test that comes closest to satisfying
this constraint and make some outgoing edges nondeterministic
23
Improving Matching TimeUtility - how much a test goes towards checking a rule based on notion of assigning costs to tests and rules compare cost of a rule with combined cost of a test and
the residue of a rule w.r.t the test
select strategySize reduction more important than matching time1. Pick discriminating test when available
Pick test with higher utility2. Examine opportunities for benign-nondeterminism3. Pick tests that satisfy polynomial bound
24
Tree Size
0
10000
20000
30000
40000
50000
60000
70000
0 50 100 150 200 250 300
No. of rules
No
. o
f n
od
es
ConditionFactorization
Snort NG
25
Matching Time
0102030405060708090
0 100 200 300
No. of rules
Mat
chin
g tim
e (p
er p
acke
t) in
ns
ConditionFactorization
Snort NG
Snort 2
26
Summary Developed a new technique for fast packet
classification Flexible – support diverse applications in a uniform
framework Promotes sharing of tests
Developed novel techniques for generating packet classification trees that Have polynomial size Virtually constant matching time
Demonstrated the gains from our technique for intrusion detection systems and firewalls