hybridcuts: a scheme combining decomposition and cutting for packet classification author: wenjun...

24
HybridCuts: A Scheme Combining Decomposition and Cutting for Packet Classification Author: Wenjun Li, Xianfeng Li Publisher: 2013 IEEE 21 st Annual Symposium Presenter: Ching Hsuan Shih Date: 2013/12/11

Upload: erica-barton

Post on 16-Dec-2015

223 views

Category:

Documents


3 download

TRANSCRIPT

HybridCuts: A Scheme Combining Decomposition and Cutting for Packet Classification

Author: Wenjun Li, Xianfeng Li

Publisher: 2013 IEEE 21st Annual

Symposium

Presenter: Ching Hsuan Shih

Date: 2013/12/11

Outline

I. Introduction

II.Background

III.Related Work

IV.HybridCuts

V.Experimental Results

I. IntroductionTwo major approaches to packet classification:

• Architectural: TCAM

• Advantage: process packets at line speed

• Disadvantage: expensive, high power consumption, rule duplication

• Algorithmic:

• Decision-tree: Hicuts, HyperCuts, and EffiCuts

• Decomposition: BV

I. Introduction (Cont.)

Proposed scheme:

• HybridCuts: a combination of decomposition and decision-tree techniques

•A rule set decomposition algorithm based on the observation that most rules have at least one small field.

•A novel one-dimensional cutting algorithm called FiCuts.

•A two-stage cutting framework which combines one-dimensional cutting and multi-dimensional cutting techniques.

II. BackgroundA. The Packet Classification Problem

• To find a matching rule from a packet classifier for a packet.

• A classifier is a set of rules, with each rule R consisting of a tuple of F field values, and an action to be taken in case of a match.

II. Background (Cont.)

B. Complexity in Theory

• From a geometric point of view, for N non-overlapping hyper-rectangles in F-dimensional space:

• The best bounds for locating a point are either Θ(log N) time with Θ(NF) space, or Θ(logF-1 N) time with Θ(N) space.

• Impractical: with just 1000 rules and 4 fields, a solution is either impracticably (NF is about 1000G) or too slow (logF-1 N is about 1000 memory accesses).

II. Background (Cont.)C. Complexity in Practice• It is infeasible to design a single algorithm that can perform

well in all cases.

• Packet classification rules in real-life have some inherent characteristics that can be exploited to reduce the complexity.

• The protocol field is restricted to a small set of values.

• Rules specify a limited number of distinct transport port ranges.

• The number of address prefixes matching a given address is typically five or less.

• The number of rules matching a given packet is typically five or less.

• Many different rules share the same field values.

III. Related Work

A. Parallel Bit-Vector (BV)

• One of most representative decomposition based solutions.

• It works on the individual fields of rules independently for partially matching results.

• Then a bit-wise AND operation on all bit vectors is performed to get the final result.

III. Related Work (Cont.)

III. Related Work (Cont.)

B. HiCuts and HyperCuts

• Take a geometric view of the packet classification problem.

• HiCuts chooses one dimension to cut at one time.

• HyperCuts chooses multiple dimensions to cut at one time.

III. Related Work (Cont.)

C. EffiCuts

• Separate rule set into several subsets, depending on whether the value of each dimension is wildcard (or almost wildcard).

• Each subset creates its own decision-tree independently using HyperCuts.

III. Related Work (Cont.)

D. ParaSplit

• Employs a complex heuristic for rule set partitioning.

• It is different from EffiCuts in that its objective is for efficient hardware implementation using FPGAs.

IV. HybridCutsA. Decomposition-based FrameworkDefinitions:

• Given an N-dimensional rule R = (F1, F2, F3, …, FN), Leni represents the length of field Fi , and a threshold value vector T = (T1, T2, T3, …, TN).

• We call Fi is a small field if Leni≦Ti. Then we define the following concepts for R:

• Big rule: ∀i ∈{1, 2, 3 ... N}, Fi in R is a big field

• Small rule: ∃i ∈{1, 2, 3 ... N}, Fi in R is a small field

IV. HybridCuts (Cont.)

• We decompose a 5-tuple rule set into the following five subsets without duplicates among each other:

• 1. Big-subset: SA, DA, SP and DP are all big field

• 2. SA-subset: SA is a small field for each rule

• 3. DA-subset: DA is a small field for each rule

• 4. SP-subset: SP is a small field for each rule

• 2. DP-subset: DP is a small field for each rule

• When processing a rule R with two small fields: SA and DA, suppose the sizes of the SA-subset and the DA-subset up to now are N1 and N2 respectively, then rule R will go to SA-subset if N1≦N2, or to DA-subset vice versa.

IV. HybridCuts (Cont.)

IV. HybridCuts (Cont.)

B. FiCuts: A One-Dimensional Cutting Technique• Simplicity: FiCuts conducts cuttings on the subset along a fixed

dimension

• Adaptivity: FiCuts can decide when to stop FiCuts and resort to other more effective cutting methods

How to decide np:

spfac * number of rules at node r ≥ ∑ number of rules at each child of node r + np 

IV. HybridCuts (Cont.)

binth = 4

spfac = 2

IV. HybridCuts (Cont.)

C. Multi-dimensional Cutting• With the shrinking of the search space, rule replications begin to

rise with fine cuts.

• FiCuts makes the decision. If the number of cuts np at a node is less than a predefined MAXCUTS, FiCuts resorts to HyperCuts

IV. HybridCuts (Cont.)binth = 2spfac = 2

IV. HybridCuts (Cont.)

D. Optimization• Most rules have at least one small IP address field (SA or DA).

• Based on this observation, we reduce the number of decision trees. For a 5-tuple rule set, we decompose it into three subsets:

• 1. Big-subset: Both SA and DA are big field

• 2. SA-subset: SA is a small field for each rule

• 3. DA-subset: DA is a small field for each rule

IV. HybridCuts (Cont.)

Depends on SA, DA, SP, and DP

Depends on SA, and DA

V. Experimental Results

A. Memory Consumption

V. Experimental Results (Cont.)

A. Memory Accesses

V. Experimental Results (Cont.)C. Potential of Parallelization• It can be seen that in most cases, the worst-case height of a single

tree is half of or less than the overall number of memory accesses on all the trees. This means that the trees constructed are balanced in height among each other, amenable to a parallel implementation with a potential 2x speedup.