debs 2011 pattern rewritingforeventprocessingoptimization

16
IBM Haifa Research Lab – Event Processing © 2011 IBM Corporation Pattern Rewriting Framework for Event Processing Optimization Ella Rabinovich, Opher Etzion , Avigdor Gal

Upload: opher-etzion

Post on 09-May-2015

726 views

Category:

Technology


0 download

DESCRIPTION

DEBS 2011 presentationPattern Rewriting for event processing optimization by Ella Rabinovich, Opher Etzion and Avigdor Gal

TRANSCRIPT

Page 1: Debs 2011  pattern rewritingforeventprocessingoptimization

IBM Haifa Research Lab – Event Processing

© 2011 IBM Corporation

Pattern Rewriting Framework forEvent Processing Optimization

Ella Rabinovich, Opher Etzion, Avigdor Gal

Page 2: Debs 2011  pattern rewritingforeventprocessingoptimization

IBM Haifa Research Lab – Event Processing

© 2011 IBM Corporation2

Motivation Adi A., Etzion O. Amit - the situation manager.The VLDB Journal – The International Journal on Very Large Databases. Volume 13 Issue 2, 2004.

Previous studies indicate that thereis a major performance degradation asapplication complexity increases.

Mendes M., Bizarro P., Marques P. Benchmarkingevent processing systems: current state and future directions. WOSP/SIPEW 2010: 259-260.

event processing system benchmark

0

10000

20000

30000

40000

50000

60000

70000

80000

standby w orld noisy w orld filtered w orld complex w orld

category

thro

ug

hp

ut

throughput

event processing system benchmark

0

20000

40000

60000

80000

100000

120000

140000

standby world noisy world filtered world complex world

category

late

nc

y (

ms

)

performance time (ms)

performance study of event processing systems

0

100

200

300

400

500

600

700

800

900

1000

selection andprojection

aggregation overwindows

joins pattern detection

category

thro

ug

hp

ut

* 1

0^

3

system 1 system 2 system 3

Optimize complex scenarios

Page 3: Debs 2011  pattern rewritingforeventprocessingoptimization

IBM Haifa Research Lab – Event Processing

© 2011 IBM Corporation3

Optimization tools

Blackbox optimizations:DistributionParallelismSchedulingLoad balancingLoad shedding

Whitebox optimizations:Implementation selectionImplementation optimizationPattern rewriting Our

focus

Page 4: Debs 2011  pattern rewritingforeventprocessingoptimization

IBM Haifa Research Lab – Event Processing

© 2011 IBM Corporation4

An example of a complex scenario

E1 E2 E3 E15 E16

A process has 16 steps, that have to be executed in a predefined order; termination of each step creates an event with a status-code (SC).

The process is reported as committed whenThe 16 steps have completed in the correct order (sequence pattern) and the pattern assertion is satisfied.

The assertion that may look like:

E1.SC == E2.SC or E3.SC < 4

For this scenario we succeeded to achieve more than tenfold decreaseof latency, or more than 20% increase in throughput

Page 5: Debs 2011  pattern rewritingforeventprocessingoptimization

IBM Haifa Research Lab – Event Processing

© 2011 IBM Corporation5

Pattern Rewriting Approach

The goal: create equivalent pattern that provides better performance

seq(E1,E2,E3,E4)seq(E1,E2,E3,E4)

seq(E1,E2,E3,E5,E6)seq(E1,E2,E3,E5,E6)

seq(E1,E2,E3)seq(E1,E2,E3)

seq(DE,E4)seq(DE,E4)

seq(DE,E5,E6)seq(DE,E5,E6)

all(E1,E2,E3,E4)all(E1,E2,E3,E4)

all(E1,E2)all(E1,E2)

all(E3,E4)all(E3,E4)

all(DE1,DE2)all(DE1,DE2)

subsumption of a common logic splitting for parallel execution

DE1

DE2

DE

Rewriting techniques exist in other domains such as: rule system, SQL queriesDue to the inherent complexity of event processing patterns there are some unique challenges

Page 6: Debs 2011  pattern rewritingforeventprocessingoptimization

IBM Haifa Research Lab – Event Processing

© 2011 IBM Corporation6

Challenges: Assertion Split

A pattern assertion (PA) is a predicate that event collection needs to satisfied for the pattern to be matched.A pattern assertion (PA) is a predicate that event collection needs to satisfied for the pattern to be matched.

seq(E1,E2)with PA’

seq(E1,E2)with PA’

seq(DE,E3)with PA’’

seq(DE,E3)with PA’’

DE

seq(E1,E2,E3) with pattern assertion: E1.SC == E2.SC OR E3.SC < 4

E1.SC == E2.SC OR E3.SC <4 E1.SC == E2.SC E3.SC < 4

seq(E1,E2,E3)with PA

seq(E1,E2,E3)with PA

the direct connection of the two patterns implies “AND” operator between PA’ and PA’’

seq(E1,E3)with PA’

seq(E1,E3)with PA’

seq(DE,E2)with PA’’

seq(DE,E2)with PA’’

DE

seq(E1,E2,E3) with pattern assertion: E1.SC == E3.SC AND E2.SC = 0

E1.SC == E3.SC AND E2.SC = 0 E1.SC == E3.SC E2.SC = 0

seq(E1,E2,E3)with PA

seq(E1,E2,E3)with PA

the assertion should be separable in terms of its variables

Page 7: Debs 2011  pattern rewritingforeventprocessingoptimization

IBM Haifa Research Lab – Event Processing

© 2011 IBM Corporation7

Assertion Split – Solution

Convert the pattern assertionexpression into conjunctive normal form (CNF).

Identify independent participants’sub-groups, by generating assertion variables dependency graph.

Maximal number of independent partitions implies the finest granulation of the assertion expression.

E1E1 E2E2 E4E4 E5E5 E6E6

E3E3

(E1.SC > E2.SC) AND (E4.SC > E5.SC) ANDNOT ((E5.SC==E6.SC) AND (E3.SC==77)) (E1.SC > E2.SC) AND (E4.SC > E5.SC) ANDNOT ((E5.SC==E6.SC) AND (E3.SC==77))

(E1.SC > E2.SC) AND (E4.SC > E5.SC) AND (NOT(E5.SC==E6.SC) OR NOT(E3.SC==77))(E1.SC > E2.SC) AND (E4.SC > E5.SC) AND (NOT(E5.SC==E6.SC) OR NOT(E3.SC==77))

CNF

Page 8: Debs 2011  pattern rewritingforeventprocessingoptimization

IBM Haifa Research Lab – Event Processing

© 2011 IBM Corporation8

Pattern Matching - Policies

PG1 PG2 ATM-W1

Instance selection policy

PG1 PG2ATM-W1

ATM-W2

first detection additional detection?

Cardinality policy

PG1 PG2ATM-W1

first detection – are instances consumed?

ATM-W2

Consumption policy

Pattern: seq(PG, ATM-W) within 10 minutes

Page 9: Debs 2011  pattern rewritingforeventprocessingoptimization

IBM Haifa Research Lab – Event Processing

© 2011 IBM Corporation9

Challenges: Policies Mapping

Naïve pattern split, keeping the original policies in the rewrittenversion will result in incorrect matching:

seq(E1,E2,E3){single, last, …}seq(E1,E2,E3){single, last, …}

seq(E1,E2){single, last, …}

seq(E1,E2){single, last, …}

seq(DE,E3){single, last, …}

seq(DE,E3){single, last, …}

e1.1 e3.1 e1.1 e3.1e2.1blood

pressuremeasure

e2.2blood

pressuremeasure

e2.1blood

pressuremeasure

e2.2blood

pressuremeasure

detection point

detection point

detection point

Page 10: Debs 2011  pattern rewritingforeventprocessingoptimization

IBM Haifa Research Lab – Event Processing

© 2011 IBM Corporation10

Policies Mapping – Solution

Mapping of policies in the rewritten alternative (f2’ + f2’’),based on the original pattern (f1):

policy original (f1) rewritten (f2’) rewritten (f2’’)

Cardinality single unrestricted single

Instance selection

last last last

Consumption - reuse -

seq(E1,E2,E3)seq(E1,E2,E3) seq(E1,E2)seq(E1,E2) seq(DE,E3)seq(DE,E3)

policy original (f1) rewritten (f2’) rewritten (f2’’)

Cardinality unrestricted unrestricted unrestricted

Instance selection

last each last

Consumption consume reuse consume

+ pattern assertion extensions

Page 11: Debs 2011  pattern rewritingforeventprocessingoptimization

IBM Haifa Research Lab – Event Processing

© 2011 IBM Corporation11

Denotational semantics approach:

Event processing pattern is a function (f), mapping pattern’s input (participantset - PS) into its output (matching set - MS). We formally demonstrate that forthe same PS both alternatives produce the identical MS:

f1(PS, …) == f2’((f2’’(PS’, …) PS’’), …) PS

Equivalence assurance

seq(E1,…, EN)PA, Policies

seq(E1,…, EN)PA, Policies

seq(E1, …, EK)PA’, Policies’

seq(E1, …, EK)PA’, Policies’

seq(DE, EK+1, …, EN)PA’’, Policies’’

seq(DE, EK+1, …, EN)PA’’, Policies’’

participant set (PS)

participant set (PS)

matching set (MS)

matching set (MS)

Page 12: Debs 2011  pattern rewritingforeventprocessingoptimization

IBM Haifa Research Lab – Event Processing

© 2011 IBM Corporation12

Throughput vs. Latency Tradeoff

Pattern throughput is an average rate of events it can processPattern throughput is an average rate of events it can process

The detecting event latency as a delay between the last input event causing a

this pattern detection and the detection itself, resulting in derivationof an output event.

The detecting event latency as a delay between the last input event causing a

this pattern detection and the detection itself, resulting in derivationof an output event.

Example: seq(E1,E2,E3) produces derived event DE

Detecting event latency = DE.detection_time - E3.detection_time DE.detection_time: time DE was detected by the systemE3.detection_time: time E3 arrived to the system

seq(E1,…, EN)seq(E1,…, EN) seq(E1, …, EN-2)seq(E1, …, EN-2) seq(DE, EN-1, EN)seq(DE, EN-1, EN)

throughput latency throughput latency

lazy evaluation eager evaluation

Page 13: Debs 2011  pattern rewritingforeventprocessingoptimization

IBM Haifa Research Lab – Event Processing

© 2011 IBM Corporation13

Bi-objective Performance Optimization

Define bi-objective performance function Assign a scalar weight for each objective to be optimized

Weight of to pattern throughput (th)

Complementary weight (1-) to the detecting event latency

(lt)

Minimize the goal function of the form:

g = *lt + C*(1-)*(1/th) Simulation-based approach to select the optimal rewriting

alternative (minimizing the goal function g) For a set of rewriting alternatives A = {A1, … AK}, find

argminAi ( g )

Page 14: Debs 2011  pattern rewritingforeventprocessingoptimization

IBM Haifa Research Lab – Event Processing

© 2011 IBM Corporation14

Experimental Results

# rewriting

rewritten pattern throughput (event/s)

Detected event

latency (ms)

1 0 : 8 140 260

2 1 : 7 110 189

3 2 : 6 142 174

4 3 : 5 155 147

5 4 : 4 165 95

6 5 : 3 172 63

7 6 : 2 163 32

8 7 : 1 95 15

lazy

Simulation results for seq (E1, …, E16) split of pairs

eager

The Paretofrontier

Min latency

Max throughput

The basepattern

Not in thePareto frontier

Page 15: Debs 2011  pattern rewritingforeventprocessingoptimization

IBM Haifa Research Lab – Event Processing

© 2011 IBM Corporation15

Future Work

Pattern rewriting framework for event processing optimization With more than tenfold performance improvement between the original

pattern and its rewritten alternative

Future research and practical activities Investigation of additional rewritings

Using patterns of the same type (e.g., for all pattern)

Additional methods for rewriting (e.g. seq using all and filter agents)

Elaborating an algorithm for event processing network rewriting

Exploring heuristic-based approach for selection of the rewriting alternativeof the sequence pattern

Page 16: Debs 2011  pattern rewritingforeventprocessingoptimization

IBM Haifa Research Lab – Event Processing

© 2011 IBM Corporation16