wso2 product release webinar - introducing the wso2 complex event processor

41
WSO2 Product Release Webinar WSO2 Complex Event Processor 2.0.1 Simplifying High Performant Data Processing S. Suhothayan (Suho) Software Engineer, Data Technologies Team.

Upload: wso2

Post on 13-Jul-2015

614 views

Category:

Documents


4 download

TRANSCRIPT

WSO2 Product Release Webinar

WSO2 Complex Event Processor 2.0.1

Simplifying High Performant Data Processing

S. Suhothayan (Suho) Software Engineer,

Data Technologies Team.

Outline � What is Complex Event Processing? � WSO2 CEP Server & SOA integrates � The Siddhi Runtime CEP Engine. � High availability, Persistence and Scalability of

WSO2 CEP � How CEP can be combined with Business

Activity Monitoring (BAM). � Demo

Complex Event Processing ?

Complex Event processing is about listening to events and detecting patterns in

near real-time without storing all events.

WSO2 Inc. 4

CEP Is & Is NOT! � Is NOT!

o Simple filters - Simple Event Processing - E.g. Is this a gold or platinum customer?

o Joining multiple event streams - Event Stream Processing

� Is ! o Processing multiple event streams o Identify meaningful patterns among streams o Useing temporal windows

- E.g. Notify if there is a 10% increase in overall trading

activity AND the average price of commodities has

fallen 2% in the last 4 hours

WSO2 CEP Server � Enterprise grade server for CEP runtimes � Supports several transports (network access) � Supports several data formats � Support for multiple CEP runtimes � Governance � Monitoring � Tools (WSO2 Dev Studio)

WSO2 CEP Architecture

CEP Brokers

� Is an adaptor for receiving and publishing events

� Has the configurations to connect to external endpoints

� Its many-to-many with CEP engine

CEP Brokers � Support for several transports (network access)

and data formats o SOAP/WS-Eventing

- XML messages o REST

- JSON messages o JMS

- Map messages - XML messages - Text messages

o SMTP (Email) - Text messages

o Thrift - WSO2 data format High Performant Event Capturing & Delivery Framework supports Java/C/C++/C# via Thrift language bindings - WSO2 Event

� & Brokers are pluggable !

CEP Buckets

� Is an isolated logical execution unit

� Each CEP bucket has a set of o Queries o Input & Output

event mappings. � Its one-to-one with

a CEP Backend Runtime Engine

Opensource CEP Runtimes for Buckets � Siddhi

o Apache License, a java library, Tuple based event model

o Supports distributed processing o Supports multiple query models

- Based on a SQL-like language - Filters, Windows, Joins, Ordering and others

� Esper, http://esper.codehaus.org (Deprecated) o GPLv2 License, a Java library, Events can be XML, Map,

Object o Supports multiple query models

- Based on a SQL-like language - Filters, Windows, Joins, Ordering and others

� Drools Fusion (Deprecated) o Apache License, a java library o Support for temporal reasoning + windows

Management UI � To define,

manage & monitor o buckets o brokers (Data

adopters)

Developer Studio UI

� Eclipse based tool to define buckets

� Can manage the configurations throughout the production lifecycle

� Note: 2.1.0 Still not support Text Output Mapping

Monitoring � Provides real-time statistical visual illustrations of

request & response counts per time based on CEP server, bucket, broker and topics.

Understanding Siddhi CEP Runtime

Engine

Siddhi Queries � Filters and Projection � Windows

o Events are processed within temporal windows. (e.g. for aggregation and joins)

Time window vs. length window. � Joins

o Join two streams � Event ordering

o Identify event sequences and patterns

Filters

� Filters the events by conditions � Conditions

o >, <, = , <=, <=, != o contains, instanceof o and, or, not

� Example

from <stream-name> [<conditions>]* insert into <stream-name>

from cseEventStream[price >= 20 and symbol==’IBM’] insert into StockQuote symbol, volume

Window

� Types of Windows o (Time | Length) (Sliding| Batch) windows

� Type of aggregate functions o sum, avg, max, min

� Example

from <stream-name> [<conditions>]#window.<window-name>(<parameters>) Insert [<output-type>] into <stream-name

from cseEventStream[price >= 20]#window.lengthBatch(50) insert into StockQuote symbol, avg(price) as avgPrice group by symbol having avgPrice>50

Join

� Join two streams based on a condition and window � Unidirectional – event arriving only to the

unidirectional stream triggers join � Example

from <stream>#<window> [unidirectional] join <stream>#<window> on <condition> within <time> insert into <stream>

from TickEvent[symbol==’IBM’]#window.length(2000) join NewsEvent#window.time(5 min) insert into JoinStream *

Pattern

� Check condition A happen before/after condition B � Can do iterative checks via “every” keyword. � Here with “within <time>”, SIddhi emits only events

that are within that time of each other � Example

from [every] <condition> Æ [every] <condition> … <condition> within <time> insert into StockQuote (<attribute-name>* | * )

from every (a1 = purchase[price < 10] ) Æa2 = purchase [price >10000 and a1.cardNo==a2.cardNo]

within 1 day insert into potentialFraud a1.cardNo as cardNo, a2.price as price, a2.place as place

a1 x1 k5 a2 n7 y1

Sequence

� Regular Expressions supported o * - Zero or more matches (reluctant). o + - One or more matches (reluctant). o ? - Zero or one match (reluctant). o or – either event

� Here we have to refer events returned by * , + using square brackets to access a specific occurrence of that event

from <event-regular-expression> within <time> insert into <stream>

from a1 = requestOrder[action == "buy"], b1 = cseEventStream[price > a1.price and symbol==a1.symbol]+, b2 = cseEventStream[price <b1.price] insert into purchaseOrder a1. symbol as symbol, b1[0].price as firstPrice, b2.price as orderPrice

a1 b1 b1 b2 n7 y1

� We compared Siddhi with Esper, the widely used opensource CEP engine

� For evaluation, we did setup different queries using both

systems, push events in to the system, and measure the time till all of them are processed.

� We used Intel(R) Xeon(R) X3440 @2.53GHz , 4 cores 8M

cache 8GB RAM running Debian 2.6.32-5-amd64 Kernel

Performance Results

Simple filter without window

Performance Comparison With ESPER

from StockTick[prize >6] return symbol, price

State machine query for pattern matching

Performance Comparison With ESPER

From f=FraudWarningEvent -> p=PINChangeEvent(accountNumber=f.accountNumber) return accountNumber;

Performance of WSO2 CEP � Here we publihsed data from two client publisher

nodes to the CEP Sever node and sent the triggered notifications of CEP to a client subscriber node.

� To test the worsecase sinario, 100% of the data

published to CEP is recived at the subscriber node after processing (No data is filtered)

� We used Intel® Core™ i7-2630QM CPU @ 2.00GHz, 8

cores, 8GB RAM running Ubnthu 12.04, 3.2.0-32-generic Kernel, for running CEP and used Intel® Core™

i3-2350M CPU @ 2.30GHz, 4 cores, 4GB RAM running Ubnthu 12.04, 3.2.0-32-generic Kernel, for the three client nodes.

Simple filter without window

Performance of WSO2 CEP

from StockTick[prize >6] return symbol, price

1 2 3 4 5 6 7 8 9 10 50 100 Avg 67 135 181 210 212 232 245 250 234 186 187 112

0

50

100

150

200

250

300

kilo

Eve

nts/

Sec

# Clients

WSO2 CEP Throughput

HA/ Persistence � Ability to recover

runtime state in the case of a failure.

� Enables queries to span lifetimes much greater than server uptime.

� Takes periodic snapshots and stores all state information to a scalable persistence store (Apache Cassandra).

� Supports pluggable persistent stores.

Scaling � Vertically scaling

o Can be distributed as a pipeline � Horizontally scaling

o Queries like windows, patterns, and Join have shared states, hence hard to distribute!

o Use distributed cache (Hazelcast) to achieve this - shared memory and batch processing

Event Recording � Ability to record all/some of the events for

future processing � Few options

o Publish them to Cassandra cluster using WSO2 data bridge API or BAM (can process data in Cassandra with Hadoop using WSO2 BAM).

o Write them to distributed cache o Custom thrift based event recorder

WSO2 BAM

Data Receiving Data Analyzing Data Presentation

Data Publishing

CEP Role within WSO2 Platform

DEMO

Scenario � Monitoring stock exchange for game changing

moments � Two input event streams.

o Event stream of Stock Quotes from a stock exchange

o Event stream of word count on various company names from twitter pages

� Check whether the last traded price of the stock has changed significantly(by 2%) within last minute, and people are twitting about that company (> 10) within last minute

Example Scenario

Input events � Input events are JMS Maps

o Stock Exchange Stream

Map<String, Object> map1 = new HashMap<String, Object>(); map1.put("symbol", "MSFT"); map1.put("price", 26.36); publisher.publish("AllStockQuotes", map1);

o Twitter Stream

Map<String, Object> map1 = new HashMap<String, Object>();

map1.put("company", "MSFT");

map1.put("wordCount", 8);

publisher.publish("TwitterFeed", map1);

Queries

Queries from allStockQuotes[win.time(60000)] insert into fastMovingStockQuotes symbol,price, avg(price) as averagePrice group by symbol having ((price > averagePrice*1.02) or (averagePrice*0.98 > price )) from twitterFeed[win.time(60000)] insert into highFrequentTweets company as company, sum(wordCount) as words group by company having (words > 10) from fastMovingStockQuotes[win.time(60000)] as fastMovingStockQuotes join highFrequentTweets[win.time(60000)] as highFrequentTweets on fastMovingStockQuotes.symbol==highFrequentTweets.company insert into predictedStockQuotes fastMovingStockQuotes.symbol as company, fastMovingStockQuotes.averagePrice as amount, highFrequentTweets.words as words

Alert � As a Email

Hi Within last minute, people being twitting about {company}

{words} times, and the last traded price of {company} has changed by 2% and now being trading at ${amount}.

From CEP

Useful links � WSO2 CEP 2.0.1

http://wso2.com/products/complex-event-processor/

� Distributed Processing Sample With Siddhi CEP and ActiveMQ JMS Broker.

http://suhothayan.blogspot.com/2012/08/distributed-processing-sample-for-wso2.html

� Creating Custom Data Publishers to BAM/CEP http://wso2.org/library/articles/2012/07/creating-custom-agents-publish-

events-bamcep

� WSO2 BAM 2.0.1 http://wso2.com/products/business-activity-monitor/

Questions?

Thank you.