vortex tutorial part ii

59
Angelo Corsaro, PhD Chief Technology Officer [email protected] Vortex Tutorial Part II

Upload: angelo-corsaro

Post on 06-Jul-2015

464 views

Category:

Technology


3 download

DESCRIPTION

Vortex is a platform that provides seamless, ubiquitous, efficient and timely data sharing across mobile, embedded, desktop, cloud and web applications. Today Vortex is the enabling technology at the core the most innovative Internet of Things and Industrial Internet applications, such as Smart Cities, Smart Grids, and Smart Traffic. This two parts tutorial (1) introduces the key concepts of Vortex, (2) gets you started with using Vortex to efficiently exchange data across mobile, embedded, desktop, cloud and web applications, and (3) provides a series of best practices, patterns and idiom to get the best our of Vortex. The only prerequisite to fully exploit this tutorial is a basic understanding of Java, C++ and JavaScript. Some knowledge of Scala and CoffeScript will be a plus.

TRANSCRIPT

Page 1: Vortex Tutorial Part II

Angelo  Corsaro,  PhD  Chief  Technology  Officer  

[email protected]

Vortex Tutorial Part II

Page 2: Vortex Tutorial Part II

Recap

Page 4: Vortex Tutorial Part II

Building ChirpIt

Page 5: Vortex Tutorial Part II

Cop

yrig

ht P

rism

Tech

, 201

4

To explore the various features provided by the Vortex platform we will be designing and implementing a micro-blogging platform called ChirpIt. Specifically, we want to support the following features:

ChirpIt users should be able to “chirp”, “re-chirp”, “like” and “dislike” trills as well as get detailed statistics

The ChirpIt platform should provide information on trending topics — identified by hashtags — as well as trending users

Third party services should be able to flexibly access slices of produced trills to perform their own trend analysis

ChirpIt should scale to millions of users

ChirpIt should be based on a Lambda Architecture

ChirpIt Requirements

Page 7: Vortex Tutorial Part II

Next Step

Page 8: Vortex Tutorial Part II

Trendy #hashtags

Page 12: Vortex Tutorial Part II

Quality of Service

Page 25: Vortex Tutorial Part II

Cop

yrig

ht P

rism

Tech

, 201

4

For data to flow from a DataWriter (DW) to one or many DataReader (DR) a few conditions have to apply:

The DR and DW domain participants have to be in the same domain

The partition expression of the DR’s Subscriber and the DW’s Publisher should match (in terms of regular expression match)

The QoS Policies offered by the DW should exceed or match those requested by the DR

QoS ModelDomain

Participant

DURABILITY

OWENERSHIP

DEADLINE

LATENCY BUDGET

LIVELINESS

RELIABILITY

DEST. ORDER

Publisher

DataWriter

PARTITION

DataReader

Subscriber

DomainParticipant

offered QoS

Topicwrites reads

Domain Idjoins joins

produces-in consumes-from

RxO QoS Policies

requested QoS

Page 26: Vortex Tutorial Part II

Useful QoS for ChirpIt

Page 27: Vortex Tutorial Part II

Cop

yrig

ht P

rism

Tech

, 201

4

The DataWriter HISTORY QoS Policy controls the amount of data that can be made available to late joining DataReaders under TRANSIENT_LOCAL Durability

The DataReader HISTORY QoS Policy controls how many samples will be kept on the reader cache

- Keep Last. DDS will keep the most recent “depth” samples of each instance of data identified by its key

- Keep All. The DDS keep all the samples of each instance of data identified by its key -- up to reaching some configurable resource limits

History QoS Policy

0 1 2 3

Pressure time

Pressure time

Pressure time

KeepLast(3)

KeepLast(1)

KeepAll

Page 41: Vortex Tutorial Part II

Cop

yrig

ht P

rism

Tech

, 201

4

Beside the service specific configuration — that we won’t discuss here — it is important to understand that the amount of data that the durability service will maintain for a given topic is configured using the DurabilityService Policy

The DurabilityService Policy, defined for topics, can be used to store:

- The last n samples for each topic instance

- All samples ever produced for a given Topic (across all its instances)

Resource constraints can also be specified to limit the maximum amount of data taken by a topic

NOTE: beware that when you dispose an instance its data is removed from the Durability Service

Durability Service Configuration

Page 42: Vortex Tutorial Part II

Cop

yrig

ht P

rism

Tech

, 201

4

Data can be retrieved from the Durability Service in two ways

Automatic Retrieval: Non VOLATILES DataReaders automatically receive a set of historical data. How much data is received depends on the DR HISTORY setting and the Durability Service (or DW for TRANSIENT_LOCAL) historical settings

Query-Based Retrieval: Any applications can create a “special” data reader to query the Durability Service. Query can predicate on content, as well as time

- Get all Chirps made by @wolverine in the last 2 days:

• dr.getHistoricalData(“uid = ‘@wolverine’”, now() - 2 days, now())

- Get all Chirps containing the #hashtag “#xmen” in the last day:

• dr.getHistoricalData(“msg like ‘*#xmen*’”, now() - 1 day, now())

Getting Data From the Durability Service

Page 43: Vortex Tutorial Part II

Cop

yrig

ht P

rism

Tech

, 201

4

VORTEX’s Durability can be leveraged to address several different use cases in ChirpIt

Batch Layer: VORTEX durability can be used to persist all the chirps ever received by our application. Scalability can be easily achieved by partitioning (more later)

Speed Layer: Views on the data-set can be efficiently created using the Durability Service Query API

Historical Data: Any analytics application as well as end-user application can access historical data through either the Automatic or Query-based delivery

Exploiting VORTEX Durability in ChirpIt

Page 47: Vortex Tutorial Part II

Computing Trending #hastags

Page 49: Vortex Tutorial Part II

Cop

yrig

ht P

rism

Tech

, 201

4

ChirpIt #hashtag ranking function will be stateless

The latest rankings will be maintained by the VORTEX durability service. This allows to restore the state after a failure as well as easily do aggregations (more later)

#hashtag ranking function

#hashtag ranking function

Ranking

Chirps

Ranking

latest  ranking

last  ranking

real-­‐time  data

As such, the ranking function will consume the latest ranking along with live chirps and produce the new ranking for the basic interval (say the shortest interval that will define our granularity and from which aggregation will be created)