restream: accelerating backtesting and stream replay with serial-equivalent parallel processing

ReStreamAccelerating Backtesting and Stream Replay

with Serial-Equivalent Parallel Processing

October 6, 2016

Johann Schleier-Smith, Erik T. Krogen, Joseph M. Hellerstein UC Berkeley

@jssmith @joe_hellerstein

Overview

• Motivations for backtesting and stream replay

• Alternatives for scaling throughput

• ReStream and Multi-Version Parallel Streaming (MVPS)

• Evaluation

Research Motivation

• Operating Tagged and hi5 social networks

• >300 million users registered • Millions of daily active users

Practical Pains Curiosity

• >10 million active accounts • >1000 updates/sec

• Must respond to current activity • Require near-instant decisions

Real-Time Spam Detection

for Dating Product

• Facts recorded in event log • Real-time stream-processing • Need to evaluate new ideas quickly,

e.g., simulate model using data of past 30 days in under 10 minutes

Real-Time Spam Detection

for Dating Product

Replay lets Agile developers ask

powerful tool for creating and enhancing streaming applications

When latency matters… streaming shines

• Spam detection • Payment fraud • Money laundering • Real-time recommendations • Ad serving • Dynamic pricing and inventory

management for e-commerce, car-services, etc.

• Financial trading • Industrial monitoring • IoT applications • And more

Research Motivation

• Operating Tagged and hi5 social networks

• >300 million users registered • Millions of daily active users

Practical Pains Curiosity

Given a program that processes an ordered log sequentially

How can we achieve parallel speedup?

Serial-Equivalent Parallel Replay

12345

Ordered log


12345 Program

12345678910


ABProgram

1234567891011


ABProgram

234567891011121314


ABC15 Programt=4t=5t=9

Program*

Program*



13579

246810 A

BCProgram*

Program*


13579111213151719212325

2468101214161820222426 A

BCProgram*

Program*t=4

t=5t=9


• Deterministic output

• More restrictive than transaction serializability

• Partition the input between multiple parallel programs

• Obtain same output as from one program

Developers’ Accelerated Replay Wish List

• Semantics of sequential operations with mutable state

• Full fine-grained temporal resolution

• Process months in minutes: 10,000x real-time rate

Want serial-equivalent parallel replay

Workload Assumptions

• Total order provided by log

• Abundant cloud resources available

• Per-event latency not a concern

Possible Solutions

Streaming Databases

• StreamBase / Aurora

• Truviso / TelegraphCQ

• Recent startups - PipelineDB - RethinkDB

• Query interface derived from SQL

• Set-oriented approach allows query plan optimization, parallelism and reordering

• Some programs can be difficult to express

• Most systems emphasize latency over replay throughput

Examples

+––

+/–

OLTP Databases

• PostgreSQL

• IBM DB2

• MS SQL Server

• Oracle

• SQL interface

• Robust high-performance implementations

• Need to coordinate parallel replay programs

• Transactional serializability gives weaker consistency than serial-equivalence

Examples

+––

+/–

• Hadoop

• Apache Spark Streaming

• Lambda architecture

• Routinely delivers desired log-processing throughput

• Easy to integrate arbitrary functions

• MapReduce foundation does not lend itself naturally to sequential processing

• Throughput and program semantics may be linked

Parallel Big Data SystemsExamples

+––

+

Other Systems• Other streaming: Google MillWheel, Yahoo S4, Apache Storm,

Twitter Heron, Apache Flink, Apache Samza, Walmart MUP8

• Deterministic databases: Calvin, Bohm

• Transactional: VoltDB / S-Store

• Complex Event Processing: Esper, Tibco, JBoss

• Other recent systems: Trill, Naiad, Google Cloud Dataflow

ReStream

• Consequence of input data

• Suggests opportunity for parallelism

• Can we maintain order when necessary, but not necessarily otherwise?

Challenge: serial equivalence and parallelism

Observation: causal dependencies are often sparse

Multi-Versioned State

SET(timestamp=10,key=x,value=3)

SET(timestamp=20,key=x,value=5)

GET(timestamp=15,key=x)

x=3@t=10

x=5@t=20



→3

→3

→5x=5@t=25

SET(timestamp=21,key=x,value=7)x

Social Network Anti-Spam Example

sender has sent 2x messages to non-friends as to friends AND

> 20% of messages sent from IP contain e-mail address

⇒ message is spam

Social Network Anti-Spam Example

Express program in four piecesA. Track friendships B. Track how often user sends to friends / non-friends C. Track how often ip address sends text containing e-mail D. For each message, check B and C to label spam

Sample Code{e:NewFriendshipEvent=>

}A

Sample Code{e:NewFriendshipEvent=>userPair=(e.userIdA,e.userIdB)friendships.merge(e.timestamp,userPair,_=>true)}

A

{e:NewFriendshipEvent=>userPair=(e.userIdA,e.userIdB)friendships.merge(e.timestamp,userPair,_=>true)}

Sample Code

A

{e:NewFriendshipEvent=>userPair=(e.userIdA,e.userIdB)friendships.merge(e.timestamp,userPair,_=>true)} WRITE

Sample Code

A

friendships.merge(timestamp,key,value)

WRITE

Sample Code


Sample Code

{e:MessageEvent=>

}

42

A

B

{e:MessageEvent=>userPair=(e.senderId,e.recipientId)if(friendships.get(e.timestamp,userPair)){friendMsgs.merge(e.timestamp,e.senderId,_+1)}else{nonfriendMsgs.merge(e.timestamp,e.senderId,_+1)}}

Sample Code

43


READ

A

B

{e:MessageEvent=>userPair=(e.senderId,e.recipientId)if(friendships.get(timestamp,key)){friendMsgs.merge(e.timestamp,e.senderId,_+1)}else{nonfriendMsgs.merge(e.timestamp,e.senderId,_+1)}}

Sample Code

44


READ

A

B

{e:MessageEvent=>userPair=(e.senderId,e.recipientId)if(friendships.get(e.timestamp,userPair)){

friendMsgs.merge(e.timestamp,e.senderId,_+1)}else{nonfriendMsgs.merge(e.timestamp,e.senderId,_+1)}}

Sample Code

45


READ

WRITE

A

B

A

B

C

D

nonfriendMsgs

R

R

R

R

R

W

W

W

W

W

friendMsgs

friendships

ipEmailMsgs

ipMsgs

A B C D

nonfriendMsgs

friendMsgs

friendshipsipEmailMsgs

ipMsgsR

R

R

R

R

W

W

WWW

Topological sort

A B C D

nonfriendMsgs

friendMsgs


ipMsgs

12

R

Reading from log

A B C D

nonfriendMsgs

friendMsgs


ipMsgs

1234

W

R

Reading from log, writing shared state

A B C D

nonfriendMsgs

friendMsgs


ipMsgs

123456

R

W

R

Loose coupling

A B C D

nonfriendMsgs

friendMsgs


ipMsgs

123456789

Loose coupling

A B C D

nonfriendMsgs

friendMsgs


ipMsgs

123456789

Must respect dependencies

NO

A B C D

nonfriendMsgs

friendMsgs


ipMsgs

123456789

Loose coupling

OK

A B C D

nonfriendMsgs

friendMsgs


ipMsgs

1234567891011

OK

Loose coupling

A B C D

nonfriendMsgs

friendMsgs


ipMsgs

12345678910111213

OK

Out-of-order processing

A B C D

nonfriendMsgs

friendMsgs


ipMsgs

234567891011121314

OK

Out-of-order processing

Multi-version Parallel StreamingMVPS:

A B C D

nonfriendMsgs

friendMsgs

friendships

ipEmailMsgs

ipMsgs

A B C D

MVPS

135791112131517192123

246810121416182022

A B C D

nonfriendMsgs

friendMsgs

friendships

ipEmailMsgs

ipMsgs

A B C D

MVPS

Mini-batches for MVPS

110

1112131415161718

Mini-batches for MVPS

1120

110

110

2130

4150

6170

8190101 110121 130

1120

3140

5160

7180

91 100111 120140

A B C D

nonfriendMsgs

friendMsgs

friendships

ipEmailMsgs

ipMsgs

A B C D

MVPS with mini-batches

• Partitioned parallel dataflow • Input events passed to all operators • Globally shared multi-versioned state • Logical timestamps referenced throughout computation • Analyze DAG of operator potential read-write dependency • May use mini-batches to amortize coordination • Serial-equivalent semantics

Multi-Versioned Parallel Streaming (MVPS)

Evaluation

ReStream Evaluation Aims

• Demonstrate parallel speedup vs. single-thread (COST)

• Compare to alternative systems

• Understand limits to parallelism

ReStream Evaluation Workload• Simulated social network spam detection

• Structure of read-write dependency graph linked to structure of social network

• Can tune workload characteristics by generating different social graphs

ReStream Evaluation Workload• Simulated social network spam detection

• Structure of read-write dependency graph linked to structure of social network

• Can tune workload characteristics by generating different social graphs

uniform degree distribution

skewed degree distribution

Scaling Throughput

0

200,000

400,000

600,000

1 2 4 8 16 32Hosts

Thro

ughp

ut (e

vent

s/s)

Execution Engine ReStream MVPS on Spark 1−Thread

Modeling Performance

• Greater parallel speedup possible when there are fewer read-write dataflow dependencies

• Track reads and writes of global state, compute critical path length along chained dependencies

uniform degree distributionskewed degree distribution

Parametrized by α

Web out-links

PhotoSharing

SocialNetworks

Web in-links

0

100,000

200,000

300,000

1.5 2.0 2.5 3.0

Hosts 2 4 8 16

Thro

ughp

ut (e

vent

/s)

α

Modeling Performance

R2=0.94 Per-host batch size

2,500-40,000 events (10,000 shown)

Fit gives

ReStream Summary• Serial-equivalent results from parallel replay

• Throughput much greater than real-time rate

• MVPS consistency: Multi-Versioned Parallel Streaming - Analyze for potential read-write dependencies - Timestamped multi-versioned state - Track logical time at runtime

• Also may apply to online stream processing and deterministic databases

@jssmith @joe_hellerstein This work was supported in part by AWS Cloud Credits for Research

restream: accelerating backtesting and stream replay with serial-equivalent parallel processing

Data & Analytics