Transcript
Page 1: Building Robust, Adaptive Streaming Apps with Spark Streaming

Building Robust, Adaptive Streaming

Apps with Spark Streaming

Tathagata “TD” Das@tathadas

Spark Summit East 2016

Page 2: Building Robust, Adaptive Streaming Apps with Spark Streaming

Who am I?

Project Management Committee (PMC) member of Spark

Started Spark Streaming in grad school - AMPLab, UC Berkeley

Current technical lead of Spark Streaming

Software engineer at Databricks

2

Page 3: Building Robust, Adaptive Streaming Apps with Spark Streaming

Streaming Apps in the Real World

Building a high volume stream processing system in production has many challenges

3

Fast and Scalable Spark Streaming is fast, distributed and scalable by design!

Running in production at many places with large clusters and high data volumes

Page 4: Building Robust, Adaptive Streaming Apps with Spark Streaming

Streaming Apps in the Real World

Building a high volume stream processing system in production has many challenges

4

Fast and Scalable Easy to program

Spark Streaming makes it easy to express complex streaming business logic

Interoperates with Spark RDDs, Spark SQL DataFrames/Datasets and MLlib

Page 5: Building Robust, Adaptive Streaming Apps with Spark Streaming

Streaming Apps in the Real World

Building a high volume stream processing system in production has many challenges

5

Fast and Scalable Easy to programFault-tolerant

Spark Streaming is fully fault-tolerant and can provide end-to-end semantic guarantees

See my previous Spark Summit talk for more details

Page 6: Building Robust, Adaptive Streaming Apps with Spark Streaming

Streaming Apps in the Real World

Building a high volume stream processing system in production has many challenges

6

Fast and Scalable Easy to programFault-tolerantAdaptive Focus of this talk

Page 7: Building Robust, Adaptive Streaming Apps with Spark Streaming

Adaptive Streaming Apps

Processing conditions can change dynamically Sudden surges in data ratesDiurnal variations in processing loadUnexpected slowdowns in downstream data stores

Streaming apps should be able to adapt accordingly

7

Page 8: Building Robust, Adaptive Streaming Apps with Spark Streaming

8

BackpressureMake apps robust

against data surges

Elastic ScalingMake apps scale with

load variations

Page 9: Building Robust, Adaptive Streaming Apps with Spark Streaming

9

BackpressureMake apps robust

against data surges

Page 10: Building Robust, Adaptive Streaming Apps with Spark Streaming

Motivation

Stability condition for any streaming app Receive data only as fast as the system can process it

Stability condition for Spark Streaming’s “micro-batch” modelFinish processing previous batch before next one arrives

10

Page 11: Building Robust, Adaptive Streaming Apps with Spark Streaming

Stable micro-batch operation

11

batch interval

1s 1s 1s 1s

batch processing time <=

Previous batch is processed before next one arrives => stable

Spark Streaming runs micro-batches at fixed batch intervals

time0s 2s 4s 6s 8s

Page 12: Building Robust, Adaptive Streaming Apps with Spark Streaming

Unstable micro-batch operation

12

batch processing time > batch interval

2.1s 2.1s 2.1s 2.1s

scheduling delay builds up

Batches continuously gets delayed and backlogged => unstable

time

Spark Streaming runs micro-batches at fixed batch intervals

0s 2s 4s 6s 8s

Page 13: Building Robust, Adaptive Streaming Apps with Spark Streaming

Backpressure: Feedback Loop

13

Backpressure introduces a feedback loop to dynamically adapt the system and avoid instability

Page 14: Building Robust, Adaptive Streaming Apps with Spark Streaming

Backpressure: Dynamic rate limiting

14

Batch processing timesScheduling delays

Batch processing times and scheduling delays used to continuously estimate current processing rates

Page 15: Building Robust, Adaptive Streaming Apps with Spark Streaming

Backpressure: Dynamic rate limiting

15

∑Batch processing timesScheduling delays

Max stable processing rate estimated with PID Controller theoryWell known feedback mechanism used in industrial control systems

PID Controller

Page 16: Building Robust, Adaptive Streaming Apps with Spark Streaming

Backpressure: Dynamic rate limiting

16

∑Batch processing timesScheduling delays

Rate limits on ingestion

Accordingly, the system dynamically adapts the limits on the data ingestion rates

PID Controller

Page 17: Building Robust, Adaptive Streaming Apps with Spark Streaming

Data buffered in Kafka, ensures Spark Streaming stays stable

Backpressure: Dynamic rate limiting

17

If HDFS ingestion slows down, processing times increase

SS lowers rate limits to slow down receiving from Kafka

Page 18: Building Robust, Adaptive Streaming Apps with Spark Streaming

Backpressure: Configuration

Available since Spark 1.5

Enabled through SparkConf, setspark.streaming.backpressure.enabled = true

18

Page 19: Building Robust, Adaptive Streaming Apps with Spark Streaming

19

Elastic ScalingMake apps scale with

load variations

Page 20: Building Robust, Adaptive Streaming Apps with Spark Streaming

Elastic Scaling (aka Dynamic Allocation)

Scaling the number of Spark executors according to the load

Spark already supports Dynamic Allocation for batch jobsScale down if executors are idleScale up if tasks are queueing up

Streaming “micro-batch” jobs need different scaling policyNo executor is idle for a long time!

20

Page 21: Building Robust, Adaptive Streaming Apps with Spark Streaming

Scaling policy with Streaming

21

Load is very lowLots idle time between batches

Cluster resources wasted

stable but inefficient

Page 22: Building Robust, Adaptive Streaming Apps with Spark Streaming

Scaling policy with Streaming

22

scale down the cluster

scaling down increases batch processing timesstable operation with less resources wasted

stable but inefficient

stable and efficient

Page 23: Building Robust, Adaptive Streaming Apps with Spark Streaming

Scaling policy with Streaming

23

scaling up reduces batch processing timesunstable system becomes stable again

scale up the cluster

unstable

stable

Page 24: Building Robust, Adaptive Streaming Apps with Spark Streaming

Elastic Scaling

24

Data buffered in Kafka starts draining, allowing app adapt to any data rate

If Kafka gets data faster than what backpressure allows

SS scales up cluster to increase processing rate

Page 25: Building Robust, Adaptive Streaming Apps with Spark Streaming

Elastic Scaling: Configuration

Will be available in Spark 2.0

Enabled through SparkConf, setspark.streaming.dynamicAllocation.enabled = true

More parameters will be in the online programming guide

25

Page 26: Building Robust, Adaptive Streaming Apps with Spark Streaming

Elastic Scaling: Configuration

Make sure there is enough parallelism to take advantage of max cluster size

# of partitions in reduce, join, etc.# of Kafka partitions# of receivers

Gives usual fault-tolerance guarantees with files, Kafka Direct, Kinesis, and receiver-based sources with WAL enabled

26

Page 27: Building Robust, Adaptive Streaming Apps with Spark Streaming

27

Backpressure Elastic Scaling

Page 28: Building Robust, Adaptive Streaming Apps with Spark Streaming

28

Awesome Adaptive

Apps

Page 29: Building Robust, Adaptive Streaming Apps with Spark Streaming

29

Demo

How app behaves when the data rate suddenly

increases 20x

Page 30: Building Robust, Adaptive Streaming Apps with Spark Streaming

Demo

30

Processing time increases with data rate, until equal to batch interval

Backpressure limits the ingestion rate lower than 20k recs/sec to keep app stable

Page 31: Building Robust, Adaptive Streaming Apps with Spark Streaming

Demo

31

Elastic Scaling detects heavy load and increases cluster size

Processing times reduces as more resources available

Page 32: Building Robust, Adaptive Streaming Apps with Spark Streaming

Demo

32

Backpressure relaxes limits to allow higher ingestion rate

But still less than 20x as cluster is fully utilized

Page 33: Building Robust, Adaptive Streaming Apps with Spark Streaming

Takeaways

Backpressure: makes apps robust to sudden changes

Elastic Scaling: makes apps adapt to slower changes

Backpressure + Elastic Scaling = Awesome Adaptive Apps

Follow me @tathadas


Top Related