kostas tzoumas - stream processing with apache flink®

of 41 /41
1 Kostas Tzoumas @kostas_tzoumas Big Data Ldn November 4, 2016 Stream Processing with Apache Flink®

Upload: dataartisans

Post on 08-Jan-2017

51 views

Category:

Data & Analytics


2 download

TRANSCRIPT

Page 1: Kostas Tzoumas - Stream Processing with Apache Flink®

1

Kostas Tzoumas@kostas_tzoumas

Big Data LdnNovember 4, 2016

Stream Processing with Apache Flink®

Page 2: Kostas Tzoumas - Stream Processing with Apache Flink®

2

Kostas Tzoumas@kostas_tzoumas

Big Data LdnNovember 4, 2016

Debunking Some Common Myths in Stream Processing

Page 3: Kostas Tzoumas - Stream Processing with Apache Flink®

3

Original creators of Apache Flink®

Providers of the dA Platform, a supported

Flink distribution

Page 4: Kostas Tzoumas - Stream Processing with Apache Flink®

Outline What is data streaming Myth 1: The throughput/latency tradeoff

Myth 2: Exactly once not possible

Myth 3: Streaming is for (near) real-time Myth 4: Streaming is hard

4

Page 5: Kostas Tzoumas - Stream Processing with Apache Flink®

The streaming architecture

5

Page 6: Kostas Tzoumas - Stream Processing with Apache Flink®

6

Reconsideration of data architecture

Better app isolation More real-time reaction to events Robust continuous applications Process both real-time and historical data

Page 7: Kostas Tzoumas - Stream Processing with Apache Flink®

7

app state

app state

app state

event log

Queryservice

Page 8: Kostas Tzoumas - Stream Processing with Apache Flink®

What is (distributed) streaming Computations on never-

ending “streams” of data records (“events”)

Stream processor distributes the computation in a cluster

8

Your code

Your code

Your code

Your code

Page 9: Kostas Tzoumas - Stream Processing with Apache Flink®

What is stateful streaming Computation and state

• E.g., counters, windows of past events, state machines, trained ML models

Result depends on history of stream

Stateful stream processor gives the tools to manage state• Recover, roll back, version, upgrade,

etc

9

Your code

state

Page 10: Kostas Tzoumas - Stream Processing with Apache Flink®

What is event-time streaming Data records associated with

timestamps (time series data)

Processing depends on timestamps

Event-time stream processor gives you the tools to reason about time• E.g., handle streams that are out of

order• Core feature is watermarks – a clock

to measure event time10

Your code

state

t3 t1 t2t4 t1-t2 t3-t4

Page 11: Kostas Tzoumas - Stream Processing with Apache Flink®

What is streaming Continuous processing on data that

is continuously generated

I.e., pretty much all “big” data

It’s all about state and time11

Page 12: Kostas Tzoumas - Stream Processing with Apache Flink®

Debunking some common stream processing myths

12

Page 13: Kostas Tzoumas - Stream Processing with Apache Flink®

Myth 1: Throughput/latency tradeoff Myth 1: you need to choose between high

throughput or low latency

Physical limits• In reality, network determines both the

achievable throughput and latency• A well-engineered system achieves these limits

13

Page 14: Kostas Tzoumas - Stream Processing with Apache Flink®

Flink performance 10s of millions events per seconds in 10s of

nodes scaled to 1000s of nodes with latency in single-digit milliseconds

14

Page 15: Kostas Tzoumas - Stream Processing with Apache Flink®

15

Myth 2: Exactly once not possible Exactly once: under failures, system computes result as if

there was no failure

In contrast to:• At most once: no guarantees• At least once: duplicates possible

Exactly once state versus exactly once delivery

Myth 2: Exactly once state not possible/too costly

Page 16: Kostas Tzoumas - Stream Processing with Apache Flink®

Transactions “Exactly once” is transactions: either all actions

succeed or none succeed

Transactions are possible

Transactions are useful

Let’s not start eventual consistency all over again…

16

Page 17: Kostas Tzoumas - Stream Processing with Apache Flink®

Flink checkpoints Periodic asynchronous consistent snapshots of

application state

Provide exactly-once state guarantees under failures

17

Page 18: Kostas Tzoumas - Stream Processing with Apache Flink®

End-to-end exactly once Checkpoints double as transaction coordination mechanism

Source and sink operators can take part in checkpoints

Exactly once internally, "effectively once" end to end: e.g., Flink + Cassandra with idempotent updates

18

transactional sinks

Page 19: Kostas Tzoumas - Stream Processing with Apache Flink®

State management Checkpoints triple as state

versioning mechanism (savepoints)

Go back and forth in time while maintaining state consistency

Ease code upgrades (Flink or app), maintenance, migration, and debugging, what-if simulations, A/B tests

19

Page 20: Kostas Tzoumas - Stream Processing with Apache Flink®

Myth 3: Streaming and real time Myth 3: streaming and real-time are

synonymous

Streaming is a new model• Essentially, state and time• Low latency/real time is the icing on the

cake20

Page 21: Kostas Tzoumas - Stream Processing with Apache Flink®

Low latency and high latency streams

21

2016-3-112:00 am

2016-3-11:00 am

2016-3-12:00 am

2016-3-1111:00pm

2016-3-1212:00am

2016-3-121:00am

2016-3-1110:00pm

2016-3-122:00am

2016-3-123:00am…

partition

partition

Stream (low latency)

Batch(bounded stream)Stream (high latency)

Page 22: Kostas Tzoumas - Stream Processing with Apache Flink®

Robust continuous applications

22

Page 23: Kostas Tzoumas - Stream Processing with Apache Flink®

Accurate computation Batch processing is not an accurate

computation model for continuous data• Misses the right concepts and primitives• Time handling, state across batch boundaries

Stateful stream processing a better model• Real-time/low-latency is the icing on the cake

23

Page 24: Kostas Tzoumas - Stream Processing with Apache Flink®

Myth 4: How hard is streaming? Myth 4: streaming is too hard to learn

You are already doing streaming, just in an ad hoc way

Most data is unbounded and the code changes slower than the data• This is a streaming problem

24

Page 25: Kostas Tzoumas - Stream Processing with Apache Flink®

It's about your data and code What's the form of your data?• Unbounded (e.g., clicks, sensors, logs), or• Bounded (e.g., ???*)

What changes more often?• My code changes faster than my data• My data changes faster than my code

25

* Please help me find a great example of naturally bounded data

Page 26: Kostas Tzoumas - Stream Processing with Apache Flink®

It's about your data and code If your data changes faster than your

code you have a streaming problem• You may be solving it with hourly batch

jobs depending on someone else to create the hourly batches

• You are probably living with inaccurate results without knowing it

26

Page 27: Kostas Tzoumas - Stream Processing with Apache Flink®

It's about your data and code If your code changes faster than your

data you have an exploration problem• Using notebooks or other tools for quick

data exploration is a good idea• Once your code stabilizes you will have

a streaming problem, so you might as well think of it as such from the beginning 27

Page 28: Kostas Tzoumas - Stream Processing with Apache Flink®

Flink in the real world

28

Page 29: Kostas Tzoumas - Stream Processing with Apache Flink®

29

Flink community > 240 contributors, 95 contributors in Flink 1.1

42 meetups around the world with > 15,000 members

2x-3x growth in 2015, similar in 2016

Page 30: Kostas Tzoumas - Stream Processing with Apache Flink®

Powered by Flink

30

Zalando, one of the largest ecommerce companies in Europe, uses Flink for real-time business

process monitoring.

King, the creators of Candy Crush Saga, uses Flink to provide data

science teams with real-time analytics.

Bouygues Telecom uses Flink for real-time event processing over billions of

Kafka messages per day.

Alibaba, the world's largest retailer, built a Flink-based system (Blink) to

optimize search rankings in real time.

See more at flink.apache.org/poweredby.html

Page 31: Kostas Tzoumas - Stream Processing with Apache Flink®

30 Flink applications in production for more than one year. 10 billion events (2TB) processed daily

Complex jobs of > 30 operators running 24/7, processing 30 billion events daily, maintaining state of 100s of GB with exactly-once guarantees

Largest job has > 20 operators, runs on > 5000 vCores in 1000-node cluster, processes millions of events per second

31

Page 32: Kostas Tzoumas - Stream Processing with Apache Flink®

32

Page 33: Kostas Tzoumas - Stream Processing with Apache Flink®

Flink Forward 2016

Page 34: Kostas Tzoumas - Stream Processing with Apache Flink®

Current work in Flink

34

Page 35: Kostas Tzoumas - Stream Processing with Apache Flink®

Ongoing Flink development

35

ConnectorsSession

Windows(Stream) SQL

Libraryenhancements

MetricSystem

Operations

Ecosystem ApplicationFeatures

Metrics &Visualization

Dynamic Scaling

Savepointcompatibility Checkpoints

to savepoints

More connectors Stream SQLWindows

Large stateMaintenance

Fine grainedrecovery

Side in-/outputsWindow DSL

BroaderAudience

Security

Mesos &others

Dynamic ResourceManagement

Authentication

Queryable State

Page 36: Kostas Tzoumas - Stream Processing with Apache Flink®

A longer-term vision for Flink

36

Page 37: Kostas Tzoumas - Stream Processing with Apache Flink®

37

Streaming use casesApplication

(Near) real-time apps

Continuous apps

Analytics on historical data

Request/response apps

TechnologyLow-latency streaming

High-latency streaming

Batch as special case of streaming

Large queryable state

Page 38: Kostas Tzoumas - Stream Processing with Apache Flink®

Request/response applications Queryable state: query Flink state directly instead

of pushing results in a database

Large state support and query API coming in Flink

38

queries

Page 39: Kostas Tzoumas - Stream Processing with Apache Flink®

In summary The need for streaming comes from a rethinking

of data infra architecture• Stream processing then just becomes natural

Debunking 4 common myths• Myth 1: The throughput/latency tradeoff• Myth 2: Exactly once not possible• Myth 3: Streaming is for (near) real-time• Myth 4: Streaming is hard

39

Page 40: Kostas Tzoumas - Stream Processing with Apache Flink®

40

Thank you!@kostas_tzoumas @ApacheFlink @dataArtisans

Page 41: Kostas Tzoumas - Stream Processing with Apache Flink®

41

We are hiring!

data-artisans.com/careers