flink forward sf 2017: cliff resnick & seth wiesman - from zero to streaming: getting to...
TRANSCRIPT
![Page 1: Flink Forward SF 2017: Cliff Resnick & Seth Wiesman - From Zero to Streaming: Getting to production with Apache Flink](https://reader033.vdocuments.net/reader033/viewer/2022052915/58f9a8ad1a28aba5278b4599/html5/thumbnails/1.jpg)
+
1
April 6, 2017
Getting to Production with Apache Flink
Zero to Streaming
![Page 2: Flink Forward SF 2017: Cliff Resnick & Seth Wiesman - From Zero to Streaming: Getting to production with Apache Flink](https://reader033.vdocuments.net/reader033/viewer/2022052915/58f9a8ad1a28aba5278b4599/html5/thumbnails/2.jpg)
2
MediaMathMediaMath’s technology and services help brands and their agencies drive business outcomes through programmatic marketing. We believe that good advertising is customer-centric, delivering relevant and meaningful marketing experiences across channels, formats and devices. Powered by advanced machine learning algorithms that buy, optimize and report in real time, our platform gives sophisticated marketers access to first-, second- and third-party data and trillions of digital impressions across every media channel. Clients are supported by solutions and services experts that make it simple to activate our technology. Since launching the first Demand Side Platform (DSP) in 2007, MediaMath has grown to a global company of nearly 700 employees in 15 locations in every region of the world. MediaMath’s clients include all major holding companies and operating agencies as well as leading brands across top verticals.
![Page 3: Flink Forward SF 2017: Cliff Resnick & Seth Wiesman - From Zero to Streaming: Getting to production with Apache Flink](https://reader033.vdocuments.net/reader033/viewer/2022052915/58f9a8ad1a28aba5278b4599/html5/thumbnails/3.jpg)
3
Reporting on FlinkNext Generation Reporting Infrastructure
▪ Generate user facing reports from raw log data Currently one dataset with 600 million to 1 billion records a day More datasets to come soon
▪ Data collected from multiple datacenters around the world
▪Work is done in AWS
![Page 4: Flink Forward SF 2017: Cliff Resnick & Seth Wiesman - From Zero to Streaming: Getting to production with Apache Flink](https://reader033.vdocuments.net/reader033/viewer/2022052915/58f9a8ad1a28aba5278b4599/html5/thumbnails/4.jpg)
4
Proof of Concept
![Page 5: Flink Forward SF 2017: Cliff Resnick & Seth Wiesman - From Zero to Streaming: Getting to production with Apache Flink](https://reader033.vdocuments.net/reader033/viewer/2022052915/58f9a8ad1a28aba5278b4599/html5/thumbnails/5.jpg)
5
Requirements
▪ Not a lambda architecture
▪ Fit into the existing stack
▪ Can’t break the bank
▪Must align with my sleep schedule
![Page 6: Flink Forward SF 2017: Cliff Resnick & Seth Wiesman - From Zero to Streaming: Getting to production with Apache Flink](https://reader033.vdocuments.net/reader033/viewer/2022052915/58f9a8ad1a28aba5278b4599/html5/thumbnails/6.jpg)
6
Current Stack
![Page 7: Flink Forward SF 2017: Cliff Resnick & Seth Wiesman - From Zero to Streaming: Getting to production with Apache Flink](https://reader033.vdocuments.net/reader033/viewer/2022052915/58f9a8ad1a28aba5278b4599/html5/thumbnails/7.jpg)
7
New Stack
![Page 8: Flink Forward SF 2017: Cliff Resnick & Seth Wiesman - From Zero to Streaming: Getting to production with Apache Flink](https://reader033.vdocuments.net/reader033/viewer/2022052915/58f9a8ad1a28aba5278b4599/html5/thumbnails/8.jpg)
8
Window SchemaAppend ▪ Simple to reason about ▪ Difficult to guarantee idempotent updates
Upsert ▪ Better aligns with the streaming model ▪ Results in a larger state
![Page 9: Flink Forward SF 2017: Cliff Resnick & Seth Wiesman - From Zero to Streaming: Getting to production with Apache Flink](https://reader033.vdocuments.net/reader033/viewer/2022052915/58f9a8ad1a28aba5278b4599/html5/thumbnails/9.jpg)
9
Restore Penalty
![Page 10: Flink Forward SF 2017: Cliff Resnick & Seth Wiesman - From Zero to Streaming: Getting to production with Apache Flink](https://reader033.vdocuments.net/reader033/viewer/2022052915/58f9a8ad1a28aba5278b4599/html5/thumbnails/10.jpg)
10
Restore Penalty
![Page 11: Flink Forward SF 2017: Cliff Resnick & Seth Wiesman - From Zero to Streaming: Getting to production with Apache Flink](https://reader033.vdocuments.net/reader033/viewer/2022052915/58f9a8ad1a28aba5278b4599/html5/thumbnails/11.jpg)
11
S3▪ S3 is not a file system
▪ 99.999999999% durability at the cost of eventual consistency
▪What does Amazon promise?
read-after-write consistency for PUTS of new objects
eventual consistency for overwrite PUTS and DELETES
![Page 12: Flink Forward SF 2017: Cliff Resnick & Seth Wiesman - From Zero to Streaming: Getting to production with Apache Flink](https://reader033.vdocuments.net/reader033/viewer/2022052915/58f9a8ad1a28aba5278b4599/html5/thumbnails/12.jpg)
12
Consistent View▪ EMR provides a consistent view of S3
DynamoDb is used as a source of truth
▪ Can we create our own source of truth?
▪ Flink is expected to be in a consistent state
![Page 13: Flink Forward SF 2017: Cliff Resnick & Seth Wiesman - From Zero to Streaming: Getting to production with Apache Flink](https://reader033.vdocuments.net/reader033/viewer/2022052915/58f9a8ad1a28aba5278b4599/html5/thumbnails/13.jpg)
13
Roll our own File System▪ Treat Flink as a source of truth
▪ If Flink and S3 disagree Assume S3 is inconsistent Back off Try again
▪Works to an extent S3 has no upper-bound on time to consistency
![Page 14: Flink Forward SF 2017: Cliff Resnick & Seth Wiesman - From Zero to Streaming: Getting to production with Apache Flink](https://reader033.vdocuments.net/reader033/viewer/2022052915/58f9a8ad1a28aba5278b4599/html5/thumbnails/14.jpg)
14
The BucketingSink and S3
![Page 15: Flink Forward SF 2017: Cliff Resnick & Seth Wiesman - From Zero to Streaming: Getting to production with Apache Flink](https://reader033.vdocuments.net/reader033/viewer/2022052915/58f9a8ad1a28aba5278b4599/html5/thumbnails/15.jpg)
15
Roll our own Sink▪ Treat S3 like a key-value store
▪Write all files locally
▪ Copy to S3 once per checkpoint
▪ Sink writes are all or nothing
![Page 16: Flink Forward SF 2017: Cliff Resnick & Seth Wiesman - From Zero to Streaming: Getting to production with Apache Flink](https://reader033.vdocuments.net/reader033/viewer/2022052915/58f9a8ad1a28aba5278b4599/html5/thumbnails/16.jpg)
16
Where did the checkpoint barriers go▪Went nearly 3 months without seeing a checkpoint
▪ Dug into Flink’s internals
▪ Built tools to better understand what was happening
A small number of windows were never checkpointing
![Page 17: Flink Forward SF 2017: Cliff Resnick & Seth Wiesman - From Zero to Streaming: Getting to production with Apache Flink](https://reader033.vdocuments.net/reader033/viewer/2022052915/58f9a8ad1a28aba5278b4599/html5/thumbnails/17.jpg)
17
Taking a step back ▪We needed to relearn our dataset
8 hours looks very different than 15 minutes
▪We have massive data skew
▪ 1/3 of all elements belong to 0.3% of the key-space
![Page 18: Flink Forward SF 2017: Cliff Resnick & Seth Wiesman - From Zero to Streaming: Getting to production with Apache Flink](https://reader033.vdocuments.net/reader033/viewer/2022052915/58f9a8ad1a28aba5278b4599/html5/thumbnails/18.jpg)
18
Handling data skew▪ Constrain the key-space
Flink scales with windows
▪We have three keys common across all reports
Primary Key
Secondary Key Metrics
. . . . . .
. . . . . .
![Page 19: Flink Forward SF 2017: Cliff Resnick & Seth Wiesman - From Zero to Streaming: Getting to production with Apache Flink](https://reader033.vdocuments.net/reader033/viewer/2022052915/58f9a8ad1a28aba5278b4599/html5/thumbnails/19.jpg)
19
Handling data skew▪ Reduce the number of elements moving across the network
▪ Can we build something similar Hadoop’s combiner?
▪ Low level control through the SingleInputOperatorInterface
![Page 20: Flink Forward SF 2017: Cliff Resnick & Seth Wiesman - From Zero to Streaming: Getting to production with Apache Flink](https://reader033.vdocuments.net/reader033/viewer/2022052915/58f9a8ad1a28aba5278b4599/html5/thumbnails/20.jpg)
20
Flink in AWS
▪ Cluster Standalone or YARN or …?
▪ Storage HDFS or EFS or S3?
▪ Lease Reserved, On-Demand or Spot? EMR?
![Page 21: Flink Forward SF 2017: Cliff Resnick & Seth Wiesman - From Zero to Streaming: Getting to production with Apache Flink](https://reader033.vdocuments.net/reader033/viewer/2022052915/58f9a8ad1a28aba5278b4599/html5/thumbnails/21.jpg)
21
Spot rides
![Page 22: Flink Forward SF 2017: Cliff Resnick & Seth Wiesman - From Zero to Streaming: Getting to production with Apache Flink](https://reader033.vdocuments.net/reader033/viewer/2022052915/58f9a8ad1a28aba5278b4599/html5/thumbnails/22.jpg)
22
Why YARN?▪ Flink Standalone (Homogeneous instances)
▪ Flink on YARN (Heterogeneous instances)
4 448 88
4
4
4
44
4
![Page 23: Flink Forward SF 2017: Cliff Resnick & Seth Wiesman - From Zero to Streaming: Getting to production with Apache Flink](https://reader033.vdocuments.net/reader033/viewer/2022052915/58f9a8ad1a28aba5278b4599/html5/thumbnails/23.jpg)
23
Cluster Initialization Concepts
• Cluster is defined as one or more instances of a Config, keyed by Name
• Config is “userdata”, a user supplied shell script called as root on instance init.
• Cluster “Join” – like Workflow join – means that one cluster can block on another’s init.
![Page 24: Flink Forward SF 2017: Cliff Resnick & Seth Wiesman - From Zero to Streaming: Getting to production with Apache Flink](https://reader033.vdocuments.net/reader033/viewer/2022052915/58f9a8ad1a28aba5278b4599/html5/thumbnails/24.jpg)
24
Spot Cluster Creation
![Page 25: Flink Forward SF 2017: Cliff Resnick & Seth Wiesman - From Zero to Streaming: Getting to production with Apache Flink](https://reader033.vdocuments.net/reader033/viewer/2022052915/58f9a8ad1a28aba5278b4599/html5/thumbnails/25.jpg)
25
Running Flink▪ Initialized by shell process, managed, monitored with YARN REST
apis
▪ Flink Job created and managed with Flink REST apis (poll for events).
▪ Job exceptions and Checkpoint events are logged to database.
![Page 26: Flink Forward SF 2017: Cliff Resnick & Seth Wiesman - From Zero to Streaming: Getting to production with Apache Flink](https://reader033.vdocuments.net/reader033/viewer/2022052915/58f9a8ad1a28aba5278b4599/html5/thumbnails/26.jpg)
26
Monitoring▪ Health checks:
Use http pings and YARN REST api to verify cluster heath. Expose and publish Flink job Current Low Watermark • Alert on Watermark lag.
Trick! Use jolokia: javaagent:/opt/jolokia-jvm-1.3.5-agent.jar=port=0,host=0.0.0.0
Publish checkpoint times and durations • Alert on checkpoint lag.
collectd/graphite for machine stats (memory, load, disk).
![Page 27: Flink Forward SF 2017: Cliff Resnick & Seth Wiesman - From Zero to Streaming: Getting to production with Apache Flink](https://reader033.vdocuments.net/reader033/viewer/2022052915/58f9a8ad1a28aba5278b4599/html5/thumbnails/27.jpg)
27
Spotty-agent▪Watches for spot reclaim (2 minute warning).
▪ Posix Passthrough: remote exec scripts and shell commands (init checks, stats).
![Page 28: Flink Forward SF 2017: Cliff Resnick & Seth Wiesman - From Zero to Streaming: Getting to production with Apache Flink](https://reader033.vdocuments.net/reader033/viewer/2022052915/58f9a8ad1a28aba5278b4599/html5/thumbnails/28.jpg)
28
Cluster Failover1. Agent receives warning and sends notification (first one wins).
2. Monitor initiates Flink “cancel_with_savepoint”.
3. Monitor requests a whole new cluster.
4. When new cluster is up: 1. Verify final checkpoint (may or may not be from cancel). 2. Restart Flink Session 3. Restart Flink job. 4. Delete old clusters’ EC2 instances (may already be gone).
![Page 29: Flink Forward SF 2017: Cliff Resnick & Seth Wiesman - From Zero to Streaming: Getting to production with Apache Flink](https://reader033.vdocuments.net/reader033/viewer/2022052915/58f9a8ad1a28aba5278b4599/html5/thumbnails/29.jpg)
29
Task Manager on YARN config▪ EC2 compute (C-type) instances are appx 1.75 gig ram/VCPU.
(reserving 600-700MB for linux os) ▪Min YARN process = 1.75G ▪ TM of four slots = 4 x 1.75 = 7G ▪ After 5% Flink “yarn tax”, each YARN process = ~6.65G ▪ Set TM Heap cut-off to .25, leaving 1.6G for RocksDB
Heap
RocksDb
![Page 30: Flink Forward SF 2017: Cliff Resnick & Seth Wiesman - From Zero to Streaming: Getting to production with Apache Flink](https://reader033.vdocuments.net/reader033/viewer/2022052915/58f9a8ad1a28aba5278b4599/html5/thumbnails/30.jpg)
30
RocksDB Snapshot to S3▪ S3N and S3A default:
▪ S3A Fast Upload:
▪ Hadoop 3.0 code looks very promising!
![Page 31: Flink Forward SF 2017: Cliff Resnick & Seth Wiesman - From Zero to Streaming: Getting to production with Apache Flink](https://reader033.vdocuments.net/reader033/viewer/2022052915/58f9a8ad1a28aba5278b4599/html5/thumbnails/31.jpg)
31
Dev Tools▪ Live metrics
Provide context to checkpoint and other pipeline problems
▪ Remote debugging Effective, but limited scope
▪ Cluster Logging? Continuous process – cannot use YARN log aggregation Many (100s) of slots – each with own operator DAG Spot Instances – log aggregators not easy to integrate, overkill for problem
![Page 32: Flink Forward SF 2017: Cliff Resnick & Seth Wiesman - From Zero to Streaming: Getting to production with Apache Flink](https://reader033.vdocuments.net/reader033/viewer/2022052915/58f9a8ad1a28aba5278b4599/html5/thumbnails/32.jpg)
32
Snoop “Loggy” Log
https://github.com/MediaMath/snoop-log
Tail your ephemeral cluster ephemerally!
flink/lib/logback-snoop.jar
![Page 33: Flink Forward SF 2017: Cliff Resnick & Seth Wiesman - From Zero to Streaming: Getting to production with Apache Flink](https://reader033.vdocuments.net/reader033/viewer/2022052915/58f9a8ad1a28aba5278b4599/html5/thumbnails/33.jpg)
33
Hacking Flink▪ (demo)
![Page 34: Flink Forward SF 2017: Cliff Resnick & Seth Wiesman - From Zero to Streaming: Getting to production with Apache Flink](https://reader033.vdocuments.net/reader033/viewer/2022052915/58f9a8ad1a28aba5278b4599/html5/thumbnails/34.jpg)
34
Seth Wiesman
Cliff Resnick
4 World Trade Center, 45th Floor New York, NY 10007
THANK YOU!