very large scale stream processing inside alibaba longda@alibaba alibaba
TRANSCRIPT
Alibaba
Very Large Scale Stream Processing inside Alibaba
longda@alibaba
Alibaba
Current1
Current
Next Future
3
团队介绍
• Apache Storm PMC• The First Storm Team in China• Storm 0.5.1/0.5.4/0.6.0/0.6.2/0.7.0/0.7.1• Jstorm
0.7.1/0.9.0/0.9.1/0.9.2/0.9.3/0.9.3.1/0.9.4/0.9.4.1/0.9.5/0.9.5.1/0.9.6/0.9.6.1/0.9.6.2/0.9.6.3/0.9.7/0.9.7.1/0.9.7.2/0.9.8/2.0.4/2.1.0
• Our job – Do Everything: • Application Development• JStorm Platform Evolve• JStorm/Storm Technology Support• Maintain all Cluster
Current
Next Future
4
In Alibaba
• Everywhere• 1600 machines, 70 K machines will deploy • More 1000 Applications, 1500 topology• 1.5 PB• 2 Trillion Messages
Current
Next Future
5
Tlog/eagleeye 1000 Billion Message, 700 TB log, monitor 200K machines log.Rds Monitor 200 TB LogCTU Security 200 Billion Message, monitor all of trade/user actions, 500wDB Monitor 200 Billion Message, 500wBI Realtime Monitor 200 Billion Message, more than 2000 KPI.Alimama Anti Cheat 100 Billion Message, Living Room 11.11 Living Room, 12.12 Living Room, Spring Festival Living RoomOthers All kinds of monitor System
Large Scale Application
Current
Next Future
6
Advanced Features
• User Side Functionality• Stability Enhancement• Performance Improvement
Current
Next Future
7
Stable
• Customer Feedback• No one accident since the switch to Jstorm in the Alimama Cluster
Current
Next Future
8
• Improve Stability• Redesign Metric System• Backpressure• Resource Isolation• Nimbus HA• Topology Manager• Redesign ZK usage• Modify OS setting in RPM
Advanced Feature – Improve Stability
Current
Next Future
9
Redesign Metric System
• Key point:• Every Tuple Stage RT, including wait-time between stages, network cost. • Avoid noise
• Pluginable• Provide API to fetch all metrics
• Koala • Simple Directly Display all metrics
Current
Next Future
10
New UI
Current
Next Future
11
Backpressure
The paper about Heron is too simple to useThe design is complicatedWorks well on our online system, 6 times than the normal
Current
Next Future
12
Resource Isolation
• Cluster Isolation, control through one unified porter –Koala• In one cluster:• Cgroup , share + limit CPU• User-defined Scheduler, force topology run on special
nodes.
Current
Next Future
13
Nimbus HA
• Nimbus HA, • Run more than 20 months• Stable
Current
Next Future
14
TopologyMaster
• Topology’s central control, move some jobs from Nimbus• Backpressure coordinator• Metrics collector/calculator• Hearbeat collector
Current
Next Future
15
Redesign ZK usage
• No dynamic data stored on ZK, especially metrics and hearbeat• ZK can’t support more than 400 Storm nodes .• ZK can support 2000 Jstorm node, current in Alibaba, a lot of
Jstorm ZK support 800 node.
Current
Next Future
16
RPM Setting
• Easy install Jstorm• Modify• Local temporary port range• Ulimit• Cronjob• Environment viriable
Current
Next Future
17
Advanced Features – From User Side
• User Side Functionality• User-Defined Scheduler• User-Defined Log• User-Defined Metrics• Gently Shutdown
• Dynamic Expand/Reload/Restart
• Customized Memory Usage• Different Netty Policy• Classloader
Current
Next Future
18
User-Defined Scheduler
• Just Using API:• Customize every worker’s CPU/Memory usage• Customized topology assignment• Assign Topology by used• Bind several component into one worker ( such as spout/bolt )• Bind upstream/downstream component into one worker• Force one component run on special machines
• Force one component’s task run on different machines• Force topology run on special machines• Force using old assignment
Current
Next Future
19
Used-Define Log
• Switch to user log configuration• Switch between logback and log4j• Redirect System.out to any file• Add tags•
( clustername/hostname/topologyname/workerid/taskid)• Dynamic change log setting:• Enable/Disable debug, debug log sample rate
Current
Next Future
20
User-Defined Metrics
• Using java metrics• Use-defined metrics• Web UI display
• Using Alimonitor• All metrics will be sent to Alimonitor• Used defined Alarm• Display history
• Koala System – JStorm porter• All metrics will be sent to Koala System• Display history• User Defined Alarm
Current
Next Future
21
Gently shutdown
• Resolve problem:• No data loss during shutdown• All worker must be killed• ZK is clean
Current
Next Future
22
Dynamic Expand/Reload/Restart
• Expand• Don’t kill current worker, don’t impact current data flow
• Restart• Reset all configuration• Modify worker/component parallel
• Reload• Reload binary• Reload Configuration
Current
Next Future
23
Customized memory usage
• Customize Worker memory -- worker.memory.size• Modify gc • worker.gc.childopts• Using user-define scheduler api
• Queue mode• Capacity limited/unlimited
Current
Next Future
24
Advanced Netty Feature
• Sync /Async Mode• Async mode blocking policy• Async cache policy
Current
Next Future
25
classloader
• Resolve class conflict between Application and JStorm
Current
Next Future
26
• 6 Servers (24core/98G)• 18 Spout/18 Bolt/18 Acker
0 10 20 30 40 50 600
2000000
4000000
6000000
8000000
10000000
12000000
62436806830500
5595900 5474180
3379800
9280598
10818815
9065965
6819139
5610201
Throughput vs workers
jstormstorm
workers
pollt
uple
s/10
s
Current
Next Future
27
Performance Improvement
1. Smart Batch Policy2. Add one thread to deserialize Tuple in every task3. Remove total send/receive stage4. Separate send and receive operation in Spout5. Fix several bug which leading to CPU empty run.6. Reduce metrics system performance influence.7. Tuning Acker code8. Tuning GC
Current
Next Future
28
Archeture
zookeeper
ui nimbus supervisor supervisor supervisor
worker
task
Current Next Future
29
Merge into Storm
• Replace the clojure core
Current Next Future
30
Redesign our SQL Engine
• The SQL Engine is customized, no general
Alibaba
Current Next Future
31
1. A more powerful SQL Engine2. A more powerful high level program framework
1. Easier to learn, to debug2. Provide higher thoroughput
3. A high level scheduler1. I don’t prefer to offline system – liking Hadoop/Spark/Yarn2. I prefer to online system – Elastic Online Scheduler/Docker/virtual machine3. More light
What should Storm/Jstorm go
Alibaba
Thanks!
Welcome join us :QQ/ 微信 : 32147704