storm 2012-03-29
TRANSCRIPT
![Page 1: Storm 2012-03-29](https://reader034.vdocuments.net/reader034/viewer/2022042614/55a2bb161a28ab3a5f8b46e5/html5/thumbnails/1.jpg)
Real-time and long-time
Fun with Hadoop + Storm
![Page 2: Storm 2012-03-29](https://reader034.vdocuments.net/reader034/viewer/2022042614/55a2bb161a28ab3a5f8b46e5/html5/thumbnails/2.jpg)
The Challenge
• Hadoop is great of processing vats of data
– But sucks for real-time (by design!)
• Storm is great for real-time processing
– But lacks any way to deal with batch processing
• It sounds like there isn’t a solution
– Neither fashionable solution handles everything
![Page 3: Storm 2012-03-29](https://reader034.vdocuments.net/reader034/viewer/2022042614/55a2bb161a28ab3a5f8b46e5/html5/thumbnails/3.jpg)
This is not a problem.
It’s an opportunity!
![Page 4: Storm 2012-03-29](https://reader034.vdocuments.net/reader034/viewer/2022042614/55a2bb161a28ab3a5f8b46e5/html5/thumbnails/4.jpg)
t
now
Hadoop is Not Very Real-time
UnprocessedData
Fully processed
Latest full period
Hadoop job takes this long for this data
![Page 5: Storm 2012-03-29](https://reader034.vdocuments.net/reader034/viewer/2022042614/55a2bb161a28ab3a5f8b46e5/html5/thumbnails/5.jpg)
Need to Plug the Hole in Hadoop
• We have real-time data with limited state
– Exactly what Storm does
– And what Hadoop does not
• Can Storm and Hadoop be combined?
![Page 6: Storm 2012-03-29](https://reader034.vdocuments.net/reader034/viewer/2022042614/55a2bb161a28ab3a5f8b46e5/html5/thumbnails/6.jpg)
t
now
Hadoop works great back here
Storm workshere
Real-time and Long-time together
Blended view
Blended view
Blended View
![Page 7: Storm 2012-03-29](https://reader034.vdocuments.net/reader034/viewer/2022042614/55a2bb161a28ab3a5f8b46e5/html5/thumbnails/7.jpg)
An Example
• I want to know how many queries I get
– Per second, minute, day, week
• Results should be available
– within <2 seconds 99.9+% of the time
– within 30 seconds almost always
• History should last >3 years
• Should work for 0.001 q/s up to 100,000 q/s
• Failure tolerant, yadda, yadda
![Page 8: Storm 2012-03-29](https://reader034.vdocuments.net/reader034/viewer/2022042614/55a2bb161a28ab3a5f8b46e5/html5/thumbnails/8.jpg)
Rough Design – Data Flow
Search Engine
Query Event Spout
Logger Bolt
Counter Bolt
Raw Logs
LoggerBolt
Semi Agg
Hadoop Aggregator
Snap
Long agg
Query Event Spout
Counter Bolt
Logger Bolt
![Page 9: Storm 2012-03-29](https://reader034.vdocuments.net/reader034/viewer/2022042614/55a2bb161a28ab3a5f8b46e5/html5/thumbnails/9.jpg)
Counter Bolt Detail
• Input: Labels to count
• Output: Short-term semi-aggregated counts
– (time-window, label, count)
• Non-zero counts emitted if
– event count reaches threshold (typical 100K)
– time since last count reaches threshold (typical 1s)
• Tuples acked when counts emitted
• Double count probability is > 0 but very small
![Page 10: Storm 2012-03-29](https://reader034.vdocuments.net/reader034/viewer/2022042614/55a2bb161a28ab3a5f8b46e5/html5/thumbnails/10.jpg)
Counter Bolt Counterintuitivity
• Counts are emitted for same label, same time window many times
– these are semi-aggregated
– this is a feature
– tuples can be acked within 1s
– time windows can be much longer than 1s
• No need to send same label to same bolt
– speeds failure recovery
![Page 11: Storm 2012-03-29](https://reader034.vdocuments.net/reader034/viewer/2022042614/55a2bb161a28ab3a5f8b46e5/html5/thumbnails/11.jpg)
Design Flexibility
• Counter can persist short-term transaction log
– counter can recover state on failure
– log is normally burn after write
• Count flush interval can be extended without extending tuple timeout
– Decreases currency of counts
– System is still real-time at a longer time-scale
• Total bandwidth for log is typically not huge
![Page 12: Storm 2012-03-29](https://reader034.vdocuments.net/reader034/viewer/2022042614/55a2bb161a28ab3a5f8b46e5/html5/thumbnails/12.jpg)
Counter Bolt No-nos
• Cannot accumulate entire period in-memory
– Tuples must be ack’ed much sooner
– State must be persisted before ack’ing
– State can easily grow too large to handle without disk access
• Cannot persist entire count table at once
– Incremental persistence required
![Page 13: Storm 2012-03-29](https://reader034.vdocuments.net/reader034/viewer/2022042614/55a2bb161a28ab3a5f8b46e5/html5/thumbnails/13.jpg)
Guarantees
• Counter output volume is small-ish
– the greater of k tuples per 100K inputs or k tuple/s
– 1 tuple/s/label/bolt for this exercise
• Persistence layer must provide guarantees
– distributed against node failure
– must have either readable flush or closed-append
– HDFS is distributed, but no guarantees
– MapRfs is distributed, provides both guarantees
![Page 14: Storm 2012-03-29](https://reader034.vdocuments.net/reader034/viewer/2022042614/55a2bb161a28ab3a5f8b46e5/html5/thumbnails/14.jpg)
Failure Modes
• Bolt failure– buffered tuples will go un’acked– after timeout, tuples will be resent– timeout ≈ 10s– if failure occurs after persistence, before acking, then
double-counting is possible
• Storage (with MapR)– most failures invisible– a few continue within 0-2s, some take 10s– catastrophic cluster restart can take 2-3 min– logger can buffer this much easily
![Page 15: Storm 2012-03-29](https://reader034.vdocuments.net/reader034/viewer/2022042614/55a2bb161a28ab3a5f8b46e5/html5/thumbnails/15.jpg)
Presentation Layer
• Presentation must
– read recent output of Logger bolt
– read relevant output of Hadoop jobs
– combine semi-aggregated records
• User will see
– counts that increment within 0-2 s of events
– seamless meld of short and long-term data
![Page 16: Storm 2012-03-29](https://reader034.vdocuments.net/reader034/viewer/2022042614/55a2bb161a28ab3a5f8b46e5/html5/thumbnails/16.jpg)
Mobile Network Monitor
16
Transaction data
Batch aggregation
Map
Real-time dashboard and alerts
Geo-dispersed ingest servers
Retro-analysisinterface
![Page 17: Storm 2012-03-29](https://reader034.vdocuments.net/reader034/viewer/2022042614/55a2bb161a28ab3a5f8b46e5/html5/thumbnails/17.jpg)
Example 2 – Real-time learning
• My system has to
– learn a response model
and
– select training data
– in real-time
• Data rate up to 100K queries per second
![Page 18: Storm 2012-03-29](https://reader034.vdocuments.net/reader034/viewer/2022042614/55a2bb161a28ab3a5f8b46e5/html5/thumbnails/18.jpg)
Door Number 3
• I have 15 versions of my landing page
• Each visitor is assigned to a version
– Which version?
• A conversion or sale or whatever can happen
– How long to wait?
• Some versions of the landing page are horrible
– Don’t want to give them traffic
![Page 19: Storm 2012-03-29](https://reader034.vdocuments.net/reader034/viewer/2022042614/55a2bb161a28ab3a5f8b46e5/html5/thumbnails/19.jpg)
Real-time Constraints
• Selection must happen in <20 ms almost all the time
• Training events must be handled in <20 ms
• Failover must happen within 5 seconds
• Client should timeout and back-off
– no need for an answer after 500ms
• State persistence required
![Page 20: Storm 2012-03-29](https://reader034.vdocuments.net/reader034/viewer/2022042614/55a2bb161a28ab3a5f8b46e5/html5/thumbnails/20.jpg)
Rough Design
DRPC SpoutQuery Event
Spout
Logger Bolt
Counter Bolt
Raw Logs
Model State
Timed Join Model
Logger Bolt
Conversion Detector
Selector Layer
![Page 21: Storm 2012-03-29](https://reader034.vdocuments.net/reader034/viewer/2022042614/55a2bb161a28ab3a5f8b46e5/html5/thumbnails/21.jpg)
A Quick Diversion
• You see a coin
– What is the probability of heads?
– Could it be larger or smaller than that?
• I flip the coin and while it is in the air ask again
• I catch the coin and ask again
• I look at the coin (and you don’t) and ask again
• Why does the answer change?
– And did it ever have a single value?
![Page 22: Storm 2012-03-29](https://reader034.vdocuments.net/reader034/viewer/2022042614/55a2bb161a28ab3a5f8b46e5/html5/thumbnails/22.jpg)
A First Conclusion
• Probability as expressed by humans is subjective and depends on information and experience
![Page 23: Storm 2012-03-29](https://reader034.vdocuments.net/reader034/viewer/2022042614/55a2bb161a28ab3a5f8b46e5/html5/thumbnails/23.jpg)
A Second Conclusion
• A single number is a bad way to express uncertain knowledge
• A distribution of values might be better
![Page 24: Storm 2012-03-29](https://reader034.vdocuments.net/reader034/viewer/2022042614/55a2bb161a28ab3a5f8b46e5/html5/thumbnails/24.jpg)
I Dunno
![Page 25: Storm 2012-03-29](https://reader034.vdocuments.net/reader034/viewer/2022042614/55a2bb161a28ab3a5f8b46e5/html5/thumbnails/25.jpg)
5 and 5
![Page 26: Storm 2012-03-29](https://reader034.vdocuments.net/reader034/viewer/2022042614/55a2bb161a28ab3a5f8b46e5/html5/thumbnails/26.jpg)
2 and 10
![Page 27: Storm 2012-03-29](https://reader034.vdocuments.net/reader034/viewer/2022042614/55a2bb161a28ab3a5f8b46e5/html5/thumbnails/27.jpg)
Bayesian Bandit
• Compute distributions based on data
• Sample p1 and p2 from these distributions
• Put a coin in bandit 1 if p1 > p2
• Else, put the coin in bandit 2
![Page 28: Storm 2012-03-29](https://reader034.vdocuments.net/reader034/viewer/2022042614/55a2bb161a28ab3a5f8b46e5/html5/thumbnails/28.jpg)
And it works!
11000 100 200 300 400 500 600 700 800 900 1000
0.12
0
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
0.1
0.11
n
reg
ret
ε- greedy, ε = 0.05
Bayesian Bandit with Gamma- Normal
![Page 29: Storm 2012-03-29](https://reader034.vdocuments.net/reader034/viewer/2022042614/55a2bb161a28ab3a5f8b46e5/html5/thumbnails/29.jpg)
The Basic Idea
• We can encode a distribution by sampling
• Sampling allows unification of exploration and exploitation
• Can be extended to more general response models
![Page 30: Storm 2012-03-29](https://reader034.vdocuments.net/reader034/viewer/2022042614/55a2bb161a28ab3a5f8b46e5/html5/thumbnails/30.jpg)
• Contact:
– @ted_dunning
• Slides and such:
– http://info.mapr.com/ted-storm-2012-03