storm 2012-03-29

30
Real-time and long-time Fun with Hadoop + Storm

Upload: ted-dunning

Post on 13-Jul-2015

1.395 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: Storm 2012-03-29

Real-time and long-time

Fun with Hadoop + Storm

Page 2: Storm 2012-03-29

The Challenge

• Hadoop is great of processing vats of data

– But sucks for real-time (by design!)

• Storm is great for real-time processing

– But lacks any way to deal with batch processing

• It sounds like there isn’t a solution

– Neither fashionable solution handles everything

Page 3: Storm 2012-03-29

This is not a problem.

It’s an opportunity!

Page 4: Storm 2012-03-29

t

now

Hadoop is Not Very Real-time

UnprocessedData

Fully processed

Latest full period

Hadoop job takes this long for this data

Page 5: Storm 2012-03-29

Need to Plug the Hole in Hadoop

• We have real-time data with limited state

– Exactly what Storm does

– And what Hadoop does not

• Can Storm and Hadoop be combined?

Page 6: Storm 2012-03-29

t

now

Hadoop works great back here

Storm workshere

Real-time and Long-time together

Blended view

Blended view

Blended View

Page 7: Storm 2012-03-29

An Example

• I want to know how many queries I get

– Per second, minute, day, week

• Results should be available

– within <2 seconds 99.9+% of the time

– within 30 seconds almost always

• History should last >3 years

• Should work for 0.001 q/s up to 100,000 q/s

• Failure tolerant, yadda, yadda

Page 8: Storm 2012-03-29

Rough Design – Data Flow

Search Engine

Query Event Spout

Logger Bolt

Counter Bolt

Raw Logs

LoggerBolt

Semi Agg

Hadoop Aggregator

Snap

Long agg

Query Event Spout

Counter Bolt

Logger Bolt

Page 9: Storm 2012-03-29

Counter Bolt Detail

• Input: Labels to count

• Output: Short-term semi-aggregated counts

– (time-window, label, count)

• Non-zero counts emitted if

– event count reaches threshold (typical 100K)

– time since last count reaches threshold (typical 1s)

• Tuples acked when counts emitted

• Double count probability is > 0 but very small

Page 10: Storm 2012-03-29

Counter Bolt Counterintuitivity

• Counts are emitted for same label, same time window many times

– these are semi-aggregated

– this is a feature

– tuples can be acked within 1s

– time windows can be much longer than 1s

• No need to send same label to same bolt

– speeds failure recovery

Page 11: Storm 2012-03-29

Design Flexibility

• Counter can persist short-term transaction log

– counter can recover state on failure

– log is normally burn after write

• Count flush interval can be extended without extending tuple timeout

– Decreases currency of counts

– System is still real-time at a longer time-scale

• Total bandwidth for log is typically not huge

Page 12: Storm 2012-03-29

Counter Bolt No-nos

• Cannot accumulate entire period in-memory

– Tuples must be ack’ed much sooner

– State must be persisted before ack’ing

– State can easily grow too large to handle without disk access

• Cannot persist entire count table at once

– Incremental persistence required

Page 13: Storm 2012-03-29

Guarantees

• Counter output volume is small-ish

– the greater of k tuples per 100K inputs or k tuple/s

– 1 tuple/s/label/bolt for this exercise

• Persistence layer must provide guarantees

– distributed against node failure

– must have either readable flush or closed-append

– HDFS is distributed, but no guarantees

– MapRfs is distributed, provides both guarantees

Page 14: Storm 2012-03-29

Failure Modes

• Bolt failure– buffered tuples will go un’acked– after timeout, tuples will be resent– timeout ≈ 10s– if failure occurs after persistence, before acking, then

double-counting is possible

• Storage (with MapR)– most failures invisible– a few continue within 0-2s, some take 10s– catastrophic cluster restart can take 2-3 min– logger can buffer this much easily

Page 15: Storm 2012-03-29

Presentation Layer

• Presentation must

– read recent output of Logger bolt

– read relevant output of Hadoop jobs

– combine semi-aggregated records

• User will see

– counts that increment within 0-2 s of events

– seamless meld of short and long-term data

Page 16: Storm 2012-03-29

Mobile Network Monitor

16

Transaction data

Batch aggregation

Map

Real-time dashboard and alerts

Geo-dispersed ingest servers

Retro-analysisinterface

Page 17: Storm 2012-03-29

Example 2 – Real-time learning

• My system has to

– learn a response model

and

– select training data

– in real-time

• Data rate up to 100K queries per second

Page 18: Storm 2012-03-29

Door Number 3

• I have 15 versions of my landing page

• Each visitor is assigned to a version

– Which version?

• A conversion or sale or whatever can happen

– How long to wait?

• Some versions of the landing page are horrible

– Don’t want to give them traffic

Page 19: Storm 2012-03-29

Real-time Constraints

• Selection must happen in <20 ms almost all the time

• Training events must be handled in <20 ms

• Failover must happen within 5 seconds

• Client should timeout and back-off

– no need for an answer after 500ms

• State persistence required

Page 20: Storm 2012-03-29

Rough Design

DRPC SpoutQuery Event

Spout

Logger Bolt

Counter Bolt

Raw Logs

Model State

Timed Join Model

Logger Bolt

Conversion Detector

Selector Layer

Page 21: Storm 2012-03-29

A Quick Diversion

• You see a coin

– What is the probability of heads?

– Could it be larger or smaller than that?

• I flip the coin and while it is in the air ask again

• I catch the coin and ask again

• I look at the coin (and you don’t) and ask again

• Why does the answer change?

– And did it ever have a single value?

Page 22: Storm 2012-03-29

A First Conclusion

• Probability as expressed by humans is subjective and depends on information and experience

Page 23: Storm 2012-03-29

A Second Conclusion

• A single number is a bad way to express uncertain knowledge

• A distribution of values might be better

Page 24: Storm 2012-03-29

I Dunno

Page 25: Storm 2012-03-29

5 and 5

Page 26: Storm 2012-03-29

2 and 10

Page 27: Storm 2012-03-29

Bayesian Bandit

• Compute distributions based on data

• Sample p1 and p2 from these distributions

• Put a coin in bandit 1 if p1 > p2

• Else, put the coin in bandit 2

Page 28: Storm 2012-03-29

And it works!

11000 100 200 300 400 500 600 700 800 900 1000

0.12

0

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.1

0.11

n

reg

ret

ε- greedy, ε = 0.05

Bayesian Bandit with Gamma- Normal

Page 29: Storm 2012-03-29

The Basic Idea

• We can encode a distribution by sampling

• Sampling allows unification of exploration and exploitation

• Can be extended to more general response models