flume intro-100715

Jonathan Hsieh, Henry Robinson, Patrick Hunt

Cloudera, Inc

7/15/2010

Flume

Reliable Distributed

Streaming Log Collection

Scenario

• Situation:

– You have hundreds of services producing logs in a datacenter .

– They produce a lot of logs that you want to analyzed

– You have Hadoop, a system for processing large volumes of data.

• Problem:

– How do I reliably ship all my logs to a place that Hadoop can analyze them?

7/15/2010 3

Use cases

• Collecting logs from nodes in Hadoop cluster

• Collecting logs from services such as httpd, mail, etc.

• Collecting impressions from custom apps for an ad network

• But wait, there’s more!– Basic metrics of available

– Basic online in-stream analysis

It’s log, log .. Everyone wants a log!

7/15/2010 4

A sample topology

Collector

HDFS

AgentAgentAgentAgent

CollectorAgentAgentAgentAgent


/logs/web/2010/0715/1200

/logs/web/2010/0715/1300

/logs/web/2010/0715/1400

MasterAgent tier Collector tier

7/15/2010 5

You need a “Flume”

• Flume is a distributed system that gets your logs from their source and aggregates them to where you want to process them.

• Open source, Apache v2.0 License

• Goals:

– Reliability

– Scalability

– Extensibility

– Manageability

Columbia Gorge, Broughton Log Flume7/15/2010 6

Key abstractions

• Data path and control path

• Nodes are in the data path – Nodes have a source and a sink

– They can take different roles• A typical topology has agent nodes and collector nodes.

• Optionally it has processor nodes.

• Masters are in the control path.– Centralized point of configuration.

– Specify sources and sinks

– Can control flows of data between nodes

– Use one master or use many with a ZK-backed quorum

Master

Collector

Agent

7/15/2010 7

A sample topology

Collector

HDFS




/logs/web/2010/0715/1200

/logs/web/2010/0715/1300

/logs/web/2010/0715/1400


7/15/2010 8

/logs/web/2010/0715/1200

/logs/web/2010/0715/1300

/logs/web/2010/0715/1400

Agent tier Collector tier

Storage tier

Masters

Collector

HDFS




Master

7/15/2010 9

Outline

• What is Flume?– Goals and architecture

• Reliability– Fault-tolerance and High availability

• Scalability– Horizontal scalability of all nodes and masters

• Extensibility– Unix principle, all kinds of data, all kinds of sources, all kinds of sinks

• Manageability– Centralized management supporting dynamic reconfiguration

7/15/2010 10

RELIABILITY

The logs will still get there.

7/15/2010 11

Failures

• Faults can happen at many levels

– Software applications can fail

– Machines can fail

– Networking gear can fail

– Excessive networking congestion or machine load

– A node goes down for maintenance.

• How do we make sure that events make it to a permanent store?

7/15/2010 12

Tunable data reliability levels

• Best effort– Fire and forget

• Store on failure + retry– Local acks, local errors

detectable – Failover when faults detected.

• End to end reliability– End to end acks– Data survives compound failures,

and may be retried multiple times

HDFSCollectorAgent

HDFSCollectorAgent

HDFSCollectorAgent

7/15/2010 13

Dealing with Agent failures

• We do not want to lose data

• Make events durable at the generation point.

– If a log generator goes down, it is not generating logs.

– If the event generation point fails and recovers, data will reach the end point

• Data is durable and survive if machines crashes and reboots

– Allows for synchronous writes in log generating applications.

• Watchdog program to restart agent if it fails.

7/15/2010 14

Dealing with Collector Failures

• Data is durable at the agent:

– Minimize the amount of state and possible data loss

– Not necessary to durably keep intermediate state at collector

– Retry if collector goes down.

• Use hot failover so agents can use alternate paths:

– Master predetermines failovers to load balance when collectors go down.

7/15/2010 15

Master Service Failures

• An master machine should not be the single point of failure!

• Masters keep two kinds of information:

• Configuration information (node/flow configuration) – Kept in ZooKeeper ensemble for persistent, highly available metadata store

– Failures easily recovered from

• Ephemeral information (heartbeat info, acks, metrics reports)– Kept in memory

– Failures will lose data

– This information can be lazily replicated

7/15/2010 16

SCALABILITY

Logs jamming the Kemi River7/15/2010 17

A sample topology

Collector

HDFS




/logs/web/2010/0715/1200

/logs/web/2010/0715/1300

/logs/web/2010/0715/1400


7/15/2010 18

Data path is horizontally scalable

• Add collectors to increase availability and to handle more data– Assumes a single agent will not dominate a collector– Fewer connections to HDFS.– Larger more efficient writes to HDFS.

• Agents have mechanisms for machine resource tradeoffs• Write log locally to avoid collector disk IO bottleneck and catastrophic failures• Compression and batching (trade cpu for network)• Push computation into the event collection pipeline (balance IO, Mem, and CPU

resource bottlenecks)

CollectorAgentAgentAgentAgent HDFS

7/15/2010 19

Load balancing

• Agents are logically partitioned and send to different collectors

• Use randomization to pre-specify failovers when many collectors exist • Spread load if a collector goes down.

• Spread load if new collectors added to the system.

CollectorAgentAgent

AgentAgent

CollectorAgentAgent

Collector

7/15/2010 20

Load balancing

• Agents are logically partitioned and send to different collectors

• Use randomization to pre-specify failovers when many collectors exist • Spread load if a collector goes down.

• Spread load if new collectors added to the system.

CollectorAgent

AgentAgent

CollectorAgentAgent

Collector

Agent

7/15/2010 21

Control plane is horizontally scalable

• A master controls dynamic configurations of nodes

– Uses consensus protocol to keep state consistent

– Scales well for configuration reads

– Allows for adaptive repartitioning in the future

• Nodes can talk to any master.

• Masters can talk to an ZK member

Master

Master

Master

ZK1

ZK2

ZK3

Node

Node

Node

7/15/2010 22








Master

Master

Master

ZK1

ZK2

ZK3

Node

Node

Node

7/15/2010 23








Master

Master

Master

ZK1

ZK2

ZK3

Node

Node

Node

7/15/2010 24

EXTENSIBILITY

Turn raw logs into something useful…

7/15/2010 25

Flume is easy to extend

• Simple source and sink APIs

– Event granularity streaming design

– Have many simple operations and compose for complex behavior.

• End-to-end principle

– Put smarts and state at the end points. Keep the middle simple.

• Flume deals with reliability.

– Just add a new source or add a new sink and Flume has primitives to deal with reliability

7/15/2010 26

App

Variety of Data sources

• Can deal with push and pull sources.

• Supports many legacy event sources

– Tailing a file

– Output from periodically Exec’ed program

– Syslog, Syslog-ng

– Experimental: IRC / Twitter / Scribe / AMQP

Agent

Agent

Agent

Apppoll

push

embed

7/15/2010 27

Variety of Data output

• Send data to many sinks

– Files, Hdfs, Console, RPC

– Experimental: hbase, voldemort, s3, etc..

• Supports an extensible variety of outputs formats and destinations

– Output to language neutral and open data formats (json, avro, text)

– Compressed output files in development

• Uses decorators to process event data in flight.

– Sampling, attribute extraction, filtering, projection, checksumming, batching, wire compression, etc..

7/15/2010 28

MANAGEABILITY

Wheeeeee!7/15/2010 29

Centralized data flow management

• One place to specify node sources, sinks and data flows.

– Simply specify the role of the node: collector, agent

– Or specify a custom configuration for a node

• Control Interfaces:

– Flume Shell

– Basic web

– HUE + Flume Manager App (Enterprise users)

7/15/2010 30

Output bucketing

• Automatic output file management

– Write hdfs files in over time based tags

7/15/2010 31

HDFS

/logs/web/2010/0715/1200/data-xxx.txt

/logs/web/2010/0715/1200/data-xxy.txt


/logs/web/2010/0715/1300/data-xxy.txt


…

Collector

node : collectorSource | collectorSink

(“hdfs://namenode/logs/web/%Y/%m%d/%H00”, “data”)

Collector

Simplified configurations

• To make configuring flume nodes higher level, we use logical nodes.

– The Flume node process is a physical node

– Each Flume node process can host multiple logical nodes

• Allows for:

– Reduces the amount of detail required in configurations.

– Reduces management process-centric management overhead

– Allows for finer-grained resource control and isolation with flows

7/15/2010 32

Flow Isolation

• Isolate different kinds of data when and where it is generated

– Have multiple logical nodes on a machine

– Each has their own data source

– Each has their own data sink

Agent

Agent

AgentAgent

Collector

Agent

Agent

Collector

Collector

7/15/2010 33

• Isolate different kinds of data when it is generated

– Have multiple logical nodes on a machine

– Each has their own data source

– Each has their own data sink

Agent

Agent

Agent

Flow Isolation

Collector

Collector

CollectorAgent

Agent

Agent

7/15/2010 34

For advanced users

• A concise and precise configuration language for specifying arbitrary data paths.

– Dataflows are essentially DAGs

– Control specific event flows

• Enable durability mechanism and failover mechanisms

• Tune the parameters these mechanisms

– Dynamic updates of configurations

• Allows for live failover changes

• Allows for handling newly provisioned machines

• Allows for changing analytics

7/15/2010 35

CONCLUSIONS

7/15/2010 36

Summary

• Flume is a distributed, reliable, scalable, system for collecting and delivering high-volume continuous event data such as logs

– Tunable data reliability levels for day

– Reliable master backed by ZK

– Write data to HDFS into buckets ready for batch processing

– Dynamically configurable node

– Simplified automated management for agent+collector topologies

• Open Source Apache v2.0.

7/15/2010 37

Contribute!

• GitHub source repo– http://github.com/cloudera/flume

• Mailing lists– User: https://groups.google.com/a/cloudera.org/group/flume-user– Dev: https://groups.google.com/a/cloudera.org/group/flume-dev

• Development trackers– JIRA (bugs/ formal feature requests):

• https://issues.cloudera.org/browse/FLUME

– Review board (code reviews): • http://review.hbase.org -> http://review.cloudera.org

• IRC Channels– #flume @ irc.freenode.net

7/15/2010 38

http://github.com/cloudera/flume

https://groups.google.com/a/cloudera.org/group/flume-user



https://groups.google.com/a/cloudera.org/group/flume-dev



https://issues.cloudera.org/browse/FLUME

http://review.hbase.org/

http://review.cloudera.org/

Image credits

• http://www.flickr.com/photos/victorvonsalza/3327750057/



• http://www.emvergeoning.com/?m=200811

• http://www.flickr.com/photos/juse/188960076/

• http://www.flickr.com/photos/juse/188960076/

• http://www.flickr.com/photos/23720661@N08/3186507302/

• http://clarksoutdoorchairs.com/log_adirondack_chairs.html

• http://www.flickr.com/photos/dboo/3314299591/

7/15/2010 40

http://www.flickr.com/photos/victorvonsalza/3327750057/



http://www.emvergeoning.com/?m=200811

http://www.flickr.com/photos/juse/188960076/

http://www.flickr.com/photos/juse/188960076/

http://www.flickr.com/photos/23720661@N08/3186507302/

http://clarksoutdoorchairs.com/log_adirondack_chairs.html

http://www.flickr.com/photos/dboo/3314299591/

flume intro-100715

Documents

collector failures data

collector nodes

sink collector

agent failures

kinds of data

data path nodes

end point data

flows of data