apache storm concepts

34
Apache Storm Concepts André Dias [email protected]

Upload: andre-dias

Post on 15-Jul-2015

101 views

Category:

Documents


5 download

TRANSCRIPT

Page 1: Apache Storm Concepts

Apache Storm ConceptsAndré Dias

[email protected]

Page 2: Apache Storm Concepts

Highlights

● Background● Storm’s History● Concepts● Integrations● Real Cases● Q&A

Page 3: Apache Storm Concepts

Background

Page 4: Apache Storm Concepts

BackgroundBig Data V’s

Page 5: Apache Storm Concepts

BackgroundBig Data V’s

VolumePetabytes / Terabytes

VelocityReal-time / Near Real-time

VarietySensors, Blog Posts, Logs, Social Networks...

Page 6: Apache Storm Concepts

BackgroundData Streaming

Page 7: Apache Storm Concepts

BackgroundData Streaming

Page 8: Apache Storm Concepts

BackgroundData Streaming

Now Please… I’m in traffic…

Page 9: Apache Storm Concepts

● Discovery● Ingest● Process● Persist● Analyze● Expose

BackgroundData Value Chain

Page 10: Apache Storm Concepts

● Discovery● Ingest● Process● Persist● Analyze● Expose

BackgroundData Value Chain (Storm Focus)

Page 11: Apache Storm Concepts

BackgroundData Value Chain

● Discovery● Ingest● Process● Persist● Analyze● Expose

Page 12: Apache Storm Concepts

BackgroundData Value Chain (Ingest)

Stream ProcessingData in Motion

Batch Processing

Page 13: Apache Storm Concepts

● Data processing architecture

○ Generic

○ Scalable

○ Fault-tolerant (human/hardware)

● Low latency

BackgroundLambda Architecture (LA)

Page 14: Apache Storm Concepts

BackgroundLambda Architecture (LA)

Page 15: Apache Storm Concepts

Storm’s HistoryNathan Marz

● Backtype (2008), acquired by Twitter (2011)

● Lambda Architecture Creator

● September, 2011. Storm Creation

● September, 2013. Storm entered to ASF

Page 16: Apache Storm Concepts

Why Use It?● Scalable, as Hadoop

● No data loss, reliable

● Fault tolerant

● Language agnostic

● Real-time, real fast

Page 17: Apache Storm Concepts

Why Use It?

Storm is TLP !!

Page 18: Apache Storm Concepts

Concepts

Page 19: Apache Storm Concepts

Concepts

● Tuple● Streams● Spouts● Bolts● Topologies / Trident API● Stream Groupings

Page 20: Apache Storm Concepts

ConceptsSpouts

● Source of Streams

● Data Consumers (Ingestion)

● Emits Tuples

Page 21: Apache Storm Concepts

ConceptsBolts

● Units of Work to tuples

● Data streaming logic

● Can emit tuples as well

● Data store integration

Page 22: Apache Storm Concepts

ConceptsTopology

● Data Streaming Flow Representation

● DAG (Direct Acyclic Graph) of Spouts and

Bolts

● Streaming computation

● Each node as individual task (parallel

execution)

● Stateless

Page 23: Apache Storm Concepts

ConceptsTrident API

● Abstraction Layer over low-level Storm API

● More Complex Topologies

● Stateful

● Micro-batch

● High-level API (similar to Pig / Cascading -

Hadoop)

● Message processed at least once

(guaranteed)

Page 24: Apache Storm Concepts

ConceptsTrident API - Micro Batch

● Trident Batches

○ are Ordered

Page 25: Apache Storm Concepts

ConceptsTrident API - Micro Batch

● Trident Batches

○ can be Partitioned

Page 26: Apache Storm Concepts

ConceptsStreaming Groups

● Data Flow Control over

Topologies

Page 27: Apache Storm Concepts

Architecture

Page 28: Apache Storm Concepts

ArchitectureComponents - Nimbus

● Master node (similar to JobTracker)

● Monitor and distribute the processing

workload across worker nodes

● Stores all its data into Zookeeper

Page 29: Apache Storm Concepts

ArchitectureComponents - Supervisor

● Worker node (similar to TaskTracker)

● Monitor and distribute the processing

workload across worker nodes

● Stores all its data into Zookeeper

Page 30: Apache Storm Concepts

ArchitectureOverview

● Master-slave approach

● Cluster coordination

(Zookeeper)

● Nimbus HA

Page 31: Apache Storm Concepts

Integrations

Page 32: Apache Storm Concepts

Real Cases

Collector sensor information to a

Data Lake

Micro-batch user contents, content feeds

and application logs

Real-time user music

recommendations

Page 33: Apache Storm Concepts

Q&A

Page 34: Apache Storm Concepts

THANK YOU!Use Storm