stream processing with kafka and samza

Post on 16-Feb-2017

564 Views

Category:

Technology

3 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Stream Processing with Kafka and Samza

Diego Pacheco @diego_pacheco Principal Software Architect

●LinkedIN 2011●Implemented with Scala and Java●Motivation: Real-time data feeds●Goals:–Low Latency–High Throughtput

●Kafka at LinkedIN(2014):–300+ brokers–18k topics–140k partitions–220B messages per day–40TB inboud–160TB outbound–Peak Load: 3.25M messages/second

●Use case: Activity Stream, Offline log processing

NO JMS

● LinkedIN 2013

● Stream Processing with Save Points.

● Multi-tenancy: 1 Thread per container

● State is simple

– You handle logging and restoring

– Single threaded programing

● Works with YARN

● Works well with Kafka

● Simple API – Record-like.

● Stream Processing

● Low Latency

● Async Processing

● Local State● Stores data localy on DISK● SAME machine where container runs

– Awesome FIT for Statefull processing

● Tight Integration with Kafka

● Strong Model For Streams: Ordered, Highly Avaliable, Partitioned and Durable(Kafka).

● Full feature Set of Kafka

● Client Side Join

Stream Processing with Kafka and Samza

Diego Pacheco @diego_pacheco Principal Software Architect

Thank You!Obrigado !

top related