kafka short
Post on 27-Jan-2015
142 Views
Preview:
DESCRIPTION
TRANSCRIPT
1
Kafka for Kafka for BigData BigData
ProcessingProcessing
Yanai Franchi , TikalYanai Franchi , Tikal
2
Find “Hot” Places
3
4
gogobot checkinHeat Map Service
Lets' Develop“Gogobot Checkins Heat-Map”
5
Key Notes● Collector Service - Collects checkins as text addresses
– We need to use GeoLocation ServiceWe need to use GeoLocation Service
● Upon elapsed interval, the last locations list will be displayed as Heat-Map in GUI.
● Web Scale service – 10Ks checkins/seconds all over the world (imaginary, but lets do it for the exercise).
6
Heat-Map Context
Text-Address
Checkins Heat-MapService
Gogobot System
GogobotMicro Service
GogobotMicro Service
GogobotMicro Service
Geo LocationService
Get-GeoCode(Address)
Heat-Map
Last Interval Locations
7
Tons of Addresses Arriving Every Second
8
First Reaction...
9
Checkin HTTP Reactor Checkins
Topic
Storm Heat-Map Topology
Hotzones Topic
Web App
Push via WebSocket
Publish Checkins
HDFS
Checkin HTTP Firehose
10
11
They all are GoodBut not for all use-cases
12
KafkaA little introduction
13
14
Why ?
15
LinkedIn Original Architecture
16
17
What LinkedIn Want...
18
Looks Familiar : Use Messaging
(i.e. JMS, RabbitMQ)
19
20
21
22
23
It Didn't Scale...
24
Paradigm Change : Do NOT track message
consumption
25
26
27
28
Stateless Broker &Doesn't Fear the File System
29
Topics● Logical collections of partitions (the physical fi les). ● A broker contains some of the partitions for a topic
30
A partition is Consumed byExactly One Group's Consumer
31
Distributed & Fault-Tolerant
32
Broker 1 Broker 3Broker 2
Zoo Keeper
Consumer 1 Consumer 2
Producer 1 Producer 2
33
Broker 1 Broker 4Broker 3Broker 2
Zoo Keeper
Consumer 1 Consumer 2
Producer 1 Producer 2
34
Broker 1 Broker 4Broker 3Broker 2
Zoo Keeper
Consumer 1 Consumer 2
Producer 1 Producer 2
35
Broker 1 Broker 4Broker 3Broker 2
Zoo Keeper
Consumer 1 Consumer 2
Producer 1 Producer 2
36
Broker 1 Broker 4Broker 3Broker 2
Zoo Keeper
Consumer 1 Consumer 2
Producer 1 Producer 2
37
Broker 1 Broker 4Broker 3Broker 2
Zoo Keeper
Consumer 1 Consumer 2
Producer 1 Producer 2
38
Broker 1 Broker 4Broker 3Broker 2
Zoo Keeper
Consumer 1 Consumer 2
Producer 1 Producer 2
39
Broker 1 Broker 3Broker 2
Zoo Keeper
Consumer 1 Consumer 2
Producer 1 Producer 2
40
Broker 1 Broker 3Broker 2
Zoo Keeper
Consumer 1 Consumer 2
Producer 1 Producer 2
41
Broker 1 Broker 3Broker 2
Zoo Keeper
Consumer 1 Consumer 2
Producer 1 Producer 2
42
Broker 1 Broker 3Broker 2
Zoo Keeper
Consumer 1
Producer 1 Producer 2
43
Broker 1 Broker 3Broker 2
Zoo Keeper
Consumer 1
Producer 1 Producer 2
44
Broker 1 Broker 3Broker 2
Zoo Keeper
Consumer 1
Producer 1 Producer 2
45
Performance Benchmark1 Broker
1 Producer1 Consumer
46
47
48
LinkedIn Kafka Performance (2012)
● 8 nodes per datacenter
– ~20 GB RAM available for Kafka~20 GB RAM available for Kafka
– 6TB storage, RAID 10, basic SATA drives6TB storage, RAID 10, basic SATA drives
● 10 billion messages/day
● Sustained peak:
– 172,000 messages/second written172,000 messages/second written
– 950,000 messages/second read950,000 messages/second read
● 367 topics
● 40 real-time consumers
● Many ad hoc consumers
● 9.5TB log retained (~ 6 days)
● End-to-end delivery time: A few seconds
49
Thanks
top related