topics covered - eclass.uoa.gr · topics covered ¾ what is kafka ¾ ¾ why kafka ¾ ¾ high level...

64

Upload: others

Post on 25-Jul-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Topics Covered - eclass.uoa.gr · Topics Covered ¾ What is Kafka ¾ ¾ Why Kafka ¾ ¾ High level overview ¾ ¾ Use cases ¾ ¾ Key terminology ¾ ¾ Partitions distribution over
Page 2: Topics Covered - eclass.uoa.gr · Topics Covered ¾ What is Kafka ¾ ¾ Why Kafka ¾ ¾ High level overview ¾ ¾ Use cases ¾ ¾ Key terminology ¾ ¾ Partitions distribution over

Topics Covered What is Kafka

Why Kafka

High level overview

Use cases

Key terminology

Partitions distribution over brokers

Replication protocol

Page 3: Topics Covered - eclass.uoa.gr · Topics Covered ¾ What is Kafka ¾ ¾ Why Kafka ¾ ¾ High level overview ¾ ¾ Use cases ¾ ¾ Key terminology ¾ ¾ Partitions distribution over

What is Kafka

publish-subscribe messaging system

fast

distributed by Design

fault tolerant

scalable

durable

written in Scala

free and open source

Page 4: Topics Covered - eclass.uoa.gr · Topics Covered ¾ What is Kafka ¾ ¾ Why Kafka ¾ ¾ High level overview ¾ ¾ Use cases ¾ ¾ Key terminology ¾ ¾ Partitions distribution over

Building Data Pipelines

Page 5: Topics Covered - eclass.uoa.gr · Topics Covered ¾ What is Kafka ¾ ¾ Why Kafka ¾ ¾ High level overview ¾ ¾ Use cases ¾ ¾ Key terminology ¾ ¾ Partitions distribution over

Building Data Pipelines

Page 6: Topics Covered - eclass.uoa.gr · Topics Covered ¾ What is Kafka ¾ ¾ Why Kafka ¾ ¾ High level overview ¾ ¾ Use cases ¾ ¾ Key terminology ¾ ¾ Partitions distribution over

Building Data Pipelines

Page 7: Topics Covered - eclass.uoa.gr · Topics Covered ¾ What is Kafka ¾ ¾ Why Kafka ¾ ¾ High level overview ¾ ¾ Use cases ¾ ¾ Key terminology ¾ ¾ Partitions distribution over

Building Data Pipelines

Page 8: Topics Covered - eclass.uoa.gr · Topics Covered ¾ What is Kafka ¾ ¾ Why Kafka ¾ ¾ High level overview ¾ ¾ Use cases ¾ ¾ Key terminology ¾ ¾ Partitions distribution over

Building Data Pipelines

Page 9: Topics Covered - eclass.uoa.gr · Topics Covered ¾ What is Kafka ¾ ¾ Why Kafka ¾ ¾ High level overview ¾ ¾ Use cases ¾ ¾ Key terminology ¾ ¾ Partitions distribution over

Building Data Pipelines

This is Bad data pipelining

Page 10: Topics Covered - eclass.uoa.gr · Topics Covered ¾ What is Kafka ¾ ¾ Why Kafka ¾ ¾ High level overview ¾ ¾ Use cases ¾ ¾ Key terminology ¾ ¾ Partitions distribution over

Building Data Pipelines

Kafka decouples Data Pipelines

Page 11: Topics Covered - eclass.uoa.gr · Topics Covered ¾ What is Kafka ¾ ¾ Why Kafka ¾ ¾ High level overview ¾ ¾ Use cases ¾ ¾ Key terminology ¾ ¾ Partitions distribution over

High level overview

Page 12: Topics Covered - eclass.uoa.gr · Topics Covered ¾ What is Kafka ¾ ¾ Why Kafka ¾ ¾ High level overview ¾ ¾ Use cases ¾ ¾ Key terminology ¾ ¾ Partitions distribution over

High level overview

Page 13: Topics Covered - eclass.uoa.gr · Topics Covered ¾ What is Kafka ¾ ¾ Why Kafka ¾ ¾ High level overview ¾ ¾ Use cases ¾ ¾ Key terminology ¾ ¾ Partitions distribution over

Use cases

Messaging

Website Activity Tracking

Metrics

Log Aggregation

Real-Time Stream Processing

Event Sourcing

Commit Log

Internet Of Things (IoT)

Page 14: Topics Covered - eclass.uoa.gr · Topics Covered ¾ What is Kafka ¾ ¾ Why Kafka ¾ ¾ High level overview ¾ ¾ Use cases ¾ ¾ Key terminology ¾ ¾ Partitions distribution over

Key Terminology

Page 15: Topics Covered - eclass.uoa.gr · Topics Covered ¾ What is Kafka ¾ ¾ Why Kafka ¾ ¾ High level overview ¾ ¾ Use cases ¾ ¾ Key terminology ¾ ¾ Partitions distribution over
Page 16: Topics Covered - eclass.uoa.gr · Topics Covered ¾ What is Kafka ¾ ¾ Why Kafka ¾ ¾ High level overview ¾ ¾ Use cases ¾ ¾ Key terminology ¾ ¾ Partitions distribution over
Page 17: Topics Covered - eclass.uoa.gr · Topics Covered ¾ What is Kafka ¾ ¾ Why Kafka ¾ ¾ High level overview ¾ ¾ Use cases ¾ ¾ Key terminology ¾ ¾ Partitions distribution over
Page 18: Topics Covered - eclass.uoa.gr · Topics Covered ¾ What is Kafka ¾ ¾ Why Kafka ¾ ¾ High level overview ¾ ¾ Use cases ¾ ¾ Key terminology ¾ ¾ Partitions distribution over
Page 19: Topics Covered - eclass.uoa.gr · Topics Covered ¾ What is Kafka ¾ ¾ Why Kafka ¾ ¾ High level overview ¾ ¾ Use cases ¾ ¾ Key terminology ¾ ¾ Partitions distribution over
Page 20: Topics Covered - eclass.uoa.gr · Topics Covered ¾ What is Kafka ¾ ¾ Why Kafka ¾ ¾ High level overview ¾ ¾ Use cases ¾ ¾ Key terminology ¾ ¾ Partitions distribution over
Page 21: Topics Covered - eclass.uoa.gr · Topics Covered ¾ What is Kafka ¾ ¾ Why Kafka ¾ ¾ High level overview ¾ ¾ Use cases ¾ ¾ Key terminology ¾ ¾ Partitions distribution over
Page 22: Topics Covered - eclass.uoa.gr · Topics Covered ¾ What is Kafka ¾ ¾ Why Kafka ¾ ¾ High level overview ¾ ¾ Use cases ¾ ¾ Key terminology ¾ ¾ Partitions distribution over
Page 23: Topics Covered - eclass.uoa.gr · Topics Covered ¾ What is Kafka ¾ ¾ Why Kafka ¾ ¾ High level overview ¾ ¾ Use cases ¾ ¾ Key terminology ¾ ¾ Partitions distribution over
Page 24: Topics Covered - eclass.uoa.gr · Topics Covered ¾ What is Kafka ¾ ¾ Why Kafka ¾ ¾ High level overview ¾ ¾ Use cases ¾ ¾ Key terminology ¾ ¾ Partitions distribution over
Page 25: Topics Covered - eclass.uoa.gr · Topics Covered ¾ What is Kafka ¾ ¾ Why Kafka ¾ ¾ High level overview ¾ ¾ Use cases ¾ ¾ Key terminology ¾ ¾ Partitions distribution over
Page 26: Topics Covered - eclass.uoa.gr · Topics Covered ¾ What is Kafka ¾ ¾ Why Kafka ¾ ¾ High level overview ¾ ¾ Use cases ¾ ¾ Key terminology ¾ ¾ Partitions distribution over

Anatomy of a Topic

For each topic, the Kafka cluster maintains a partitioned log that looks like this:

http://kafka.apache.org/images/log_anatomy.png

Number of partition for a Topic is configurable. In this example number of partition are three.

Page 28: Topics Covered - eclass.uoa.gr · Topics Covered ¾ What is Kafka ¾ ¾ Why Kafka ¾ ¾ High level overview ¾ ¾ Use cases ¾ ¾ Key terminology ¾ ¾ Partitions distribution over

Partitions distribution

Page 29: Topics Covered - eclass.uoa.gr · Topics Covered ¾ What is Kafka ¾ ¾ Why Kafka ¾ ¾ High level overview ¾ ¾ Use cases ¾ ¾ Key terminology ¾ ¾ Partitions distribution over

Partitions distribution

Page 30: Topics Covered - eclass.uoa.gr · Topics Covered ¾ What is Kafka ¾ ¾ Why Kafka ¾ ¾ High level overview ¾ ¾ Use cases ¾ ¾ Key terminology ¾ ¾ Partitions distribution over

Partitions distribution

Page 31: Topics Covered - eclass.uoa.gr · Topics Covered ¾ What is Kafka ¾ ¾ Why Kafka ¾ ¾ High level overview ¾ ¾ Use cases ¾ ¾ Key terminology ¾ ¾ Partitions distribution over

Partitions distribution

Page 32: Topics Covered - eclass.uoa.gr · Topics Covered ¾ What is Kafka ¾ ¾ Why Kafka ¾ ¾ High level overview ¾ ¾ Use cases ¾ ¾ Key terminology ¾ ¾ Partitions distribution over

Partitions distribution

Page 33: Topics Covered - eclass.uoa.gr · Topics Covered ¾ What is Kafka ¾ ¾ Why Kafka ¾ ¾ High level overview ¾ ¾ Use cases ¾ ¾ Key terminology ¾ ¾ Partitions distribution over

Partitions distribution

Page 34: Topics Covered - eclass.uoa.gr · Topics Covered ¾ What is Kafka ¾ ¾ Why Kafka ¾ ¾ High level overview ¾ ¾ Use cases ¾ ¾ Key terminology ¾ ¾ Partitions distribution over

Partitions distribution

Page 35: Topics Covered - eclass.uoa.gr · Topics Covered ¾ What is Kafka ¾ ¾ Why Kafka ¾ ¾ High level overview ¾ ¾ Use cases ¾ ¾ Key terminology ¾ ¾ Partitions distribution over

Partitions Distribution

Who is responsible for these tasks ?

Page 36: Topics Covered - eclass.uoa.gr · Topics Covered ¾ What is Kafka ¾ ¾ Why Kafka ¾ ¾ High level overview ¾ ¾ Use cases ¾ ¾ Key terminology ¾ ¾ Partitions distribution over

Partitions Distribution

Page 37: Topics Covered - eclass.uoa.gr · Topics Covered ¾ What is Kafka ¾ ¾ Why Kafka ¾ ¾ High level overview ¾ ¾ Use cases ¾ ¾ Key terminology ¾ ¾ Partitions distribution over

Partitions Distribution

Page 38: Topics Covered - eclass.uoa.gr · Topics Covered ¾ What is Kafka ¾ ¾ Why Kafka ¾ ¾ High level overview ¾ ¾ Use cases ¾ ¾ Key terminology ¾ ¾ Partitions distribution over

Partitions Distribution

Page 39: Topics Covered - eclass.uoa.gr · Topics Covered ¾ What is Kafka ¾ ¾ Why Kafka ¾ ¾ High level overview ¾ ¾ Use cases ¾ ¾ Key terminology ¾ ¾ Partitions distribution over

Responsibility Of Controller

managing the states of partitions and replicas

performing administrative tasks like reassigning partitions

Page 40: Topics Covered - eclass.uoa.gr · Topics Covered ¾ What is Kafka ¾ ¾ Why Kafka ¾ ¾ High level overview ¾ ¾ Use cases ¾ ¾ Key terminology ¾ ¾ Partitions distribution over

Roles For Partition

Each partition has one server which acts as the "leader" and zero or more servers which act as

"followers".

The leader handles all read and write requests for the partition while the followers passively replicate

the leader.

If the leader fails, one of the followers will automatically become the new leader.

Each server acts as a leader for some of its partitions and a follower for others so load is well

balanced within the cluster.

Page 41: Topics Covered - eclass.uoa.gr · Topics Covered ¾ What is Kafka ¾ ¾ Why Kafka ¾ ¾ High level overview ¾ ¾ Use cases ¾ ¾ Key terminology ¾ ¾ Partitions distribution over

Replication Protocol

Page 42: Topics Covered - eclass.uoa.gr · Topics Covered ¾ What is Kafka ¾ ¾ Why Kafka ¾ ¾ High level overview ¾ ¾ Use cases ¾ ¾ Key terminology ¾ ¾ Partitions distribution over

Replication Protocol

Page 43: Topics Covered - eclass.uoa.gr · Topics Covered ¾ What is Kafka ¾ ¾ Why Kafka ¾ ¾ High level overview ¾ ¾ Use cases ¾ ¾ Key terminology ¾ ¾ Partitions distribution over

Replication Protocol

Page 44: Topics Covered - eclass.uoa.gr · Topics Covered ¾ What is Kafka ¾ ¾ Why Kafka ¾ ¾ High level overview ¾ ¾ Use cases ¾ ¾ Key terminology ¾ ¾ Partitions distribution over

Replication Protocol

Page 45: Topics Covered - eclass.uoa.gr · Topics Covered ¾ What is Kafka ¾ ¾ Why Kafka ¾ ¾ High level overview ¾ ¾ Use cases ¾ ¾ Key terminology ¾ ¾ Partitions distribution over

Replication Protocol

Page 46: Topics Covered - eclass.uoa.gr · Topics Covered ¾ What is Kafka ¾ ¾ Why Kafka ¾ ¾ High level overview ¾ ¾ Use cases ¾ ¾ Key terminology ¾ ¾ Partitions distribution over

Replication Protocol

Page 47: Topics Covered - eclass.uoa.gr · Topics Covered ¾ What is Kafka ¾ ¾ Why Kafka ¾ ¾ High level overview ¾ ¾ Use cases ¾ ¾ Key terminology ¾ ¾ Partitions distribution over

Replication Protocol

Page 48: Topics Covered - eclass.uoa.gr · Topics Covered ¾ What is Kafka ¾ ¾ Why Kafka ¾ ¾ High level overview ¾ ¾ Use cases ¾ ¾ Key terminology ¾ ¾ Partitions distribution over

Replication Protocol

Page 49: Topics Covered - eclass.uoa.gr · Topics Covered ¾ What is Kafka ¾ ¾ Why Kafka ¾ ¾ High level overview ¾ ¾ Use cases ¾ ¾ Key terminology ¾ ¾ Partitions distribution over

Replication Protocol

Page 50: Topics Covered - eclass.uoa.gr · Topics Covered ¾ What is Kafka ¾ ¾ Why Kafka ¾ ¾ High level overview ¾ ¾ Use cases ¾ ¾ Key terminology ¾ ¾ Partitions distribution over

Replication Protocol

Page 51: Topics Covered - eclass.uoa.gr · Topics Covered ¾ What is Kafka ¾ ¾ Why Kafka ¾ ¾ High level overview ¾ ¾ Use cases ¾ ¾ Key terminology ¾ ¾ Partitions distribution over

Replication Protocol

Page 52: Topics Covered - eclass.uoa.gr · Topics Covered ¾ What is Kafka ¾ ¾ Why Kafka ¾ ¾ High level overview ¾ ¾ Use cases ¾ ¾ Key terminology ¾ ¾ Partitions distribution over

Replication Protocol

Page 53: Topics Covered - eclass.uoa.gr · Topics Covered ¾ What is Kafka ¾ ¾ Why Kafka ¾ ¾ High level overview ¾ ¾ Use cases ¾ ¾ Key terminology ¾ ¾ Partitions distribution over

Replication Protocol

Page 54: Topics Covered - eclass.uoa.gr · Topics Covered ¾ What is Kafka ¾ ¾ Why Kafka ¾ ¾ High level overview ¾ ¾ Use cases ¾ ¾ Key terminology ¾ ¾ Partitions distribution over

Replication Protocol

Page 55: Topics Covered - eclass.uoa.gr · Topics Covered ¾ What is Kafka ¾ ¾ Why Kafka ¾ ¾ High level overview ¾ ¾ Use cases ¾ ¾ Key terminology ¾ ¾ Partitions distribution over

Replication Protocol

Page 56: Topics Covered - eclass.uoa.gr · Topics Covered ¾ What is Kafka ¾ ¾ Why Kafka ¾ ¾ High level overview ¾ ¾ Use cases ¾ ¾ Key terminology ¾ ¾ Partitions distribution over

Replication Protocol

Page 57: Topics Covered - eclass.uoa.gr · Topics Covered ¾ What is Kafka ¾ ¾ Why Kafka ¾ ¾ High level overview ¾ ¾ Use cases ¾ ¾ Key terminology ¾ ¾ Partitions distribution over

Replication Protocol

Page 58: Topics Covered - eclass.uoa.gr · Topics Covered ¾ What is Kafka ¾ ¾ Why Kafka ¾ ¾ High level overview ¾ ¾ Use cases ¾ ¾ Key terminology ¾ ¾ Partitions distribution over

Replication Protocol

Page 59: Topics Covered - eclass.uoa.gr · Topics Covered ¾ What is Kafka ¾ ¾ Why Kafka ¾ ¾ High level overview ¾ ¾ Use cases ¾ ¾ Key terminology ¾ ¾ Partitions distribution over

Replication Protocol

Page 60: Topics Covered - eclass.uoa.gr · Topics Covered ¾ What is Kafka ¾ ¾ Why Kafka ¾ ¾ High level overview ¾ ¾ Use cases ¾ ¾ Key terminology ¾ ¾ Partitions distribution over

Replication Protocol

Page 61: Topics Covered - eclass.uoa.gr · Topics Covered ¾ What is Kafka ¾ ¾ Why Kafka ¾ ¾ High level overview ¾ ¾ Use cases ¾ ¾ Key terminology ¾ ¾ Partitions distribution over

Replication Protocol

Page 62: Topics Covered - eclass.uoa.gr · Topics Covered ¾ What is Kafka ¾ ¾ Why Kafka ¾ ¾ High level overview ¾ ¾ Use cases ¾ ¾ Key terminology ¾ ¾ Partitions distribution over

Replication Protocol

Page 63: Topics Covered - eclass.uoa.gr · Topics Covered ¾ What is Kafka ¾ ¾ Why Kafka ¾ ¾ High level overview ¾ ¾ Use cases ¾ ¾ Key terminology ¾ ¾ Partitions distribution over

Replication Protocol

Page 64: Topics Covered - eclass.uoa.gr · Topics Covered ¾ What is Kafka ¾ ¾ Why Kafka ¾ ¾ High level overview ¾ ¾ Use cases ¾ ¾ Key terminology ¾ ¾ Partitions distribution over

Replication Protocol