kafka needs no keeper 1 - qconsf · kafka needs no keeper colin mccabe. 2 kafka has gotten its...

124
1 Kafka Needs No Keeper Colin McCabe

Upload: others

Post on 03-Jun-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

1

Kafka Needs No Keeper

Colin McCabe

Page 2: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

2

● Kafka has gotten its mileage out of Zookeeper● But it is still a second system● KIP-500 has been adopted by the community● This is not a 1-1 replacement● We’ve been headed this direction for years

Introduction

Page 3: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

3

Evolution of Apache Kafka Clients

Page 4: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

4

Producer

Consumer

Admin Tools

Page 5: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

5

write to topics

Producer

Consumer

Admin Tools

Page 6: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

6

write to topics

read fromtopics

Producer

Consumer

Admin Tools

Page 7: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

7

write to topics

read fromtopics

offset fetch/commit

group partitionassignment

Producer

Consumer

Admin Tools

Page 8: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

8

write to topics

read fromtopics

offset fetch/commit

group partitionassignment

topic create/delete

Producer

Consumer

Admin Tools

Page 9: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

9

Consumer Group Coordinator

Page 10: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

10

Consumeroffset

fetch/commitgroup partition

assignment

read from topics

Page 11: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

11

Consumeroffset

fetch/commitgroup partition

assignment

read from topics

Consumer APIs● Fetch

Page 12: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

12

Consumeroffset

fetch/commitgroup partition

assignment

read from topics

Consumer APIs● Fetch

Page 13: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

13

ConsumerConsumer APIs

● Fetchoffset fetch/commit

group partitionassignment

read from topics

__offsets

Page 14: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

14

offset fetch/commit

Consumer

group partitionassignment

read from topics

Consumer APIs● Fetch● OffsetCommit● OffsetFetch

__offsets

Page 15: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

15

Consumer

group partitionassignment

read from topics

offset fetch/commit Consumer APIs

● Fetch● OffsetCommit● OffsetFetch

__offsets

Page 16: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

16

Consumer

group partitionassignment

read from topics

offset fetch/commit Consumer APIs

● Fetch● OffsetCommit● OffsetFetch

__offsets

Page 17: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

17

group partitionassignment

Consumer

read from topics

offset fetch/commit Consumer APIs

● Fetch● OffsetCommit● OffsetFetch● JoinGroup● SyncGroup● Heartbeat

__offsets

Page 18: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

18

Consumer

read from topics

offset fetch/commit

group partitionassignment

Consumer APIs● Fetch● OffsetCommit● OffsetFetch● JoinGroup● SyncGroup● Heartbeat

__offsets

Page 19: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

19

Consumer

read from topics

offset fetch/commit

group partitionassignment

Consumer APIs● Fetch● OffsetCommit● OffsetFetch● JoinGroup● SyncGroup● Heartbeat

__offsets

Page 20: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

20

read from topics

offset fetch/commit

group partitionassignment

ConsumerConsumer APIs

● Fetch● OffsetCommit● OffsetFetch● JoinGroup● SyncGroup● Heartbeat

__offsets

Page 21: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

21

Consumer

Producer

Admin Toolscreate/delete

topics

Page 22: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

22

Kafka Security and the

Admin Client

Page 23: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

23

Consumer

Producer

create/delete topics

Admin Tools

Page 24: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

24

ACL Enforcement

create/delete topics

Admin Tools

Consumer

Producer

Page 25: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

25

create/delete topics

ACL Enforcement

Admin Tools

Consumer

Producer

Page 26: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

26

create/delete topics

ACL Enforcement

Admin Tools

Page 27: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

27

AdminClient

Admin Tools

ACL Enforcement

create/delete topics

Page 28: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

28

AdminClient

Admin Tools

ACL Enforcement

create/delete topics

Admin APIs:● CreateTopics● DeleteTopics● AlterConfigs● ...

Page 29: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

29

Admin APIs:● CreateTopics● DeleteTopics● AlterConfigs● ...

AdminClient

Admin Tools

ACL Enforcement

Page 30: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

30

Producer

Consumer

AdminClient

Client APIs:● Produce● Fetch● Metadata● CreateTopics● DeleteTopics● ...

Page 31: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

31

Producer

Consumer

AdminClient

Client APIs:● Produce● Fetch● Metadata● CreateTopics● DeleteTopics● ...

● Encapsulation● Security● Validation● Compatibility

Page 32: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

32

Inter BrokerCommunication

Page 33: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

33

Page 34: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

34

Broker RegistrationACL Management

Dynamic ConfigurationISR Management

Page 35: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

35

Controller

Broker RegistrationACL Management

Dynamic ConfigurationISR Management

Page 36: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

36

Controller

Broker RegistrationACL Management

Dynamic ConfigurationISR Management

Controller Election

Page 37: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

37

Controller

Broker RegistrationACL Management

Dynamic ConfigurationISR Management

Controller Election

Page 38: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

38

Controller

Broker RegistrationACL Management

Dynamic ConfigurationISR Management

Controller Election

Page 39: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

39

Controller Controller APIs:● LeaderAndIsr● UpdateMetadata● StopReplica

Leader/ISR PushUpdate Metadata

Stop/Delete Replica

Broker RegistrationACL Management

Dynamic ConfigurationISR Management

Controller Election

Page 40: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

40

Controller Controller APIs:● LeaderAndIsr● UpdateMetadata● StopReplica

Leader/ISR PushUpdate Metadata

Stop/Delete Replica

Broker RegistrationACL Management

Dynamic ConfigurationISR Management

Controller Election

Page 41: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

41

Controller Controller APIs:● LeaderAndIsr● UpdateMetadata● StopReplica

Leader/ISR PushUpdate Metadata

Stop/Delete Replica

Broker RegistrationACL Management

Dynamic ConfigurationISR Management

Controller Election

Page 42: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

42

Controller Controller APIs:● LeaderAndIsr● UpdateMetadata● StopReplica● AlterIsr

Broker RegistrationACL Management

Dynamic ConfigurationISR Management

Controller Election

Leader/ISR PushUpdate Metadata

Stop/Delete Replica

Page 43: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

43

Controller

Leader/ISR PushUpdate Metadata

Stop/Delete ReplicaISR Management

Controller APIs:● LeaderAndIsr● UpdateMetadata● StopReplica● AlterIsr

Broker RegistrationACL Management

Dynamic ConfigurationISR Management

Controller Election

Page 44: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

44

Page 45: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

45

● Encapsulation● Compatibility● Ownership

Page 46: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

46

Broker Liveness

Page 47: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

47

Zk Session

Page 48: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

48

/brokers/1 -> { host: 10.10.10.1:9092 rack: rack-1}

Page 49: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

49

/brokers/1 -> { host: 10.10.10.1:9092 rack: rack-1}

Page 50: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

50

Page 51: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

51

Watch trigger

Page 52: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

52

Watch triggerBroker 1 is offline

Page 53: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

53

Network PartitionResilience

Page 54: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

54

Page 55: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

55

Case 1: Total partition

Page 56: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

56

Case 2: Broker partition

Page 57: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

57

Case 3: Zk Partition

Page 58: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

58

Case 4: Controller partition

Page 59: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

59

Metadata Inconsistency

Page 60: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

60

Page 61: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

61

Metadata Sourceof Truth

Page 62: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

62

Metadata Sourceof Truth

Metadata Cache- sync writes- async updates

Page 63: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

63

Metadata Sourceof Truth

Metadata Cache- async update

Metadata Cache- sync writes- async updates

Metadata Cache- async update

Page 64: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

64

Page 65: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

65

Page 66: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

66

Page 67: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

67

Last Resort:> rmr /controller

Page 68: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

68

Last Resort:> rmr /controller

New controller!

Page 69: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

69

Last Resort:> rmr /controller

Load ALLMetadata

Page 70: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

70

Last Resort:> rmr /controller

Load ALLMetadata

Page 71: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

71

Last Resort:> rmr /controller

Push ALLMetadata

Page 72: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

72

Last Resort:> rmr /controller

Push ALLMetadata

Page 73: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

73

Last Resort:> rmr /controller

Push ALLMetadata

How do you know the metadata has diverged?

Page 74: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

74

Performance of ControllerInitialization

Page 75: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

75

Page 76: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

76

Page 77: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

77

New controller!

Page 78: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

78

Load ALLMetadata

Page 79: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

79

Load ALLMetadata

Complexity: O(N)N = number of partitions

Page 80: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

80

Page 81: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

81

Push ALLMetadata

Page 82: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

82

Push ALLMetadata

Complexity: O(N*M)N = number of partitionsM = number of brokers

Page 83: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

83

Metadata as an Event Log

Page 84: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

8484

Metadata as an Event Log

- Each change becomes a message

- Changes are propagated to all brokers

...

924 Create topic ”foo”

925 Delete topic “bar”

926 Add node 4 to the cluster

927 Create topic “baz”

928 Alter ISR for “foo-0”

929 Add node 5 to the cluster

Page 85: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

8585

Metadata as an Event Log

- Clear ordering- Can send deltas- Offset tracks consumer

position- Easy to measure lag

...

924 Create topic ”foo”

925 Delete topic “bar”

926 Add node 4 to the cluster

927 Create topic “baz”

928 Alter ISR for “foo-0”

929 Add node 5 to the cluster

Page 86: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

86

Consumer

Consumer

Consumer

offset=3

offset=1

offset=2

Page 87: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

87

offset=3

offset=1

offset=2

Broker

Broker

Broker

?

Page 88: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

88

offset=3

offset=1

offset=2

Broker

Broker

Broker

Controller

Page 89: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

89

Can we use the existing Kafka log replication protocol?

- How do we elect the leader?

We need a self-managed quorum.

Implementing the Controller Log

Page 90: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

90

Can we use the existing Kafka log replication protocol?

- How do we elect the leader?

We need a self-managed quorum.

Implementing the Controller Log

Enter Raft.

Leader election is by simple majority.

Page 91: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

91

Kafka Raft

Writes Single Leader Single Leader

Fencing Monotonically increasing epoch

Monotonically increasing term

Log reconciliation Offset and epoch Term and index

Push/Pull Pull Push

Commit Semantics ISR Majority

Leader Election From ISR through Zookeeper

Majority

Page 92: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

92

The Controller Quorum

Page 93: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

93

offset=1

offset=2

Broker

Broker

Controller

Controller

Controller

The Controller Raft Quorum- The leader is the active controller- Controls reads / writes to the log- Typically 3 or 5 nodes, like ZK

Page 94: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

94

offset=1

offset=2

Broker

Broker

Controller

Controller

Controller

Instant Failover- Low-latency failover via Raft election- Standbys contain all data in memory- Brokers do not need to re-fetch

Page 95: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

95

/mnt/logs/kafka/metadataoffset=1

Broker

Broker

Controller

Controller

Controller

Metadata Caching- Brokers can persist metadata to disk- Only fetch what they need- Use snapshots if we’re too far behind

/mnt/logs/kafka/metadataoffset=2

Page 96: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

96

Broker Registration- Building a map of the

cluster- What brokers exist in

the cluster?- How can they be

reached?

Controller

Page 97: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

97

Broker Registration- Brokers send

heartbeats to the active controller

- The controller uses this to build a map of the cluster

Controller

Page 98: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

98

Controller

Broker Registration- Brokers send

heartbeats to the active controller

- The controller uses this to build a map of the cluster

- The controller also tells brokers if they should be fenced or shut down

Page 99: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

99

Controller

Fencing- Brokers need to be

fenced if they’re partitioned from the controller, or can’t keep up

- Brokers self-fence if they can’t talk to the controller

Page 100: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

100

Handling network partitions

Page 101: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

101

Case 1: Total partition

Page 102: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

102

Case 1: Total partition

Page 103: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

103

Case 2: Broker partition

Page 104: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

104

Case 3: Controller partition

Page 105: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

105

Case 3: Controller partition

Page 106: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

106

DeploymentCurrent KIP-500

Configuration File Kafka and ZooKeeper

Kafka

Metrics Kafka and ZK Kafka

Administrative Tools

ZK Shell, Four letter words, Kafka tools

Kafka tools

Security Kafka and ZK Kafka

Page 107: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

107

Shared Controller Nodes

- Fewer resources used

- Single node clusters (eventually)

Page 108: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

108

Separate Controller Nodes

- Better resource isolation

- Good for big clusters

Page 109: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

109

Roadmap

Page 110: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

110

Remove Client-side ZK dependencies

Remove Broker-side ZK dependencies

Controller Quorum

Page 111: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

111

Remove Client-side ZK dependencies

Remove Broker-side ZK dependencies

Controller Quorum

Incremental KIP-4 Improvements

- Create new APIs- Deprecate direct ZK

access

Page 112: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

112

Remove Client-side ZK dependencies

Remove Broker-side ZK dependencies

Controller Quorum

Broker-Side Fixes- Remove deprecated

direct ZK access for tools

- Create broker-side APIs

- Centralize ZK access in the controller

Page 113: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

113

Remove Client-side ZK dependencies

Remove Broker-side ZK dependencies

Controller Quorum

First Release without ZooKeeper

- Raft- Controller quorum

Page 114: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

114

Upgrade Issues- Tools using ZK- Brokers

accessing ZK- State in ZK

KIP-500 Release

Older Kafka Release

Page 115: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

115

Bridge Release

KIP-500 Release

Older Kafka ReleaseBridge Release- No ZK access

from tools, brokers (except controller)

Page 116: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

116

Upgrading- Starting from the

bridge release

Page 117: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

117

Upgrading- Start new controller

nodes (possibly combined)

- Quorum elects leader- Claims leadership in

ZK

Page 118: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

118

Upgrading- Roll nodes one by

one as usual- Controller continues

sending LeaderAndIsr, etc. to old nodes

Page 119: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

119

Upgrading- When all brokers

have been rolled, decommission ZK nodes

Page 120: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

120

Conclusion

Page 121: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

121

Apache ZooKeeper has served us well- KIP-500 is not a 1:1 replacement, but a different

paradigmWe have already started removing ZK from clients- Consumer, AdminClient- Improved encapsulation, security, upgradability

Page 122: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

122

Metadata should be managed as a log- Deltas, ordering, caching- Controller Failover, Fencing- Improved scalability, robustness, easier deployment

The metadata log must be self-managed- Raft- Controller quorum

Page 123: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

123

It will take a few releases to implement KIP-500- Additional KIPs for APIs, Raft, Metadata, etc.

Rolling upgrades will be supported- Bridge release- Post-ZK release

Kafka needs no Keeper

Page 124: Kafka Needs No Keeper 1 - QConSF · Kafka Needs No Keeper Colin McCabe. 2 Kafka has gotten its mileage out of Zookeeper But it is still a second system KIP-500 has been adopted by

124

cnfl.io/meetups cnfl.io/blog cnfl.io/slack

THANK YOU

Colin [email protected]