zookeeper - institute of electrical and electronics engineers · zookeeper servers zookeeper...

34
1 ZooKeeper Yahoo! Research NFIC 2010

Upload: others

Post on 29-Jul-2020

14 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ZooKeeper - Institute of Electrical and Electronics Engineers · ZooKeeper Servers ZooKeeper Service Server Server Server Server Server • All servers have a copy of the state in

1

ZooKeeper

Yahoo! ResearchNFIC 2010

Page 2: ZooKeeper - Institute of Electrical and Electronics Engineers · ZooKeeper Servers ZooKeeper Service Server Server Server Server Server • All servers have a copy of the state in

2

Cloud Computing

• Elastic – size changes dynamically

• Scale – large number of servers

Page 3: ZooKeeper - Institute of Electrical and Electronics Engineers · ZooKeeper Servers ZooKeeper Service Server Server Server Server Server • All servers have a copy of the state in

3

Coordination is important

Page 4: ZooKeeper - Institute of Electrical and Electronics Engineers · ZooKeeper Servers ZooKeeper Service Server Server Server Server Server • All servers have a copy of the state in

4

Coordination is important

Page 5: ZooKeeper - Institute of Electrical and Electronics Engineers · ZooKeeper Servers ZooKeeper Service Server Server Server Server Server • All servers have a copy of the state in

Internet-scale Challenges

• Lots of servers, users, data

• FLP, CAP

• Mere mortal programmers

Page 6: ZooKeeper - Institute of Electrical and Electronics Engineers · ZooKeeper Servers ZooKeeper Service Server Server Server Server Server • All servers have a copy of the state in

Classic Distributed System

Master

Slave SlaveSlaveSlaveSlaveSlave

Page 7: ZooKeeper - Institute of Electrical and Electronics Engineers · ZooKeeper Servers ZooKeeper Service Server Server Server Server Server • All servers have a copy of the state in

Fault Tolerant Distributed System

Master

Slave SlaveSlaveSlaveSlaveSlave

CoordinationService

Master

Page 8: ZooKeeper - Institute of Electrical and Electronics Engineers · ZooKeeper Servers ZooKeeper Service Server Server Server Server Server • All servers have a copy of the state in

Fault Tolerant Distributed System

Master

Slave SlaveSlaveSlaveSlaveSlave

CoordinationService

Master

Page 9: ZooKeeper - Institute of Electrical and Electronics Engineers · ZooKeeper Servers ZooKeeper Service Server Server Server Server Server • All servers have a copy of the state in

Fully Distributed System

Worker WorkerWorkerWorkerWorkerWorker

CoordinationService

Page 10: ZooKeeper - Institute of Electrical and Electronics Engineers · ZooKeeper Servers ZooKeeper Service Server Server Server Server Server • All servers have a copy of the state in

What is coordination?

• Group membership

• Leader election

• Dynamic Configuration

• Status monitoring

• Queuing

• Barriers

• Critical sections

Page 11: ZooKeeper - Institute of Electrical and Electronics Engineers · ZooKeeper Servers ZooKeeper Service Server Server Server Server Server • All servers have a copy of the state in

Goals

• Been done in the past

– ISIS, distributed locks (Chubby, VMS)

• High Performance

– Multiple outstanding ops

– Read dominant

• General (Coordination Kernel)

• Reliable

• Easy to use

Page 12: ZooKeeper - Institute of Electrical and Electronics Engineers · ZooKeeper Servers ZooKeeper Service Server Server Server Server Server • All servers have a copy of the state in

wait-free

• Pros

– Slow processes cannot slow down fast ones

– No deadlocks

– No blocking in the implementations

• Cons

– Some coordination primitives are blocking

– Need to be able to efficiently wait for conditions

Page 13: ZooKeeper - Institute of Electrical and Electronics Engineers · ZooKeeper Servers ZooKeeper Service Server Server Server Server Server • All servers have a copy of the state in

Serializable vs Linearizability

• Linearizable writes

• Serializable read (may be stale)

• Client FIFO ordering

Page 14: ZooKeeper - Institute of Electrical and Electronics Engineers · ZooKeeper Servers ZooKeeper Service Server Server Server Server Server • All servers have a copy of the state in

Change Events

• Clients request change notifications

• Service does timely notifications

• Do not block write requests

• Clients get notification of a change before they see the result of a change

Page 15: ZooKeeper - Institute of Electrical and Electronics Engineers · ZooKeeper Servers ZooKeeper Service Server Server Server Server Server • All servers have a copy of the state in

Solution

Order + wait-free + change events = coordination

Page 16: ZooKeeper - Institute of Electrical and Electronics Engineers · ZooKeeper Servers ZooKeeper Service Server Server Server Server Server • All servers have a copy of the state in

ZooKeeper API

String create(path, data, acl, flags)

void delete(path, expectedVersion)

Stat setData(path, data, expectedVersion)

(data, Stat) getData(path, watch)

Stat exists(path, watch)

String[] getChildren(path, watch)

void sync()

Stat setACL(path, acl, expectedVersion)

(acl, Stat) getACL(path)

Page 17: ZooKeeper - Institute of Electrical and Electronics Engineers · ZooKeeper Servers ZooKeeper Service Server Server Server Server Server • All servers have a copy of the state in

Data Model

● Hierarchal namespace (like a file system)

● Each znode has data and children

● data is read and written in its entirety

/

services

users

apps

locks

workers

YaView

s-1

worker2

worker1

Page 18: ZooKeeper - Institute of Electrical and Electronics Engineers · ZooKeeper Servers ZooKeeper Service Server Server Server Server Server • All servers have a copy of the state in

Create Flags

• Ephemeral: znode deleted when creator fails or explicitly deleted

• Sequence: append a monotonically increasing counter

/

services

users

apps

locks

workers

YaView

s-1

worker2

worker1

Ephemeralscreated bySession X

Sequenceappendedon create

Page 19: ZooKeeper - Institute of Electrical and Electronics Engineers · ZooKeeper Servers ZooKeeper Service Server Server Server Server Server • All servers have a copy of the state in

Configuration

• Workers get configuration

– getData(“.../config/settings”, true)

• Administrators change the configuration

– setData(“.../config/settings”, newConf, -1)

• Workers notified of change and get the new settings

– getData(“.../config/settings”, true)

config

settings

Page 20: ZooKeeper - Institute of Electrical and Electronics Engineers · ZooKeeper Servers ZooKeeper Service Server Server Server Server Server • All servers have a copy of the state in

Group Membership

• Register serverName in group

– create(“.../workers/workerName”, hostInfo, EPHEMERAL)

• List group members

– listChildren(“.../workers”, true)

workers

worker2

worker1

Page 21: ZooKeeper - Institute of Electrical and Electronics Engineers · ZooKeeper Servers ZooKeeper Service Server Server Server Server Server • All servers have a copy of the state in

Leader Election

1 getData(“.../workers/leader”, true)

2 if successful follow the leader described in the data and exit

3 create(“.../workers/leader”, hostname, EPHEMERAL)

4 if successful lead and exit

5 goto step 1

workers

worker2

worker1

If a watch is triggered for“.../workers/leader”, followers willrestart the leader election process

leader

Page 22: ZooKeeper - Institute of Electrical and Electronics Engineers · ZooKeeper Servers ZooKeeper Service Server Server Server Server Server • All servers have a copy of the state in

Locks

1 id = create(“.../locks/x-”, SEQUENCE|EPHEMERAL)

2 getChildren(“.../locks”/, false)

3 if id is the 1st child, exit

4 exists(name of last child before id, true)

5 if does not exist, goto 2)

6 wait for event

7 goto 2)

locks

x-19

x-11

x-20

Each znode watches one other.No herd effect.

Page 23: ZooKeeper - Institute of Electrical and Electronics Engineers · ZooKeeper Servers ZooKeeper Service Server Server Server Server Server • All servers have a copy of the state in

Shared Locks

1 id = create(“.../locks/s-”, SEQUENCE|EPHEMERAL)

2 getChildren(“.../locks”/, false)

3 if no children that start with x- before id, exit

4 exists(name of the last x- before id, true)

5 if does not exist, goto 2)

6 wait for event

7 goto 2)

locks

x-19

s-11

x-20

x-19x-19

s-21

x-22

s-20

Page 24: ZooKeeper - Institute of Electrical and Electronics Engineers · ZooKeeper Servers ZooKeeper Service Server Server Server Server Server • All servers have a copy of the state in

ZooKeeper Servers

ZooKeeper Service

ServerServer ServerServerServerServer

• All servers have a copy of the state in memory

• A leader is elected at startup

• Followers service clients, all updates go through leader

• Update responses are sent when a majority of servers have persisted the change

We need 2f+1 machines to tolerate f failures

Page 25: ZooKeeper - Institute of Electrical and Electronics Engineers · ZooKeeper Servers ZooKeeper Service Server Server Server Server Server • All servers have a copy of the state in

ZooKeeper Servers

ZooKeeper Service

ServerServer ServerServerServerServer

Client ClientClientClientClientClient ClientClient

Leader

Page 26: ZooKeeper - Institute of Electrical and Electronics Engineers · ZooKeeper Servers ZooKeeper Service Server Server Server Server Server • All servers have a copy of the state in

26

Evaluation & Experience

Page 27: ZooKeeper - Institute of Electrical and Electronics Engineers · ZooKeeper Servers ZooKeeper Service Server Server Server Server Server • All servers have a copy of the state in

27

Evaluation

• Cluster of 50 servers

• Xeon dual-core 2.1 GHz

• 4 GB of RAM

• Two SATA disks

Page 28: ZooKeeper - Institute of Electrical and Electronics Engineers · ZooKeeper Servers ZooKeeper Service Server Server Server Server Server • All servers have a copy of the state in

28

Latency

Page 29: ZooKeeper - Institute of Electrical and Electronics Engineers · ZooKeeper Servers ZooKeeper Service Server Server Server Server Server • All servers have a copy of the state in

29

Throughput

Page 30: ZooKeeper - Institute of Electrical and Electronics Engineers · ZooKeeper Servers ZooKeeper Service Server Server Server Server Server • All servers have a copy of the state in

30

Load in production (Fetching Service)

Page 31: ZooKeeper - Institute of Electrical and Electronics Engineers · ZooKeeper Servers ZooKeeper Service Server Server Server Server Server • All servers have a copy of the state in

31

At Yahoo!...

●Has been used for:

● Cross-datacenter locks (wide-area locking)

● Web crawling

● Large-scale publish-subscribe (Hedwig: ZOOKEEPER-775)

● Portal front-end

●Largest cluster I’m aware of● Thousands of clients

Page 32: ZooKeeper - Institute of Electrical and Electronics Engineers · ZooKeeper Servers ZooKeeper Service Server Server Server Server Server • All servers have a copy of the state in

32

Faults in practice

●Bugzilla

● Ticket system for software defects, improvements, etc

●Fetching service queue

● Over 2 years running

● 9 tickets reporting issues with ZooKeeper

Page 33: ZooKeeper - Institute of Electrical and Electronics Engineers · ZooKeeper Servers ZooKeeper Service Server Server Server Server Server • All servers have a copy of the state in

33

Faults in practice

● Misconfiguration: 5 issues● System configuration, not ZK

● E.g., misconfigured net cards, DNS clash

● Application bugs: 2 issues

● Misunderstanding of the API semantics

● E.g., race condition using async api

● ZK bugs: 2 issues

● Really our fault

● API and server (affected all)

Page 34: ZooKeeper - Institute of Electrical and Electronics Engineers · ZooKeeper Servers ZooKeeper Service Server Server Server Server Server • All servers have a copy of the state in

Summary

• Easy to use

• High Performance

• General

• Reliable

• Release 3.3 on Apache

– See http://hadoop.apache.org/zookeeper

– Committers from Yahoo! and Cloudera