the chubby lock service for loosely-coupled distributed systems

25
The Chubby Lock Service for Loosely-coupled Dist ributed Systems Mike Burrow, Google Inc Presented by Xin (Joyce) Zhan

Upload: duscha

Post on 16-Jan-2016

28 views

Category:

Documents


0 download

DESCRIPTION

The Chubby Lock Service for Loosely-coupled Distributed Systems. Mike Burrow, Google Inc Presented by Xin (Joyce) Zhan. Outline. Design System structure Locks, caching, failovers Scaling mechanism Use and observations As name service Failover problems. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: The Chubby Lock Service for Loosely-coupled Distributed Systems

The Chubby Lock Service for Loosely-coupled Distributed Systems

Mike Burrow, Google Inc

Presented by Xin (Joyce) Zhan

Page 2: The Chubby Lock Service for Loosely-coupled Distributed Systems

Outline

• Design– System structure– Locks, caching, failovers– Scaling mechanism

• Use and observations– As name service– Failover problems

Page 3: The Chubby Lock Service for Loosely-coupled Distributed Systems

Lock service for distributed system

• Synchronize access to shared resources

• Other usage– Primary election, meta-data storage, name se

rvice

• Reliability, availability

Page 4: The Chubby Lock Service for Loosely-coupled Distributed Systems

System Strucure

Page 5: The Chubby Lock Service for Loosely-coupled Distributed Systems

System Structure

• Set of replicas

• Periodically elected master– Master lease– Paxos protocol

• All client requests are directed to master– updates propagated to replicas

• Replace failed replicas– master periodically polls DNS

Page 6: The Chubby Lock Service for Loosely-coupled Distributed Systems

Design

• Store small files

• Event notification mechanism

• Consistent caching

• Advisory lock (vs.mandatory)– confilct only when others attempt to acquire th

e same lock

• Coarse grained locks– survive lock server failures

Page 7: The Chubby Lock Service for Loosely-coupled Distributed Systems

Design - File Interface

• Ease distribution– /ls/fool/wombat/pouch

• Node meta-data include Access Control Lists

• Handle– analogous to UNIX file descriptors– support for use across master changes

Page 8: The Chubby Lock Service for Loosely-coupled Distributed Systems

Design - Sequencer for lock

• Delayed / Out-of-order messages– introduce sequence numbers into interactions

that use locks– lock holder requests a sequencer, pass it to fil

e server to validate

• Alternative– lock-delay

Page 9: The Chubby Lock Service for Loosely-coupled Distributed Systems

Design - Events

• Client subscribes when creating handle• Delivered async via up-call from client library• Event types

– file contents modified– child node added / removed / modified– Chubby master failed over– handle / lock have become invalid– lock acquired / conflicting lock request (rarely used)

Page 10: The Chubby Lock Service for Loosely-coupled Distributed Systems

Design - Caching

• Clients cache file data and meta data – Consistent, write-through

• Invalidation– master keeps list of what clients may have cached– master sends invalidations on top of KeepAlive– clients flush changed data, ack. with KeepAlive– server proceeds the modification only after invalidatio

n

• Clients cache open handle and locks

Page 11: The Chubby Lock Service for Loosely-coupled Distributed Systems

Design - Sessions

• Session maintained through KeepAlives– handles, locks, cached data remain valid– lease

• Lease timeout advanced when– creation of a session– master fail-over occurs– master responds to KeepAlive RPC

Page 12: The Chubby Lock Service for Loosely-coupled Distributed Systems

Design - KeepAlive

• Master responds close to lease timeout• Client sends another KeepAlive immediately• Client maintains local lease timeout

– conservative approximation

• When local lease expires– disable cache– session in jeopardy, client waits in grace period– cache enabled on reconnect

• Application informed about session changes– Jeopardy/safe/expired event

Page 13: The Chubby Lock Service for Loosely-coupled Distributed Systems

Design – Failovers

Page 14: The Chubby Lock Service for Loosely-coupled Distributed Systems

Design - Failovers

• In-memory state discarded– sessions, handles, locks, etc.

• Lease timer “stops”• Fast master election

– client reconnect before lease expires

• Slow master election– clients flush cache, enter grace period

• New master reconstruct the assumption of in-memory state of previous master

Page 15: The Chubby Lock Service for Loosely-coupled Distributed Systems

Design - Failovers

Steps of newly-elected master:• Pick new epoch number• Respond only to master location requests• Build in-memory state for sessions / locks from databa

se• Respond to KeepAlives• Emit fail-over events to sessions, flush caches• Wait for acknowledgements / session expire• Allow all operations to proceed• Allow clients to use handles created before fail-over• Delete ephemeral files w/o open handles after an inter

val

Page 16: The Chubby Lock Service for Loosely-coupled Distributed Systems

Design - Backup and Mirroring

• Master writes snapshots every few hours– GFS server in different building

• Collection of files mirrored across cells– /ls/global/master mirrored to /ls/cell/slave

• Mostly for configuration files– Chubby’s own ACLs– Files advertising presence / location– pointers to Bigtable cells

Page 17: The Chubby Lock Service for Loosely-coupled Distributed Systems

Design - Scaling Mechanisms

• 90,000 clients communicate with one cell• Regulate the number of Chubby cells

– client use the nearby cell

• Increase lease time• Client caching• Protocol-conversion servers

Page 18: The Chubby Lock Service for Loosely-coupled Distributed Systems

Scaling - Proxies

• Proxies pass requests from clients to cell

• Reduce traffic of KeepAlive and read requests– Not writes, but writes << 1% of workload– KeepAlive traffic by far most dominant

• Overheads:– additional RPC for writes / first time reads– increased probability of unavailability

Page 19: The Chubby Lock Service for Loosely-coupled Distributed Systems

Scaling - Partitioning

• Namespace of a cell partitioned between servers

• N partitions, each with master and replicas– Node D/C stored on P(D/C) = hash(D) mod N– meta-data for D may be on different partition

• Little cross-partition communication

• Reduce R/W traffic, no necessarily KeepAlive

Page 20: The Chubby Lock Service for Loosely-coupled Distributed Systems

Use and Observations

• Many files for naming• Config, ACL, meta-da

ta common• 10 clients use each c

ached file, on avg.• Few locks held, no sh

ared locks• KeepAlives dominate

RPC traffic

Page 21: The Chubby Lock Service for Loosely-coupled Distributed Systems

Use as Name Service

• DNS uses TTL values– entries must be refreshed within that time– huge (and variable) load on DNS server

• Chubby’s caching uses invalidations, no polling– client builds up needed entries in cache– name entries further grouped in batches

Page 22: The Chubby Lock Service for Loosely-coupled Distributed Systems

Failover problems

• Master writes sessions to DB when created– Overload when start of many processes at once

• Instead, store session at first modification / lock acquisition etc.

• Active sessions recorded with probability on KeepAlive– spread out writes in time– young read-only session may be discarded in a fail-ov

er

Page 23: The Chubby Lock Service for Loosely-coupled Distributed Systems

Failover problems

• New design – do not record sessions in database– recreate them like handles after fail-over– new master waits full lease time before operat

ions proceed

Page 24: The Chubby Lock Service for Loosely-coupled Distributed Systems

Lesson learnt

• Developers rarely consider availability– should plan for short Chubby outages

• Fine-grained locking not essential

• Poor API choices– handles acquiring locks cannot be shared

• RPC use affects transport protocols– forced to send KeepAlives by UDP for timeline

ss

Page 25: The Chubby Lock Service for Loosely-coupled Distributed Systems

Q & A