distributed distributed systemscs.brown.edu/courses/csci1380/s20/lectures/l22_2020.pdf · 2020. 4....

32
Distributed Distributed Systems L22: Distributed File Systems Theophilus Benson CS1380 Spring 20

Upload: others

Post on 19-Sep-2020

24 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Distributed Distributed Systemscs.brown.edu/courses/csci1380/s20/lectures/L22_2020.pdf · 2020. 4. 30. · Distributed Distributed Systems L22: Distributed File Systems Theophilus

Distributed Distributed Systems

L22: Distributed File SystemsTheophilus BensonCS1380 Spring 20

Page 2: Distributed Distributed Systemscs.brown.edu/courses/csci1380/s20/lectures/L22_2020.pdf · 2020. 4. 30. · Distributed Distributed Systems L22: Distributed File Systems Theophilus

Todays Agenda

• General Distributed File Systems

• Industry Use Cases• Google File System (GFS)

• Next Class• MongoDB (Guest lecture)• Kafka (LinkedIn’s Queue Processing)

Page 3: Distributed Distributed Systemscs.brown.edu/courses/csci1380/s20/lectures/L22_2020.pdf · 2020. 4. 30. · Distributed Distributed Systems L22: Distributed File Systems Theophilus

WhatisaFile?

• Ablobofbinary?

• Asetofblobs?• Thinkabook:TableofContents+ chapters

• Indexà inode (maprangestodatablocks)• Chaptersà Datablocks

• Howaboutdirectories?• Howaboutfilepermissions?

Data21010101010100100

File11010101010100100

File1Data1Data2

Data11010101010100100

Page 4: Distributed Distributed Systemscs.brown.edu/courses/csci1380/s20/lectures/L22_2020.pdf · 2020. 4. 30. · Distributed Distributed Systems L22: Distributed File Systems Theophilus

WhatisaDirectory?

• Directory->mapnamestoFileIDs• DirectoryalsocontainsDirectory• InLinuxadirectory--->alsoafile

RootDirectory• File1->IDX• File2à IDY• Dir1à IDZ

Dir1• File3->IDM• File4->IDC

File1Data1Data2

File5Data8Data9

Page 5: Distributed Distributed Systemscs.brown.edu/courses/csci1380/s20/lectures/L22_2020.pdf · 2020. 4. 30. · Distributed Distributed Systems L22: Distributed File Systems Theophilus

WhatisaFileSystem?

• Filesystemsà systemthatmanagesfiles

• Provides• APIforApplications tointeractw/files• Algorithms forsecuringfiles(access control)• Maintainmetadataaboutafile

Application Application

FileSystem

API

File1 FileN

FileMetaData• Filelength(size)• Timestamp• Location• Referencecount• Type• accesscontrol• Owner

ModifiableByAPP

ModifiableByFileSystem

Page 6: Distributed Distributed Systemscs.brown.edu/courses/csci1380/s20/lectures/L22_2020.pdf · 2020. 4. 30. · Distributed Distributed Systems L22: Distributed File Systems Theophilus

Distributed File Systems (DFS)

Page 7: Distributed Distributed Systemscs.brown.edu/courses/csci1380/s20/lectures/L22_2020.pdf · 2020. 4. 30. · Distributed Distributed Systems L22: Distributed File Systems Theophilus

Local Versus Distributed File System

• Failure Implications:• Local: all components are down• Distributed: only some components are down others keep operating

• Performance Implications• Local: interactions are function calls à very fast• Distributed: interactions are RPC calls à variable speed

Client

Storage

Storage Storage Storage

Client

50ms 50ms100ms

Semantics

At-least-once(1 or more calls)

At-most-once(0 or 1 calls)

Page 8: Distributed Distributed Systemscs.brown.edu/courses/csci1380/s20/lectures/L22_2020.pdf · 2020. 4. 30. · Distributed Distributed Systems L22: Distributed File Systems Theophilus

Transparency Properties of a Distributed File System (DFS)

Client Program/API• Access --> same API for remote/local files• Location à same ``name’’ for remote/local file• Mobility à client should be unaware of files moving

System level Performance• Performance à as workload grows: performance is OK• Scalability à# of files grow: performance is OK Storage Storage Storage

Client

50ms 50ms100ms

Page 9: Distributed Distributed Systemscs.brown.edu/courses/csci1380/s20/lectures/L22_2020.pdf · 2020. 4. 30. · Distributed Distributed Systems L22: Distributed File Systems Theophilus

Performance Optimizations

• Caching: Client Versus Server Side• Client: minimizes load on server and improves read latency• Server: improves performances

Page 10: Distributed Distributed Systemscs.brown.edu/courses/csci1380/s20/lectures/L22_2020.pdf · 2020. 4. 30. · Distributed Distributed Systems L22: Distributed File Systems Theophilus

Server-side Caching: Write Issues

• Write-through caching• Write-through: on every write, write to mem à disk à report OK• All writes persist to Disk because Ack. Which provides poor perf. But good consistency

StorageClient

BlockBlock Block

Page 11: Distributed Distributed Systemscs.brown.edu/courses/csci1380/s20/lectures/L22_2020.pdf · 2020. 4. 30. · Distributed Distributed Systems L22: Distributed File Systems Theophilus

Server-side Caching: Write Issues

• Commits• Commit: on file close, commit/flush all writes to disk.• Writes are to memory until commit à ensures performance but consistency issues

StorageClient

BlockBlock BlockCommit

Page 12: Distributed Distributed Systemscs.brown.edu/courses/csci1380/s20/lectures/L22_2020.pdf · 2020. 4. 30. · Distributed Distributed Systems L22: Distributed File Systems Theophilus

Performance Optimizations

• Caching: Client Versus Server Side• Client: minimizes load on server and improves read latency• Server: improves performances

• Server Side:• Write Caching: potential consistency issues

• Commit: on file close, commit/flush all writes to disk.• Writes are to memory until commit à ensures performance but consistency issues• Write-through: on every write, write to mem à disk à report OK• All writes persist to Disk because Ack. Which provides poor perf. But good consistency

• Read Caching:• Store recently read blocks in memory for fast/quick access

Page 13: Distributed Distributed Systemscs.brown.edu/courses/csci1380/s20/lectures/L22_2020.pdf · 2020. 4. 30. · Distributed Distributed Systems L22: Distributed File Systems Theophilus

Client-Caching: Issues

• Locks/Leases• Write/reads are local provided you have a lock/lease

• Two types of locks/leases• Writes: only one client have have this lock• Read: multiple clients have can a read lock. When a write lock is granted, read locks

are revoked

Storage

Client

BlockBlockBlock

CacheBlockBlockBlock

Page 14: Distributed Distributed Systemscs.brown.edu/courses/csci1380/s20/lectures/L22_2020.pdf · 2020. 4. 30. · Distributed Distributed Systems L22: Distributed File Systems Theophilus

Client-Caching Tradeoffs

Storage

Client

BlockBlockBlock

CacheBlockBlockBlock

Page 15: Distributed Distributed Systemscs.brown.edu/courses/csci1380/s20/lectures/L22_2020.pdf · 2020. 4. 30. · Distributed Distributed Systems L22: Distributed File Systems Theophilus

Locks Versus Leases

• Locks: client requests, server grants• Client explicitly revokes/gives up lock

• Failure recovery requires tracking locks• Server must track all clients (heartbeats)• On client failure, need complicated

procedure to recover locks (revoke locks)

• Leases: time limit on how long you can hold a resource

• Client Must periodically renew lease• If client does not renew, lock is lost

• Failure recovery is easy• Server doesn’t need to track clients, just leases• On client failure, only need to wait until lease

time out before handing off resource to someone else

Leases: An Efficient Fault-Tolerant Mechanism for Distributed File Cache Consistency

Storage

Client getLease(K)

lease(K, 60s)

renewLease(K)

Storage

Client getLock(K)

revokeLock(K)

OK

Periodically renew or loose access

Page 16: Distributed Distributed Systemscs.brown.edu/courses/csci1380/s20/lectures/L22_2020.pdf · 2020. 4. 30. · Distributed Distributed Systems L22: Distributed File Systems Theophilus

Microsoft’s Opportunistic Lock (not to be confused with optimistic locking)

https://blogs.msdn.microsoft.com/openspecification/2009/05/22/client-caching-features-oplock-vs-lease/

Page 17: Distributed Distributed Systemscs.brown.edu/courses/csci1380/s20/lectures/L22_2020.pdf · 2020. 4. 30. · Distributed Distributed Systems L22: Distributed File Systems Theophilus

Opportunistic Locking

Page 18: Distributed Distributed Systemscs.brown.edu/courses/csci1380/s20/lectures/L22_2020.pdf · 2020. 4. 30. · Distributed Distributed Systems L22: Distributed File Systems Theophilus

Opportunistic Locking

Page 19: Distributed Distributed Systemscs.brown.edu/courses/csci1380/s20/lectures/L22_2020.pdf · 2020. 4. 30. · Distributed Distributed Systems L22: Distributed File Systems Theophilus

Opportunistic LockingOpportunistic because server only grants the locks if/when convenient.

Page 20: Distributed Distributed Systemscs.brown.edu/courses/csci1380/s20/lectures/L22_2020.pdf · 2020. 4. 30. · Distributed Distributed Systems L22: Distributed File Systems Theophilus
Page 21: Distributed Distributed Systemscs.brown.edu/courses/csci1380/s20/lectures/L22_2020.pdf · 2020. 4. 30. · Distributed Distributed Systems L22: Distributed File Systems Theophilus

Performance Optimizations

• Caching: Client Versus Server Side• Client: minimizes load on server and improves read latency• Server: improves performances

• Server Side:• Write Caching: potential consistency issues

• Commit: on file close, commit/flush all writes to disk.• Writes are to memory until commit à ensures performance but consistency issues• Write-through: on every write, write to mem à disk à report OK• All writes persist to Disk because Ack. Which provides poor perf. But good consistency

• Read Caching:• Store recently read blocks in memory for fast/quick access

• Client side:• Locks/leases are used to balance consistency versus performance.

Page 22: Distributed Distributed Systemscs.brown.edu/courses/csci1380/s20/lectures/L22_2020.pdf · 2020. 4. 30. · Distributed Distributed Systems L22: Distributed File Systems Theophilus

Security and Access Control

• Approaches• Capabilities: client provided a security token which encodes client security.

Server validates the the token is correct and use the permissions in the token to control access to resources

• Access Lists: server maintains a list of permission, on every access the server consults this list to verify that client has permissions

• Approaches in DFS

• On open validate and give client a ‘capability’• Client uses ‘capability’ with all future requests

• For every request, client includes identify information

StorageClientRPC(API + capability)

StorageClient

ACL List

RPC(API + credentials)

Validate credentials have access to API

Capability includes access information

Page 23: Distributed Distributed Systemscs.brown.edu/courses/csci1380/s20/lectures/L22_2020.pdf · 2020. 4. 30. · Distributed Distributed Systems L22: Distributed File Systems Theophilus

Potential Security Trade-offs: Capabilities versus ACL

• Capabilities: hard to revoke/change permissions• Permissions only checked at the beginning• Must send a revocation list and force reissue

• ACL: since centralized list --- easy to change and adopt• Every API call uses the list so changes reflected on the next call

StorageClientRPC(API + capability)

StorageClient

ACL List

RPC(API + credentials)

Validate credentials have access to API

Capability includes access information

Revoke List

Page 24: Distributed Distributed Systemscs.brown.edu/courses/csci1380/s20/lectures/L22_2020.pdf · 2020. 4. 30. · Distributed Distributed Systems L22: Distributed File Systems Theophilus

GFS: Google File System

Page 25: Distributed Distributed Systemscs.brown.edu/courses/csci1380/s20/lectures/L22_2020.pdf · 2020. 4. 30. · Distributed Distributed Systems L22: Distributed File Systems Theophilus

GFS

• Two types of nodes!• Master • Chuck servers

• Masters (few API calls)• Metadata operations

• Chunk servers (most client API calls)• Stores actual data

Page 26: Distributed Distributed Systemscs.brown.edu/courses/csci1380/s20/lectures/L22_2020.pdf · 2020. 4. 30. · Distributed Distributed Systems L22: Distributed File Systems Theophilus

GFS Master

• Single/Centralized Master• Never store the files contents• Only store metadata/attributes

• Benefits of centralization• Easy to write code• Can implement sophisticated algorithms • Store all metadata in memory à perf. Boost

• Issues with centralization• Single point of failure à 2 backups

• Replicate to backups before responding to client• Not enough memory à buy more!!!

ChunkServer

Client

Gmail

GFSMaster

GFSMasterShadow

Master

BackUp masters

Page 27: Distributed Distributed Systemscs.brown.edu/courses/csci1380/s20/lectures/L22_2020.pdf · 2020. 4. 30. · Distributed Distributed Systems L22: Distributed File Systems Theophilus

GFS Attributes, Data, Metadata

Chunks-> server

Dir->FilesFiles->Chucks

DATA (i.e., Chunks)

Page 28: Distributed Distributed Systemscs.brown.edu/courses/csci1380/s20/lectures/L22_2020.pdf · 2020. 4. 30. · Distributed Distributed Systems L22: Distributed File Systems Theophilus

GFS Attributes, Data, Metadata

• Chunk->server mappings not stored at master

• ChunkServer can die• Operator can manually change

ChunkServer• ChunkServer is the authoritative

voice on what it stores

• ChunkServer includes list of chucks in heartbeat msgs

• Master rebuilds a map of chunk->server locations after receiving heartbeat msgs

Chunks-> server

Dir->FilesFiles->Chucks

ChunkServer

Client

Gmail

GFSMaster

GFSMasterShadow

Master

BackUp masters

DATA (i.e., Chunks)

Chunks-> server

Page 29: Distributed Distributed Systemscs.brown.edu/courses/csci1380/s20/lectures/L22_2020.pdf · 2020. 4. 30. · Distributed Distributed Systems L22: Distributed File Systems Theophilus

GFS Consistency Semantics

• Types of API calls• Metadata operations: create/delete/rename• Data operations: read/writes

• All metadata à Masters: linearizable because master gives global ordering

• Read/writes à ChuckServer à potential consistency issues Chunks-> server

Dir->FilesFiles->Chucks

DATA (i.e., Chunks)

Page 30: Distributed Distributed Systemscs.brown.edu/courses/csci1380/s20/lectures/L22_2020.pdf · 2020. 4. 30. · Distributed Distributed Systems L22: Distributed File Systems Theophilus

• Use heart beats to detect failures• Maintain three replicas of each chunk

• On failed server create a new replica

• Monitor load on each server• Periodically move replica/chunks around to

balance load

• Single masters provides global total ordering on meta data operations

• Master give leases to coordinate writes on data.

• One replica is denoted master for other replicas

Page 31: Distributed Distributed Systemscs.brown.edu/courses/csci1380/s20/lectures/L22_2020.pdf · 2020. 4. 30. · Distributed Distributed Systems L22: Distributed File Systems Theophilus

• Use heart beats to detect failures• Maintain three replicas of each chunk

• On failed server create a new replica

• Monitor load on each server• Periodically move replica/chunks around to

balance load

• Single masters provides global total ordering on meta data operations

• Master give leases to coordinate writes on data.

• One replica is denoted master for other replicas

GFSMaster

GFSMasterShadow

Master

BackUp masters

ChunkServer

ChunkServer

ChunkServerLeader for chunk

replica

Client

Gmail

List of chuckservers

Writes

writes

writes

HeartBeats(Chunk List)

LeaderLease

Open()

Page 32: Distributed Distributed Systemscs.brown.edu/courses/csci1380/s20/lectures/L22_2020.pdf · 2020. 4. 30. · Distributed Distributed Systems L22: Distributed File Systems Theophilus

Today

• Distributed files systems • Caching: performance versus consistency• Locks V. Leases: opportunistic locking• Server v. client side caches

• GFS: Google File Systems• Centralized Masters• Consistency semantics