distributed file systems sarah diesburg operating systems cs 3430
TRANSCRIPT
Distributed File System
Provides transparent access to files stored on a remote disk
Recurrent themes of design issuesFailure handlingPerformance optimizationsCache consistency
No Client Caching
Use RPC to forward every file system request to the remote serveropen, seek, read, write
Server cache: X
Client A Client B
read write
No Client Caching
+ Server always has a consistent view of the file system
- Poor performance
- Server is a single point of failure
Network File System (NFS)
Uses client caching to reduce network load
Built on top of RPC
Server cache: X
Client A cache: X Client B cache: X
Network File System (NFS)
+ Performance better than no caching
- Has to handle failures
- Has to handle consistency
Failure Modes
If the server crashesUncommitted data in memory are lostCurrent file positions may be lostThe client may ask the server to perform
unacknowledged operations again
If a client crashesModified data in the client cache may be lost
NFS Failure Handling
1. Write-through caching
2. Stateless protocol: the server keeps no state about the clientread open, seek, read, closeNo server recovery after a failure
3. Idempotent operations: repeated operations get the same resultNo static variables
NFS Failure Handling
4. Transparent failures to clientsTwo options
The client waits until the server comes backThe client can return an error to the user application
• Do you check the return value of close?
NFS Weak Consistency Protocol
A write updates the server immediatelyOther clients poll the server periodically
for changesNo guarantees for multiple writers
NFS Summary
+ Simple and highly portable
- May become inconsistent sometimesDoes not happen very often
Andrew File System (AFS)
Developed at CMUDesign principles
Files are cached on each client’s disks NFS caches only in clients’ memory
Callbacks: The server records who has the copy of a file
Write-back cache on file close. The server then tells all clients that own an old copy.
Session semantics: Updates are only visible on close
AFS Illustrated
Server cache: X
Client A cache: X Client B
read X
read X
callback list of Xclient Aclient B
AFS Illustrated
Server cache: X
Client A cache: X Client B cache: X
read X
read X
callback list of Xclient Aclient B
AFS vs. NFS
AFSLess server load due to clients’ disk cachesNot involved for read-only files
Both AFS and NFSServer is a performance bottleneckSingle point of failure
Serverless Network File Service (xFS)
Idea: construct a file system as a parallel program and exploit the high-speed LANFour major pieces
Cooperative cachingWrite-ownership cache coherenceSoftware RAIDDistributed control
Cooperative Caching
Uses remote memory to avoid going to diskOn a cache miss, check the local memory and
remote memory, before checking the diskBefore discarding the last cached memory
copy, send the content to remote memory if possible
Write-Ownership Cache Coherence
Declares a client to be a owner of the file at writesNo one else can have a copy
Write-Ownership Cache Coherence
Client C cache: Client D cache:
Client A cache: X Client B cache:
owner, read-write
Write-Ownership Cache Coherence
Client C cache: Client D cache:
Client A cache: X Client B cache:
owner, read-write
read X
Write-Ownership Cache Coherence
Client C cache: Client D cache:
Client A cache: X Client B cache:
read-only
read X
X
Write-Ownership Cache Coherence
Client C cache: X Client D cache:
Client A cache: X Client B cache:
read-only
read-only
X
Write-Ownership Cache Coherence
Client C cache: X Client D cache:
Client A cache: X Client B cache:
read-only
read-onlywrite X
Write-Ownership Cache Coherence
Client C cache: X Client D cache:
Client A cache: Client B cache:
owner, read-writewrite X
Other components
Software RAIDStripe data redundantly over multiple disks
Distributed controlFile system managers are spread across all
machines
xFS Summary
Built on small, unreliable componentsData, metadata, and control can live on
any machineIf one machine goes down, everything
else continues to workWhen machines are added, xFS starts to
use their resources