distributed systems cs 15-440 distributed file systems- part i lecture 19, nov 14, 2011 majd f....

44
Distributed Systems CS 15-440 Distributed File Systems- Part I Lecture 19, Nov 14, 2011 Majd F. Sakr, Mohammad Hammoud andVinay Kolar 1

Post on 20-Dec-2015

220 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Distributed Systems CS 15-440 Distributed File Systems- Part I Lecture 19, Nov 14, 2011 Majd F. Sakr, Mohammad Hammoud andVinay Kolar 1

Distributed SystemsCS 15-440

Distributed File Systems- Part I

Lecture 19, Nov 14, 2011

Majd F. Sakr, Mohammad Hammoud andVinay Kolar

1

Page 2: Distributed Systems CS 15-440 Distributed File Systems- Part I Lecture 19, Nov 14, 2011 Majd F. Sakr, Mohammad Hammoud andVinay Kolar 1

Today…

Last 3 sessions Programming Models

Today’s session Distributed File Systems – Part I

Announcement: Project 3 is due today by midnight Problem solving assignment 4 (the last assignment) is going to

be posted by tonight Project 4 (the last project) is going to be posted before the end

of this week

2

Page 3: Distributed Systems CS 15-440 Distributed File Systems- Part I Lecture 19, Nov 14, 2011 Majd F. Sakr, Mohammad Hammoud andVinay Kolar 1

Intended Learning Outcomes: Distributed File Systems

ILO7 Explain distributed file systems as a paradigm for general-purpose distributed systems, and analyze its various aspects and architectures

ILO7.1 Define distributed file systems (DFSs) and explain various architectures

ILO7.2 Analyze various aspects of DFSs including processes, communication, naming, synchronization, consistency and replication, and fault tolerance

Considered: a reasonably critical and comprehensive perspective.

Thoughtful: Fluent, flexible and efficient perspective.

Masterful: a powerful and illuminating perspective.

ILO7.1

ILO7.2

ILO7

Page 4: Distributed Systems CS 15-440 Distributed File Systems- Part I Lecture 19, Nov 14, 2011 Majd F. Sakr, Mohammad Hammoud andVinay Kolar 1

Discussion on Distributed File Systems

4

Distributed File Systems (DFSs)

Basics DFS Aspects

Page 5: Distributed Systems CS 15-440 Distributed File Systems- Part I Lecture 19, Nov 14, 2011 Majd F. Sakr, Mohammad Hammoud andVinay Kolar 1

Distributed File Systems

Why File Systems?

To organize data (as files) To provide a means for applications to store, access, and modify data

Why Distributed File Systems?

Big data continues to grow

In contrary to a local file system, a distributed file system (DFS) can hold big data and provide access to this data to many clients distributed across a network

Page 6: Distributed Systems CS 15-440 Distributed File Systems- Part I Lecture 19, Nov 14, 2011 Majd F. Sakr, Mohammad Hammoud andVinay Kolar 1

NAS versus SAN

Another term for DFS is network attached storage (NAS), referring to attaching storage to network servers that provide file systems

A similar sounding term that refers to a very different approach is storage area network (SAN)

SAN makes storage devices (not file systems) available over a network

Client Computer

Client Computer

Client Computer

Client Computer

LAN

File Server (Providing

NAS)

Database Server

SAN

Page 7: Distributed Systems CS 15-440 Distributed File Systems- Part I Lecture 19, Nov 14, 2011 Majd F. Sakr, Mohammad Hammoud andVinay Kolar 1

Benefits of DFSs

DFSs provide:

1. File sharing over a network: without a DFS, we would have to exchange files by e-mail or use applications such as the Internet’s FTP

2. Transparent files accesses: A user’s programs can access remote files as if they are local. The remote files have no special APIs; they are accessed just like local ones

3. Easy file management: managing a DFS is easier than managing multiple local file systems

Page 8: Distributed Systems CS 15-440 Distributed File Systems- Part I Lecture 19, Nov 14, 2011 Majd F. Sakr, Mohammad Hammoud andVinay Kolar 1

DFS Components

DFS information can be typically categorized as follows:

1. The data state: This is the contents of files

2. The attribute state: This is the information about each file (e.g., file’s size and access control list)

3. The open-file state: This includes which files are open or otherwise in use, as well as describing how files are locked

Designing a DFS entails determining how its various components are placed. Specifically, by component placement we indicate:

What resides on the servers What resides on the clients

Page 9: Distributed Systems CS 15-440 Distributed File Systems- Part I Lecture 19, Nov 14, 2011 Majd F. Sakr, Mohammad Hammoud andVinay Kolar 1

DFS Component Placement (1)

The data and the attribute states permanently reside on the server’s local file system, but recently accessed or modified information might reside in server and/or client caches

The open-file state is transitory; it changes as processes open and close files

Data Cache

Attribute Cache

Open-File State

Data Cache

Attribute Cache

Open-File State

Local File System

Client Server

Network

Page 10: Distributed Systems CS 15-440 Distributed File Systems- Part I Lecture 19, Nov 14, 2011 Majd F. Sakr, Mohammad Hammoud andVinay Kolar 1

DFS Component Placement (2)

Three basic concerns govern the DFS components placement strategy:

1. Access speed: Caching information on clients improves performance considerably

2. Consistency: If clients cache information, do all parties share the same view of it?

3. Recovery: If one or more computers crash, to what extent are the others affected? How much information is lost?

Page 11: Distributed Systems CS 15-440 Distributed File Systems- Part I Lecture 19, Nov 14, 2011 Majd F. Sakr, Mohammad Hammoud andVinay Kolar 1

Discussion on Distributed File Systems

11

Distributed File Systems (DFSs)

Basics DFS AspectsBasics

Page 12: Distributed Systems CS 15-440 Distributed File Systems- Part I Lecture 19, Nov 14, 2011 Majd F. Sakr, Mohammad Hammoud andVinay Kolar 1

DFS Aspects

Aspect Description

Architecture How are DFSs generally organized?

Processes • Who are the cooperating processes? • Are processes stateful or stateless?

Communication • What is the typical communication paradigm followed by DFSs?

• How do processes in DFSs communicate?

Naming How is naming often handled in DFSs?

Synchronization What are the file sharing semantics adopted by DFSs?

Consistency and Replication What are the various features of client-side caching as well as server-side replication?

Fault Tolerance How is fault tolerance handled in DFSs?

Aspect Description

Architecture How are DFSs generally organized?

Processes • Who are the cooperating processes? • Are processes stateful or stateless?

Communication • What is the typical communication paradigm followed by DFSs?

• How do processes in DFSs communicate?

Naming How is naming often handled in DFSs?

Synchronization What are the file sharing semantics adopted by DFSs?

Consistency and Replication What are the various features of client-side caching as well as server-side replication?

Fault Tolerance How is fault tolerance handled in DFSs?

Aspect Description

Architecture How are DFSs generally organized?

Processes • Who are the cooperating processes? • Are processes stateful or stateless?

Communication • What is the typical communication paradigm followed by DFSs?

• How do processes in DFSs communicate?

Naming How is naming often handled in DFSs?

Synchronization What are the file sharing semantics adopted by DFSs?

Consistency and Replication What are the various features of client-side caching as well as server-side replication?

Fault Tolerance How is fault tolerance handled in DFSs?

Aspect Description

Architecture How are DFSs generally organized?

Processes • Who are the cooperating processes? • Are processes stateful or stateless?

Communication • What is the typical communication paradigm followed by DFSs?

• How do processes in DFSs communicate?

Naming How is naming often handled in DFSs?

Synchronization What are the file sharing semantics adopted by DFSs?

Consistency and Replication What are the various features of client-side caching as well as server-side replication?

Fault Tolerance How is fault tolerance handled in DFSs?

Aspect Description

Architecture How are DFSs generally organized?

Processes • Who are the cooperating processes? • Are processes stateful or stateless?

Communication • What is the typical communication paradigm followed by DFSs?

• How do processes in DFSs communicate?

Naming How is naming often handled in DFSs?

Synchronization What are the file sharing semantics adopted by DFSs?

Consistency and Replication What are the various features of client-side caching as well as server-side replication?

Fault Tolerance How is fault tolerance handled in DFSs?

Aspect Description

Architecture How are DFSs generally organized?

Processes • Who are the cooperating processes? • Are processes stateful or stateless?

Communication • What is the typical communication paradigm followed by DFSs?

• How do processes in DFSs communicate?

Naming How is naming often handled in DFSs?

Synchronization What are the file sharing semantics adopted by DFSs?

Consistency and Replication What are the various features of client-side caching as well as server-side replication?

Fault Tolerance How is fault tolerance handled in DFSs?

Aspect Description

Architecture How are DFSs generally organized?

Processes • Who are the cooperating processes? • Are processes stateful or stateless?

Communication • What is the typical communication paradigm followed by DFSs?

• How do processes in DFSs communicate?

Naming How is naming often handled in DFSs?

Synchronization What are the file sharing semantics adopted by DFSs?

Consistency and Replication What are the various features of client-side caching as well as server-side replication?

Fault Tolerance How is fault tolerance handled in DFSs?

Aspect Description

Architecture How are DFSs generally organized?

Processes • Who are the cooperating processes? • Are processes stateful or stateless?

Communication • What is the typical communication paradigm followed by DFSs?

• How do processes in DFSs communicate?

Naming How is naming often handled in DFSs?

Synchronization What are the file sharing semantics adopted by DFSs?

Consistency and Replication What are the various features of client-side caching as well as server-side replication?

Fault Tolerance How is fault tolerance handled in DFSs?

Aspect Description

Architecture How are DFSs generally organized?

Processes • Who are the cooperating processes? • Are processes stateful or stateless?

Communication • What is the typical communication paradigm followed by DFSs?

• How do processes in DFSs communicate?

Naming How is naming often handled in DFSs?

Synchronization What are the file sharing semantics adopted by DFSs?

Consistency and Replication What are the various features of client-side caching as well as server-side replication?

Fault Tolerance How is fault tolerance handled in DFSs?

Aspect Description

Architecture How are DFSs generally organized?

Processes • Who are the cooperating processes? • Are processes stateful or stateless?

Communication • What is the typical communication paradigm followed by DFSs?

• How do processes in DFSs communicate?

Naming How is naming often handled in DFSs?

Synchronization What are the file sharing semantics adopted by DFSs?

Consistency and Replication What are the various features of client-side caching as well as server-side replication?

Fault Tolerance How is fault tolerance handled in DFSs?

Page 13: Distributed Systems CS 15-440 Distributed File Systems- Part I Lecture 19, Nov 14, 2011 Majd F. Sakr, Mohammad Hammoud andVinay Kolar 1

DFS Aspects

Aspect Description

Architecture How are DFSs generally organized?

Processes • Who are the cooperating processes? • Are processes stateful or stateless?

Communication • What is the typical communication paradigm followed by DFSs?

• How do processes in DFSs communicate?

Naming How is naming often handled in DFSs?

Synchronization What are the file sharing semantics adopted by DFSs?

Consistency and Replication What are the various features of client-side caching as well as server-side replication?

Fault Tolerance How is fault tolerance handled in DFSs?

Aspect Description

Architecture How are DFSs generally organized?

Processes • Who are the cooperating processes? • Are processes stateful or stateless?

Communication • What is the typical communication paradigm followed by DFSs?

• How do processes in DFSs communicate?

Naming How is naming often handled in DFSs?

Synchronization What are the file sharing semantics adopted by DFSs?

Consistency and Replication What are the various features of client-side caching as well as server-side replication?

Fault Tolerance How is fault tolerance handled in DFSs?

Page 14: Distributed Systems CS 15-440 Distributed File Systems- Part I Lecture 19, Nov 14, 2011 Majd F. Sakr, Mohammad Hammoud andVinay Kolar 1

Architectures

Client-Server Distributed File Systems Cluster-Based Distributed File Systems Symmetric Distributed File Systems

Page 15: Distributed Systems CS 15-440 Distributed File Systems- Part I Lecture 19, Nov 14, 2011 Majd F. Sakr, Mohammad Hammoud andVinay Kolar 1

Network File System

Many distributed file systems are organized along the lines of client-server architectures

Sun Microsystem’s Network File System (NFS) is one of the most widely-deployed DFS for Unix-based systems

NFS comes with a protocol that describes precisely how a client can access a file stored on a (remote) NFS file server

NFS allows a heterogeneous collection of processes, possibly running on different OSs and machines, to share a common file system

Page 16: Distributed Systems CS 15-440 Distributed File Systems- Part I Lecture 19, Nov 14, 2011 Majd F. Sakr, Mohammad Hammoud andVinay Kolar 1

Remote Access Model

The model underlying NFS and similar systems is that of remote access model

In this model, clients: Are offered transparent access to a file system that is managed by a

remote server Are normally unaware of the actual location of files Are offered an interface to a file system similar to the interface offered by a

conventional local file system

File

File stays on server

ClientServer

Requests fromclient to access

remote file

Replies fromserver

Page 17: Distributed Systems CS 15-440 Distributed File Systems- Part I Lecture 19, Nov 14, 2011 Majd F. Sakr, Mohammad Hammoud andVinay Kolar 1

Upload/Download Model

A contrary model, referred to as upload/download model, allows a client to access a file locally after having downloaded it from the server

The Internet’s FTP service can be used this way when a client downloads a complete file, modifies it, and then puts it back

All accesses are done on client

File

New File

File moved to client

Client

Server

When client is done,file is returned to server

Page 18: Distributed Systems CS 15-440 Distributed File Systems- Part I Lecture 19, Nov 14, 2011 Majd F. Sakr, Mohammad Hammoud andVinay Kolar 1

The Basic NFS Architecture

System call layer

Virtual File System (VFS) layer

Local file system interface

NFS client

RPC client stub

System call layer

Virtual File System (VFS) layer

NFS serverLocal file system

interface

RPC server stub

Network

Client Server

A Client Request in NFS

Page 19: Distributed Systems CS 15-440 Distributed File Systems- Part I Lecture 19, Nov 14, 2011 Majd F. Sakr, Mohammad Hammoud andVinay Kolar 1

Architectures

Client-Server Distributed File Systems Cluster-Based Distributed File Systems Symmetric Distributed File Systems

Page 20: Distributed Systems CS 15-440 Distributed File Systems- Part I Lecture 19, Nov 14, 2011 Majd F. Sakr, Mohammad Hammoud andVinay Kolar 1

Parallel Applications

Problems are increasingly becoming computationally challenging

Typically run on large parallel machines Critical to leveraging parallelism in

all phases

Data access is a huge challenge

We can also use parallelism in accessing data to obtain performance gains

File striping techniquesVisualization of entropy in TerascaleSupernova Initiative application. Image from Kwan-Liu Ma’s visualization team at UC Davis

Page 21: Distributed Systems CS 15-440 Distributed File Systems- Part I Lecture 19, Nov 14, 2011 Majd F. Sakr, Mohammad Hammoud andVinay Kolar 1

File Striping Techniques

Server clusters are often used for parallel applications and their associated file systems are adjusted to satisfy their requirements

One well-known technique is to deploy file-striping techniques, by which a single file is distributed across multiple servers

Hence, it becomes possible to fetch different parts in parallel

a b

c

d ea

c

d

a

e b b

e

d

c

Accessing file parts in parallel

Page 22: Distributed Systems CS 15-440 Distributed File Systems- Part I Lecture 19, Nov 14, 2011 Majd F. Sakr, Mohammad Hammoud andVinay Kolar 1

Round-Robin Distribution (1)

How to stripe a file over multiple machines? Round-Robin is typically a reasonable default solution

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Logical File

Stripe Size

Striping Unit

Server 1 Server 2 Server 3 Server 4

0 1 2 34 5 6 10 14 7 11 159 138 12

Page 23: Distributed Systems CS 15-440 Distributed File Systems- Part I Lecture 19, Nov 14, 2011 Majd F. Sakr, Mohammad Hammoud andVinay Kolar 1

Round-Robin Distribution (2)

Clients perform writes/reads of file at various regions

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Server 1 Server 2 Server 3 Server 4

0 1 2 34 5 6 10 14 7 11 159 138 12

Client I: 512K write, offset 0 Client II: 512K write, offset 512

0 4 1 5 2 6 3 7 8 12 9 13 10 14 11 15

Page 24: Distributed Systems CS 15-440 Distributed File Systems- Part I Lecture 19, Nov 14, 2011 Majd F. Sakr, Mohammad Hammoud andVinay Kolar 1

2D Round-Robin Distribution (1)

What happens when we have many servers (say 100s)? 2D distribution can help

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Logical File

Stripe Size

Striping Unit

Server 1 Server 2 Server 3 Server 4

0 12 34 56 10 147 11 159 138 12

Group Size = 2

Page 25: Distributed Systems CS 15-440 Distributed File Systems- Part I Lecture 19, Nov 14, 2011 Majd F. Sakr, Mohammad Hammoud andVinay Kolar 1

2D Round-Robin Distribution (2)

2D distribution can limit the number of servers per client

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Client I: 512K write, offset 0 Client II: 512K write, offset 512

0 4 1 5 2 6 3 7 8 12 9 13 10 14 11 15

Server 1 Server 2 Server 3 Server 4

0 12 34 56 10 147 11 159 138 12

Group Size = 2

Page 26: Distributed Systems CS 15-440 Distributed File Systems- Part I Lecture 19, Nov 14, 2011 Majd F. Sakr, Mohammad Hammoud andVinay Kolar 1

General Purpose Applications

For general-purpose applications, or those with irregular or many different types of data structures, file striping may not be effective

In those cases, it is often more convenient to partition the file system as a whole and simply store files on different servers

a

a

a

b

b

b

c

c

c

d

d

d

e

e

e

File block of file a

Page 27: Distributed Systems CS 15-440 Distributed File Systems- Part I Lecture 19, Nov 14, 2011 Majd F. Sakr, Mohammad Hammoud andVinay Kolar 1

Google File System

Companies like Amazon and Google offer services to Web clients resulting in reads and updates to a massive number of files distributed across literally tens of thousands of computers

In such environments, the traditional assumptions concerning distributed file systems might no longer hold

Files tend to be very large (TBs or PBs) For example, we can expect that at any single moment there will be a

computer malfunctioning

To address these problems, Google, for example, has developed its own Google File System (GFS)

Page 28: Distributed Systems CS 15-440 Distributed File Systems- Part I Lecture 19, Nov 14, 2011 Majd F. Sakr, Mohammad Hammoud andVinay Kolar 1

GFS Architecture

GFS client Master

Chunk Server

Linux File System

Chunk Server

Linux File System

Chunk Server

Linux File System

File name, chunk index

Contact address

Instructions Chunk-server state

Chunk Id, range

Chunk data

Page 29: Distributed Systems CS 15-440 Distributed File Systems- Part I Lecture 19, Nov 14, 2011 Majd F. Sakr, Mohammad Hammoud andVinay Kolar 1

GFS Mechanics (1)

Each GFS cluster consists of a single master along with multiple chunk servers

Each GFS file is divided up into chunks of 64MB

These chunks are distributed across chunk servers (i.e., using file-striping technique)

The GFS master allocates chunks to chunk servers

The master keeps track of where a chunk is located

Page 30: Distributed Systems CS 15-440 Distributed File Systems- Part I Lecture 19, Nov 14, 2011 Majd F. Sakr, Mohammad Hammoud andVinay Kolar 1

GFS Mechanics (2)

The GFS master is contacted for metadata information

The GFS master maintains a name space, along with a mapping from file name to chunks

The chunk servers keep an account of what they have stored

Chunks are replicated (server-side- No data caching; but it does cache metadata) to handle failures (No RAID, No SAN)

Replicas are updated serially

Page 31: Distributed Systems CS 15-440 Distributed File Systems- Part I Lecture 19, Nov 14, 2011 Majd F. Sakr, Mohammad Hammoud andVinay Kolar 1

IBM General Parallel File System

The General Parallel File System (GPFS) created by IBM is another example of a high-performance clustered file system

A file that is written to the file system is broken up into blocks of a configured size (less than 1 MB each)

GPFS provides higher I/O performance by striping blocks of data from individual files over multiple disks

GPFS is vulnerable to disk failures -any one disk failing would be enough to lose data

To prevent data loss, the file system nodes have RAID controllers

Page 32: Distributed Systems CS 15-440 Distributed File Systems- Part I Lecture 19, Nov 14, 2011 Majd F. Sakr, Mohammad Hammoud andVinay Kolar 1

Architectures

Client-Server Distributed File Systems Cluster-Based Distributed File Systems Symmetric Distributed File Systems

Page 33: Distributed Systems CS 15-440 Distributed File Systems- Part I Lecture 19, Nov 14, 2011 Majd F. Sakr, Mohammad Hammoud andVinay Kolar 1

Ivy

Fully symmetric organizations that are based on peer-to-peer technology also exist

All current proposals use a DHT-based system for distributing data, combined with a key-based lookup mechanism

As an example, Ivy is a distributed file system that is built using a Chord DHT-based system

Data storage in Ivy is realized by a block-oriented (i.e., blocks are distributed over storage nodes) distributed storage called DHash

Page 34: Distributed Systems CS 15-440 Distributed File Systems- Part I Lecture 19, Nov 14, 2011 Majd F. Sakr, Mohammad Hammoud andVinay Kolar 1

Ivy Architecture

Ivy consists of 3 separate layers:

Ivy

DHash

Chord

File System Layer

Block-Oriented Storage

DHT Layer

Ivy

DHash

Chord

Ivy

DHash

Chord

Node where a file system is rooted

Page 35: Distributed Systems CS 15-440 Distributed File Systems- Part I Lecture 19, Nov 14, 2011 Majd F. Sakr, Mohammad Hammoud andVinay Kolar 1

Ivy

Ivy implements an NFS-like file system

To increase availability and improve performance, Ivy:

Replicates every block B to the k immediate successors of the server responsible for storing B

Caches looked up blocks along the route that the lookup request followed

Page 36: Distributed Systems CS 15-440 Distributed File Systems- Part I Lecture 19, Nov 14, 2011 Majd F. Sakr, Mohammad Hammoud andVinay Kolar 1

DFS Aspects

Aspect Description

Architecture How are DFSs generally organized?

Processes • Who are the cooperating processes? • Are processes stateful or stateless?

Communication • What is the typical communication paradigm followed by DFSs?

• How do processes in DFSs communicate?

Naming How is naming often handled in DFSs?

Synchronization What are the file sharing semantics adopted by DFSs?

Consistency and Replication What are the various features of client-side caching as well as server-side replication?

Fault Tolerance How is fault tolerance handled in DFSs?

Aspect Description

Architecture How are DFSs generally organized?

Processes • Who are the cooperating processes? • Are processes stateful or stateless?

Communication • What is the typical communication paradigm followed by DFSs?

• How do processes in DFSs communicate?

Naming How is naming often handled in DFSs?

Synchronization What are the file sharing semantics adopted by DFSs?

Consistency and Replication What are the various features of client-side caching as well as server-side replication?

Fault Tolerance How is fault tolerance handled in DFSs?

Aspect Description

Architecture How are DFSs generally organized?

Processes • Who are the cooperating processes? • Are processes stateful or stateless?

Communication • What is the typical communication paradigm followed by DFSs?

• How do processes in DFSs communicate?

Naming How is naming often handled in DFSs?

Synchronization What are the file sharing semantics adopted by DFSs?

Consistency and Replication What are the various features of client-side caching as well as server-side replication?

Fault Tolerance How is fault tolerance handled in DFSs?

Page 37: Distributed Systems CS 15-440 Distributed File Systems- Part I Lecture 19, Nov 14, 2011 Majd F. Sakr, Mohammad Hammoud andVinay Kolar 1

Processes (1)

Cooperating Processes in DFSs are usually the storage servers and file manager(s)

The most important aspect concerning DFS processes is whether they should be stateless or stateful

1. Stateless Approach:

Doesn’t require that servers maintain any client state When a server crashes, there is essentially no need to enter

a recovery phase to bring the server to a previous state Locking a file cannot be easily done E.g., NFSv3

Page 38: Distributed Systems CS 15-440 Distributed File Systems- Part I Lecture 19, Nov 14, 2011 Majd F. Sakr, Mohammad Hammoud andVinay Kolar 1

Processes (2)

2. Stateful Approach:

Requires that a server maintains some client state

Clients can make effective use of caches but this would entail an efficient underlying cache consistency protocol

Provides a server with the ability to support callbacks (i.e., the ability to do RPC to a client) in order to keep track of its clients

E.g., NFSv4

Page 39: Distributed Systems CS 15-440 Distributed File Systems- Part I Lecture 19, Nov 14, 2011 Majd F. Sakr, Mohammad Hammoud andVinay Kolar 1

DFS Aspects

Aspect Description

Architecture How are DFSs generally organized?

Processes • Who are the cooperating processes? • Are processes stateful or stateless?

Communication • What is the typical communication paradigm followed by DFSs?

• How do processes in DFSs communicate?

Naming How is naming often handled in DFSs?

Synchronization What are the file sharing semantics adopted by DFSs?

Consistency and Replication What are the various features of client-side caching as well as server-side replication?

Fault Tolerance How is fault tolerance handled in DFSs?

Aspect Description

Architecture How are DFSs generally organized?

Processes • Who are the cooperating processes? • Are processes stateful or stateless?

Communication • What is the typical communication paradigm followed by DFSs?

• How do processes in DFSs communicate?

Naming How is naming often handled in DFSs?

Synchronization What are the file sharing semantics adopted by DFSs?

Consistency and Replication What are the various features of client-side caching as well as server-side replication?

Fault Tolerance How is fault tolerance handled in DFSs?

Aspect Description

Architecture How are DFSs generally organized?

Processes • Who are the cooperating processes? • Are processes stateful or stateless?

Communication • What is the typical communication paradigm followed by DFSs?

• How do processes in DFSs communicate?

Naming How is naming often handled in DFSs?

Synchronization What are the file sharing semantics adopted by DFSs?

Consistency and Replication What are the various features of client-side caching as well as server-side replication?

Fault Tolerance How is fault tolerance handled in DFSs?

Aspect Description

Architecture How are DFSs generally organized?

Processes • Who are the cooperating processes? • Are processes stateful or stateless?

Communication • What is the typical communication paradigm followed by DFSs?

• How do processes in DFSs communicate?

Naming How is naming often handled in DFSs?

Synchronization What are the file sharing semantics adopted by DFSs?

Consistency and Replication What are the various features of client-side caching as well as server-side replication?

Fault Tolerance How is fault tolerance handled in DFSs?

Page 40: Distributed Systems CS 15-440 Distributed File Systems- Part I Lecture 19, Nov 14, 2011 Majd F. Sakr, Mohammad Hammoud andVinay Kolar 1

Communication

Communication in DFSs is mainly based on remote procedure calls (RPCs)

For example, in NFS, all communication between a client and server proceeds along the Open Network Computing RPC (ONC RPC)

The main reason for choosing RPC is to make the system independent from underlying OSs, networks, and transport protocols

Every NFS operation can be implemented as a single RPC call to a file server

Page 41: Distributed Systems CS 15-440 Distributed File Systems- Part I Lecture 19, Nov 14, 2011 Majd F. Sakr, Mohammad Hammoud andVinay Kolar 1

RPCs in NFS

Up until NFSv4, the client was made responsible for making the server’s life as easy as possible by keeping requests simple

The drawback becomes apparent when considering the use of NFS in a wide-area system

In that case, the extra latency of a second RPCleads to performance degradation

To circumvent such problems, NFSv4 supportscompound procedures

Lookup

Lookup name

Read

Read file data

Tim

e

Client Server

LookupOpenRead

Lookup name

Read file data

Tim

e

Client Server

Open file

Page 42: Distributed Systems CS 15-440 Distributed File Systems- Part I Lecture 19, Nov 14, 2011 Majd F. Sakr, Mohammad Hammoud andVinay Kolar 1

RPC in Coda (1)

Another enhancement to RPCs has been developed as part of the Coda file system (Kistler and Satyanarayanan, 1992) and referred to as RPC2

RPC2 is a package that offers reliable RPCs on top of the (unreliable) UDP protocol

Each time an RPC is made, the RPC2 client code starts a new thread that sends an invocation request to the server then blocks until it receives an answer

As request processing may take an arbitrary time to complete, the server regularly sends back messages to the client to let it know it is still working on the request

Page 43: Distributed Systems CS 15-440 Distributed File Systems- Part I Lecture 19, Nov 14, 2011 Majd F. Sakr, Mohammad Hammoud andVinay Kolar 1

RPC in Coda (2)

RPC2 supports multicasting

An important design issue in Coda is that servers keep track of which clients have a local copy of a file (i.e., stateful servers)

When a file is modified, a sever invalidates local copies at clients in parallel using multicasting

Client

Server

Client

Time

Sending an invalidation message one at a time

Client

Server

Client

Time

Sending an invalidation message using multicasting as in RPC2

Invalidate

Invalidate

Reply

Reply Invalidate

Invalidate

Reply

Reply

Page 44: Distributed Systems CS 15-440 Distributed File Systems- Part I Lecture 19, Nov 14, 2011 Majd F. Sakr, Mohammad Hammoud andVinay Kolar 1

Next Class: DFS Aspects

Aspect Description

Architecture How are DFSs generally organized?

Processes • Who are the cooperating processes? • Are processes stateful or stateless?

Communication • What is the typical communication paradigm followed by DFSs?

• How do processes in DFSs communicate?

Naming How is naming often handled in DFSs?

Synchronization What are the file sharing semantics adopted by DFSs?

Consistency and Replication What are the various features of client-side caching as well as server-side replication?

Fault Tolerance How is fault tolerance handled in DFSs?

Aspect Description

Architecture How are DFSs generally organized?

Processes • Who are the cooperating processes? • Are processes stateful or stateless?

Communication • What is the typical communication paradigm followed by DFSs?

• How do processes in DFSs communicate?

Naming How is naming often handled in DFSs?

Synchronization What are the file sharing semantics adopted by DFSs?

Consistency and Replication What are the various features of client-side caching as well as server-side replication?

Fault Tolerance How is fault tolerance handled in DFSs?

Aspect Description

Architecture How are DFSs generally organized?

Processes • Who are the cooperating processes? • Are processes stateful or stateless?

Communication • What is the typical communication paradigm followed by DFSs?

• How do processes in DFSs communicate?

Naming How is naming often handled in DFSs?

Synchronization What are the file sharing semantics adopted by DFSs?

Consistency and Replication What are the various features of client-side caching as well as server-side replication?

Fault Tolerance How is fault tolerance handled in DFSs?

Aspect Description

Architecture How are DFSs generally organized?

Processes • Who are the cooperating processes? • Are processes stateful or stateless?

Communication • What is the typical communication paradigm followed by DFSs?

• How do processes in DFSs communicate?

Naming How is naming often handled in DFSs?

Synchronization What are the file sharing semantics adopted by DFSs?

Consistency and Replication What are the various features of client-side caching as well as server-side replication?

Fault Tolerance How is fault tolerance handled in DFSs?

Aspect Description

Architecture How are DFSs generally organized?

Processes • Who are the cooperating processes? • Are processes stateful or stateless?

Communication • What is the typical communication paradigm followed by DFSs?

• How do processes in DFSs communicate?

Naming How is naming often handled in DFSs?

Synchronization What are the file sharing semantics adopted by DFSs?

Consistency and Replication What are the various features of client-side caching as well as server-side replication?

Fault Tolerance How is fault tolerance handled in DFSs?