operating systems cmpsci 377 lecture 21: distributed file systems

21
UNIVERSITY OF NIVERSITY OF MASSACHUSETTS ASSACHUSETTS, A , AMHERST MHERST Department of Computer Science Department of Computer Science Emery Berger University of Massachusetts Amherst Operating Systems CMPSCI 377 Lecture 21: Distributed File Systems

Upload: leo-hurst

Post on 02-Jan-2016

41 views

Category:

Documents


1 download

DESCRIPTION

Operating Systems CMPSCI 377 Lecture 21: Distributed File Systems. Emery Berger University of Massachusetts Amherst. Distributed File Systems. Most common use of distributed systems Idea: Given set of disks attached to different nodes, share as if all were attached to every node Examples: - PowerPoint PPT Presentation

TRANSCRIPT

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science

Emery BergerUniversity of Massachusetts Amherst

Operating SystemsCMPSCI 377

Lecture 21: Distributed File Systems

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 2

Distributed File Systems

Most common use of distributed systems

Idea: Given set of disks attached to different

nodes,share as if all were attached to every node

Examples: Edlab: one server, workstations on LAN AppleShare: nodes are servers with disk &

client

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 3

Distributed File Systems: Issues

Naming & transparency Remote file access Caching Server with state or without Replication

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 4

Naming & Transparency

Issues How are files named? Do filenames reveal location? Do filenames change if file moves? Do filenames change if user moves?

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 5

Transparency

Location transparency: filename does not reveal physical storage location

Location independence: filename need not change if file’s storage location changes

In practice: Most naming schemes do not have

location independence Many have location transparency

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 6

Naming Strategies:Absolute Names

Disadvantages: User must know

complete name – aware of which files are local & which are remote

File is location dependent (cannot move)

Makes sharing harder

Not fault-tolerant

Advantages: Easy to find fully

specified filename Easy to add &

delete new names No global state Scales easily

<machine name, pathname>Examples: AppleShare, Windows

NT

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 7

Naming Strategies:Mount Points

Mount points (NFS – Network File System) Each host has set of local names for

remote locations Mount table (/etc/fstab): specifies

<remote pathname @ machine name, local pathname>

At boot: bind local name to remote Users refer to local pathnames

NFS manages mapping

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 8

Mount Points: Pros & Cons

Advantages: Location transparent Remote name can change across

reboots

Disadvantages: Single unified strategy hard to maintain Same file can have different names

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 9

NFS Example

Partial contents of /etc/fstab for Edlab:

/usr1/[email protected]:/var/spool/mail

/users/users1@elsrv1:/users/users1

/courses/cs300@elsrv3:/courses/cs300

/rcf/common@elsrv1:/exp/rcf/common

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 10

NFS Example

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 11

Naming Strategies:Global Name Space

Single name space: Examples:

AFS (CMU’s Andrew File System) Sprite (Berkeley)

No matter which node you are on,filenames remain the same

Client: gets filename structure from server(s)

When users access files, server sends copies to workstation, where they are cached

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 12

Global Name Space: Pros & Cons

Advantages: Naming – consistent Ensures all files are same regardless of where

you login Late binding of names ) moving them is easier

Disadvantages: Difficult for OS to keep files consistent (caching) Global name space may limit flexibility Performance issues

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 13

Distributed File Systems: Issues

Naming & transparency Remote file access Caching Server with state or without Replication

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 14

Remote File Access & Caching

Can access files: Remotely: returns results using RPC = remote

service Transfer part of file, perform local access =

caching

Caching issues: Where & when are file blocks cached? When are modifications propagated back to

remote file? What happens when multiple clients cache

same file?

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 15

Remote File Caching Local disk:

Reduces access time (compared to remote) Safe if node fails– Difficult to keep local copy consistent with

remote copy– Requires client to have disk!

Local memory: Quick access time Works without disks– Difficult to keep local copy consistent with

remote copy– Smaller cache size– Not fault-tolerant

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 16

Cache Update Policies

Write-through: write to remote disk Reliable– Low-performance = remote service for all

writes

Write-back: write only to cache Write to disk on evictions, periodic synch Quick Reduces network traffic (repeated writes to

same block)– User machine crashes ) data loss

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 17

Cache Consistency

Client-initiated consistency:client contacts server and checks consistency Can check every access, at given intervals, only upon opening a file

Server-initiated consistency:server detects potential conflicts, invalidates caches Server needs to know which clients have

cached which parts of which files which clients are readers & which are writers

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 18

Case Study:Sun’s Network File System

NFS: standard for distributed UNIX file access Designed to run on LANs

Nodes are both servers & clients Servers have no state Uses mount protocol to make global name local

/etc/exports: lists local names server willing to export

/etc/fstab: lists global names that local nodes import

Corresponding global name must be in /etc/exports on server

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 19

NFS Implementation NFS defines set of RPC operations for

remote file access:1. Directory search, reading directory entries2. Manipulating links & directories3. Accessing file attributes4. Reading/writing files

Does not rely on node homogeneity Heterogeneous nodes support NFS mount

& remote access protocols using RPC Users may need to know different names

depending upon which node they log on

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 20

NFS Implementation

UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 21

Summary

Distributed File Systems Naming & transparency Remote file access Caching Server with state or without Replication