distributed file systems - brown universitycs.brown.edu/courses/cs138/s17/lectures/15dfs.pdf · cs...

62
CS 138 XV–1 Copyright © 2017 Thomas W. Doeppner. All rights reserved. Distributed File Systems

Upload: others

Post on 24-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Distributed File Systems - Brown Universitycs.brown.edu/courses/cs138/s17/lectures/15dfs.pdf · CS 138 XV–11 Copyright © 2017 Thomas W. Doeppner. All rights reserved Thursday morning,

CS 138 XV–1 Copyright © 2017 Thomas W. Doeppner. All rights reserved.

Distributed File Systems

Page 2: Distributed File Systems - Brown Universitycs.brown.edu/courses/cs138/s17/lectures/15dfs.pdf · CS 138 XV–11 Copyright © 2017 Thomas W. Doeppner. All rights reserved Thursday morning,

CS 138 XV–2 Copyright © 2017 Thomas W. Doeppner. All rights reserved.

Outline

•  Failure •  Basic concepts •  NFS version 2 •  CIFS •  DCE DFS •  NFS version 4

Page 3: Distributed File Systems - Brown Universitycs.brown.edu/courses/cs138/s17/lectures/15dfs.pdf · CS 138 XV–11 Copyright © 2017 Thomas W. Doeppner. All rights reserved Thursday morning,

CS 138 XV–3 Copyright © 2017 Thomas W. Doeppner. All rights reserved.

Synchronous vs. Asynchronous

•  Execution speed – synchronous: bounded – asynchronous: unbounded

•  Message transmission delays – synchronous: bounded – asynchronous: unbounded

•  Local clock drift rate: – synchronous: bounded – asynchronous: unbounded

Page 4: Distributed File Systems - Brown Universitycs.brown.edu/courses/cs138/s17/lectures/15dfs.pdf · CS 138 XV–11 Copyright © 2017 Thomas W. Doeppner. All rights reserved Thursday morning,

CS 138 XV–4 Copyright © 2017 Thomas W. Doeppner. All rights reserved.

Failures

•  Omission failures – something doesn’t happen

-  process crashes -  data lost in transmission -  etc.

•  Byzantine (arbitrary) failures – something bad happens

-  message is modified -  message received twice -  etc.

•  Timing failures – something takes too long

Page 5: Distributed File Systems - Brown Universitycs.brown.edu/courses/cs138/s17/lectures/15dfs.pdf · CS 138 XV–11 Copyright © 2017 Thomas W. Doeppner. All rights reserved Thursday morning,

CS 138 XV–5 Copyright © 2017 Thomas W. Doeppner. All rights reserved.

Detecting Crashes

•  Synchronous systems –  timeouts

•  Asynchronous systems – ?

•  Fail-stop – an oracle lets us know

Page 6: Distributed File Systems - Brown Universitycs.brown.edu/courses/cs138/s17/lectures/15dfs.pdf · CS 138 XV–11 Copyright © 2017 Thomas W. Doeppner. All rights reserved Thursday morning,

CS 138 XV–6 Copyright © 2017 Thomas W. Doeppner. All rights reserved.

Gossip Scenario

Client

Client

Client Client

Replica Manager

Replica Manager

Replica Manager

Page 7: Distributed File Systems - Brown Universitycs.brown.edu/courses/cs138/s17/lectures/15dfs.pdf · CS 138 XV–11 Copyright © 2017 Thomas W. Doeppner. All rights reserved Thursday morning,

CS 138 XV–7 Copyright © 2017 Thomas W. Doeppner. All rights reserved.

Data Server

Data Server

Data Server

Data Server

DFS Scenario

Client

Client

Client Client File Server

Cache

Cache

Cac

he C

ache

Page 8: Distributed File Systems - Brown Universitycs.brown.edu/courses/cs138/s17/lectures/15dfs.pdf · CS 138 XV–11 Copyright © 2017 Thomas W. Doeppner. All rights reserved Thursday morning,

CS 138 XV–8 Copyright © 2017 Thomas W. Doeppner. All rights reserved.

DFS Components

•  Data state –  file contents

•  Attribute state – size, access-control info, modification time,

etc. •  Open-file state

– which files are in use (open) –  lock state

Page 9: Distributed File Systems - Brown Universitycs.brown.edu/courses/cs138/s17/lectures/15dfs.pdf · CS 138 XV–11 Copyright © 2017 Thomas W. Doeppner. All rights reserved Thursday morning,

CS 138 XV–9 Copyright © 2017 Thomas W. Doeppner. All rights reserved.

Possible Locations data

cache

attr cache

open-file state

Client

data cache

attr cache

open-file state

Client

data cache

attr cache

Server

local file system

open-file state

Page 10: Distributed File Systems - Brown Universitycs.brown.edu/courses/cs138/s17/lectures/15dfs.pdf · CS 138 XV–11 Copyright © 2017 Thomas W. Doeppner. All rights reserved Thursday morning,

CS 138 XV–10 Copyright © 2017 Thomas W. Doeppner. All rights reserved.

In Practice …

•  Data state – NFS

-  weakly consistent -  less weak if program uses locks

– CIFS and DFS -  strictly consistent

•  Lock state – must be strictly consistent

Page 11: Distributed File Systems - Brown Universitycs.brown.edu/courses/cs138/s17/lectures/15dfs.pdf · CS 138 XV–11 Copyright © 2017 Thomas W. Doeppner. All rights reserved Thursday morning,

CS 138 XV–11 Copyright © 2017 Thomas W. Doeppner. All rights reserved.

Thursday morning, November 17th At 7:00 a.m.

Maytag, the department’s central file server, will be taken down to kick off a filesystem consistency check.

Linux machines will hang. All Windows users should log off.

Normal operation will resume by 8:30 a.m. if all goes well. All windows users should log off before this time.

Questions/concerns to [email protected]

Page 12: Distributed File Systems - Brown Universitycs.brown.edu/courses/cs138/s17/lectures/15dfs.pdf · CS 138 XV–11 Copyright © 2017 Thomas W. Doeppner. All rights reserved Thursday morning,

CS 138 XV–12 Copyright © 2017 Thomas W. Doeppner. All rights reserved.

Failures in a Local File System

On-Disk File System

Cache

0 1 2 3

.

.

.

n–1

1 rw 0

Open-File State Server

Client

Client Client

Client

On-Disk File System

Cache

0 1 2 3

.

.

.

n–1

1 rw 0

Open-File State Server

Client

Client Client

Client

Page 13: Distributed File Systems - Brown Universitycs.brown.edu/courses/cs138/s17/lectures/15dfs.pdf · CS 138 XV–11 Copyright © 2017 Thomas W. Doeppner. All rights reserved Thursday morning,

CS 138 XV–13 Copyright © 2017 Thomas W. Doeppner. All rights reserved.

Distributed Failure

On-Disk File System

Cache

0 1 2 3

.

.

.

n–1

1 rw 0

Open-File State Server

Client

Client Client

Client

On-Disk File System

Cache

0 1 2 3

.

.

.

n–1

1 rw 0

Open-File State Server

Client

Page 14: Distributed File Systems - Brown Universitycs.brown.edu/courses/cs138/s17/lectures/15dfs.pdf · CS 138 XV–11 Copyright © 2017 Thomas W. Doeppner. All rights reserved Thursday morning,

CS 138 XV–14 Copyright © 2017 Thomas W. Doeppner. All rights reserved.

In Practice … •  NFS version 2

–  relaxed approach to consistency – handles failures well

•  CIFS – strictly consistent –  intolerant of failures

•  DCE DFS – strictly consistent – sort of tolerant of failures

•  NFS version 4 – either relaxed or strictly-consistent – handles failures very well

Page 15: Distributed File Systems - Brown Universitycs.brown.edu/courses/cs138/s17/lectures/15dfs.pdf · CS 138 XV–11 Copyright © 2017 Thomas W. Doeppner. All rights reserved Thursday morning,

CS 138 XV–15 Copyright © 2017 Thomas W. Doeppner. All rights reserved.

NFS Version 2

•  Released in mid 1980s •  Three protocols in one

–  file protocol – mount protocol – network lock manager protocol

Basic NFS Extended NFS

Page 16: Distributed File Systems - Brown Universitycs.brown.edu/courses/cs138/s17/lectures/15dfs.pdf · CS 138 XV–11 Copyright © 2017 Thomas W. Doeppner. All rights reserved Thursday morning,

CS 138 XV–16 Copyright © 2017 Thomas W. Doeppner. All rights reserved.

Distribution of Components data

cache

attr cache

open-file state

NFSv2 client

data cache

attr cache

open-file state

NFSv2 client

data cache

attr cache

NFSv2 server

local file system

Page 17: Distributed File Systems - Brown Universitycs.brown.edu/courses/cs138/s17/lectures/15dfs.pdf · CS 138 XV–11 Copyright © 2017 Thomas W. Doeppner. All rights reserved Thursday morning,

CS 138 XV–17 Copyright © 2017 Thomas W. Doeppner. All rights reserved.

Consistency in Basic NFSv2

file x block 1

file x block 5

file y block 2

file y block 17

Data cache

file x attrs

file y attrs

Attribute cache

validity period

validity period

Page 18: Distributed File Systems - Brown Universitycs.brown.edu/courses/cs138/s17/lectures/15dfs.pdf · CS 138 XV–11 Copyright © 2017 Thomas W. Doeppner. All rights reserved Thursday morning,

CS 138 XV–18 Copyright © 2017 Thomas W. Doeppner. All rights reserved.

More …

•  All write RPC requests must be handled synchronously on the server

•  Close-to-Open consistency – client writes back all changes on close –  flushes cached file info on open

Page 19: Distributed File Systems - Brown Universitycs.brown.edu/courses/cs138/s17/lectures/15dfs.pdf · CS 138 XV–11 Copyright © 2017 Thomas W. Doeppner. All rights reserved Thursday morning,

CS 138 XV–19 Copyright © 2017 Thomas W. Doeppner. All rights reserved.

Client Crash Recovery

Data Cache

Server

Data Cache

Process A

Data Cache

Process B

Page 20: Distributed File Systems - Brown Universitycs.brown.edu/courses/cs138/s17/lectures/15dfs.pdf · CS 138 XV–11 Copyright © 2017 Thomas W. Doeppner. All rights reserved Thursday morning,

CS 138 XV–20 Copyright © 2017 Thomas W. Doeppner. All rights reserved.

Server Crash Recovery

Data Cache

Server

Data Cache

Process A

Data Cache

Process B

Page 21: Distributed File Systems - Brown Universitycs.brown.edu/courses/cs138/s17/lectures/15dfs.pdf · CS 138 XV–11 Copyright © 2017 Thomas W. Doeppner. All rights reserved Thursday morning,

CS 138 XV–21 Copyright © 2017 Thomas W. Doeppner. All rights reserved.

File Locking

•  State is required on the server! –  recovery must take place in the event of client

and server crashes •  Locking Protocol is independent of the File

Protocol –  locking is advisory – one can lock a file and ask if a file is locked – not required to honor locks

-  may read/write a file locked by others!

Page 22: Distributed File Systems - Brown Universitycs.brown.edu/courses/cs138/s17/lectures/15dfs.pdf · CS 138 XV–11 Copyright © 2017 Thomas W. Doeppner. All rights reserved Thursday morning,

CS 138 XV–22 Copyright © 2017 Thomas W. Doeppner. All rights reserved.

Network Lock Manager Protocol

Data Cache

Server

Data Cache

Process A

Data Cache

Process B

lockd statd

lockd statd

lockd statd

Page 23: Distributed File Systems - Brown Universitycs.brown.edu/courses/cs138/s17/lectures/15dfs.pdf · CS 138 XV–11 Copyright © 2017 Thomas W. Doeppner. All rights reserved Thursday morning,

CS 138 XV–23 Copyright © 2017 Thomas W. Doeppner. All rights reserved.

NFS Version 3

•  In use at Brown and in most of the rest of the world

•  Basically the same as NFSv2 –  improved handling of attributes – commit operation for writes – various other things

Page 24: Distributed File Systems - Brown Universitycs.brown.edu/courses/cs138/s17/lectures/15dfs.pdf · CS 138 XV–11 Copyright © 2017 Thomas W. Doeppner. All rights reserved Thursday morning,

CS 138 XV–24 Copyright © 2017 Thomas W. Doeppner. All rights reserved.

CIFS

•  Common Internet File System – Microsoft’s distributed file system

•  Features – strictly consistent

•  Not featured … – depends on reliability of transport protocol –  loss of connection == loss of session

Page 25: Distributed File Systems - Brown Universitycs.brown.edu/courses/cs138/s17/lectures/15dfs.pdf · CS 138 XV–11 Copyright © 2017 Thomas W. Doeppner. All rights reserved Thursday morning,

CS 138 XV–25 Copyright © 2017 Thomas W. Doeppner. All rights reserved.

History

•  Originally a simple means for sharing files – developed by IBM and called server message

block protocol (SMB) –  ran on top of NetBIOS

•  Microsoft took over –  renamed CIFS in late 1990s – uses SMB as RPC-like communication protocol

-  runs on NetBIOS -  usually layered on TCP -  sometimes no NetBIOS, just TCP

Page 26: Distributed File Systems - Brown Universitycs.brown.edu/courses/cs138/s17/lectures/15dfs.pdf · CS 138 XV–11 Copyright © 2017 Thomas W. Doeppner. All rights reserved Thursday morning,

CS 138 XV–26 Copyright © 2017 Thomas W. Doeppner. All rights reserved.

Consistency vs. Performance

•  Strict consistency is easy … – … if all operations take place on server – no client caching

•  Performance is good … – … if all operations take place on client – everything is cached on client

•  Put the two together …

ø – or you can do opportunistic locking

Page 27: Distributed File Systems - Brown Universitycs.brown.edu/courses/cs138/s17/lectures/15dfs.pdf · CS 138 XV–11 Copyright © 2017 Thomas W. Doeppner. All rights reserved Thursday morning,

CS 138 XV–27 Copyright © 2017 Thomas W. Doeppner. All rights reserved.

Opportunistic Locks

Server

Open A

OK, Op Lock

Client 1 Client 2

Open A

Revoke Op Lock

OK, changes

OK

Page 28: Distributed File Systems - Brown Universitycs.brown.edu/courses/cs138/s17/lectures/15dfs.pdf · CS 138 XV–11 Copyright © 2017 Thomas W. Doeppner. All rights reserved Thursday morning,

CS 138 XV–28 Copyright © 2017 Thomas W. Doeppner. All rights reserved.

Back to NFS

•  File system name space – how is distributed file system perceived on

clients? •  Cross-computer links

– how are files on other computers referred to?

Page 29: Distributed File Systems - Brown Universitycs.brown.edu/courses/cs138/s17/lectures/15dfs.pdf · CS 138 XV–11 Copyright © 2017 Thomas W. Doeppner. All rights reserved Thursday morning,

CS 138 XV–29 Copyright © 2017 Thomas W. Doeppner. All rights reserved.

NFS Mount Protocol

Server

Client

Approved List

Page 30: Distributed File Systems - Brown Universitycs.brown.edu/courses/cs138/s17/lectures/15dfs.pdf · CS 138 XV–11 Copyright © 2017 Thomas W. Doeppner. All rights reserved Thursday morning,

CS 138 XV–30 Copyright © 2017 Thomas W. Doeppner. All rights reserved.

File Handles

•  Servers provide opaque file handles to clients to refer to files

– contents mean nothing to clients –  identify files on server

•  Clients contact server via mount protocol to obtain file handles of roots of exported file systems

Page 31: Distributed File Systems - Brown Universitycs.brown.edu/courses/cs138/s17/lectures/15dfs.pdf · CS 138 XV–11 Copyright © 2017 Thomas W. Doeppner. All rights reserved Thursday morning,

CS 138 XV–31 Copyright © 2017 Thomas W. Doeppner. All rights reserved.

File Handle Contents

•  File-System ID – which server file system

•  File ID – which file within file system

•  Generation # – guards against inode reuse

File-System ID File ID Generation #

Page 32: Distributed File Systems - Brown Universitycs.brown.edu/courses/cs138/s17/lectures/15dfs.pdf · CS 138 XV–11 Copyright © 2017 Thomas W. Doeppner. All rights reserved Thursday morning,

CS 138 XV–32 Copyright © 2017 Thomas W. Doeppner. All rights reserved.

Server File Systems

/

B A D E C

H I

Q P O N

U

3 2

T

1 Z

F G

M L K J

S

Y X

R

W V

Page 33: Distributed File Systems - Brown Universitycs.brown.edu/courses/cs138/s17/lectures/15dfs.pdf · CS 138 XV–11 Copyright © 2017 Thomas W. Doeppner. All rights reserved Thursday morning,

CS 138 XV–33 Copyright © 2017 Thomas W. Doeppner. All rights reserved.

Client vs. Server Mount Points (1)

/

B A D E C

H I

Q P O N

U

3 2

T

1 Z

F G

M L K J

S

Y X

R

W V

/

C1 C2

mount server:/B /C2

Page 34: Distributed File Systems - Brown Universitycs.brown.edu/courses/cs138/s17/lectures/15dfs.pdf · CS 138 XV–11 Copyright © 2017 Thomas W. Doeppner. All rights reserved Thursday morning,

CS 138 XV–34 Copyright © 2017 Thomas W. Doeppner. All rights reserved.

Client vs. Server Mount Points (2)

/

B A D E C

H I

Q P O N

U

3 2

T

1 Z

F G

M L K J

S

Y X

R

W V

/

C1 C2

mount server:/B /C2

mount server:/B/F/K /C2/F/K

Page 35: Distributed File Systems - Brown Universitycs.brown.edu/courses/cs138/s17/lectures/15dfs.pdf · CS 138 XV–11 Copyright © 2017 Thomas W. Doeppner. All rights reserved Thursday morning,

CS 138 XV–35 Copyright © 2017 Thomas W. Doeppner. All rights reserved.

Local vs. Global Namespace

•  Local namespace – each host configures its own file-system

namespace – NFS clients each mount the appropriate remote

file systems •  Global namespace

– all hosts share the same namespace – not done in early NFS

Page 36: Distributed File Systems - Brown Universitycs.brown.edu/courses/cs138/s17/lectures/15dfs.pdf · CS 138 XV–11 Copyright © 2017 Thomas W. Doeppner. All rights reserved Thursday morning,

CS 138 XV–36 Copyright © 2017 Thomas W. Doeppner. All rights reserved.

Mount Protocol Problems

•  Local namespaces don’t work •  Achieve global name space by having each

client mount everything consistently •  giving each client a table listing all possible

mounts is administratively difficult •  performing all possible mounts is time

consuming •  mounting is a “heavyweight” operation

Page 37: Distributed File Systems - Brown Universitycs.brown.edu/courses/cs138/s17/lectures/15dfs.pdf · CS 138 XV–11 Copyright © 2017 Thomas W. Doeppner. All rights reserved Thursday morning,

CS 138 XV–37 Copyright © 2017 Thomas W. Doeppner. All rights reserved.

Rather than this …

home etc dev

carlos rohil rodrigo max louisa twd atty haris ishan

Page 38: Distributed File Systems - Brown Universitycs.brown.edu/courses/cs138/s17/lectures/15dfs.pdf · CS 138 XV–11 Copyright © 2017 Thomas W. Doeppner. All rights reserved Thursday morning,

CS 138 XV–38 Copyright © 2017 Thomas W. Doeppner. All rights reserved.

… this

home etc dev

Autofs

automount database

Page 39: Distributed File Systems - Brown Universitycs.brown.edu/courses/cs138/s17/lectures/15dfs.pdf · CS 138 XV–11 Copyright © 2017 Thomas W. Doeppner. All rights reserved Thursday morning,

CS 138 XV–39 Copyright © 2017 Thomas W. Doeppner. All rights reserved.

Automounting: 2000

•  Maintain description of global namespace in global database: NIS

•  Do mounts only when needed •  Automount times out after period of unuse

Page 40: Distributed File Systems - Brown Universitycs.brown.edu/courses/cs138/s17/lectures/15dfs.pdf · CS 138 XV–11 Copyright © 2017 Thomas W. Doeppner. All rights reserved Thursday morning,

CS 138 XV–40 Copyright © 2017 Thomas W. Doeppner. All rights reserved.

Automounting: 2017

•  Global namespace maintained in LDAP database

–  lightweight directory access protocol -  vendor neutral

– everything mounted at boottime -  fewer, but larger, file systems

– no timeout

Page 41: Distributed File Systems - Brown Universitycs.brown.edu/courses/cs138/s17/lectures/15dfs.pdf · CS 138 XV–11 Copyright © 2017 Thomas W. Doeppner. All rights reserved Thursday morning,

CS 138 XV–41 Copyright © 2017 Thomas W. Doeppner. All rights reserved.

DCE’s DFS fs

system users projects

sol7 osf1

bin usr bin

bin

twd motif dce

Client

Client

Client

Client

Client

Client

FLDB

FLDB

FLDB

Server

Server

Server

Server

Page 42: Distributed File Systems - Brown Universitycs.brown.edu/courses/cs138/s17/lectures/15dfs.pdf · CS 138 XV–11 Copyright © 2017 Thomas W. Doeppner. All rights reserved Thursday morning,

CS 138 XV–42 Copyright © 2017 Thomas W. Doeppner. All rights reserved.

DFS Mount Points fs

system users projects

sol7 osf1

bin usr bin

bin

twd motif dce

Client

Client

Client

Client

Client

Client

FLDB

FLDB

FLDB

Server

Server

Server

Server

bin.sol7 bin.osf1 users.twd proj.motif proj.dce

root.cell

bin.sol7 bin.osf1 users.twd proj.motif proj.dce

Page 43: Distributed File Systems - Brown Universitycs.brown.edu/courses/cs138/s17/lectures/15dfs.pdf · CS 138 XV–11 Copyright © 2017 Thomas W. Doeppner. All rights reserved Thursday morning,

CS 138 XV–43 Copyright © 2017 Thomas W. Doeppner. All rights reserved.

Strict Consistency in DFS fs

system users projects

sol7 osf1

bin usr bin

bin

twd motif dce

Client

Client

Client

Client

Client

Client

FLDB

FLDB

FLDB

Server

Server

Server

Server

Page 44: Distributed File Systems - Brown Universitycs.brown.edu/courses/cs138/s17/lectures/15dfs.pdf · CS 138 XV–11 Copyright © 2017 Thomas W. Doeppner. All rights reserved Thursday morning,

CS 138 XV–44 Copyright © 2017 Thomas W. Doeppner. All rights reserved.

DFS Tokens (1) fs

system users projects

sol7 osf1

bin usr bin

bin

twd motif dce

Client

FLDB

FLDB

FLDB

Server

Server

Server

Server

Client

File A: Read:0-4095

File B: Write:0-512

File A: Read:0-4095

File B: Write:513-4096

Page 45: Distributed File Systems - Brown Universitycs.brown.edu/courses/cs138/s17/lectures/15dfs.pdf · CS 138 XV–11 Copyright © 2017 Thomas W. Doeppner. All rights reserved Thursday morning,

CS 138 XV–45 Copyright © 2017 Thomas W. Doeppner. All rights reserved.

DFS Tokens (2) fs

system users projects

sol7 osf1

bin usr bin

bin

twd motif dce

Client

FLDB

FLDB

FLDB

Server

Server

Server

Server

Client

File A: Read:0-4095

File A: Read:0-4095

Write(A, 0-4095)

Page 46: Distributed File Systems - Brown Universitycs.brown.edu/courses/cs138/s17/lectures/15dfs.pdf · CS 138 XV–11 Copyright © 2017 Thomas W. Doeppner. All rights reserved Thursday morning,

CS 138 XV–46 Copyright © 2017 Thomas W. Doeppner. All rights reserved.

DFS Tokens (3) fs

system users projects

sol7 osf1

bin usr bin

bin

twd motif dce

Client

FLDB

FLDB

FLDB

Server

Server

Server

Server

Client

File A: Read:0-4095

File A: Read:0-4095

Revoke(A, Read, 0-4095)

Page 47: Distributed File Systems - Brown Universitycs.brown.edu/courses/cs138/s17/lectures/15dfs.pdf · CS 138 XV–11 Copyright © 2017 Thomas W. Doeppner. All rights reserved Thursday morning,

CS 138 XV–47 Copyright © 2017 Thomas W. Doeppner. All rights reserved.

DFS Tokens (4) fs

system users projects

sol7 osf1

bin usr bin

bin

twd motif dce

Client

FLDB

FLDB

FLDB

Server

Server

Server

Server

Client File A: Read:0-4095

Grant(A, write, 0-4095)

File A: Write:0-4095

Page 48: Distributed File Systems - Brown Universitycs.brown.edu/courses/cs138/s17/lectures/15dfs.pdf · CS 138 XV–11 Copyright © 2017 Thomas W. Doeppner. All rights reserved Thursday morning,

CS 138 XV–48 Copyright © 2017 Thomas W. Doeppner. All rights reserved.

DFS Tokens (5) fs

system users projects

sol7 osf1

bin usr bin

bin

twd motif dce

Client

FLDB

FLDB

FLDB

Server

Server

Server

Server

Client File A: Read:0-4095

File A: Write:0-4095

Read(A, 0-4095)

Page 49: Distributed File Systems - Brown Universitycs.brown.edu/courses/cs138/s17/lectures/15dfs.pdf · CS 138 XV–11 Copyright © 2017 Thomas W. Doeppner. All rights reserved Thursday morning,

CS 138 XV–49 Copyright © 2017 Thomas W. Doeppner. All rights reserved.

DFS Tokens (6) fs

system users projects

sol7 osf1

bin usr bin

bin

twd motif dce

Client

FLDB

FLDB

FLDB

Server

Server

Server

Server

Client File A: Read:0-4095

File A: Write:0-4095

Page 50: Distributed File Systems - Brown Universitycs.brown.edu/courses/cs138/s17/lectures/15dfs.pdf · CS 138 XV–11 Copyright © 2017 Thomas W. Doeppner. All rights reserved Thursday morning,

CS 138 XV–50 Copyright © 2017 Thomas W. Doeppner. All rights reserved.

DFS Tokens (7) fs

system users projects

sol7 osf1

bin usr bin

bin

twd motif dce

Client

FLDB

FLDB

FLDB

Server

Server

Server

Server

Client File A: Read:0-4095

File A: Read:0-4095

Grant(A, read, 0-4095)

Page 51: Distributed File Systems - Brown Universitycs.brown.edu/courses/cs138/s17/lectures/15dfs.pdf · CS 138 XV–11 Copyright © 2017 Thomas W. Doeppner. All rights reserved Thursday morning,

CS 138 XV–51 Copyright © 2017 Thomas W. Doeppner. All rights reserved.

DFS Crash Recovery

Server

Client

Client

Page 52: Distributed File Systems - Brown Universitycs.brown.edu/courses/cs138/s17/lectures/15dfs.pdf · CS 138 XV–11 Copyright © 2017 Thomas W. Doeppner. All rights reserved Thursday morning,

CS 138 XV–52 Copyright © 2017 Thomas W. Doeppner. All rights reserved.

DFS Crash Recovery (notes continued)

Server

Client

Client

Page 53: Distributed File Systems - Brown Universitycs.brown.edu/courses/cs138/s17/lectures/15dfs.pdf · CS 138 XV–11 Copyright © 2017 Thomas W. Doeppner. All rights reserved Thursday morning,

CS 138 XV–53 Copyright © 2017 Thomas W. Doeppner. All rights reserved.

DFS Recovery Problems

•  Client application must participate! – must recognize that operation returns “timed-

out” error – must retry

•  Due to semantics of tokens, it isn’t feasible to provide NFS-style hard mount

Page 54: Distributed File Systems - Brown Universitycs.brown.edu/courses/cs138/s17/lectures/15dfs.pdf · CS 138 XV–11 Copyright © 2017 Thomas W. Doeppner. All rights reserved Thursday morning,

CS 138 XV–54 Copyright © 2017 Thomas W. Doeppner. All rights reserved.

NFS Version 4

•  Better than … – NFS version 2 – NFS version 3 – CIFS – DFS –  (why aren’t we running it?)

Page 55: Distributed File Systems - Brown Universitycs.brown.edu/courses/cs138/s17/lectures/15dfs.pdf · CS 138 XV–11 Copyright © 2017 Thomas W. Doeppner. All rights reserved Thursday morning,

CS 138 XV–55 Copyright © 2017 Thomas W. Doeppner. All rights reserved.

NFSv4: Why?

•  Problems with NFSv3 – doesn’t provide exact Unix file semantics – doesn’t support mandatory locks – doesn’t cope with byzantine failures

Page 56: Distributed File Systems - Brown Universitycs.brown.edu/courses/cs138/s17/lectures/15dfs.pdf · CS 138 XV–11 Copyright © 2017 Thomas W. Doeppner. All rights reserved Thursday morning,

CS 138 XV–56 Copyright © 2017 Thomas W. Doeppner. All rights reserved.

Server State

•  It’s required! – exact Unix semantics – mandatory locks

Page 57: Distributed File Systems - Brown Universitycs.brown.edu/courses/cs138/s17/lectures/15dfs.pdf · CS 138 XV–11 Copyright © 2017 Thomas W. Doeppner. All rights reserved Thursday morning,

CS 138 XV–57 Copyright © 2017 Thomas W. Doeppner. All rights reserved.

State Recovery

•  Server crash recovery – clients reclaim state on server

-  grace period after crash during which no new state may be established

•  Client crash recovery – server detects crash and nullifies client state

information on server

Page 58: Distributed File Systems - Brown Universitycs.brown.edu/courses/cs138/s17/lectures/15dfs.pdf · CS 138 XV–11 Copyright © 2017 Thomas W. Doeppner. All rights reserved Thursday morning,

CS 138 XV–58 Copyright © 2017 Thomas W. Doeppner. All rights reserved.

Coping with Non-Responsiveness

•  Leases –  locks are granted for a fixed period of time

-  server-specified lease –  if lease not renewed before expiration, server

may (unilaterally) revoke locks and share reservations

-  most client RPCs renew leases – clients must contact server periodically

-  if clientid is rejected as stale, then server has restarted

-  server’s grace period is equal to lease period

Page 59: Distributed File Systems - Brown Universitycs.brown.edu/courses/cs138/s17/lectures/15dfs.pdf · CS 138 XV–11 Copyright © 2017 Thomas W. Doeppner. All rights reserved Thursday morning,

CS 138 XV–59 Copyright © 2017 Thomas W. Doeppner. All rights reserved.

Pathological Network Problems

1)  Client 1 obtains a lock on a portion of a file 2)  There’s a network partition such that client 1

and server can no longer communicate 3)  The server crashes and restarts 4)  Client 2 obtains a lock on the same portion

of the same file, modifies the file, and then releases the lock

5)  The server crashes and restarts and the network partition is repaired

6)  Client 1 recontacts the server and reclaims its lock

Page 60: Distributed File Systems - Brown Universitycs.brown.edu/courses/cs138/s17/lectures/15dfs.pdf · CS 138 XV–11 Copyright © 2017 Thomas W. Doeppner. All rights reserved Thursday morning,

CS 138 XV–60 Copyright © 2017 Thomas W. Doeppner. All rights reserved.

Coping …

•  Possibilities 1)  server keeps all client state in non-volatile

storage 2)  server keeps all client state in volatile storage

and refuses all reclaim requests (effectively emulating CIFS)

3)  something in between …

Page 61: Distributed File Systems - Brown Universitycs.brown.edu/courses/cs138/s17/lectures/15dfs.pdf · CS 138 XV–11 Copyright © 2017 Thomas W. Doeppner. All rights reserved Thursday morning,

CS 138 XV–61 Copyright © 2017 Thomas W. Doeppner. All rights reserved.

Compromise

•  Keep enough client state in non-volatile memory to know which clients were active at time of crash

– will honor reclaim requests from these clients – will refuse reclaim requests from others

•  What to keep: – client ID –  the time of the client’s first acquisition of a share

reservation or lock after a server reboot or client lease expiration

– a flag indicating whether the client’s most recent state was revoked because of a lease expiration

Page 62: Distributed File Systems - Brown Universitycs.brown.edu/courses/cs138/s17/lectures/15dfs.pdf · CS 138 XV–11 Copyright © 2017 Thomas W. Doeppner. All rights reserved Thursday morning,

CS 138 XV–62 Copyright © 2017 Thomas W. Doeppner. All rights reserved.

DFS ≠ LFS

•  Servers might give up on non-crashed clients – clients may lose locks – clients may lose files – NFSv4 attempts to make such things unlikely