shivaram venkataramanpages.cs.wisc.edu › ~shivaram › cs537-sp20-notes › nfs-wrap ›...
TRANSCRIPT
![Page 1: Shivaram Venkataramanpages.cs.wisc.edu › ~shivaram › cs537-sp20-notes › nfs-wrap › cs53… · QUIZ 31 The only costs to worry about are network costs. Assume "small" messages](https://reader034.vdocuments.net/reader034/viewer/2022042404/5f1986ac9fc3601ae930432a/html5/thumbnails/1.jpg)
NFS
Shivaram VenkataramanCS 537, Spring 2020
welcometo the
Penultimatefuture
!
![Page 2: Shivaram Venkataramanpages.cs.wisc.edu › ~shivaram › cs537-sp20-notes › nfs-wrap › cs53… · QUIZ 31 The only costs to worry about are network costs. Assume "small" messages](https://reader034.vdocuments.net/reader034/viewer/2022042404/5f1986ac9fc3601ae930432a/html5/thumbnails/2.jpg)
ADMINISTRIVIA
AEFIS feedbackOptional project Final exam details
No discussion this week!
>NoSUP
DATS
→ May I -354.
currently[→ Due wed 10pm to ? like this to be
→ Check Piazza
↳ canvas Quiz →randomization
↳us .ae#...r!+,..iu.
" ""-
![Page 3: Shivaram Venkataramanpages.cs.wisc.edu › ~shivaram › cs537-sp20-notes › nfs-wrap › cs53… · QUIZ 31 The only costs to worry about are network costs. Assume "small" messages](https://reader034.vdocuments.net/reader034/viewer/2022042404/5f1986ac9fc3601ae930432a/html5/thumbnails/3.jpg)
AGENDA / LEARNING OUTCOMES
How to design a distributed file system that can survive partial failures?
What are consistency properties for such designs?
-
↳ Distributedsystems
![Page 4: Shivaram Venkataramanpages.cs.wisc.edu › ~shivaram › cs537-sp20-notes › nfs-wrap › cs53… · QUIZ 31 The only costs to worry about are network costs. Assume "small" messages](https://reader034.vdocuments.net/reader034/viewer/2022042404/5f1986ac9fc3601ae930432a/html5/thumbnails/4.jpg)
RECAP
![Page 5: Shivaram Venkataramanpages.cs.wisc.edu › ~shivaram › cs537-sp20-notes › nfs-wrap › cs53… · QUIZ 31 The only costs to worry about are network costs. Assume "small" messages](https://reader034.vdocuments.net/reader034/viewer/2022042404/5f1986ac9fc3601ae930432a/html5/thumbnails/5.jpg)
Distributed File Systems
Local FS: processes on same machine access shared files
Network FS: processes on different machines access shared files in same way
GoalsTransparent accessFast + simple crash recoveryReasonable performance?
TCPupp tf RPC
f-www.an?a;eun?gtFeu-orkr-sI
→For server &
client failure
![Page 6: Shivaram Venkataramanpages.cs.wisc.edu › ~shivaram › cs537-sp20-notes › nfs-wrap › cs53… · QUIZ 31 The only costs to worry about are network costs. Assume "small" messages](https://reader034.vdocuments.net/reader034/viewer/2022042404/5f1986ac9fc3601ae930432a/html5/thumbnails/6.jpg)
NFS Architecture
FileServer
Client
Client
Client
Client
RPC
RPC
RPC
RPCLocal FS
desktop!qdird powerful
f mobile tht
la.tt laptop
vin #
a.txt
TI or
more
disks
![Page 7: Shivaram Venkataramanpages.cs.wisc.edu › ~shivaram › cs537-sp20-notes › nfs-wrap › cs53… · QUIZ 31 The only costs to worry about are network costs. Assume "small" messages](https://reader034.vdocuments.net/reader034/viewer/2022042404/5f1986ac9fc3601ae930432a/html5/thumbnails/7.jpg)
Strategy 1
Attempt: Wrap regular UNIX system calls using RPCopen() on client calls open() on serveropen() on server returns fd back to client
read(fd) on client calls read(fd) on serverread(fd) on server returns data back to client
int fd = open(“foo”, O_RDONLY);read(fd, buf, MAX);…read(fd, buf, MAX);
Server crash!
Naming!
vim
fotenftd
ii. FEE:÷÷.←
L) localFS
- fol store-irade run
- and offset
server comes back up ⇒ fp table'
is lost
offset is lost
![Page 8: Shivaram Venkataramanpages.cs.wisc.edu › ~shivaram › cs537-sp20-notes › nfs-wrap › cs53… · QUIZ 31 The only costs to worry about are network costs. Assume "small" messages](https://reader034.vdocuments.net/reader034/viewer/2022042404/5f1986ac9fc3601ae930432a/html5/thumbnails/8.jpg)
Strategy 2: put all info in requests
“Stateless” protocol: server maintains no state about clients
Need API change. One possibility:
pread(char *path, buf, size, offset);pwrite(char *path, buf, size, offset);
Specify path and offset each time. Server need not remember anything from clients.
Pros? Server can crash and reboot transparently to clientsCons? Too many path lookups.
i::÷:
:O o ¥¥÷÷÷÷÷⇒①¥:-
- -
-
![Page 9: Shivaram Venkataramanpages.cs.wisc.edu › ~shivaram › cs537-sp20-notes › nfs-wrap › cs53… · QUIZ 31 The only costs to worry about are network costs. Assume "small" messages](https://reader034.vdocuments.net/reader034/viewer/2022042404/5f1986ac9fc3601ae930432a/html5/thumbnails/9.jpg)
Strategy 3: file handles
fh = open(char *path);pread(fh, buf, size, offset);pwrite(fh, buf, size, offset);
File Handle = <volume ID, inode #, generation #>
Opaque to client (client should not interpret internals)
Oo"
ni.÷÷:read[ , .is?wEkxgd \ Drs
AwakeStateless .
We don't have
path traversal on every pread, p write
off; state
![Page 10: Shivaram Venkataramanpages.cs.wisc.edu › ~shivaram › cs537-sp20-notes › nfs-wrap › cs53… · QUIZ 31 The only costs to worry about are network costs. Assume "small" messages](https://reader034.vdocuments.net/reader034/viewer/2022042404/5f1986ac9fc3601ae930432a/html5/thumbnails/10.jpg)
file handle for 1 is
@ O message well known Grant )Tripe
- →
-- ①h@ Ishivaramlfooittt
fly cadre FHfit
* lhote /] Lookup ( fit ,
" have")
↳-
threw"
www.pqfac.yhone"
),
"Miami)←← :
→
Similarto path↳
Yer;D IT www.endearbd?h .
-
ismessage retro
%bdftwbreply the←
![Page 11: Shivaram Venkataramanpages.cs.wisc.edu › ~shivaram › cs537-sp20-notes › nfs-wrap › cs53… · QUIZ 31 The only costs to worry about are network costs. Assume "small" messages](https://reader034.vdocuments.net/reader034/viewer/2022042404/5f1986ac9fc3601ae930432a/html5/thumbnails/11.jpg)
Can NFS Protocol include Append?
fh = open(char *path);pread(fh, buf, size, offset);pwrite(fh, buf, size, offset);
append(fh, buf, size);
=
[ ]
![Page 12: Shivaram Venkataramanpages.cs.wisc.edu › ~shivaram › cs537-sp20-notes › nfs-wrap › cs53… · QUIZ 31 The only costs to worry about are network costs. Assume "small" messages](https://reader034.vdocuments.net/reader034/viewer/2022042404/5f1986ac9fc3601ae930432a/html5/thumbnails/12.jpg)
pwrite VS APPEND
AAAAAAAA
file
pwrite ABBAAAAA
file
pwrite ABBAAAAA
file
pwrite ABBAAAAA
file
pwrite(file, “BB”, 2, 2);
append(file, “BB”);
T =, offsetEntente
OD•
°
(retry )
a¥aFd Dia-91rem
![Page 13: Shivaram Venkataramanpages.cs.wisc.edu › ~shivaram › cs537-sp20-notes › nfs-wrap › cs53… · QUIZ 31 The only costs to worry about are network costs. Assume "small" messages](https://reader034.vdocuments.net/reader034/viewer/2022042404/5f1986ac9fc3601ae930432a/html5/thumbnails/13.jpg)
Idempotent Operations
Solution: Design API so no harm to executing function more than once
If f() is idempotent, then:f() has the same effect as f(); f(); … f(); f()
int fd = open(“foo”, O_RDONLY);read(fd, buf, MAX);write(fd, buf, MAX);…
Server crash!
client doesnt
know ifserver
crashed beforeI after
doing theoperation
Retrying functionsis only
- - sage if fis idempotent
→
---
- server-
T÷F"'Dno#
![Page 14: Shivaram Venkataramanpages.cs.wisc.edu › ~shivaram › cs537-sp20-notes › nfs-wrap › cs53… · QUIZ 31 The only costs to worry about are network costs. Assume "small" messages](https://reader034.vdocuments.net/reader034/viewer/2022042404/5f1986ac9fc3601ae930432a/html5/thumbnails/14.jpg)
What operations are Idempotent?Idempotent- any sort of read that doesn’t change anything- pwrite
Not idempotent- append
What about these?- mkdir- creat
- specified offset
NFSAPI ,
while
-not part of
pwrite is
= . ⇐ :::¥÷msee Mr
dir C"
Hoo"
)
![Page 15: Shivaram Venkataramanpages.cs.wisc.edu › ~shivaram › cs537-sp20-notes › nfs-wrap › cs53… · QUIZ 31 The only costs to worry about are network costs. Assume "small" messages](https://reader034.vdocuments.net/reader034/viewer/2022042404/5f1986ac9fc3601ae930432a/html5/thumbnails/15.jpg)
Write Buffers
Local FS
Client Server
NFS
write
write bufferwrite buffer
Server acknowledges write before write is pushed to disk;What happens if server crashes?
pwrite\ X,
"
t'-
, ,a r
- --
^
Can we sayOk before
datahits disk ?
![Page 16: Shivaram Venkataramanpages.cs.wisc.edu › ~shivaram › cs537-sp20-notes › nfs-wrap › cs53… · QUIZ 31 The only costs to worry about are network costs. Assume "small" messages](https://reader034.vdocuments.net/reader034/viewer/2022042404/5f1986ac9fc3601ae930432a/html5/thumbnails/16.jpg)
client:
write A to 0write B to 1write C to 2
Server Write Buffer Lost
server mem: A B C
server disk:
server acknowledges write before write is pushed to disk
blockO l Z
# • Dinge→ FACETACK
( ←crashes ? A C
twaitfretry Ihid serrer ggrnbage.aeis back middle
![Page 17: Shivaram Venkataramanpages.cs.wisc.edu › ~shivaram › cs537-sp20-notes › nfs-wrap › cs53… · QUIZ 31 The only costs to worry about are network costs. Assume "small" messages](https://reader034.vdocuments.net/reader034/viewer/2022042404/5f1986ac9fc3601ae930432a/html5/thumbnails/17.jpg)
Server Write Buffer Lost
server mem: Z
server disk: X B Z
Client:
write A to 0
write B to 1
write C to 2
write X to 0
write Y to 1
write Z to 2
Problem: No write failed, but disk state doesn’t match any point in time
Solutions?
←an
OOD←
X BC
←xyc
![Page 18: Shivaram Venkataramanpages.cs.wisc.edu › ~shivaram › cs537-sp20-notes › nfs-wrap › cs53… · QUIZ 31 The only costs to worry about are network costs. Assume "small" messages](https://reader034.vdocuments.net/reader034/viewer/2022042404/5f1986ac9fc3601ae930432a/html5/thumbnails/18.jpg)
Write Buffers
Local FS
Client Server
NFS
write
write buffer
Don’t use server write buffer. Problem: Slow?
Use persistent write buffer (more expensive)
Fife friendly
+
wittol'
is
a:*:c rite
I. \ f¥yncraft:Fa -¥-U a
DRAM
![Page 19: Shivaram Venkataramanpages.cs.wisc.edu › ~shivaram › cs537-sp20-notes › nfs-wrap › cs53… · QUIZ 31 The only costs to worry about are network costs. Assume "small" messages](https://reader034.vdocuments.net/reader034/viewer/2022042404/5f1986ac9fc3601ae930432a/html5/thumbnails/19.jpg)
QUIZ 31The only costs to worry about are network costs. Assume "small" messages takes S units of time, whereas a "bigger" message (e.g., size of a block=4KB) takes B units. If a message is larger than 4KB, it takes longer (2B for 8KB).
1. How long does it take to open a 100-block (400 KB) file called /a/b/c.txtfor the first time? (assume root directory file handle is already available)
2. How long does it take to read the whole file?
https://tinyurl.com/cs537-sp20-quiz31% AFS
( ):O ②
IT -
Tfchookup Cla)
seookpc.la/b ) hookup ( lelbleitxt
)=65
e- - -¥ c- c-25 Zs ZS
read ( th , by , 4413) looting#
-- -
r . '' '
@+By goo.
.
1005+1 B
# St 10013
![Page 20: Shivaram Venkataramanpages.cs.wisc.edu › ~shivaram › cs537-sp20-notes › nfs-wrap › cs53… · QUIZ 31 The only costs to worry about are network costs. Assume "small" messages](https://reader034.vdocuments.net/reader034/viewer/2022042404/5f1986ac9fc3601ae930432a/html5/thumbnails/20.jpg)
Cache Consistency
NFS can cache data in three places:- server memory- client disk- client memory
How to make sure all versions are in sync?
![Page 21: Shivaram Venkataramanpages.cs.wisc.edu › ~shivaram › cs537-sp20-notes › nfs-wrap › cs53… · QUIZ 31 The only costs to worry about are network costs. Assume "small" messages](https://reader034.vdocuments.net/reader034/viewer/2022042404/5f1986ac9fc3601ae930432a/html5/thumbnails/21.jpg)
Distributed Cache
Local FS
Client 1 Server
NFScache: Acache:
Client 2
NFScache:
write read openA" # T EAT read ⇐ f. Tread IT
at c- →AaB 0 A A
using ① If i.q we have a ① How do we
clientwt
write when do we detect that
⇒ Forging artful? gun. sent?
![Page 22: Shivaram Venkataramanpages.cs.wisc.edu › ~shivaram › cs537-sp20-notes › nfs-wrap › cs53… · QUIZ 31 The only costs to worry about are network costs. Assume "small" messages](https://reader034.vdocuments.net/reader034/viewer/2022042404/5f1986ac9fc3601ae930432a/html5/thumbnails/22.jpg)
Cache
Local FS
Client 1 Server
NFScache: Acache: B
Client 2
NFScache: A
write!
“Update Visibility” problem: server doesn’t have latest version
What happens if Client 2 (or any other client) reads data?
f T
![Page 23: Shivaram Venkataramanpages.cs.wisc.edu › ~shivaram › cs537-sp20-notes › nfs-wrap › cs53… · QUIZ 31 The only costs to worry about are network costs. Assume "small" messages](https://reader034.vdocuments.net/reader034/viewer/2022042404/5f1986ac9fc3601ae930432a/html5/thumbnails/23.jpg)
Cache
Local FS
Client 1 Server
NFScache: Bcache: B
Client 2
NFScache: A
flush
“Stale Cache” problem: client 2 doesn’t have latest version
What happens if Client 2 reads data?
↳Merwin dirt.Fae
before flushclient 2 reads file
O O
--
![Page 24: Shivaram Venkataramanpages.cs.wisc.edu › ~shivaram › cs537-sp20-notes › nfs-wrap › cs53… · QUIZ 31 The only costs to worry about are network costs. Assume "small" messages](https://reader034.vdocuments.net/reader034/viewer/2022042404/5f1986ac9fc3601ae930432a/html5/thumbnails/24.jpg)
Problem 1: Update Visibility
When client buffers a write, how can server (and other clients) see update?Client flushes cache entry to server
When should client perform flush? NFS solution: flush on fd close
Local FS
Client 1 Server
NFScache: Acache: B
write!
open- admit'T open
CB? ↳hello.cldose .org gunfire→
worst case
- - guarantee
![Page 25: Shivaram Venkataramanpages.cs.wisc.edu › ~shivaram › cs537-sp20-notes › nfs-wrap › cs53… · QUIZ 31 The only costs to worry about are network costs. Assume "small" messages](https://reader034.vdocuments.net/reader034/viewer/2022042404/5f1986ac9fc3601ae930432a/html5/thumbnails/25.jpg)
Problem 2: Stale Cache
Problem: Client 2 has stale copy of data; how can it get the latest?
NFS solution: – Clients recheck if cached copy is current before using data
Local FS
Server
cache: B
Client 2
NFScache: A
client 1 Ibfwrite CB) deflectscloseHd↳ O ¥
![Page 26: Shivaram Venkataramanpages.cs.wisc.edu › ~shivaram › cs537-sp20-notes › nfs-wrap › cs53… · QUIZ 31 The only costs to worry about are network costs. Assume "small" messages](https://reader034.vdocuments.net/reader034/viewer/2022042404/5f1986ac9fc3601ae930432a/html5/thumbnails/26.jpg)
Stale Cache Solution
Client cache records time when data block was fetched (t1)Before using data block, client does a STAT request to server
- get’s last modified timestamp for this file (t2) (not block…)- compare to cache timestamp- refetch data block if changed since timestamp (t2 > t1)
Local FS
Server
cache: B
Client 2
NFScache: A t1t2
Stqtlgettttrcell
gaffes fred careyc-
¥ O
D-
£7- timestamp
is
-
at file granularity-
![Page 27: Shivaram Venkataramanpages.cs.wisc.edu › ~shivaram › cs537-sp20-notes › nfs-wrap › cs53… · QUIZ 31 The only costs to worry about are network costs. Assume "small" messages](https://reader034.vdocuments.net/reader034/viewer/2022042404/5f1986ac9fc3601ae930432a/html5/thumbnails/27.jpg)
Measure then Build
NFS developers found stat accounted for 90% of server requests
Why?
Because clients frequently recheck cache
- - Tf
In common case scenario
1 client opensreads
stat ← ends )b- checkvalidity
![Page 28: Shivaram Venkataramanpages.cs.wisc.edu › ~shivaram › cs537-sp20-notes › nfs-wrap › cs53… · QUIZ 31 The only costs to worry about are network costs. Assume "small" messages](https://reader034.vdocuments.net/reader034/viewer/2022042404/5f1986ac9fc3601ae930432a/html5/thumbnails/28.jpg)
Reducing Stat Calls
Solution: cache results of stat callsPartial Solution:
Make stat cache entries expire after a given time (e.g., 3 seconds) (discard t2 at client 2)
What is the consequence?
Local FS
Server
cache: B
Client 2
NFScache: A
Never see updates on server!
AttributeCache
⇐ fED
µ¥o
Treat
fit alter code entryis
older than 3s
then call stat-
cache reads could be againstale for 3A !
![Page 29: Shivaram Venkataramanpages.cs.wisc.edu › ~shivaram › cs537-sp20-notes › nfs-wrap › cs53… · QUIZ 31 The only costs to worry about are network costs. Assume "small" messages](https://reader034.vdocuments.net/reader034/viewer/2022042404/5f1986ac9fc3601ae930432a/html5/thumbnails/29.jpg)
NFS Summary
NFS handles client and server crashes very well; robust APIs that are:
- stateless: servers don’t remember clients- idempotent: doing things twice never hurts
Caching and write buffering is harder, especially with crashes
Problems:
– Consistency model is odd (client may not see updates until 3s after file closed)– Scalability limitations as more clients call stat() on server
clients pullingstatl timestamps
from server
![Page 30: Shivaram Venkataramanpages.cs.wisc.edu › ~shivaram › cs537-sp20-notes › nfs-wrap › cs53… · QUIZ 31 The only costs to worry about are network costs. Assume "small" messages](https://reader034.vdocuments.net/reader034/viewer/2022042404/5f1986ac9fc3601ae930432a/html5/thumbnails/30.jpg)
NEXT STEPS
Next class: Review, Looking forwardOptional project due WedAEFIS feedback