nfs & distributed systems issues

NFS & Distributed Systems Issues

Vivek PaiDec 6, 2001

2

Mechanics A few words about Project 5

It’s not just another webserver project

3

The Next Project Behavioral spec Implementation up to you Can assume max of 128

procs/threads Use a simple counter to implement

simple counts I may release a tool to test easier

4

Behavioral Spec

The following behavioral spec is important If there aren’t enough free

processes/threads, the server should spawn one per second

If there are too many free, one should be killed per second

This should not depend on any other activity in the system

5

Caching Mmap Always use mmap Keep cache of active & inactive

maps Total cache size in KB should be

limited by command-line argument Can only exceed this limit if all

mappings are active

6

Man Pages You May Like Mmap, munmap Man –k pthread Flock Sleep Signal Alarm

7

Being A Good User Do not fork wildly Try to test on non-shared system

8

Imagine The Following Everyone has a desktop machine Each machine has a user Each user has a home directory What problems arise?

Can’t move between machines Can’t easily share files with others How does this data get backed up?

9

Was It Always Like This? No Think mainframes:

Big, centralized box All disks attached Programs ran on box Only terminals/monitors on each desk

10

How Did We Get Here? Mainframe killers advocated little

boxes Lots of little boxes are a distributed

system Distributed systems introduce new

problems

11

Why Use Little Boxes? Little boxes are cheap

Easier to order a PC than a mainframe Little boxes are disposable

No need for a maintenance contract Economy of scale

Design cost amortized over more units

12

Were Minis Immune? Minicomputers were “department”-

sized versus “company”-sized Most information not shared among

everyone Administrator per department OK Shared resources only within

department OK

13

Why Not Just Shared Disk? Centralized storage

Easier administration/backup Better use of capacity Easier to build large filesystem cache Easier to provide AC/power

Problem: compare bandwidth 10 Mbit/sec Ethernet at the time Switched versus shared irrelevant

14

New Problem Single point of failure

Means everything depends on this item In other cases, duplication helps

Common failures = reboot But all information (state) lost All clients would have to be told We’d need to keep track of all clients

• On stable storage!

15

Toward Statelessness Make server as dumb as possible Shift burdens to client-side Client failure only harms that client Each operation is self-contained Repeating operations permissible

Idempotent – repeating causes no change

16

Idempotency Regular Unix system call

Write(fd, buf, size) Writes size bytes at current position,

moves position forward by size Idempotent version

Pwrite(fd, buf, size, offset) Idempotent operations in NFS hidden

from user programs

17

Distributed Caching Local filesystems have caches Use caches to offload network traffic

Same object replicated in many caches No problem for reads

What happens on write/update? Multiple different copies of data? What happens if it’s metadata?

18

Distributed Write Problem Possible approaches

Disallow caching on writes• What about emacs?

Disallow caching of shared files• What happens for really big files?

Disallow caching of metadata writes What disk blocks does OS care

about?

19

Sun’s Write Philosophy File block write sharing not an issue Very few programs do it Correctness depends on program Reduce window of opportunity

Flush dirty blocks periodically Flush can be asynchronous

20

Metadata Operations Performed synchronously at server Must be reflected to disk

Why: stability Overhead: disk op + network

Can we speed up synchronous ops?

21

New Statelessness Problems Stale file handle problem

cd ~vivek/temp1/temp in window A rm –r ~vivek/temp1 in window B “ls” in window A

Stale inode problem Machine A gets file for read Filesystem reformatted by admin Machine A modifies file, tries to write

22

What Slows Down Servers Network overhead

Disk DMA in 4KB pieces Network processing in 1500 byte

packets + manipulation Multiple CPUs

Synchronous operations Nonvolatile memory + recovery

nfs & distributed systems issues

Documents