ivy: a read/write p2p file system athicha muthitacharoan, robert morris, thomer gil and benjie chen...

23
Ivy: A Read/Write P2P File System Athicha Muthitacharoan, Robert Morris, Thomer Gil and Benjie Chen Presented by Rachel Rubin Presented by Rachel Rubin CS 294-4, Fall 2003 CS 294-4, Fall 2003

Upload: cornelius-dalton

Post on 17-Dec-2015

215 views

Category:

Documents


1 download

TRANSCRIPT

Ivy: A Read/Write P2P File System

Athicha Muthitacharoan, Robert Morris, Thomer Gil and

Benjie Chen

Presented by Rachel Rubin Presented by Rachel Rubin

CS 294-4, Fall 2003CS 294-4, Fall 2003

Overview Basic Concepts Design Applications Evaluation Discussion

Main Ideas Single file system

image like NFS in a P2P setting

Log-based file system

Security and recovery provided

Close-to-open consistency of files

Contributions Read/Write P2P storage system Designs a distributed file system

with useful integrity properties on top of unreliable components

DHTs as a building block for systems

Challenges Consistency

Difficult with multiple shared writers Unreliable participants

Locking is unattractive Participants may not trust each

other or other machines Need to provide an undo

Partitions need to be supported

Background: Sprite Non P2P system Represents a file system as a log

of operations Single log managed by a single

server Snapshots of i-numbers to i-node

location mappings

Design Overview Ivy is a set of logs

One per participant Owner appends own

log only but can read all of them

log-head points to most recent log record

Maintain private snapshot of the system

Design: DHash Distributed P2P hash table mapping keys

to arbitrary values Store log records in the DHash

Two forms of integrity Blocks key is the SHA-1 hash of the block value Blocks key the public key of the owner

Log-head is the public key so it doesn’t change Interface is a simple

put(key, value); get(key) In theory guarantees write/read

consistency Requires careful replication and updates

Design: Log Data Structure Linked list of

immutable log records

Log record is a single file system modification

Like an NFS operation Contains permissions i-numbers of files

effected noted in log

Fields in the log

Design:Views A view is a set of logs that comprise the

file system A view-block points to all log-heads in

the view A view-block key is a hash of the public

keys of participants so they can be easily identified

File system is named using the view-block key

Log Usage Non-updating requests satisfied in

a single pass Others require more

Scan for changes Appends

Log with description of update Fills in required fields Updates log-head

Combining logs Ivy server consults all logs to find

relevant information Obey causality All users should choose the same

order Logs ordered by sequence

numbers and version vectors

Snapshots Participants

periodically create a private snapshot of the system Prevents full log

traversal Stored in DHash to

make it persistent Built off of the

previous snapshot

Operations Supported File System Creation File Creation File name lookup File Read File attributes Directory listings

Application Semantics Write-read semantics

Updates immediately visible Except in network partitions

Close-to-open file semantics Prevents fetching the log head on

every read Concurrent operations ordered and

executed

Application Semantics cont. Exclusive create

Except on partitions Partitioned updates supported

Version vectors used to make everything consistent later

Conflict resolution tool in place

Security Bad behavior is discovered and

eradicated Roll back logs to exclude malicious

actions

Evaluation Performance in different settings

Local WAN Multiple participants Multiple DHash Nodes Number of concurrent writers Snapshot interval

Modified Andrew Benchmark Used

Single User MAB on LAN One log

Single User on WAN

Many logs

One Writer Many Writers

Snapshot interval There is a “sweet” spot

It’s a flat curve so snapshots can be done with more frequency

Questions Are multiple logs a good idea? Would Byzantine agreement be

useful in this system? Is the performance too bad?