ivy: a read/write p2p file system athicha muthitacharoan, robert morris, thomer gil and benjie chen...
TRANSCRIPT
Ivy: A Read/Write P2P File System
Athicha Muthitacharoan, Robert Morris, Thomer Gil and
Benjie Chen
Presented by Rachel Rubin Presented by Rachel Rubin
CS 294-4, Fall 2003CS 294-4, Fall 2003
Main Ideas Single file system
image like NFS in a P2P setting
Log-based file system
Security and recovery provided
Close-to-open consistency of files
Contributions Read/Write P2P storage system Designs a distributed file system
with useful integrity properties on top of unreliable components
DHTs as a building block for systems
Challenges Consistency
Difficult with multiple shared writers Unreliable participants
Locking is unattractive Participants may not trust each
other or other machines Need to provide an undo
Partitions need to be supported
Background: Sprite Non P2P system Represents a file system as a log
of operations Single log managed by a single
server Snapshots of i-numbers to i-node
location mappings
Design Overview Ivy is a set of logs
One per participant Owner appends own
log only but can read all of them
log-head points to most recent log record
Maintain private snapshot of the system
Design: DHash Distributed P2P hash table mapping keys
to arbitrary values Store log records in the DHash
Two forms of integrity Blocks key is the SHA-1 hash of the block value Blocks key the public key of the owner
Log-head is the public key so it doesn’t change Interface is a simple
put(key, value); get(key) In theory guarantees write/read
consistency Requires careful replication and updates
Design: Log Data Structure Linked list of
immutable log records
Log record is a single file system modification
Like an NFS operation Contains permissions i-numbers of files
effected noted in log
Fields in the log
Design:Views A view is a set of logs that comprise the
file system A view-block points to all log-heads in
the view A view-block key is a hash of the public
keys of participants so they can be easily identified
File system is named using the view-block key
Log Usage Non-updating requests satisfied in
a single pass Others require more
Scan for changes Appends
Log with description of update Fills in required fields Updates log-head
Combining logs Ivy server consults all logs to find
relevant information Obey causality All users should choose the same
order Logs ordered by sequence
numbers and version vectors
Snapshots Participants
periodically create a private snapshot of the system Prevents full log
traversal Stored in DHash to
make it persistent Built off of the
previous snapshot
Operations Supported File System Creation File Creation File name lookup File Read File attributes Directory listings
Application Semantics Write-read semantics
Updates immediately visible Except in network partitions
Close-to-open file semantics Prevents fetching the log head on
every read Concurrent operations ordered and
executed
Application Semantics cont. Exclusive create
Except on partitions Partitioned updates supported
Version vectors used to make everything consistent later
Conflict resolution tool in place
Evaluation Performance in different settings
Local WAN Multiple participants Multiple DHash Nodes Number of concurrent writers Snapshot interval
Modified Andrew Benchmark Used
Snapshot interval There is a “sweet” spot
It’s a flat curve so snapshots can be done with more frequency