self stabilizing distributed file system

33
Self Stabilizing Distributed File System Shlomi Dolev and Ronen Shlomi Dolev and Ronen I. Kat I. Kat Department of Computer Science, Department of Computer Science, Ben-Gurion University Ben-Gurion University Research Sponsored by IBM Research Sponsored by IBM

Upload: davis-bishop

Post on 30-Dec-2015

72 views

Category:

Documents


1 download

DESCRIPTION

Self Stabilizing Distributed File System. Shlomi Dolev and Ronen I. Kat Department of Computer Science, Ben-Gurion University Research Sponsored by IBM. DFS Motivation. Performance Fault tolerance Placing files closer to users. Related Work. File systems - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Self Stabilizing Distributed File System

Self Stabilizing Distributed File System

Shlomi Dolev and Ronen I. KatShlomi Dolev and Ronen I. Kat

Department of Computer Science, Ben-Gurion Department of Computer Science, Ben-Gurion UniversityUniversity

Research Sponsored by IBMResearch Sponsored by IBM

Page 2: Self Stabilizing Distributed File System

DFS Motivation

• Performance

• Fault tolerance

• Placing files closer to users

Page 3: Self Stabilizing Distributed File System

Related Work

• File systems• NFS – network file system protocol• AFS – Andrew file system – CMU(1988)• Coda - CMU (1998)• Intermezzo – Peter J. Braam, CMU

• Peer to peer (2000)• Global storage: OceanStore – Berkeley• Server less: Microsoft Farsite.

Page 4: Self Stabilizing Distributed File System

Talk Overview

• Self-stabilization• Design• Algorithms• File system implementation• Future work

Page 5: Self Stabilizing Distributed File System

Self Stabilization

• Self healing• Adaptiveness• Automatic recovery• Autonomic computing

Self StabilizationDijkstra 1974

Page 6: Self Stabilizing Distributed File System

Self Stabilization

A self-stabilizing system is a system that can automatically recover following the occurrence of (transient) faults.

The idea is to design system that can be started in an arbitrary state and still converge to a desired behaviour.

E.G., Self-stabilization / S. Dolev.

Page 7: Self Stabilizing Distributed File System

Self Stabilization Motivation

• The combination and type of faults cannot be totallytotally anticipated in on-going systems

• Any on-going system mustmust be self stabilizing (or manually monitored)

• Self-stabilizing algorithm can recover from any arbitrary state reached due to the occurrence of faults

Page 8: Self Stabilizing Distributed File System

Design

Page 9: Self Stabilizing Distributed File System

Design

• Replication servers joined to a spanning tree

• A spanning tree is constructed• File updates are propagated using self-

stabilizing -synchronizer

Page 10: Self Stabilizing Distributed File System

Design (Cont’)

• Clients join the replication tree and form a caching tree

• File leases• Global locking

Page 11: Self Stabilizing Distributed File System

Algorithms – Self Stabilizing

Electing a leader (leader election)Electing a leader (leader election)• Collecting connectivity information• Optimising communication costs -Synchronizer for file consistency

Page 12: Self Stabilizing Distributed File System

Leader Election

• A single leader coordinates construction

• If non exists, a server becomes a leader• If more than one exists, one survives• Message are periodically broadcasted

Page 13: Self Stabilizing Distributed File System

Leader Election Algorithm

• Every T1 do:• If (p = leader) then send-multicast(‘I’m a leader’)• Leader-exists = true

• Every T1+Td do:• If (not leader-exists) then leader = p• Leader-exists = false

• Upon arrival of message do:• If (p.volume=volume) then

• If (p=leader) then leader = min(leader,sender)• Else leader = sender

• Leader-exists = true

Page 14: Self Stabilizing Distributed File System

Algorithms – Self Stabilizing

• Electing a leader (leader election) Collecting connectivity informationCollecting connectivity information• Optimising communication costs -Synchronizer for file consistency

Page 15: Self Stabilizing Distributed File System

Induced Graph Example

Page 16: Self Stabilizing Distributed File System

Update Algorithm

• Collect routing tables from all neighbours in the induced graph

• Elect a manager (local leader) for the tree, a server with the minimal ID

• Build a distributed BFS spanning tree• The algorithm converges

Page 17: Self Stabilizing Distributed File System

Algorithms – Self Stabilizing

• Electing a leader (leader election)• Collecting connectivity information Optimising communication costsOptimising communication costs -Synchronizer for file consistency

Page 18: Self Stabilizing Distributed File System

Optimising Communication Costs

• Goal: find the minimal radius that keeps connectivity

• Increase by a factor of 2• Run a 2nd instance of update with < • Searching for using binary search

Page 19: Self Stabilizing Distributed File System

Tree Structure

Page 20: Self Stabilizing Distributed File System

Caching Tree

• Extends the replication tree • The update algorithm constructs both• Servers execute two instances• Caches execute one instance

Page 21: Self Stabilizing Distributed File System

Combined Spanning Tree

Page 22: Self Stabilizing Distributed File System

Algorithms – Self Stabilizing

• Electing a leader (leader election)• Collecting connectivity information• Optimising communication costs -Synchronizer for file consistency-Synchronizer for file consistency

Page 23: Self Stabilizing Distributed File System

Synchronization Mechanism

• Provide reliable command and timing• Propagate commands between servers• Collect and distribute information

Page 24: Self Stabilizing Distributed File System

Replication Consistency

• Verifies signatures• Multiple signature – a conflict• Conflict resolution• Broadcast resolved signature

Page 25: Self Stabilizing Distributed File System

Locking Table

• A (unified) global lock table • Lock are requested• Leader resolves multiple locks• Lock are removed by cancelling the

locks request

Page 26: Self Stabilizing Distributed File System

File System Implementation

Page 27: Self Stabilizing Distributed File System

Accessing a FileLock file

Get signature

Get a copy

Yes

No

No

Use local copy

Yes

Update?

Cached?

Page 28: Self Stabilizing Distributed File System

Closing a File

Send new signature

Yes

No

Update?

Confirm signature

Page 29: Self Stabilizing Distributed File System

Meta Access

• Globally processed• Blocked until a lock is

obtained

Lock file

Executecommand

Waitconfirmation

Page 30: Self Stabilizing Distributed File System

Linux Based bgRFS

Application

User LevelLinux system calls

System Calls

New implementation:

open, close, lstat, mkdir, etc …

SyncDaemon:Cache manager & Server

Up calls

Network Communication

Page 31: Self Stabilizing Distributed File System

Future Work

• Kernel VFS module.• Communication improvements:

– Reducing update messages– Using timers with -synchronizer

• Performance enhancements• Integrating disconnected operations• Conflict resolution algorithms

Page 32: Self Stabilizing Distributed File System

Credits

Undergraduate Students:Amir Livneh [email protected] Granik [email protected] Lansky [email protected] Shmuel [email protected] Shish [email protected] Erlich [email protected] Chohen [email protected] Biran [email protected] Fridman [email protected] Bernard [email protected] Ferents [email protected] Feintuch [email protected] Shalev [email protected] Kraim [email protected] Hayuit

FacultyProf Shlomi Dolev [email protected]

Graduate StudentsRonen I. Kat [email protected]

System EngeenierAlbina Budker [email protected]

Page 33: Self Stabilizing Distributed File System

Visit us atVisit us at

www.cs.bgu.ac.il/~bgrwww.cs.bgu.ac.il/~bgr

fsfs