dfs design and implementation

34
DFS Design and Implementation Vijay Neelakandan [email protected] u

Upload: laddie

Post on 14-Feb-2016

47 views

Category:

Documents


3 download

DESCRIPTION

DFS Design and Implementation. Vijay Neelakandan [email protected]. What is DFS?. In computing, a distributed file system is a network file system where a single file system can be distributed across several physical computers[3]. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: DFS Design  and  Implementation

DFS Design and Implementation

Vijay [email protected]

Page 2: DFS Design  and  Implementation

What is DFS?

• In computing, a distributed file system is a network file system where a single file system can be distributed across several physical computers[3].

• DFS allows administrators to group shared folders located on different servers by transparently connecting them to one or more DFS namespaces.

Page 3: DFS Design  and  Implementation

Characteristics of a DFS[3]• Network transparency: same access operation as local files• Location transparency: file name should not reveal its

location• Location independence: file name should not be changed

when its physical location changes• User mobility: access to file from anywhere• Fault tolerance• Scalability• File mobility: move files from one place to another in a

running system

Page 4: DFS Design  and  Implementation

What and Why DFS ?• Distributed file systems generally include facilities for

transparent replication and fault tolerance. That is, when a limited number of nodes in a file system go offline, the system continues to work without any data loss.

• Transparency • Name Service, Directory Service, Caching and

replication, access control and protection

Page 5: DFS Design  and  Implementation

Files and File Systems

Page 6: DFS Design  and  Implementation

Files and File systems

• Files are named data objects. Files hold structured data that are used by programs but that are not part of the programs themselves[1].

• File system is responsible for the naming, creation, deletion, retrieval, modification, and protection of a file in the system.

• Logical components of a file for users.

File Name File Attributes Data units

Page 7: DFS Design  and  Implementation

Files and File systems[1]

• File name: symbolic name– When accessing a file, its symbolic name is mapped to a unique

file id (ufid or file handle) that can locate the physical file• Mapping is the primary function of the Directory Service

• File Attributes:Name, Size, Location, Time, Type etc.

• Data units: Organization – Flat structure of a stream of bytes of sequence of blocks– Hierarchical structure of indexed records

Page 8: DFS Design  and  Implementation

Files and File systems

• File Access• Sequential access mode File position pointer to indicate the position of the next data unit to be accessed.

• Direct access Explicitly reference fixed-size data units by their block numbers.

• Indexed sequential access Use an index to locate the block in which the key/object pair resides,and then

accessing the data in the block until the is found.

Page 9: DFS Design  and  Implementation

Example

• UNIX

• Files are streams of characters for application programs and sequences of logical fixed size blocks for file system.

• Both sequential and direct access methods are supported. other access methods can be built on top of the flat file structures.

Page 10: DFS Design  and  Implementation

Major Components in a file systemDirectory service Name resolution, add and deletion of files

Authorization service Capability and /or access control list

File service

Transaction Concurrency and replication management

Basic Read/write files and get/set attributes

System Service Device, cache, and block management

Page 11: DFS Design  and  Implementation

Directory Service

• Directories are files that contain names and addresses of other files and subdirectories.

• Mapping and locating• Search for a file• Create a file• Delete a file• List a directory• Rename a file• Traverse the file system

Page 12: DFS Design  and  Implementation

Authorization Service

• File access must be regulated to ensure security• Types of access– Read– Write– Execute– Append– Delete– List

12

Page 13: DFS Design  and  Implementation

File Service – Basic Operations • Create

– Allocate space– Make an entry in the directory

• Write – Search the directory– Write is to take place at the location

of the write pointer• Read

– Search the directory– Read is to take place at the location

of the read pointer• Reposition within file – file seek

– Set the current file pointer to a given value

• Delete– Search the directory– Release all file space

• Truncate– Reset the file to length zero

• Open(Fi)– Search the directory structure – Move the content of the

directory entry to memory• Close(Fi)

– move the content in memory to directory structure on disk

• Get/set file attributes

13

Page 14: DFS Design  and  Implementation

System Service System services are a FS’s interface to the hardware

and are transparent to users of FS

– Mapping of logical to physical block addresses– Interfacing to services at the device level for file space allocation/de-

allocation– Actual read/write file operations– Caching for performance enhancement– Replicating for reliability improvement

Page 15: DFS Design  and  Implementation

Interaction among services in a DFS

Clients

Directory Services

Authorization services

File services

System service

Page 16: DFS Design  and  Implementation

Organization of data files in a file system

Page 17: DFS Design  and  Implementation

File Mounting and Server Registration

Page 18: DFS Design  and  Implementation

File Mounting and Server Registration

• Attach a remote named file system to the client’s file system hierarchy at the position pointed to by a path name (mounting point)– A mounting point is usually a leaf of the directory tree that contains

only an empty subdirectory

• Once files are mounted, they are accessed by using the concatenated logical path names without referencing either the remote hosts or local devices– Location transparency– The linked information (mount table) is kept until they are unmounted

Page 19: DFS Design  and  Implementation

File Mounting Example

19

root

chow

paper book

root

OS

DFS DSM

Local Client Remote Server

Export

Mount

DFS DSM /chow/book/DSM

/OS/DSM

Page 20: DFS Design  and  Implementation

File mounting and Server Registration

• Mounting Strategy

– Explicit mounting: clients make explicit mounting system calls whenever one is desired

– Boot mounting: a set of file servers is prescribed and all mountings are performed the client’s boot time

– Auto-mounting: mounting of the servers is implicitly done on demand when a file is first opened by a client

Page 21: DFS Design  and  Implementation

A Simple Automounter for NFS

Page 22: DFS Design  and  Implementation

Server Registration

• The mounting protocol is not transparent – the initial mounting requires knowledge of the location of file servers

• Server registration– File servers register their services, and clients consult with the

registration server before mounting– Clients broadcast mounting requests, and file servers respond to

client’s requests

Page 23: DFS Design  and  Implementation

Stateful and Stateless File Servers

Page 24: DFS Design  and  Implementation

Stateful and stateless File Servers

• State informationOpened files and their clientsFile descriptors and file handlesCurrent file position pointersMounting infoLock statusSession keysCache or buffer

Page 25: DFS Design  and  Implementation

Stateful and stateless File Servers• A file server is called stateful if it maintains internally some of

the state information and stateless if it maintains none at all.

• Stateless file server – when a client sends a request to a server, the server carries out the request, sends the reply, and then remove from its internal tables all information about the request– Between requests, no client-specific information is kept on the server– Each request must be self-contained: full file name and offset…

• Stateful file server – file servers maintain state information about clients between requests

Page 26: DFS Design  and  Implementation

Comparing

Page 27: DFS Design  and  Implementation

Research

Page 28: DFS Design  and  Implementation

Integrated High performance DFS

28

1. Scientific computing applications running in the cluster environment require high performance distributed file system to store and share data[6].

2. A new approach, the IncFS , of building a high ∗performance distributed file system by integrating many NFS servers .

3. The IncFS is aimed at providing a simple and convenient way to achieve high aggregate I/O bandwidth for scientific computing applications that require intensive concurrent file access[6].

Page 29: DFS Design  and  Implementation

INFS continued..

4. The InFcS uses a hyper structure to integrate multiple NFS file systems. And it provides multiple data layouts to effectively distribute file data among those NFS servers.

Page 30: DFS Design  and  Implementation

INFS Continued..

Page 31: DFS Design  and  Implementation

Achieving High Availability[2]

1. Achieving high availability is a premium goal for many distributed file systems.File replication is a well-known technique that is used to achieve this goal. It generally offers reduced client latencies and increases files availability.

2. To achieve this goal two algorithms are proposed,

namely a primary replica assignment algorithm and an intelligent replica placement algorithms.

Page 32: DFS Design  and  Implementation

References1. Distributed File System: Efficiency Experiments for Data Access and

CommunicationUpadhyaya, B.; Azimov, F.; Doan, T.T.; Eunmi Choi; SangBum Kim; Pilsung Kim;Networked Computing and Advanced Information Management, 2008. NCM '08. Fourth International Conference on

Volume 2, 2-4 Sept. 2008 2. Towards Achieving a Highly Available Distributed File System

Abdalla, S.; Ahmad, I.; Ewe Hong Tat; Gim Aik Teh; Yong Lee Kee;Advanced Communication Technology, The 9th International Conference on

Volume 3, 12-14 Feb. 20073. Randy Chow, Theodore Johnson, Distributed Operating Systems and

Algorithms, Addison-Wesley, 1997.

Page 33: DFS Design  and  Implementation

References

4. Distributed File Systems Pierre Boulet Masters Informatique TIIR et IAGL September 28,2006.

5 . Glagoleva,2000.A load balancing tool based on mining access patterns for Distributed File System servers. System Sciences. HICSS. Proceedings of the 35th Annual Hawaii International Conference

6 . IncFS: an integrated high-performance distributed file system based on NFSYi Zhao; Rongfeng Tang; Jin Xiong; Jie Ma;Networking, Architecture, and Storages, 2006. IWNAS '06. International Workshop on

1-3 Aug. 2006

Page 34: DFS Design  and  Implementation

Thankyou