google file system - official ppt

THE GOOGLE FILE SYSTEMBy Sanjay Ghemawat, Howard Gobioff, and

Shun-Tak Leung

INTRODUCTION• Google• Applications process lots of data• Need good file system• Solution : Google File System

Large, distributed, highly fault tolerant file system.

DESIGN MOTIVATIONS1. Fault-tolerance and auto-recovery need to be built

into the system.

2. Standard I/O assumptions (e.g. block size) have to be re-examined.

3. Record appends are the prevalent form of writing.

4. Google applications and GFS should be co-designed.

INTERFACE Create Delete Open Close Read Write Snapshot Record Append

GFS ARCHITECTUREOn a single-machine FS: An upper layer maintains the metadata. A lower layer (i.e. disk) stores the data in units called “blocks”.

In the GFS: A master process maintains the metadata.A lower layer (i.e. a set of chunk servers) stores the data in units called “chunks”.

GFS ARCHITECTURE

CHUNK Analogous to block, except larger. Size: 64 MB Stored on chunk server as file Chunk handle ( chunk file name) is used to

reference chunk. Replicated across multiple chunk servers

CHUNK SIZE• Advantages

o Reduce client-master interactiono Reduce the size of the metadata

• Disadvantageso Hot Spots Solution:

Higher replication factor

MASTER

Single master is centralized

Stores all metadata:o File namespaceo File to chunk mappingso Chunk location information

GFS ARCHITECTURE

System InteractionsCurrent lease holder?

identity of primarylocation of replicas(cached by client)

3a. data

3b. data

3c. data

Write request

Primary assign mutationsApplies itForward write request

Operation completed

Operation completedor Error report

SYSTEM INTERACTIONS Record appends

- Client specifies only data Snapshot

-Makes a copy of a file or a directory tree

OPERATION LOG Historical record of critical metadata changes Defines the order of concurrent operations Critical

Replicated on multiple remote machines Respond to client only when log locally and remotely

Fast recovery by using checkpoints Use a compact B-tree like form directly mapping into

memory Switch to a new log, Create new checkpoints in a

separate threads

MASTER OPERATIONS Namespace Management and Locking Chunk Creation Chunk Re-replication Chunk Rebalancing Garbage Collection

FAULT TOLERANCE AND DIAGNOSIS

1.High Availability

They keep the overall system highly available with two simple yet effective strategies.

Fast Recovery and replication

1.1 Fast Recovery : Master and chunk servers are designed to restart and restore states in a few seconds.

1.2 Chunk Replication : Across multiple machines, across multiple racks.

1.3 Master Replication:

Log of all changes made to metadata.

Log replicated on multiple machines.

“Shadow” masters for reading data if “real” master is down.

2. Data Integrity

Each chunk has an associated checksum.

3. Diagnostic Logging

Logging is maintained for keeping the details of interactions between machines. (exact request and responses sent on the wire except data being transferred.)

MEASUREMENTS

They measured performance on a GFS cluster consisting one master, two master replicas, 16 chunk servers and 16 clients.

All machines are configured with 1.Dual 1.4 GHz PIII processors2. 2 GB memory3. Two 80 GB 5400 rpm disks4. 100 Mbps full duplex

Ethernet connection to an HP 2524 switch.

Here also rate will drop when the number of clients increases up to 16 , append rate drops due to congestion and variance in network transfer rates seen by different clients.

REAL WORLD CLUSTERS

Table 1-Characteristics of two GFS clusters

Table 2 –Performance Metrics for A and B clusters

RESULTS

1.Read and Write Rates• Average write rate was 30 MB/s.• When the measurements were taken B

was in a middle of a write.• Read rates were high, both clusters

were in the middle of a heavy read activity.

• A is using resources efficiently than B.

2. Master Loads

Master can easily keep up with 200 to 500 operations per second.

3. Recovery Time.

• Killed a single chunk server ( 15, 000 chunks containing 600 GB of data) in cluster B.

•All chunks were replicated in 23.2 minutes at an effective replication rate of 440 MB/s.

Killed two chunk servers (16 000 chunks and 660 GB of data).

Failure reduced 266 chunks to having a single replica.

These 266 chunks were cloned at a higher priority and all restored within 2 minutes.

Putting the cluster in a state where it could tolerate another chunk server failure

WORKLOAD BREAKDOWN

Cluster X and Y are used to represent breakdown of the workloads on two GFS. Cluster X is for research and development while Y is for production data processing.

Operations Breakdown by Size

Table 3 – Operation Breakdown by Size (%)

Bytes transferred breakdown by operation size

Table 4 – Bytes Transferred Breakdown by Operation Size(%) 34

Master Requests Breakdown by Type (%)

Table 5 : Master request Breakdown by Type (%)

CONCLUSIONS

• GFS demonstrates the qualities essential for supporting large scale data processing workloads on commodity hardware.

• It provides fault tolerance by constant monitoring, replicating crucial data and fast, automatic recovery.

• It delivers high aggregate throughput to many concurrent readers and writers by separating file system control from data transfer. 36

Thank You.

Q and A

google file system - official ppt

file chunk handle chunk

chunk replication

multiple chunk servers

master replication

file namespace file

data integrityeach chunk

set of chunk servers

master process

Documents

mono square google ppt

ppt google

google goggles ppt

ppt google earth

ppt google docs[1]

google ppt

google chrome os ppt

google adsense ppt, adsense

google strategic plan-ppt

google ppt by amit

google work place ppt

ppt google

google glass ppt

exploring google tools - ppt

google loon ppt

bpsm google ppt

ppt on google glass

google docs ppt

presentation on google(ppt)

google drive ppt