distributed storage system

23
DISTRIBUTED STORAGE SYSTEM Mr. Dương Công Li Company: VNG-Corp Tel: +84989510016 Email:[email protected]

Upload: cong-loi-duong

Post on 23-Jan-2015

275 views

Category:

Technology


2 download

DESCRIPTION

Distributed storage system

TRANSCRIPT

Page 1: Distributed storage system

DISTRIBUTED STORAGE

SYSTEM

Mr. Dương Công Lợi

Company: VNG-Corp

Tel: +84989510016

Email:[email protected]

Page 2: Distributed storage system

CONTENTS

1. What is distributed-computing system?

2. Principle of distributed database/storage

system

3. Distributed storage system paradigm

4. UniversalDistributedStorage

Page 3: Distributed storage system

1. WHAT IS DISTRIBUTED-COMPUTING

SYSTEM?

Distributed-Computing is the process of solving a

computational problem using a distributed

system.

A distributed system is a computing system in

which a number of components on multiple

computers cooperate by communicating over a

network to achieve a common goal.

Page 4: Distributed storage system

DISTRIBUTED DATABASE/STORAGE

SYSTEM

A distributed database system, the database is

stored on several computers .

A distributed database is a collection of multiple

, Logic computer network .

Page 5: Distributed storage system

DISTRIBUTED SYSTEM ADVANCE

Advance

Avoid bottleneck & single-point-of-failure

More Scalability

More Availability

Routing model

Client routing: client request to appropriate server to

read/write data

Server routing: server forward request of client to

appropriate server and send result to this client

* can combine the two model above into a system

Page 6: Distributed storage system

DISTRIBUTED STORAGE SYSTEM

Store some data {1,2,3,4,6,7,8} into 1 server

And store them into 3 distributed server

1,2,3,4,6,7,8

1,2,34,6

7,8

Page 7: Distributed storage system

2. PRINCIPLE OF DISTRIBUTED

DATABASE/STORAGE SYSTEM

Shard data key and store it to appropriate server

use Distributed Hash Table (DHT)

DHT must be consistent hashing:

Uniform distribution of generation

Consistent

Jenkins, Murmur are the good choice; MD5, SHA

slower

Page 8: Distributed storage system

CANONICAL PROBLEMS IN DISTRIBUTED

SYSTEMS

Distributed data independence

Distributed transactions: ACID (Atomicity,

Consistency, Isolation, Durability) requirement

Fault tolerance

Transparency

Page 9: Distributed storage system

3. DISTRIBUTED STORAGE SYSTEM

PARADIGM

Data Hashing/Addressing

Determine server for data store in

Data Replication

Store data into multi server node for more available,

fault-tolerance

Page 10: Distributed storage system

DISTRIBUTED STORAGE SYSTEM

ARCHITECT

Data Hashing/Addressing

Use DHT to addressing server (use server-name) to a

number, performing it on one circle called the keys

space

Use DHT to addressing data and find server store it

by successor(k)=ceiling(addressing(k))

successor(k): server store k

0

server3

server1

server2

Page 11: Distributed storage system

DISTRIBUTED STORAGE SYSTEM

ARCHITECT

Addressing – Virtual node

Each server node is generated to more node-id for

evenly distributed, load balance

Server1: n1, n4, n6

Server2: n2, n7

Server3: n3, n5

0

server3

server1

server2

n7

n1

n5

n2

n4

n6

n3

n6

Page 12: Distributed storage system

DISTRIBUTED STORAGE SYSTEM

ARCHITECT

Data Replication

Data k1 store in server1 as master and store in

server2 as slave

0

server3

server1

server2

k1

Page 13: Distributed storage system

UNIVERSALDISTRIBUTEDSTORAGE

a distributed storage system

Page 14: Distributed storage system

4. UNIVERSALDISTRIBUTEDSTORAGE

UniversalDistributedStorage is a distributed

storage system develop for:

Distributed data independence

Distributed transactions (ACID)

Fault tolerance

Leader election (decision for join or leave server node)

Replicate with multiple master replication

Transparency

Page 15: Distributed storage system

UNIVERSALDISTRIBUTEDSTORAGE

ARCHITECTURE

Overview

Bussiness

Layer

Distrib

uted

Layer

Storage

Layer

Bussiness

Layer

Distrib

uted

Layer

Storage

Layer

Bussiness

Layer

Distrib

uted

Layer

Storage

Layer

Page 16: Distributed storage system

ARCHITECTURE OVERVIEW

Page 17: Distributed storage system

UNIVERSALDISTRIBUTEDSTORAGE

FEATURE

Data hashing/addressing

Use Murmur hashing function

Page 18: Distributed storage system

UNIVERSALDISTRIBUTEDSTORAGE

FEATURE

Leader election

Use Bully Leader Election algorithm

Page 19: Distributed storage system
Page 20: Distributed storage system

UNIVERSALDISTRIBUTEDSTORAGE

FEATURE

Multi-master replication

Problem of multi-master replication

Page 21: Distributed storage system

UNIVERSALDISTRIBUTEDSTORAGE

FEATURE

Multi-master replication

Data store to main master (called sub-leader), then

this data post to queue to sync to other master.

Page 22: Distributed storage system

UNIVERSALDISTRIBUTEDSTORAGE

STATISTIC

System information:

3 machine 8GB Ram, core i5 3,220GHz

LAN/WAN network

7 physical servers on 3 above mechine

Concurrence write 16500000 items in 3680s, rate~ 4480req/sec (at client computing)

Concurrence read 16500000 items in 1458s, rate~ 11320req/sec (at client computing)

* It doesn’t limit of this system, it limit at clients (this test using 3 client thread)

Page 23: Distributed storage system

Q & A

Contact:

Duong Cong Loi

[email protected]

https://www.facebook.com/duongcong.loi