red hat storage server replication past, present, & future

35
RED HAT STORAGE SERVER REPLICATION: PAST AND PRESENT Jeff Darcy, Venky Shankar, Raghavan Pichai GlusterFS/RHS Developers @ Red Hat

Upload: redhatstorage

Post on 25-May-2015

2.051 views

Category:

Technology


1 download

DESCRIPTION

"In this session, we’ll detail Red Hat Storage Server data replication strategies for both near replication (LAN) and far replication (over WAN), and explain how replication has evolved over the last few years. You’ll learn about: Past mechanisms. Near replication (client-side replication). Far replication using timestamps (xtime). Present mechanisms. Near replication (server side) built using quorum and journaling. Faster far replication using journaling. Unified replication. Replication using snapshots. Stripe replication using erasure coding."

TRANSCRIPT

Page 1: Red Hat Storage Server Replication Past, Present, & Future

RED HAT STORAGE SERVERREPLICATION: PAST AND PRESENTJeff Darcy, Venky Shankar, Raghavan PichaiGlusterFS/RHS Developers @ Red Hat

Page 2: Red Hat Storage Server Replication Past, Present, & Future

Talk Outline

Background Local replication Remote replication Next steps Questions

Page 3: Red Hat Storage Server Replication Past, Present, & Future

BackgroundTypes of replication, goals, and challenges

Page 4: Red Hat Storage Server Replication Past, Present, & Future

Synchronous Replication

S

S

Y

Y

N

N

C

C

+ high consistency - network sensitive

Page 5: Red Hat Storage Server Replication Past, Present, & Future

Quorum Enforcement

Replica #1 Replica #2 Replica #3

Majority can write Minority can’t

There can only be one majority => no split brain

Page 6: Red Hat Storage Server Replication Past, Present, & Future

Synchronous Replication Data Flows

X

X

X

Y

Y

Y

Chain Fan Out

Client

Server

Server

Client

Server

Server

Page 7: Red Hat Storage Server Replication Past, Present, & Future

Fan Out Replication

Y

Y

Y Client

Server

Server

SplitBandwidth

Wait forSlowest

Page 8: Red Hat Storage Server Replication Past, Present, & Future

Chain Replication

X

X

X

Client

Server

Server

FullBandwidth

Two Hops

Page 9: Red Hat Storage Server Replication Past, Present, & Future

Asynchronous Replication

A

A

S

S

C

C

Y

Y

N

N

+ low consistency - network insensitive

Page 10: Red Hat Storage Server Replication Past, Present, & Future

Effect of Network Partitions

A

A

S

S

MY

Y N

What’s the correct value?

Page 11: Red Hat Storage Server Replication Past, Present, & Future

Tradeoff Space

Network Sensitive Network Insensitive

HighConsistency

LowConsistency

S

A

Page 12: Red Hat Storage Server Replication Past, Present, & Future

Red Hat StorageSynchronous Near-ReplicationRaghavan PDeveloper, Red Hat

Page 13: Red Hat Storage Server Replication Past, Present, & Future

Traditional replication using AFR

“Automatic file replication” Client based replication Entry, meta data and data based replication. Automated Self healing in case bricks recover after failure.

Page 14: Red Hat Storage Server Replication Past, Present, & Future

AFR Sequence Diagram

Client 1

Client 2

Server A

Server B

LockPre Op

OpPost Op

Unlock

Lock (blocked) Pre Op

Page 15: Red Hat Storage Server Replication Past, Present, & Future

AFR improvements

In 3.4 release Eager locking Piggybacking Server quorum In 3.5 release Granular self heal

In 3.6 release Rewrite of the code Pending counters Self healing in the context of self heal daemon

Page 16: Red Hat Storage Server Replication Past, Present, & Future

NSR – new style (aka server side) replication Replication to the back end (brick processes) Controlled by a designated “leader” also known as sweeper. AdvantagesBandwidth usage of client network optimized for direct (fuse) mountsAvoidance of split brain Sweeper elected using majority principle. Per term Changelog on the sweeper preseves the ordering of operations.Variable consistency models for trading consistency with performance.

Page 17: Red Hat Storage Server Replication Past, Present, & Future

NSR high level blocks

NSR client side translator

Sends IO to sweeper

Sweeper (leader)

Forwards IO to peers

Commits after all peer completion

Non sweeper (follower)

Accepts IO only from sweeper or reconciliation

Rejects IO from client (client retry)

Change log

Reconciliation

Makes use of membership to figure out terms missing.

Makes use of change logs for syncing the corresponding terms.

Page 18: Red Hat Storage Server Replication Past, Present, & Future

NSR Sequence Diagram

Client 1

Client 2

Sweeper

Follower

Client 1 Request

Client 2 Request

Page 19: Red Hat Storage Server Replication Past, Present, & Future

Red Hat Storage ServerGeo-ReplicationVenky ShankarDeveloper, Red Hat

Page 20: Red Hat Storage Server Replication Past, Present, & Future

Geo-Replication Asynchronous data replication Continuous, Incremental

Across geographies One site (master) to another (slave) Multi-slave Cascading Fan-out

Disaster Recovery

Page 21: Red Hat Storage Server Replication Past, Present, & Future

Remote Replication: Past

Page 22: Red Hat Storage Server Replication Past, Present, & Future

Single node Change detection Crawling (xtime based crawl)

Data synchronization Rsync

Suboptimal processing rename, deletes, hardlink

Overview

Page 23: Red Hat Storage Server Replication Past, Present, & Future

Crawling and xtime

Xtime Inode changed time Marked up to root (marker xlator)

Crawling/Scanning Directory crawl and file synchronization

xtime(master) > xtime(slave)

Slave xtime maintained by master

Page 24: Red Hat Storage Server Replication Past, Present, & Future
Page 25: Red Hat Storage Server Replication Past, Present, & Future

Remote Replication: Present

Page 26: Red Hat Storage Server Replication Past, Present, & Future

Overview Multi node Distributed (parallel) synchronization Replica failover

Change detection Consumable journals

Data synchronization (configurable) Rsync, tar+ssh (large number of small files)

Efficient processing rename, delete, hardlink

Page 27: Red Hat Storage Server Replication Past, Present, & Future

Journaling

Journaling Translator (changelog) Records FOP (efficiently) local to a brick Data, Entry, Metadata

Change detection : O(1) relative to number of changes

Consumer library (libgfchangelog) Per brick Publish/Subscribe mechanism Journals periodically published

Page 28: Red Hat Storage Server Replication Past, Present, & Future

Remote Replication: Future

Page 29: Red Hat Storage Server Replication Past, Present, & Future

Replicating Snapshots Multi Master Vector clocks Conflict detection & resolution

Libgfapi integration Geo-replication to Swift target

Features

Page 30: Red Hat Storage Server Replication Past, Present, & Future

Red Hat Storage ServerReplication-related FeaturesJeff DarcyDeveloper, Red Hat

Page 31: Red Hat Storage Server Replication Past, Present, & Future

Unified Replication

Leader

ChangeLog

LocalReplica

ChangeLog

RemoteReplica

ChangeLogSync Async

Page 32: Red Hat Storage Server Replication Past, Present, & Future

Erasure Coding (a.k.a. “disperse”)

D1 D2 D3 D4 P1 P2 P3

D1 D2 D3 D4 P1 P2 P3

D2

Page 33: Red Hat Storage Server Replication Past, Present, & Future

Also…

VolumeSnapshot

FileSnapshot

Deduplication+

CompressionChecksums

OK

OK

Page 34: Red Hat Storage Server Replication Past, Present, & Future

Tiering (a.k.a. data classification)

Tier 0

Tier 1

Tier 2

SSD, no replication

Normal disk, sync replication

SMR disk, erasure codingcompression + checksumsasync replication

Page 35: Red Hat Storage Server Replication Past, Present, & Future

Questions?