enabling high availability with multi-site, rack-aware replication: couchbase connect 2014
DESCRIPTION
Join us to learn how customers are building globally available applications using Couchbase. This talk will go into detail on Cross data center replication, typical replication topologies, Server groups, best practices and things not to do.TRANSCRIPT
Enabling High-Availability with multi-site, rack aware replication
Alex MaPrincipal Solutions Engineer
read/write/update
Active
SERVER 1
Active
SERVER 2
Active
SERVER 3
APP SERVER 1
COUCHBASE Client Library
CLUSTER MAP
COUCHBASE Client Library
CLUSTER MAP
APP SERVER 2
Shard 5
Shard 2
Shard 9
Shard
Shard
Shard
Shard 4
Shard 7
Shard 8
Shard
Shard
Shard
Shard 1
Shard 3
Shard 6
Shard
Shard
Shard
Replica Replica Replica
Shard 4
Shard 1
Shard 8
Shard
Shard
Shard
Shard 6
Shard 3
Shard 2
Shard
Shard
Shard
Shard 7
Shard 9
Shard 5
Shard
Shard
Shard
Couchbase Basics
• Docs distributed evenly across servers
• Each server stores both active and replica docs Only one server active at a time
• Client library provides app with simple interface to database
• Cluster map provides map to which server doc is on App never needs to know
• App reads, writes, updates docs
• Multiple app servers can access same document at same time
Auto-Failover
SERVER 4 SERVER 5
Replica
Active
Replica
Active
App Server 1
COUCHBASE Client Library
CLUSTER MAP
COUCHBASE Client Library
CLUSTER MAP
App Server 2
Active
SERVER 1
Shard 5
Shard 2
Shard 9Shard
Shard
Shard
Replica
Shard 4
Shard 1
Shard 8Shard
Shard
Shard
Active
SERVER 2
Shard 4
Shard 7 Shard 8
Shard
Shard Shard
Replica
Shard 6
Shard 3 Shard 2
Shard
Shard Shard
Active
SERVER 3
Shard 1
Shard 3
Shard 6Shard
Shard
Shard
Replica
Shard 7
Shard 9
Shard 5Shard
Shard
Shard
• App servers accessing Shards
• Requests to Server 3 fail
• Cluster detects server failedo Promotes replicas of
Shards to activeo Updates cluster map
• Requests for docs now go to appropriate server
• Typically rebalance would follow
Shard 1 Shard 3
Shard
Rack-Zone Awareness
©2014 Couchbase, Inc.
• Grouping of servers into server groups so that each group is on a physically separate rack
• Ensures that replica data partitions are not on the same rack as the primary partitions
Servers 1, 2, 3 on Rack 1
Servers 4, 5, 6 on Rack 2
Servers 7, 8, 9 on Rack 3
Cluster has 2 replicas (3 copies of data)
This is a balanced configuration
Rack-Zone Awareness
©2014 Couchbase, Inc.
• If a entire server rack fails, data is still available
• If a entire cloud zone or a region fails, data is still available
©2014 Couchbase, Inc. — Proprietary and Confidential 6
Ping times EC2 Regions
7
Cross Datacenter Replication (XDCR)
• Replicates continuously data FROM source cluster to remote clusters may be spread across geo’s
• Supports unidirectional and bidirectional operation
• Application can read and write from both clusters (active – active replication)
• Automatically handles node addition and removal
• Replication throughput scales out linearly
• Simplified Administration via console, REST, and CLI
©2014 Couchbase, Inc.
8
Cross Datacenter Replication (XDCR) – Single Node Type
33 2Managed Cache
Dis
k Q
ueue
Disk
Replication Queue
App Server
Memory-to-Memory Replication to other node
Doc
Doc Doc
XDCR Queue
(New in 3.0) Memory-to-Memory Replication to remote cluster
Doc
©2014 Couchbase, Inc.
9
Cross Datacenter Replication (XDCR)
©2014 Couchbase, Inc.
Unidirectional Replication
• Hot spare / Disaster Recovery
• Development/Testing copies
• Integrate to Connector e.g. Solr, ElasticSearch
• Integrate to custom consumer
10
Cross Datacenter Replication (XDCR)
©2014 Couchbase, Inc.
Bidirectional Replication
• Multiple Active Masters
• Disaster Recovery
• Datacenter Locality
©2014 Couchbase, Inc. — Proprietary and Confidential 11
Putting it all together
Ec2 West - Cluster Ec2 East- Cluster
Group 1West 1-a
Group 2West 1-b
Group 3West 1-c
Group 1East 1-a
Group 2East 1-b
Group 3East 1-c
XDCR
Q&A