drbd

62
DRBD Pierre Mavro www.enovance.com September 11, 2013

Upload: pierre-mavro

Post on 28-Jan-2015

2.055 views

Category:

Technology


0 download

DESCRIPTION

This presentation is on the DRBD product. At eNovance, we're using it for several years. In those slides, you will find informations on how we use it, use cases and Ninja tricks. This document has been realized with a lot of feedbacks and thanks to strong knowledges on that technology that eNovance is able to provide.

TRANSCRIPT

Page 1: Drbd

DRBD

Pierre Mavro

www.enovance.com

September 11, 2013

Page 2: Drbd

DRBD: Summary

Summary

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2 Basic usages and understandings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

3 Use cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

4 Ninja tricks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

Pierre Mavro www.enovance.comDRBD 2 / 62

Page 3: Drbd

DRBD: Introduction

How it worksDRBD refers to block devices designed as a building block to form high availability(HA) clusters. This is done by mirroring a whole block device via an assigned network.DRBD can be seen as network based raid-1.

Pierre Mavro www.enovance.comDRBD 3 / 62

Page 4: Drbd

DRBD: Introduction

Synchronisations

DRBD supports three distinct replication modes, allowing three degrees of replication :

I A (Asynchronous replication) : Master node disk write + local TCP sendbuffer → operation completed

I B (Semi synchronous replication) : Master node disk write + replicationpackets reached other node → operation completed

I C (Synchronous replication protocol) : Written to both the local and theremote disk to get operation completed

The Syncronous replication protocol is the most robust protocol tested inproduction.http://www.drbd.org/home/mirroring/

Pierre Mavro www.enovance.comDRBD 4 / 62

Page 5: Drbd

DRBD: Introduction

Data accessibility

A consequence of mirroring data on block device level is that you can access your data(using a file system) only on the active node. This is not a shortcoming of DRBD butis caused by the nature of most file systems (ext3, XFS, JFS, ext4, ...).These file systems are designed for one computer accessing one disk, so theycannot cope with two computers accessing one (virtually) shared disk.In spite of this limitation, there are still a few ways to access the data on the secondnode:

I Use DRBD on logical volumes and use LVM’s capabilities to take snapshots onthe standby node, and access the data via the snapshot.

I DRBD’s primary-primary mode with a shared disk file system (GFS, OCFS2).These systems are very sensitive to failures of the replication network.

I Mount in read only mode the partition

We are using DRBD in master/slave at eNovance.

Pierre Mavro www.enovance.comDRBD 5 / 62

Page 6: Drbd

DRBD: Basic usages and understandings

Plan

2 Basic usages and understandings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6I Configuration and initialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6I Check replication status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15I Node switching and manual synchronization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22I Remove a DRBD device . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

Pierre Mavro www.enovance.comDRBD 6 / 62

Page 7: Drbd

DRBD: Basic usages and understandings

Configuration

The DRBD configuration should be the same on both nodes. This drbd.conf file isthe global configuration of the DRBD service with fine tuning configuration :

/etc/drbd.confglobal { <-- Global configuration

usage-count no; <-- Do not report statistics usage to LinBit}common { <-- All resources inherit the options set in this section

protocol C; <-- C (Synchronous replication protocol)

startup {wfc-timeout 1 ; <-- Wait for connection timeout (in seconds)degr-wfc-timeout 1 ; <-- Wait for connection timeout, if this node was a degraded

} cluster (in seconds)

Pierre Mavro www.enovance.comDRBD 7 / 62

Page 8: Drbd

DRBD: Basic usages and understandings

Configuration

/etc/drbd.confnet {

max-buffers 8192; <-- Maximum number of requests to be allocated by DRBDmax-epoch-size 8192; <-- The highest number of data blocks between two write barrierssndbuf-size 512k; <-- The size of the TCP socket send bufferunplug-watermark 8192; <-- how often the I/O subsystem’s controller is forced to

process pending I/O requestscram-hmac-alg sha1; <-- The HMAC algorithm to enable peer authentication at allshared-secret "xxx"; <-- The shared secret used in peer authentication# Split brainsafter-sb-0pri disconnect; <-- Split brain, resource is not in the Primary role on any hostafter-sb-1pri disconnect; <-- Split brain, resource is in the Primary role on one hostafter-sb-2pri disconnect; <-- Split brain, resource is in the Primary role on both hostrr-conflict disconnect; <-- Helps to solve the cases when the outcome of the resync

} decision is incompatible with the current role assignment

handlers {pri-on-incon-degr "echo node is primary, degraded and the local copy of the data is

inconsistent | wall "; <-- If the node is primary, degraded and if the} local copy of the data is inconsistent

Pierre Mavro www.enovance.comDRBD 8 / 62

Page 9: Drbd

DRBD: Basic usages and understandings

Configuration

/etc/drbd.confdisk {

on-io-error pass_on; <-- The node downgrades the disk status to inconsistent on io errorsno-disk-barrier; <-- Disable protecting data if power failure (done by hardware)no-disk-flushes; <-- Disable the backing device to support disk flushesno-disk-drain; <-- Do not let write requests drain before write requests of a new

reordering domain are issuedno-md-flushes; <-- Disables the use of disk flushes and barrier BIOs when accessing

} the meta data device

syncer {rate 300M; <-- The maximum bandwidth a resource uses for background

re-synchronizational-extents 3833; <-- Control how big the hot area (= active set) can get

}}

Pierre Mavro www.enovance.comDRBD 9 / 62

Page 10: Drbd

DRBD: Basic usages and understandings

Configuration

Now the resources can be defined. One resource should match a DRBD device. Bothnodes informations should be filled :

/etc/drbd.d/resources.confresource drbd1 { <-- DRBD block device name

syncer {after drbd0 } <-- Start drbd1 after drbd0 is up & running

on srv1 { <-- Master node namedevice /dev/drbd1; <-- Block device name of the resource being describeddisk /dev/sda1; <-- Block device to store and retrieve the dataaddress x.x.x.x:7789; <-- IP address and port of the local hostmeta-disk internal; <-- The last part of the backing device is used to store

the meta-data}

on srv2 { <-- Slave node namedevice /dev/drbd2;disk /dev/sda1;address y.y.y.y:7789;meta-disk internal;

}}

Pierre Mavro www.enovance.comDRBD 10 / 62

Page 11: Drbd

DRBD: Basic usages and understandings

Initialisation

Here is a summary to understand the process to create a drbd replication. You needto follow that rule to get a working DRBD synchronization :

Init metadatadevice

AttachDRBD device

ConnectDRBD device

Start syn-chronization

Pierre Mavro www.enovance.comDRBD 11 / 62

Page 12: Drbd

DRBD: Basic usages and understandings

Init metadata device

This step must be completed only on initial device creation. It initializes DRBD’smetadata (replace <drbd_volume> by the drbd device name you want to initialize).You should complete those steps on both nodes :

Create device metadata on server 1srv1~$ drbdadm create -md <drbd_volume >Writing meta data ...initialising activity logNOT initializing bitmapNew drbd meta data block sucessfully created .

Create device metadata on server 2srv2~$ drbdadm create -md <drbd_volume >...

Pierre Mavro www.enovance.comDRBD 12 / 62

Page 13: Drbd

DRBD: Basic usages and understandings

Attach and connect DRBD device

Attach a local backing block device to the DRBD resource’s device :

Attach DRBD device$ drbdadm attach <drbd_volume >

If the peer device is already configured, the two DRBD devices will connect :

Sets up the network configuration of the resource’s device$ drbdadm connect <drbd_volume >

Pierre Mavro www.enovance.comDRBD 13 / 62

Page 14: Drbd

DRBD: Basic usages and understandings

Start synchronization

To start the first synchronization, you need to ask to the first server to sync all blocksto the secondary server.This will erase all blocks on the other server, so all data will be lost :

Start the first synchronization$ drbdadm -- --overwrite -data -of -peer primary <drbd_volume >

This action may take a while, depending on the network bandwidth and DRBDvolume size.

Pierre Mavro www.enovance.comDRBD 14 / 62

Page 15: Drbd

DRBD: Basic usages and understandings

Plan

2 Basic usages and understandings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6I Configuration and initialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6I Check replication status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15I Node switching and manual synchronization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22I Remove a DRBD device . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

Pierre Mavro www.enovance.comDRBD 15 / 62

Page 16: Drbd

DRBD: Basic usages and understandings

Check replication status

To check the replication status, simply look at DRBD file in /proc :

Check$ cat /proc/drbdversion : 8.3.7 (api :88/ proto :86 -91)srcversion : EE47D8BF18AC166BE219757

0: cs: SyncSource ro: Primary / Secondary ds: UpToDate / Inconsistent C r.ns :912248 nr :0 dw :0 dr :920640 al :0 bm :55 lo :1 pe :388 ua :2048 ap :0

[=== >................] sync ’ed: 21.9% (3283604/4194304) Kfinish : 1:08:24 speed : 580 (452) K/sec

Pierre Mavro www.enovance.comDRBD 16 / 62

Page 17: Drbd

DRBD: Basic usages and understandings

Check replication status

Another solution consists to launch this command that will provide informations of allDRBD devices :

Check$ drbd - overview0: home Connected Primary / Secondary

UpToDate / UpToDate C r--- /home xfs 200G 158G 43G 79%

Pierre Mavro www.enovance.comDRBD 17 / 62

Page 18: Drbd

DRBD: Basic usages and understandings

Check replication statusSeveral informations need be checked to know the status replication of a DRBDdevice :

Replicationstatus

Connectionstate (cs)

Resourcesroles (ro)

Diskstates (ds)

Pierre Mavro www.enovance.comDRBD 18 / 62

Page 19: Drbd

DRBD: Basic usages and understandings

Check replication status

Connection states (cs) :

I StandAlone : No network configuration available. The resource has not yetbeen connected, has been administratively disconnected or has dropped itsconnection due to failed authentication/split brain.

I Unconnected : Temporary state, prior to a connection attempt.

I WFConnection : This node is waiting until the peer node becomes visible onthe network.

I Connected : A DRBD connection has been established, data mirroring is nowactive. This is the normal state.

I PausedSync : The local node is the source or target of an ongoingsynchronization, but synchronization is currently paused. This may be due to adependency on the completion of another synchronization process, or due tosynchronization having been manually interrupted.

Pierre Mavro www.enovance.comDRBD 19 / 62

Page 20: Drbd

DRBD: Basic usages and understandings

Check replication status

Resource roles (ro) :

I Primary : The resource is currently in the primary role, and may be read fromand written to. This role only occurs on one of the two nodes, unlessdual-primary mode is enabled.

I Secondary : The resource is currently in the secondary role. It normally receivesupdates from its peer (unless running in disconnected mode), but may neither beread from nor written to. This role may occur on one or both nodes.

I Unknown : The resource’s role is currently unknown. The local resource rolenever has this status. It is only displayed for the peer’s resource role, and only indisconnected mode.

Pierre Mavro www.enovance.comDRBD 20 / 62

Page 21: Drbd

DRBD: Basic usages and understandings

Check replication status

Disk states (ds) :

I UpToDate : Consistent, up-to-date state of the data. This is the normal state.

I Inconsistent : The data is inconsistent. This status occurs immediately uponcreation of a new resource, on both nodes (before the initial full sync). Also, thisstatus is found in one node (the synchronization target) during synchronization.

Pierre Mavro www.enovance.comDRBD 21 / 62

Page 22: Drbd

DRBD: Basic usages and understandings

Plan

2 Basic usages and understandings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6I Configuration and initialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6I Check replication status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15I Node switching and manual synchronization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22I Remove a DRBD device . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

Pierre Mavro www.enovance.comDRBD 22 / 62

Page 23: Drbd

DRBD: Basic usages and understandings

Node switching

You may need to perform maintenance tasks. For instance you need to switch aDRBD volume to the other node.Simply connect to the secondary node and launch :

Set secondary as primary$ drbdadm primary <drbd_volume >

You can also switch all volumes at once. Double check if you really can do thisbefore proceed :

Set secondary as primary$ drbdadm primary all

Pierre Mavro www.enovance.comDRBD 23 / 62

Page 24: Drbd

DRBD: Basic usages and understandings

Node switching

You also can switch a Primary node to Secondary :

Set primary as secondary$ drbdadm secondary <drbd_volume >

Notes

This will not automatically set the old secondary node to primary state

Pierre Mavro www.enovance.comDRBD 24 / 62

Page 25: Drbd

DRBD: Basic usages and understandings

Manual synchronization

To start a manual synchronization, you need to invalidate the DRBD device on thecurrent host :

Invalidate data on current host$ drbdadm invalidate <drbd_volume >

You can do the same from one node to the other node :

Invalidate data on remote host$ drbdadm invalidate_remote <drbd_volume >

You can then check the status in /proc/drbd

Pierre Mavro www.enovance.comDRBD 25 / 62

Page 26: Drbd

DRBD: Basic usages and understandings

Plan

2 Basic usages and understandings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6I Configuration and initialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6I Check replication status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15I Node switching and manual synchronization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22I Remove a DRBD device . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

Pierre Mavro www.enovance.comDRBD 26 / 62

Page 27: Drbd

DRBD: Basic usages and understandings

Remove a DRBD device

Before removing a DRBD device, be sure you’ve dumped all data ! Then disconnectthe device on both nodes :

Disconnect DRBD device$ drbdadm disconnect <drbd_volume >

And delete it :

Remove DRBD device$ drbdsetup <volume_number > down

Change volume_number by the number seen with drbd-overview command. To finish,remove the config files.

Pierre Mavro www.enovance.comDRBD 27 / 62

Page 28: Drbd

DRBD: Use cases

Plan

3 Use cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28I Stop, upgrade and commit changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28I Stop, upgrade and rollback changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31I Up, connect...but doesn’t want to come UpToDate . . . . . . . . . . . . . . . . . . . . . . . 35I How to promote a primary when having a dual secondary . . . . . . . . . . . . . . . . . . 39

Pierre Mavro www.enovance.comDRBD 28 / 62

Page 29: Drbd

DRBD: Use cases

Stop, upgrade and commit changes

Sometimes, you may need to perform maintenances tasks on a DRBD device. Forexample, you need to upgrade stuffs on a DRBD device and if success, commitchanges to the secondary :

DRBD Primary DRBD Secondary

Set secondary nodein maintenance mode

Resume sync

Upgrade success

Normal state

Pierre Mavro www.enovance.comDRBD 29 / 62

Page 30: Drbd

DRBD: Use cases

Stop, upgrade and commit changes

To set the secondary node in maintenance, simply stop the synchronisation on thesecondary node :

Disconnect drbd device$ drbdadm disconnect <drbd_volume >

Do what you have to do on the DRBD primary device and once you have finished andeverything looks fine, resume the synchronization to get back to the normal state :

Connect drbd device$ drbdadm connect <drbd_volume >

Pierre Mavro www.enovance.comDRBD 30 / 62

Page 31: Drbd

DRBD: Use cases

Plan

3 Use cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28I Stop, upgrade and commit changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28I Stop, upgrade and rollback changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31I Up, connect...but doesn’t want to come UpToDate . . . . . . . . . . . . . . . . . . . . . . . 35I How to promote a primary when having a dual secondary . . . . . . . . . . . . . . . . . . 39

Pierre Mavro www.enovance.comDRBD 31 / 62

Page 32: Drbd

DRBD: Use cases

Stop, upgrade and rollback changes

Sometimes, you may need to perform maintenances tasks on a DRBD device. Forexample, you need to upgrade stuffs on a DRBD device and if fail, rollback changesfrom the secondary :

DRBD Primary DRBD Secondary

Set secondary nodein maintenance mode

Upgrade failed

Normal state

Rollback sync

Pierre Mavro www.enovance.comDRBD 32 / 62

Page 33: Drbd

DRBD: Use cases

Stop, upgrade and rollback changes

To set the secondary node in maintenance, simply stop the synchronisation on thesecondary node :

Disconnect drbd device$ drbdadm disconnect <drbd_volume >

Then you’ve upgraded and want to rollback. You have to switch your secondary nodeto master node :

Promote device as primary$ drbdadm primary <drbd_volume >

Pierre Mavro www.enovance.comDRBD 33 / 62

Page 34: Drbd

DRBD: Use cases

Stop, upgrade and rollback changes

Then you need to invalidate the datas on the old primary :

Invalidate drbd device data$ drbdadm invalidate <drbd_volume >

To finish, adjust and connect the device to perform the rollback on the new masternode :

Adjust and connect$ drbdadm adjust <drbd_volume >$ drbdadm connect <drbd_volume >

Pierre Mavro www.enovance.comDRBD 34 / 62

Page 35: Drbd

DRBD: Use cases

Plan

3 Use cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28I Stop, upgrade and commit changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28I Stop, upgrade and rollback changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31I Up, connect...but doesn’t want to come UpToDate . . . . . . . . . . . . . . . . . . . . . . . 35I How to promote a primary when having a dual secondary . . . . . . . . . . . . . . . . . . 39

Pierre Mavro www.enovance.comDRBD 35 / 62

Page 36: Drbd

DRBD: Use cases

Up, connect...but doesn’t want to come UpToDate

If you don’t understand why after several up, connect etc.... you don’t see anythingchanging and still got this kind of event :

Check$ drbd - overview0: drbd0 StandAlone Primary / Unknown UpToDate / DUnknown r-----

You need to instigate more. The first thing to do is to look at the logs :

/var/log/syslog[ 6375.849509] block drbd0: Split-Brain detected but unresolved, dropping connection![ 6375.850025] block drbd0: helper command: /sbin/drbdadm split-brain minor-0[ 6375.852859] block drbd0: helper command: /sbin/drbdadm split-brain minor-0 exit code 0 (0x0)[ 6375.852867] block drbd0: conn( WFReportParams -> Disconnecting )[ 6375.852873] block drbd0: error receiving ReportState, l: 4![ 6375.853274] block drbd0: asender terminated[ 6375.853277] block drbd0: Terminating drbd0_asender[ 6375.853506] block drbd0: Connection closed[ 6375.853514] block drbd0: conn( Disconnecting -> StandAlone )[ 6375.853523] block drbd0: receiver terminated[ 6375.853526] block drbd0: Terminating drbd0_receiver

Pierre Mavro www.enovance.comDRBD 36 / 62

Page 37: Drbd

DRBD: Use cases

Up, connect...but doesn’t want to come UpToDate

Anyway, it may not be clear enough for you. You can use that command to get abetter understanding on the situation :

Show all drbd informations$ drbdadm show -gi <drbd_volume >

+--< Current data generation UUID >-| +--< Bitmap ’s base data generation UUID >-| | +--< younger history UUID >-| | | +-< older history >-V V V V

B63AABEFCFADD87D :2 C30515E7D05B35D :82 C4146DCB8E6BFF :82 C3146DCB8E6BFF :1:1:1:0:0:0:0^ ^ ^ ^ ^ ^ ^

-< Data consistency flag >--+ | | | | | |-< Data was/is currently up -to -date >--+ | | | | |

-< Node was/is currently primary >--+ | | | |-< Node was/is currently connected >--+ | | |

-< Node was in the progress of setting all bits in the bitmap >--+ | |-< The peer ’s disk was out - dated or inconsistent >--+ |

-< This node was a crashed primary , and has not seen its peer since >--+

flags : Primary , StandAlone , UpToDate

Pierre Mavro www.enovance.comDRBD 37 / 62

Page 38: Drbd

DRBD: Use cases

Up, connect...but doesn’t want to come UpToDate

Here the problem is clear : we’re in a split brain situation. So there is no othersolutions than doing a full sync.But what if no split brain was detected ? There’s a last solution :

Adjust and connect the drbd device$ drbdadm adjust <drbd_volume >$ drbdadm connect <drbd_volume >

Now you can check the replication which is resuming.

Pierre Mavro www.enovance.comDRBD 38 / 62

Page 39: Drbd

DRBD: Use cases

Plan

3 Use cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28I Stop, upgrade and commit changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28I Stop, upgrade and rollback changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31I Up, connect...but doesn’t want to come UpToDate . . . . . . . . . . . . . . . . . . . . . . . 35I How to promote a primary when having a dual secondary . . . . . . . . . . . . . . . . . . 39

Pierre Mavro www.enovance.comDRBD 39 / 62

Page 40: Drbd

DRBD: Use cases

How to promote a primary when having a dual secondary

If you get the dual secondary state after node’s restart, check which node as the’UpToDate’ state :

Check$ drbd - overview0: drbd0 Connected Secondary / Secondary UpToDate / UpToDate C r-----

Here, both are up to date. Simply select one and promote it as primary :

Set as primary$ drbdadm primary <drbd_volume >

You should now see one primary and a secondary synced.

Pierre Mavro www.enovance.comDRBD 40 / 62

Page 41: Drbd

DRBD: Ninja tricks

Plan

4 Ninja tricks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41I Create DRBD replication without syncing data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41I What to do in Split brain case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45I Mount secondary node as read only . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50I All is good, but still doesn’t want to connect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53I Can’t "create-md" on an old DRBD device . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55I My DRBD device is still there but not the configuration . . . . . . . . . . . . . . . . . . . 58I Speed up sync transfer rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

Pierre Mavro www.enovance.comDRBD 41 / 62

Page 42: Drbd

DRBD: Ninja tricks

Create DRBD replication without syncing data

When you have big devices to sync on a DRBD initialization, it could take severalhours or days in worst cases. There is a solution to only sync bitmap and have yourDRBD device available in a few seconds instead !Create on both nodes the DRBD device :

Create device metadata$ drbdadm create -md <drbd_volume >Writing meta data ...initialising activity logNOT initializing bitmapNew drbd meta data block sucessfully created .

Pierre Mavro www.enovance.comDRBD 42 / 62

Page 43: Drbd

DRBD: Ninja tricks

Create DRBD replication without syncing data

Then bring it up (still on both nodes) :

Connect the volume$ drbdadm up <drbd_volume >

Only on the master node, initialize the bitmap and uuid of the DRBD device :

Only sync bitmap$ drbdadm -- --clear - bitmap new -current -uuid <drbd_volume >

Pierre Mavro www.enovance.comDRBD 43 / 62

Page 44: Drbd

DRBD: Ninja tricks

Create DRBD replication without syncing data

Promote this master node as primary node to replicate bitmap :

Set primary volume$ drbdadm primary <drbd_volume >

Now you can check the DRBD is ready to use :-)

Check$ drbd - overview1: drbd1 Connected Primary / Secondary UpToDate / UpToDate C r-----

Pierre Mavro www.enovance.comDRBD 44 / 62

Page 45: Drbd

DRBD: Ninja tricks

Plan

4 Ninja tricks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41I Create DRBD replication without syncing data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41I What to do in Split brain case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45I Mount secondary node as read only . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50I All is good, but still doesn’t want to connect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53I Can’t "create-md" on an old DRBD device . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55I My DRBD device is still there but not the configuration . . . . . . . . . . . . . . . . . . . 58I Speed up sync transfer rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

Pierre Mavro www.enovance.comDRBD 45 / 62

Page 46: Drbd

DRBD: Ninja tricks

What to do in Split brain case

If you get this kind of problem in /proc/drbd :

Set secondary as primary$ cat /proc/drbdprimary / unknown

that mean that you are in a split-brain case. First of all you need to check the versionon both nodes ! You absolutely need to have the same version to avoid errors(8.0-8.3/8.4/9.0) :

Set secondary as primary$ drbdadm --version | grep DRBDADM_VERSION =DRBDADM_VERSION =8.3.13

Pierre Mavro www.enovance.comDRBD 46 / 62

Page 47: Drbd

DRBD: Ninja tricks

What to do in Split brain case

You will need to discard all data on the secondary node and reimport all of them fromthe master node.

Umount all drbd

Umount all failed DRBD mounted devices before continuing !!!

On the master node, we need to connect all failed DRBD devices first :

Connect DRBD$ drbdadm connect <drbd_volume >

Pierre Mavro www.enovance.comDRBD 47 / 62

Page 48: Drbd

DRBD: Ninja tricks

What to do in Split brain case

And then sync all data to the secondary node :

Sync from master$ drbdadm -- --discard -my -data connect <drbd_volume >

Check the sync state in /proc/drbd :

Check sync state$ cat /proc/drbdversion : 8.3.7 (api :88/ proto :86 -91)srcversion : EE47D8BF18AC166BE219757

0: cs: SyncSource ro: Primary / Secondary ds: UpToDate / Inconsistent C r.ns :912248 nr :0 dw :0 dr :920640 al :0 bm :55 lo :1 pe :388 ua :2048 ap :0

[=== >................] sync ’ed: 21.9% (3283604/4194304) Kfinish : 1:08:24 speed : 580 (452) K/sec

Mount your DRBD device when finished.

Pierre Mavro www.enovance.comDRBD 48 / 62

Page 49: Drbd

DRBD: Ninja tricks

What to do in Split brain case

Note : there is a difference between "invalidate" and "discard-my-data" :

I drbdadm invalidate : This forces the local device of a pair of connected DRBDdevices into SyncTarget state, which means that all data blocks of the device arecopied over from the peer. This command will fail if the device is not part of aconnected device pair.

I drbdadm – –discard-my-data : Use this option to manually recover from asplit-brain situation. In case you do not have any automatic after-split-brainpolicies selected, the nodes refuse to connect. By passing this option you makethis node a sync target immediately after successful connect.

Pierre Mavro www.enovance.comDRBD 49 / 62

Page 50: Drbd

DRBD: Ninja tricks

Plan

4 Ninja tricks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41I Create DRBD replication without syncing data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41I What to do in Split brain case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45I Mount secondary node as read only . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50I All is good, but still doesn’t want to connect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53I Can’t "create-md" on an old DRBD device . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55I My DRBD device is still there but not the configuration . . . . . . . . . . . . . . . . . . . 58I Speed up sync transfer rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

Pierre Mavro www.enovance.comDRBD 50 / 62

Page 51: Drbd

DRBD: Ninja tricks

Mount secondary node as read only

There are several ways to mount secondary node in a read only state but several waysto break your data on the secondary node as well. This method is the best to avoid afull resync.First, create a snapshot on your logical volume on top of the DRBD device :

Create logical volume$ lvcreate -s -n <lv >- snapshot -L 2G /dev/<vg >/<lv >

You’re now able to mount the snapshot as a read only device. To make it work, youabsolutely need to specify the filesystem :

Mount filesystem$ mount -t ext4 -o ro /dev/<vg >/<lv >- snapshot / mountpoint

Backup all your desired data.

Pierre Mavro www.enovance.comDRBD 51 / 62

Page 52: Drbd

DRBD: Ninja tricks

Mount secondary node as read only

You can check any time that your data are still under sync with ’drbd-overview’command. When you’ve finished, you need to umount the mountpoint :

Unmount$ umount / mountpoint

and remove the logical volume :

Delete LV$ lvremove /dev/<vg >/<lv >- snapshot

Pierre Mavro www.enovance.comDRBD 52 / 62

Page 53: Drbd

DRBD: Ninja tricks

Plan

4 Ninja tricks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41I Create DRBD replication without syncing data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41I What to do in Split brain case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45I Mount secondary node as read only . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50I All is good, but still doesn’t want to connect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53I Can’t "create-md" on an old DRBD device . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55I My DRBD device is still there but not the configuration . . . . . . . . . . . . . . . . . . . 58I Speed up sync transfer rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

Pierre Mavro www.enovance.comDRBD 53 / 62

Page 54: Drbd

DRBD: Ninja tricks

All is good, but still doesn’t want to connect

All your configuration is ok, your logs doesn’t give any errors but the connect methoddoesn’t seam to do something.The last chance is to try the ’adjust’ command :

Adjust$ drbdadm adjust <drbd_volume >$ drbdadm connect <drbd_volume >

This should work now

Pierre Mavro www.enovance.comDRBD 54 / 62

Page 55: Drbd

DRBD: Ninja tricks

Plan

4 Ninja tricks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41I Create DRBD replication without syncing data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41I What to do in Split brain case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45I Mount secondary node as read only . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50I All is good, but still doesn’t want to connect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53I Can’t "create-md" on an old DRBD device . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55I My DRBD device is still there but not the configuration . . . . . . . . . . . . . . . . . . . 58I Speed up sync transfer rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

Pierre Mavro www.enovance.comDRBD 55 / 62

Page 56: Drbd

DRBD: Ninja tricks

Can’t "create-md" on an old DRBD device

If you get this kind of error when you want to recreate and md device on an oldexisting one :

Initiate DRBD device$ drbdadm create -md <drbd_volume >

Found some data==> This might destroy existing data! <==

Do you want to proceed ?[need to type ’yes ’ to confirm ] yes

drbdadm create -md nameResource : exited with code 40

To resolve the problem, erase all data on the current device :

Flush disk data$ drbdadm wipe -md <drbd_volume >

Now restart the create procedure.

Pierre Mavro www.enovance.comDRBD 56 / 62

Page 57: Drbd

DRBD: Ninja tricks

Can’t "create-md" on an old DRBD device

If it doesn’t work, because you don’t have your configuration any more for example,you can do it manually :

Flush disk data$ dd if =/ dev/zero of =/ dev/sdaX bs =1M count =128

Now restart the create procedure.

Pierre Mavro www.enovance.comDRBD 57 / 62

Page 58: Drbd

DRBD: Ninja tricks

Plan

4 Ninja tricks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41I Create DRBD replication without syncing data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41I What to do in Split brain case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45I Mount secondary node as read only . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50I All is good, but still doesn’t want to connect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53I Can’t "create-md" on an old DRBD device . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55I My DRBD device is still there but not the configuration . . . . . . . . . . . . . . . . . . . 58I Speed up sync transfer rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

Pierre Mavro www.enovance.comDRBD 58 / 62

Page 59: Drbd

DRBD: Ninja tricks

My DRBD device is still there but not the configuration

You may want to remove the DRBD device but unfortunately you did it in the wrongway or an automatic script didn’t did it well.The solution consist to recreate the configuration file to get the device recovered.

Warning

Before creating a configuration file in the drbd.d folder, check that you’ve configuredthe correct device and ports to avoid conflicting with already used DRBD devices.

You can check that you’ve recovered your device with drbd-overview.Then you’ll be able to follow the remove procedure.

Pierre Mavro www.enovance.comDRBD 59 / 62

Page 60: Drbd

DRBD: Ninja tricks

Plan

4 Ninja tricks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41I Create DRBD replication without syncing data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41I What to do in Split brain case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45I Mount secondary node as read only . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50I All is good, but still doesn’t want to connect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53I Can’t "create-md" on an old DRBD device . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55I My DRBD device is still there but not the configuration . . . . . . . . . . . . . . . . . . . 58I Speed up sync transfer rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

Pierre Mavro www.enovance.comDRBD 60 / 62

Page 61: Drbd

DRBD: Ninja tricks

Speed up sync transfer rate

If you need to speed up the transfer rate because you need to do a full sync betweento hosts, you can change it on the fly. It does not make sense to set a synchronizationrate higher than the maximum write throughput on your secondary node, so set it upto a correct rate on the secondary node :

Set sync ratedrbdsetup /dev/< drbd_volume > syncer -r 900M

Then, when you’ve finished and want to revert to the default value :

Set sync ratedrbdadm adjust <drbd_volume >

Pierre Mavro www.enovance.comDRBD 61 / 62

Page 62: Drbd

www.enovance.com