a live data synchronization using clsync - linuxdays · inotify[choice].. +...

27
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A live data synchronization using clsync Andrey Savchenko NRNU MEPHI, Moscow, Russia 08 Oct 2016 Andrey Savchenko clsync 08 Oct 2016 1 / 26

Upload: others

Post on 11-Jun-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A live data synchronization using clsync - LinuxDays · inotify[choice].. + epoll-basednotification + fulleventinfoprovided - norecursion - watch) pathmaprequired ... A live data

..........

.....

.....................................................................

.....

......

.....

.....

.

.

...... A live data synchronization using clsync

Andrey Savchenko

NRNU MEPHI, Moscow, Russia

08 Oct 2016

Andrey Savchenko clsync 08 Oct 2016 1 / 26

Page 2: A live data synchronization using clsync - LinuxDays · inotify[choice].. + epoll-basednotification + fulleventinfoprovided - norecursion - watch) pathmaprequired ... A live data

..........

.....

.....................................................................

.....

......

.....

.....

.

Motivation

Our task:

Deploy HA/LB clusters:

hostingVoIPcorporate services

HPC systems management and synchronization

Backups of all the above

... and we have just limited hardware resources

Andrey Savchenko clsync 08 Oct 2016 2 / 26

Page 3: A live data synchronization using clsync - LinuxDays · inotify[choice].. + epoll-basednotification + fulleventinfoprovided - norecursion - watch) pathmaprequired ... A live data

..........

.....

.....................................................................

.....

......

.....

.....

.

Infrastructure

Separation by rings based on importance and trustHA and backups for each ring

Andrey Savchenko clsync 08 Oct 2016 3 / 26

Page 4: A live data synchronization using clsync - LinuxDays · inotify[choice].. + epoll-basednotification + fulleventinfoprovided - norecursion - watch) pathmaprequired ... A live data

..........

.....

.....................................................................

.....

......

.....

.....

.

Synchronization approaches

Possible solutions:

RO file systemslimited application area

block replicationunacceptable performance (DRBD + OCFS2)not resistant to split brain

network file systems (e.g. CEPH)high latency (on 1 Gb/s)kernel panics (year 2012)

file level replication

Andrey Savchenko clsync 08 Oct 2016 4 / 26

Page 5: A live data synchronization using clsync - LinuxDays · inotify[choice].. + epoll-basednotification + fulleventinfoprovided - norecursion - watch) pathmaprequired ... A live data

..........

.....

.....................................................................

.....

......

.....

.....

.

Choosing software

lsyncd: best match, but:high CPU load (>½ code on LUA)bugs and complexity of LUA code supportno threading, bad for large dirs (> 106 objects)limited events aggregationno *.so supportno BSD support

incronno recursion, no events

csync2no events, extremely slow

librnotifynot existed at that time :)bare inotify interface

Andrey Savchenko clsync 08 Oct 2016 5 / 26

Page 6: A live data synchronization using clsync - LinuxDays · inotify[choice].. + epoll-basednotification + fulleventinfoprovided - norecursion - watch) pathmaprequired ... A live data

..........

.....

.....................................................................

.....

......

.....

.....

.

How clsync works

Handler can be anything, but common case is rsync.

Andrey Savchenko clsync 08 Oct 2016 6 / 26

Page 7: A live data synchronization using clsync - LinuxDays · inotify[choice].. + epoll-basednotification + fulleventinfoprovided - norecursion - watch) pathmaprequired ... A live data

..........

.....

.....................................................................

.....

......

.....

.....

.

Choosing monitoring subsystemLinux:.dnotify..

......

+ can trace any event- individual files can’t be monitored- umount is blocked- signal-based notification- requires man stat(), only fd is provided

.inotify [choice]..

......

+ epoll-based notification+ full event info provided- no recursion- watch ⇒ path map required

Andrey Savchenko clsync 08 Oct 2016 7 / 26

Page 8: A live data synchronization using clsync - LinuxDays · inotify[choice].. + epoll-basednotification + fulleventinfoprovided - norecursion - watch) pathmaprequired ... A live data

..........

.....

.....................................................................

.....

......

.....

.....

.

Choosing monitoring subsystem.fanotify..

......

+ recursion support+ returns fd and pid Rightarrow path known- no support for delete, move, rename events ;−(

.FreeBSD..

......

kqueue/keventlibinotify (using kqueue)BSM API (not original purpose)dtrace (unusable, path can’t be extracted)

There are no worthy Linux inotify replacements in FreeBSD.Best match is kqueue interface.

Andrey Savchenko clsync 08 Oct 2016 8 / 26

Page 9: A live data synchronization using clsync - LinuxDays · inotify[choice].. + epoll-basednotification + fulleventinfoprovided - norecursion - watch) pathmaprequired ... A live data

..........

.....

.....................................................................

.....

......

.....

.....

.

How clsync works

On default settings:

...1 Initialization

...2 Set inotify watches

...3 Full file tree sync

...4 Resync for new events

...5 Wait, event aggregation and ⇑

Andrey Savchenko clsync 08 Oct 2016 9 / 26

Page 10: A live data synchronization using clsync - LinuxDays · inotify[choice].. + epoll-basednotification + fulleventinfoprovided - norecursion - watch) pathmaprequired ... A live data

..........

.....

.....................................................................

.....

......

.....

.....

.

Handlers

synchandler modes:

simple — per event per file app calldirect — per event app callshell — shell call on any sync eventrsyncdirect — direct rsync callrsyncshell — call for rsync wrapperrsyncso — clsyncapi_rsync() callback with listing suitablefor rsyncso — load *.so with generic clsyncapi_sync() callbackusing simple file listing

Andrey Savchenko clsync 08 Oct 2016 10 / 26

Page 11: A live data synchronization using clsync - LinuxDays · inotify[choice].. + epoll-basednotification + fulleventinfoprovided - norecursion - watch) pathmaprequired ... A live data

..........

.....

.....................................................................

.....

......

.....

.....

.

Security

Security features optionally engaged:privilege dropuse of capabilitiesnamespace isolationcgroups isolationthread splitting (priv + common)seccomp isolation

mprotect() banned ⇒ no thread splittingprocess splitting (priv + common)

Andrey Savchenko clsync 08 Oct 2016 11 / 26

Page 12: A live data synchronization using clsync - LinuxDays · inotify[choice].. + epoll-basednotification + fulleventinfoprovided - norecursion - watch) pathmaprequired ... A live data

..........

.....

.....................................................................

.....

......

.....

.....

.

Threading performance

pthread mutex is not designed for high-speed locks

Andrey Savchenko clsync 08 Oct 2016 12 / 26

Page 13: A live data synchronization using clsync - LinuxDays · inotify[choice].. + epoll-basednotification + fulleventinfoprovided - norecursion - watch) pathmaprequired ... A live data

..........

.....

.....................................................................

.....

......

.....

.....

.

Spinlock to mutex switch

tested on init sync procedure

Andrey Savchenko clsync 08 Oct 2016 13 / 26

Page 14: A live data synchronization using clsync - LinuxDays · inotify[choice].. + epoll-basednotification + fulleventinfoprovided - norecursion - watch) pathmaprequired ... A live data

..........

.....

.....................................................................

.....

......

.....

.....

.

Smart locks

Andrey Savchenko clsync 08 Oct 2016 14 / 26

Page 15: A live data synchronization using clsync - LinuxDays · inotify[choice].. + epoll-basednotification + fulleventinfoprovided - norecursion - watch) pathmaprequired ... A live data

..........

.....

.....................................................................

.....

......

.....

.....

.

More features

threading support (thread per sync event)

regexp support and fs object type selection

UNIX-socket control interface

auto switch between spin_lock и mutex

fast initial sync

Andrey Savchenko clsync 08 Oct 2016 15 / 26

Page 16: A live data synchronization using clsync - LinuxDays · inotify[choice].. + epoll-basednotification + fulleventinfoprovided - norecursion - watch) pathmaprequired ... A live data

..........

.....

.....................................................................

.....

......

.....

.....

.

Overhead measurement

lsyncd v.2.1.xclsync v.0.4.x4789558 files and dirs on tmpfs-F allows to bypass rules on initial syncAndrey Savchenko clsync 08 Oct 2016 16 / 26

Page 17: A live data synchronization using clsync - LinuxDays · inotify[choice].. + epoll-basednotification + fulleventinfoprovided - norecursion - watch) pathmaprequired ... A live data

..........

.....

.....................................................................

.....

......

.....

.....

.

Usage

.Mirror directory..

......clsync -Mrsyncdirect -W/path/to/source_dir \-D/path/to/destination_dir

.One-time sync..

......

clsync --exit-on-no-events --max-iterations=20 \--mode=rsyncdirect -W/var/www_new -Srsync -- \%RSYNC-ARGS% /var/www_new/ /var/www/

Andrey Savchenko clsync 08 Oct 2016 17 / 26

Page 18: A live data synchronization using clsync - LinuxDays · inotify[choice].. + epoll-basednotification + fulleventinfoprovided - norecursion - watch) pathmaprequired ... A live data

..........

.....

.....................................................................

.....

......

.....

.....

.

Usage.Live correction of web site perms..

......

clsync -w1 -t1 -T1 -x1 \-W/var/www/site.example.org/root \-Mdirect -Schown --uid 0 --gid 0 -Ysyslog -b1 \--modification-signature uid,gid -- \--from=root www-data:www-data %INCLUDE-LIST%.Troubleshooting..

......

Cannot inotify_add_watch() on [...]:No space left on device (errno: 28)

Increase sysctl fs.inotify.max_user_watches:one watch per dirno recursionkernel default is 8192

Andrey Savchenko clsync 08 Oct 2016 18 / 26

Page 19: A live data synchronization using clsync - LinuxDays · inotify[choice].. + epoll-basednotification + fulleventinfoprovided - norecursion - watch) pathmaprequired ... A live data

..........

.....

.....................................................................

.....

......

.....

.....

.

Usage.Live correction of web site perms..

......

clsync -w1 -t1 -T1 -x1 \-W/var/www/site.example.org/root \-Mdirect -Schown --uid 0 --gid 0 -Ysyslog -b1 \--modification-signature uid,gid -- \--from=root www-data:www-data %INCLUDE-LIST%.Troubleshooting..

......

Cannot inotify_add_watch() on [...]:No space left on device (errno: 28)

Increase sysctl fs.inotify.max_user_watches:one watch per dirno recursionkernel default is 8192

Andrey Savchenko clsync 08 Oct 2016 18 / 26

Page 20: A live data synchronization using clsync - LinuxDays · inotify[choice].. + epoll-basednotification + fulleventinfoprovided - norecursion - watch) pathmaprequired ... A live data

..........

.....

.....................................................................

.....

......

.....

.....

.

Status

Main functionality is implementedMaintenance modeLatest release 0.4.2 a week ago ;−)

Distro support:

Gentoo official (w/o *-fbsd)Debian official (w/o fbsd)RH-based RPM availableFreeBSD ports available

The only mandatory dep is glib.

Andrey Savchenko clsync 08 Oct 2016 19 / 26

Page 21: A live data synchronization using clsync - LinuxDays · inotify[choice].. + epoll-basednotification + fulleventinfoprovided - norecursion - watch) pathmaprequired ... A live data

..........

.....

.....................................................................

.....

......

.....

.....

.

Summary

Contacts:Github https://github.com/xaionaro/clsyncIRC Freenode: #clsynce-mail [email protected]

[email protected]

Acknowledgements:Dmitry Yu. Okunev — main clsync authorArtyom A. Anikeev and Barak A. Pearlmutter for packagingoldlaptop and Enrique Martinez for their help

Thank you for your attention!

Andrey Savchenko clsync 08 Oct 2016 20 / 26

Page 22: A live data synchronization using clsync - LinuxDays · inotify[choice].. + epoll-basednotification + fulleventinfoprovided - norecursion - watch) pathmaprequired ... A live data

..........

.....

.....................................................................

.....

......

.....

.....

.

Backup slides

Backup slides follow...

Andrey Savchenko clsync 08 Oct 2016 21 / 26

Page 23: A live data synchronization using clsync - LinuxDays · inotify[choice].. + epoll-basednotification + fulleventinfoprovided - norecursion - watch) pathmaprequired ... A live data

..........

.....

.....................................................................

.....

......

.....

.....

.

Limited resources

What we have:

Limited hardware:About 10 blades1 Gb/s interconnect (not very stable)short on RAM and HDDs

> 100 tasks (containers):main university web sitedepartments sitesVoIPVPN and Wi-Fi for studentsinternal services (e-mail, ntp, sks,...)

Andrey Savchenko clsync 08 Oct 2016 22 / 26

Page 24: A live data synchronization using clsync - LinuxDays · inotify[choice].. + epoll-basednotification + fulleventinfoprovided - norecursion - watch) pathmaprequired ... A live data

..........

.....

.....................................................................

.....

......

.....

.....

.

Possible approaches

What can be done?

Virtual machines ⇒ containers (LXC)

Disk data deduplication:base image + aufs/overlayfs

Cheap HA/LB clusters

Andrey Savchenko clsync 08 Oct 2016 23 / 26

Page 25: A live data synchronization using clsync - LinuxDays · inotify[choice].. + epoll-basednotification + fulleventinfoprovided - norecursion - watch) pathmaprequired ... A live data

..........

.....

.....................................................................

.....

......

.....

.....

.

File synchronization

Requirements:

performance (minimal overhead)

high availability (max few seconds downtime)

reliability (failure minimization)

versatility (wide range of use cases)

fine tuning over aggregation

Andrey Savchenko clsync 08 Oct 2016 24 / 26

Page 26: A live data synchronization using clsync - LinuxDays · inotify[choice].. + epoll-basednotification + fulleventinfoprovided - norecursion - watch) pathmaprequired ... A live data

..........

.....

.....................................................................

.....

......

.....

.....

.

Infrastructure core

LXCExt4 с clsync + rsyncPerconaAndrey Savchenko clsync 08 Oct 2016 25 / 26

Page 27: A live data synchronization using clsync - LinuxDays · inotify[choice].. + epoll-basednotification + fulleventinfoprovided - norecursion - watch) pathmaprequired ... A live data

..........

.....

.....................................................................

.....

......

.....

.....

.

Spinlock on single core

Andrey Savchenko clsync 08 Oct 2016 26 / 26