systemd @ facebook -- a year later
TRANSCRIPT
![Page 1: systemd @ Facebook -- a year later](https://reader030.vdocuments.net/reader030/viewer/2022020108/5a66478d7f8b9afe4c8b4787/html5/thumbnails/1.jpg)
![Page 2: systemd @ Facebook -- a year later](https://reader030.vdocuments.net/reader030/viewer/2022020108/5a66478d7f8b9afe4c8b4787/html5/thumbnails/2.jpg)
systemd @ FB – a year later
Davide CavalcaProduction Engineer
![Page 3: systemd @ Facebook -- a year later](https://reader030.vdocuments.net/reader030/viewer/2022020108/5a66478d7f8b9afe4c8b4787/html5/thumbnails/3.jpg)
• Recap• Tracking upstream• Resource management• Service monitoring• Case studies• Advocacy
Agenda
![Page 4: systemd @ Facebook -- a year later](https://reader030.vdocuments.net/reader030/viewer/2022020108/5a66478d7f8b9afe4c8b4787/html5/thumbnails/4.jpg)
Recap
![Page 5: systemd @ Facebook -- a year later](https://reader030.vdocuments.net/reader030/viewer/2022020108/5a66478d7f8b9afe4c8b4787/html5/thumbnails/5.jpg)
![Page 6: systemd @ Facebook -- a year later](https://reader030.vdocuments.net/reader030/viewer/2022020108/5a66478d7f8b9afe4c8b4787/html5/thumbnails/6.jpg)
• 100% of the bare metal feet on CentOS 7!• Migrated countless services to systemd• libsystemd integration in our build system• Containers: see Zeal’s talk later today!
RecapCentOS 7 migration
![Page 7: systemd @ Facebook -- a year later](https://reader030.vdocuments.net/reader030/viewer/2022020108/5a66478d7f8b9afe4c8b4787/html5/thumbnails/7.jpg)
Tracking upstream
![Page 8: systemd @ Facebook -- a year later](https://reader030.vdocuments.net/reader030/viewer/2022020108/5a66478d7f8b9afe4c8b4787/html5/thumbnails/8.jpg)
• systemd 231 232 233 (234 235)→ → → →• Also tracking util-linux, dbus, etc.• Published our Rawhide-based backports on:
https://github.com/facebookincubator/rpm-backports• Binary RPMs based on it on:
https://copr.fedorainfracloud.org/coprs/jsynacek/systemd-backports-for-centos-7/
Tracking upstreamStaying up to date
![Page 9: systemd @ Facebook -- a year later](https://reader030.vdocuments.net/reader030/viewer/2022020108/5a66478d7f8b9afe4c8b4787/html5/thumbnails/9.jpg)
• Not specifc to systemd• Duplicate systemd RPMs: package-cleanup wrapper• rpmdb corruption: dcrpm• Mismatch between systemd and systemd-libs
Tracking upstreamRPM issues
if ldd /usr/lib/systemd/systemd | grep ‘systemd.*not found$’ yum reinstall -y $systemd_packagesfi
![Page 10: systemd @ Facebook -- a year later](https://reader030.vdocuments.net/reader030/viewer/2022020108/5a66478d7f8b9afe4c8b4787/html5/thumbnails/10.jpg)
• Rebuild packaging for the Meson transition• Backported meson, ninja-build in CentOS• Standalone systemd-compat-libs
https://github.com/facebookincubator/systemd-compat-libs
Tracking upstreamMeson and compat-libs
![Page 11: systemd @ Facebook -- a year later](https://reader030.vdocuments.net/reader030/viewer/2022020108/5a66478d7f8b9afe4c8b4787/html5/thumbnails/11.jpg)
Tracking upstreamtty woes with 234
• When rolling 234 we discovered a race in the kernel tty subsystem (repros all the way back to 4.0)• Turns out both systemd and Tupperware use the real tty0• Investigation still in progress, likely a use-after-free bug• Tupperware should probably just use a pty here
![Page 12: systemd @ Facebook -- a year later](https://reader030.vdocuments.net/reader030/viewer/2022020108/5a66478d7f8b9afe4c8b4787/html5/thumbnails/12.jpg)
Resource management
![Page 13: systemd @ Facebook -- a year later](https://reader030.vdocuments.net/reader030/viewer/2022020108/5a66478d7f8b9afe4c8b4787/html5/thumbnails/13.jpg)
• See Chris’s talk tomorrow for all things cgroup2!• Using systemd to partition services and apply limits• Lightweight daemon to collect metrics from /sys/fs/cgroup• Chef API to apply confgurations and manage experiments
Resource managementRolling out cgroup2
![Page 14: systemd @ Facebook -- a year later](https://reader030.vdocuments.net/reader030/viewer/2022020108/5a66478d7f8b9afe4c8b4787/html5/thumbnails/14.jpg)
Resource managementSlice hierarchy
/||-system.slice||-workload.slice| || +-critical-wdb.slice|+-tbd.slice
![Page 15: systemd @ Facebook -- a year later](https://reader030.vdocuments.net/reader030/viewer/2022020108/5a66478d7f8b9afe4c8b4787/html5/thumbnails/15.jpg)
Service monitoring
![Page 16: systemd @ Facebook -- a year later](https://reader030.vdocuments.net/reader030/viewer/2022020108/5a66478d7f8b9afe4c8b4787/html5/thumbnails/16.jpg)
Service monitoring
• systemd exposes lots of useful metrics over dbus• Unit properties (e.g. *Timestamp*, NRestarts)• Status events (e.g. unit state changes)• Options: python-dbus, sd-bus, coreos/go-systemd/dbus
Getting metrics out of systemd
![Page 17: systemd @ Facebook -- a year later](https://reader030.vdocuments.net/reader030/viewer/2022020108/5a66478d7f8b9afe4c8b4787/html5/thumbnails/17.jpg)
Service monitoring
• Lightweight daemon to feed systemd metrics to various monitoring systems• Polling for unit properties, subscriptions for status events• Initial implementation in golang
systemdmon
![Page 18: systemd @ Facebook -- a year later](https://reader030.vdocuments.net/reader030/viewer/2022020108/5a66478d7f8b9afe4c8b4787/html5/thumbnails/18.jpg)
Service monitoring
• Thin Cython wrapper on top of sd-bus• Expose systemd dbus object model• ipython REPL for prototyping• Will be opensourced together with systemdmon
pystemd
![Page 19: systemd @ Facebook -- a year later](https://reader030.vdocuments.net/reader030/viewer/2022020108/5a66478d7f8b9afe4c8b4787/html5/thumbnails/19.jpg)
Case studies
![Page 20: systemd @ Facebook -- a year later](https://reader030.vdocuments.net/reader030/viewer/2022020108/5a66478d7f8b9afe4c8b4787/html5/thumbnails/20.jpg)
Case studiesdbus reliability
• Issues with dbus-daemon or the system bus afect systemd• systemctl hanging or failing Chef failing→• Easy to DoS the bus, especially with user services• Hard to remediate without a reboot
• Looking forward to dbus-broker!
![Page 21: systemd @ Facebook -- a year later](https://reader030.vdocuments.net/reader030/viewer/2022020108/5a66478d7f8b9afe4c8b4787/html5/thumbnails/21.jpg)
Case studiesrpm macros for systemd services
• By default RPM macros will restart units on upgrade...• …which is a problem if you’ve also setup Chef to restart
• Solution: knob in our internal packaging tool to optionally disable the restart macro
![Page 22: systemd @ Facebook -- a year later](https://reader030.vdocuments.net/reader030/viewer/2022020108/5a66478d7f8b9afe4c8b4787/html5/thumbnails/22.jpg)
Case studiesLogging
• Journald setup: 10MB in memory logging feeding rsyslog• journalctl is awesome• Double writing problem• No way to set per-unit limits
![Page 23: systemd @ Facebook -- a year later](https://reader030.vdocuments.net/reader030/viewer/2022020108/5a66478d7f8b9afe4c8b4787/html5/thumbnails/23.jpg)
Case studiesUnit loops
• Easy to create loops with x-systemd-requires in fstab• systemd will delete a random unit to break loops
• Solution: add _netdev to the fstab entry• systemd-analyze to help debugging
systemd-tmpfiles-setup.service: Job systemd-tmpfiles-setup.service/start deleted to break ordering cycle starting with smc_proxy.service/start
![Page 24: systemd @ Facebook -- a year later](https://reader030.vdocuments.net/reader030/viewer/2022020108/5a66478d7f8b9afe4c8b4787/html5/thumbnails/24.jpg)
Case studiesTransient unit creep
• systemd-run creates units in /run/systemd/transient• If the unit fails, it sticks around in ‘failed’ state• 10k failed units 50% cpu usage for pid 1→• 30k failed units 100% cpu usage for pid 1→• Fix: call systemctl reset-failed periodically
![Page 25: systemd @ Facebook -- a year later](https://reader030.vdocuments.net/reader030/viewer/2022020108/5a66478d7f8b9afe4c8b4787/html5/thumbnails/25.jpg)
Case studiesKillMode=process
• KillMode=process may leave stray processes in the cgroup• Changes to unit slices don’t apply unless the old slice is
empty• Fix: move to use KillMode=control-group
![Page 26: systemd @ Facebook -- a year later](https://reader030.vdocuments.net/reader030/viewer/2022020108/5a66478d7f8b9afe4c8b4787/html5/thumbnails/26.jpg)
Case studiesUnit escaping
• Escape logic relies on shell control characters:/dev/dm0 dev-dm\x2d1.swap→• Chef fx: https://github.com/chef/chef/pull/6230• path_to_unit wrapper in fb_systemd
![Page 27: systemd @ Facebook -- a year later](https://reader030.vdocuments.net/reader030/viewer/2022020108/5a66478d7f8b9afe4c8b4787/html5/thumbnails/27.jpg)
Advocacy
![Page 28: systemd @ Facebook -- a year later](https://reader030.vdocuments.net/reader030/viewer/2022020108/5a66478d7f8b9afe4c8b4787/html5/thumbnails/28.jpg)
• Announce core packages updates widely• Tailor documentation to customer usecases• Encourage people to engage upstream directly• Tech talks
Advocacy
![Page 29: systemd @ Facebook -- a year later](https://reader030.vdocuments.net/reader030/viewer/2022020108/5a66478d7f8b9afe4c8b4787/html5/thumbnails/29.jpg)
Questions?
![Page 30: systemd @ Facebook -- a year later](https://reader030.vdocuments.net/reader030/viewer/2022020108/5a66478d7f8b9afe4c8b4787/html5/thumbnails/30.jpg)