linux-cr: transparent application checkpoint-restart in linuxorenl/talks/ksummit-2010.pdf ·...
TRANSCRIPT
![Page 1: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ksummit-2010.pdf · 2010-11-02 · Linux Kernel Summit, November 2010Linux Kernel Summit, November 2010 2121](https://reader034.vdocuments.net/reader034/viewer/2022042310/5ed8065fcba89e334c6719a5/html5/thumbnails/1.jpg)
Linux Kernel Summit, November 2010 1 [email protected] Kernel Summit, November 2010 1 [email protected]
Linux-CR:
Transparent Application Checkpoint-Restart in Linux
Linux-CR:
Transparent Application Checkpoint-Restart in Linux
Oren LaadanColumbia University
![Page 2: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ksummit-2010.pdf · 2010-11-02 · Linux Kernel Summit, November 2010Linux Kernel Summit, November 2010 2121](https://reader034.vdocuments.net/reader034/viewer/2022042310/5ed8065fcba89e334c6719a5/html5/thumbnails/2.jpg)
Linux Kernel Summit, November 2010 2 [email protected] Kernel Summit, November 2010 2 [email protected]
Application C/RApplication C/R
◆ Application Checkpoint/Restart
a mechanism to save the state ofrunning application(s) so that they can later resume execution from that point
![Page 3: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ksummit-2010.pdf · 2010-11-02 · Linux Kernel Summit, November 2010Linux Kernel Summit, November 2010 2121](https://reader034.vdocuments.net/reader034/viewer/2022042310/5ed8065fcba89e334c6719a5/html5/thumbnails/3.jpg)
Linux Kernel Summit, November 2010 3 [email protected] Kernel Summit, November 2010 3 [email protected]
checkpointimage
Application C/RApplication C/R
original restoredhierarchy hierarchy
restartcheckpoint
![Page 4: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ksummit-2010.pdf · 2010-11-02 · Linux Kernel Summit, November 2010Linux Kernel Summit, November 2010 2121](https://reader034.vdocuments.net/reader034/viewer/2022042310/5ed8065fcba89e334c6719a5/html5/thumbnails/4.jpg)
Linux Kernel Summit, November 2010 4 [email protected] Kernel Summit, November 2010 4 [email protected]
What is it good for ?What is it good for ?
◆ Application roll back to the past◆ Application suspend and resume◆ Application migration
![Page 5: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ksummit-2010.pdf · 2010-11-02 · Linux Kernel Summit, November 2010Linux Kernel Summit, November 2010 2121](https://reader034.vdocuments.net/reader034/viewer/2022042310/5ed8065fcba89e334c6719a5/html5/thumbnails/5.jpg)
Linux Kernel Summit, November 2010 5 [email protected] Kernel Summit, November 2010 5 [email protected]
Application rollbackApplication rollback
◆ Fault tolerance◆ Effective debugging◆ Fast application start-up◆ Software testing◆ Generic time-machine
![Page 6: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ksummit-2010.pdf · 2010-11-02 · Linux Kernel Summit, November 2010Linux Kernel Summit, November 2010 2121](https://reader034.vdocuments.net/reader034/viewer/2022042310/5ed8065fcba89e334c6719a5/html5/thumbnails/6.jpg)
Linux Kernel Summit, November 2010 6 [email protected] Kernel Summit, November 2010 6 [email protected]
Application rollbackApplication rollback
◆ Fault tolerance◆ long running applications◆ cloud, HPC, at work, at home
◆ Effective debugging◆ Fast application start-up◆ Software testing◆ Generic time-machine
![Page 7: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ksummit-2010.pdf · 2010-11-02 · Linux Kernel Summit, November 2010Linux Kernel Summit, November 2010 2121](https://reader034.vdocuments.net/reader034/viewer/2022042310/5ed8065fcba89e334c6719a5/html5/thumbnails/7.jpg)
Linux Kernel Summit, November 2010 7 [email protected] Kernel Summit, November 2010 7 [email protected]
Application rollbackApplication rollback
◆ Fault tolerance◆ Effective debugging
◆ Super-core-dump● more details, multiple tasks
◆ re-run from checkpoint● trace, profile, and instrument
◆ Fast application start-up◆ Software testing◆ Generic time-machine
![Page 8: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ksummit-2010.pdf · 2010-11-02 · Linux Kernel Summit, November 2010Linux Kernel Summit, November 2010 2121](https://reader034.vdocuments.net/reader034/viewer/2022042310/5ed8065fcba89e334c6719a5/html5/thumbnails/8.jpg)
Linux Kernel Summit, November 2010 8 [email protected] Kernel Summit, November 2010 8 [email protected]
Application rollbackApplication rollback
◆ Fault tolerance◆ Effective debugging◆ Fast application start-up
◆ from default/previous state (ccache...)◆ improve desktop boot time
◆ Software testing◆ Generic time-machine
![Page 9: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ksummit-2010.pdf · 2010-11-02 · Linux Kernel Summit, November 2010Linux Kernel Summit, November 2010 2121](https://reader034.vdocuments.net/reader034/viewer/2022042310/5ed8065fcba89e334c6719a5/html5/thumbnails/9.jpg)
Linux Kernel Summit, November 2010 9 [email protected] Kernel Summit, November 2010 9 [email protected]
Application rollbackApplication rollback
◆ Fault tolerance◆ Effective debugging◆ Fast application start-up◆ Software testing
◆ repeat from specific point(s)◆ distribute on multiple hosts
◆ Generic time-machine
![Page 10: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ksummit-2010.pdf · 2010-11-02 · Linux Kernel Summit, November 2010Linux Kernel Summit, November 2010 2121](https://reader034.vdocuments.net/reader034/viewer/2022042310/5ed8065fcba89e334c6719a5/html5/thumbnails/10.jpg)
Linux Kernel Summit, November 2010 10 [email protected] Kernel Summit, November 2010 10 [email protected]
Application rollbackApplication rollback
◆ Fault tolerance◆ Effective debugging◆ Fast application start-up◆ Software testing◆ Generic time-machine
◆ revive old server/desktop state◆ retry a move in a game
![Page 11: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ksummit-2010.pdf · 2010-11-02 · Linux Kernel Summit, November 2010Linux Kernel Summit, November 2010 2121](https://reader034.vdocuments.net/reader034/viewer/2022042310/5ed8065fcba89e334c6719a5/html5/thumbnails/11.jpg)
Linux Kernel Summit, November 2010 11 [email protected] Kernel Summit, November 2010 11 [email protected]
Application suspend/resumeApplication suspend/resume
◆ Improved OOM handling◆ Better system utilization◆ Suspend/resume a user's session
![Page 12: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ksummit-2010.pdf · 2010-11-02 · Linux Kernel Summit, November 2010Linux Kernel Summit, November 2010 2121](https://reader034.vdocuments.net/reader034/viewer/2022042310/5ed8065fcba89e334c6719a5/html5/thumbnails/12.jpg)
Linux Kernel Summit, November 2010 12 [email protected] Kernel Summit, November 2010 12 [email protected]
Application suspend/resumeApplication suspend/resume
◆ Improved OOM handling◆ suspend applications, don't kill◆ smart “swap” on embedded
◆ Better system utilization◆ Suspend/resume a user's session
![Page 13: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ksummit-2010.pdf · 2010-11-02 · Linux Kernel Summit, November 2010Linux Kernel Summit, November 2010 2121](https://reader034.vdocuments.net/reader034/viewer/2022042310/5ed8065fcba89e334c6719a5/html5/thumbnails/13.jpg)
Linux Kernel Summit, November 2010 13 [email protected] Kernel Summit, November 2010 13 [email protected]
Application suspend/resumeApplication suspend/resume
◆ Improved OOM handling◆ Better system utilization
◆ suspend application to reduce load
◆ Suspend/resume a user's session
![Page 14: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ksummit-2010.pdf · 2010-11-02 · Linux Kernel Summit, November 2010Linux Kernel Summit, November 2010 2121](https://reader034.vdocuments.net/reader034/viewer/2022042310/5ed8065fcba89e334c6719a5/html5/thumbnails/14.jpg)
Linux Kernel Summit, November 2010 14 [email protected] Kernel Summit, November 2010 14 [email protected]
Application suspend/resumeApplication suspend/resume
◆ Improved OOM handling◆ Better system utilization◆ Suspend/resume a user's session
◆ mobile desktop on USB key◆ linux based VPS/VDI systems
![Page 15: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ksummit-2010.pdf · 2010-11-02 · Linux Kernel Summit, November 2010Linux Kernel Summit, November 2010 2121](https://reader034.vdocuments.net/reader034/viewer/2022042310/5ed8065fcba89e334c6719a5/html5/thumbnails/15.jpg)
Linux Kernel Summit, November 2010 15 [email protected] Kernel Summit, November 2010 15 [email protected]
Application MigrationApplication Migration
◆ Load balancing / resource sharing◆ Zero-downtime maintenance◆ High availability
![Page 16: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ksummit-2010.pdf · 2010-11-02 · Linux Kernel Summit, November 2010Linux Kernel Summit, November 2010 2121](https://reader034.vdocuments.net/reader034/viewer/2022042310/5ed8065fcba89e334c6719a5/html5/thumbnails/16.jpg)
Linux Kernel Summit, November 2010 16 [email protected] Kernel Summit, November 2010 16 [email protected]
Application MigrationApplication Migration
◆ Load balancing / resource sharing◆ HPC (e.g. BlueWaters project)◆ cloud environments◆ linux-based VPS/VDI
◆ Zero-downtime maintenance◆ High availability
![Page 17: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ksummit-2010.pdf · 2010-11-02 · Linux Kernel Summit, November 2010Linux Kernel Summit, November 2010 2121](https://reader034.vdocuments.net/reader034/viewer/2022042310/5ed8065fcba89e334c6719a5/html5/thumbnails/17.jpg)
Linux Kernel Summit, November 2010 17 [email protected] Kernel Summit, November 2010 17 [email protected]
Application MigrationApplication Migration
◆ Load balancing / resource sharing◆ Zero-downtime maintenance
◆ live migration of applications
◆ High availability
![Page 18: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ksummit-2010.pdf · 2010-11-02 · Linux Kernel Summit, November 2010Linux Kernel Summit, November 2010 2121](https://reader034.vdocuments.net/reader034/viewer/2022042310/5ed8065fcba89e334c6719a5/html5/thumbnails/18.jpg)
Linux Kernel Summit, November 2010 18 [email protected] Kernel Summit, November 2010 18 [email protected]
Application MigrationApplication Migration
◆ Load balancing / resource sharing◆ Zero-downtime maintenance◆ High availability
◆ primary/backup in lock-step◆ frequent incremental checkpoints
![Page 19: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ksummit-2010.pdf · 2010-11-02 · Linux Kernel Summit, November 2010Linux Kernel Summit, November 2010 2121](https://reader034.vdocuments.net/reader034/viewer/2022042310/5ed8065fcba89e334c6719a5/html5/thumbnails/19.jpg)
Linux Kernel Summit, November 2010 19 [email protected] Kernel Summit, November 2010 19 [email protected]
Application vs Virtual-MachineApplication vs Virtual-Machine
Application Virtual C/R Machine
granularity specific operating systemapplications as a whole unit
saved state application entire operatingstate only system state
overhead none visible
flexibility application operating systemawareness is black box
deployment linux only same arch family
![Page 20: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ksummit-2010.pdf · 2010-11-02 · Linux Kernel Summit, November 2010Linux Kernel Summit, November 2010 2121](https://reader034.vdocuments.net/reader034/viewer/2022042310/5ed8065fcba89e334c6719a5/html5/thumbnails/20.jpg)
Linux Kernel Summit, November 2010 20 [email protected] Kernel Summit, November 2010 20 [email protected]
Some examplesSome examples
◆ HPC environments◆ can extend linux-cr → distributed-cr
◆ Cloud deployments◆ using linux containers
◆ Light-weight clusters of ARMs◆ combine LXC and linux-cr
![Page 21: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ksummit-2010.pdf · 2010-11-02 · Linux Kernel Summit, November 2010Linux Kernel Summit, November 2010 2121](https://reader034.vdocuments.net/reader034/viewer/2022042310/5ed8065fcba89e334c6719a5/html5/thumbnails/21.jpg)
Linux Kernel Summit, November 2010 21 [email protected] Kernel Summit, November 2010 21 [email protected]
Some concrete examplesSome concrete examples
◆ BlueWaters◆ NCSA's most powerful supercomputer◆ checkpointing based on linux-cr
◆ OpenVZ◆ VPS hosting with migration capabilities
◆ Canonical / Ubuntu◆ add LXC & linux-cr in UEC cluster stack
![Page 22: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ksummit-2010.pdf · 2010-11-02 · Linux Kernel Summit, November 2010Linux Kernel Summit, November 2010 2121](https://reader034.vdocuments.net/reader034/viewer/2022042310/5ed8065fcba89e334c6719a5/html5/thumbnails/22.jpg)
Linux Kernel Summit, November 2010 22 [email protected] Kernel Summit, November 2010 22 [email protected]
Who and WhoWho and Who
◆ Who is doing ?◆ Matt Helsley, Dan Smith, Serge Hallyn,
Nathan Lynch, Sukadev Bhattiprolu, me
◆ Who is interested ?◆ IBM, Canonical, OpenVZ, HPC industry,
Kerrighed, Google (?), ...
◆ Who else does/did ?◆ OS: AIX, OpenVZ, IRIX, Cray...◆ Systems: Moab, BLCR/Beowolf, Condor...
![Page 23: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ksummit-2010.pdf · 2010-11-02 · Linux Kernel Summit, November 2010Linux Kernel Summit, November 2010 2121](https://reader034.vdocuments.net/reader034/viewer/2022042310/5ed8065fcba89e334c6719a5/html5/thumbnails/23.jpg)
Linux Kernel Summit, November 2010 23 [email protected] Kernel Summit, November 2010 23 [email protected]
Linux-C/R design goalsLinux-C/R design goals
◆ Transparency◆ Reliability◆ Security/safety◆ Performance◆ Maintainability
![Page 24: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ksummit-2010.pdf · 2010-11-02 · Linux Kernel Summit, November 2010Linux Kernel Summit, November 2010 2121](https://reader034.vdocuments.net/reader034/viewer/2022042310/5ed8065fcba89e334c6719a5/html5/thumbnails/24.jpg)
Linux Kernel Summit, November 2010 24 [email protected] Kernel Summit, November 2010 24 [email protected]
Linux-C/R design goalsLinux-C/R design goals
◆ Transparency◆ applications oblivious to operation◆ allow notify of checkpoint or restart◆ allow application awareness
◆ Reliability◆ Security/safety◆ Performance◆ Maintainability
![Page 25: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ksummit-2010.pdf · 2010-11-02 · Linux Kernel Summit, November 2010Linux Kernel Summit, November 2010 2121](https://reader034.vdocuments.net/reader034/viewer/2022042310/5ed8065fcba89e334c6719a5/html5/thumbnails/25.jpg)
Linux Kernel Summit, November 2010 25 [email protected] Kernel Summit, November 2010 25 [email protected]
Linux-C/R design goalsLinux-C/R design goals
◆ Transparency◆ Reliability
◆ checkpoint succeeds → restart succeeds◆ report non-checkpoint-able reasons◆ checkpoint is non-intrusive
◆ Security/safety◆ Performance◆ Maintainability
![Page 26: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ksummit-2010.pdf · 2010-11-02 · Linux Kernel Summit, November 2010Linux Kernel Summit, November 2010 2121](https://reader034.vdocuments.net/reader034/viewer/2022042310/5ed8065fcba89e334c6719a5/html5/thumbnails/26.jpg)
Linux Kernel Summit, November 2010 26 [email protected] Kernel Summit, November 2010 26 [email protected]
Linux-C/R design goalsLinux-C/R design goals
◆ Transparency◆ Reliability◆ Security/safety
◆ ptrace capabilities to checkpoint◆ reuse kernel code to reconstruct state
◆ Performance◆ Maintainability
![Page 27: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ksummit-2010.pdf · 2010-11-02 · Linux Kernel Summit, November 2010Linux Kernel Summit, November 2010 2121](https://reader034.vdocuments.net/reader034/viewer/2022042310/5ed8065fcba89e334c6719a5/html5/thumbnails/27.jpg)
Linux Kernel Summit, November 2010 27 [email protected] Kernel Summit, November 2010 27 [email protected]
Linux-C/R design goalsLinux-C/R design goals
◆ Transparency◆ Reliability◆ Security/safety◆ Performance
◆ zero impact on performance◆ reasonable code footprint
◆ Maintainability
![Page 28: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ksummit-2010.pdf · 2010-11-02 · Linux Kernel Summit, November 2010Linux Kernel Summit, November 2010 2121](https://reader034.vdocuments.net/reader034/viewer/2022042310/5ed8065fcba89e334c6719a5/html5/thumbnails/28.jpg)
Linux Kernel Summit, November 2010 28 [email protected] Kernel Summit, November 2010 28 [email protected]
Linux-C/R design goalsLinux-C/R design goals
◆ Transparency◆ Reliability◆ Security/safety◆ Performance◆ Maintainability
◆ next slide ...
![Page 29: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ksummit-2010.pdf · 2010-11-02 · Linux Kernel Summit, November 2010Linux Kernel Summit, November 2010 2121](https://reader034.vdocuments.net/reader034/viewer/2022042310/5ed8065fcba89e334c6719a5/html5/thumbnails/29.jpg)
Linux Kernel Summit, November 2010 29 [email protected] Kernel Summit, November 2010 29 [email protected]
MaintainabilityMaintainability
◆ Placement of C/R code◆ Extensive test-suite◆ Positive experience so far◆ Impact on developers
![Page 30: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ksummit-2010.pdf · 2010-11-02 · Linux Kernel Summit, November 2010Linux Kernel Summit, November 2010 2121](https://reader034.vdocuments.net/reader034/viewer/2022042310/5ed8065fcba89e334c6719a5/html5/thumbnails/30.jpg)
Linux Kernel Summit, November 2010 30 [email protected] Kernel Summit, November 2010 30 [email protected]
MaintainabilityMaintainability
◆ Placement of C/R code◆ generic code in kernel/checkpoint/...◆ most c/r code with or near subsystem
code so subsytem maintainers sees it◆ c/r is well documented
◆ Extensive test-suite◆ Positive experience so far◆ Impact on developers
![Page 31: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ksummit-2010.pdf · 2010-11-02 · Linux Kernel Summit, November 2010Linux Kernel Summit, November 2010 2121](https://reader034.vdocuments.net/reader034/viewer/2022042310/5ed8065fcba89e334c6719a5/html5/thumbnails/31.jpg)
Linux Kernel Summit, November 2010 31 [email protected] Kernel Summit, November 2010 31 [email protected]
MaintainabilityMaintainability
◆ Placement of C/R code◆ Extensive test-suite
◆ test large list (>120) of scenarios◆ test before/during/after behavior
◆ Positive experience so far◆ Impact on developers
![Page 32: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ksummit-2010.pdf · 2010-11-02 · Linux Kernel Summit, November 2010Linux Kernel Summit, November 2010 2121](https://reader034.vdocuments.net/reader034/viewer/2022042310/5ed8065fcba89e334c6719a5/html5/thumbnails/32.jpg)
Linux Kernel Summit, November 2010 32 [email protected] Kernel Summit, November 2010 32 [email protected]
MaintainabilityMaintainability
◆ Placement of C/R code◆ Extensive test-suite◆ Positive experience (2.6.27 → today)
◆ can ignore most kernel changes◆ mainly need to add features◆ minor changes to prior c/r code
● e.g. splice/pipe, syscalls #s, mm helpers
◆ Impact on developers
![Page 33: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ksummit-2010.pdf · 2010-11-02 · Linux Kernel Summit, November 2010Linux Kernel Summit, November 2010 2121](https://reader034.vdocuments.net/reader034/viewer/2022042310/5ed8065fcba89e334c6719a5/html5/thumbnails/33.jpg)
Linux Kernel Summit, November 2010 33 [email protected] Kernel Summit, November 2010 33 [email protected]
MaintainabilityMaintainability
◆ Placement of C/R code◆ Extensive test-suite◆ Positive experience so far◆ Impact on developers
◆ understand what may affect c/r code◆ at least notify c/r people when needed◆ awareness will grow with exposure
![Page 34: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ksummit-2010.pdf · 2010-11-02 · Linux Kernel Summit, November 2010Linux Kernel Summit, November 2010 2121](https://reader034.vdocuments.net/reader034/viewer/2022042310/5ed8065fcba89e334c6719a5/html5/thumbnails/34.jpg)
Linux Kernel Summit, November 2010 34 [email protected] Kernel Summit, November 2010 34 [email protected]
Design SummaryDesign Summary
◆ Save/restore state in-kernel◆ Checkpoint container/subtree/self◆ Image holds “user-visible” state◆ Userspace image conversion◆ Detailed error reporting
![Page 35: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ksummit-2010.pdf · 2010-11-02 · Linux Kernel Summit, November 2010Linux Kernel Summit, November 2010 2121](https://reader034.vdocuments.net/reader034/viewer/2022042310/5ed8065fcba89e334c6719a5/html5/thumbnails/35.jpg)
Linux Kernel Summit, November 2010 35 [email protected] Kernel Summit, November 2010 35 [email protected]
CheckpointCheckpoint
(1) Freeze process hierarchy
(2) Save global data
(3) Save process hierarchy
(4) Save state of all tasks
(?) Filesystem snapshot
(5) Thaw/kill process hierarchy
in-kernel
![Page 36: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ksummit-2010.pdf · 2010-11-02 · Linux Kernel Summit, November 2010Linux Kernel Summit, November 2010 2121](https://reader034.vdocuments.net/reader034/viewer/2022042310/5ed8065fcba89e334c6719a5/html5/thumbnails/36.jpg)
Linux Kernel Summit, November 2010 36 [email protected] Kernel Summit, November 2010 36 [email protected]
RestartRestart
(1) Create container
(?) Restore (stage) filesystem
(3) Create process hierarchy
(4) Restore state of all tasks
(5) Resume execution
in-kernel
![Page 37: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ksummit-2010.pdf · 2010-11-02 · Linux Kernel Summit, November 2010Linux Kernel Summit, November 2010 2121](https://reader034.vdocuments.net/reader034/viewer/2022042310/5ed8065fcba89e334c6719a5/html5/thumbnails/37.jpg)
Linux Kernel Summit, November 2010 37 [email protected] Kernel Summit, November 2010 37 [email protected]
Current StateCurrent State
◆ Supported subsystems:◆ tasks (threads, signals, credentials, etc)◆ namespaces (all but mounts-ns)◆ sysvipc (shm, msg, sem)◆ files, dirs (regular, fifos/pipes, epoll,
event, simple devices)◆ sockets (unix, ipv4, ipv6)◆ security (smack, selinux labels)
![Page 38: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ksummit-2010.pdf · 2010-11-02 · Linux Kernel Summit, November 2010Linux Kernel Summit, November 2010 2121](https://reader034.vdocuments.net/reader034/viewer/2022042310/5ed8065fcba89e334c6719a5/html5/thumbnails/38.jpg)
Linux Kernel Summit, November 2010 38 [email protected] Kernel Summit, November 2010 38 [email protected]
Current StateCurrent State
◆ What's missing◆ [reviewed] file locks, leases, owner◆ [reviewed] unlinked files/dirs◆ [wip] fanotify/inotify/dnotify◆ [wip] mounts, mount-ns◆ /proc filesystem◆ ptraced tasks◆ more devices
![Page 39: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ksummit-2010.pdf · 2010-11-02 · Linux Kernel Summit, November 2010Linux Kernel Summit, November 2010 2121](https://reader034.vdocuments.net/reader034/viewer/2022042310/5ed8065fcba89e334c6719a5/html5/thumbnails/39.jpg)
Linux Kernel Summit, November 2010 39 [email protected] Kernel Summit, November 2010 39 [email protected]
Current StateCurrent State
◆ Supported architectures:◆ x86-32◆ x86-64◆ s390x◆ PowerPC◆ ARM
![Page 40: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ksummit-2010.pdf · 2010-11-02 · Linux Kernel Summit, November 2010Linux Kernel Summit, November 2010 2121](https://reader034.vdocuments.net/reader034/viewer/2022042310/5ed8065fcba89e334c6719a5/html5/thumbnails/40.jpg)
Linux Kernel Summit, November 2010 40 [email protected] Kernel Summit, November 2010 40 [email protected]
Current StateCurrent State
◆ Code:◆ ~23K lines in total
● ~1200 lines documentation● ~600 lines per arch (x5)● ~8K lines in kernel/checkpoint/* (base)● ~7K lines “near-place” files (*/checkpoint.c)● ~2K lines “in-place” save/restore● ~2K lines for LSM
![Page 41: Linux-CR: Transparent Application Checkpoint-Restart in Linuxorenl/talks/ksummit-2010.pdf · 2010-11-02 · Linux Kernel Summit, November 2010Linux Kernel Summit, November 2010 2121](https://reader034.vdocuments.net/reader034/viewer/2022042310/5ed8065fcba89e334c6719a5/html5/thumbnails/41.jpg)
Linux Kernel Summit, November 2010 41 [email protected] Kernel Summit, November 2010 41 [email protected]
Discussion ...Discussion ...
◆ Concrete path to mainline (mm/next?)◆ Exposure to subsystem maintainers ?◆ Image format tied to kernel version
(userspace conversion tools)
Many thanks to those who reviewed, tested, and provided suggestions !
● Web page: http://www.linux-cr.org/● Git tree(s): git://www.linux-cr.org/git/