Checklist crnext

From Linux Checkpoint / Restart Wiki
Jump to: navigation, search

Introduction

The v19 branch of the checkpoint/restart tree is a feature-full version supporting real use-cases. A container with an ssh session and screen session under which mpi is running can be reliably checkpointed and restarted.

This page describes features which are and are not implemented. An application using features which are not implemented will not be checkpointable - meaning that an attempted checkpoint of such an application will return an error code, as well as an explanation in syslog and in an optional user-provided logfile of why the checkpoint failed.

So some real HPC workloads are checkpointable using v19, and we think the codebase so far is very clean. We want to make it clear that if checkpoint and restart of any feature (i.e. inotify) cannot be done cleanly - meaning the maintainers of the modified subsystems object - then we will either keep working with the maintainer to find a clean solution, or we will accept that the feature makes an application uncheckpointable.

In other words, just as we accept that it is not a valid goal of containers to fully 'trick' software into thinking it is running on its own bare hardware, we also accept that c/r will have limitations. However c/r will not be a 'toy', so critical resources for a variety of real-world workloads must be reliably checkpointable.

What is implemented

* open regular files and directories:
  * ext2, ext3, ext4
* /dev/null, zero, random, urandom
* epoll fd's
* event fd's
* timer fd's
* signal fd's
* unix sockets
* ipv4 sockets (except time-wait sockets)
* SYSV IPC (message queues, semaphores, and shared memory)
  * except semaphore undo
* Unix98 ptys
* futexes
* process ids
* credentials (userid, group, and POSIX capabilities)
* Smack and SELinux LSM labels
* pipes
* FIFOs
* signals
* multiple processes, threads
* nested namespaces of:
  * user
  * IPC
  * UTS (hostname)
  * IPC
  * (nested pid namespaces only require userspace support)

What is NOT implemented


Not Implemented Refuses to Checkpoint
(Yes/No)
Anticipated Solution Impact
mounts Yes
network devices, time-wait inet sockets Yes
unlinked files and directories Yes
inotify Yes Probably gross due to watches attaching to inodes
FUSE Yes Clean but requires per-FUSE-filesystem support
network and distributed filesystems Yes
SYSV IPC: semaphore undo No Small, clean
netlink Yes
new, expiremental, and/or
infrastructure-oriented network protocols
Yes
file locks Yes
file leases Yes
fowner+sigio Yes
ptraced tasks Yes
aio Yes
time namespace N/A
restart of 64-bit task from
32-bit task and vice versa
Yes
inet6 sockets
(previously supported, just need
to add it back in. minor)
Yes
Hardware devices
other than those mentioned above.
Includes mmap, /dev/foo, Infiniband devices
Yes (mostly) One idea
System-specific files in sysfs
(things like UUID files)
Yes
Personal tools