Checklist crnext

From Linux Checkpoint / Restart Wiki
Revision as of 17:30, 21 December 2009 by Hallyn (Talk | contribs)

Jump to: navigation, search


The v19 branch of the checkpoint/restart tree is a feature-full version supporting real use-cases. A container with an ssh session and screen session under which mpi is running can be reliably checkpointed and restarted.

This page describes features which are and are not implemented. An application using features which are not implemented will not be checkpointable - meaning that an attempted checkpoint of such an application will return an error code, as well as an explanation in syslog and in an optional user-provided logfile of why the checkpoint failed.

So real HPC workloads are checkpointable using v19, and we think the codebase so far is very clean. We want to make it clear that if checkpoint and restart of any feature (i.e. inotify) cannot be done cleanly - meaning the maintainers of the modified subsystems object - then we will either keep working with the maintainer to find a clean solution, or we will accept that the feature makes an application uncheckpointable.

In other words, just as we accept that it is not a valid goal of containers to fully 'trick' software into thinking it is running on its own bare hardware, we also accept that c/r will have limitations. However c/r will not be a 'toy', so if an application will not be restartable, then userspace must be warned of that fact at checkpoint time.

What is implemented

* open regular files and directories:
  * ext2, ext3, ext4
* /dev/null, zero, random, urandom
* epoll fd's
* event fd's
* timer fd's
* signal fd's
* unix sockets
* ipv4 sockets (except time-wait sockets)
* SYSV IPC (message queues, semaphores, and shared memory)
  * except semaphore undo
* Unix98 ptys
* futexes
* process ids
* credentials (userid, group, and POSIX capabilities)
* Smack LSM labels
* pipes
* signals
* multiple processes, threads
* nested namespaces of:
  * user
  * IPC
  * UTS (hostname)
  * IPC
  * (nested pid namespaces only require userspace support)

What is NOT implemented

Not Implemented Refuses to Checkpoint
Anticipated Solution Impact
mounts Yes
network devices, time-wait inet sockets Yes
unlinked files and directories Yes
inotify Yes Probably gross due to watches attaching to inodes
FUSE Yes Clean but requires per-FUSE-filesystem support
network and distributed filesystems Yes
SYSV IPC: semaphore undo No Small, clean
netlink Yes
new, expiremental, and/or
infrastructure-oriented network protocols
file locks Yes
file leases Yes
fowner+sigio Yes
ptraced tasks Yes
time namespace N/A
restart of 64-bit task from
32-bit task and vice versa
inet6 sockets
(previously supported, just need
to add it back in. minor)
Hardware devices
other than those mentioned above.
Includes mmap, /dev/foo, Infiniband devices
Yes (mostly)
System-specific files in sysfs
(things like UUID files)
Personal tools