Checklist crnext
OBSOLETE CONTENT
This wiki has been archived and the content is no longer updated.
Introduction
The v19 branch of the checkpoint/restart tree is a feature-full version supporting real use-cases. A container with an ssh session and screen session under which mpi is running can be reliably checkpointed and restarted.
This page describes features which are and are not implemented. An application using features which are not implemented will not be checkpointable - meaning that an attempted checkpoint of such an application will return an error code, as well as an explanation in syslog and in an optional user-provided logfile of why the checkpoint failed.
So some real HPC workloads are checkpointable using v19, and we think the codebase so far is very clean. We want to make it clear that if checkpoint and restart of any feature (i.e. inotify) cannot be done cleanly - meaning the maintainers of the modified subsystems object - then we will either keep working with the maintainer to find a clean solution, or we will accept that the feature makes an application uncheckpointable.
In other words, just as we accept that it is not a valid goal of containers to fully 'trick' software into thinking it is running on its own bare hardware, we also accept that c/r will have limitations. However c/r will not be a 'toy', so critical resources for a variety of real-world workloads must be reliably checkpointable.
What is implemented
* open regular files and directories: * ext2, ext3, ext4 * /dev/null, zero, random, urandom * epoll fd's * event fd's * timer fd's * signal fd's * unix sockets * ipv4 sockets (except time-wait sockets) * SYSV IPC (message queues, semaphores, and shared memory) * except semaphore undo * Unix98 ptys * futexes * process ids * credentials (userid, group, and POSIX capabilities) * Smack and SELinux LSM labels * pipes * FIFOs * signals * multiple processes, threads * nested namespaces of: * user * IPC * UTS (hostname) * IPC * (nested pid namespaces only require userspace support)
What is NOT implemented
Not Implemented | Refuses to Checkpoint (Yes/No) |
Anticipated Solution Impact | |
---|---|---|---|
mounts | Yes | ||
network devices, time-wait inet sockets | Yes | ||
unlinked files and directories | Yes | ||
inotify | Yes | Probably gross due to watches attaching to inodes | |
FUSE | Yes | Clean but requires per-FUSE-filesystem support | |
network and distributed filesystems | Yes | ||
SYSV IPC: semaphore undo | No | Small, clean | |
netlink | Yes | ||
new, expiremental, and/or infrastructure-oriented network protocols |
Yes | ||
file locks | Yes | ||
file leases | Yes | ||
fowner+sigio | Yes | ||
ptraced tasks | Yes | ||
aio | Yes | ||
time namespace | N/A | ||
restart of 64-bit task from 32-bit task and vice versa |
Yes | ||
inet6 sockets (previously supported, just need to add it back in. minor) |
Yes | ||
Hardware devices other than those mentioned above. Includes mmap, /dev/foo, Infiniband devices |
Yes (mostly) | One idea | |
System-specific files in sysfs (things like UUID files) |
Yes |