UncheckpointableFilesystems
OBSOLETE CONTENT
This wiki has been archived and the content is no longer updated.
Contents |
Summary
Any task with an open filesystem object which does not support checkpoint through its .checkpoint file_operation (file, dir, etc) will cause sys_checkpoint() to return failure. Using the glibc syscall wrapper it would return -1 and set errno to EINVAL. Though some portions of a checkpoint image may exist, it's terminated with an "error" description so that any attempt to use it will cause restart to report failure.
Unsupported Files
The file_operations structs missing the .checkpoint operation can be found in (expanded fs/):
162 arch 3 block 1 crypto 1 Documentation 718 drivers 178 fs 3 9p 8 afs 1 autofs 3 autofs4 1 bad_inode.c 3 binfmt_misc.c 1 block_dev.c 2 cachefiles 1 char_dev.c 15 cifs 4 coda 2 configfs 3 debugfs 8 dlm 1 ext4 1 fifo.c 1 filesystems.c 3 fscache 9 fuse 5 gfs2 1 hugetlbfs 1 jbd2 6 jfs 1 libfs.c 1 locks.c 2 ncpfs 2 nfs 5 nfsd 1 no-block.c 1 notify 1 ntfs 15 ocfs2 55 proc 1 reiserfs 1 signalfd.c 2 smbfs 3 sysfs 1 timerfd.c 3 xfs 1 include 4 ipc 88 kernel 3 lib 12 mm 164 net 1 samples 35 security 29 sound 4 virt
(As of 2.6.33-rc8 + ckpt-v19-test on Feb 21st, 2010)
Notes:
- The missing checkpoint file operation in fs/fifo.c is only an artifact of the way fifo file ops are assigned. FIFOs are supported.
- The ext4 missing file operation is for the multiblock groups file in /proc
Feel free to find the specific locations of these structs and/or generate histograms for your own tree using the script below.
Scripts
Shell
A hackish script to run on the kernel source to find all of the files that do not support checkpoint:
#!/bin/bash # # Identify foo_operations structs missing the given operation. # # This is not a foolproof way to check for the operation because it assumes: # the .field = foo syntax is used (rather than the field: foo syntax or # positional assignment). # # no '};' will be seen until the end of the fops struct. I think this implies: # No structs embedded in fops # No anonymous functions in fops # No funky macro business in fops (e.g. ({ .. }) or do { .. } while(0)) # # This script breaks when it scans for file_operations missing operations: # ./net/mac80211/debugfs_key.c:31+242: # # Note that the fs/fifo.c def_fifo_fops is "special" in that it's used # to "bootstrap" to the correct file operations struct, so it's missing # lots of ops you might expect even from a fifo. # DO_STAT="" DO_DUMP="" STRUCT=file_operations OP="checkpoint" KDIRS=( ) options=`getopt -o 'ao:d:s:v' --long 'auto-dirs,operation:,dir:,stat:,struct:,verbose' -- "$@"` eval set -- "$options" while true do case "$1" in --) shift break ;; --stat|-s) DO_STAT=$(( $2 + 0 )) DO_DUMP="" shift 2 ;; --verbose|-v) DO_DUMP=":" shift ;; --struct) STRUCT="$2" shift 2 ;; --operation|-o) OP="$2" shift 2 ;; --dir|-d) KDIRS+=("$2") shift 2 ;; --auto-dirs|-a) KDIRS+=( $(find ./ -mindepth 1 -maxdepth 1 -type d '!' -name '.git' -printf ' %p ') ) shift 1 ;; esac done KDIRS+=( "$@" ) if (( ${#KDIRS[@]} < 1 )); then exit 0 fi LIST=`mktemp check_fsop.XXXXX` trap "rm -f \"${LIST}\" ; exit 23" EXIT ERR rgrep -nHE 'struct[[:space:]]+'"${STRUCT}" "${KDIRS[@]}" | grep -v 'extern' | grep -v '\&' | grep -v 'sizeof' | grep '=' | grep -vE '[[:space:]]*=[[:space:]]*NULL[[:space:]]*;' > "${LIST}" || exit -1 ( for ENTRY in $(cat "${LIST}" | cut -d : -f 1,2) ; do FILE=$(echo "${ENTRY}" | cut -d : -f 1) START_FOPS=$(($(echo "${ENTRY}" | cut -d : -f 2) + 0)) LEN_FOPS=$(($(tail -n "+${START_FOPS}" "${FILE}" | grep -m 1 -nE '}' | cut -d : -f 1) + 0)) if [ -z "${LEN_FOPS}" ]; then continue fi ((LEN_FOPS + 0)) || continue tail -n "+${START_FOPS}" "${FILE}" | \ head -n ${LEN_FOPS} | \ grep -m 1 -E '\.'"${OP}" > /dev/null && continue echo "${FILE}:${START_FOPS}+${LEN_FOPS}${DO_DUMP}" if [ "${DO_DUMP}" == ":" ]; then cat -n "${FILE}" | tail -n "+${START_FOPS}" | head -n "${LEN_FOPS}" fi done ) | ( if [ -n "${DO_STAT}" ]; then cat - | cut -d / -f ${DO_STAT} | sort | uniq -c else cat - fi )
Coccinelle
A more robust method to check for missing checkpoint operations would be to preprocess and parse the struct file_operations assignments before analyzing them. One tool that can be adapted to do this is coccinelle.
Use the stdout from running following coccinelle script with:
spatch -sp_file fsops.cocci -dir ./
The contents of the script (fsops.cocci above) are:
@initialize:python@ # # Check for missing .checkpoint file operations # Check for missing .llseek file ops when the .checkpoint op is # generic_file_checkpoint # # First we collect all file operations structures into a dictionary (aka hash # in perl). # Then we add their checkpoint expression and llseek expression to the dict. # # Finally we check each file operations struct with some simple python logic. # It is possible to check .checkpoint with less python code, but not possible # to look for missing .llseek operations when # .checkpoint = generic_file_checkpoint. # from coccilib.elems import Location f_ops = {} @all_f_ops@ identifier I; position p; @@ struct file_operations I@p = { ... }; @script:python@ f_op_name << all_f_ops.I; f_op_pos << all_f_ops.p; @@ l = f_op_pos[0] fname = l.file line = l.line col = l.column k = (fname, line, col) # add to the f_ops ( file_operations struct identifier, checkpoint_expr, llseek_expr ) f_ops[k] = { 'id' : f_op_name.ident } @checkpoint_f_op depends on all_f_ops@ identifier all_f_ops.I; expression E; position all_f_ops.p; @@ struct file_operations I@p = { ... .checkpoint = E, ... }; @script:python@ sub_f_op_pos << all_f_ops.p; expr << checkpoint_f_op.E; @@ l = sub_f_op_pos[0] fname = l.file line = l.line col = l.column k = (fname, line, col) f_ops[k]['checkpoint'] = expr; @llseek_f_op depends on all_f_ops@ identifier all_f_ops.I; expression E; position all_f_ops.p; @@ struct file_operations I@p = { ... .llseek = E, ... }; @script:python@ pos << all_f_ops.p; f_op_name << all_f_ops.I; expr << llseek_f_op.E; @@ l = pos[0] fname = l.file line = l.line col = l.column k = (fname, line, col) f_ops[k]['llseek'] = expr; @finalize:python@ for i in f_ops.items(): k = i[0] filename = k[0] line = k[1] column = k[2] f_op_var = i[1] name = f_op_var['id'] position = "%s:%s@%s %s" % (filename, line, column, name) try: checkpoint_op = f_op_var['checkpoint'] except KeyError: print "Missing .checkpoint in %s" % (position) continue checkpoint_op = str(checkpoint_op).strip() if not checkpoint_op == 'generic_file_checkpoint': continue try: llseek_op = f_op_var['llseek'] except KeyError: print "Missing .llseek required by generic_file_checkpoint in %s" % (position)
Unlike the shell script, the coccinelle script is adapted to file_operations.