UncheckpointableFilesystems

From Linux Checkpoint / Restart Wiki
Jump to: navigation, search

Contents

Summary

Any task with an open filesystem object which does not support checkpoint through its .checkpoint file_operation (file, dir, etc) will cause sys_checkpoint() to return failure. Using the glibc syscall wrapper it would return -1 and set errno to EINVAL. Though some portions of a checkpoint image may exist, it's terminated with an "error" description so that any attempt to use it will cause restart to report failure.

Unsupported Files

The file_operations structs missing the .checkpoint operation can be found in (expanded fs/):

    162 arch
      3 block
      1 crypto
      1 Documentation
    718 drivers
    178 fs
             3 9p
              8 afs
              1 autofs
              3 autofs4
              1 bad_inode.c
              3 binfmt_misc.c
              1 block_dev.c
              2 cachefiles
              1 char_dev.c
             15 cifs
              4 coda
              2 configfs
              3 debugfs
              8 dlm
              1 ext4
              1 fifo.c
              1 filesystems.c
              3 fscache
              9 fuse
              5 gfs2
              1 hugetlbfs
              1 jbd2
              6 jfs
              1 libfs.c
              1 locks.c
              2 ncpfs
              2 nfs
              5 nfsd
              1 no-block.c
              1 notify
              1 ntfs
             15 ocfs2
             55 proc
              1 reiserfs
              1 signalfd.c
              2 smbfs
              3 sysfs
              1 timerfd.c
              3 xfs
      1 include
      4 ipc
     88 kernel
      3 lib
     12 mm
    164 net
      1 samples
     35 security
     29 sound
      4 virt

(As of 2.6.33-rc8 + ckpt-v19-test on Feb 21st, 2010)

Notes:

  1. The missing checkpoint file operation in fs/fifo.c is only an artifact of the way fifo file ops are assigned. FIFOs are supported.
  2. The ext4 missing file operation is for the multiblock groups file in /proc

Feel free to find the specific locations of these structs and/or generate histograms for your own tree using the script below.

Scripts

Shell

A hackish script to run on the kernel source to find all of the files that do not support checkpoint:

#!/bin/bash
#
# Identify foo_operations structs missing the given operation.
#
# This is not a foolproof way to check for the operation because it assumes:
# the .field = foo syntax is used (rather than the field: foo syntax or
# positional assignment).
#
# no '};' will be seen until the end of the fops struct. I think this implies:
#       No structs embedded in fops
#       No anonymous functions in fops
#       No funky macro business in fops (e.g. ({ .. }) or do { .. } while(0))
#
# This script breaks when it scans for file_operations missing operations:
#       ./net/mac80211/debugfs_key.c:31+242:
#
# Note that the fs/fifo.c def_fifo_fops is "special" in that it's used
# to "bootstrap" to the correct file operations struct, so it's missing
# lots of ops you might expect even from a fifo.
#

DO_STAT=""
DO_DUMP=""

STRUCT=file_operations
OP="checkpoint"
KDIRS=( )

options=`getopt -o 'ao:d:s:v' --long 'auto-dirs,operation:,dir:,stat:,struct:,verbose' -- "$@"`
eval set -- "$options"

while true
do
        case "$1" in
        --)
                shift
                break ;;
        --stat|-s)
                DO_STAT=$(( $2 + 0 ))
                DO_DUMP=""
                shift 2
                ;;
        --verbose|-v)
                DO_DUMP=":"
                shift
                ;;
        --struct)
                STRUCT="$2"
                shift 2
                ;;
        --operation|-o)
                OP="$2"
                shift 2
                ;;
        --dir|-d)
                KDIRS+=("$2")
                shift 2
                ;;
        --auto-dirs|-a)
                KDIRS+=( $(find ./ -mindepth 1 -maxdepth 1 -type d '!' -name '.git' -printf ' %p ') )
                shift 1
                ;;
        esac
done

KDIRS+=( "$@" )
if (( ${#KDIRS[@]} < 1 )); then
        exit 0
fi

LIST=`mktemp check_fsop.XXXXX`
trap "rm -f \"${LIST}\" ; exit 23" EXIT ERR

rgrep -nHE 'struct[[:space:]]+'"${STRUCT}" "${KDIRS[@]}" | grep -v 'extern' | grep -v '\&' | grep -v 'sizeof' | grep '=' | grep -vE '[[:space:]]*=[[:space:]]*NULL[[:space:]]*;' > "${LIST}" || exit -1

(
for ENTRY in  $(cat "${LIST}" | cut -d : -f 1,2) ; do
        FILE=$(echo "${ENTRY}" | cut -d : -f 1)
        START_FOPS=$(($(echo "${ENTRY}" | cut -d : -f 2) + 0))
        LEN_FOPS=$(($(tail -n "+${START_FOPS}" "${FILE}" | grep -m 1 -nE '}' | cut -d : -f 1) + 0))
        if [ -z "${LEN_FOPS}" ]; then
                continue
        fi
        ((LEN_FOPS + 0)) || continue
        tail -n "+${START_FOPS}" "${FILE}" | \
                head -n ${LEN_FOPS} | \
                grep -m 1 -E '\.'"${OP}" > /dev/null && continue
        echo "${FILE}:${START_FOPS}+${LEN_FOPS}${DO_DUMP}"
        if [ "${DO_DUMP}" == ":" ]; then
                cat -n "${FILE}" | tail -n "+${START_FOPS}" | head -n "${LEN_FOPS}"
        fi
done
) | (
if [ -n "${DO_STAT}" ]; then
cat - | cut -d / -f ${DO_STAT} | sort | uniq -c
else
cat -
fi
)


Coccinelle

A more robust method to check for missing checkpoint operations would be to preprocess and parse the struct file_operations assignments before analyzing them. One tool that can be adapted to do this is coccinelle.

Use the stdout from running following coccinelle script with:

spatch -sp_file fsops.cocci  -dir ./

The contents of the script (fsops.cocci above) are:

@initialize:python@
#
# Check for missing .checkpoint file operations
# Check for missing .llseek file ops when the .checkpoint op is
#       generic_file_checkpoint
#
# First we collect all file operations structures into a dictionary (aka hash
#       in perl).
# Then we add their checkpoint expression and llseek expression to the dict.
#
# Finally we check each file operations struct with some simple python logic.
# It is possible to check .checkpoint with less python code, but not possible
# to look for missing .llseek operations when
# .checkpoint = generic_file_checkpoint.
#

from coccilib.elems import Location

f_ops = {}

@all_f_ops@
identifier I;
position p;
@@
struct file_operations I@p = {
        ...
};
@script:python@
f_op_name << all_f_ops.I;
f_op_pos << all_f_ops.p;
@@
l = f_op_pos[0]
fname = l.file
line = l.line
col = l.column
k = (fname, line, col)
# add to the f_ops ( file_operations struct identifier, checkpoint_expr, llseek_expr )
f_ops[k] = { 'id' : f_op_name.ident }

@checkpoint_f_op depends on all_f_ops@
identifier all_f_ops.I;
expression E;
position all_f_ops.p;
@@
struct file_operations I@p = {
        ...
        .checkpoint = E,
        ...
};

@script:python@
sub_f_op_pos << all_f_ops.p;
expr << checkpoint_f_op.E;
@@
l = sub_f_op_pos[0]
fname = l.file
line = l.line
col = l.column
k = (fname, line, col)
f_ops[k]['checkpoint'] = expr;

@llseek_f_op depends on all_f_ops@
identifier all_f_ops.I;
expression E;
position all_f_ops.p;
@@
struct file_operations I@p = {
        ...
        .llseek = E,
        ...
};
@script:python@
pos << all_f_ops.p;
f_op_name << all_f_ops.I;
expr << llseek_f_op.E;
@@
l = pos[0]
fname = l.file
line = l.line
col = l.column
k = (fname, line, col)
f_ops[k]['llseek'] = expr;

@finalize:python@
for i in f_ops.items():
        k = i[0]
        filename = k[0]
        line = k[1]
        column = k[2]
        f_op_var = i[1]
        name = f_op_var['id']
        position = "%s:%s@%s %s" % (filename, line, column, name)
        try:
                checkpoint_op = f_op_var['checkpoint']
        except KeyError:
                print "Missing .checkpoint in %s" % (position)
                continue
        checkpoint_op = str(checkpoint_op).strip()
        if not checkpoint_op == 'generic_file_checkpoint':
                continue
        try:
                llseek_op = f_op_var['llseek']
        except KeyError:
                print "Missing .llseek required by generic_file_checkpoint in %s" % (position)

Unlike the shell script, the coccinelle script is adapted to file_operations.

Personal tools