UncheckpointableFilesystems

From Linux Checkpoint / Restart Wiki
Jump to: navigation, search

OBSOLETE CONTENT

This wiki has been archived and the content is no longer updated.

Contents

Summary

Any task with an open filesystem object which does not support checkpoint through its .checkpoint file_operation (file, dir, etc) will cause sys_checkpoint() to return failure. Using the glibc syscall wrapper it would return -1 and set errno to EINVAL. Though some portions of a checkpoint image may exist, it's terminated with an "error" description so that any attempt to use it will cause restart to report failure.

Unsupported Files

The file_operations structs missing the .checkpoint operation can be found in (expanded fs/):

    162 arch
      3 block
      1 crypto
      1 Documentation
    718 drivers
    178 fs
             3 9p
              8 afs
              1 autofs
              3 autofs4
              1 bad_inode.c
              3 binfmt_misc.c
              1 block_dev.c
              2 cachefiles
              1 char_dev.c
             15 cifs
              4 coda
              2 configfs
              3 debugfs
              8 dlm
              1 ext4
              1 fifo.c
              1 filesystems.c
              3 fscache
              9 fuse
              5 gfs2
              1 hugetlbfs
              1 jbd2
              6 jfs
              1 libfs.c
              1 locks.c
              2 ncpfs
              2 nfs
              5 nfsd
              1 no-block.c
              1 notify
              1 ntfs
             15 ocfs2
             55 proc
              1 reiserfs
              1 signalfd.c
              2 smbfs
              3 sysfs
              1 timerfd.c
              3 xfs
      1 include
      4 ipc
     88 kernel
      3 lib
     12 mm
    164 net
      1 samples
     35 security
     29 sound
      4 virt

(As of 2.6.33-rc8 + ckpt-v19-test on Feb 21st, 2010)

Notes:

  1. The missing checkpoint file operation in fs/fifo.c is only an artifact of the way fifo file ops are assigned. FIFOs are supported.
  2. The ext4 missing file operation is for the multiblock groups file in /proc

Feel free to find the specific locations of these structs and/or generate histograms for your own tree using the script below.

Scripts

Shell

A hackish script to run on the kernel source to find all of the files that do not support checkpoint:

#!/bin/bash
#
# Identify foo_operations structs missing the given operation.
#
# This is not a foolproof way to check for the operation because it assumes:
# the .field = foo syntax is used (rather than the field: foo syntax or
# positional assignment).
#
# no '};' will be seen until the end of the fops struct. I think this implies:
#       No structs embedded in fops
#       No anonymous functions in fops
#       No funky macro business in fops (e.g. ({ .. }) or do { .. } while(0))
#
# This script breaks when it scans for file_operations missing operations:
#       ./net/mac80211/debugfs_key.c:31+242:
#
# Note that the fs/fifo.c def_fifo_fops is "special" in that it's used
# to "bootstrap" to the correct file operations struct, so it's missing
# lots of ops you might expect even from a fifo.
#

DO_STAT=""
DO_DUMP=""

STRUCT=file_operations
OP="checkpoint"
KDIRS=( )

options=`getopt -o 'ao:d:s:v' --long 'auto-dirs,operation:,dir:,stat:,struct:,verbose' -- "$@"`
eval set -- "$options"

while true
do
        case "$1" in
        --)
                shift
                break ;;
        --stat|-s)
                DO_STAT=$(( $2 + 0 ))
                DO_DUMP=""
                shift 2
                ;;
        --verbose|-v)
                DO_DUMP=":"
                shift
                ;;
        --struct)
                STRUCT="$2"
                shift 2
                ;;
        --operation|-o)
                OP="$2"
                shift 2
                ;;
        --dir|-d)
                KDIRS+=("$2")
                shift 2
                ;;
        --auto-dirs|-a)
                KDIRS+=( $(find ./ -mindepth 1 -maxdepth 1 -type d '!' -name '.git' -printf ' %p ') )
                shift 1
                ;;
        esac
done

KDIRS+=( "$@" )
if (( ${#KDIRS[@]} < 1 )); then
        exit 0
fi

LIST=`mktemp check_fsop.XXXXX`
trap "rm -f \"${LIST}\" ; exit 23" EXIT ERR

rgrep -nHE 'struct[[:space:]]+'"${STRUCT}" "${KDIRS[@]}" | grep -v 'extern' | grep -v '\&' | grep -v 'sizeof' | grep '=' | grep -vE '[[:space:]]*=[[:space:]]*NULL[[:space:]]*;' > "${LIST}" || exit -1

(
for ENTRY in  $(cat "${LIST}" | cut -d : -f 1,2) ; do
        FILE=$(echo "${ENTRY}" | cut -d : -f 1)
        START_FOPS=$(($(echo "${ENTRY}" | cut -d : -f 2) + 0))
        LEN_FOPS=$(($(tail -n "+${START_FOPS}" "${FILE}" | grep -m 1 -nE '}' | cut -d : -f 1) + 0))
        if [ -z "${LEN_FOPS}" ]; then
                continue
        fi
        ((LEN_FOPS + 0)) || continue
        tail -n "+${START_FOPS}" "${FILE}" | \
                head -n ${LEN_FOPS} | \
                grep -m 1 -E '\.'"${OP}" > /dev/null && continue
        echo "${FILE}:${START_FOPS}+${LEN_FOPS}${DO_DUMP}"
        if [ "${DO_DUMP}" == ":" ]; then
                cat -n "${FILE}" | tail -n "+${START_FOPS}" | head -n "${LEN_FOPS}"
        fi
done
) | (
if [ -n "${DO_STAT}" ]; then
cat - | cut -d / -f ${DO_STAT} | sort | uniq -c
else
cat -
fi
)


Coccinelle

A more robust method to check for missing checkpoint operations would be to preprocess and parse the struct file_operations assignments before analyzing them. One tool that can be adapted to do this is coccinelle.

Use the stdout from running following coccinelle script with:

spatch -sp_file fsops.cocci  -dir ./

The contents of the script (fsops.cocci above) are:

@initialize:python@
#
# Check for missing .checkpoint file operations
# Check for missing .llseek file ops when the .checkpoint op is
#       generic_file_checkpoint
#
# First we collect all file operations structures into a dictionary (aka hash
#       in perl).
# Then we add their checkpoint expression and llseek expression to the dict.
#
# Finally we check each file operations struct with some simple python logic.
# It is possible to check .checkpoint with less python code, but not possible
# to look for missing .llseek operations when
# .checkpoint = generic_file_checkpoint.
#

from coccilib.elems import Location

f_ops = {}

@all_f_ops@
identifier I;
position p;
@@
struct file_operations I@p = {
        ...
};
@script:python@
f_op_name << all_f_ops.I;
f_op_pos << all_f_ops.p;
@@
l = f_op_pos[0]
fname = l.file
line = l.line
col = l.column
k = (fname, line, col)
# add to the f_ops ( file_operations struct identifier, checkpoint_expr, llseek_expr )
f_ops[k] = { 'id' : f_op_name.ident }

@checkpoint_f_op depends on all_f_ops@
identifier all_f_ops.I;
expression E;
position all_f_ops.p;
@@
struct file_operations I@p = {
        ...
        .checkpoint = E,
        ...
};

@script:python@
sub_f_op_pos << all_f_ops.p;
expr << checkpoint_f_op.E;
@@
l = sub_f_op_pos[0]
fname = l.file
line = l.line
col = l.column
k = (fname, line, col)
f_ops[k]['checkpoint'] = expr;

@llseek_f_op depends on all_f_ops@
identifier all_f_ops.I;
expression E;
position all_f_ops.p;
@@
struct file_operations I@p = {
        ...
        .llseek = E,
        ...
};
@script:python@
pos << all_f_ops.p;
f_op_name << all_f_ops.I;
expr << llseek_f_op.E;
@@
l = pos[0]
fname = l.file
line = l.line
col = l.column
k = (fname, line, col)
f_ops[k]['llseek'] = expr;

@finalize:python@
for i in f_ops.items():
        k = i[0]
        filename = k[0]
        line = k[1]
        column = k[2]
        f_op_var = i[1]
        name = f_op_var['id']
        position = "%s:%s@%s %s" % (filename, line, column, name)
        try:
                checkpoint_op = f_op_var['checkpoint']
        except KeyError:
                print "Missing .checkpoint in %s" % (position)
                continue
        checkpoint_op = str(checkpoint_op).strip()
        if not checkpoint_op == 'generic_file_checkpoint':
                continue
        try:
                llseek_op = f_op_var['llseek']
        except KeyError:
                print "Missing .llseek required by generic_file_checkpoint in %s" % (position)

Unlike the shell script, the coccinelle script is adapted to file_operations.

Personal tools