Link-LXC-USERCR

From Linux Checkpoint / Restart Wiki
Revision as of 01:03, 8 April 2010 by Sukadevb (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Contents

Overview

The following instructions describe the process to build/insall components necessary to checkpoint/restart (C/R) LXC containers using the C/R implementation being pushed into mainline.

The three main components to be built/installed are: * C/R enabled Linux kernel * USERCR - the user-space component of checkpoint/restart * LXC

Commit id

The instructions below identify a commit-id or a git tag in each of the git trees and it is recommended that you create a branch on those commit ids and apply any of the listed patches on that branch, before building the component.

Basic debugging of C/R

There is somewhat of a strong dependency between each of these components (USERCR and Linux kernel are tightly coupled as are USERCR and LXC) so it is important to ensure the versions match up correctly. Otherwise attempts to checkpoint/restart typically fail with terse -EINVAL or -EBUSY errors.

If this happens, usually more debug information is found in dmesg output. If lxc-checkpoint fails some additional error information can be found by running

* $ /bin/ckptinfo -ev <checkpoint-statefile>

Terminology

* statefile / checkpoint-image - this refers to the file in which the application state is saved after a successful checkpoint.

* USERCR - The user-space commands/library components that utilize the kernel system calls to checkpoint/restart applications. Specifically this refers to the git tree: git://git.ncl.cs.columbia.edu/pub/git/user-cr.git

1. Build C/R-enabled Linux kernel

* $ cd /root

* $ git-clone git://www.linux-cr.org/pub/git/linux-cr.git linux-cr

* $ cd linux-cr

* $ git-checkout ckpt-v20-dev

Tested with commit 3522c57a9ec6f08a129a78322318abcb4467db28 as HEAD.

* # Ensure following tokens are set in .config

||CONFIG_CHECKPOINT_SUPPORT=y|| ||CONFIG_SYSVIPC_CHECKPOINT=y|| ||CONFIG_CHECKPOINT=y|| ||CONFIG_CHECKPOINT_NETNS=y|| ||CONFIG_CHECKPOINT_DEBUG=y|| ||CONFIG_CGROUPS=y|| ||CONFIG_CGROUP_FREEZER=y|| ||CONFIG_NAMESPACES=y|| ||CONFIG_CGROUP_NS=y|| ||CONFIG_UTS_NS=y|| ||CONFIG_IPC_NS=y|| ||CONFIG_USER_NS=y|| ||CONFIG_PID_NS=y|| ||CONFIG_NET_NS=y|| ||CONFIG_FREEZER=y||

* # Build, install, reboot on new kernel

* # After every reboot, ensure '-o newinstance' mount option to /dev/pts works (see Documentation/filesystems/devpts.txt for details). In short, run following commands on each reboot:

* $ rm /dev/ptmx

* $ ln -s pts/ptmx /dev/ptmx

* $ chmod 666 /dev/pts/ptmx


2. Build USERCR

* $ cd /root

* $ git-clone git://git.ncl.cs.columbia.edu/pub/git/user-cr.git user-cr

* $ cd user-cr

* $ git-checkout ckpt-v20-dev

* Tested with commit e275f77e4a82d228c1df14dbeb691342e32cdac2 as HEAD.

* $ KERNELSRC=/root/linux-cr make

* Build USERCR by pointing to corresponding kernel-source. This should create restart.o and checkpoint.o needed by LXC.

* You may need to compile checkpoint.o and restart.o with -fPIC compiler option.

* $ make install

3. Build/install LXC

* $ cd /root

* $ git-clone git://lxc.git.sourceforge.net/gitroot/lxc/lxc lxc.git

* $ cd lxc.git

* # Apply attached patches to LXC (I tested with these patches applied to commit 9ea8066aa67b808f71f46e346bd7a215e2a355f3)

* $ autogen.sh

* $ ./configure --with-libcr=/root/user-cr

* This will fail if /root/user-cr does not container checkpoint.o, restart.o and app-checkpoint.h files * $ make

* $ make install

4. Checkpoint/restart a simple LXC container

* $ lxc-execute --name foo --rcfile lxc-no-netns.conf -- /bin/sleep 1000

* $ lxc-checkpoint --name foo --statefile /root/lxc-foo.ckpt

* $ lxc-stop --name foo

* $ lxc-restart --name foo --statefile /root/lxc-foo.ckpt

* $ lxc-stop --name foo

5. Checkpoint/restart other LXC containers:

* Similar to step 4 above checkpoint/restart other applications in containers. For example applications clone following git tree: * cr-tests: git://git.sr71.net/~hallyn/cr_tests.git

and try checkpoint/restart of:

* a file-io session (see run-fileio1 in cr-tests[1])

* process-tree (see run-ptree1 in cr-tests[1])

6. Checkpoint/restart an LXC container running a VNC server

* Run a "vi" editing session inside a VNC server using "twm"

* $ cat /root/.vnc/xstartup ||#!/bin/sh||

||xsetroot -solid grey|| ||vncconfig -iconic &|| ||xterm -geometry 80x24+10+10 -ls -title "$VNCDESKTOP Desktop" &|| ||twm &||

* $ lxc-execute --name foo --rcfile lxc-no-netns.conf -- /usr/bin/vncserver :1

* $ vncviewer :1

* # Open a vi session in vnc viewer

* $ lxc-checkpoint --name foo --statefile /root/vnc.ckpt

* $ lxc-stop --name foo

* $ lxc-restart --pause --name foo --statefile /root/vnc.ckpt

* # Leaves the server frozen due to --pause

* $ lxc-unfreeze --name foo

* $ vncviewer :1

* # Should bring up the old VNC session with vi window

Personal tools