Link-LXC-USERCR

From Linux Checkpoint / Restart Wiki
(Difference between revisions)
Jump to: navigation, search
(LXC 0.6.5 based stack)
m (corrected a small typo in kernel config)
 
(5 intermediate revisions by 3 users not shown)
Line 19: Line 19:
 
'''Stack''' -  The C/R functionality has only been tested with these specific sets of commit ids for the three components (kernel, USERCR and LXC). In this document a stack refers to a set of matching commit ids and is identified by an LXC release.
 
'''Stack''' -  The C/R functionality has only been tested with these specific sets of commit ids for the three components (kernel, USERCR and LXC). In this document a stack refers to a set of matching commit ids and is identified by an LXC release.
  
'''USERCR''' - The user-space commands/library components that utilize the kernel system calls to checkpoint/restart applications. Specifically this refers to the git tree: ''git://git.ncl.cs.columbia.edu/pub/git/user-cr.git''
+
'''USERCR''' - The user-space commands/library components that utilize the kernel system calls to checkpoint/restart applications. Specifically this refers to the git tree: ''git://www.linux-cr.org/pub/git/user-cr.git''
  
 
== Basic debugging of C/R ==
 
== Basic debugging of C/R ==
Line 39: Line 39:
 
== USERCR sources ==
 
== USERCR sources ==
 
     cd /root
 
     cd /root
     $ git clone git://git.ncl.cs.columbia.edu/pub/git/user-cr.git user-cr
+
     $ git clone git://www.linux-cr.org/pub/git/user-cr.git user-cr
  
 
== LXC sources ==
 
== LXC sources ==
Line 75: Line 75:
  
 
== LXC-0.7.1 based stack ==
 
== LXC-0.7.1 based stack ==
 +
To use this stack, use the following values in the build instructions
  
 
     - LXC_RELEASE = 0.7.1
 
     - LXC_RELEASE = 0.7.1
Line 81: Line 82:
 
     - LXC_COMMIT_ID = cba56779c893aac20d42d65cfa10db966c24d9b7
 
     - LXC_COMMIT_ID = cba56779c893aac20d42d65cfa10db966c24d9b7
  
    You can use the following commands to verify the above commit ids
+
You can use the following commands to verify the above commit ids
  
 
     $ (cd /root/linux-cr && git log --pretty=short -1 $KERNEL_COMMIT_ID)
 
     $ (cd /root/linux-cr && git log --pretty=short -1 $KERNEL_COMMIT_ID)
Line 88: Line 89:
 
         Fix potential restart failure on uninitialized sockets
 
         Fix potential restart failure on uninitialized sockets
  
     $ (cd /root/user-cr && git log --pretty=short -1 $KERNEL_COMMIT_ID)
+
     $ (cd /root/user-cr && git log --pretty=short -1 $USERCR_COMMIT_ID)
 
     commit f67877308e2ff8faedf79b204b26b70f03bcd562
 
     commit f67877308e2ff8faedf79b204b26b70f03bcd562
 
     Author: Christoffer Dall <christofferdall@christofferdall.dk>
 
     Author: Christoffer Dall <christofferdall@christofferdall.dk>
 
         ARM: Added user space support for c/r on ARM
 
         ARM: Added user space support for c/r on ARM
  
     $ (cd /root/lxc-git && git log --pretty=short -1 $KERNEL_COMMIT_ID)
+
     $ (cd /root/lxc-git && git log --pretty=short -1 $LXC_COMMIT_ID)
 
     commit cba56779c893aac20d42d65cfa10db966c24d9b7
 
     commit cba56779c893aac20d42d65cfa10db966c24d9b7
 
     Author: Daniel Lezcano <daniel.lezcano@free.fr>
 
     Author: Daniel Lezcano <daniel.lezcano@free.fr>
Line 128: Line 129:
 
   CONFIG_SYSVIPC_CHECKPOINT=y
 
   CONFIG_SYSVIPC_CHECKPOINT=y
 
   CONFIG_CHECKPOINT=y
 
   CONFIG_CHECKPOINT=y
   CONFIG_CHECKPOINT_NETNS=y
+
   CONFIG_NETNS_CHECKPOINT=y
 
   CONFIG_CHECKPOINT_DEBUG=y
 
   CONFIG_CHECKPOINT_DEBUG=y
 
   CONFIG_CGROUPS=y
 
   CONFIG_CGROUPS=y
Line 221: Line 222:
  
 
= Checkpoint/restart a simple LXC container =
 
= Checkpoint/restart a simple LXC container =
 +
 +
Checkpoint/restart a simple container to verify a successful build of the LXC and Checkpoint/restart stack.
  
 
     $ lxc-execute --name foo --rcfile lxc-no-netns.conf -- /bin/sleep 1000
 
     $ lxc-execute --name foo --rcfile lxc-no-netns.conf -- /bin/sleep 1000

Latest revision as of 13:14, 11 January 2011

Note: This page is still under construction. Feel free to fix obvious mistakes or email sukadev@linux.vnet.ibm.com.

Contents

[edit] Overview

The following instructions describe the process to build/install components necessary to checkpoint/restart (C/R) LXC containers using the C/R implementation being pushed into mainline.

The three main components to be built/installed are:

  • C/R enabled Linux kernel
  • USERCR - the user-space component of checkpoint/restart
  • LXC

Each of these components must currently be built from source from their git trees. The instructions below describe how to get the sources, checkout appropriate versions, build/install the components.

[edit] Terminology

Statefile / checkpoint-image - this refers to the file in which the application state is saved after a successful checkpoint.

LXC - This refers to implementation of linux containers. Specifically this refers to the git tree git://lxc.git.sourceforge.net/gitroot/lxc/lxc.

Stack - The C/R functionality has only been tested with these specific sets of commit ids for the three components (kernel, USERCR and LXC). In this document a stack refers to a set of matching commit ids and is identified by an LXC release.

USERCR - The user-space commands/library components that utilize the kernel system calls to checkpoint/restart applications. Specifically this refers to the git tree: git://www.linux-cr.org/pub/git/user-cr.git

[edit] Basic debugging of C/R

There is a strong dependency between each of these components (USERCR and Linux kernel are tightly coupled as are USERCR and LXC) so it is important to ensure the versions match up correctly. Otherwise attempts to checkpoint/restart typically fail with terse -EINVAL or -EBUSY errors.

If this happens, usually more debug information is found in dmesg output. If lxc-checkpoint fails some additional error information can be found by running

  • $ /bin/ckptinfo -ev <checkpoint-statefile>

[edit] Get sources

Get kernel, USERCR and LXC sources using following commands:

[edit] Kernel sources

   $ cd /root
   $ git clone  git://www.linux-cr.org/pub/git/linux-cr.git linux-cr

[edit] USERCR sources

   cd /root
   $ git clone git://www.linux-cr.org/pub/git/user-cr.git user-cr

[edit] LXC sources

   $ cd /root
   $ git clone git://lxc.git.sourceforge.net/gitroot/lxc/lxc lxc.git

[edit] Choose a stack

The commit ids for each of the components (kernel, USERCR, LXC) must be in sync for the entire "stack" to work correctly. In time, as new fixes go into each of these trees, new "stacks" become available. Choose a stack based on the LXC release below and use the commit ids from that stack in the build instructions to build the stack.

[edit] LXC 0.6.5 based stack

To use this stack, use the following values in the build instructions below

   - LXC_RELEASE = 0.6.5
   - KERNEL_COMMIT_ID = 0fdca57255b8b5bc9a4f107bee7f1e47d2630cb3
   - USERCR_COMMIT_ID = fa382a1758ef395858b1bea530ccdb4e1360b76f
   - LXC_COMMIT_ID = f78a1f32f41f6acbbf0b78e6498736dbd22e2301

You can use the following commands/output to verify the above commit ids.

   $ (cd /root/linux-cr && git log --pretty=short -1 $KERNEL_COMMIT_ID)
   commit 0fdca57255b8b5bc9a4f107bee7f1e47d2630cb3
   Author: Dan Smith <danms@us.ibm.com>
       Disable softirqs when taking the socket queue lock
   $ (cd /root/user-cr && git log --pretty=short -1 $USERCR_COMMIT_ID)
   commit fa382a1758ef395858b1bea530ccdb4e1360b76f
   Author: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
       Add keep_frozen field to struct app_restart_args
   $ (cd /root/lxc-git && git log --pretty=short -1 $LXC_COMMIT_ID)
   commit f78a1f32f41f6acbbf0b78e6498736dbd22e2301
   Author: Daniel Lezcano <daniel.lezcano@free.fr>
       fix when console is not specified

[edit] LXC-0.7.1 based stack

To use this stack, use the following values in the build instructions

   - LXC_RELEASE = 0.7.1
   - KERNEL_COMMIT_ID = c97ab066d8f45d2458f36fdb1e457630fb5f1278
   - USERCR_COMMIT_ID = f67877308e2ff8faedf79b204b26b70f03bcd562
   - LXC_COMMIT_ID = cba56779c893aac20d42d65cfa10db966c24d9b7

You can use the following commands to verify the above commit ids

   $ (cd /root/linux-cr && git log --pretty=short -1 $KERNEL_COMMIT_ID)
   commit c97ab066d8f45d2458f36fdb1e457630fb5f1278
   Author: Dan Smith <danms@us.ibm.com>
       Fix potential restart failure on uninitialized sockets
   $ (cd /root/user-cr && git log --pretty=short -1 $USERCR_COMMIT_ID)
   commit f67877308e2ff8faedf79b204b26b70f03bcd562
   Author: Christoffer Dall <christofferdall@christofferdall.dk>
       ARM: Added user space support for c/r on ARM
   $ (cd /root/lxc-git && git log --pretty=short -1 $LXC_COMMIT_ID)
   commit cba56779c893aac20d42d65cfa10db966c24d9b7
   Author: Daniel Lezcano <daniel.lezcano@free.fr>
       lxc-0.7.1

[edit] Build/install C/R-enabled Linux kernel

[edit] Checkout appropriate kernel-commit

Find the KERNEL_COMMIT_ID for the stack/LXC-RELEASE you chose above and run following commands:

   $ cd /root/linux-cr
   $ git checkout -b test-1 $KERNEL_COMMIT_ID

[edit] Apply kernel patches

If any patches are needed for the kernel tree, they will be in the $LXC_RELEASE/kernel-patches directory of: http://lxc.sourceforge.net/patches/lxc+usercr/, where LXC_RELEASE is set above.

Get those patches, and use, say, git am -3 to apply each of the patches in the directory to the kernel tree.

   $ git am -3 kernel-patches/0001-patch
   $ git am -3 kernel-patches/0002-patch
   etc

[edit] Setup kernel config

Ensure following tokens are set in the kernel .config:

  CONFIG_CHECKPOINT_SUPPORT=y
  CONFIG_SYSVIPC_CHECKPOINT=y
  CONFIG_CHECKPOINT=y
  CONFIG_NETNS_CHECKPOINT=y
  CONFIG_CHECKPOINT_DEBUG=y
  CONFIG_CGROUPS=y
  CONFIG_CGROUP_FREEZER=y
  CONFIG_NAMESPACES=y
  CONFIG_CGROUP_NS=y
  CONFIG_UTS_NS=y
  CONFIG_IPC_NS=y
  CONFIG_USER_NS=y
  CONFIG_PID_NS=y
  CONFIG_NET_NS=y
  CONFIG_FREEZER=y

To work around build issues with unexported symbols reported against this kernel version, ensure that the following symbols are each y or n, but not m: CONFIG_IPV6, CONFIG_MACVLAN, and CONFIG_VETH.

[edit] Build, install kernel

Build/install the linux kernel using the normal build procedure (make oldconfig, make, install vmlinuz, install modules, edit grub/lilo etc) and reboot on new kernel.

[edit] Reboot on new kernel

After every reboot, ensure '-o newinstance' mount option to /dev/pts works (see Documentation/filesystems/devpts.txt for details). In short, run following commands on each reboot:

  $ rm /dev/ptmx
  $ ln -s pts/ptmx /dev/ptmx
  $ chmod 666 /dev/pts/ptmx

Disable nscd if it is running -- nscd can cause file descriptors to be passed between mount namespaces, which is not supported by the current C/R code.

[edit] Build/install USERCR

[edit] Checkout appropriate USERCR commit

Find the USERCR_COMMIT_ID for the stack/LXC-RELEASE you chose above and run following commands:

   cd /root/user-cr
   $ git checkout -b test-1 $USERCR_COMMIT_ID

[edit] Apply USERCR patches

If any patches are needed for the USERCR tree, they will be available in $LXC_RELEASE/usercr-patches directory of: http://lxc.sourceforge.net/patches/lxc+usercr/, where LXC_RELEASE is set above.

Get those patches, and use, say, git am -3 to apply each of the patches in the directory to the USERCR tree.

   $ git am -3 usercr-patches/0001-patch
   $ git am -3 usercr-patches/0002-patch
   etc


[edit] Build and install USERCR binaries

Build USERCR by pointing to corresponding kernel-source tree we used above and install binaries. This should create restart.o and checkpoint.o needed by LXC, which, for now, are left in the current directory.

Note: You may need to compile checkpoint.o and restart.o with -fPIC compiler option.

   $ KERNELSRC=/root/linux-cr make 
   $ ls restart.o checkpoint.o
   restart.o   checkpoint.o
   $ make install

[edit] Build/install LXC

[edit] Checkout appropriate LXC commit

   Find the LXC_COMMIT_ID for the stack/LXC-RELEASE you chose above and run following commands:
   $ cd /root/lxc.git
   $ git checkout -b test-1 $LXC_COMMIT_ID

[edit] Apply LXC patches

Get the patch set from $LXC_RELEASE/lxc-patches directory of http://lxc.sourceforge.net/patches/lxc+usercr/, where LXC_RELEASE is set above

Apply each of the patches in the directory to the LXC git tree

   $ git am -3 lxc-patches/0001-patch
   $ git am -3 lxc-patches/0002-patch
   etc

[edit] Configure, build, install LXC binaries

Configure/build/install LXC binaries using following commands. The configure command will fail if /root/user-cr specified below, does not contain checkpoint.o, restart.o and app-checkpoint.h files

   $ ./autogen.sh
   $ ./configure --with-libcr=/root/user-cr
   $ make
   $ make install

[edit] Checkpoint/restart a simple LXC container

Checkpoint/restart a simple container to verify a successful build of the LXC and Checkpoint/restart stack.

   $ lxc-execute --name foo --rcfile lxc-no-netns.conf -- /bin/sleep 1000
   $ lxc-checkpoint --name foo --statefile /root/lxc-foo.ckpt
   $ lxc-stop --name foo
   $ lxc-restart --name foo --statefile /root/lxc-foo.ckpt
   $ lxc-stop --name foo

[edit] Checkpoint/restart an LXC container running a VNC server

[edit] VI editing session in a VNC session

Run a "vi" editing session inside a VNC server using "twm" window manager

   $ cat /root/.vnc/xstartup
   #!/bin/sh
   xsetroot -solid grey
   vncconfig -iconic &
   xterm -geometry 80x24+10+10 -ls -title "$VNCDESKTOP Desktop" &
   twm &
   $ lxc-execute --name foo --rcfile lxc-no-netns.conf -- /usr/bin/vncserver :1
   $ vncviewer :1
   # Open a vi session in vnc viewer
   $ lxc-checkpoint --name foo  --statefile /root/vnc.ckpt
   $ lxc-stop --name foo
   $ lxc-restart --pause --name foo --statefile /root/vnc.ckpt
   # Leaves the server frozen due to --pause
   $ lxc-unfreeze --name foo
   $ vncviewer :1
   # Should bring up the old VNC session with vi window
Personal tools