LXD 2.0: Live migration [9/12]

This is the ninth blog post in this series about LXD 2.0.

LXD logo

Introduction

One of the very exciting feature of LXD 2.0, albeit experimental, is the support for container checkpoint and restore.

Simply put, checkpoint/restore means that the running container state can be serialized down to disk and then restored, either on the same host as a stateful snapshot of the container or on another host which equates to live migration.

Requirements

To have access to container live migration and stateful snapshots, you need the following:

  • A very recent Linux kernel, 4.4 or higher.
  • CRIU 2.0, possibly with some cherry-picked commits depending on your exact kernel configuration.
  • Run LXD directly on the host. It’s not possible to use those features with container nesting.
  • For migration, the target machine must at least implement the instruction set of the source, the target kernel must at least offer the same syscalls as the source and any kernel filesystem which was mounted on the source must also be mountable on the target.

All the needed dependencies are provided by Ubuntu 16.04 LTS, in which case, all you need to do is install CRIU itself:

apt install criu

Using the thing

Stateful snapshots

A normal container snapshot looks like:

stgraber@dakara:~$ lxc snapshot c1 first
stgraber@dakara:~$ lxc info c1 | grep first
 first (taken at 2016/04/25 19:35 UTC) (stateless)

A stateful snapshot instead looks like:

stgraber@dakara:~$ lxc snapshot c1 second --stateful
stgraber@dakara:~$ lxc info c1 | grep second
 second (taken at 2016/04/25 19:36 UTC) (stateful)

This means that all the container runtime state was serialized to disk and included as part of the snapshot. Restoring one such snapshot is done as you would a stateless one:

stgraber@dakara:~$ lxc restore c1 second
stgraber@dakara:~$

Stateful stop/start

Say you want to reboot your server for a kernel update or similar maintenance. Rather than have to wait for all the containers to start from scratch after reboot, you can do:

stgraber@dakara:~$ lxc stop c1 --stateful

The container state will be written to disk and then picked up the next time you start it.

You can even look at what the state looks like:

root@dakara:~# tree /var/lib/lxd/containers/c1/rootfs/state/
/var/lib/lxd/containers/c1/rootfs/state/
├── cgroup.img
├── core-101.img
├── core-102.img
├── core-107.img
├── core-108.img
├── core-109.img
├── core-113.img
├── core-114.img
├── core-122.img
├── core-125.img
├── core-126.img
├── core-127.img
├── core-183.img
├── core-1.img
├── core-245.img
├── core-246.img
├── core-50.img
├── core-52.img
├── core-95.img
├── core-96.img
├── core-97.img
├── core-98.img
├── dump.log
├── eventfd.img
├── eventpoll.img
├── fdinfo-10.img
├── fdinfo-11.img
├── fdinfo-12.img
├── fdinfo-13.img
├── fdinfo-14.img
├── fdinfo-2.img
├── fdinfo-3.img
├── fdinfo-4.img
├── fdinfo-5.img
├── fdinfo-6.img
├── fdinfo-7.img
├── fdinfo-8.img
├── fdinfo-9.img
├── fifo-data.img
├── fifo.img
├── filelocks.img
├── fs-101.img
├── fs-113.img
├── fs-122.img
├── fs-183.img
├── fs-1.img
├── fs-245.img
├── fs-246.img
├── fs-50.img
├── fs-52.img
├── fs-95.img
├── fs-96.img
├── fs-97.img
├── fs-98.img
├── ids-101.img
├── ids-113.img
├── ids-122.img
├── ids-183.img
├── ids-1.img
├── ids-245.img
├── ids-246.img
├── ids-50.img
├── ids-52.img
├── ids-95.img
├── ids-96.img
├── ids-97.img
├── ids-98.img
├── ifaddr-9.img
├── inetsk.img
├── inotify.img
├── inventory.img
├── ip6tables-9.img
├── ipcns-var-10.img
├── iptables-9.img
├── mm-101.img
├── mm-113.img
├── mm-122.img
├── mm-183.img
├── mm-1.img
├── mm-245.img
├── mm-246.img
├── mm-50.img
├── mm-52.img
├── mm-95.img
├── mm-96.img
├── mm-97.img
├── mm-98.img
├── mountpoints-12.img
├── netdev-9.img
├── netlinksk.img
├── netns-9.img
├── netns-ct-9.img
├── netns-exp-9.img
├── packetsk.img
├── pagemap-101.img
├── pagemap-113.img
├── pagemap-122.img
├── pagemap-183.img
├── pagemap-1.img
├── pagemap-245.img
├── pagemap-246.img
├── pagemap-50.img
├── pagemap-52.img
├── pagemap-95.img
├── pagemap-96.img
├── pagemap-97.img
├── pagemap-98.img
├── pages-10.img
├── pages-11.img
├── pages-12.img
├── pages-13.img
├── pages-1.img
├── pages-2.img
├── pages-3.img
├── pages-4.img
├── pages-5.img
├── pages-6.img
├── pages-7.img
├── pages-8.img
├── pages-9.img
├── pipes-data.img
├── pipes.img
├── pstree.img
├── reg-files.img
├── remap-fpath.img
├── route6-9.img
├── route-9.img
├── rule-9.img
├── seccomp.img
├── sigacts-101.img
├── sigacts-113.img
├── sigacts-122.img
├── sigacts-183.img
├── sigacts-1.img
├── sigacts-245.img
├── sigacts-246.img
├── sigacts-50.img
├── sigacts-52.img
├── sigacts-95.img
├── sigacts-96.img
├── sigacts-97.img
├── sigacts-98.img
├── signalfd.img
├── stats-dump
├── timerfd.img
├── tmpfs-dev-104.tar.gz.img
├── tmpfs-dev-109.tar.gz.img
├── tmpfs-dev-110.tar.gz.img
├── tmpfs-dev-112.tar.gz.img
├── tmpfs-dev-114.tar.gz.img
├── tty.info
├── unixsk.img
├── userns-13.img
└── utsns-11.img

0 directories, 154 files

Restoring the container can be done with a simple:

stgraber@dakara:~$ lxc start c1

Live migration

Live migration is basically the same as the stateful stop/start above, except that the container directory and configuration happens to be moved to another machine too.

stgraber@dakara:~$ lxc list c1
+------+---------+-----------------------+----------------------------------------------+------------+-----------+
| NAME |  STATE  |          IPV4         |                     IPV6                     |    TYPE    | SNAPSHOTS |
+------+---------+-----------------------+----------------------------------------------+------------+-----------+
| c1   | RUNNING | 10.178.150.197 (eth0) | 2001:470:b368:4242:216:3eff:fe19:27b0 (eth0) | PERSISTENT | 2         |
+------+---------+-----------------------+----------------------------------------------+------------+-----------+

stgraber@dakara:~$ lxc list s-tollana:
+------+-------+------+------+------+-----------+
| NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS |
+------+-------+------+------+------+-----------+

stgraber@dakara:~$ lxc move c1 s-tollana:

stgraber@dakara:~$ lxc list c1
+------+-------+------+------+------+-----------+
| NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS |
+------+-------+------+------+------+-----------+

stgraber@dakara:~$ lxc list s-tollana:
+------+---------+-----------------------+----------------------------------------------+------------+-----------+
| NAME |  STATE  |          IPV4         |                     IPV6                     |    TYPE    | SNAPSHOTS |
+------+---------+-----------------------+----------------------------------------------+------------+-----------+
| c1   | RUNNING | 10.178.150.197 (eth0) | 2001:470:b368:4242:216:3eff:fe19:27b0 (eth0) | PERSISTENT | 2         |
+------+---------+-----------------------+----------------------------------------------+------------+-----------+

Limitations

As I said before, checkpoint/restore of containers is still pretty new and we’re still very much working on this feature, fixing issues as we are made aware of them. We do need more people trying this feature and sending us feedback, I would however not recommend using this in production just yet.

The current list of issues we’re tracking is available on Launchpad.

We expect a basic Ubuntu container with a few services to work properly with CRIU in Ubuntu 16.04. However more complex containers, using device passthrough, complex network services or special storage configurations are likely to fail.

Whenever possible, CRIU will fail at dump time, rather than at restore time. In such cases, the source container will keep running, the snapshot or migration will simply fail and a log file will be generated for debugging.

In rare cases, CRIU fails to restore the container, in which case the source container will still be around but will be stopped and will have to be manually restarted.

Sending bug reports

We’re tracking bugs related to checkpoint/restore against the CRIU Ubuntu package on Launchpad. Most of the work to fix those bugs will then happen upstream either on CRIU itself or the Linux kernel, but it’s easier for us to track things this way.

To file a new bug report, head here.

Please make sure to include:

  • The command you ran and the error message as displayed to you
  • Output of “lxc info” (*)
  • Output of “lxc info <container name>”
  • Output of “lxc config show –expanded <container name>”
  • Output of “dmesg” (*)
  • Output of “/proc/self/mountinfo” (*)
  • Output of “lxc exec <container name> — cat /proc/self/mountinfo”
  • Output of “uname -a” (*)
  • The content of /var/log/lxd.log (*)
  • The content of /etc/default/lxd-bridge (*)
  • A tarball of /var/log/lxd/<container name>/ (*)

If reporting a migration bug as opposed to a stateful snapshot or stateful stop bug, please include the data for both the source and target for any of the above which has been marked with a (*).

Extra information

The CRIU website can be found at: https://criu.org

The main LXD website is at: https://linuxcontainers.org/lxd
Development happens on Github at: https://github.com/lxc/lxd
Mailing-list support happens on: https://lists.linuxcontainers.org
IRC support happens in: #lxcontainers on irc.freenode.net
Try LXD online: https://linuxcontainers.org/lxd/try-it

This entry was posted in Canonical voices, LXD, Planet Ubuntu and tagged . Bookmark the permalink.

11 Responses to LXD 2.0: Live migration [9/12]

  1. Pingback: LXD 2.0: migración de contenedores en tiempo real [ENG]

  2. Pingback: LXD 2.0: Live migration [9/12] - Linux Distros

  3. Gábor Mészáros says:

    Hello,

    for some reason(s) for me applying migratable profile was necessary to succeed but I couldn’t find any reference for it here.
    Am I missing something? (xenial both hosts, criu installed)

  4. Mirko Corosu says:

    Hi,

    In order to avoid the network transfer of the container root, I’m going to install a distribute filesystem on each lxd node. Is it supported?

  5. naresh says:

    Hey after the command lxc move host1 p1,I am getitng the error renaming of running container not allowed

  6. Steffen Müller says:

    Hi,

    can you give me a hint how to do live migration with curl using the REST API? I know how to start it:
    curl -s -k –cert crt.pem –key key.pem -X POST -d ‘{“migration”: true}’ https://192.168.168.167:8443/1.0/containers/rf-infraserver | jq
    I get “Operation created” status code 100 but I don’t understand what to do next to really start the migration. You wrote in rest-api.md
    “The migration does not actually start until someone (i.e. another lxd instance) connects to all the websockets and begins negotiation with the source.”
    What this means?

  7. Oleg says:

    I installed fresh ubuntu 16.04, and stateful snapshots do not work at all.

    # lxc launch ubuntu: c1
    Creating c1
    Starting c1

    # lxc snapshot c1 first –stateful
    error: snapshot dump failed
    (00.007832) Error (action-scripts.c:60): One of more action scripts failed
    (00.007868) Error (cr-dump.c:1621): Pre dump script failed with 32512!

    without “–stateful” everything works ok.

  8. NL says:

    root@dnxovh-hy001 (node1):~# lsb_release -a
    No LSB modules are available.
    Distributor ID: Ubuntu
    Description: Ubuntu 16.04.1 LTS
    Release: 16.04
    Codename: xenial

    root@dnxovh-hy001 (node1):~# dpkg -l |egrep “criu|lxc|lxd” | awk ‘{print $2,$3,$4}’ |
    criu 2.0-2ubuntu3 amd64 |
    liblxc1 2.0.5-0ubuntu1~ubuntu16.04.1~ppa1 amd64 |
    lxc-common 2.0.5-0ubuntu1~ubuntu16.04.1~ppa1 amd64 |
    lxcfs 2.0.4-0ubuntu1~ubuntu16.04.1~ppa1 amd64 |
    lxd 2.4.1-0ubuntu1~ubuntu16.04.1~ppa1 amd64 |
    lxd-client 2.4.1-0ubuntu1~ubuntu16.04.1~ppa1 amd64

    root@dnxovh-hy001 (node1):~# lxc info |
    apiextensions: |
    – storage_zfs_remove_snapshots |
    – container_host_shutdown_timeout |
    – container_syscall_filtering |
    – auth_pki |
    – container_last_used_at |
    – etag |
    – patch |
    – usb_devices |
    – https_allowed_credentials |
    – image_compression_algorithm |
    – directory_manipulation |
    – container_cpu_time |
    – storage_zfs_use_refquota
    – storage_lvm_mount_options
    – network
    – profile_usedby
    – container_push
    apistatus: stable
    apiversion: “1.0”
    addresses:
    – 10.0.0.1:8443
    architectures:
    – x86_64
    – i686
    driver: lxc
    driverversion: 2.0.5
    kernel: Linux
    kernelarchitecture: x86_64
    kernelversion: 4.4.0-43-generic
    server: lxd
    serverpid: 15124
    serverversion: 2.4.1
    storage: zfs
    storageversion: “5”
    config:
    core.https_address: 10.0.0.1:8443
    core.trust_password: true
    storage.zfs_pool_name: zdata/lxd
    public: false

    root@dnxovh-hy001 (node1):~# lxc list
    +————–+———+———————–+——+————+———–+
    | NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS |
    +————–+———+———————–+——+————+———–+
    | inxovh-db001 | RUNNING | 192.168.10.201 (eth0) | | PERSISTENT | 0 |
    +————–+———+———————–+——+————+———–+
    | inxovh-db002 | RUNNING | 192.168.10.202 (eth0) | | PERSISTENT | 0 |
    +————–+———+———————–+——+————+———–+
    | inxovh-db003 | RUNNING | 192.168.10.203 (eth0) | | PERSISTENT | 0 |
    +————–+———+———————–+——+————+———–+

    root@dnxovh-hy001 (node1):~# lxc remote list |
    +—————–+——————————————+—————+——–+——–+
    | NAME | URL | PROTOCOL | PUBLIC | STATIC |
    +—————–+——————————————+—————+——–+——–+
    | images | https://images.linuxcontainers.org | simplestreams | YES | NO |
    +—————–+——————————————+—————+——–+——–+
    | local (default) | unix:// | lxd | NO | YES |
    +—————–+——————————————+—————+——–+——–+
    | node2 | https://10.0.0.2:8443 | lxd | NO | NO |
    +—————–+——————————————+—————+——–+——–+
    | ubuntu | https://cloud-images.ubuntu.com/releases | simplestreams | YES | YES |
    +—————–+——————————————+—————+——–+——–+
    | ubuntu-daily | https://cloud-images.ubuntu.com/daily | simplestreams | YES | YES |
    +—————–+——————————————+—————+——–+——–+

    root@dnxovh-hy001 (node1):~# lxc info inxovh-db001
    Name: inxovh-db001
    Remote: unix:/var/lib/lxd/unix.socket
    Architecture: x86_64
    Created: 2016/10/19 13:41 UTC
    Status: Running
    Type: persistent
    Profiles: default
    Pid: 3148
    Ips:
    lo: inet 127.0.0.1
    lo: inet6 ::1
    lxdbr0: inet6 fe80::e4a2:e5ff:fe66:7180
    lxdbr0: inet6 fe80::1
    eth0: inet 192.168.10.201 vethPU4Y94
    eth0: inet6 fe80::216:3eff:fe5a:fc58 vethPU4Y94
    Resources:
    Processes: 70
    Disk usage:
    root: 166.26MB
    CPU usage:
    CPU usage (in seconds): 78
    Memory usage:
    Memory (current): 144.85MB
    Memory (peak): 215.60MB
    Network usage:
    eth0:
    Bytes received: 27.83MB
    Bytes sent: 243.20MB
    Packets received: 240888
    Packets sent: 204223
    lo:
    Bytes received: 6.56kB
    Bytes sent: 6.56kB
    Packets received: 96
    Packets sent: 96
    lxdbr0:
    Bytes received: 0 bytes
    Bytes sent: 470 bytes
    Packets received: 0
    Packets sent: 5
    root@dnxovh-hy001 (node1):~#

    Live migration failed always with same message:

    root@dnxovh-hy001 (node1):~# lxc move inxovh-db002 node2:inxovh-db002
    error: Error transferring container data: migration restore failed
    (01.045004) Warn (cr-restore.c:1159): Set CLONE_PARENT | CLONE_NEWPID but it might cause restore problem,because not all kernels support such clone flags combinations!
    (01.063601) 1: Error (cr-restore.c:1489): mkdtemp failed crtools-proc.Qi1baH: Permission denied
    (01.074712) Error (cr-restore.c:1352): 22488 killed by signal 9
    (01.123156) Error (cr-restore.c:2182): Restoring FAILED.

    /tmp are of course with good perms

    root@dnxovh-hy001 (node1):~# ls -ld /tmp
    drwxrwxrwt 16 root root 86016 Oct 19 21:42 /tmp
    root@dnxovh-hy001 (node1):~# ssh node2 !!
    ssh node2 ls -ld /tmp
    drwxrwxrwt 9 root root 4096 Oct 19 21:42 /tmp

    Any clue is welcome !

    Thanks

  9. Pingback: Getting started with LXD 2.0 container hypervisor | Odd One Out

Leave a Reply

Your email address will not be published. Required fields are marked *