LXD 2.0: Live migration [9/12]

Posted on 2016/04/25 by Stéphane Graber

This is the ninth blog post in this series about LXD 2.0.

LXD logo

Introduction

One of the very exciting feature of LXD 2.0, albeit experimental, is the support for container checkpoint and restore.

Simply put, checkpoint/restore means that the running container state can be serialized down to disk and then restored, either on the same host as a stateful snapshot of the container or on another host which equates to live migration.

Requirements

To have access to container live migration and stateful snapshots, you need the following:

A very recent Linux kernel, 4.4 or higher.
CRIU 2.0, possibly with some cherry-picked commits depending on your exact kernel configuration.
Run LXD directly on the host. It’s not possible to use those features with container nesting.
For migration, the target machine must at least implement the instruction set of the source, the target kernel must at least offer the same syscalls as the source and any kernel filesystem which was mounted on the source must also be mountable on the target.

All the needed dependencies are provided by Ubuntu 16.04 LTS, in which case, all you need to do is install CRIU itself:

apt install criu

Using the thing

Stateful snapshots

A normal container snapshot looks like:

stgraber@dakara:~$ lxc snapshot c1 first
stgraber@dakara:~$ lxc info c1 | grep first
 first (taken at 2016/04/25 19:35 UTC) (stateless)

A stateful snapshot instead looks like:

stgraber@dakara:~$ lxc snapshot c1 second --stateful
stgraber@dakara:~$ lxc info c1 | grep second
 second (taken at 2016/04/25 19:36 UTC) (stateful)

This means that all the container runtime state was serialized to disk and included as part of the snapshot. Restoring one such snapshot is done as you would a stateless one:

stgraber@dakara:~$ lxc restore c1 second
stgraber@dakara:~$

Stateful stop/start

Say you want to reboot your server for a kernel update or similar maintenance. Rather than have to wait for all the containers to start from scratch after reboot, you can do:

stgraber@dakara:~$ lxc stop c1 --stateful

The container state will be written to disk and then picked up the next time you start it.

You can even look at what the state looks like:

root@dakara:~# tree /var/lib/lxd/containers/c1/rootfs/state/
/var/lib/lxd/containers/c1/rootfs/state/
├── cgroup.img
├── core-101.img
├── core-102.img
├── core-107.img
├── core-108.img
├── core-109.img
├── core-113.img
├── core-114.img
├── core-122.img
├── core-125.img
├── core-126.img
├── core-127.img
├── core-183.img
├── core-1.img
├── core-245.img
├── core-246.img
├── core-50.img
├── core-52.img
├── core-95.img
├── core-96.img
├── core-97.img
├── core-98.img
├── dump.log
├── eventfd.img
├── eventpoll.img
├── fdinfo-10.img
├── fdinfo-11.img
├── fdinfo-12.img
├── fdinfo-13.img
├── fdinfo-14.img
├── fdinfo-2.img
├── fdinfo-3.img
├── fdinfo-4.img
├── fdinfo-5.img
├── fdinfo-6.img
├── fdinfo-7.img
├── fdinfo-8.img
├── fdinfo-9.img
├── fifo-data.img
├── fifo.img
├── filelocks.img
├── fs-101.img
├── fs-113.img
├── fs-122.img
├── fs-183.img
├── fs-1.img
├── fs-245.img
├── fs-246.img
├── fs-50.img
├── fs-52.img
├── fs-95.img
├── fs-96.img
├── fs-97.img
├── fs-98.img
├── ids-101.img
├── ids-113.img
├── ids-122.img
├── ids-183.img
├── ids-1.img
├── ids-245.img
├── ids-246.img
├── ids-50.img
├── ids-52.img
├── ids-95.img
├── ids-96.img
├── ids-97.img
├── ids-98.img
├── ifaddr-9.img
├── inetsk.img
├── inotify.img
├── inventory.img
├── ip6tables-9.img
├── ipcns-var-10.img
├── iptables-9.img
├── mm-101.img
├── mm-113.img
├── mm-122.img
├── mm-183.img
├── mm-1.img
├── mm-245.img
├── mm-246.img
├── mm-50.img
├── mm-52.img
├── mm-95.img
├── mm-96.img
├── mm-97.img
├── mm-98.img
├── mountpoints-12.img
├── netdev-9.img
├── netlinksk.img
├── netns-9.img
├── netns-ct-9.img
├── netns-exp-9.img
├── packetsk.img
├── pagemap-101.img
├── pagemap-113.img
├── pagemap-122.img
├── pagemap-183.img
├── pagemap-1.img
├── pagemap-245.img
├── pagemap-246.img
├── pagemap-50.img
├── pagemap-52.img
├── pagemap-95.img
├── pagemap-96.img
├── pagemap-97.img
├── pagemap-98.img
├── pages-10.img
├── pages-11.img
├── pages-12.img
├── pages-13.img
├── pages-1.img
├── pages-2.img
├── pages-3.img
├── pages-4.img
├── pages-5.img
├── pages-6.img
├── pages-7.img
├── pages-8.img
├── pages-9.img
├── pipes-data.img
├── pipes.img
├── pstree.img
├── reg-files.img
├── remap-fpath.img
├── route6-9.img
├── route-9.img
├── rule-9.img
├── seccomp.img
├── sigacts-101.img
├── sigacts-113.img
├── sigacts-122.img
├── sigacts-183.img
├── sigacts-1.img
├── sigacts-245.img
├── sigacts-246.img
├── sigacts-50.img
├── sigacts-52.img
├── sigacts-95.img
├── sigacts-96.img
├── sigacts-97.img
├── sigacts-98.img
├── signalfd.img
├── stats-dump
├── timerfd.img
├── tmpfs-dev-104.tar.gz.img
├── tmpfs-dev-109.tar.gz.img
├── tmpfs-dev-110.tar.gz.img
├── tmpfs-dev-112.tar.gz.img
├── tmpfs-dev-114.tar.gz.img
├── tty.info
├── unixsk.img
├── userns-13.img
└── utsns-11.img

0 directories, 154 files

Restoring the container can be done with a simple:

stgraber@dakara:~$ lxc start c1

Live migration

Live migration is basically the same as the stateful stop/start above, except that the container directory and configuration happens to be moved to another machine too.

stgraber@dakara:~$ lxc list c1
+------+---------+-----------------------+----------------------------------------------+------------+-----------+
| NAME |  STATE  |          IPV4         |                     IPV6                     |    TYPE    | SNAPSHOTS |
+------+---------+-----------------------+----------------------------------------------+------------+-----------+
| c1   | RUNNING | 10.178.150.197 (eth0) | 2001:470:b368:4242:216:3eff:fe19:27b0 (eth0) | PERSISTENT | 2         |
+------+---------+-----------------------+----------------------------------------------+------------+-----------+

stgraber@dakara:~$ lxc list s-tollana:
+------+-------+------+------+------+-----------+
| NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS |
+------+-------+------+------+------+-----------+

stgraber@dakara:~$ lxc move c1 s-tollana:

stgraber@dakara:~$ lxc list c1
+------+-------+------+------+------+-----------+
| NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS |
+------+-------+------+------+------+-----------+

stgraber@dakara:~$ lxc list s-tollana:
+------+---------+-----------------------+----------------------------------------------+------------+-----------+
| NAME |  STATE  |          IPV4         |                     IPV6                     |    TYPE    | SNAPSHOTS |
+------+---------+-----------------------+----------------------------------------------+------------+-----------+
| c1   | RUNNING | 10.178.150.197 (eth0) | 2001:470:b368:4242:216:3eff:fe19:27b0 (eth0) | PERSISTENT | 2         |
+------+---------+-----------------------+----------------------------------------------+------------+-----------+

Limitations

As I said before, checkpoint/restore of containers is still pretty new and we’re still very much working on this feature, fixing issues as we are made aware of them. We do need more people trying this feature and sending us feedback, I would however not recommend using this in production just yet.

The current list of issues we’re tracking is available on Launchpad.

We expect a basic Ubuntu container with a few services to work properly with CRIU in Ubuntu 16.04. However more complex containers, using device passthrough, complex network services or special storage configurations are likely to fail.

Whenever possible, CRIU will fail at dump time, rather than at restore time. In such cases, the source container will keep running, the snapshot or migration will simply fail and a log file will be generated for debugging.

In rare cases, CRIU fails to restore the container, in which case the source container will still be around but will be stopped and will have to be manually restarted.

Sending bug reports

We’re tracking bugs related to checkpoint/restore against the CRIU Ubuntu package on Launchpad. Most of the work to fix those bugs will then happen upstream either on CRIU itself or the Linux kernel, but it’s easier for us to track things this way.

To file a new bug report, head here.

Please make sure to include:

The command you ran and the error message as displayed to you
Output of “lxc info” (*)
Output of “lxc info <container name>”
Output of “lxc config show –expanded <container name>”
Output of “dmesg” (*)
Output of “/proc/self/mountinfo” (*)
Output of “lxc exec <container name> — cat /proc/self/mountinfo”
Output of “uname -a” (*)
The content of /var/log/lxd.log (*)
The content of /etc/default/lxd-bridge (*)
A tarball of /var/log/lxd/<container name>/ (*)

If reporting a migration bug as opposed to a stateful snapshot or stateful stop bug, please include the data for both the source and target for any of the above which has been marked with a (*).

Extra information

The CRIU website can be found at: https://criu.org

The main LXD website is at: https://linuxcontainers.org/lxd
Development happens on Github at: https://github.com/lxc/lxd
Mailing-list support happens on: https://lists.linuxcontainers.org
IRC support happens in: #lxcontainers on irc.freenode.net
Try LXD online: https://linuxcontainers.org/lxd/try-it

About Stéphane Graber

Project leader of Linux Containers, Linux hacker, Ubuntu core developer, conference organizer and speaker.

View all posts by Stéphane Graber →

This entry was posted in Canonical voices, LXD, Planet Ubuntu and tagged containers. Bookmark the permalink.

13 Responses to LXD 2.0: Live migration [9/12]

Pingback: LXD 2.0: migración de contenedores en tiempo real [ENG]
Pingback: LXD 2.0: Live migration [9/12] - Linux Distros
Gábor Mészáros says:

2016/05/13 at 4:37 AM

Hello,

for some reason(s) for me applying migratable profile was necessary to succeed but I couldn’t find any reference for it here.
Am I missing something? (xenial both hosts, criu installed)

Reply
1. Stéphane Graber says:
  
  2016/05/16 at 3:41 PM
  
  We deprecated the migratable profile quite a while ago, it’s not present on new installs at all. So sounds like you may have some old LXD config around that may be messing with your containers.
  
  Reply
  1. Gábor Mészáros says:
    
    2016/05/17 at 3:09 AM
    
    I know, I added that as per insights [1].
    Strangely enough this only happens to one upgraded system (from 15.04 -> 15.10 -> 16.04), others are working fine.
    Same package versions, latest everything.
    
    [1]: https://insights.ubuntu.com/2015/05/06/live-migration-in-lxd/
    
    Reply
Mirko Corosu says:

2016/05/20 at 5:03 AM

Hi,

In order to avoid the network transfer of the container root, I’m going to install a distribute filesystem on each lxd node. Is it supported?

Reply
1. Dejan Lekic says:
  
  2017/04/06 at 11:38 AM
  
  I would like to know the answer to this question as well… 🙂
  
  Reply
naresh says:

2016/07/05 at 9:58 AM

Hey after the command lxc move host1 p1,I am getitng the error renaming of running container not allowed

Reply
Steffen Müller says:

2016/07/07 at 12:05 PM

Hi,

can you give me a hint how to do live migration with curl using the REST API? I know how to start it:
curl -s -k –cert crt.pem –key key.pem -X POST -d ‘{“migration”: true}’ https://192.168.168.167:8443/1.0/containers/rf-infraserver | jq
I get “Operation created” status code 100 but I don’t understand what to do next to really start the migration. You wrote in rest-api.md
“The migration does not actually start until someone (i.e. another lxd instance) connects to all the websockets and begins negotiation with the source.”
What this means?

Reply
Oleg says:

2016/09/21 at 8:26 AM

I installed fresh ubuntu 16.04, and stateful snapshots do not work at all.

# lxc launch ubuntu: c1
Creating c1
Starting c1

# lxc snapshot c1 first –stateful
error: snapshot dump failed
(00.007832) Error (action-scripts.c:60): One of more action scripts failed
(00.007868) Error (cr-dump.c:1621): Pre dump script failed with 32512!

without “–stateful” everything works ok.

Reply
NL says:

2016/10/19 at 3:42 PM

root@dnxovh-hy001 (node1):~# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 16.04.1 LTS
Release: 16.04
Codename: xenial

root@dnxovh-hy001 (node1):~# dpkg -l |egrep “criu|lxc|lxd” | awk ‘{print $2,$3,$4}’ |
criu 2.0-2ubuntu3 amd64 |
liblxc1 2.0.5-0ubuntu1~ubuntu16.04.1~ppa1 amd64 |
lxc-common 2.0.5-0ubuntu1~ubuntu16.04.1~ppa1 amd64 |
lxcfs 2.0.4-0ubuntu1~ubuntu16.04.1~ppa1 amd64 |
lxd 2.4.1-0ubuntu1~ubuntu16.04.1~ppa1 amd64 |
lxd-client 2.4.1-0ubuntu1~ubuntu16.04.1~ppa1 amd64

root@dnxovh-hy001 (node1):~# lxc info |
apiextensions: |
– storage_zfs_remove_snapshots |
– container_host_shutdown_timeout |
– container_syscall_filtering |
– auth_pki |
– container_last_used_at |
– etag |
– patch |
– usb_devices |
– https_allowed_credentials |
– image_compression_algorithm |
– directory_manipulation |
– container_cpu_time |
– storage_zfs_use_refquota
– storage_lvm_mount_options
– network
– profile_usedby
– container_push
apistatus: stable
apiversion: “1.0”
addresses:
– 10.0.0.1:8443
architectures:
– x86_64
– i686
driver: lxc
driverversion: 2.0.5
kernel: Linux
kernelarchitecture: x86_64
kernelversion: 4.4.0-43-generic
server: lxd
serverpid: 15124
serverversion: 2.4.1
storage: zfs
storageversion: “5”
config:
core.https_address: 10.0.0.1:8443
core.trust_password: true
storage.zfs_pool_name: zdata/lxd
public: false

root@dnxovh-hy001 (node1):~# lxc list
+————–+———+———————–+——+————+———–+
| NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS |
+————–+———+———————–+——+————+———–+
| inxovh-db001 | RUNNING | 192.168.10.201 (eth0) | | PERSISTENT | 0 |
+————–+———+———————–+——+————+———–+
| inxovh-db002 | RUNNING | 192.168.10.202 (eth0) | | PERSISTENT | 0 |
+————–+———+———————–+——+————+———–+
| inxovh-db003 | RUNNING | 192.168.10.203 (eth0) | | PERSISTENT | 0 |
+————–+———+———————–+——+————+———–+

root@dnxovh-hy001 (node1):~# lxc remote list |
+—————–+——————————————+—————+——–+——–+
| NAME | URL | PROTOCOL | PUBLIC | STATIC |
+—————–+——————————————+—————+——–+——–+
| images | https://images.linuxcontainers.org | simplestreams | YES | NO |
+—————–+——————————————+—————+——–+——–+
| local (default) | unix:// | lxd | NO | YES |
+—————–+——————————————+—————+——–+——–+
| node2 | https://10.0.0.2:8443 | lxd | NO | NO |
+—————–+——————————————+—————+——–+——–+
| ubuntu | https://cloud-images.ubuntu.com/releases | simplestreams | YES | YES |
+—————–+——————————————+—————+——–+——–+
| ubuntu-daily | https://cloud-images.ubuntu.com/daily | simplestreams | YES | YES |
+—————–+——————————————+—————+——–+——–+

root@dnxovh-hy001 (node1):~# lxc info inxovh-db001
Name: inxovh-db001
Remote: unix:/var/lib/lxd/unix.socket
Architecture: x86_64
Created: 2016/10/19 13:41 UTC
Status: Running
Type: persistent
Profiles: default
Pid: 3148
Ips:
lo: inet 127.0.0.1
lo: inet6 ::1
lxdbr0: inet6 fe80::e4a2:e5ff:fe66:7180
lxdbr0: inet6 fe80::1
eth0: inet 192.168.10.201 vethPU4Y94
eth0: inet6 fe80::216:3eff:fe5a:fc58 vethPU4Y94
Resources:
Processes: 70
Disk usage:
root: 166.26MB
CPU usage:
CPU usage (in seconds): 78
Memory usage:
Memory (current): 144.85MB
Memory (peak): 215.60MB
Network usage:
eth0:
Bytes received: 27.83MB
Bytes sent: 243.20MB
Packets received: 240888
Packets sent: 204223
lo:
Bytes received: 6.56kB
Bytes sent: 6.56kB
Packets received: 96
Packets sent: 96
lxdbr0:
Bytes received: 0 bytes
Bytes sent: 470 bytes
Packets received: 0
Packets sent: 5
root@dnxovh-hy001 (node1):~#

Live migration failed always with same message:

root@dnxovh-hy001 (node1):~# lxc move inxovh-db002 node2:inxovh-db002
error: Error transferring container data: migration restore failed
(01.045004) Warn (cr-restore.c:1159): Set CLONE_PARENT | CLONE_NEWPID but it might cause restore problem,because not all kernels support such clone flags combinations!
(01.063601) 1: Error (cr-restore.c:1489): mkdtemp failed crtools-proc.Qi1baH: Permission denied
(01.074712) Error (cr-restore.c:1352): 22488 killed by signal 9
(01.123156) Error (cr-restore.c:2182): Restoring FAILED.

/tmp are of course with good perms

root@dnxovh-hy001 (node1):~# ls -ld /tmp
drwxrwxrwt 16 root root 86016 Oct 19 21:42 /tmp
root@dnxovh-hy001 (node1):~# ssh node2 !!
ssh node2 ls -ld /tmp
drwxrwxrwt 9 root root 4096 Oct 19 21:42 /tmp

Any clue is welcome !

Thanks

Reply
Pingback: Getting started with LXD 2.0 container hypervisor | Odd One Out
Pingback: How to migrate LXD from DEB/PPA package to Snap package – Mi blog lah!