# LXD 2.0: Remote hosts and container migration [6/12]

This is the sixth blog post in this series about LXD 2.0.

# Remote protocols

LXD 2.0 supports two protocols:

• LXD 1.0 API: That’s the REST API used between the clients and a LXD daemon as well as between LXD daemons when copying/moving images and containers.
• Simplestreams: The Simplestreams protocol is a read-only, image-only protocol used by both the LXD client and daemon to get image information and import images from some public image servers (like the Ubuntu images).

Everything below will be using the first of those two.

# Security

Authentication for the LXD API is done through client certificate authentication over TLS 1.2 using recent ciphers. When two LXD daemons must exchange information directly, a temporary token is generated by the source daemon and transferred through the client to the target daemon. This token may only be used to access a particular stream and is immediately revoked so cannot be re-used.

To avoid Man In The Middle attacks, the client tool also sends the certificate of the source server to the target. That means that for a particular download operation, the target server is provided with the source server URL, a one-time access token for the resource it needs and the certificate that the server is supposed to be using. This prevents MITM attacks and only give temporary access to the object of the transfer.

# Network requirements

LXD 2.0 uses a model where the target of an operation (the receiving end) is connecting directly to the source to fetch the data.

This means that you must ensure that the target server can connect to the source directly, updating any needed firewall along the way.

We have a plan to allow this to be reversed and also to allow proxying through the client itself for those rare cases where draconian firewalls are preventing any communication between the two hosts.

# Interacting with remote hosts

Rather than having our users have to always provide hostname or IP addresses and then validating certificate information whenever they want to interact with a remote host, LXD is using the concept of “remotes”.

By default, the only real LXD remote configured is “local:” which also happens to be the default remote (so you don’t have to type its name). The local remote uses the LXD REST API to talk to the local daemon over a unix socket.

Say you have two machines with LXD installed, your local machine and a remote host that we’ll call “foo”.

First you need to make sure that “foo” is listening to the network and has a password set, so get a remote shell on it and run:

lxc config set core.https_address [::]:8443
lxc config set core.trust_password something-secure

Now on your local LXD, we just need to make it visible to the network so we can transfer containers and images from it:

lxc config set core.https_address [::]:8443

Now that the daemon configuration is done on both ends, you can add “foo” to your local client with:

lxc remote add foo 1.2.3.4

You’ll see something like this:

stgraber@dakara:~$lxc remote add foo 2607:f2c0:f00f:2770:216:3eff:fee1:bd67 Certificate fingerprint: fdb06d909b77a5311d7437cabb6c203374462b907f3923cefc91dd5fce8d7b60 ok (y/n)? y Admin password for foo: Client certificate stored at server: foo You can then list your remotes and you’ll see “foo” listed there: stgraber@dakara:~$ lxc remote list
+-----------------+-------------------------------------------------------+---------------+--------+--------+
|      NAME       |                         URL                           |   PROTOCOL    | PUBLIC | STATIC |
+-----------------+-------------------------------------------------------+---------------+--------+--------+
| foo             | https://[2607:f2c0:f00f:2770:216:3eff:fee1:bd67]:8443 | lxd           | NO     | NO     |
+-----------------+-------------------------------------------------------+---------------+--------+--------+
| images          | https://images.linuxcontainers.org:8443               | lxd           | YES    | NO     |
+-----------------+-------------------------------------------------------+---------------+--------+--------+
| local (default) | unix://                                               | lxd           | NO     | YES    |
+-----------------+-------------------------------------------------------+---------------+--------+--------+
| ubuntu          | https://cloud-images.ubuntu.com/releases              | simplestreams | YES    | YES    |
+-----------------+-------------------------------------------------------+---------------+--------+--------+
| ubuntu-daily    | https://cloud-images.ubuntu.com/daily                 | simplestreams | YES    | YES    |
+-----------------+-------------------------------------------------------+---------------+--------+--------+

## Interacting with it

Ok, so we have a remote server defined, what can we do with it now?

Well, just about everything you saw in the posts until now, the only difference being that you must tell LXD what host to run against.

For example:

lxc launch ubuntu:14.04 c1

Will run on the default remote (“lxc remote get-default”) which is your local host.

lxc launch ubuntu:14.04 foo:c1

Listing running containers on a remote host can be done with:

stgraber@dakara:~\$ lxc list foo:
+------+---------+---------------------+-----------------------------------------------+------------+-----------+
| NAME |  STATE  |         IPV4        |                     IPV6                      |    TYPE    | SNAPSHOTS |
+------+---------+---------------------+-----------------------------------------------+------------+-----------+
| c1   | RUNNING | 10.245.81.95 (eth0) | 2607:f2c0:f00f:2770:216:3eff:fe43:7994 (eth0) | PERSISTENT | 0         |
+------+---------+---------------------+-----------------------------------------------+------------+-----------+

One thing to keep in mind is that you have to specify the remote host for both images and containers. So if you have a local image called “my-image” on “foo” and want to create a container called “c2” from it, you have to run:

lxc launch foo:my-image foo:c2

Finally, getting a shell into a remote container works just as you would expect:

lxc exec foo:c1 bash

## Copying containers

Copying containers between hosts is as easy as it sounds:

lxc copy foo:c1 c2

And you’ll have a new local container called “c2” created from a copy of the remote “c1” container. This requires “c1” to be stopped first, but you could just copy a snapshot instead and do it while the source container is running:

lxc snapshot foo:c1 current
lxc copy foo:c1/current c3

## Moving containers

Unless you’re doing live migration (which will be covered in a later post), you have to stop the source container prior to moving it, after which everything works as you’d expect.

lxc stop foo:c1
lxc move foo:c1 local:

This example is functionally identical to:

lxc stop foo:c1
lxc move foo:c1 c1

# How this all works

Interactions with remote containers work as you would expect, rather than using the REST API over a local Unix socket, LXD just uses the exact same API over a remote HTTPS transport.

Where it gets a bit trickier is when interaction between two daemons must occur, as is the case for copy and move.

In those cases the following happens:

1. The user runs “lxc move foo:c1 c1”.
2. The client contacts the local: remote to check for an existing “c1” container.
3. The client fetches container information from “foo”.
4. The client requests a migration token from the source “foo” daemon.
5. The client sends that migration token as well as the source URL and “foo”‘s certificate to the local LXD daemon alongside the container configuration and devices.
6. The local LXD daemon then connects directly to “foo” using the provided token
1. It connects to a first control websocket
2. It negotiates the filesystem transfer protocol (zfs send/receive, btrfs send/receive or plain rsync)
3. If available locally, it unpacks the image which was used to create the source container. This is to avoid needless data transfer.
4. It then transfers the container and any of its snapshots as a delta.
7. If succesful, the client then instructs “foo” to delete the source container.

# Try all this online

Don’t have two machines to try remote interactions and moving/copying containers?

That’s okay, you can test it all online using our demo service.
The included step-by-step walkthrough even covers it!

# Extra information

The main LXD website is at: https://linuxcontainers.org/lxd
Development happens on Github at: https://github.com/lxc/lxd
Mailing-list support happens on: https://lists.linuxcontainers.org
IRC support happens in: #lxcontainers on irc.freenode.net

This entry was posted in Canonical voices, LXD, Planet Ubuntu and tagged . Bookmark the permalink.

### 27 Responses to LXD 2.0: Remote hosts and container migration [6/12]

1. Mahesh says:

once again fantastic post Stephane. ???????????? . Very well explained. I think you missed containers congratulation part in point number 5. While sending certificate, token, and url it also sends containers config. Correct me if i am wrong.

Mahesh

• You’re right, the container configuration is sent at that time.

Updated the post.

2. Jim says:

Having a remote LXD deamon sounds fantastic especially for backup and “HA” purposes.

However I can imagine scenario’s where that remote daemon is not on the same LAN and where you don’t want to open up 8443 for the world.

Would this work with an SSH tunnel? and if so do you have an example of that?

• I don’t have an example but I’d certainly expect things to work with the usual -R and -L ssh options, redirecting the remote 8443 port to some random local port, then just “lxc remote add some-server localhost:“.

I believe recent ssh even lets you forward a unix socket, in which case you could just redirect the remote /var/lib/lxd/unix.socket path to say /tmp/some-server.socket and then add it with “lxc remote add some-server unix:///tmp/some-server.socket”

The obvious problem with this is that there won’t be an accessible path from that remote server and your local LXD, so you should be able to copy images and containers FROM that remote server, but not copy images and containers TO it.

3. Jonny says:

Fantastic post, thanks. Just a few questions-

Is there an easy way to route traffic between containers on two different hosts? To achieve this should I setup two different subnets? I can’t seem to get either working easily.

Thanks!

• Jonny says:

I have, at least temporarily, resolved this by adding the following to the host /etc/network/interfaces files

Host1:

Host2:

Would love to know if there’s a more conventional way of doing this or if not, if there’s room for this to be implemented properly.

Cheers!

4. Cristian says:

I have 2 hosts in the same network, firewall disabled on both of them, running LXD 2.0 on both.
On the first host I’ve created an image trying to export it on the 2nd host. I did all setup following your documentation, spent 5h trying to get pass this error:
root@TrustTharTemplate:~# lxc remote add name 10.205.197.31
error: Get https://10.205.197.31:8443: Forbidden

Any ideas will be much appreciated.
Thank you,
Cristian

5. Ram says:

Hi Stephane,

I was curious to know what the implications would be of using LXD to build a highly available ZFS based storage system. Has this been tried and if so, would you be able to point us towards how this can be done?

6. Ram says:

Sorry, as an addition to the previous post, I meant to add – both in a shared storage environment – two hosts sharing a set of drives, and in a non-shared storage environment.

7. NL says:

Hello

Live migration doesn’t work on my side with a straight forward installation on Ubuntu 16 LTS.

There is always same message :

root@dnxovh-hy001 (node1):~# lxc remote list
+—————–+——————————————+—————+——–+——–+
| NAME | URL | PROTOCOL | PUBLIC | STATIC |
+—————–+——————————————+—————+——–+——–+
| images | https://images.linuxcontainers.org | simplestreams | YES | NO |
+—————–+——————————————+—————+——–+——–+
| local (default) | unix:// | lxd | NO | YES |
+—————–+——————————————+—————+——–+——–+
| node2 | https://10.0.0.2:8443 | lxd | NO | NO |
+—————–+——————————————+—————+——–+——–+
| ubuntu | https://cloud-images.ubuntu.com/releases | simplestreams | YES | YES |
+—————–+——————————————+—————+——–+——–+
| ubuntu-daily | https://cloud-images.ubuntu.com/daily | simplestreams | YES | YES |
+—————–+——————————————+—————+——–+——–+
root@dnxovh-hy001 (node1):~# lxc list
+————–+———+———————–+——+————+———–+
| NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS |
+————–+———+———————–+——+————+———–+
| inxovh-db001 | RUNNING | 192.168.10.201 (eth0) | | PERSISTENT | 0 |
+————–+———+———————–+——+————+———–+
| inxovh-db002 | RUNNING | 192.168.10.202 (eth0) | | PERSISTENT | 0 |
+————–+———+———————–+——+————+———–+
| inxovh-db003 | RUNNING | 192.168.10.203 (eth0) | | PERSISTENT | 0 |
+————–+———+———————–+——+————+———–+
root@dnxovh-hy001 (node1):~# lxc move inxovh-db002 node2:inxovh-db002
error: Error transferring container data: migration restore failed
(00.017790) Warn (cr-restore.c:1159): Set CLONE_PARENT | CLONE_NEWPID but it might cause restore problem,because not all kernels support such clone flags combinations!
(00.021291) 1: Error (cr-restore.c:1489): mkdtemp failed crtools-proc.Obatld: Permission denied
(00.038977) Error (cr-restore.c:1352): 18657 killed by signal 9
(00.087416) Error (cr-restore.c:2182): Restoring FAILED.
root@dnxovh-hy001 (node1):~#

I even installed the PPA repo to have latest version and SAME error.

root@dnxovh-hy001 (node1):~# lxc –version 2.4.1

root@dnxovh-hy001 (node1):~# dpkg -l |egrep “lxc|lxd|criu” |
ii criu 2.0-2ubuntu3 amd64 checkpoint and restore in userspace |
ii liblxc1 2.0.5-0ubuntu1~ubuntu16.04.1~ppa1 amd64 Linux Containers userspace tools (library) |
ii lxc-common 2.0.5-0ubuntu1~ubuntu16.04.1~ppa1 amd64 Linux Containers userspace tools (common tools) |
ii lxcfs 2.0.4-0ubuntu1~ubuntu16.04.1~ppa1 amd64 FUSE based filesystem for LXC |
ii lxd 2.4.1-0ubuntu1~ubuntu16.04.1~ppa1 amd64 Container hypervisor based on LXC – daemon |
ii lxd-client 2.4.1-0ubuntu1~ubuntu16.04.1~ppa1 amd64 Container hypervisor based on LXC – client

root@dnxovh-hy001 (node1):~# uname -a
Linux dnxovh-hy001 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

Please, give us (i think i’m not the only one) some clue to perform live migration with LXD, it would be fantastic, especially for maintenance tasks (kernel upgrade) on our servers.

Thanks
Regards
NL

8. Harry says:

“Somewhat related” to the topic of Container Migration, I have a need to migrate the lxd data directory repeatedly between 2 hosts, say between my office and home workstations. The idea being, to always be able to continue my work between office and home workstations.

I’d posted my query on SuperUser here,

, but got no responses at all!

Would you please confirm if the current LxD behavior is a bug or a by-design feature, and if a feature indeed, then suggest a workaround to achieve my desired use-case?

• Harry says:

I answered the question myself on Super User.

In summary: My /etc/subuid and /etc/subgid files were having different numeric content on the 2 hosts.

9. AceSlash says:

I created a snapshot on host1: lxc snapshot container1 20161229_1203, but when I try to copy it, I have the following error:
# lxc copy host1:”container1/20161229_1203″ local:
error: Invalid container name: ‘/’ is reserved for snapshots

any idea?

• Markqq says:

The quotation marks (“) are causing the problem because they lead the container_name/snapshot combination to be treated as one, i.e. as just a container name (with a ‘/’ inside, which is invalid). So leave the quotations out, as in Stëphane’s example in the post.

10. Claneys says:

Great,

Can we copy a container to another host, then runs it while the first one is stopped. And once work done on the running container, sync it with the one that is stopped ?
I imagine having a container that I can carry with me.

Thanks you, interesting post 🙂

11. andrea montanari says:

I have a problem when trying to copy a container from original host to a new host

note that I get this error only with this particular container (other copies are ok)
Error I get eventually = error: websocket: bad handshake

below the log

Any suggestion?

to http://unix.socket/1.0/containers

DBUG[03-21|19:41:04] 1.0/operations/bd5ecdd5-2c5c-4cf8-a3ea-1c9c9f5f413d/wait
DBUG[03-22|12:52:43] Raw response: {“type”:”sync”,”status”:”Success”,”status_code”:200,”metadata”:{“id”:”bd5ecdd5-2c5c-4cf8-a3ea-1c9c9f5f413d”,”class”:”task”,”created_at”:”2017-03-21T19:41:04.544043316+01:00″,”updated_at”:”2017-03-21T19:41:04.544043316+01:00″,”status”:”Failure”,”status_code”:400,”resources”:{“containers”:[“/1.0/containers/app1″]},”metadata”:null,”may_cancel”:false,”err”:”migration dump failed\n(00.330070) Warn (criu/autofs.c:77): Failed to find pipe_ino option (old kernel?)\n(01.009088) Warn (criu/arch/x86/crtools.c:133): Will restore 15087 with interrupted system call\n(01.212448) Warn (criu/arch/x86/crtools.c:133): Will restore 15091 with interrupted system call\n(01.414757) Warn (criu/arch/x86/crtools.c:133): Will restore 15099 with interrupted system call\n(01.415034) Warn (criu/arch/x86/crtools.c:133): Will restore 15142 with interrupted system call\n(01.802547) Warn (criu/arch/x86/crtools.c:133): Will restore 15112 with interrupted system call\n(01.803512) Warn (criu/arch/x86/crtools.c:133): Will restore 15137 with interrupted system call\n(02.378560) Warn (criu/arch/x86/crtools.c:133): Will restore 15174 with interrupted system call\n(02.550403) Warn (criu/arch/x86/crtools.c:133): Will restore 15182 with interrupted system call\n(02.550687) Warn (criu/arch/x86/crtools.c:133): Will restore 15186 with interrupted system call\n(02.561101) Warn (criu/arch/x86/crtools.c:133): Will restore 15359 with interrupted system call\n(06.460960) Warn (criu/arch/x86/crtools.c:133): Will restore 15591 with interrupted system call\n(06.462071) Warn (criu/arch/x86/crtools.c:133): Will restore 15599 with interrupted system call\n(06.462329) Warn (criu/arch/x86/crtools.c:133): Will restore 15600 with interrupted system call\n(06.462548) Warn (criu/arch/x86/crtools.c:133): Will restore 15601 with interrupted system call\n(06.462834) Warn (criu/arch/x86/crtools.c:133): Will restore 15602 with interrupted system call\n(06.464217) Warn (criu/arch/x86/crtools.c:133): Will restore 15608 with interrupted system call\n(06.464387) Warn (criu/arch/x86/crtools.c:133): Will restore 15609 with interrupted system call\n(06.464958) Warn (criu/arch/x86/crtools.c:133): Will restore 15617 with interrupted system call\n(06.466880) Warn (criu/arch/x86/crtools.c:133): Will restore 7637 with interrupted system call\n(06.468412) Warn (criu/arch/x86/crtools.c:133): Will restore 17371 with interrupted system call\n(06.468548) Warn (criu/arch/x86/crtools.c:133): Will restore 22469 with interrupted system call\n(06.468757) Warn (criu/arch/x86/crtools.c:133): Will restore 9264 with interrupted system call\n(06.469159) Warn (criu/arch/x86/crtools.c:133): Will restore 9271 with interrupted system call\n(06.470061) Warn (criu/arch/x86/crtools.c:133): Will restore 3886 with interrupted system call\n(06.497080) Error (criu/sk-inet.c:202): Name resolved on unconnected socket\n(06.497096) Error (criu/cr-dump.c:1313): Dump files (pid: 15384) failed with -1\n(06.506079) Error (criu/cr-dump.c:1628): Dumping FAILED.”}}

to http://unix.socket/1.0/containers

DBUG[03-22|12:52:47] 1.0/operations/0a2267d8-9acf-4cb9-b366-37a7484c0d50/wait

12. Andrea says:

Hello,
I have a new issue copying container between hosts.

origin host:

root@hi1:~# lxc –version
2.0.8

root@hi1:~# lxc list
.
.
+———+———+——————————–+——+————+———–+
| pol1 | RUNNING | xx.xx.xx.xx (eth0) | | PERSISTENT | 7 |
| | | 172.16.11.84 (eth1) | | | |
+———+———+——————————–+——+————+———–+
.

destination host:
root@po1:~# lxc –version
2.0.9
(note origin host has lxd 2.0.8 and cannot upgrade)

root@po1:~# lxc list
+——+——-+——+——+——+———–+
| NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS |
+——+——-+——+——+——+———–+

(so no container at all)
root@po1:~# ls /var/lib/lxd/containers/
lxc-monitord.log

WHen I run the command
lxc copy hi1:pol1/pol1snap pol1 –verbose –debug

I obtain the following..

…..MISSING PART
rver”:”lxd”,”server_pid”:3310,”server_version”:”2.0.9″,”storage”:”zfs”,”storage_version”:”5″}}}

DBUG[05-28|21:50:47] POST {“migration”:true}
to https://hi1.xxxx.yy:8443/1.0/containers/pol1/snapshots/pol1snap

to http://unix.socket/1.0/containers

DBUG[05-28|21:50:49] 1.0/operations/e2c3a1d6-b0b8-43a5-af54-c00965fe7ed2/wait

DBUG[05-28|22:06:02] Raw response: {“error”:”Container ‘pol1’ already exists”,”error_code”:500,”type”:”error”}

and the origin container cannot be copied.
Any suggestion?

Thank you very much

Andrea.

• Andrea says:

I have set up a little test:

container copy from both 2.0.8 lxd version was succesful

container copy from 2.0.8 to 2.0.9 lxd version was not succesful (container ‘xxx’ already exists error)

container copy from 2.0.9 to 2.0.9 lxd version was not succesful (container ‘xxx’ already exists error)

Is it a known issue in the 2.0.9 version?

Wow
I just migrated about 30 containers from one host running containers on ext4 to another running ZFS. Love LXD/LXC!

My biggest issue was a container that have SSSD for domainjoining to Microsoft AD. The migration of that container looked like it hung because it was taking a lot of time and checking the disk usage on the receiving host I found that it took waaaay more than what was used on the original host!!
The issue was this file /var/log/lastlog that was gigantic (but it only appears to be big) see more here: http://www.noah.org/wiki/Lastlog_is_gigantic
Removing the file and I could migrate the last container!!

14. Luis Rodriguez says:

I had an issue trying to move a 2.12 container (ubuntu 17.04) into a 2.09 container (ubuntu 16.04) it seems that the metadata information is different, something related to architecture.

Besides that migration seems to work fine from 17.04 (2.12) to 17.04 (2.12) however it takes a loooong time if you are trying to move a 1.7GB container from one server to another, which makes the move slow over the network. I have some overhead since I am moving from an host 17.04 to a KVM 17.04 which is using a thin qcow2 image as partition and I can see hot the size is increasing over the time with the move (this will not be production, but helps me to have LXD on VMs to simulate a production like environment).

However.

What would be your suggestion to have live migration with a shared storage. is it possible? E.g. I used to have VM’s on a shared storage (NFS, iSCSI), and perform a live migration from one server to another and since both servers where sharing the same storage, the migration only moved the VM’s metadata, not the disk images itself.

It seems to me that live migration on the other side, is copying the entire 1.7GB container from one server to another, and even thou if it was a smaller container (or bigger) it will take time.

Would it be possible to have a shared storage between two LXD hypervisors? and perform a real-time live migration (e.g. HA)?

I am really new to LXD and ZFS by the way , it looks great, however I had this issue, plus another issue that If I reinstall the host server (trying to preserve the storage pool) I can add an existing storage pool into the server. I have to backup the ZFS pool delete the pool and then perform the init and later on move the backup into the new storage pool. (Am I missing something?

Thanks
Luis

15. Luis Rodriguez says:

One more thing, the container base image without any data (basic software installed) is 600M not the 1.7G, however, I am using this as a CI runner for gitlab. Question. Is it possible to mount a shared storage inside the container (I read somewhere that in LXC is not — yes some tricks mounting the share in the host and adding the device) in order to have stateless container? (e.g. the working directory for gitlab could be a shared storage that is not in the container file system) so that when I perform the live migration, even though I have to copy all the container data from one host to another, I don’t have to copy the “state” of the container?

(I am really new to LXD, so maybe there are options like having two base images in two hosts, maybe a night backup, and only copying the delta if a live migration is needed )

Thanks
Luis

16. Andrei Rodrigues says:

Thank you for the tutorial. Very well explain and useful.

Continues with the good blog.