This is the seventh blog post in this series about LXD 2.0.
Why run Docker inside LXD
As I briefly covered in the first post of this series, LXD’s focus is system containers. That is, we run a full unmodified Linux distribution inside our containers. LXD for all intent and purposes doesn’t care about the workload running in the container. It just sets up the container namespaces and security policies, then spawns /sbin/init and waits for the container to stop.
Application containers such as those implemented by Docker or Rkt are pretty different in that they are used to distribute applications, will typically run a single main process inside them and be much more ephemeral than a LXD container.
Those two container types aren’t mutually exclusive and we certainly see the value of using Docker containers to distribute applications. That’s why we’ve been working hard over the past year to make it possible to run Docker inside LXD.
This means that with Ubuntu 16.04 and LXD 2.0, you can create containers for your users who will then be able to connect into them just like a normal Ubuntu system and then run Docker to install the services and applications they want.
Requirements
There are a lot of moving pieces to make all of this working and we got it all included in Ubuntu 16.04:
- A kernel with CGroup namespace support (4.4 Ubuntu or 4.6 mainline)
- LXD 2.0 using LXC 2.0 and LXCFS 2.0
- A custom version of Docker (or one built with all the patches that we submitted)
- A Docker image which behaves when confined by user namespaces, or alternatively make the parent LXD container a privileged container (security.privileged=true)
Running a basic Docker workload
Enough talking, lets run some Docker containers!
First of all, you need an Ubuntu 16.04 container which you can get with:
lxc launch ubuntu-daily:16.04 docker -c security.nesting=true
Now lets make sure the container is up to date and install docker:
lxc exec docker -- apt update lxc exec docker -- apt dist-upgrade -y lxc exec docker -- apt install docker.io -y
And that’s it! You’ve got Docker installed and running in your container.
Now lets start a basic web service made of two Docker containers:
stgraber@dakara:~$ lxc exec docker -- docker run --detach --name app carinamarina/hello-world-app Unable to find image 'carinamarina/hello-world-app:latest' locally latest: Pulling from carinamarina/hello-world-app efd26ecc9548: Pull complete a3ed95caeb02: Pull complete d1784d73276e: Pull complete 72e581645fc3: Pull complete 9709ddcc4d24: Pull complete 2d600f0ec235: Pull complete c4cf94f61cbd: Pull complete c40f2ab60404: Pull complete e87185df6de7: Pull complete 62a11c66eb65: Pull complete 4c5eea9f676d: Pull complete 498df6a0d074: Pull complete Digest: sha256:6a159db50cb9c0fbe127fb038ed5a33bb5a443fcdd925ec74bf578142718f516 Status: Downloaded newer image for carinamarina/hello-world-app:latest c8318f0401fb1e119e6c5bb23d1e706e8ca080f8e44b42613856ccd0bf8bfb0d stgraber@dakara:~$ lxc exec docker -- docker run --detach --name web --link app:helloapp -p 80:5000 carinamarina/hello-world-web Unable to find image 'carinamarina/hello-world-web:latest' locally latest: Pulling from carinamarina/hello-world-web efd26ecc9548: Already exists a3ed95caeb02: Already exists d1784d73276e: Already exists 72e581645fc3: Already exists 9709ddcc4d24: Already exists 2d600f0ec235: Already exists c4cf94f61cbd: Already exists c40f2ab60404: Already exists e87185df6de7: Already exists f2d249ff479b: Pull complete 97cb83fe7a9a: Pull complete d7ce7c58a919: Pull complete Digest: sha256:c31cf04b1ab6a0dac40d0c5e3e64864f4f2e0527a8ba602971dab5a977a74f20 Status: Downloaded newer image for carinamarina/hello-world-web:latest d7b8963401482337329faf487d5274465536eebe76f5b33c89622b92477a670f
With those two Docker containers now running, we can then get the IP address of our LXD container and access the service!
stgraber@dakara:~$ lxc list +--------+---------+----------------------+----------------------------------------------+------------+-----------+ | NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS | +--------+---------+----------------------+----------------------------------------------+------------+-----------+ | docker | RUNNING | 172.17.0.1 (docker0) | 2001:470:b368:4242:216:3eff:fe55:45f4 (eth0) | PERSISTENT | 0 | | | | 10.178.150.73 (eth0) | | | | +--------+---------+----------------------+----------------------------------------------+------------+-----------+ stgraber@dakara:~$ curl http://10.178.150.73 The linked container said... "Hello World!"
Conclusion
That’s it! It’s really that simple to run Docker containers inside a LXD container.
Now as I mentioned earlier, not all Docker images will behave as well as my example, that’s typically because of the extra confinement that comes with LXD, specifically the user namespace.
Only the overlayfs storage driver of Docker works in this mode. That storage driver may come with its own set of limitation which may further limit how many images will work in this environment.
If your workload doesn’t work properly and you trust the user inside the LXD container, you can try:
lxc config set docker security.privileged true lxc restart docker
That will de-activate the user namespace and will run the container in privileged mode.
Note however that in this mode, root inside the container is the same uid as root on the host. There are a number of known ways for users to escape such containers and gain root privileges on the host, so you should only ever do that if you’d trust the user inside your LXD container with root privileges on the host.
Extra information
The main LXD website is at: https://linuxcontainers.org/lxd
Development happens on Github at: https://github.com/lxc/lxd
Mailing-list support happens on: https://lists.linuxcontainers.org
IRC support happens in: #lxcontainers on irc.freenode.net
Try LXD online: https://linuxcontainers.org/lxd/try-it
Why I can not curl the container?
$ lxc list
+——–+———+——————————–+——+————+———–+
| NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS |
+——–+———+——————————–+——+————+———–+
| docker | RUNNING | 10.33.36.20 (eth0) | | PERSISTENT | 0 |
| | | 172.17.0.1 (docker0) | | | |
+——–+———+——————————–+——+————+———–+
$ curl http://10.33.36.20
curl: (7) Failed to connect to 10.33.36.20 port 80: Connection refused
Probably something went wrong with Docker inside that container. I’m unfortunately not very familiar with Docker but I think “docker ps” and “docker logs” may help you figure it out.
your lxd bridge isnt configured properly. remove all images and containers then do
sudo lxd init
follow the prompt defaults and then retry this tutortial
Hi Stéphane. A few days ago I wrote something similar to this post: Running Docker Swarm inside LXC. I know you care much about nesting (which is really appreciated!) and I though you might be interested in interesting the approach I’m using, and my considerations about AppArmor.
A few notes: it’d be great if LXC/LXD allowed unprivileged containers create node devices! It’d be also great if AppArmor supported “nesting” profiles, but I guess that’s not something you are going to work on…
Apparmor nesting (in fact, namespacing) is in the current Ubuntu 4.4 kernel, though we haven’t yet had time to add support for it to userspace, it is a very very new feature.
As for allowing mknod inside a user namespace, this simply isn’t going to happen as it would be a major security issue. If you were allowed to mknod things, you could mknod existing devices, say sda and then have an unprivileged user mess with that and wipe the disk on the host.
You can allow mknod without allowing access to them (like you do with privileged containers)
No you can’t.
Any unprivileged user can unshare a user namespace and there is no tie inside the kernel between the use of a user namespace and the use of the devices cgroup.
So I basically could do:
unshare -m -u -p -f -r
mknod sda b 8 0
chmod 666 sda
> sda
And wipe the host’s disk as an entirely unprivileged user.
You’re right, sorry. I realized that shortly after submitting my comment.
Stephane, is mknod still unsupported? I have an app that needs to mknod and it fails.
Thanks for the great work on this. The series has been easy to follow even for a beginner. The video of of your talk from last year was also quite helpful.
https://www.youtube.com/watch?v=yEr_EIZG0ZM
On the storage set up I wanted to clarify your comment on using the overlay driver. I set up a zfs partition based on this article (https://insights.ubuntu.com/2016/02/16/zfs-is-the-fs-for-containers-in-ubuntu-16-04/), and then followed the steps in your article. I got a container with Docker started and the first docker image loaded, but the second one failed because it ran out of space (the zfs partition was 15G and the image was 0.7G). I’m assuming that is because Docker was using the default vfs driver on top of the zfs host device.
Apparently the overlay driver is not compatible with zfs:
https://docs.docker.com/engine/userguide/storagedriver/selectadriver/
How would you recommend setting the storage up on the host, lxd and Docker?
https://github.com/lxc/lxd/blob/master/doc/storage-backends.md
Thank you,
Having the same issue, please say there is a way to keep the ZFS partition usage near the actual image size? A bit of overhead is ok, but this is way too much =/
Kind regards
I was starting from scratch, so didn’t have a lot of built up docker infrastructure, but found that using the lxd / lxc containers where I would have used docker containers worked really well – mounting volumes, networking containers together, controlling remote containers from my local machine a la docker-machine, quickly spinning up and tearing down containers for development, etc. Seems to me like the best of both virtual machines and containerized apps without having to use both vagrant and docker.
I’ve dropped zfs as an option as it uses way too much space, downloading/extracting Docker images is many times slower than without it (not running on bare metal) and last but not least: OMG the RAM usage. I maxed out 6GB RAM with only two LXC containers running the above example.
BTRFS on the other hand works fine, speed is good, quotas are simple enough to implement, host server uses 400MB RAM per container but the full disk size is shown to when you df -h inside the container (not the quota) and I don’t get an error if I try to exceed the quota (just doesn’t write to disk).
I’m aware the two issues I have with BTRFS have nothing to do with LXC/LXD but I will need to find a solution for them if I want to use LXC/LXD.
The solution is to use fuse-overlayfs in the container, it is compatible with ZFS. Install the fuse-overlayfs package and configure docker to use it as the storage driver.
At some point in the future we’ll have two other options: overlay2 will be natively supported and with ZFS namespace delegation a ZFS dataset subtree can be managed from within the container. Both features are currently in development.
Hi,
I tried following the steps, and I was able to install docker successfully, but when I did docker run, following errors popped up.
Error response from daemon: Cannot start container f171: [10] System error: write /sys/fs/cgroup/devices/docker/f171/devices.allow: operation not permitted.
Somehow, I think docker images are using character devices and there might be a setting to override it. I tried Andrea’s blog, and docker in lxc1 worked pretty well. Can you help me in this regard.
Yeah, it seems likely that the image you picked just doesn’t work in unprivileged containers.
You could give it a go with a privileged container which is most likely what you were doing with lxc1.
I get exactly the same on a 16.04 host with LXD 2.0.3 and runnin it in a privileged container.
Steps I did:
– lxc init images:ubuntu/xenial/amd64 docker -p default -p docker
– lxc config set docker security.privileged true
– lxc start docker
– lxc exec docker bash
from bash shell in docker container:
– apt-get update && apt-get dist-upgrade -y
– apt-get install docker.io
– docker run –detach –name app carinamarina/hello-world-app
after it downloaded the docker image I get:
docker: Error response from daemon: Cannot start container [10] System error: write /sys/fs/cgroup/devices/docker//devices.allow: operation not permitted
The host is running on BTRFS by the way…
Hi,
the default Ubuntu docker.io package, as described in your post working fine, thank you!
However, once you try to upgrade Docker to latest version with the official Docker installation script (https://docs.docker.com/linux/step_one/):
lxc exec docker — curl -fsSL https://get.docker.com/ | sh
Docker no longer works properly:
lxc exec docker — docker run hello-world
docker: Error response from daemon: Container command ‘/hello’ not found or does not exist..
Any idea how to fix this?
BR
panticz
So, is setting the container’s security.privileged to true any less secure than running Docker directly inside of a host without using LXD? (i.e. if I were to run Docker on Ubuntu 14/15.x without LXC)
“Only the overlayfs storage driver of Docker works in this mode.” — What about if the host is using btrfs? Can the Docker btrfs storage driver be used then? Or is this a theoretical possibility, but still not supported yet by Canonical?
For example, in the comments of your LXD in LXD nesting tutorial you stated that the nested LXD containers can still use btrfs storage driver, if the file system was mounted by the host with user_subvol_rm_allowed (presumably so that unprivileged nested LXD daemon can manage subvolumes). I would think a similar thing could be done with Docker?
Doing some more testing on my own… it looks like things do seem to work with Docker in LXD on btrfs if you use user_subvol_rm_allowed. I was able to run a hello world Docker container, and then delete it – and no btrfs subvolumes were left behind. BUT you should be aware of this bug when doing so: https://github.com/lxc/lxd/issues/2058 — until it’s fixed, use of snapshots with LXD is kind of broken if the container made btrfs subvolumes. (Same issue applies if trying LXD in LXD on btrfs.)
I’m still not sure if I’m doing something that’s ok to do or not, in light of the statement that only overlayfs is supported. But it seemed to work with user_subvol_rm_allowed btrfs mount option.
Hi, I´m facing exactly the same issue and I can confirm that adding user_subvol_rm_allowed to fstab solves the issue.
As you pointed out, neither I am sure if this fix is “ok” or not
Broken docker.io?
apt install docker.io ends with error:
May 31 14:44:08 test docker[2865]: time=”2016-05-31T14:44:08.102019345Z” level=fatal msg=”Error starting daemon: AppArmor enabled on system but the docker-default profile could not b
May 31 14:44:08 test systemd[1]: docker.service: Main process exited, code=exited, status=1/FAILURE
May 31 14:44:08 test systemd[1]: Failed to start Docker Application Container Engine.
May 31 14:44:08 test systemd[1]: docker.service: Unit entered failed state.
May 31 14:44:08 test systemd[1]: docker.service: Failed with result ‘exit-code’.
lxc container in ubuntu xenial
Thanks your groundbreaking works for better resource utilisation and fostering free software!
Can’t wait to span LXC & Docker on LXD over Linode & DigitalOcean.
@Stéphane: Do you need tests from behind the Gre@t Firew@ll of Chin@, for example with Sh@dowsocks situations? I would have to check first but I’m confident that some of the more tech-savvy colleagues from http://www.blug.sh (Beijing Linux User Group) would be interested to contribute. Best from Beijing, hello[at]eco-git.org
Workaround run docker commands as user ubuntu, not as user root:
Tested 24 June 2016, using ubuntu-daily LXD image and the instructions above.
When running the docker command(s) as root in the container (which occurs by default), the docker images will not run.
When I run the docker command as user ubuntu, the docker images *do* run:
lxc exec docker — su -l ubuntu -c “docker run –name hello2 hellow-world”
This was with the LXD container running unprivileged. It did not appear to make any difference whether running privileged or not; when trying to launch a privileged docker image inside the LXD, if the launching user was root it fails.
I ran `docker daemon –debug` to check for errors, and these are what were being issued whenever a docker image failed to run. Key phrases included “error locating sandbox”, and “Container command not found or does not exist”:
ERRO[0013] error locating sandbox id 79d7: sandbox 79d7 not found
WARN[0013] failed to cleanup ipc mounts:
failed to umount /var/lib/docker/containers/853d/shm: invalid argument
ERRO[0013] Error unmounting container 853d: not mounted
ERRO[0013] Handler for POST /v1.22/containers/853d/start returned error: Container command not found or does not exist.
I checked the “merged” mountpoint, which was also empty.
for docker 1.11.2, I had to configure the container to be privileged as well as add the tuntap device to the docker profile to be able to run `docker run -it ubuntu /bin/bash` Unless the ubuntu container requires privileged, I assume this is a docker 1.11 thing.
`lxc config set security.privileged true `
`lxc profile device add docker tuntap unix-char path=/dev/net/tun`
I followed the steps here to launch an LXD which has Docker engine running in it. But when I run a Docker container, it will fail:
root@docker:~# docker run -it busybox /bin/sh
Unable to find image ‘busybox:latest’ locally
latest: Pulling from library/busybox
8ddc19f16526: Pull complete
Digest: sha256:a59906e33509d14c036c8678d687bd4eec81ed7c4b8ce907b888c607f6a1e0e6
Status: Downloaded newer image for busybox:latest
docker: Error response from daemon: Cannot start container 91de8306d177670453d0831b830807516b0863c13c8a6f5325a32fde6baa0835: [10] System error: write /sys/fs/cgroup/devices/docker/91de8306d177670453d0831b830807516b0863c13c8a6f5325a32fde6baa0835/devices.allow: operation not permitted.
I even see this error after I run the following commands:
lxc config set docker security.privileged true
lxc restart docker
Any ideas?
I think I have figured out how to reproduce this issue:
Right after I followed the steps here to launch an LXD container and installed Docker engine in it, I can run Docker containers in it, everything is good. But after I made it a privileged container and restarted it, then I found I can not run Docker containers anymore and Docker’s storage driver was changed from vfs to aufs, and all the images I pulled before the restart disappeared.
Then I changed the LXD container back to unprivileged and restarted it, this time I found the Docker engine event can not be started:
Aug 05 09:44:40 docker systemd[1]: Starting Docker Application Container Engine…
Aug 05 09:44:40 docker docker[327]: time=”2016-08-05T09:44:40.805938409Z” level=error msg=”[graphdriver] prior storage driver “aufs” failed: driver not supported”
Aug 05 09:44:40 docker docker[327]: time=”2016-08-05T09:44:40.806319580Z” level=fatal msg=”Error starting daemon: error initializing graphdriver: driver not supported”
Aug 05 09:44:40 docker systemd[1]: docker.service: Main process exited, code=exited, status=1/FAILURE
Aug 05 09:44:40 docker systemd[1]: Failed to start Docker Application Container Engine.
Aug 05 09:44:40 docker systemd[1]: docker.service: Unit entered failed state.
Aug 05 09:44:40 docker systemd[1]: docker.service: Failed with result ‘exit-code’.
This seems strange to me: I can run Docker container in an unprivileged LXD container, but I can not do it in an privileged LXD container.
I tried the steps for running docker inside of LXD, it works well. I also wanted to try the new Docker v1.12 in LXD… it failed to start the docker deamon. Error is like:
Sep 12 20:13:45 docker1 dockerd[294]: time=”2016-09-12T20:13:45.601263368Z” level=info msg=”libcontainerd: new containerd process, pid: 300″
Sep 12 20:13:45 docker1 dockerd[294]: time=”2016-09-12T20:13:45.601446148Z” level=fatal msg=”Failed to connect to containerd. Please make sure containerd is installed in your PATH or you have speci
Sep 12 20:13:45 docker1 systemd[1]: docker.service: Main process exited, code=exited, status=1/FAILURE
Sep 12 20:13:45 docker1 systemd[1]: Failed to start Docker Application Container Engine.
If I set the container to be “security.privileged true”, then I can start the docker deamon fine, but I would like to avoid that if possible. Is there a way to make this work? How can I disable appamour & gives all capabilities to the processes in LXC container (e.g. lxc.cap.drop =).
Many thanks!
Lin
I tried running docker inside an LXD controlled LXC container. Most things worked…. except when it came time to delete docker images.
I’m running BTRFS and using Directories instead of zfs loopback.
Docker couldn’t delete the btrfs snapshots. I had to delete the lxc container they were inside to reclaim the space.
Did you setup your btrfs filesystem to allow for unprivileged subvolume removal?
If not, that could explain what you’re seeing.
See https://github.com/lxc/lxd/blob/master/doc/storage-backends.md#btrfs
Thanks Stéphane, after reading some of the other people’s experience on this page, I did add unprivileged sub volume removal and that problem went away.
Hi Stéphane.
On a box with BTRFS, I run LXD, and create an LXC container to run docker inside. Maybe this is an odd use case.
However the LXC container needs to communicate in it’s mount point the option of allowing the user do delete it’s btrfs subvolumes. Inside the LXC container (ubuntu 16.04) the /etc/fstab mount has only one line which indicates an ext4 partition. However /etc/mtab from the same nested area shows btrfs.
On the physical machine back at the top layer the btrfs mount has the user_subvol_rm_allowed option. Where can the nested LXC container be configured to communicate to it’s docker containers that user_subvol_rm_allowed is to be seen as an option for it’s mount points?
This has been great fun! I’m inspired to learn more about running containers. Also I’m trying to learn HAProxy so I can minimize the number of IP addresses taken up by the many web sites I intend to host. What sort of interesting problems are you working on now?
Regards,
Joe Baker
Now I’m not seeing the problem. I did now see the subvolume delete permissions and was able to use the docker rmi command to delete some old docker images from within a nested LXC container made for docker images.
the page changed from: doc/storage-backends.md to:
https://github.com/lxc/lxd/blob/master/doc/storage.md
unfortunately there is no instructions about:
setup your btrfs filesystem to allow for unprivileged subvolume removal
Hi, and thank you for these blog posts – they are really useful and well written!
I’m having trouble configuring the docker0 interfaces. When I launch two containers, both with docker installed, they use the same IPv4 address for their docker0 interface. (I’ve got only IPv4 available, so no IPv6 configuration.)
Commands:
lxc launch ubuntu-daily:16.04 number-one -p default -p docker
lxc exec number-one — apt install docker.io -y
lxc launch ubuntu-daily:16.04 number-two -p default -p docker
lxc exec number-two — apt install docker.io -y
[lxc list]
[lxc exec number-one — journalctl -u docker.service]
[lxc exec number-two — journalctl -u docker.service]
Shortened output of “lxc list”:
number-one 172.17.0.1 (docker0) 10.1.189.47 (eth0)
number-two 172.17.0.1 (docker0) 10.1.189.19 (eth0)
Is this the intended behavior? If so, what is the right way to assign different IP addresses?
If it is not the intended behavior, shall I create an issue for the lxc, lxd or Docker repository?
More information:
– Commands and relevant output: http://pastebin.com/aHzsbVsF
– Commands and full output: http://pastebin.com/X5Xi9FjP
– LXD network configuration: http://pastebin.com/ANRu5wv1
Hi,
I’m using LXD with a separate zfs partition on a host. I would like to start a container for nested docker containers. Which storage driver should I use in docker configuration?
Regards,
Matthew
Hi,
Do you have any updated tutorial for LXD 2.16 + zfs + docker CE + swarm?
It is not working for me
Using this tutorial I got an error due zfs:
ERRO[0001] ‘overlay’ is not supported over zfs
Your article was useful for me during my first steps with LXD. Thank you.
Right now I wanted to run a Docker in Swarm mode inside LXD containers. Is it possible at all right now? I’m asking because Docker works fine for me, but when I try to run it in Swarm mode, it fails to create a service.
Have you or anyone else tried that?
How to privileged the container In Ubuntu 18,04 base OS, I need to configure the public Ip and internet connection also, and I need create storage pools in local machine
This probably deserves an update given that it’s the first google hit on the topic.
An lvm sparse pool or passing a source=/host/dir path for var lib docker works well with overlay, but you will not be able to load the module from the container. This can be inconsistent if overlay is not already loaded, you’ll fallback to vfs with accompanying IO and storage usage problems. Set config.linux.kernel_modules=overlay to prevent this.
Often building images may require security.privileged=true, but it can be dropped after shutting down the lxd container once the images are created and you are just running them.
Someone above was reporting that overlay may not be compatible with zfs.
A btrfs storage backend might also be usable with the docker btrfs storage driver in the container, but I haven’t personally tried it.
How to get docker up and running in Centos LXC Container?
How can I make Docker Process Up and Running in Centos LXC Container rather than Ubuntu LXC Container?
Thanks for the great write up! I collected all the steps I used on Ubuntu 18.04 for Docker in an LXD container backed by ZFS at https://gist.github.com/gbrayut/9ec570584dfd01620412e318ed987e31
I also included details on setting up a proxy for docker.sock, sharing folders into the container with correct permissions, and setting up the docker cli tools for “remote” access