LXC 1.0: Security features [6/10]

This is post 6 out of 10 in the LXC 1.0 blog post series.

When talking about container security most people either consider containers as inherently insecure or inherently secure. The reality isn’t so black and white and LXC supports a variety of technologies to mitigate most security concerns.

One thing to clarify right from the start is that you won’t hear any of the LXC maintainers tell you that LXC is secure so long as you use privileged containers. However, at least in Ubuntu, our default containers ship with what we think is a pretty good configuration of both the cgroup access and an extensive apparmor profile which prevents all attacks that we are aware of.

Below I’ll be covering the various technologies LXC supports to let you restrict what a container may do. Just keep in mind that unless you are using unprivileged containers, you shouldn’t give root access to a container to someone whom you’d mind having root access to your host.

Capabilities

The first security feature which was added to LXC was Linux capabilities support. With that feature you can set a list of capabilities that you want LXC to drop before starting the container or a full list of capabilities to retain (all others will be dropped).

The two relevant configurations options are:

  • lxc.cap.drop
  • lxc.cap.keep

Both are lists of capability names as listed in capabilities(7).

This may sound like a great way to make containers safe and for very specific cases it may be, however if running a system container, you’ll soon notice that dropping sys_admin and net_admin isn’t very practical and short of dropping those, you won’t make your container much safer (as root in the container will be able to re-grant itself any dropped capability).

In Ubuntu we use lxc.cap.drop to drop sys_module, mac_admin, mac_override, sys_time which prevent some known problems at container boot time.

Control groups

Control groups are interesting because they achieve multiple things which while interconnected are still pretty different:

  • Resource bean counting
  • Resource quotas
  • Access restrictions

The first two aren’t really security related, though resource quotas will let you avoid some obvious DoS of the host (by setting memory, cpu and I/O limits).

The last is mostly about the devices cgroup which lets you define which character and block devices a container may access and what it can do with them (you can restrict creation, read access and write access for each major/minor combination).

In LXC, configuring cgroups is done with the “lxc.cgroup.*” options which can roughly be defined as: lxc.cgroup.<controller>.<key> = <value>

For example to set a memory limit on p1 you’d add the following to its configuration:

lxc.cgroup.memory.limit_in_bytes = 134217728

This will set a memory limit of 128MB (the value is in bytes) and will be the equivalent to writing that same value to /sys/fs/cgroup/memory/lxc/p1/memory.limit_in_bytes

Most LXC templates only set a few devices controller entries by default:

# Default cgroup limits
lxc.cgroup.devices.deny = a
## Allow any mknod (but not using the node)
lxc.cgroup.devices.allow = c *:* m
lxc.cgroup.devices.allow = b *:* m
## /dev/null and zero
lxc.cgroup.devices.allow = c 1:3 rwm
lxc.cgroup.devices.allow = c 1:5 rwm
## consoles
lxc.cgroup.devices.allow = c 5:0 rwm
lxc.cgroup.devices.allow = c 5:1 rwm
## /dev/{,u}random
lxc.cgroup.devices.allow = c 1:8 rwm
lxc.cgroup.devices.allow = c 1:9 rwm
## /dev/pts/*
lxc.cgroup.devices.allow = c 5:2 rwm
lxc.cgroup.devices.allow = c 136:* rwm
## rtc
lxc.cgroup.devices.allow = c 254:0 rm
## fuse
lxc.cgroup.devices.allow = c 10:229 rwm
## tun
lxc.cgroup.devices.allow = c 10:200 rwm
## full
lxc.cgroup.devices.allow = c 1:7 rwm
## hpet
lxc.cgroup.devices.allow = c 10:228 rwm
## kvm
lxc.cgroup.devices.allow = c 10:232 rwm

This configuration allows the container (usually udev) to create any device it wishes (that’s the wildcard “m” above) but block everything else (the “a” deny entry) unless it’s listed in one of the allow entries below. This covers everything a container will typically need to function.

You will find reasonably up to date documentation about the available controllers, control files and supported values at:
https://www.kernel.org/doc/Documentation/cgroups/

Apparmor

A little while back we added Apparmor profiles support to LXC.
The Apparmor support is rather simple, there’s one configuration option “lxc.aa_profile” which sets what apparmor profile to use for the container.

LXC will then setup the container and ask apparmor to switch it to that profile right before starting the container. Ubuntu’s LXC profile is rather complex as it aims to prevent any of the known ways of escaping a container or cause harm to the host.

As things are today, Ubuntu ships with 3 apparmor profiles meaning that the supported values for lxc.aa_profile are:

  • lxc-container-default (default value if lxc.aa_profile isn’t set)
  • lxc-container-default-with-nesting (same as default but allows some needed bits for nested containers)
  • lxc-container-default-with-mounting (same as default but allows mounting ext*, xfs and btrfs file systems).
  • unconfined (a special value which will disable apparmor support for the container)

You can also define your own by copying one of the ones in /etc/apparmor.d/lxc/, adding the bits you want, giving it a unique name, then reloading apparmor with “sudo /etc/init.d/apparmor reload” and finally setting lxc.aa_profile to the new profile’s name.

SELinux

The SELinux support is very similar to Apparmor’s. An SELinux context can be set using “lxc.se_context”.

An example would be:

lxc.se_context = unconfined_u:unconfined_r:lxc_t:s0-s0:c0.c1023

Similarly to Apparmor, LXC will switch to the new SELinux context right before starting init in the container. As far as I know, no distributions are setting a default SELinux context at this time, however most distributions build LXC with SELinux support (including Ubuntu, should someone choose to boot their host with SELinux rather than Apparmor).

Seccomp

Seccomp is a fairly recent kernel mechanism which allows for filtering of system calls.
As a user you can write a seccomp policy file and set it using “lxc.seccomp” in the container’s configuration. As always, this policy will only be applied to the running container and will allow or reject syscalls with a pre-defined return value.

An example (though limited and useless) of a seccomp policy file would be:

1
whitelist
103

Which would only allow syscall #103 (syslog) in the container and reject everything else.

Note that seccomp is a rather low level feature and only useful for some very specific use cases. All syscalls have to be referred by their ID instead of their name and those may change between architectures. Also, as things are today, if your host is 64bit and you load a seccomp policy file, all 32bit syscalls will be rejected. We’d need per-personality seccomp profiles to solve that but it’s not been a high priority so far.

User namespace

And last but not least, what’s probably the only way of making a container actually safe. LXC now has support for user namespaces. I’ll go into more details on how to use that feature in a later blog post but simply put, LXC is no longer running as root so even if an attacker manages to escape the container, he’d find himself having the privileges of a regular user on the host.

All this is achieved by assigning ranges of uids and gids to existing users. Those users on the host will then be allowed to clone a new user namespace in which all uids/gids are mapped to uids/gids that are part of the user’s range.

This obviously means that you need to allocate a rather silly amount of uids and gids to each user who’ll be using LXC in that way. In a perfect world, you’d allocate 65536 uids and gids per container and per user. As this would likely exhaust the whole uid/gid range rather quickly on some systems, I tend to go with “just” 65536 uids and gids per user that’ll use LXC and then have the same range shared by all containers.

Anyway, that’s enough details about user namespaces for now. I’ll cover how to actually set that up and use those unprivileged containers in the next post.

Posted in Canonical voices, LXC, Planet Ubuntu | Tagged | 20 Comments

LXC 1.0: Container storage [5/10]

This is post 5 out of 10 in the LXC 1.0 blog post series.

Storage backingstores

LXC supports a variety of storage backends (also referred to as backingstore).
It defaults to “none” which simply stores the rootfs under
/var/lib/lxc/<container>/rootfs but you can specify something else to lxc-create or lxc-clone with the -B option.

Currently supported values are:

directory based storage (“none” and “dir)

This is the default backingstore, the container rootfs is stored under
/var/lib/lxc/<container>/rootfs

The --dir option (when using “dir”) can be used to override the path.

btrfs

With this backingstore LXC will setup a new subvolume for the container which makes snapshotting much easier.

lvm

This one will use a new logical volume for the container.
The LV can be set with --lvname (the default is the container name).
The VG can be set with --vgname (the default is “lxc”).
The filesystem can be set with --fstype (the default is “ext4”).
The size can be set with --fssize (the default is “1G”).
You can also use LVM thinpools with --thinpool

overlayfs

This one is mostly used when cloning containers to create a container based on another one and storing any changes in an overlay.

When used with lxc-create it’ll create a container where any change done after its initial creation will be stored in a “delta0” directory next to the container’s rootfs.

zfs

Very similar to btrfs, as I’ve not used either of those myself I can’t say much about them besides that it should also create some kind of subvolume for the container and make snapshots and clones faster and more space efficient.

Standard paths

One quick word with the way LXC usually works and where it’s storing its files:

  • /var/lib/lxc (default location for containers)
  • /var/lib/lxcsnap (default location for snapshots)
  • /var/cache/lxc (default location for the template cache)
  • $HOME/.local/share/lxc (default location for unprivileged containers)
  • $HOME/.local/share/lxcsnap (default location for unprivileged snapshots)
  • $HOME/.cache/lxc (default location for unprivileged template cache)

The default path, also called lxcpath can be overridden on the command line with the -P option or once and for all by setting “lxcpath = /new/path” in /etc/lxc/lxc.conf (or $HOME/.config/lxc/lxc.conf for unprivileged containers).

The snapshot directory is always “snap” appended to lxcpath so it’ll magically follow lxcpath. The template cache is unfortunately hardcoded and can’t easily be moved short of relying on bind-mounts or symlinks.

The default configuration used for all containers at creation time is taken from
/etc/lxc/default.conf (no unprivileged equivalent yet).
The templates themselves are stored in /usr/share/lxc/templates.

Cloning containers

All those backingstores only really shine once you start cloning containers.
For example, let’s take our good old “p1” Ubuntu container and let’s say you want to make a usable copy of it called “p4”, you can simply do:

sudo lxc-clone -o p1 -n p4

And there you go, you’ve got a working “p4” container that’ll be a simple copy of “p1” but with a new mac address and its hostname properly set.

Now let’s say you want to do a quick test against “p1” but don’t want to alter that container itself, yet you don’t want to wait the time needed for a full copy, you can simply do:

sudo lxc-clone -o p1 -n p1-test -B overlayfs -s

And there you go, you’ve got a new “p1-test” container which is entirely based on the “p1” rootfs and where any change will be stored in the “delta0” directory of “p1-test”.
The same “-s” option also works with lvm and btrfs (possibly zfs too) containers and tells lxc-clone to use a snapshot rather than copy the whole rootfs across.

Snapshotting

So cloning is nice and convenient, great for things like development environments where you want throw away containers. But in production, snapshots tend to be a whole lot more useful for things like backup or just before you do possibly risky changes.

In LXC we have a “lxc-snapshot” tool which will let you create, list, restore and destroy snapshots of your containers.
Before I show you how it works, please note that “lxc-snapshot” currently doesn’t appear to work with directory based containers. With those it produces an empty snapshot, this should be fixed by the time LXC 1.0 is actually released.

So, let’s say we want to backup our “p1-lvm” container before installing “apache2” into it, simply run:

echo "before installing apache2" > snap-comment
sudo lxc-snapshot -n p1-lvm -c snap-comment

At which point, you can confirm the snapshot was created with:

sudo lxc-snapshot -n p1-lvm -L -C

Now you can go ahead and install “apache2” in the container.

If you want to revert the container at a later point, simply use:

sudo lxc-snapshot -n p1-lvm -r snap0

Or if you want to restore a snapshot as its own container, you can use:

sudo lxc-snapshot -n p1-lvm -r snap0 p1-lvm-snap0

And you’ll get a new “p1-lvm-snap0” container which will contain a working copy of “p1-lvm” as it was at “snap0”.

Posted in Canonical voices, LXC, Planet Ubuntu | Tagged | 23 Comments

LXC 1.0: Some more advanced container usage [4/10]

This is post 4 out of 10 in the LXC 1.0 blog post series.

Running foreign architectures

By default LXC will only let you run containers of one of the architectures supported by the host. That makes sense since after all, your CPU doesn’t know what to do with anything else.

Except that we have this convenient package called “qemu-user-static” which contains a whole bunch of emulators for quite a few interesting architectures. The most common and useful of those is qemu-arm-static which will let you run most armv7 binaries directly on x86.

The “ubuntu” template knows how to make use of qemu-user-static, so you can simply check that you have the “qemu-user-static” package installed, then run:

sudo lxc-create -t ubuntu -n p3 -- -a armhf

After a rather long bootstrap, you’ll get a new p3 container which will be mostly running Ubuntu armhf. I’m saying mostly because the qemu emulation comes with a few limitations, the biggest of which is that any piece of software using the ptrace() syscall will fail and so will anything using netlink. As a result, LXC will install the host architecture version of upstart and a few of the networking tools so that the containers can boot properly.

stgraber@castiana:~$ file /bin/ls
/bin/ls: ELF 64-bit LSB  executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.24, """BuildID[sha1]""" =e50e0a5dadb8a7f4eaa2fd715cacb9842e157dc7, stripped
stgraber@castiana:~$ sudo lxc-start -n p3 -d
stgraber@castiana:~$ sudo lxc-attach -n p3
root@p3:/# file /bin/ls
/bin/ls: ELF 32-bit LSB  executable, ARM, EABI5 version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.32, """BuildID[sha1]""" =88ff013a8fd9389747fb1fea1c898547fb0f650a, stripped
root@p3:/# exit
stgraber@castiana:~$ sudo lxc-stop -n p3
stgraber@castiana:~$

Hooks

As we know people like to script their containers and that our configuration can’t always accommodate every single use case, we’ve introduced a set of hooks which you may use.

Those hooks are simple paths to an executable file which LXC will run at some specific time in the lifetime of the container. Those executables will also be passed a set of useful environment variables so they can easily know what container invoked them and what to do.

The currently available hooks are (details in lxc.conf(5)):

  • lxc.hook.pre-start (called before any initialization is done)
  • lxc.hook.pre-mount (called after creating the mount namespace but before mounting anything)
  • lxc.hook.mount (called after the mounts but before pivot_root)
  • lxc.hook.autodev (identical to mount but only called if using autodev)
  • lxc.hook.start (called in the container right before /sbin/init)
  • lxc.hook.post-stop (run after the container has been shutdown)
  • lxc.hook.clone (called when cloning a container into a new one)

Additionally each network section may also define two additional hooks:

  • lxc.network.script.up (called in the network namespace after the interface was created)
  • lxc.network.script.down (called in the network namespace before destroying the interface)

All of those hooks may be specified as many times as you want in the configuration so you can use each hooking point multiple times.

As a simple example, let’s add the following to our “p1” container:

lxc.hook.pre-start = /var/lib/lxc/p1/pre-start.sh

And create the hook itself at /var/lib/lxc/p1/pre-start.sh:

#!/bin/sh
echo "arguments: $*" > /tmp/test
echo "environment:" >> /tmp/test
env | grep LXC >> /tmp/test

Make it executable (chmod 755) and then start the container.
Checking /tmp/test you should see:

arguments: p1 lxc pre-start
environment:
LXC_ROOTFS_MOUNT=/usr/lib/x86_64-linux-gnu/lxc
LXC_CONFIG_FILE=/var/lib/lxc/p1/config
LXC_ROOTFS_PATH=/var/lib/lxc/p1/rootfs
LXC_NAME=p1

Android containers

I’ve often been asked whether it was possible to run Android in an LXC container. Well, the short answer is yes. However it’s not very simple and it really depends on what you want to do with it.

The first thing you’ll need if you want to do this is get your machine to run an Android kernel, you’ll need to have any modules needed by Android built and loaded before you can start the container.

Once you have that, you’ll need to create a new container by hand.
Let’s put it in “/var/lib/lxc/android/”, in there, you need a configuration file similar to this one:

lxc.rootfs = /var/lib/lxc/android/rootfs
lxc.utsname = armhf

lxc.network.type = none

lxc.devttydir = lxc
lxc.tty = 4
lxc.pts = 1024
lxc.arch = armhf
lxc.cap.drop = mac_admin mac_override
lxc.pivotdir = lxc_putold

lxc.hook.pre-start = /var/lib/lxc/android/pre-start.sh

lxc.aa_profile = unconfined

/var/lib/lxc/android/pre-start.sh is where the interesting bits happen. It needs to be an executable shell script, containing something along the lines of:

#!/bin/sh
mkdir -p $LXC_ROOTFS_PATH
mount -n -t tmpfs tmpfs $LXC_ROOTFS_PATH

cd $LXC_ROOTFS_PATH
cat /var/lib/lxc/android/initrd.gz | gzip -d | cpio -i

# Create /dev/pts if missing
mkdir -p $LXC_ROOTFS_PATH/dev/pts

Then get the initrd for your device and place it in /var/lib/lxc/android/initrd.gz.

At that point, when starting the LXC container, the Android initrd will be unpacked on a tmpfs (similar to Android’s ramfs) and Android’s init will be started which in turn should mount any partition that Android requires and then start all of the usual services.

Because there are no apparmor, cgroup or even network configuration applied to it, the container will have a lot of rights and will typically completely crash the machine. You unfortunately have to be familiar with the way Android works and not be afraid to modify its init scripts if not even its init process to only start the bits you actually want.

I can’t provide a generic recipe there as it completely depends on what you’re interested on, what version of Android and what device you’re using. But it’s clearly possible to do and you may want to look at Ubuntu Touch to see how we’re doing it by default there.

One last note, Android’s init script isn’t in /sbin/init, so you need to tell LXC where to load it with:

lxc-start -n android -- /init

LXC on Android devices

So now that we’ve seen how to run Android in LXC, let’s talk about running Ubuntu on Android in LXC.

LXC has been ported to bionic (Android’s C library) and while not feature-equivalent with its glibc build, it’s still good enough to be used.

Unfortunately due to the kind of low level access LXC requires and the fact that our primary focus isn’t Android, installation could be easier…You won’t be finding LXC on the Google PlayStore and we won’t provide you with a .apk that you can install.

Instead every time something changes in the upstream git branch, we produce a new tarball which can be downloaded here: https://jenkins.linuxcontainers.org/view/LXC/view/LXC%20builds/job/lxc-build-android/lastSuccessfulBuild/artifact/lxc-android.tar.gz

This build is known to work with Android >= 4.2 but will quite likely work on older versions too.

For this to work, you’ll need to grab your device’s kernel configuration and run lxc-checkconfig against it to see whether it’s compatible with LXC or not. Unfortunately it’s very likely that it won’t be… In that case, you’ll need to go hunt for the kernel source for your device, add the missing feature flags, rebuild it and update your device to boot your updated kernel.

As scary as this may sound, it’s usually not that difficult as long as your device is unlocked and you’re already using an alternate ROM like Cyanogen which usually make their kernel git tree easily available.

Once your device has a working kernel, all you need to do is unpack our tarball as root in your device’s / directory, copy an arm container to /data/lxc/containers/<container name>, get into /data/lxc and run “./run-lxc lxc-start -n <container name>”.
A few seconds later you’ll be greeted by a login prompt.

Posted in Canonical voices, LXC, Planet Ubuntu, Ubuntu Touch | Tagged | 32 Comments

LXC 1.0: Advanced container usage [3/10]

This is post 3 out of 10 in the LXC 1.0 blog post series.

Exchanging data with a container

Because containers directly share their filesystem with the host, there’s a lot of things that can be done to pass data into a container or to get stuff out.

The first obvious one is that you can access the container’s root at:
/var/lib/lxc/<container name>/rootfs/

That’s great, but sometimes you need to access data that’s in the container and on a filesystem which was mounted by the container itself (such as a tmpfs). In those cases, you can use this trick:

sudo ls -lh /proc/$(sudo lxc-info -n p1 -p -H)/root/run/

Which will show you what’s in /run of the running container “p1”.

Now, that’s great to have access from the host to the container, but what about having the container access and write data to the host?
Well, let’s say we want to have our host’s /var/cache/lxc shared with “p1”, we can edit /var/lib/lxc/p1/fstab and append:

/var/cache/lxc var/cache/lxc none bind,create=dir

This line means, mount “/var/cache/lxc” from the host as “/var/cache/lxc” (the lack of initial / makes it relative to the container’s root), mount it as a bind-mount (“none” fstype and “bind” option) and create any directory that’s missing in the container (“create=dir”).

Now restart “p1” and you’ll see /var/cache/lxc in there, showing the same thing as you have on the host. Note that if you want the container to only be able to read the data, you can simply add “ro” as a mount flag in the fstab.

Container nesting

One pretty cool feature of LXC (though admittedly not very useful to most people) is support for nesting. That is, you can run LXC within LXC with pretty much no overhead.

By default this is blocked in Ubuntu as allowing this at the moment requires letting the container mount cgroupfs which will let it escape any cgroup restrictions that’s applied to it. It’s not an issue in most environment, but if you don’t trust your containers at all, then you shouldn’t be using nesting at this point.

So to enable nesting for our “p1” container, edit /var/lib/lxc/p1/config and add:

lxc.aa_profile = lxc-container-default-with-nesting

And then restart “p1”. Once that’s done, install lxc inside the container. I usually recommend using the same version as the host, though that’s not strictly required.

Once LXC is installed in the container, run:

sudo lxc-create -t ubuntu -n p1

As you’ve previously bind-mounted /var/cache/lxc inside the container, this should be very quick (it shouldn’t rebootstrap the whole environment). Then start that new container as usual.

At that point, you may now run lxc-ls on the host in nesting mode to see exactly what’s running on your system:

stgraber@castiana:~$ sudo lxc-ls --fancy --nesting
NAME    STATE    IPV4                 IPV6   AUTOSTART  
------------------------------------------------------
p1      RUNNING  10.0.3.82, 10.0.4.1  -      NO       
 \_ p1  RUNNING  10.0.4.7             -      NO       
p2      RUNNING  10.0.3.128           -      NO

There’s no real limit to the number of level you can go, though as fun as it may be, it’s hard to imagine why 10 levels of nesting would be of much use to anyone 🙂

Raw network access

In the previous post I mentioned passing raw devices from the host inside the container. One such container I use relatively often is when working with a remote network over a VPN. That network uses OpenVPN and a raw ethernet tap device.

I needed to have a completely isolated system access that VPN so I wouldn’t get mixed routes and it’d appear just like any other machine to the machines on the remote site.

All I had to do to make this work was set my container’s network configuration to:

lxc.network.type = phys
lxc.network.hwaddr = 00:16:3e:c6:0e:04
lxc.network.flags = up
lxc.network.link = tap0
lxc.network.name = eth0

Then all I have to do is start OpenVPN on my host which will connect and setup tap0, then start the container which will steal that interface and use it as its own eth0.The container will then use DHCP to grab an IP and will behave just like if it was a physical machine connect directly in the remote network.

Posted in Canonical voices, LXC, Planet Ubuntu | Tagged | 20 Comments

LXC 1.0: Your second container [2/10]

This is post 2 out of 10 in the LXC 1.0 blog post series.

More templates

So at this point you should have a working Ubuntu container that’s called “p1” and was created using the default template called simply enough “ubuntu”.

But LXC supports much more than just standard Ubuntu. In fact, in current upstream git (and daily PPA), we support Alpine Linux, Alt Linux, Arch Linux, busybox, CentOS, Cirros, Debian, Fedora, OpenMandriva, OpenSUSE, Oracle, Plamo, sshd, Ubuntu Cloud and Ubuntu.

All of those can usually be found in /usr/share/lxc/templates. They also all typically have extra advanced options which you can get to by passing “--help” after the “lxc-create” call (the “--” is required to split “lxc-create” options from the template’s).

Writing extra templates isn’t too difficult, they basically are executables (all shell scripts but that’s not a requirement) which take a set of standard arguments and are expected to produce a working rootfs in the path that’s passed to them.

One thing to be aware of is that due to missing tools not all distros can be bootstrapped on all distros. It’s usually best to just try. We’re always interested in making those work on more distros even if that means using some rather weird tricks (like is done in the fedora template) so if you have a specific combination which doesn’t work at the moment, patches are definitely welcome!

Anyway, enough talking for now, let’s go ahead and create an Oracle Linux container that we’ll force to be 32bit.

sudo lxc-create -t oracle -n p2 -- -a i386

On most systems, this will initially fail, telling you to install the “rpm” package first which is needed for bootstrap reasons. So install it and “yum” and then try again.

After some time downloading RPMs, the container will be created, then it’s just a:

sudo lxc-start -n p2

And you’ll be greated by the Oracle Linux login prompt (root / root).

At that point since you started the container without passing “-d” to “lxc-start”, you’ll have to shut it down to get your shell back (you can’t detach from a container which wasn’t started initially in the background).

Now if you are wondering why Ubuntu has two templates. The Ubuntu template which I’ve been using so far does a local bootstrap using “debootstrap” basically building your container from scratch, whereas the Ubuntu Cloud template (ubuntu-cloud) downloads a pre-generated cloud image (identical to what you’d get on EC2 or other cloud services) and starts it. That image also includes cloud-init and supports the standard cloud metadata.

It’s a matter of personal choice which you like best. I personally have a local mirror so the “ubuntu” template is much faster for me and I also trust it more since I know everything was downloaded from the archive in front of me and assembled locally on my machine.

One last note on templates. Most of them use a local cache, so the initial bootstrap of a container for a given arch will be slow, any subsequent one will just be a local copy from the cache and will be much faster.

Auto-start

So what if you want to start a container automatically at boot time?

Well, that’s been supported for a long time in Ubuntu and other distros by using some init scripts and symlinks in /etc, but very recently (two days ago), this has now been implemented cleanly upstream.

So here’s how auto-started containers work nowadays:

As you may know, each container has a configuration file typically under
/var/lib/lxc/<container name>/config

That file is key = value with the list of valid keys being specified in lxc.conf(5).

The startup related values that are available are:

  • lxc.start.auto = 0 (disabled) or 1 (enabled)
  • lxc.start.delay = 0 (delay in second to wait after starting the container)
  • lxc.start.order = 0 (priority of the container, higher value means starts earlier)
  • lxc.group = group1,group2,group3,… (groups the container is a member of)

When your machine starts, an init script will ask “lxc-autostart” to start all containers of a given group (by default, all containers which aren’t in any) in the right order and waiting the specified time between them.

To illustrate that, edit /var/lib/lxc/p1/config and append those lines to the file:

lxc.start.auto = 1
lxc.group = ubuntu

And /var/lib/lxc/p2/config and append those lines:

lxc.start.auto = 1
lxc.start.delay = 5
lxc.start.order = 100

Doing that means that only the p2 container will be started at boot time (since only those without a group are by default), the order value won’t matter since it’s alone and the init script will wait 5s before moving on.

You may check what containers are automatically started using “lxc-ls”:

stgraber@castiana:~$ sudo lxc-ls --fancy
NAME    STATE    IPV4        IPV6                                    AUTOSTART     
---------------------------------------------------------------------------------
p1      RUNNING  10.0.3.128  2607:f2c0:f00f:2751:216:3eff:feb1:4c7f  YES (ubuntu)
p2      RUNNING  10.0.3.165  2607:f2c0:f00f:2751:216:3eff:fe3a:f1c1  YES

Now you can also manually play with those containers using the “lxc-autostart” command which let’s you start/stop/kill/reboot any container marked with lxc.start.auto=1.

For example, you could do:

sudo lxc-autostart -a

Which will start any container that has lxc.start.auto=1 (ignoring the lxc.group value) which in our case means it’ll first start p2 (because of order = 100), then wait 5s (because of delay = 5) and then start p1 and return immediately afterwards.

If at that point you want to reboot all containers that are in the “ubuntu” group, you may do:

sudo lxc-autostart -r -g ubuntu

You can also pass “-L” with any of those commands which will simply print which containers would be affected and what the delays would be but won’t actually do anything (useful to integrate with other scripts).

Freezing your containers

Sometimes containers may be running daemons that take time to shutdown or restart, yet you don’t want to run the container because you’re not actively using it at the time.

In such cases, “sudo lxc-freeze -n <container name>” can be used. That very simply freezes all the processes in the container so they won’t get any time allocated by the scheduler. However the processes will still exist and will still use whatever memory they used to.

Once you need the service again, just call “sudo lxc-unfreeze -n <container name>” and all the processes will be restarted.

Networking

As you may have noticed in the configuration file while you were setting the auto-start settings, LXC has a relatively flexible network configuration.
By default in Ubuntu we allocate one “veth” device per container which is bridged into a “lxcbr0” bridge on the host on which we run a minimal dnsmasq dhcp server.

While that’s usually good enough for most people. You may want something slightly more complex, such as multiple network interfaces in the container or passing through physical network interfaces, … The details of all of those options are listed in lxc.conf(5) so I won’t repeat them here, however here’s a quick example of what can be done.

lxc.network.type = veth
lxc.network.hwaddr = 00:16:3e:3a:f1:c1
lxc.network.flags = up
lxc.network.link = lxcbr0
lxc.network.name = eth0

lxc.network.type = veth
lxc.network.link = virbr0
lxc.network.name = virt0

lxc.network.type = phys
lxc.network.link = eth2
lxc.network.name = eth1

With this setup my container will have 3 interfaces, eth0 will be the usual veth device in the lxcbr0 bridge, eth1 will be the host’s eth2 moved inside the container (it’ll disappear from the host while the container is running) and virt0 will be another veth device in the virbr0 bridge on the host.

Those last two interfaces don’t have a mac address or network flags set, so they’ll get a random mac address at boot time (non-persistent) and it’ll be up to the container to bring the link up.

Attach

Provided you are running a sufficiently recent kernel, that is 3.8 or higher, you may use the “lxc-attach” tool. It’s most basic feature is to give you a standard shell inside a running container:

sudo lxc-attach -n p1

You may also use it from scripts to run actions in the container, such as:

sudo lxc-attach -n p1 -- restart ssh

But it’s a lot more powerful than that. For example, take:

sudo lxc-attach -n p1 -e -s 'NETWORK|UTSNAME'

In that case, you’ll get a shell that says “root@p1” (thanks to UTSNAME), running “ifconfig -a” from there will list the container’s network interfaces. But everything else will be that of the host. Also passing “-e” means that the cgroup, apparmor, … restrictions won’t apply to any processes started from that shell.

This can be very useful at times to spawn a software located on the host but inside the container’s network or pid namespace.

Passing devices to a running container

It’s great being able to enter and leave the container at will, but what about accessing some random devices on your host?

By default LXC will prevent any such access using the devices cgroup as a filtering mechanism. You could edit the container configuration to allow the right additional devices and then restart the container.

But for one-off things, there’s also a very convenient tool called “lxc-device”.
With it, you can simply do:

sudo lxc-device add -n p1 /dev/ttyUSB0 /dev/ttyS0

Which will add (mknod) /dev/ttyS0 in the container with the same type/major/minor as /dev/ttyUSB0 and then add the matching cgroup entry allowing access from the container.

The same tool also allows moving network devices from the host to within the container.

Posted in Canonical voices, LXC, Planet Ubuntu | Tagged | 32 Comments