Category Archives: LXC

LXC in Ubuntu 12.04 LTS

Quite a few people have been asking for a status update of LXC in Ubuntu as of Ubuntu 12.04 LTS. This post is meant as an overview of the work we did over the past 6 months and pointers to more detailed blog posts for some of the new features.

What’s LXC?

LXC is a userspace tool controlling the kernel namespaces and cgroup features to create system or application containers.

To give you an idea:

  • Feels like somewhere between a chroot and a VM
  • Can run a full distro using the “host” kernel
  • Processes running in a container are visible from the outside
  • Doesn’t require any specific hardware, works on all supported architectures

A libvirt driver for LXC exists (libvirt-lxc), however it doesn’t use the “lxc” userspace tool even though it uses the same kernel features.

Making LXC easier

One of the main focus for 12.04 LTS was to make LXC dead easy to use, to achieve this, we’ve been working on a few different fronts fixing known bugs and improving LXC’s default configuration.

Creating a basic container and starting it on Ubuntu 12.04 LTS is now down to:

sudo apt-get install lxc
sudo lxc-create -t ubuntu -n my-container
sudo lxc-start -n my-container

This will default to using the same version and architecture as your machine, additional option are obviously available (–help will list them). Login/Password are ubuntu/ubuntu.

Another thing we worked on to make LXC easier to work with is reducing the number of hacks required to turn a regular system into a container down to zero.
Starting with 12.04, we don’t do any modification to a standard Ubuntu system to get it running in a container.
It’s now even possible to take a raw VM image and have it boot in a container!

The ubuntu-cloud template also lets you get one of our EC2/cloud images and have it start as a container instead of a cloud instance:

sudo apt-get install lxc cloud-utils
sudo lxc-create -t ubuntu-cloud -n my-cloud-container
sudo lxc-start -n my-cloud-container

And finally, if you want to test the new cool stuff, you can also use juju with LXC:

[ ! -f ~/.ssh/id_rsa.pub ] && ssh-keygen -t rsa
sudo apt-get install juju apt-cacher-ng zookeeper lxc libvirt-bin --no-install-recommends
sudo adduser $USER libvirtd
juju bootstrap
sed -i "s/ec2/local/" ~/.juju/environments.yaml
echo " data-dir: /tmp/juju" >> ~/.juju/environments.yaml
juju bootstrap
juju deploy mysql
juju deploy wordpress
juju add-relation wordpress mysql
juju expose wordpress

# To tail the logs
juju debug-log

# To get the IPs and status
juju status

Making LXC safer

Another main focus for LXC in Ubuntu 12.04 was to make it safe. John Johansen did an amazing work of extending apparmor to let us implement per-container apparmor profiles and prevent most known dangerous behaviours from happening in a container.

NOTE: Until we have user namespaces implemented in the kernel and used by the LXC we will NOT say that LXC is root safe, however the default apparmor profile as shipped in Ubuntu 12.04 LTS is blocking any armful action that we are aware of.

This mostly means that write access to /proc and /sys are heavily restricted, mounting filesystems is also restricted, only allowing known-safe filesystems to be mounted by default. Capabilities are also restricted in the default LXC profile to prevent a container from loading kernel modules or control apparmor.

More details on this are available here:

Other cool new stuff

Emulated architecture containers

It’s now possible to use qemu-user-static with LXC to run containers of non-native architectures, for example:

sudo apt-get install lxc qemu-user-static
sudo lxc-create -n my-armhf-container -t ubuntu -- -a armhf
sudo lxc-start -n my-armhf-container

Ephemeral containers

Quite a bit of work also went into lxc-start-ephemeral, the tool letting you start a copy of an existing container using an overlay filesystem, discarding any change you make on shutdown:

sudo apt-get install lxc
sudo lxc-create -n my-container -t ubuntu
sudo lxc-start-ephemeral -o my-container

Container nesting

You can now start a container inside a container!
For that to work, you first need to create a new apparmor profile as the default one doesn’t allow this for security reason.
I already did that for you, so the few commands below will download it and install it in /etc/apparmor.d/lxc/lxc-with-nesting. This profile (or something close to it) will ship in Ubuntu 12.10 as an example of alternate apparmor profile for container.

sudo apt-get install lxc
sudo lxc-create -t ubuntu -n my-host-container
sudo wget https://www.stgraber.org/download/lxc-with-nesting -O /etc/apparmor.d/lxc/lxc-with-nesting
sudo /etc/init.d/apparmor reload
sudo sed -i "s/#lxc.aa_profile = unconfined/lxc.aa_profile = lxc-container-with-nesting/" /var/lib/lxc/my-host-container/config
sudo lxc-start -n my-host-container
(in my-host-container) sudo apt-get install lxc
(in my-host-container) sudo stop lxc
(in my-host-container) sudo sed -i "s/10.0.3/10.0.4/g" /etc/default/lxc
(in my-host-container) sudo start lxc
(in my-host-container) sudo lxc-create -n my-sub-container -t ubuntu
(in my-host-container) sudo lxc-start -n my-sub-container

Documentation

Outside of the existing manpages and blog posts I mentioned throughout this post, Serge Hallyn did a very good job at creating a whole section dedicated to LXC in the Ubuntu Server Guide.
You can read it here: https://help.ubuntu.com/12.04/serverguide/lxc.html

Next steps

Next week we have the Ubuntu Developer Summit in Oakland, CA. There we’ll be working on the plans for LXC in Ubuntu 12.10. We currently have two sessions scheduled:

If you want to make sure the changes you want will be in Ubuntu 12.10, please make sure to join these two sessions. It’s possible to participate remotely to the Ubuntu Developer Summit, through IRC and audio streaming.

My personal hope for LXC in Ubuntu 12.10 is to have a clean liblxc library that can be used to create bindings and be used in languages like python. Working towards that goal should make it easier to do automated testing of LXC and cleanup our current tools.

I hope this post made you want to try LXC or for existing users, made you discover some of the new features that appeared in Ubuntu 12.04. We’re actively working on improving LXC both upstream and in Ubuntu, so do not hesitate to report bugs (preferably with “ubuntu-bug lxc”).

Posted in Canonical voices, Conferences, LXC, Planet Ubuntu | Tagged | 64 Comments

Booting an Ubuntu 12.04 virtual machine in an LXC container

One thing that we’ve been working on for LXC in 12.04 is getting rid of any remaining LXC specific hack in our templates. This means that you can now run a perfectly clean Ubuntu system in a container without any change.

To better illustrate that, here’s a guide on how to boot a standard Ubuntu VM in a container.

First, you’ll need an Ubuntu VM image in raw disk format. The next few steps also assume a default partitioning where the first primary partition is the root device. Make sure you have the lxc package installed and up to date and lxcbr0 enabled (the default with recent LXC).

Then run kpartx -a vm.img this will create loop devices in /dev/mapper for your VM partitions, in the following configuration I’m assuming /dev/mapper/loop0p1 is the root partition.

Now write a new LXC configuration file (myvm.conf in my case) containing:

lxc.network.type = veth
lxc.network.flags = up
lxc.network.link = lxcbr0
lxc.utsname = myvminlxc

lxc.tty = 4
lxc.pts = 1024
lxc.rootfs = /dev/mapper/loop0p1
lxc.arch = amd64
lxc.cap.drop = sys_module mac_admin

lxc.cgroup.devices.deny = a
# Allow any mknod (but not using the node)
lxc.cgroup.devices.allow = c *:* m
lxc.cgroup.devices.allow = b *:* m
# /dev/null and zero
lxc.cgroup.devices.allow = c 1:3 rwm
lxc.cgroup.devices.allow = c 1:5 rwm
# consoles
lxc.cgroup.devices.allow = c 5:1 rwm
lxc.cgroup.devices.allow = c 5:0 rwm
#lxc.cgroup.devices.allow = c 4:0 rwm
#lxc.cgroup.devices.allow = c 4:1 rwm
# /dev/{,u}random
lxc.cgroup.devices.allow = c 1:9 rwm
lxc.cgroup.devices.allow = c 1:8 rwm
lxc.cgroup.devices.allow = c 136:* rwm
lxc.cgroup.devices.allow = c 5:2 rwm
# rtc
lxc.cgroup.devices.allow = c 254:0 rwm
#fuse
lxc.cgroup.devices.allow = c 10:229 rwm
#tun
lxc.cgroup.devices.allow = c 10:200 rwm
#full
lxc.cgroup.devices.allow = c 1:7 rwm
#hpet
lxc.cgroup.devices.allow = c 10:228 rwm
#kvm
lxc.cgroup.devices.allow = c 10:232 rwm

The bits in bold may need updating if you’re not using the same architecture, partition scheme or bridges as I’m.

Then finally, run: lxc-start -n myvminlxc -f myvm.conf

And watch your VM boot in an LXC container.

I did this test with a desktop VM using network manager so it didn’t mind LXC’s random MAC address, server VMs might get stuck for a minute at boot time because of that though.
In such case, either clean /etc/udev/rules.d/70-persistent-net.rules or set “lxc.network.hwaddr” to the same mac address as your VM.

Once done, run kpartx -d vm.img to remove the loop devices.

Posted in Canonical voices, LXC, Planet Ubuntu | Tagged | 28 Comments

Ever wanted an armhf container on your x86 machine? It’s now possible with LXC in Ubuntu Precise

It took a while to get some apt resolver bugs fixed, a few packages marked for multi-arch and some changes in the Ubuntu LXC template, but since yesterday, you can now run (using up to date Precise):

  • sudo apt-get install lxc qemu-user-static
  • sudo lxc-create -n armhf01 -t ubuntu — -a armhf -r precise
  • sudo lxc-start -n armhf01
  • Then login with root as both login and password

And enjoy an armhf system running on your good old x86 machine.

Now, obviously it’s pretty far from what you’d get on real ARM hardware.
It’s using qemu’s user space CPU emulation (qemu-user-static), so won’t be particularly fast, will likely use a lot of CPU and may give results pretty different from what you’d expect on real hardware.

Also, because of limitations in qemu-user-static, a few packages from the “host” architecture are installed in the container. These are mostly anything that requires the use of ptrace (upstart) or the use of netlink (mountall, iproute and isc-dhcp-client).
This is the bare minimum I needed to install to get the rest of the container to work using armhf binaries. I obviously didn’t test everything and I’m sure quite a few other packages will fail in such environment.

This feature should be used as an improvement on top of a regular armhf chroot using qemu-user-static and not as a replacement for actual ARM hardware (obviously), but it’s cool to have around and nice to show what LXC can do.

I confirmed it to work for armhf and armel, powerpc should also work, though it didn’t succeed to debootstrap when I tried it earlier today.

Enjoy!

Posted in Canonical voices, LXC, Planet Ubuntu | Tagged | 12 Comments

Networking in Ubuntu 12.04 LTS

One of my focus for this cycle is to get Ubuntu’s support for complex networking working in a predictable way. The idea was to review exactly what’s happening at boot time, get a list of possible scenario that are used on servers in corporate environment and make sure these always work.

Bonding

Bonding basically means aggregating multiple physical link into one virtual link for high availability and load balancing. There are different ways of setting up such a link though the industry standard is 802.3ad (LACP – Link Aggregation Control Protocol). In that mode your server will negotiate with your switch to establish an aggregate link, then send monitoring packets to detect failure. LACP also does load balancing (MAC, IP and protocol based depending on hardware support).

One problem we had since at least Ubuntu 10.04 LTS is that Ubuntu’s boot sequence is event based, including bringing up network interfaces. The good old “ifup -a” is only done at the end of the boot sequence to try and fix anything that wasn’t brought up through events.

Unfortunately that meant that if your server takes a long time to detect the hardware, your bond would be initialised before the network cards have been detected, giving you a bond0 without a MAC address, making DHCP queries fail in pretty weird ways and making bridging or tagging fail with “Operation not permitted”.
As that all depends on hardware detection timing, it was racy, giving you random results at boot time.

Thankfully that should now be all fixed in 12.04, the new ifenslave-2.6 I uploaded a few weeks ago now initialises the bond whenever the first slave appears. If no slave appeared by the time we get to the catch-all “ifup -a”, it’ll simply wait for up to an additional minute for a slave to appear before giving up and continuing the boot sequence.
To avoid another common race condition where a bridge is brought up with a bond as one of its members before the bond is ready, ifenslave will now detect a bond is part of a bridge and add it only once ready.

Tagging

Another pretty common thing on corporate networks is the use of VLANs (802.1q), letting you create up to 4096 virtual networks on one link.
In the past, Ubuntu would rely on the catch all “ifup -a” to create any required vlan interface, once again, that’s a problem when an interface that depends on that vlan interface is initialised before the vlan interface is created.

To fix that, Ubuntu 12.04’s vlan package now ships with a udev rule that triggers the creation of the vlan interface whenever its parent interface is created.

Bridging

Bridging on Linux can be seen as creating a virtual switch on your system (including STP support).

Bridges have been working pretty well for a while on Ubuntu as we’ve been shipping a udev rule similar to the one for vlans for a few releases already. Members are simply added to the bridge as they appear on the system. The changes to ifenslave and the vlan package make sure that even bond interfaces with VLANs get properly added to bridges.

Complex network configuration example

My current test setup for networking on Ubuntu 12.04 is actually something I’ve been using on my network for years.

As you may know, I’m also working on LXC (Linux Containers), so my servers usually run somewhere between 15 and 80 containers, each of these container has a virtual ethernet interface that’s bridged.
I have one bridge per network zone, each of these network zone being a separate VLAN. These VLANs are created on top of a two gigabit link bond.

At boot time, the following happens (roughly):

  1. One of the two network interfaces appear
  2. The bond is initialised and the first interface is enslaved
  3. This triggers the creation of all the VLAN interfaces
  4. Creating the VLAN interfaces triggers the creation of all the bridges
  5. All the VLAN interfaces are added to their respective bridge
  6. The other network interface appear and gets added to the bond

My /etc/network/interfaces can be found here:
http://www.stgraber.org/download/complex-interfaces

This contains the very strict minimum needed for LACP to work. One thing worth noting is that the two physical interfaces are listed before bond0, this is to ensure that even if the events don’t work and we have to rely on the fallback “ifup -a”, the interfaces will be initialised in the right order avoiding the 60s delay.

Please note that this example will only reliably work with Ubuntu Precise (to become 12.04 LTS). It’s still a correct configuration for previous releases but race conditions may give you a random result.

I’ll be trying to push these changes to Ubuntu 11.10 as they are pretty easy to backport there, however it’d be much harder and very likely dangerous to backport these to even older releases.
For these, the only recommendation I can give is to add some “pre-up sleep 5” or similar to your bridges and vlan interfaces to make sure whatever interface they depend on exists and is ready by the time the “ifup -a” call is reached.

IPv6

Another interesting topic for 12.04 is IPv6, as a release that’ll be supported for 5 years on both servers and desktops, it’s important to get IPv6 right.

Ubuntu 12.04 LTS will be the first Ubuntu release shipping with IPv6 private extensions turned on by default. Ubuntu 11.10 already brought most of what’s needed for IPv6 support on the desktop and in the installer, supporting SLAAC (stateless autoconfiguration), stateless DHCPv6 and stateful DHCPv6.

Once we get a new ifupdown in Ubuntu Precise, we’ll have full support for IPv6 also for people that aren’t using Network Manager (mostly servers) which should at this point give us support for any IPv6 setup you may find.

The userspace has been working pretty well with IPv6 for years. I recently made my whole network dual-stack and now have all my servers and services defaulting to IPv6 for a total of 40% of my network traffic (roughly 1.5TB a month of IPv6 traffic). The only user space related problem I noticed is the lack of IPv6 support in Nagios’ nrpe plugin, meaning I can’t start converting servers to single stack IPv6 as I’d loose monitoring …

I also wrote a python script using pypcap to give me the percentage of ipv6 and ipv4 traffic going through a given interface, the script can be found here: http://www.stgraber.org/download/v6stats.py (start with: python v6stats.py eth0)

What now ?

At this point, I think Ubuntu Precise is pretty much ready as far as networking is concerned. The only remaining change is the new ifupdown and the small installer change that comes with it for DHCPv6 support.

If you have a spare machine or VM, now is the time to grab a recent build of Ubuntu Precise and make sure whatever network configuration you use in production works reliably for you and that any hack you needed in the past can now be dropped.
If your configuration doesn’t work at all or fails randomly, please file a bug so we can have a look at your configuration and fix any bug (or documentation).

Posted in Canonical voices, IPv6, LXC, Planet Ubuntu | 45 Comments

Using Arkose for development and packaging

Since I last reinstalled my laptop, I try to keep my usually insanely long list of installed packages to a bare minimum. I’d usually have hundreds if not thousands of libraries and development packages as these are required by a bunch of packages I maintain or code I work on.

To achieve this and still be as productive as before (if not more), I’m using arkose quite a lot to generate temporary dev/build environment that are wiped as soon as I close the shell.
This helps maintain the number of extra libraries to a minimum, avoiding situations where something mysteriously works fine on my laptop but not on another machine and avoids the maintenance needed when dealing with chroots.

Arkose used to install libdbus-1-dev

An example of this is when I’m working on ubiquity (the Ubuntu graphical installer).
Ubiquity depends on quite a few libraries and development packages that are required even if you just want to build its source package.

So having arkose installed on my system, I usually start working on a bug with:

sudo arkose -n -h -c "cd $PWD; $SHELL"

You can make that an alias if you use it quite often. At this point, you’ll see your shell showing a different hostname, like “arkose-tmpaF9yqa”, that’s how you know you’re in a container.
The command above creates a new container using copy-on-write for all the file system but your home directory and lets the container access the network without any restriction.

I then install all the packages I’ll need to work

sudo apt-get build-dep ubiquity

Then work as usual in that container, run debuild, dput, … everything should work as usual as it has direct access to my home directory.

Once I’m done and I don’t need all these packages anymore, I just exit that shell and all the changes done outside of /home will be lost.

Posted in Arkose, Canonical voices, LXC, Planet Ubuntu | Tagged | 17 Comments