Author Archives: Stéphane Graber

About Stéphane Graber

Project leader of Linux Containers, Linux hacker, Ubuntu core developer, conference organizer and speaker.

LXC 2.0 has been released!

LXD logo

Introduction

Today I’m very pleased to announce the release of LXC 2.0, our second Long Term Support Release! LXC 2.0 is the result of a year of work by the LXC community with over 700 commits done by over 90 contributors!

It joins LXCFS 2.0 which was released last week and will very soon be joined by LXD 2.0 to complete our collection of 2.0 container management tools!

What’s new?

The complete changelog is linked below but the main highlights for me are:

  • More consistent user experience between the various LXC tools.
  • Improved checkpoint/restore support.
  • Complete rework of our CGroup handling code, including support for the CGroup namespace.
  • Cleaned up storage backend subsystem, including the addition of a new Ceph RBD backend.
  • A massive amount of bugfixes.
  • And lastly, we managed to get all that done without breaking our API, so LXC 2.0 is fully API compatible with LXC 1.0.

The focus with this release was stability and maintaining support for all the environments in which LXC shines. We still support all kernels from 2.6.32 though the exact feature set does obviously vary based on kernel features. We also improved support for a bunch of architectures and fixed a lot of bugs and other rough edges.

This is the release you want to run in production for the next few years!

Support length

As mentioned, LXC 2.0 is a Long Term Support release.
This is the second time we do such a release with the first being LXC 1.0.

Long Term Support releases come with a 5 years commitment from upstream to do bugfixes and security updates and release new point releases when enough fixes have accumulated.

The end of life date for the various LXC versions is as follow:

  • LXC 1.0, released February 2014 will EOL on the 1st of June 2019
  • LXC 1.1, released February 2015 will EOL on the 1st of September 2016
  • LXC 2.0, released April 2016 will EOL on the 1st of June 2021

We therefore very strongly recommend LXC 1.1 users to update to LXC 2.0 as we will not be supporting this release for very much longer.

We also recommend production deployments stick to our Long Term Support release.

Project information

Upstream website: https://linuxcontainers.org/lxc/
Release announcement: https://linuxcontainers.org/lxc/news/
Code: https://github.com/lxc/lxc
IRC channel: #lxcontainers on irc.freenode.net
Mailing-lists: https://lists.linuxcontainers.org

Try it online

Want to see what a container with LXC 2.0 installed feels like?
You can get one online to play with here.

Posted in Canonical voices, LXC, Planet Ubuntu | Tagged | 7 Comments

LXCFS 2.0 has been released!

LXD logo

What’s LXCFS?

LXCFS is a side project of LXC and LXD. It’s basically a tiny FUSE filesystem which gets mounted in your containers and mask a number of proc files.

At present, it supports the following files:

  • /proc/cpuinfo
    Only returns the CPUs listed in your cpuset
  • /proc/diskstats
    Returns I/O usage from the container
  • /proc/meminfo
    Only shows the amount of memory and SWAP the container can use
  • /proc/stat
    Related to cpuinfo, only lists the right CPUs
  • /proc/swaps
    Related to meminfo, only shows your container’s swap consumption
  • /proc/uptime
    Shows the container uptime instead of the host’s

It’s basically a userspace workaround to changes which were deemed unreasonable to do in the kernel. It makes containers feel much more like separate systems than they would without it.

On top of the proc virtualization feature, lxcfs also supports rendering a partial cgroupfs view which can then be mounted into a container on top of /sys/fs/cgroup, allowing processes in the container to interact with the cgroups in a safe way.

This part is only enabled on kernels that do not support the cgroup namespace, as newer kernels (4.6 upstream, 4.4 Ubuntu) no longer need this.

Why do I need it?

lxcfs isn’t absolutely needed to run LXC or LXD containers.

That being said, you will want it if:

  • You want proper resource consumption reporting inside your container
  • You need to start a systemd based container on a system running a kernel older than 4.6 upstream (or 4.4 Ubuntu)

LXD in Ubuntu actually depends on LXCFS as we think it’s a critical part of offering a good container experience on Ubuntu.

How to get it?

LXCFS is available in quite a few distributions, so chances are you can just grab it with your package manager. It may take a few days/weeks for 2.0 to be available though.

Ubuntu users have had lxcfs available for a few years now and the 2.0 release is now in the Ubuntu development release. Up to date packages for all Ubuntu releases can also be found in our PPAs.

What kind of support will this get?

LXCFS 2.0 is a long term support release. That means that upstream LXCFS will be pushing out bugfix and security releases for the next 5 years.

A separate stable branch will be setup upstream and bugfixes will be cherry-picked into it, when enough fixes have accumulated a bugfix release (like 2.0.1) will be released.

Project information

Upstream website: https://linuxcontainers.org/lxcfs/
Release announcement: https://linuxcontainers.org/lxcfs/news/
Code: https://github.com/lxc/lxcfs
IRC channel: #lxcontainers on irc.freenode.net
Mailing-lists: https://lists.linuxcontainers.org

Try it online

Want to see what a container with LXCFS installed feels like?
You can get one online to play with here.

Posted in Canonical voices, LXCFS, Planet Ubuntu | Tagged | 4 Comments

LXD 2.0: Image management [5/12]

This is the fifth blog post in this series about LXD 2.0.

LXD logo

Container images

If you’ve used LXC before, you probably remember those LXC “templates”, basically shell scripts that spit out a container filesystem and a bit of configuration.

Most templates generate the filesystem by doing a full distribution bootstrapping on your local machine. This may take quite a while, won’t work for all distributions and may require significant network bandwidth.

Back in LXC 1.0, I wrote a “download” template which would allow users to download pre-packaged container images, generated on a central server from the usual template scripts and then heavily compressed, signed and distributed over https. A lot of our users switched from the old style container generation to using this new, much faster and much more reliable method of creating a container.

With LXD, we’re taking this one step further by being all-in on the image based workflow. All containers are created from an image and we have advanced image caching and pre-loading support in LXD to keep the image store up to date.

Interacting with LXD images

Before digging deeper into the image format, lets quickly go through what LXD lets you do with those images.

Transparently importing images

All containers are created from an image. The image may have come from a remote image server and have been pulled using its full hash, short hash or an alias, but in the end, every LXD container is created from a local image.

Here are a few examples:

lxc launch ubuntu:14.04 c1
lxc launch ubuntu:75182b1241be475a64e68a518ce853e800e9b50397d2f152816c24f038c94d6e c2
lxc launch ubuntu:75182b1241be c3

All of those refer to the same remote image (at the time of this writing), the first time one of those is run, the remote image will be imported in the local LXD image store as a cached image, then the container will be created from it.

The next time one of those commands are run, LXD will only check that the image is still up to date (when not referring to it by its fingerprint), if it is, it will create the container without downloading anything.

Now that the image is cached in the local image store, you can also just start it from there without even checking if it’s up to date:

lxc launch 75182b1241be c4

And lastly, if you have your own local image under the name “myimage”, you can just do:

lxc launch my-image c5

If you want to change some of that automatic caching and expiration behavior, there are instructions in an earlier post in this series.

Manually importing images

Copying from an image server

If you want to copy some remote image into your local image store but not immediately create a container from it, you can use the “lxc image copy” command. It also lets you tweak some of the image flags, for example:

lxc image copy ubuntu:14.04 local:

This simply copies the remote image into the local image store.

If you want to be able to refer to your copy of the image by something easier to remember than its fingerprint, you can add an alias at the time of the copy:

lxc image copy ubuntu:12.04 local: --alias old-ubuntu
lxc launch old-ubuntu c6

And if you would rather just use the aliases that were set on the source server, you can ask LXD to copy the for you:

lxc image copy ubuntu:15.10 local: --copy-aliases
lxc launch 15.10 c7

All of the copies above were one-shot copy, so copying the current version of the remote image into the local image store. If you want to have LXD keep the image up to date, as it does for the ones stored in its cache, you need to request it with the –auto-update flag:

lxc image copy images:gentoo/current/amd64 local: --alias gentoo --auto-update

Importing a tarball

If someone provides you with a LXD image as a single tarball, you can import it with:

lxc image import <tarball>

If you want to set an alias at import time, you can do it with:

lxc image import <tarball> --alias random-image

Now if you were provided with two tarballs, identify which contains the LXD metadata. Usually the tarball name gives it away, if not, pick the smallest of the two, metadata tarballs are tiny. Then import them both together with:

lxc image import <metadata tarball> <rootfs tarball>

Importing from a URL

“lxc image import” also works with some special URLs. If you have an https web server which serves a path with the LXD-Image-URL and LXD-Image-Hash headers set, then LXD will pull that image into its image store.

For example you can do:

lxc image import https://dl.stgraber.org/lxd --alias busybox-amd64

When pulling the image, LXD also sets some headers which the remote server could check to return an appropriate image. Those are LXD-Server-Architectures and LXD-Server-Version.

This is meant as a poor man’s image server. It can be made to work with any static web server and provides a user friendly way to import your image.

Managing the local image store

Now that we have a bunch of images in our local image store, lets see what we can do with them. We’ve already covered the most obvious, creating containers from them but there are a few more things you can do with the local image store.

Listing images

To get a list of all images in the store, just run “lxc image list”:

stgraber@dakara:~$ lxc image list
+---------------+--------------+--------+------------------------------------------------------+--------+----------+------------------------------+
|     ALIAS     | FINGERPRINT  | PUBLIC |                     DESCRIPTION                      |  ARCH  |   SIZE   |         UPLOAD DATE          |
+---------------+--------------+--------+------------------------------------------------------+--------+----------+------------------------------+
| alpine-32     | 6d9c131efab3 | yes    | Alpine edge (i386) (20160329_23:52)                  | i686   | 2.50MB   | Mar 30, 2016 at 4:36am (UTC) |
+---------------+--------------+--------+------------------------------------------------------+--------+----------+------------------------------+
| busybox-amd64 | 74186c79ca2f | no     | Busybox x86_64                                       | x86_64 | 0.79MB   | Mar 30, 2016 at 4:33am (UTC) |
+---------------+--------------+--------+------------------------------------------------------+--------+----------+------------------------------+
| gentoo        | 1a134c5951e0 | no     | Gentoo current (amd64) (20160329_14:12)              | x86_64 | 232.50MB | Mar 30, 2016 at 4:34am (UTC) |
+---------------+--------------+--------+------------------------------------------------------+--------+----------+------------------------------+
| my-image      | c9b6e738fae7 | no     | Scientific Linux 6 x86_64 (default) (20160215_02:36) | x86_64 | 625.34MB | Mar 2, 2016 at 4:56am (UTC)  |
+---------------+--------------+--------+------------------------------------------------------+--------+----------+------------------------------+
| old-ubuntu    | 4d558b08f22f | no     | ubuntu 12.04 LTS amd64 (release) (20160315)          | x86_64 | 155.09MB | Mar 30, 2016 at 4:30am (UTC) |
+---------------+--------------+--------+------------------------------------------------------+--------+----------+------------------------------+
| w (11 more)   | d3703a994910 | no     | ubuntu 15.10 amd64 (release) (20160315)              | x86_64 | 153.35MB | Mar 30, 2016 at 4:31am (UTC) |
+---------------+--------------+--------+------------------------------------------------------+--------+----------+------------------------------+
|               | 75182b1241be | no     | ubuntu 14.04 LTS amd64 (release) (20160314)          | x86_64 | 118.17MB | Mar 30, 2016 at 4:27am (UTC) |
+---------------+--------------+--------+------------------------------------------------------+--------+----------+------------------------------+

You can filter based on the alias or fingerprint simply by doing:

stgraber@dakara:~$ lxc image list amd64
+---------------+--------------+--------+-----------------------------------------+--------+----------+------------------------------+
|     ALIAS     | FINGERPRINT  | PUBLIC |               DESCRIPTION               |  ARCH  |   SIZE   |          UPLOAD DATE         |
+---------------+--------------+--------+-----------------------------------------+--------+----------+------------------------------+
| busybox-amd64 | 74186c79ca2f | no     | Busybox x86_64                          | x86_64 | 0.79MB   | Mar 30, 2016 at 4:33am (UTC) |
+---------------+--------------+--------+-----------------------------------------+--------+----------+------------------------------+
| w (11 more)   | d3703a994910 | no     | ubuntu 15.10 amd64 (release) (20160315) | x86_64 | 153.35MB | Mar 30, 2016 at 4:31am (UTC) |
+---------------+--------------+--------+-----------------------------------------+--------+----------+------------------------------+

Or by specifying a key=value filter of image properties:

stgraber@dakara:~$ lxc image list os=ubuntu
+-------------+--------------+--------+---------------------------------------------+--------+----------+------------------------------+
|    ALIAS    | FINGERPRINT  | PUBLIC |                  DESCRIPTION                |  ARCH  |   SIZE   |          UPLOAD DATE         |
+-------------+--------------+--------+---------------------------------------------+--------+----------+------------------------------+
| old-ubuntu  | 4d558b08f22f | no     | ubuntu 12.04 LTS amd64 (release) (20160315) | x86_64 | 155.09MB | Mar 30, 2016 at 4:30am (UTC) |
+-------------+--------------+--------+---------------------------------------------+--------+----------+------------------------------+
| w (11 more) | d3703a994910 | no     | ubuntu 15.10 amd64 (release) (20160315)     | x86_64 | 153.35MB | Mar 30, 2016 at 4:31am (UTC) |
+-------------+--------------+--------+---------------------------------------------+--------+----------+------------------------------+
|             | 75182b1241be | no     | ubuntu 14.04 LTS amd64 (release) (20160314) | x86_64 | 118.17MB | Mar 30, 2016 at 4:27am (UTC) |
+-------------+--------------+--------+---------------------------------------------+--------+----------+------------------------------+

To see everything LXD knows about a given image, you can use “lxc image info”:

stgraber@castiana:~$ lxc image info ubuntu
Fingerprint: e8a33ec326ae7dd02331bd72f5d22181ba25401480b8e733c247da5950a7d084
Size: 139.43MB
Architecture: i686
Public: no
Timestamps:
 Created: 2016/03/15 00:00 UTC
 Uploaded: 2016/03/16 05:50 UTC
 Expires: 2017/04/26 00:00 UTC
Properties:
 version: 12.04
 aliases: 12.04,p,precise
 architecture: i386
 description: ubuntu 12.04 LTS i386 (release) (20160315)
 label: release
 os: ubuntu
 release: precise
 serial: 20160315
Aliases:
 - ubuntu
Auto update: enabled
Source:
 Server: https://cloud-images.ubuntu.com/releases
 Protocol: simplestreams
 Alias: precise/i386

Editing images

A convenient way to edit image properties and some of the flags is to use:

lxc image edit <alias or fingerprint>

This opens up your default text editor with something like this:

autoupdate: true
properties:
 aliases: 14.04,default,lts,t,trusty
 architecture: amd64
 description: ubuntu 14.04 LTS amd64 (release) (20160314)
 label: release
 os: ubuntu
 release: trusty
 serial: "20160314"
 version: "14.04"
public: false

You can change any property you want, turn auto-update on and off or mark an image as publicly available (more on that later).

Deleting images

Remove an image is a simple matter of running:

lxc image delete <alias or fingerprint>

Note that you don’t have to remove cached entries, those will automatically be removed by LXD after they expire (by default, after 10 days since they were last used).

Exporting images

If you want to get image tarballs from images currently in your image store, you can use “lxc image export”, like:

stgraber@dakara:~$ lxc image export old-ubuntu .
Output is in .
stgraber@dakara:~$ ls -lh *.tar.xz
-rw------- 1 stgraber domain admins 656 Mar 30 00:55 meta-ubuntu-12.04-server-cloudimg-amd64-lxd.tar.xz
-rw------- 1 stgraber domain admins 156M Mar 30 00:55 ubuntu-12.04-server-cloudimg-amd64-lxd.tar.xz

Image formats

LXD right now supports two image layouts, unified or split. Both of those are effectively LXD-specific though the latter makes it easier to re-use the filesystem with other container or virtual machine runtimes.

LXD being solely focused on system containers, doesn’t support any of the application container “standard” image formats out there, nor do we plan to.

Our images are pretty simple, they’re made of a container filesystem, a metadata file describing things like when the image was made, when it expires, what architecture its for, … and optionally a bunch of file templates.

See this document for up to date details on the image format.

Unified image (single tarball)

The unified image format is what LXD uses when generating images itself. They are a single big tarball, containing the container filesystem inside a “rootfs” directory, have the metadata.yaml file at the root of the tarball and any template goes into a “templates” directory.

Any compression (or none at all) can be used for that tarball. The image hash is the sha256 of the resulting compressed tarball.

Split image (two tarballs)

This format is most commonly used by anyone rolling their own images and who already have a compressed filesystem tarball.

They are made of two distinct tarball, the first contains just the metadata bits that LXD uses, so the metadata.yaml file at the root and any template in the “templates” directory.

The second tarball contains only the container filesystem directly at its root. Most distributions already produce such tarballs as they are common for bootstrapping new machines. This image format allows re-using them unmodified.

Any compression (or none at all) can be used for either tarball, they can absolutely use different compression algorithms. The image hash is the sha256 of the concatenation of the metadata and rootfs tarballs.

Image metadata

A typical metadata.yaml file looks something like:

architecture: "i686"
creation_date: 1458040200
properties:
 architecture: "i686"
 description: "Ubuntu 12.04 LTS server (20160315)"
 os: "ubuntu"
 release: "precise"
templates:
 /var/lib/cloud/seed/nocloud-net/meta-data:
  when:
   - start
  template: cloud-init-meta.tpl
 /var/lib/cloud/seed/nocloud-net/user-data:
  when:
   - start
  template: cloud-init-user.tpl
  properties:
   default: |
    #cloud-config
    {}
 /var/lib/cloud/seed/nocloud-net/vendor-data:
  when:
   - start
  template: cloud-init-vendor.tpl
  properties:
   default: |
    #cloud-config
    {}
 /etc/init/console.override:
  when:
   - create
  template: upstart-override.tpl
 /etc/init/tty1.override:
  when:
   - create
  template: upstart-override.tpl
 /etc/init/tty2.override:
  when:
   - create
  template: upstart-override.tpl
 /etc/init/tty3.override:
  when:
   - create
  template: upstart-override.tpl
 /etc/init/tty4.override:
  when:
   - create
  template: upstart-override.tpl

Properties

The two only mandatory fields are the creation date (UNIX EPOCH) and the architecture. Everything else can be left unset and the image will import fine.

The extra properties are mainly there to help the user figure out what the image is about. The “description” property for example is what’s visible in “lxc image list”. The other properties can be used by the user to search for specific images using key/value search.

Those properties can then be edited by the user through “lxc image edit” in contrast, the creation date and architecture fields are immutable.

Templates

The template mechanism allows for some files in the container to be generated or re-generated at some point in the container lifecycle.

We use the pongo2 templating engine for those and we export just about everything we know about the container to the template. That way you can have custom images which use user-defined container properties or normal LXD properties to change the content of some specific files.

As you can see in the example above, we’re using those in Ubuntu to seed cloud-init and to turn off some init scripts.

Creating your own images

LXD being focused on running full Linux systems means that we expect most users to just use clean distribution images and not spin their own image.

However there are a few cases where having your own images is useful. Such as having pre-configured images of your production servers or building your own images for a distribution or architecture that we don’t build images for.

Turning a container into an image

The easiest way by far to build an image with LXD is to just turn a container into an image.

This can be done with:

lxc launch ubuntu:14.04 my-container
lxc exec my-container bash
<do whatever change you want>
lxc publish my-container --alias my-new-image

You can even turn a past container snapshot into a new image:

lxc publish my-container/some-snapshot --alias some-image

Manually building an image

Building your own image is also pretty simple.

  1. Generate a container filesystem. This entirely depends on the distribution you’re using. For Ubuntu and Debian, it would be by using debootstrap.
  2. Configure anything that’s needed for the distribution to work properly in a container (if anything is needed).
  3. Make a tarball of that container filesystem, optionally compress it.
  4. Write a new metadata.yaml file based on the one described above.
  5. Create another tarball containing that metadata.yaml file.
  6. Import those two tarballs as a LXD image with:
    lxc image import <metadata tarball> <rootfs tarball> --alias some-name

You will probably need to go through this a few times before everything works, tweaking things here and there, possibly adding some templates and properties.

Publishing your images

All LXD daemons act as image servers. Unless told otherwise all images loaded in the image store are marked as private and so only trusted clients can retrieve those images, but should you want to make a public image server, all you have to do is tag a few images as public and make sure you LXD daemon is listening to the network.

Just running a public LXD server

The easiest way to share LXD images is to run a publicly visible LXD daemon.

You typically do that by running:

lxc config set core.https_address "[::]:8443"

Remote users can then add your server as a public image server with:

lxc remote add <some name> <IP or DNS> --public

They can then use it just as they would any of the default image servers. As the remote server was added with “–public”, no authentication is required and the client is restricted to images which have themselves been marked as public.

To change what images are public, just “lxc image edit” them and set the public flag to true.

Use a static web server

As mentioned above, “lxc image import” supports downloading from a static http server. The requirements are basically:

  • The server must support HTTPs with a valid certificate, TLS1.2 and EC ciphers
  • When hitting the URL provided to “lxc image import”, the server must return an answer including the LXD-Image-Hash and LXD-Image-URL HTTP headers

If you want to make this dynamic, you can have your server look for the LXD-Server-Architectures and LXD-Server-Version HTTP headers which LXD will provide when fetching the image. This allows you to return the right image for the server’s architecture.

Build a simplestreams server

The “ubuntu:” and “ubuntu-daily:” remotes aren’t using the LXD protocol (“images:” is), those are instead using a different protocol called simplestreams.

simplestreams is basically an image server description format, using JSON to describe a list of products and files related to those products.

It is used by a variety of tools like OpenStack, Juju, MAAS, … to find, download or mirror system images and LXD supports it as a native protocol for image retrieval.

While certainly not the easiest way to start providing LXD images, it may be worth considering if your images can also be used by some of those other tools.

More information can be found here.

Conclusion

I hope this gave you a good idea of how LXD manages its images and how to build and distribute your own. The ability to have the exact same image easily available bit for bit on a bunch of globally distributed system is a big step up from the old LXC days and leads the way to more reproducible infrastructure.

Extra information

The main LXD website is at: https://linuxcontainers.org/lxd
Development happens on Github at: https://github.com/lxc/lxd
Mailing-list support happens on: https://lists.linuxcontainers.org
IRC support happens in: #lxcontainers on irc.freenode.net

And if you don’t want or can’t install LXD on your own machine, you can always try it online instead!

Posted in Canonical voices, LXD, Planet Ubuntu | Tagged | 26 Comments

LXD 2.0: Resource control [4/12]

This is the fourth blog post in this series about LXD 2.0.

LXD logo

Available resource limits

LXD offers a variety of resource limits. Some of those are tied to the container itself, like memory quotas, CPU limits and I/O priorities. Some are tied to a particular device instead, like I/O bandwidth or disk usage limits.

As with all LXD configuration, resource limits can be dynamically changed while the container is running. Some may fail to apply, for example if setting a memory value smaller than the current memory usage, but LXD will try anyway and report back on failure.

All limits can also be inherited through profiles in which case each affected container will be constrained by that limit. That is, if you set limits.memory=256MB in the default profile, every container using the default profile (typically all of them) will have a memory limit of 256MB.

We don’t support resource limits pooling where a limit would be shared by a group of containers, there is simply no good way to implement something like that with the existing kernel APIs.

Disk

This is perhaps the most requested and obvious one. Simply setting a size limit on the container’s filesystem and have it enforced against the container.

And that’s exactly what LXD lets you do!
Unfortunately this is far more complicated than it sounds. Linux doesn’t have path-based quotas, instead most filesystems only have user and group quotas which are of little use to containers.

This means that right now LXD only supports disk limits if you’re using the ZFS or btrfs storage backend. It may be possible to implement this feature for LVM too but this depends on the filesystem being used with it and gets tricky when combined with live updates as not all filesystems allow online growth and pretty much none of them allow online shrink.

CPU

When it comes to CPU limits, we support 4 different things:

  • Just give me X CPUs
    In this mode, you let LXD pick a bunch of cores for you and then load-balance things as more containers and CPUs go online/offline.
    The container only sees that number of CPU.
  • Give me a specific set of CPUs (say, core 1, 3 and 5)
    Similar to the first mode except that no load-balancing is happening, you’re stuck with those cores no matter how busy they may be.
  • Give me 20% of whatever you have
    In this mode, you get to see all the CPUs but the scheduler will restrict you to 20% of the CPU time but only when under load! So if the system isn’t busy, your container can have as much fun as it wants. When containers next to it start using the CPU, then it gets capped.
  • Out of every measured 200ms, give me 50ms (and no more than that)
    This mode is similar to the previous one in that you get to see all the CPUs but this time, you can only use as much CPU time as you set in the limit, no matter how idle the system may be. On a system without over-commit this lets you slice your CPU very neatly and guarantees constant performance to those containers.

It’s also possible to combine one of the first two with one of the last two, that is, request a set of CPUs and then further restrict how much CPU time you get on those.

On top of that, we also have a generic priority knob which is used to tell the scheduler who wins when you’re under load and two containers are fighting for the same resource.

Memory

Memory sounds pretty simple, just give me X MB of RAM!

And it absolutely can be that simple. We support that kind of limits as well as percentage based requests, just give me 10% of whatever the host has!

Then we support some extra stuff on top. For example, you can choose to turn swap on and off on a per-container basis and if it’s on, set a priority so you can choose what container will have their memory swapped out to disk first!

Oh and memory limits are “hard” by default. That is, when you run out of memory, the kernel out of memory killer will start having some fun with your processes.

Alternatively you can set the enforcement policy to “soft”, in which case you’ll be allowed to use as much memory as you want so long as nothing else is. As soon as something else wants that memory, you won’t be able to allocate anything until you’re back under your limit or until the host has memory to spare again.

Network I/O

Network I/O is probably our simplest looking limit, trust me, the implementation really isn’t simple though!

We support two things. The first is a basic bit/s limits on network interfaces. You can set a limit of ingress and egress or just set the “max” limit which then applies to both. This is only supported for “bridged” and “p2p” type interfaces.

The second thing is a global network I/O priority which only applies when the network interface you’re trying to talk through is saturated.

Block I/O

I kept the weirdest for last. It may look straightforward and feel like that to the user but there are a bunch of cases where it won’t exactly do what you think it should.

What we support here is basically identical to what I described in Network I/O.

You can set IOps or byte/s read and write limits directly on a disk device entry and there is a global block I/O priority which tells the I/O scheduler who to prefer.

The weirdness comes from how and where those limits are applied. Unfortunately the underlying feature we use to implement those uses full block devices. That means we can’t set per-partition I/O limits let alone per-path.

It also means that when using ZFS or btrfs which can use multiple block devices to back a given path (with or without RAID), we effectively don’t know what block device is providing a given path.

This means that it’s entirely possible, in fact likely, that a container may have multiple disk entries (bind-mounts or straight mounts) which are coming from the same underlying disk.

And that’s where things get weird. To make things work, LXD has logic to guess what block devices back a given path, this does include interrogating the ZFS and btrfs tools and even figures things out recursively when it finds a loop mounted file backing a filesystem.

That logic while not perfect, usually yields a set of block devices that should have a limit applied. LXD then records that and moves on to the next path. When it’s done looking at all the paths, it gets to the very weird part. It averages the limits you’ve set for every affected block devices and then applies those.

That means that “in average” you’ll be getting the right speed in the container, but it also means that you can’t have a “/fast” and a “/slow” directory both coming from the same physical disk and with differing speed limits. LXD will let you set it up but in the end, they’ll both give you the average of the two values.

How does it all work?

Most of the limits described above are applied through the Linux kernel Cgroups API. That’s with the exception of the network limits which are applied through good old “tc”.

LXD at startup time detects what cgroups are enabled in your kernel and will only apply the limits which your kernel support. Should you be missing some cgroups, a warning will also be printed by the daemon which will then get logged by your init system.

On Ubuntu 16.04, everything is enabled by default with the exception of swap memory accounting which requires you pass the “swapaccount=1” kernel boot parameter.

Applying some limits

All the limits described above are applied directly to the container or to one of its profiles. Container-wide limits are applied with:

lxc config set CONTAINER KEY VALUE

or for a profile:

lxc profile set PROFILE KEY VALUE

while device-specific ones are applied with:

lxc config device set CONTAINER DEVICE KEY VALUE

or for a profile:

lxc profile device set PROFILE DEVICE KEY VALUE

The complete list of valid configuration keys, device types and device keys can be found here.

CPU

To just limit a container to any 2 CPUs, do:

lxc config set my-container limits.cpu 2

To pin to specific CPU cores, say the second and fourth:

lxc config set my-container limits.cpu 1,3

More complex pinning ranges like this works too:

lxc config set my-container limits.cpu 0-3,7-11

The limits are applied live, as can be seen in this example:

stgraber@dakara:~$ lxc exec zerotier -- cat /proc/cpuinfo | grep ^proces
processor : 0
processor : 1
processor : 2
processor : 3
stgraber@dakara:~$ lxc config set zerotier limits.cpu 2
stgraber@dakara:~$ lxc exec zerotier -- cat /proc/cpuinfo | grep ^proces
processor : 0
processor : 1

Note that to avoid utterly confusing userspace, lxcfs arranges the /proc/cpuinfo entries so that there are no gaps.

As with just about everything in LXD, those settings can also be applied in profiles:

stgraber@dakara:~$ lxc exec snappy -- cat /proc/cpuinfo | grep ^proces
processor : 0
processor : 1
processor : 2
processor : 3
stgraber@dakara:~$ lxc profile set default limits.cpu 3
stgraber@dakara:~$ lxc exec snappy -- cat /proc/cpuinfo | grep ^proces
processor : 0
processor : 1
processor : 2

To limit the CPU time of a container to 10% of the total, set the CPU allowance:

lxc config set my-container limits.cpu.allowance 10%

Or to give it a fixed slice of CPU time:

lxc config set my-container limits.cpu.allowance 25ms/200ms

And lastly, to reduce the priority of a container to a minimum:

lxc config set my-container limits.cpu.priority 0

Memory

To apply a straightforward memory limit run:

lxc config set my-container limits.memory 256MB

(The supported suffixes are kB, MB, GB, TB, PB and EB)

To turn swap off for the container (defaults to enabled):

lxc config set my-container limits.memory.swap false

To tell the kernel to swap this container’s memory first:

lxc config set my-container limits.memory.swap.priority 0

And finally if you don’t want hard memory limit enforcement:

lxc config set my-container limits.memory.enforce soft

Disk and block I/O

Unlike CPU and memory, disk and I/O limits are applied to the actual device entry, so you either need to edit the original device or mask it with a more specific one.

To set a disk limit (requires btrfs or ZFS):

lxc config device set my-container root size 20GB

For example:

stgraber@dakara:~$ lxc exec zerotier -- df -h /
Filesystem                        Size Used Avail Use% Mounted on
encrypted/lxd/containers/zerotier 179G 542M  178G   1% /
stgraber@dakara:~$ lxc config device set zerotier root size 20GB
stgraber@dakara:~$ lxc exec zerotier -- df -h /
Filesystem                       Size  Used Avail Use% Mounted on
encrypted/lxd/containers/zerotier 20G  542M   20G   3% /

To restrict speed you can do the following:

lxc config device set my-container root limits.read 30MB
lxc config device set my-container root.limits.write 10MB

Or to restrict IOps instead:

lxc config device set my-container root limits.read 20iops
lxc config device set my-container root limits.write 10iops

And lastly, if you’re on a busy system with over-commit, you may want to also do:

lxc config set my-container limits.disk.priority 10

To increase the I/O priority for that container to the maximum.

Network I/O

Network I/O is basically identical to block I/O as far the knobs available.

For example:

stgraber@dakara:~$ lxc exec zerotier -- wget http://speedtest.newark.linode.com/100MB-newark.bin -O /dev/null
--2016-03-26 22:17:34-- http://speedtest.newark.linode.com/100MB-newark.bin
Resolving speedtest.newark.linode.com (speedtest.newark.linode.com)... 50.116.57.237, 2600:3c03::4b
Connecting to speedtest.newark.linode.com (speedtest.newark.linode.com)|50.116.57.237|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 104857600 (100M) [application/octet-stream]
Saving to: '/dev/null'

/dev/null 100%[===================>] 100.00M 58.7MB/s in 1.7s 

2016-03-26 22:17:36 (58.7 MB/s) - '/dev/null' saved [104857600/104857600]

stgraber@dakara:~$ lxc profile device set default eth0 limits.ingress 100Mbit
stgraber@dakara:~$ lxc profile device set default eth0 limits.egress 100Mbit
stgraber@dakara:~$ lxc exec zerotier -- wget http://speedtest.newark.linode.com/100MB-newark.bin -O /dev/null
--2016-03-26 22:17:47-- http://speedtest.newark.linode.com/100MB-newark.bin
Resolving speedtest.newark.linode.com (speedtest.newark.linode.com)... 50.116.57.237, 2600:3c03::4b
Connecting to speedtest.newark.linode.com (speedtest.newark.linode.com)|50.116.57.237|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 104857600 (100M) [application/octet-stream]
Saving to: '/dev/null'

/dev/null 100%[===================>] 100.00M 11.4MB/s in 8.8s 

2016-03-26 22:17:56 (11.4 MB/s) - '/dev/null' saved [104857600/104857600]

And that’s how you throttle an otherwise nice gigabit connection to a mere 100Mbit/s one!

And as with block I/O, you can set an overall network priority with:

lxc config set my-container limits.network.priority 5

Getting the current resource usage

The LXD API exports quite a bit of information on current container resource usage, you can get:

  • Memory: current, peak, current swap and peak swap
  • Disk: current disk usage
  • Network: bytes and packets received and transferred for every interface

And now if you’re running a very recent LXD (only in git at the time of this writing), you can also get all of those in “lxc info”:

stgraber@dakara:~$ lxc info zerotier
Name: zerotier
Architecture: x86_64
Created: 2016/02/20 20:01 UTC
Status: Running
Type: persistent
Profiles: default
Pid: 29258
Ips:
 eth0: inet 172.17.0.101
 eth0: inet6 2607:f2c0:f00f:2700:216:3eff:feec:65a8
 eth0: inet6 fe80::216:3eff:feec:65a8
 lo: inet 127.0.0.1
 lo: inet6 ::1
 lxcbr0: inet 10.0.3.1
 lxcbr0: inet6 fe80::f0bd:55ff:feee:97a2
 zt0: inet 29.17.181.59
 zt0: inet6 fd80:56c2:e21c:0:199:9379:e711:b3e1
 zt0: inet6 fe80::79:e7ff:fe0d:5123
Resources:
 Processes: 33
 Disk usage:
  root: 808.07MB
 Memory usage:
  Memory (current): 106.79MB
  Memory (peak): 195.51MB
  Swap (current): 124.00kB
  Swap (peak): 124.00kB
 Network usage:
  lxcbr0:
   Bytes received: 0 bytes
   Bytes sent: 570 bytes
   Packets received: 0
   Packets sent: 0
  zt0:
   Bytes received: 1.10MB
   Bytes sent: 806 bytes
   Packets received: 10957
   Packets sent: 10957
  eth0:
   Bytes received: 99.35MB
   Bytes sent: 5.88MB
   Packets received: 64481
   Packets sent: 64481
  lo:
   Bytes received: 9.57kB
   Bytes sent: 9.57kB
   Packets received: 81
   Packets sent: 81
Snapshots:
 zerotier/blah (taken at 2016/03/08 23:55 UTC) (stateless)

Conclusion

The LXD team spent quite a few months iterating over the language we’re using for those limits. It’s meant to be as simple as it can get while remaining very powerful and specific when you want it to.

Live application of those limits and inheritance through profiles makes it a very powerful tool to live manage the load on your servers without impacting the running services.

Extra information

The main LXD website is at: https://linuxcontainers.org/lxd
Development happens on Github at: https://github.com/lxc/lxd
Mailing-list support happens on: https://lists.linuxcontainers.org
IRC support happens in: #lxcontainers on irc.freenode.net

And if you don’t want or can’t install LXD on your own machine, you can always try it online instead!

Posted in Canonical voices, LXD, Planet Ubuntu | Tagged | 48 Comments

LXD 2.0: Your first LXD container [3/12]

This is the third blog post in this series about LXD 2.0.

As there are a lot of commands involved with managing LXD containers, this post is rather long. If you’d instead prefer a quick step-by-step tour of those same commands, you can try our online demo instead!

LXD logo

Creating and starting a new container

As I mentioned in the previous posts, the LXD command line client comes pre-configured with a few image sources. Ubuntu is the best covered with official images for all its releases and architectures but there also are a number of unofficial images for other distributions. Those are community generated and maintained by LXC upstream contributors.

Ubuntu

If all you want is the best supported release of Ubuntu, all you have to do is:

lxc launch ubuntu:

Note however that the meaning of this will change as new Ubuntu LTS releases are released. So for scripting use, you should stick to mentioning the actual release you want (see below).

Ubuntu 14.04 LTS

To get the latest, tested, stable image of Ubuntu 14.04 LTS, you can simply run:

lxc launch ubuntu:14.04

In this mode, a random container name will be picked.
If you prefer to specify your own name, you may instead do:

lxc launch ubuntu:14.04 c1

Should you want a specific (non-primary) architecture, say a 32bit Intel image, you can do:

lxc launch ubuntu:14.04/i386 c2

Current Ubuntu development release

The “ubuntu:” remote used above only provides official, tested images for Ubuntu. If you instead want untested daily builds, as is appropriate for the development release, you’ll want to use the “ubuntu-daily:” remote instead.

lxc launch ubuntu-daily:devel c3

In this example, whatever the latest Ubuntu development release is will automatically be picked.

You can also be explicit, for example by using the code name:

lxc launch ubuntu-daily:xenial c4

Latest Alpine Linux

Alpine images are available on the “images:” remote and can be launched with:

lxc launch images:alpine/3.3/amd64 c5

And many more

A full list of the Ubuntu images can be obtained with:

lxc image list ubuntu:
lxc image list ubuntu-daily:

And of all the unofficial images:

lxc image list images:

A list of all the aliases (friendly names) available on a given remote can also be obtained with (for the “ubuntu:” remote):

lxc image alias list ubuntu:

Creating a container without starting it

If you want to just create a container or a batch of container but not also start them immediately, you can just replace “lxc launch” by “lxc init”. All the options are identical, the only different is that it will not start the container for you after creation.

lxc init ubuntu:

Information about your containers

Listing the containers

To list all your containers, you can do:

lxc list

There are a number of options you can pass to change what columns are displayed. On systems with a lot of containers, the default columns can be a bit slow (due to having to retrieve network information from the containers), you may instead want:

lxc list --fast

Which shows a different set of columns that require less processing on the server side.

You can also filter based on name or properties:

stgraber@dakara:~$ lxc list security.privileged=true
+------+---------+---------------------+-----------------------------------------------+------------+-----------+
| NAME |  STATE  |        IPV4         |                       IPV6                    |    TYPE    | SNAPSHOTS |
+------+---------+---------------------+-----------------------------------------------+------------+-----------+
| suse | RUNNING | 172.17.0.105 (eth0) | 2607:f2c0:f00f:2700:216:3eff:fef2:aff4 (eth0) | PERSISTENT | 0         |
+------+---------+---------------------+-----------------------------------------------+------------+-----------+

In this example, only containers that are privileged (user namespace disabled) are listed.

stgraber@dakara:~$ lxc list --fast alpine
+-------------+---------+--------------+----------------------+----------+------------+
|    NAME     |  STATE  | ARCHITECTURE |      CREATED AT      | PROFILES |    TYPE    |
+-------------+---------+--------------+----------------------+----------+------------+
| alpine      | RUNNING | x86_64       | 2016/03/20 02:11 UTC | default  | PERSISTENT |
+-------------+---------+--------------+----------------------+----------+------------+
| alpine-edge | RUNNING | x86_64       | 2016/03/20 02:19 UTC | default  | PERSISTENT |
+-------------+---------+--------------+----------------------+----------+------------+

And in this example, only the containers which have “alpine” in their names (complex regular expressions are also supported).

Getting detailed information from a container

As the list command obviously can’t show you everything about a container in a nicely readable way, you can query information about an individual container with:

lxc info <container>

For example:

stgraber@dakara:~$ lxc info zerotier
Name: zerotier
Architecture: x86_64
Created: 2016/02/20 20:01 UTC
Status: Running
Type: persistent
Profiles: default
Pid: 31715
Processes: 32
Ips:
 eth0: inet 172.17.0.101
 eth0: inet6 2607:f2c0:f00f:2700:216:3eff:feec:65a8
 eth0: inet6 fe80::216:3eff:feec:65a8
 lo: inet 127.0.0.1
 lo: inet6 ::1
 lxcbr0: inet 10.0.3.1
 lxcbr0: inet6 fe80::c0a4:ceff:fe52:4d51
 zt0: inet 29.17.181.59
 zt0: inet6 fd80:56c2:e21c:0:199:9379:e711:b3e1
 zt0: inet6 fe80::79:e7ff:fe0d:5123
Snapshots:
 zerotier/blah (taken at 2016/03/08 23:55 UTC) (stateless)

Life-cycle management commands

Those are probably the most obvious commands of any container or virtual machine manager but they still need to be covered.

Oh and all of them accept multiple container names for batch operation.

start

Starting a container is as simple as:

lxc start <container>

stop

Stopping a container can be done with:

lxc stop <container>

If the container isn’t cooperating (not responding to SIGPWR), you can force it with:

lxc stop <container> --force

restart

Restarting a container is done through:

lxc restart <container>

And if not cooperating (not responding to SIGINT), you can force it with:

lxc restart <container> --force

pause

You can also “pause” a container. In this mode, all the container tasks will be sent the equivalent of a SIGSTOP which means that they will still be visible and will still be using memory but they won’t get any CPU time from the scheduler.

This is useful if you have a CPU hungry container that takes quite a while to start but that you aren’t constantly using. You can let it start, then pause it, then start it again when needed.

lxc pause <container>

delete

Lastly, if you want a container to go away, you can delete it for good with:

lxc delete <container>

Note that you will have to pass “–force” if the container is currently running.

Container configuration

LXD exposes quite a few container settings, including resource limitation, control of container startup and a variety of device pass-through options. The full list is far too long to cover in this post but it’s available here.

As far as devices go, LXD currently supports the following device types:

  • disk
    This can be a physical disk or partition being mounted into the container or a bind-mounted path from the host.
  • nic
    A network interface. It can be a bridged virtual ethernet interrface, a point to point device, an ethernet macvlan device or an actual physical interface being passed through to the container.
  • unix-block
    A UNIX block device, e.g. /dev/sda
  • unix-char
    A UNIX character device, e.g. /dev/kvm
  • none
    This special type is used to hide a device which would otherwise be inherited through profiles.

Configuration profiles

The list of all available profiles can be obtained with:

lxc profile list

To see the content of a given profile, the easiest is to use:

lxc profile show <profile>

And should you want to change anything inside it, use:

lxc profile edit <profile>

You can change the list of profiles which apply to a given container with:

lxc profile apply <container> <profile1>,<profile2>,<profile3>,...

Local configuration

For things that are unique to a container and so don’t make sense to put into a profile, you can just set them directly against the container:

lxc config edit <container>

This behaves the exact same way as “profile edit” above.

Instead of opening the whole thing in a text editor, you can also modify individual keys with:

lxc config set <container> <key> <value>

Or add devices, for example:

lxc config device add my-container kvm unix-char path=/dev/kvm

Which will setup a /dev/kvm entry for the container named “my-container”.

The same can be done for a profile using “lxc profile set” and “lxc profile device add”.

Reading the configuration

You can read the container local configuration with:

lxc config show <container>

Or to get the expanded configuration (including all the profile keys):

lxc config show --expanded <container>

For example:

stgraber@dakara:~$ lxc config show --expanded zerotier
name: zerotier
profiles:
- default
config:
 security.nesting: "true"
 user.a: b
 volatile.base_image: a49d26ce5808075f5175bf31f5cb90561f5023dcd408da8ac5e834096d46b2d8
 volatile.eth0.hwaddr: 00:16:3e:ec:65:a8
 volatile.last_state.idmap: '[{"Isuid":true,"Isgid":false,"Hostid":100000,"Nsid":0,"Maprange":65536},{"Isuid":false,"Isgid":true,"Hostid":100000,"Nsid":0,"Maprange":65536}]'
devices:
 eth0:
  name: eth0
  nictype: macvlan
  parent: eth0
  type: nic
  limits.ingress: 10Mbit
  limits.egress: 10Mbit
 root:
  path: /
  size: 30GB
  type: disk
 tun:
  path: /dev/net/tun
  type: unix-char
ephemeral: false

That one is very convenient to check what will actually be applied to a given container.

Live configuration update

Note that unless indicated in the documentation, all configuration keys and device entries are applied to affected containers live. This means that you can add and remove devices or alter the security profile of running containers without ever having to restart them.

Getting a shell

LXD lets you execute tasks directly into the container. The most common use of this is to get a shell in the container or to run some admin tasks.

The benefit of this compared to SSH is that you’re not dependent on the container being reachable over the network or on any software or configuration being present inside the container.

Execution environment

One thing that’s a bit unusual with the way LXD executes commands inside the container is that it’s not itself running inside the container, which means that it can’t know what shell to use, what environment variables to set or what path to use for your home directory.

Commands executed through LXD will always run as the container’s root user (uid 0, gid 0) with a minimal PATH environment variable set and a HOME environment variable set to /root.

Additional environment variables can be passed through the command line or can be set permanently against the container through the “environment.<key>”  configuration options.

Executing commands

Getting a shell inside a container is typically as simple as:

lxc exec <container> bash

That’s assuming the container does actually have bash installed.

More complex commands require the use of a separator for proper argument parsing:

lxc exec <container> -- ls -lh /

To set or override environment variables, you can use the “–env” argument, for example:

stgraber@dakara:~$ lxc exec zerotier --env mykey=myvalue env | grep mykey
mykey=myvalue

Managing files

Because LXD has direct access to the container’s file system, it can directly read and write any file inside the container. This can be very useful to pull log files or exchange files with the container.

Pulling a file from the container

To get a file from the container, simply run:

lxc file pull <container>/<path> <dest>

For example:

stgraber@dakara:~$ lxc file pull zerotier/etc/hosts hosts

Or to read it to standard output:

stgraber@dakara:~$ lxc file pull zerotier/etc/hosts -
127.0.0.1 localhost

# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
ff02::3 ip6-allhosts

Pushing a file to the container

Push simply works the other way:

lxc file push <source> <container>/<path>

Editing a file directly

Edit is a convenience function which simply pulls a given path, opens it in your default text editor and then pushes it back to the container when you close it:

lxc file edit <container>/<path>

Snapshot management

LXD lets you snapshot and restore containers. Snapshots include the entirety of the container’s state (including running state if –stateful is used), which means all container configuration, container devices and the container file system.

Creating a snapshot

You can snapshot a container with:

lxc snapshot <container>

It’ll get named snapX where X is an incrementing number.

Alternatively, you can name your snapshot with:

lxc snapshot <container> <snapshot name>

Listing snapshots

The number of snapshots a container has is listed in “lxc list”, but the actual snapshot list is only visible in “lxc info”.

lxc info <container>

Restoring a snapshot

To restore a snapshot, simply run:

lxc restore <container> <snapshot name>

Renaming a snapshot

Renaming a snapshot can be done by moving it with:

lxc move <container>/<snapshot name> <container>/<new snapshot name>

Creating a new container from a snapshot

You can create a new container which will be identical to another container’s snapshot except for the volatile information being reset (MAC address):

lxc copy <source container>/<snapshot name> <destination container>

Deleting a snapshot

And finally, to delete a snapshot, just run:

lxc delete <container>/<snapshot name>

Cloning and renaming

Getting clean distribution images is all nice and well, but sometimes you want to install a bunch of things into your container, configure it and then branch it into a bunch of other containers.

Copying a container

To copy a container and effectively clone it into a new one, just run:

lxc copy <source container> <destination container>

The destination container will be identical in every way to the source one, except it won’t have any snapshot and volatile keys (MAC address) will be reset.

Moving a container

LXD lets you copy and move containers between hosts, but that will get covered in a later post.

For now, the “move” command can be used to rename a container with:

lxc move <old name> <new name>

The only requirement is that the container be stopped, everything else will be kept exactly as it was, including the volatile information (MAC address and such).

Conclusion

This pretty long post covered most of the commands you’re likely to use in day to day operation.

Obviously a lot of those commands have extra arguments that let you be more efficient or tweak specific aspects of your LXD containers. The best way to learn about all of those is to go through the help for those you care about (–help).

Extra information

The main LXD website is at: https://linuxcontainers.org/lxd
Development happens on Github at: https://github.com/lxc/lxd
Mailing-list support happens on: https://lists.linuxcontainers.org
IRC support happens in: #lxcontainers on irc.freenode.net

And if you don’t want or can’t install LXD on your own machine, you can always try it online instead!

Posted in Canonical voices, LXD, Planet Ubuntu | Tagged | 71 Comments