LXC 1.0: Your second container [2/10]

This is post 2 out of 10 in the LXC 1.0 blog post series.

More templates

So at this point you should have a working Ubuntu container that’s called “p1” and was created using the default template called simply enough “ubuntu”.

But LXC supports much more than just standard Ubuntu. In fact, in current upstream git (and daily PPA), we support Alpine Linux, Alt Linux, Arch Linux, busybox, CentOS, Cirros, Debian, Fedora, OpenMandriva, OpenSUSE, Oracle, Plamo, sshd, Ubuntu Cloud and Ubuntu.

All of those can usually be found in /usr/share/lxc/templates. They also all typically have extra advanced options which you can get to by passing “--help” after the “lxc-create” call (the “--” is required to split “lxc-create” options from the template’s).

Writing extra templates isn’t too difficult, they basically are executables (all shell scripts but that’s not a requirement) which take a set of standard arguments and are expected to produce a working rootfs in the path that’s passed to them.

One thing to be aware of is that due to missing tools not all distros can be bootstrapped on all distros. It’s usually best to just try. We’re always interested in making those work on more distros even if that means using some rather weird tricks (like is done in the fedora template) so if you have a specific combination which doesn’t work at the moment, patches are definitely welcome!

Anyway, enough talking for now, let’s go ahead and create an Oracle Linux container that we’ll force to be 32bit.

sudo lxc-create -t oracle -n p2 -- -a i386

On most systems, this will initially fail, telling you to install the “rpm” package first which is needed for bootstrap reasons. So install it and “yum” and then try again.

After some time downloading RPMs, the container will be created, then it’s just a:

sudo lxc-start -n p2

And you’ll be greated by the Oracle Linux login prompt (root / root).

At that point since you started the container without passing “-d” to “lxc-start”, you’ll have to shut it down to get your shell back (you can’t detach from a container which wasn’t started initially in the background).

Now if you are wondering why Ubuntu has two templates. The Ubuntu template which I’ve been using so far does a local bootstrap using “debootstrap” basically building your container from scratch, whereas the Ubuntu Cloud template (ubuntu-cloud) downloads a pre-generated cloud image (identical to what you’d get on EC2 or other cloud services) and starts it. That image also includes cloud-init and supports the standard cloud metadata.

It’s a matter of personal choice which you like best. I personally have a local mirror so the “ubuntu” template is much faster for me and I also trust it more since I know everything was downloaded from the archive in front of me and assembled locally on my machine.

One last note on templates. Most of them use a local cache, so the initial bootstrap of a container for a given arch will be slow, any subsequent one will just be a local copy from the cache and will be much faster.

Auto-start

So what if you want to start a container automatically at boot time?

Well, that’s been supported for a long time in Ubuntu and other distros by using some init scripts and symlinks in /etc, but very recently (two days ago), this has now been implemented cleanly upstream.

So here’s how auto-started containers work nowadays:

As you may know, each container has a configuration file typically under
/var/lib/lxc/<container name>/config

That file is key = value with the list of valid keys being specified in lxc.conf(5).

The startup related values that are available are:

  • lxc.start.auto = 0 (disabled) or 1 (enabled)
  • lxc.start.delay = 0 (delay in second to wait after starting the container)
  • lxc.start.order = 0 (priority of the container, higher value means starts earlier)
  • lxc.group = group1,group2,group3,… (groups the container is a member of)

When your machine starts, an init script will ask “lxc-autostart” to start all containers of a given group (by default, all containers which aren’t in any) in the right order and waiting the specified time between them.

To illustrate that, edit /var/lib/lxc/p1/config and append those lines to the file:

lxc.start.auto = 1
lxc.group = ubuntu

And /var/lib/lxc/p2/config and append those lines:

lxc.start.auto = 1
lxc.start.delay = 5
lxc.start.order = 100

Doing that means that only the p2 container will be started at boot time (since only those without a group are by default), the order value won’t matter since it’s alone and the init script will wait 5s before moving on.

You may check what containers are automatically started using “lxc-ls”:

stgraber@castiana:~$ sudo lxc-ls --fancy
NAME    STATE    IPV4        IPV6                                    AUTOSTART     
---------------------------------------------------------------------------------
p1      RUNNING  10.0.3.128  2607:f2c0:f00f:2751:216:3eff:feb1:4c7f  YES (ubuntu)
p2      RUNNING  10.0.3.165  2607:f2c0:f00f:2751:216:3eff:fe3a:f1c1  YES

Now you can also manually play with those containers using the “lxc-autostart” command which let’s you start/stop/kill/reboot any container marked with lxc.start.auto=1.

For example, you could do:

sudo lxc-autostart -a

Which will start any container that has lxc.start.auto=1 (ignoring the lxc.group value) which in our case means it’ll first start p2 (because of order = 100), then wait 5s (because of delay = 5) and then start p1 and return immediately afterwards.

If at that point you want to reboot all containers that are in the “ubuntu” group, you may do:

sudo lxc-autostart -r -g ubuntu

You can also pass “-L” with any of those commands which will simply print which containers would be affected and what the delays would be but won’t actually do anything (useful to integrate with other scripts).

Freezing your containers

Sometimes containers may be running daemons that take time to shutdown or restart, yet you don’t want to run the container because you’re not actively using it at the time.

In such cases, “sudo lxc-freeze -n <container name>” can be used. That very simply freezes all the processes in the container so they won’t get any time allocated by the scheduler. However the processes will still exist and will still use whatever memory they used to.

Once you need the service again, just call “sudo lxc-unfreeze -n <container name>” and all the processes will be restarted.

Networking

As you may have noticed in the configuration file while you were setting the auto-start settings, LXC has a relatively flexible network configuration.
By default in Ubuntu we allocate one “veth” device per container which is bridged into a “lxcbr0” bridge on the host on which we run a minimal dnsmasq dhcp server.

While that’s usually good enough for most people. You may want something slightly more complex, such as multiple network interfaces in the container or passing through physical network interfaces, … The details of all of those options are listed in lxc.conf(5) so I won’t repeat them here, however here’s a quick example of what can be done.

lxc.network.type = veth
lxc.network.hwaddr = 00:16:3e:3a:f1:c1
lxc.network.flags = up
lxc.network.link = lxcbr0
lxc.network.name = eth0

lxc.network.type = veth
lxc.network.link = virbr0
lxc.network.name = virt0

lxc.network.type = phys
lxc.network.link = eth2
lxc.network.name = eth1

With this setup my container will have 3 interfaces, eth0 will be the usual veth device in the lxcbr0 bridge, eth1 will be the host’s eth2 moved inside the container (it’ll disappear from the host while the container is running) and virt0 will be another veth device in the virbr0 bridge on the host.

Those last two interfaces don’t have a mac address or network flags set, so they’ll get a random mac address at boot time (non-persistent) and it’ll be up to the container to bring the link up.

Attach

Provided you are running a sufficiently recent kernel, that is 3.8 or higher, you may use the “lxc-attach” tool. It’s most basic feature is to give you a standard shell inside a running container:

sudo lxc-attach -n p1

You may also use it from scripts to run actions in the container, such as:

sudo lxc-attach -n p1 -- restart ssh

But it’s a lot more powerful than that. For example, take:

sudo lxc-attach -n p1 -e -s 'NETWORK|UTSNAME'

In that case, you’ll get a shell that says “root@p1” (thanks to UTSNAME), running “ifconfig -a” from there will list the container’s network interfaces. But everything else will be that of the host. Also passing “-e” means that the cgroup, apparmor, … restrictions won’t apply to any processes started from that shell.

This can be very useful at times to spawn a software located on the host but inside the container’s network or pid namespace.

Passing devices to a running container

It’s great being able to enter and leave the container at will, but what about accessing some random devices on your host?

By default LXC will prevent any such access using the devices cgroup as a filtering mechanism. You could edit the container configuration to allow the right additional devices and then restart the container.

But for one-off things, there’s also a very convenient tool called “lxc-device”.
With it, you can simply do:

sudo lxc-device add -n p1 /dev/ttyUSB0 /dev/ttyS0

Which will add (mknod) /dev/ttyS0 in the container with the same type/major/minor as /dev/ttyUSB0 and then add the matching cgroup entry allowing access from the container.

The same tool also allows moving network devices from the host to within the container.

About Stéphane Graber

Project leader of Linux Containers, Linux hacker, Ubuntu core developer, conference organizer and speaker.
This entry was posted in Canonical voices, LXC, Planet Ubuntu and tagged . Bookmark the permalink.

32 Responses to LXC 1.0: Your second container [2/10]

  1. wwwwww says:

    sooooo, you claim that you can ‘add’ a device to an existing container? how is that supposed to work? i mean, /sys is not virtualized (with the exception of the net device stuff), so enumeration doesnt really work. it will tell you about devices that dont exist. moreover since udev doesnt run in containers you lack all the enumeration metadata apps require.

    aren’t you making a promise with this you cannot keep? this all looks very half-baked to me…

    1. lxc-device does what it says it does. It allows you to pass an existing character or block device from the host to the container. The container doesn’t need udev for that to work. Of course the container will be able to enumerate services which it can’t access and that’ll be true until we get a device namespace but for normal use it’s not a problem.

      It’s also inaccurate that udev won’t run in a container. In fact we’ve been running it in all Ubuntu containers for the past 2 years.

      Udev in an Ubuntu container will let you enumerate any device that was added after the container started, our default configuration will let udev create device nodes but only access those that are allowed in the configuration.

      1. wwwwww says:

        A naked block device is such a useless thing if no app can enumerate them. I mean, this is not 1970’s UNIX where a device node was everything that existed and enumeration was done by listing contents of /dev…

        So you run the udev rules in each container you manage? Yikes, the rules contain active code, if you do that then things keep overriding each other.

        You can run udevd pretty much exactly once on each system without having the rules interfere with each other. Of course you can play games and not run udevd on the main system, but in one of its containers…

        But well, /sys and udev device management is not virtualized…, and you can lie to yourself as much as you want, but you are really just making a promise here that you cannot keep, and that will cause people to start enumerating /dev again, instead of /sys.

        1. Apparmor in Ubuntu will prevent any container from triggering uevents, so the only downside of having udev run in containers is that any new uevent coming from the host will be processed in the container. But that’s also exactly what you usually want.

          Our apparmor policy will avoid any of the usual udev cascade effect you have when multiple udevs are running with one triggering a uevent which triggers them all, etc…

          Anyway, this is far from ideal, but while we wait for people to figure out exactly how a device namespace should work (there are a dozen of proposals at the moment, all of which have been rejected by kernel developers in a way or another), this isn’t a bad compromise and this actually works out of the box for most things people care about (passing dri/drm devices, webcams, disk and partitions).

          Enumeration through /sys will always work in containers since as you said, it’s not namespaced, so you can always list all the devices from your host and thankfully most userspace software are sane enough to then hide anything they can’t actually access (so you end up seeing the subset of devices that are in /sys and have been allowed in the container).

          1. Using a USB video camera is a good example: one can set up `/dev/video0` in an lxc container, however, it will not work with the `cheese` webcam app until you create, by hand, a file `/run/udev/data/c81:0` and copy the host’s version of this into the container. (c81:0 refers to the character device major:minor device number) — create this file, and cheese works in the container. Without out, nothing. That’s the issue with udev and lxc, in a nutshell.

  2. Mehmet Ali Buyukkarakas says:

    Hello.
    I’m trying to install my containers in a Virtualbox VM which is running with Oracle Linux 6.5 on my laptop. I can install and run it successfully. But I cant reach the container from outer networks.

    The VM has only one interface which is running in bridging mode and getting IP from my wireless network. Also the container is running in macvlan mode and getting an an from wireless pool.

    I can ping the container from VM or vice versa. I installed an Apache server into the container. I cant reach the Apache from my laptop’s browser or else.

    Do you have any idea ?

    Thanks anyway.
    Mehmet

    1. Dwight Engen says:

      I’ve never had good luck using lxc macvlan with a vbox bridged adapter.
      Here is what I do to give lxc containers running under VirtualBox
      an IP on the VirtualBox’s host’s network in order to test network servers etc… in a container.

      In this example I’m using the names lxc-ol65-01 which is a container
      running on vbox-host, which is running on hw-host. In addition hw-host
      is on 192.168.1.x and I want the lxc-host to also be on 192.168.1.x:

      – In the virtual box settings for vbox-host, add an additional bridged
      adapter that will be used exclusively by one lxc container, bridged
      to your hw-host adapter that is on 192.168.1.x. Let assume it shows
      up in vbox-host as eth1.
      – Make sure vbox-host isn’t assigning an ip to eth1:
      vbox-host# cat /etc/sysconfig/network-scripts/ifcfg-eth1

      DEVICE=eth1
      BOOTPROTO=none
      ONBOOT=no
      NM_CONTROLLED=no
      TYPE=Ethernet

      You might need to chkconfig NetworkManager off to make sure it
      doesn’t configure it either, since it doesn’t seem to respect
      NM_CONTROLLED=no.

      – Set the network configuration for your container to use phys
      pass through mode:

      vbox-host# grep -i network /container/lxc-ol65-01/config

      # Networking
      lxc.network.type = phys
      lxc.network.name = eth0
      lxc.network.link = eth1
      lxc.network.flags = up
      lxc.network.mtu = 1500

      – Set the container to dhcp a name from the dhcp server on hw-host’s
      network, setting DHCP_HOSTNAME is nice so you can just refer to the
      container by name everywhere:

      vbox-host# cat /container/lxc-ol65-01/rootfs/etc/sysconfig/network-scripts/ifcfg-eth0

      DEVICE=eth0
      BOOTPROTO=dhcp
      ONBOOT=yes
      HOSTNAME=lxc-ol65-01
      DHCP_HOSTNAME=lxc-ol65-01
      NM_CONTROLLED=no
      TYPE=Ethernet

      Now when you lxc-start -n lxc-ol65-01 it should take over vbox-hosts
      eth1, which is bridged to hw-hosts interface and get an IP in the
      container in the 192.168.1.x range. Since the lxc oracle template runs
      sshd by default, you should straight away be able to ssh lxc-ol65-01
      from any host in 192.168.1.x. Hopefully this helps 🙂

  3. lxcfanboy says:

    Do I have to care about capabilities running a unprivileged container?

    1. Not really no since any capability that a process in the container would have will be limited only to that user namespace and won’t apply outside of it.

  4. David Shwartz says:

    Hello,
    Could it be that lxc-device is available only when building when enabling
    python ?
    I built lxc (from git) with ./configure –disable-python and ran make && make install as usual; the lxc-device is not available
    lxc-device
    bash: lxc-device: command not found

    Regards,
    David

    1. Yes, it’s a tool using our python3 API so you need to build with –enable-python to have it.

  5. cam says:

    Can you please describe how to enable the tun device to be able to run an openconnect vpn in a container?

    1. /dev/net/tun is allowed by default in our config (at least for Ubuntu), you may however need to create the /dev/net/tun device with “mknod /dev/net/tun c 10 200”.

  6. cam says:

    It does not look like mknod works on Fedora.
    Editing the template allow for /dev/net/tun does not seem to have any effect…
    Is there some configuration option that disables this stuff?

    1. Hmm, yes, on Fedora you’ll at least need “lxc.cgroup.devices.allow = c 10:200 rwm” and possibly clear some of the capabilities so you can actually perform the mknod (though I’m not sure which of the dropped capabilities could account for failure to mknod…).

  7. cam says:

    I was finally able to get an lxc installation on Fedora that can run a VPN.
    I had to create the tun device with mknod. Thanks.

  8. Jack says:

    Thanks for your work so I can accomplish my homework!But some problems existed when I was using lxc. I defined and started lxc with the virsh command but it seemed that lxcs cannot be started with the error[error : lxcControllerRun:1440 : root source /var/lib/lxc/o2/rootfs does not exist: No such file or directory],where “o2” is the lxc name.
    The setting xml file is as follows:

    o2
    332768

    exe
    /sbin/init

    1

    destroy
    restart
    destroy

    /usr/lib/libvirt/libvirt_lxc

    I do not know the point of this problem and no offical document has mentioned this error.
    Hope for your help,thanks again!

  9. jfleroux says:

    I just want to say thanks for your (hard) work: just upgraded to 1.0.3 on Ubuntu 14.04 LTS and it works like a charm!

  10. The new shell will inherit the environment from the shell calling “Attach”. Is there a way to avoid this and make the new shell to source it’s own environment?.

  11. Christian says:

    If anyone get’s a permission error when trying to lxc-start p2 on Ubuntu 14.04 with the LXC daily PPA, setting AppArmor to complain-mode for lxc-start solved it for me:

    # apt-get install apparmor-utils
    $ aa-complain /usr/bin/lxc-start

    (via https://github.com/docker/docker/issues/2702)

  12. Josh says:

    Thanks for the great tutorial just letting you know the link to lxc.conf(5) is broken and should be https://linuxcontainers.org/lxc/manpages/man5/lxc.conf.5.html instead. 🙂
    Cheers
    Josh

  13. A.Glaeser says:

    OnDebian Jessie, I tried to assign two containers with static IPs as veth to the same bridge on the host: cbr0
    I could ping into the second containers network-interface from the LAN, but could not reach the internet from within it.
    Probably I have to set a route inside it in order to reach the gateway. Some sources say, that each bridge can only connect two interfaces, virtual switching does not exist, besides the SpanningTreeProtocol.
    Or should I use libvirt in order to make multi-container networking easier??

  14. Suraj W says:

    hello…
    Is LXC only run on a 64 bit o.s.?

  15. Evgeny Eralev says:

    Dear Stephane,

    Could you please advice how can I permanently pass device to container via config file?
    So lxc-device -n C1 add /dev/sdb /dev/sda is working but I can’t find any references how correclty do it in config during container bootstrap.

    1. vance says:

      Where’s the answer to this?

  16. GH says:

    Hi all,

    I cannot use the device ttyS0 in an unpriviledged container.
    The “open” function returns a permission denied

    I have defined “lxc.cgroup.devices.allow = c 4:64 rwm” and “chmod 666”
    “/dev/ttyS0” on the host (and even in the container). I type “lxc-device -n … add /dev/ttyS0”. The device is then in the container.

    It works in a priviledged container.
    I have the same problem with another device (/dev/ttyUSB0).

    What did I miss ?

    I am using the 1.1.4 release.

    1. GH says:

      Also I added “lxc.mount.entry = /dev/ttyS0 dev/ttyS0 none bind,optional” in the configuration.

  17. Norbert says:

    Shutdown containers in ubuntu when started in autostart mode:

    Hello Stephane,
    i’m impressed how nice the containers work in Ubuntu LTS 14.04. One question from me, because i can´t figure it out the manpages: Do i have to do something for shutdown a container at host shutdown, or is it automatically implemented when the bootoption autostart is enabled?
    Thanks for Reply,
    Norbert

  18. Robert Threet says:

    i’m still getting this:

    Got CONNECT response: HTTP/1.1 200 OK
    CSTP connected. DPD 30, Keepalive 20
    which: no ip in (/sbin:/usr/sbin:/usr/local/sbin:/usr/sbin:/sbin:/usr/local/bin:/usr/bin:/bin:/usr/games)
    mknod: /dev/net/tun: Operation not permitted
    Failed to bind local tun device (TUNSETIFF): Inappropriate ioctl for device
    Set up tun device failed

    Tried this:

    root@slacks:/etc/lxc# cat lxc-usernet
    root veth tun lxcbr0 10

    root@slacks:/etc/lxc# cat default.conf
    lxc.network.type=veth
    lxc.network.link=br0
    lxc.network.flags=up
    lxc.cgroup.devices.deny = a
    lxc.cgroup.devices.allow = c 10:200 rwm
    lxc.hook.autodev = sh -c “modprobe tun; cd ${LXC_ROOTFS_MOUNT}/dev; mkdir net; mknod net/tun c 10 200; chmod 0666 net/tun”
    lxc.cgroup.devices.allow = c 10:200 rwm

  19. cam says:

    Here are my notes from when I set this up (last was on a Fedora 25 machine)
    To enable TUN/TAP in the container MYCONTAINER:
    1. create and make executable /var/lib/lxc/MYCONTAINER/audodev containing:
    #!/bin/sh

    cd ${LXC_ROOTFS_MOUNT}/dev
    mkdir net
    mknod net/tun c 10 200
    chmod 0666 net/tun

    2. add
    lxc.cgroup.devices.allow = c 10:200 rwm
    lxc.hook.autodev=/var/lib/lxc/MYCONTAINER/autodev
    to
    /var/lib/lxc/office/config

  20. vance says:

    How does one persist adding a block device to a container?

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.