LXD 2.0: Blog post series [0/12]

As we are getting closer and closer to tagging the final releases of LXC, LXD and LXCFS 2.0, I figured it would be a good idea to talk a bit about everything that went into LXD since we first started that project a year and a half ago.

LXD logo

This is going to be a blog post series similar to what I’ve done for LXC 1.0 a couple years back.

The topics that will be covered are:

Related posts:

I’m hoping to post a couple of those every week for the coming month and a half leading to the Ubuntu 16.04 release.

If you can’t wait for all of those to come out to play with LXD, you can also take the guided tour and play with LXD, online through our online demo page.

Posted in Canonical voices, LXD, Planet Ubuntu | Tagged | 83 Comments

NorthSec 2015: behind the scenes

TLDR: NorthSec is an incredible security event, our CTF simulates a whole internet for every participating team. This allows us to create just about anything, from a locked down country to millions of vulnerable IoT devices spread across the globe. However that flexibility comes at a high cost hardware-wise, as we’re getting bigger and bigger, we need more and more powerful servers and networking gear. We’re very actively looking for sponsors so get in touch with me or just buy us something on Amazon!

What’s NorthSec?

NorthSec is one of the biggest on-site Capture The Flag (CTF), security contest in North America. It’s organized yearly over a weekend in Montreal (usually in May) and since the last edition, has been accompanied by a two days security conference before the CTF itself. The rest of this post will only focus on the CTF part though.

the-room

A view of the main room at NorthSec 2015

Teams arrive at the venue on Friday evening, get setup at their table and then get introduced to this year’s scenario and given access to our infrastructure. There they will have to fight their way through challenges, each earning them points and letting them go further and further. On Sunday afternoon, the top 3 teams are awarded their prize and we wrap up for the year.

Size wise, for the past two years we’ve had a physical limit of up to 32 teams of 8 participants and then a bunch of extra unaffiliated visitors. For the 2016 edition, we’re raising this to 50 teams for a grand total of 400 participants, thanks to some shuffling at the venue making some more room for us.

Why is it special?

The above may sound pretty simple and straightforward, however there are a few important details that sets NorthSec apart from other CTFs.

  • It is entirely on-site. There are some very big online CTFs out there but very few on-site ones. Having everyone participating in the same room is valuable from a networking point of view but also ensures fairness by enforcing fixed size teams and equal network bandwidth and latency.
  • Every team gets its very own copy of the whole infrastructure. There are no shared services in the simulated world we provide them. That means one team’s actions cannot impact another.
  • Each simulation is its own virtual world with its own instance of the internet, we use hundreds of LXC containers and thousands of VLANs and networks FOR EVERY TEAM to provide the most realistic and complete environment you can think of.
World map of our fake internet

World map of our fake internet

What’s our infrastructure like?

Due to the very high bandwidth and low latency requirements, most of the infrastructure is hosted on premises and on our hardware. We do plan on offloading Windows virtual machines to a public cloud for the next edition though.

We also provide a mostly legacy free environment to our contestants, all of our challenges are connected to IPv6-only networks and run on 64bit Ubuntu LTS  in LXC with state of the art security configurations.

Our rack

Our rack, on location at NorthSec 2015

 

All in all, for 32 teams (last year’s edition), we had:

  • 48000 virtual network interfaces
  • 2000 virtual carriers
  • 16000 BGP routers
  • 17000 Ubuntu containers
  • 100 Windows virtual machines
  • 20000 routing table entries

And all of that was running on:

  • Two firewalls (DELL SC1425)
  • Two infrastructure servers (DELL SC1425)
  • One management server (HP DL380 G5)
  • Four main contest hosts (HP DL380 G5)
  • Three backup contest hosts (DELL C6100)

On average we had 7 full simulations and 21 virtual machines running on every host (the backup hosts only had one each). That means each of the main contest hosts had:

  • 10500 virtual network interfaces
  • 435 virtual carriers
  • 3500 BGP routers
  • 3700 Ubuntu containers
  • 21 Windows virtual machines
  • 4375 routing table entries

Not too bad for servers that are (SC1425) or are getting close (DL380 G5) to being 10 years old now.

Past infrastructure challenges

In the past editions we’ve found numerous bugs in the various technologies we use when put under such a crazy load:

  • A variety of switch firmware bugs when dealing with several thousand IPv6-only networks.
  • Multiple Linux IPv6 kernel bugs (and one security issue) also related to an excess of IPv6 multicast traffic.
  • Several memory leaks and other bugs in LXC and related components that become very visible when you’re running upwards of 10000 containers.
  • Several more Linux kernel bugs related to performance scaling as we create more and more namespaces and nested namespaces.

As our infrastructure staff is very invested in these technologies by being upstream developers or contributors to the main projects we use, those bugs were all rapidly reported, discussed and fixed. We always look forward to the next NorthSec as an opportunity to test the latest technology at scale in a completely controlled environment.

How can you help?

As I mentioned, we’ve been capped at 32 teams and around 300 attendees for the past two years. Our existing hardware was barely sufficient  to handle  the load during those two editions, we urgently need to refresh our hardware to offer the best possible experience to our participants.

We’re planning on replacing most if not all of our hardware with slightly more recent equivalents, also upgrading from rotating drives to SSDs and improving our network. On the software side, we’ll be upgrading to a newer Linux kernel, possibly to Ubuntu 16.04, switch from btrfs to zfs and from LXC to LXD.

We are a Canadian non-profit organization with all our staff being volunteers so we very heavily rely on sponsors to be able to make the event a success.

If you or your company would like to help by sponsoring our infrastructure, get in touch with me. We have several sponsoring levels and can get you the visibility you’d like, ranging from a mention on our website and at the event to on-site presence with a recruitment booth and even, if our interests align, inclusion of your product in some of our challenges.

We also have an Amazon wishlist of smaller (cheaper) items that we need to buy in the near future. If you buy something from the list, get in touch so we can properly thank you!

Oh and as I briefly mentioned at the beginning, we have a two days, single-track conference ahead of the CTF. We’re actively looking for speakers, if you have something interesting to present, the CFP is here.

Extra resources

Posted in Canonical voices, LXC, LXD, Planet Ubuntu | Tagged | 7 Comments

Getting started with LXD – the container lightervisor

Introduction

For the past 6 months, Serge Hallyn, Tycho Andersen, Chuck Short, Ryan Harper and myself have been very busy working on a new container project called LXD.

Ubuntu 15.04, due to be released this Thursday, will contain LXD 0.7 in its repository. This is still the early days and while we’re confident LXD 0.7 is functional and ready for users to experiment, we still have some work to do before it’s ready for critical production use.

LXD logo

So what’s LXD?

LXD is what we call our container “lightervisor”. The core of LXD is a daemon which offers a REST API to drive full system containers just like you’d drive virtual machines.

The LXD daemon runs on every container host and client tools then connect to those to manage those containers or to move or copy them to another LXD.

We provide two such clients:

  • A command line tool called “lxc”
  • An OpenStack Nova plugin called nova-compute-lxd

The former is mostly aimed at small deployments ranging from a single machine (your laptop) to a few dozen hosts. The latter seamlessly integrates inside your OpenStack infrastructure and lets you manage containers exactly like you would virtual machines.

Why LXD?

LXC has been around for about 7 years now, it evolved from a set of very limited tools which would get you something only marginally better than a chroot, all the way to the stable set of tools, stable library and active user and development community that we have today.

Over those years, a lot of extra security features were added to the Linux kernel and LXC grew support for all of them. As we saw the need for people to build their own solution on top of LXC, we’ve developed a public API and a set of bindings. And last year, we’ve put out our first long term support release which has been a great success so far.

That being said, for a while now, we’ve been wanting to do a few big changes:

  • Make LXC secure by default (rather than it being optional).
  • Completely rework the tools to make them simpler and less confusing to newcomers.
  • Rely on container images rather than using “templates” to build them locally.
  • Proper checkpoint/restore support (live migration).

Unfortunately, solving any of those means doing very drastic changes to LXC which would likely break our existing users or at least force them to rethink the way they do things.

Instead, LXD is our opportunity to start fresh. We’re keeping LXC as the great low level container manager that it is. And build LXD on top of it, using LXC’s API to do all the low level work. That achieves the best of both worlds, we keep our low level container manager with its API and bindings but skip using its tools and templates, instead replacing those by the new experience that LXD provides.

How does LXD relate to LXC, Docker, Rocket and other container projects?

LXD is currently based on top of LXC. It uses the stable LXC API to do all the container management behind the scene, adding the REST API on top and providing a much simpler, more consistent user experience.

The focus of LXD is on system containers. That is, a container which runs a clean copy of a Linux distribution or a full appliance. From a design perspective, LXD doesn’t care about what’s running in the container.

That’s very different from Docker or Rocket which are application container managers (as opposed to system container managers) and so focus on distributing apps as containers and so very much care about what runs inside the container.

There is absolutely nothing wrong with using LXD to run a bunch of full containers which then run Docker or Rocket inside of them to run their different applications. So letting LXD manage the host resources for you, applying all the security restrictions to make the container safe and then using whatever application distribution mechanism you want inside.

Getting started with LXD

The simplest way for somebody to try LXD is by using it with its command line tool. This can easily be done on your laptop or desktop machine.

On an Ubuntu 15.04 system (or by using ppa:ubuntu-lxc/lxd-stable on 14.04 or above), you can install LXD with:

sudo apt-get install lxd

Then either logout and login again to get your group membership refreshed, or use:

newgrp lxd

From that point on, you can interact with your newly installed LXD daemon.

The “lxc” command line tool lets you interact with one or multiple LXD daemons. By default it will interact with the local daemon, but you can easily add more of them.

As an easy way to start experimenting with remote servers, you can add our public LXD server at https://images.linuxcontainers.org:8443
That server is an image-only read-only server, so all you can do with it is list images, copy images from it or start containers from it.

You’ll have to do the following to: add the server, list all of its images and then start a container from one of them:

lxc remote add images images.linuxcontainers.org
lxc image list images:
lxc launch images:ubuntu/trusty/i386 ubuntu-32

What the above does is define a new “remote” called “images” which points to images.linuxcontainers.org. Then list all of its images and finally start a local container called “ubuntu-32” from the ubuntu/trusty/i386 image. The image will automatically be cached locally so that future containers are started instantly.

The “<remote name>:” syntax is used throughout the lxc client. When not specified, the default “local” remote is assumed. Should you only care about managing a remote server, the default remote can be changed with “lxc remote set-default”.

Now that you have a running container, you can check its status and IP information with:

lxc list

Or get even more details with:

lxc info ubuntu-32

To get a shell inside the container, or to run any other command that you want, you may do:

lxc exec ubuntu-32 /bin/bash

And you can also directly pull or push files from/to the container with:

lxc file pull ubuntu-32/path/to/file .
lxc file push /path/to/file ubuntu-32/

When done, you can stop or delete your container with one of those:

lxc stop ubuntu-32
lxc delete ubuntu-32

What’s next?

The above should be a reasonably comprehensive guide to how to use LXD on a single system. Of course, that’s not the most interesting thing to do with LXD. All the commands shown above can work against multiple hosts, containers can be remotely created, moved around, copied, …

LXD also supports live migration, snapshots, configuration profiles, device pass-through and more.

I intend to write some more posts to cover those use cases and features as well as highlight some of the work we’re currently busy doing.

LXD is a pretty young but very active project. We’ve had great contributions from existing LXC developers as well as newcomers.

The project is entirely developed in the open at https://github.com/lxc/lxd. We keep track of upcoming features and improvements through the project’s issue tracker, so it’s easy to see what will be coming soon. We also have a set of issues marked “Easy” which are meant for new contributors as easy ways to get to know the LXD code and contribute to the project.

LXD is an Apache2 licensed project, written in Go and which doesn’t require a CLA to contribute to (we do however require the standard DCO Signed-off-by). It can be built with both golang and gccgo and so works on almost all architectures.

Extra resources

More information can be found on the official LXD website:
https://linuxcontainers.org/lxd

The code, issues and pull requests can all be found on Github:
https://github.com/lxc/lxd

And a good overview of the LXD design and its API may be found in our specs:
https://github.com/lxc/lxd/tree/master/specs

Conclusion

LXD is a new and exciting project. It’s an amazing opportunity to think fresh about system containers and provide the best user experience possible, alongside great features and rock solid security.

With 7 releases and close to a thousand commits by 20 contributors, it’s a very active, fast paced project. Lots of things still remain to be implemented before we get to our 1.0 milestone release in early 2016 but looking at what was achieved in just 5 months, I’m confident we’ll have an incredible LXD in another 12 months!

For now, we’d welcome your feedback, so install LXD, play around with it, file bugs and let us know what’s important for you next.

Posted in Canonical voices, LXC, LXD, Planet Revolution-Linux, Planet Ubuntu | Tagged | 40 Comments

VPN in containers

I often have to deal with VPNs, either to connect to the company network, my own network when I’m abroad or to various other places where I’ve got servers I manage.

All of those VPNs use OpenVPN, all with a similar configuration and unfortunately quite a lot of them with overlapping networks. That means that when I connect to them, parts of my own network are no longer reachable or it means that I can’t connect to more than one of them at once.

Those I suspect are all pretty common issues with VPN users, especially those working with or for companies who over the years ended up using most of the rfc1918 subnets.

So I thought, I’m working with containers every day, nowadays we have those cool namespaces in the kernel which let you run crazy things as a a regular user, including getting your own, empty network stack, so why not use that?

Well, that’s what I ended up doing and so far, that’s all done in less than 100 lines of good old POSIX shell script 🙂

That gives me, fully unprivileged non-overlapping VPNs! OpenVPN and everything else run as my own user and nobody other than the user spawning the container can possibly get access to the resources behind the VPN.

The code is available at: git clone git://github.com/stgraber/vpn-container

Then it’s as simple as: ./start-vpn VPN-NAME CONFIG

What happens next is the script will call socat to proxy the VPN TCP socket to a UNIX socket, then a user namespace, network namespace, mount namespace and uts namespace are all created for the container. Your user is root in that namespace and so can start openvpn and create network interfaces and routes. With careful use of some bind-mounts, resolvconf and byobu are also made to work so DNS resolution is functional and we can start byobu to easily allow as many shell as you want in there.

In the end it looks like this:

stgraber@dakara:~/vpn$ ./start-vpn stgraber.net ../stgraber-vpn/stgraber.conf 
WARN: could not reopen tty: No such file or directory
lxc: call to cgmanager_move_pid_abs_sync(name=systemd) failed: invalid request
Fri Sep 26 17:48:07 2014 OpenVPN 2.3.2 x86_64-pc-linux-gnu [SSL (OpenSSL)] [LZO] [EPOLL] [PKCS11] [eurephia] [MH] [IPv6] built on Feb  4 2014
Fri Sep 26 17:48:07 2014 WARNING: No server certificate verification method has been enabled.  See http://openvpn.net/howto.html#mitm for more info.
Fri Sep 26 17:48:07 2014 NOTE: the current --script-security setting may allow this configuration to call user-defined scripts
Fri Sep 26 17:48:07 2014 Attempting to establish TCP connection with [AF_INET]127.0.0.1:1194 [nonblock]
Fri Sep 26 17:48:07 2014 TCP connection established with [AF_INET]127.0.0.1:1194
Fri Sep 26 17:48:07 2014 TCPv4_CLIENT link local: [undef]
Fri Sep 26 17:48:07 2014 TCPv4_CLIENT link remote: [AF_INET]127.0.0.1:1194
Fri Sep 26 17:48:09 2014 [vorash.stgraber.org] Peer Connection Initiated with [AF_INET]127.0.0.1:1194
Fri Sep 26 17:48:12 2014 TUN/TAP device tun0 opened
Fri Sep 26 17:48:12 2014 Note: Cannot set tx queue length on tun0: Operation not permitted (errno=1)
Fri Sep 26 17:48:12 2014 do_ifconfig, tt->ipv6=1, tt->did_ifconfig_ipv6_setup=1
Fri Sep 26 17:48:12 2014 /sbin/ip link set dev tun0 up mtu 1500
Fri Sep 26 17:48:12 2014 /sbin/ip addr add dev tun0 172.16.35.50/24 broadcast 172.16.35.255
Fri Sep 26 17:48:12 2014 /sbin/ip -6 addr add 2001:470:b368:1035::50/64 dev tun0
Fri Sep 26 17:48:12 2014 /etc/openvpn/update-resolv-conf tun0 1500 1544 172.16.35.50 255.255.255.0 init
dhcp-option DNS 172.16.20.30
dhcp-option DNS 172.16.20.31
dhcp-option DNS 2001:470:b368:1020:216:3eff:fe24:5827
dhcp-option DNS nameserver
dhcp-option DOMAIN stgraber.net
Fri Sep 26 17:48:12 2014 add_route_ipv6(2607:f2c0:f00f:2700::/56 -> 2001:470:b368:1035::1 metric -1) dev tun0
Fri Sep 26 17:48:12 2014 add_route_ipv6(2001:470:714b::/48 -> 2001:470:b368:1035::1 metric -1) dev tun0
Fri Sep 26 17:48:12 2014 add_route_ipv6(2001:470:b368::/48 -> 2001:470:b368:1035::1 metric -1) dev tun0
Fri Sep 26 17:48:12 2014 add_route_ipv6(2001:470:b511::/48 -> 2001:470:b368:1035::1 metric -1) dev tun0
Fri Sep 26 17:48:12 2014 add_route_ipv6(2001:470:b512::/48 -> 2001:470:b368:1035::1 metric -1) dev tun0
Fri Sep 26 17:48:12 2014 Initialization Sequence Completed


To attach to this VPN, use: byobu -S /home/stgraber/vpn/stgraber.net.byobu
To kill this VPN, do: byobu -S /home/stgraber/vpn/stgraber.net.byobu kill-server
or from inside byobu: byobu kill-server

After that, just copy/paste the byobu command and you’ll get a shell inside the container. Don’t be alarmed by the fact that you’re root in there. root is mapped to your user’s uid and gid outside the container so it’s actually just your usual user but with a different name and with privileges against the resources owned by the container.

You can now use the VPN as you want without any possible overlap or conflict with any route or VPN you may be running on that system and with absolutely no possibility that a user sharing your machine may access your running VPN.

This has so far been tested with 5 different VPNs, on a regular Ubuntu 14.04 LTS system with all VPNs being TCP based. UDP based VPNs would probably just need a couple of tweaks to the socat unix-socket proxy.

Enjoy!

Posted in Canonical voices, LXC, Planet Ubuntu | Tagged | 7 Comments

LXC 1.0 now available!

After 10 months of work, over a thousand contributions by 60 or so contributors, we’ve finally released LXC 1.0!

You may have followed my earlier series of blog post on LXC 1.0, well, everything I described in there is now available in a stable release which we intend to support for a long time.

In the immediate future, I expect most of LXC upstream will focus on dealing with the bug reports and questions which will no doubt follow this release, then we’ll have to discuss what our goals for LXC 1.1 are and setup a longer term roadmap to LXC 2.0.

But right now, I’m just happy to have LXC 1.0 out, get a lot more users to play with new technologies like unprivileged containers and play with our API in the various languages we support.

Thanks to everyone who made this possible!

Posted in Canonical voices, LXC, Planet Ubuntu | Tagged | 3 Comments