NVidia CUDA inside a LXD container

LXD logo

GPU inside a container

LXD supports GPU passthrough but this is implemented in a very different way than what you would expect from a virtual machine. With containers, rather than passing a raw PCI device and have the container deal with it (which it can’t), we instead have the host setup with all needed drivers and only pass the resulting device nodes to the container.

This post focuses on NVidia and the CUDA toolkit specifically, but LXD’s passthrough feature should work with all other GPUs too. NVidia is just what I happen to have around.

The test system used below is a virtual machine with two NVidia GT 730 cards attached to it. Those are very cheap, low performance GPUs, that have the advantage of existing in low-profile PCI cards that fit fine in one of my servers and don’t require extra power.
For production CUDA workloads, you’ll want something much better than this.

Note that for this to work, you’ll need LXD 2.5 or higher.

Host setup

Install the CUDA tools and drivers on the host:

wget http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/cuda-repo-ubuntu1604_8.0.61-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu1604_8.0.61-1_amd64.deb
sudo apt update
sudo apt install cuda

Then reboot the system to make sure everything is properly setup. After that, you should be able to confirm that your NVidia GPU is properly working with:

ubuntu@canonical-lxd:~$ nvidia-smi 
Tue Mar 21 21:28:34 2017       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 375.39                 Driver Version: 375.39                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GT 730      Off  | 0000:02:06.0     N/A |                  N/A |
| 30%   30C    P0    N/A /  N/A |      0MiB /  2001MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GT 730      Off  | 0000:02:08.0     N/A |                  N/A |
| 30%   26C    P0    N/A /  N/A |      0MiB /  2001MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|    0                  Not Supported                                         |
|    1                  Not Supported                                         |
+-----------------------------------------------------------------------------+

And can check that the CUDA tools work properly with:

ubuntu@canonical-lxd:~$ /usr/local/cuda-8.0/extras/demo_suite/bandwidthTest
[CUDA Bandwidth Test] - Starting...
Running on...

 Device 0: GeForce GT 730
 Quick Mode

 Host to Device Bandwidth, 1 Device(s)
 PINNED Memory Transfers
   Transfer Size (Bytes)	Bandwidth(MB/s)
   33554432			3059.4

 Device to Host Bandwidth, 1 Device(s)
 PINNED Memory Transfers
   Transfer Size (Bytes)	Bandwidth(MB/s)
   33554432			3267.4

 Device to Device Bandwidth, 1 Device(s)
 PINNED Memory Transfers
   Transfer Size (Bytes)	Bandwidth(MB/s)
   33554432			30805.1

Result = PASS

NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.

Container setup

First lets just create a regular Ubuntu 16.04 container:

ubuntu@canonical-lxd:~$ lxc launch ubuntu:16.04 c1
Creating c1
Starting c1

Then install the CUDA demo tools in there:

lxc exec c1 -- wget http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/cuda-repo-ubuntu1604_8.0.61-1_amd64.deb
lxc exec c1 -- dpkg -i cuda-repo-ubuntu1604_8.0.61-1_amd64.deb
lxc exec c1 -- apt update
lxc exec c1 -- apt install cuda-demo-suite-8-0 --no-install-recommends

At which point, you can run:

ubuntu@canonical-lxd:~$ lxc exec c1 -- nvidia-smi
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.

Which is expected as LXD hasn’t been told to pass any GPU yet.

LXD GPU passthrough

LXD allows for pretty specific GPU passthrough, the details can be found here.
First let’s start with the most generic one, just allow access to all GPUs:

ubuntu@canonical-lxd:~$ lxc config device add c1 gpu gpu
Device gpu added to c1
ubuntu@canonical-lxd:~$ lxc exec c1 -- nvidia-smi
Tue Mar 21 21:47:54 2017       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 375.39                 Driver Version: 375.39                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GT 730      Off  | 0000:02:06.0     N/A |                  N/A |
| 30%   30C    P0    N/A /  N/A |      0MiB /  2001MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GT 730      Off  | 0000:02:08.0     N/A |                  N/A |
| 30%   27C    P0    N/A /  N/A |      0MiB /  2001MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|    0                  Not Supported                                         |
|    1                  Not Supported                                         |
+-----------------------------------------------------------------------------+
ubuntu@canonical-lxd:~$ lxc config device remove c1 gpu
Device gpu removed from c1

Now just pass whichever is the first GPU:

ubuntu@canonical-lxd:~$ lxc config device add c1 gpu gpu id=0
Device gpu added to c1
ubuntu@canonical-lxd:~$ lxc exec c1 -- nvidia-smi
Tue Mar 21 21:50:37 2017       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 375.39                 Driver Version: 375.39                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GT 730      Off  | 0000:02:06.0     N/A |                  N/A |
| 30%   30C    P0    N/A /  N/A |      0MiB /  2001MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|    0                  Not Supported                                         |
+-----------------------------------------------------------------------------+
ubuntu@canonical-lxd:~$ lxc config device remove c1 gpu
Device gpu removed from c1

You can also specify the GPU by vendorid and productid:

ubuntu@canonical-lxd:~$ lspci -nnn | grep NVIDIA
02:06.0 VGA compatible controller [0300]: NVIDIA Corporation GK208 [GeForce GT 730] [10de:1287] (rev a1)
02:07.0 Audio device [0403]: NVIDIA Corporation GK208 HDMI/DP Audio Controller [10de:0e0f] (rev a1)
02:08.0 VGA compatible controller [0300]: NVIDIA Corporation GK208 [GeForce GT 730] [10de:1287] (rev a1)
02:09.0 Audio device [0403]: NVIDIA Corporation GK208 HDMI/DP Audio Controller [10de:0e0f] (rev a1)
ubuntu@canonical-lxd:~$ lxc config device add c1 gpu gpu vendorid=10de productid=1287
Device gpu added to c1
ubuntu@canonical-lxd:~$ lxc exec c1 -- nvidia-smi
Tue Mar 21 21:52:40 2017       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 375.39                 Driver Version: 375.39                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GT 730      Off  | 0000:02:06.0     N/A |                  N/A |
| 30%   30C    P0    N/A /  N/A |      0MiB /  2001MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GT 730      Off  | 0000:02:08.0     N/A |                  N/A |
| 30%   27C    P0    N/A /  N/A |      0MiB /  2001MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|    0                  Not Supported                                         |
|    1                  Not Supported                                         |
+-----------------------------------------------------------------------------+
ubuntu@canonical-lxd:~$ lxc config device remove c1 gpu
Device gpu removed from c1

Which adds them both as they are exactly the same model in my setup.

But for such cases, you can also select using the card’s PCI ID with:

ubuntu@canonical-lxd:~$ lxc config device add c1 gpu gpu pci=0000:02:08.0
Device gpu added to c1
ubuntu@canonical-lxd:~$ lxc exec c1 -- nvidia-smi
Tue Mar 21 21:56:52 2017       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 375.39                 Driver Version: 375.39                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GT 730      Off  | 0000:02:08.0     N/A |                  N/A |
| 30%   27C    P0    N/A /  N/A |      0MiB /  2001MiB |     N/A      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|    0                  Not Supported                                         |
+-----------------------------------------------------------------------------+
ubuntu@canonical-lxd:~$ lxc config device remove c1 gpu 
Device gpu removed from c1

And lastly, lets confirm that we get the same result as on the host when running a CUDA workload:

ubuntu@canonical-lxd:~$ lxc config device add c1 gpu gpu
Device gpu added to c1
ubuntu@canonical-lxd:~$ lxc exec c1 -- /usr/local/cuda-8.0/extras/demo_suite/bandwidthTest
[CUDA Bandwidth Test] - Starting...
Running on...

 Device 0: GeForce GT 730
 Quick Mode

 Host to Device Bandwidth, 1 Device(s)
 PINNED Memory Transfers
   Transfer Size (Bytes)	Bandwidth(MB/s)
   33554432			3065.4

 Device to Host Bandwidth, 1 Device(s)
 PINNED Memory Transfers
   Transfer Size (Bytes)	Bandwidth(MB/s)
   33554432			3305.8

 Device to Device Bandwidth, 1 Device(s)
 PINNED Memory Transfers
   Transfer Size (Bytes)	Bandwidth(MB/s)
   33554432			30825.7

Result = PASS

NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.

Conclusion

LXD makes it very easy to share one or multiple GPUs with your containers.
You can either dedicate specific GPUs to specific containers or just share them.

There is no of the overhead involved with usual PCI based passthrough and only a single instance of the driver is running with the containers acting just like normal host user processes would.

This does however require that your containers run a version of the CUDA tools which supports whatever version of the NVidia drivers is installed on the host.

Extra information

The main LXD website is at: https://linuxcontainers.org/lxd
Development happens on Github at: https://github.com/lxc/lxd
Mailing-list support happens on: https://lists.linuxcontainers.org
IRC support happens in: #lxcontainers on irc.freenode.net
Try LXD online: https://linuxcontainers.org/lxd/try-it

This entry was posted in Canonical voices, LXD, Planet Ubuntu and tagged . Bookmark the permalink.

6 Responses to NVidia CUDA inside a LXD container

  1. Maxim says:

    Hi. How can I use multiple containers with same GPU? Can i passthrough the GPU without to use virtualgl?

  2. Simon Schneider says:

    Very cool this is going to be a great replacement for my VM-based cuda development/testing environment.

    Could I suggest naming the container something else then “cuda”, perhaps “cuda-container” or something else distinguishable? It’s rather easy to loose track of what the container name is in the commands because of “cuda” being such a central term in the context.

  3. Lee Lo says:

    Great tutorial! I have to try it and understand it!

  4. Steven says:

    Love the tutorial but it doesn’t seem to be working.

    Specifically, “lxc config device add c1 gpu gpu” gives the error “error: Invalid device type for device ‘gpu’ ”

    Any advice?

Leave a Reply

Your email address will not be published. Required fields are marked *