Getting started with containerized LambdaStack

A containerized version of LambdaStack is available on the SAIL nodes. This container provides a working installation of PyTorch, TensorFlow, CUDA and cuDNN.

This page is a survival guide to get started with this container and use it with your projects. It assumes that you have an opened SSH session on one of the machines.

Start the container

First check that the container image is available using the docker image ls command:

$ docker image ls REPOSITORY TAG IMAGE ID CREATED SIZE lambda-stack 20.04 abe4a492cee1 6 hours ago 12GB ubuntu latest df5de72bdb3b 3 weeks ago 77.8MB ubuntu 20.04 3bc6e9f30f51 3 weeks ago 72.8MB debian latest 07d9246c53a6 3 weeks ago 124MB nvidia/cuda 11.0.3-base-ubuntu20.04 8017f5c31b74 5 weeks ago 122MB hello-world latest feb5d9fea6a5 11 months ago 13.3kB

You should see the lambda-stack image in the list, this is the one we will use for now.

To start a program in a container using this image, use the docker run --rm command:

$ docker run --rm lambda-stack:20.04 pwd /root

The --rm flag ensures that the container is cleaned-up after usage.

To run an interactive command like bash or python interpreter, add the -it flag:

$ docker run --rm -it lambda-stack:20.04 bash root@fa416f6b82f5:~# pwd /root root@fa416f6b82f5:~# exit $

So far, the container does not have access to the GPUs. To give it access to them, you need to change the runtime to nvidia and explicitly specify a list of GPUs. The following example uses the 4th and 5th GPUs (indices start at 0):

Note that GPU device numbers are re-mapped to start from 0 in the container.

This method does not prevent multiple containers to access the same GPUs. Therefore, make sure to check with other users which GPUs they are using.

This method does ensure that your container will not use by mistake any other GPU than the one specified.

An alternative method to access GPUs is the --gpus option:

Unless you are using the 8 GPUs, we strongly recommend not using this syntax, as it does not let you choose precisely which GPUs to select.

Access a folder from the container

A container is isolated from the host environment by default. Bind mounts allow you to mount a folder from the host machine into the container.

Specify the path to the directory on host and the corresponding path inside the container using the -v flag:

For example, assuming you have a project folder in $HOME/my_project and want to access it as /my_project in the container, you would use:

Any program running in the container can then access and modify files in the /my_project folder.

You can repeat the -v flag to mount multiple folders in the container.

Add packages to the container

The container image may not have all the packages you need. To add more packages, you can create a new container image based on the LambdaStack one.

To create a container image, you need a Dockerfile definition file. It contains the information about the base container image and the installation instructions for the additional packages.

In the following Dockerfile example, the Transformers library (PyTorch version) from Hugging Face is added to the LambdaStack container:

To build the corresponding container image, first create an empty folder and save the Dockerfile in it:

then use the docker build command to generate the new container image:

The -t flag is used to tag the container image, making it easier to find and use it later.

Use docker image ls to check the availability of the image:

You can now use it in place of the LambdaStack image:

Container images can be deleted using the docker image rm command. For example, remove the pytorch-transformers container image as follows:

Next steps

This document is a very short introduction to Docker to use LambdaStack. If you want to know more about Docker in general, we can recommend this workshop material and the associated recording on Youtube.