...
First check that the container image is available using the docker image ls
command:
Code Block | ||
---|---|---|
| ||
$ docker image ls REPOSITORY TAG IMAGE ID CREATED SIZE lambda-stack 20.04 abe4a492cee1 6 hours ago 12GB ubuntu latest df5de72bdb3b 3 weeks ago 77.8MB ubuntu 20.04 3bc6e9f30f51 3 weeks ago 72.8MB debian latest 07d9246c53a6 3 weeks ago 124MB nvidia/cuda 11.0.3-base-ubuntu20.04 8017f5c31b74 5 weeks ago 122MB hello-world latest feb5d9fea6a5 11 months ago 13.3kB |
You should see the lambda-stack
image in the list, this is the one we will use for now.
To start a program in the a container using this image, use the docker run --rm
command:
Code Block | ||
---|---|---|
| ||
$ docker run --rm lambda-stack:20.04 ls TODO add outputpwd /root |
The --rm
flag ensures that the container is stopped and cleanup cleaned-up after usage. TODO clarify/check this
To run an interactive command like bash
or python
interpreter, add the -it
flag:
Code Block | ||
---|---|---|
| ||
$ docker run --rm -it lambda-stack:20.04 bash TODO add output root@fa416f6b82f5:~# pwd /root root@fa416f6b82f5:~# exit $ |
So far, the container does not have access to the GPUs. To give it access to them, you need to change the runtime to nvidia
and explicitly specify a list of GPUs. The following example uses the first 3 GPUs4th and 5th GPUs (indices start at 0):
Code Block | ||
---|---|---|
| ||
$ docker run --runtime=nvidia -e NVIDIA_VISIBLE_DEVICES=03,14 --rm lambda-stack:20.04 nvidia-smi TODO output Thu Sep 1 22:02:25 2022 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 515.65.01 Driver Version: 515.65.01 CUDA Version: 11.7 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 NVIDIA A100-SXM... On | 00000000:4C:00.0 Off | 0 | | N/A 60C P0 374W / 400W | 33549MiB / 81920MiB | 47% Default | | | | Disabled | +-------------------------------+----------------------+----------------------+ | 1 NVIDIA A100-SXM... On | 00000000:88:00.0 Off | 0 | | N/A 54C P0 377W / 400W | 33549MiB / 81920MiB | 86% Default | | | | Disabled | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| +-----------------------------------------------------------------------------+ |
Note that GPU device numbers are re-mapped to start from 0 in the container.
Note |
---|
This method does not prevent multiple containers to access the same GPUs. Therefore, make sure to check with other users which GPUs they are using. This method does ensure that your container will not use by mistake any other GPU than the one specified.TODO mention |
An alternative method to access GPUs is the --gpus
option:
Code Block |
---|
$ docker run --gpus 2 --rm lambda-stack:20.04 nvidia-smi |
Unless you are using the 8 GPUs, we strongly recommend not using this syntax, as it does not let you choose precisely which GPUs to select.
Access a folder from the container
A container is isolated from the host environment by default. Bind mounts allow you to mount a folder from the host machine into the container.
Add Python packages to the container
TODOSpecify the path to the directory on host and the corresponding path inside the container using the -v
flag:
Code Block |
---|
-v <path on host>:<path in container> |
For example, assuming you have a project folder in $HOME/my_project
and want to access it as /my_project
in the container, you would use:
Code Block | ||
---|---|---|
| ||
$ docker run -v $HOME/my_project:/my_project --rm lambda-stack:20.04 ls /my_project |
Any program running in the container can then access and modify files in the /my_project
folder.
You can repeat the -v
flag to mount multiple folders in the container.
Add packages to the container
The container image may not have all the packages you need. To add more packages, you can create a new container image based on the LambdaStack one.
To create a container image, you need a Dockerfile
definition file. It contains the information about the base container image and the installation instructions for the additional packages.
In the following Dockerfile
example, the Transformers library (PyTorch version) from Hugging Face is added to the LambdaStack container:
Code Block | ||
---|---|---|
| ||
FROM lambda-stack:20.04
RUN pip install transformers[torch] |
To build the corresponding container image, first create an empty folder and save the Dockerfile
in it:
Code Block | ||
---|---|---|
| ||
$ mkdir hugging_container
$ echo "FROM lambda-stack:20.04" > hugging_container/Dockerfile
$ echo "RUN pip install transformers[torch]" >> hugging_container/Dockerfile |
then use the docker build
command to generate the new container image:
Code Block | ||
---|---|---|
| ||
$ cd hugging_container
$ docker build -t pytorch-transformers . |
The -t
flag is used to tag the container image, making it easier to find and use it later.
Use docker image ls
to check the availability of the image:
Code Block |
---|
$ docker image ls
REPOSITORY TAG IMAGE ID CREATED SIZE
pytorch-transformers latest 432c6be0a999 13 seconds ago 12.1GB
lambda-stack 20.04 abe4a492cee1 6 days ago 12GB
ubuntu latest df5de72bdb3b 4 weeks ago 77.8MB
ubuntu 20.04 3bc6e9f30f51 4 weeks ago 72.8MB
debian latest 07d9246c53a6 4 weeks ago 124MB
nvidia/cuda 11.0.3-base-ubuntu20.04 8017f5c31b74 6 weeks ago 122MB
hello-world latest feb5d9fea6a5 11 months ago 13.3kB |
You can now use it in place of the LambdaStack image:
Code Block |
---|
$ docker run --rm pytorch-transformers \
python -c "from transformers import pipeline; print(pipeline('sentiment-analysis')('I love you'))"
No model was supplied, defaulted to distilbert-base-uncased-finetuned-sst-2-english and revision af0f99b (https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.
Downloading config.json: 100%|██████████| 629/629 [00:00<00:00, 1.45MB/s]
Downloading pytorch_model.bin: 100%|██████████| 255M/255M [00:10<00:00, 24.5MB/s]
Downloading tokenizer_config.json: 100%|██████████| 48.0/48.0 [00:00<00:00, 72.8kB/s]
Downloading vocab.txt: 100%|██████████| 226k/226k [00:00<00:00, 295kB/s]
[{'label': 'POSITIVE', 'score': 0.9998656511306763}] |
Container images can be deleted using the docker image rm
command. For example, remove the pytorch-transformers
container image as follows:
Code Block |
---|
$ docker image rm pytorch-transformers
Untagged: pytorch-transformers:latest
Deleted: sha256:432c6be0a999484db090c5d9904e5c783454080d8ad8bc39e0499ace479c4559
Deleted: sha256:623ae3b33709c2fc4c40bc2c3959049345fee0087d39b4f53eb95aefd1c16f7d |
Next steps
This document is a very short introduction to Docker to use LambdaStack. If you want to know more about Docker in general, we can recommend this workshop material and the associated recording on Youtube.