History

Akihiro Nitta d5f35ece72 CI/CD: Add CUDA version to docker image tags (#13831 ) * append cuda version to tags * revertme: push to hub * Update docker readme * Build base-conda-py3.9-torch1.12-cuda11.3.1 * Use new images in conda tests * revertme: push to hub * Revert "revertme: push to hub" This reverts commit `0f7d534b2a`. * Revert "revertme: push to hub" This reverts commit `46a05fccbb`. * Run conda if workflow edited * Run gpu testing if workflow edited * Use new tags in release/Dockerfile * Build base-cuda and PL release images with all combinations * Update release docker * Update conda from py3.9-torch1.12 to py3.10-torch.1.12 * Fix ubuntu version * Revert conda * revertme: push to hub * Don't build Python 3.10 for now... * Fix pl release builder * updating version contribute to the error? https://github.com/docker/buildx/issues/456 * Update actions' versions * Update slack user to notify * Don't use 11.6.0 to avoid bagua incompatibility * Don't use 11.1, and use 11.1.1 * Update .github/workflows/ci-pytorch_test-conda.yml Co-authored-by: Luca Medeiros <67411094+luca-medeiros@users.noreply.github.com> * Update trigger * Ignore artfacts from tutorials * Trim docker images to distribute * Add an image for tutorials * Update conda image 3.8x1.10 * Try different conda variants * No need to set cuda for conda jobs * Update who to notify ipu failure * Don't push * update filenaem Co-authored-by: Luca Medeiros <67411094+luca-medeiros@users.noreply.github.com>		2022-08-10 10:37:50 +00:00
..
base-conda	add testing PT 1.12 (#13386 )	2022-07-15 19:41:23 +02:00
base-cuda	Run GPU tests with PyTorch 1.12 (#13716 )	2022-07-28 19:37:57 +05:30
base-ipu	CI: fix requirements freeze (#13441 )	2022-06-29 09:35:57 -04:00
base-xla	CI: Update XLA from 1.9 to 1.12 (#14013 )	2022-08-05 05:04:45 -04:00
ci-runner-hpu	CI: debug HPU flow (#13419 )	2022-07-20 12:35:01 +02:00
ci-runner-ipu	Drop PyTorch 1.8 support (#13155 )	2022-06-14 20:46:44 -04:00
nvidia	bump base NGC image (#13346 )	2022-07-15 21:36:19 +00:00
release	CI/CD: Add CUDA version to docker image tags (#13831 )	2022-08-10 10:37:50 +00:00
tpu-tests	Improvements to standalone scripts (#13840 )	2022-07-28 23:33:22 +00:00
README.md	CI/CD: Add CUDA version to docker image tags (#13831 )	2022-08-10 10:37:50 +00:00

README.md

Docker images

Build images from Dockerfiles

You can build it on your own, note it takes lots of time, be prepared.

git clone https://github.com/Lightning-AI/lightning.git

# build with the default arguments
docker image build -t pytorch-lightning:latest -f dockers/base-cuda/Dockerfile .

# build with specific arguments
docker image build -t pytorch-lightning:base-cuda-py3.9-torch1.11-cuda11.3.1 -f dockers/base-cuda/Dockerfile --build-arg PYTHON_VERSION=3.9 --build-arg PYTORCH_VERSION=1.11 --build-arg CUDA_VERSION=11.3.1 .

To run your docker use

docker image list
docker run --rm -it pytorch-lightning:latest bash

and if you do not need it anymore, just clean it:

docker image list
docker image rm pytorch-lightning:latest

Run docker image with GPUs

To run docker image with access to your GPUs, you need to install

# Add the package repositories
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list

sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit
sudo systemctl restart docker

and later run the docker image with --gpus all. For example,

docker run --rm -it --gpus all pytorchlightning/pytorch_lightning:base-cuda-py3.9-torch1.11-cuda11.3.1

Run Jupyter server

Inspiration comes from https://u.group/thinking/how-to-put-jupyter-notebooks-in-a-dockerfile

Build the docker image:

docker image build -t pytorch-lightning:v1.6.5 -f dockers/nvidia/Dockerfile --build-arg LIGHTNING_VERSION=1.6.5 .

start the server and map ports:

docker run --rm -it --runtime=nvidia -e NVIDIA_VISIBLE_DEVICES=all -p 8888:8888 pytorch-lightning:v1.6.5

Connect in local browser:
- copy the generated path e.g. http://hostname:8888/?token=0719fa7e1729778b0cec363541a608d5003e26d4910983c6
- replace the hostname by localhost