lightning/dockers/README.md

# Docker images

## Build images from Dockerfiles

You can build it on your own, note it takes lots of time, be prepared.

```bash
git clone https://github.com/Lightning-AI/lightning.git

# build with the default arguments
docker image build -t pytorch-lightning:latest -f dockers/base-cuda/Dockerfile .

# build with specific arguments
docker image build -t pytorch-lightning:base-cuda-py3.9-torch1.13-cuda11.7.1 -f dockers/base-cuda/Dockerfile --build-arg PYTHON_VERSION=3.9 --build-arg PYTORCH_VERSION=1.13 --build-arg CUDA_VERSION=11.7.1 .
```

To run your docker use

```bash
docker image list
docker run --rm -it pytorch-lightning:latest bash
```

and if you do not need it anymore, just clean it:

```bash
docker image list
docker image rm pytorch-lightning:latest
```

## Run docker image with GPUs

To run docker image with access to your GPUs, you need to install

```bash
# Add the package repositories
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list

sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit
sudo systemctl restart docker
```

and later run the docker image with `--gpus all`. For example,

```
docker run --rm -it --gpus all pytorchlightning/pytorch_lightning:base-cuda-py3.9-torch1.12-cuda11.7.1
```

## Run Jupyter server

1. Build the docker image:
   ```bash
   docker image build -t pytorch-lightning:v1.6.5 -f dockers/nvidia/Dockerfile --build-arg LIGHTNING_VERSION=1.6.5 .
   ```
1. start the server and map ports:
   ```bash
   docker run --rm -it --gpus=all -p 8888:8888 pytorch-lightning:v1.6.5
   ```
1. Connect in local browser:
   - copy the generated path e.g. `http://hostname:8888/?token=0719fa7e1729778b0cec363541a608d5003e26d4910983c6`
   - replace the `hostname` by `localhost`
Drone: use nightly build cuda docker images (#3658) * upgrade PT version * update docker * docker * try 1.5 * badge * fix typo: dor -> for (#3918) * prune * prune * env * echo * try * notes * env * env * env * notes * docker * prune * maintainer * CI * update * just 1.5 * CI * CI * CI * CI * CI * CI * CI * CI * CI * CI * CI * docker * CI * CI * CI * CI * CI * CI * CI * CI * CI * push * try * prune * CI * CI * CI * CI Co-authored-by: Klyukin Valeriy <mr.clyukin@gmail.com> Co-authored-by: Jeff Yang <ydcjeff@outlook.com> 2020-10-26 10:47:09 +00:00			`# Docker images`

CI/CD: Add CUDA version to docker image tags (#13831) * append cuda version to tags * revertme: push to hub * Update docker readme * Build base-conda-py3.9-torch1.12-cuda11.3.1 * Use new images in conda tests * revertme: push to hub * Revert "revertme: push to hub" This reverts commit 0f7d534b2ae41e4bd227961a929c333c88e35f59. * Revert "revertme: push to hub" This reverts commit 46a05fccbb9b596aa98d5d68424917b5811c5b4f. * Run conda if workflow edited * Run gpu testing if workflow edited * Use new tags in release/Dockerfile * Build base-cuda and PL release images with all combinations * Update release docker * Update conda from py3.9-torch1.12 to py3.10-torch.1.12 * Fix ubuntu version * Revert conda * revertme: push to hub * Don't build Python 3.10 for now... * Fix pl release builder * updating version contribute to the error? https://github.com/docker/buildx/issues/456 * Update actions' versions * Update slack user to notify * Don't use 11.6.0 to avoid bagua incompatibility * Don't use 11.1, and use 11.1.1 * Update .github/workflows/ci-pytorch_test-conda.yml Co-authored-by: Luca Medeiros <67411094+luca-medeiros@users.noreply.github.com> * Update trigger * Ignore artfacts from tutorials * Trim docker images to distribute * Add an image for tutorials * Update conda image 3.8x1.10 * Try different conda variants * No need to set cuda for conda jobs * Update who to notify ipu failure * Don't push * update filenaem Co-authored-by: Luca Medeiros <67411094+luca-medeiros@users.noreply.github.com> 2022-08-10 10:37:50 +00:00			`## Build images from Dockerfiles`
Create Dockerfile (#1569) * Create Dockerfile * add readme * Update MANIFEST.in * Update Dockerfile Co-authored-by: J. Borovec <jirka.borovec@seznam.cz> 2020-04-25 18:17:09 +00:00
			`You can build it on your own, note it takes lots of time, be prepared.`
fix docker builds (#2383) 2020-06-27 12:49:19 +00:00
Create Dockerfile (#1569) * Create Dockerfile * add readme * Update MANIFEST.in * Update Dockerfile Co-authored-by: J. Borovec <jirka.borovec@seznam.cz> 2020-04-25 18:17:09 +00:00			```bash
CI/CD: Add CUDA version to docker image tags (#13831) * append cuda version to tags * revertme: push to hub * Update docker readme * Build base-conda-py3.9-torch1.12-cuda11.3.1 * Use new images in conda tests * revertme: push to hub * Revert "revertme: push to hub" This reverts commit 0f7d534b2ae41e4bd227961a929c333c88e35f59. * Revert "revertme: push to hub" This reverts commit 46a05fccbb9b596aa98d5d68424917b5811c5b4f. * Run conda if workflow edited * Run gpu testing if workflow edited * Use new tags in release/Dockerfile * Build base-cuda and PL release images with all combinations * Update release docker * Update conda from py3.9-torch1.12 to py3.10-torch.1.12 * Fix ubuntu version * Revert conda * revertme: push to hub * Don't build Python 3.10 for now... * Fix pl release builder * updating version contribute to the error? https://github.com/docker/buildx/issues/456 * Update actions' versions * Update slack user to notify * Don't use 11.6.0 to avoid bagua incompatibility * Don't use 11.1, and use 11.1.1 * Update .github/workflows/ci-pytorch_test-conda.yml Co-authored-by: Luca Medeiros <67411094+luca-medeiros@users.noreply.github.com> * Update trigger * Ignore artfacts from tutorials * Trim docker images to distribute * Add an image for tutorials * Update conda image 3.8x1.10 * Try different conda variants * No need to set cuda for conda jobs * Update who to notify ipu failure * Don't push * update filenaem Co-authored-by: Luca Medeiros <67411094+luca-medeiros@users.noreply.github.com> 2022-08-10 10:37:50 +00:00			`git clone https://github.com/Lightning-AI/lightning.git`
CI: add mdformat (#8673) * add mdformat * exclude chlog * fix *** Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> 2021-08-03 18:19:09 +00:00
CI/CD: Add CUDA version to docker image tags (#13831) * append cuda version to tags * revertme: push to hub * Update docker readme * Build base-conda-py3.9-torch1.12-cuda11.3.1 * Use new images in conda tests * revertme: push to hub * Revert "revertme: push to hub" This reverts commit 0f7d534b2ae41e4bd227961a929c333c88e35f59. * Revert "revertme: push to hub" This reverts commit 46a05fccbb9b596aa98d5d68424917b5811c5b4f. * Run conda if workflow edited * Run gpu testing if workflow edited * Use new tags in release/Dockerfile * Build base-cuda and PL release images with all combinations * Update release docker * Update conda from py3.9-torch1.12 to py3.10-torch.1.12 * Fix ubuntu version * Revert conda * revertme: push to hub * Don't build Python 3.10 for now... * Fix pl release builder * updating version contribute to the error? https://github.com/docker/buildx/issues/456 * Update actions' versions * Update slack user to notify * Don't use 11.6.0 to avoid bagua incompatibility * Don't use 11.1, and use 11.1.1 * Update .github/workflows/ci-pytorch_test-conda.yml Co-authored-by: Luca Medeiros <67411094+luca-medeiros@users.noreply.github.com> * Update trigger * Ignore artfacts from tutorials * Trim docker images to distribute * Add an image for tutorials * Update conda image 3.8x1.10 * Try different conda variants * No need to set cuda for conda jobs * Update who to notify ipu failure * Don't push * update filenaem Co-authored-by: Luca Medeiros <67411094+luca-medeiros@users.noreply.github.com> 2022-08-10 10:37:50 +00:00			`# build with the default arguments`
			`docker image build -t pytorch-lightning:latest -f dockers/base-cuda/Dockerfile .`
CI: add mdformat (#8673) * add mdformat * exclude chlog * fix *** Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> 2021-08-03 18:19:09 +00:00
CI/CD: Add CUDA version to docker image tags (#13831) * append cuda version to tags * revertme: push to hub * Update docker readme * Build base-conda-py3.9-torch1.12-cuda11.3.1 * Use new images in conda tests * revertme: push to hub * Revert "revertme: push to hub" This reverts commit 0f7d534b2ae41e4bd227961a929c333c88e35f59. * Revert "revertme: push to hub" This reverts commit 46a05fccbb9b596aa98d5d68424917b5811c5b4f. * Run conda if workflow edited * Run gpu testing if workflow edited * Use new tags in release/Dockerfile * Build base-cuda and PL release images with all combinations * Update release docker * Update conda from py3.9-torch1.12 to py3.10-torch.1.12 * Fix ubuntu version * Revert conda * revertme: push to hub * Don't build Python 3.10 for now... * Fix pl release builder * updating version contribute to the error? https://github.com/docker/buildx/issues/456 * Update actions' versions * Update slack user to notify * Don't use 11.6.0 to avoid bagua incompatibility * Don't use 11.1, and use 11.1.1 * Update .github/workflows/ci-pytorch_test-conda.yml Co-authored-by: Luca Medeiros <67411094+luca-medeiros@users.noreply.github.com> * Update trigger * Ignore artfacts from tutorials * Trim docker images to distribute * Add an image for tutorials * Update conda image 3.8x1.10 * Try different conda variants * No need to set cuda for conda jobs * Update who to notify ipu failure * Don't push * update filenaem Co-authored-by: Luca Medeiros <67411094+luca-medeiros@users.noreply.github.com> 2022-08-10 10:37:50 +00:00			`# build with specific arguments`
Update CI to CUDA 11.7.1 (#16123) Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> 2023-01-04 22:44:23 +00:00			`docker image build -t pytorch-lightning:base-cuda-py3.9-torch1.13-cuda11.7.1 -f dockers/base-cuda/Dockerfile --build-arg PYTHON_VERSION=3.9 --build-arg PYTORCH_VERSION=1.13 --build-arg CUDA_VERSION=11.7.1 .`
try fix: Docker with Conda & PT 1.8 (#5842) * ci * ver * list * pt * nk * ch * 4.9 2021-02-09 08:22:35 +00:00			```
fix docker builds (#2383) 2020-06-27 12:49:19 +00:00
			`To run your docker use`

			```bash
			`docker image list`
			`docker run --rm -it pytorch-lightning:latest bash`
			```

			`and if you do not need it anymore, just clean it:`

Create Dockerfile (#1569) * Create Dockerfile * add readme * Update MANIFEST.in * Update Dockerfile Co-authored-by: J. Borovec <jirka.borovec@seznam.cz> 2020-04-25 18:17:09 +00:00			```bash
			`docker image list`
fix docker builds (#2383) 2020-06-27 12:49:19 +00:00			`docker image rm pytorch-lightning:latest`
Drone: use nightly build cuda docker images (#3658) * upgrade PT version * update docker * docker * try 1.5 * badge * fix typo: dor -> for (#3918) * prune * prune * env * echo * try * notes * env * env * env * notes * docker * prune * maintainer * CI * update * just 1.5 * CI * CI * CI * CI * CI * CI * CI * CI * CI * CI * CI * docker * CI * CI * CI * CI * CI * CI * CI * CI * CI * push * try * prune * CI * CI * CI * CI Co-authored-by: Klyukin Valeriy <mr.clyukin@gmail.com> Co-authored-by: Jeff Yang <ydcjeff@outlook.com> 2020-10-26 10:47:09 +00:00			```

Docker/nvidia (#7109) * version check * ... 2021-04-27 19:29:49 +00:00			`## Run docker image with GPUs`
Drone: use nightly build cuda docker images (#3658) * upgrade PT version * update docker * docker * try 1.5 * badge * fix typo: dor -> for (#3918) * prune * prune * env * echo * try * notes * env * env * env * notes * docker * prune * maintainer * CI * update * just 1.5 * CI * CI * CI * CI * CI * CI * CI * CI * CI * CI * CI * docker * CI * CI * CI * CI * CI * CI * CI * CI * CI * push * try * prune * CI * CI * CI * CI Co-authored-by: Klyukin Valeriy <mr.clyukin@gmail.com> Co-authored-by: Jeff Yang <ydcjeff@outlook.com> 2020-10-26 10:47:09 +00:00
CI/CD: Add CUDA version to docker image tags (#13831) * append cuda version to tags * revertme: push to hub * Update docker readme * Build base-conda-py3.9-torch1.12-cuda11.3.1 * Use new images in conda tests * revertme: push to hub * Revert "revertme: push to hub" This reverts commit 0f7d534b2ae41e4bd227961a929c333c88e35f59. * Revert "revertme: push to hub" This reverts commit 46a05fccbb9b596aa98d5d68424917b5811c5b4f. * Run conda if workflow edited * Run gpu testing if workflow edited * Use new tags in release/Dockerfile * Build base-cuda and PL release images with all combinations * Update release docker * Update conda from py3.9-torch1.12 to py3.10-torch.1.12 * Fix ubuntu version * Revert conda * revertme: push to hub * Don't build Python 3.10 for now... * Fix pl release builder * updating version contribute to the error? https://github.com/docker/buildx/issues/456 * Update actions' versions * Update slack user to notify * Don't use 11.6.0 to avoid bagua incompatibility * Don't use 11.1, and use 11.1.1 * Update .github/workflows/ci-pytorch_test-conda.yml Co-authored-by: Luca Medeiros <67411094+luca-medeiros@users.noreply.github.com> * Update trigger * Ignore artfacts from tutorials * Trim docker images to distribute * Add an image for tutorials * Update conda image 3.8x1.10 * Try different conda variants * No need to set cuda for conda jobs * Update who to notify ipu failure * Don't push * update filenaem Co-authored-by: Luca Medeiros <67411094+luca-medeiros@users.noreply.github.com> 2022-08-10 10:37:50 +00:00			`To run docker image with access to your GPUs, you need to install`
CI: add mdformat (#8673) * add mdformat * exclude chlog * fix *** Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> 2021-08-03 18:19:09 +00:00
Drone: use nightly build cuda docker images (#3658) * upgrade PT version * update docker * docker * try 1.5 * badge * fix typo: dor -> for (#3918) * prune * prune * env * echo * try * notes * env * env * env * notes * docker * prune * maintainer * CI * update * just 1.5 * CI * CI * CI * CI * CI * CI * CI * CI * CI * CI * CI * docker * CI * CI * CI * CI * CI * CI * CI * CI * CI * push * try * prune * CI * CI * CI * CI Co-authored-by: Klyukin Valeriy <mr.clyukin@gmail.com> Co-authored-by: Jeff Yang <ydcjeff@outlook.com> 2020-10-26 10:47:09 +00:00			```bash
			`# Add the package repositories`
			`distribution=$(. /etc/os-release;echo $ID$VERSION_ID)`
			`curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey \| sudo apt-key add -`
			`curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list \| sudo tee /etc/apt/sources.list.d/nvidia-docker.list`

			`sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit`
			`sudo systemctl restart docker`
			```

CI/CD: Add CUDA version to docker image tags (#13831) * append cuda version to tags * revertme: push to hub * Update docker readme * Build base-conda-py3.9-torch1.12-cuda11.3.1 * Use new images in conda tests * revertme: push to hub * Revert "revertme: push to hub" This reverts commit 0f7d534b2ae41e4bd227961a929c333c88e35f59. * Revert "revertme: push to hub" This reverts commit 46a05fccbb9b596aa98d5d68424917b5811c5b4f. * Run conda if workflow edited * Run gpu testing if workflow edited * Use new tags in release/Dockerfile * Build base-cuda and PL release images with all combinations * Update release docker * Update conda from py3.9-torch1.12 to py3.10-torch.1.12 * Fix ubuntu version * Revert conda * revertme: push to hub * Don't build Python 3.10 for now... * Fix pl release builder * updating version contribute to the error? https://github.com/docker/buildx/issues/456 * Update actions' versions * Update slack user to notify * Don't use 11.6.0 to avoid bagua incompatibility * Don't use 11.1, and use 11.1.1 * Update .github/workflows/ci-pytorch_test-conda.yml Co-authored-by: Luca Medeiros <67411094+luca-medeiros@users.noreply.github.com> * Update trigger * Ignore artfacts from tutorials * Trim docker images to distribute * Add an image for tutorials * Update conda image 3.8x1.10 * Try different conda variants * No need to set cuda for conda jobs * Update who to notify ipu failure * Don't push * update filenaem Co-authored-by: Luca Medeiros <67411094+luca-medeiros@users.noreply.github.com> 2022-08-10 10:37:50 +00:00			and later run the docker image with `--gpus all`. For example,
Drone: use nightly build cuda docker images (#3658) * upgrade PT version * update docker * docker * try 1.5 * badge * fix typo: dor -> for (#3918) * prune * prune * env * echo * try * notes * env * env * env * notes * docker * prune * maintainer * CI * update * just 1.5 * CI * CI * CI * CI * CI * CI * CI * CI * CI * CI * CI * docker * CI * CI * CI * CI * CI * CI * CI * CI * CI * push * try * prune * CI * CI * CI * CI Co-authored-by: Klyukin Valeriy <mr.clyukin@gmail.com> Co-authored-by: Jeff Yang <ydcjeff@outlook.com> 2020-10-26 10:47:09 +00:00
			```
Update CI to CUDA 11.7.1 (#16123) Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> 2023-01-04 22:44:23 +00:00			`docker run --rm -it --gpus all pytorchlightning/pytorch_lightning:base-cuda-py3.9-torch1.12-cuda11.7.1`
Drone: use nightly build cuda docker images (#3658) * upgrade PT version * update docker * docker * try 1.5 * badge * fix typo: dor -> for (#3918) * prune * prune * env * echo * try * notes * env * env * env * notes * docker * prune * maintainer * CI * update * just 1.5 * CI * CI * CI * CI * CI * CI * CI * CI * CI * CI * CI * docker * CI * CI * CI * CI * CI * CI * CI * CI * CI * push * try * prune * CI * CI * CI * CI Co-authored-by: Klyukin Valeriy <mr.clyukin@gmail.com> Co-authored-by: Jeff Yang <ydcjeff@outlook.com> 2020-10-26 10:47:09 +00:00			```
Docker/nvidia (#7109) * version check * ... 2021-04-27 19:29:49 +00:00
			`## Run Jupyter server`

			`1. Build the docker image:`
CI: add mdformat (#8673) * add mdformat * exclude chlog * fix *** Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> 2021-08-03 18:19:09 +00:00			```bash
CI/CD: Add CUDA version to docker image tags (#13831) * append cuda version to tags * revertme: push to hub * Update docker readme * Build base-conda-py3.9-torch1.12-cuda11.3.1 * Use new images in conda tests * revertme: push to hub * Revert "revertme: push to hub" This reverts commit 0f7d534b2ae41e4bd227961a929c333c88e35f59. * Revert "revertme: push to hub" This reverts commit 46a05fccbb9b596aa98d5d68424917b5811c5b4f. * Run conda if workflow edited * Run gpu testing if workflow edited * Use new tags in release/Dockerfile * Build base-cuda and PL release images with all combinations * Update release docker * Update conda from py3.9-torch1.12 to py3.10-torch.1.12 * Fix ubuntu version * Revert conda * revertme: push to hub * Don't build Python 3.10 for now... * Fix pl release builder * updating version contribute to the error? https://github.com/docker/buildx/issues/456 * Update actions' versions * Update slack user to notify * Don't use 11.6.0 to avoid bagua incompatibility * Don't use 11.1, and use 11.1.1 * Update .github/workflows/ci-pytorch_test-conda.yml Co-authored-by: Luca Medeiros <67411094+luca-medeiros@users.noreply.github.com> * Update trigger * Ignore artfacts from tutorials * Trim docker images to distribute * Add an image for tutorials * Update conda image 3.8x1.10 * Try different conda variants * No need to set cuda for conda jobs * Update who to notify ipu failure * Don't push * update filenaem Co-authored-by: Luca Medeiros <67411094+luca-medeiros@users.noreply.github.com> 2022-08-10 10:37:50 +00:00			`docker image build -t pytorch-lightning:v1.6.5 -f dockers/nvidia/Dockerfile --build-arg LIGHTNING_VERSION=1.6.5 .`
CI: add mdformat (#8673) * add mdformat * exclude chlog * fix *** Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> 2021-08-03 18:19:09 +00:00			```
			`1. start the server and map ports:`
			```bash
CI: enable CI run for PT 1.13 (#15128) * Apply suggestions from code review * enable CI to run for PT 1.13 Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> 2022-10-20 08:33:56 +00:00			`docker run --rm -it --gpus=all -p 8888:8888 pytorch-lightning:v1.6.5`
CI: add mdformat (#8673) * add mdformat * exclude chlog * fix *** Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> 2021-08-03 18:19:09 +00:00			```
			`1. Connect in local browser:`
			- copy the generated path e.g. `http://hostname:8888/?token=0719fa7e1729778b0cec363541a608d5003e26d4910983c6`
			- replace the `hostname` by `localhost`