History

Lezwon Castelino 12cb9942a1 Tpu save (#4309 ) * convert xla tensor to cpu before save * move_to_cpu * updated CHANGELOG.md * added on_save to accelerators * if accelerator is not None * refactors * change filename to run test * run test_tpu_backend * added xla_device_utils to tests * added xla_device_utils to test * removed tests * Revert "added xla_device_utils to test" This reverts commit 0c9316bb * fixed pep * increase timeout and print traceback * lazy check tpu exists * increased timeout removed barrier for tpu during test reduced epochs * fixed torch_xla imports * fix tests * define xla utils * fix test * aval * chlog * docs * aval * Apply suggestions from code review Co-authored-by: Jirka Borovec <jirka.borovec@seznam.cz> Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: chaton <thomas@grid.ai> Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>		2020-12-02 13:05:11 +00:00
..
base-conda	drop fairscale for PT <= 1.4 (#4910 )	2020-11-30 23:19:30 +00:00
base-cuda	drop fairscale for PT <= 1.4 (#4910 )	2020-11-30 23:19:30 +00:00
base-xla	[dockers] install nvidia-dali-cudaXXX (#4532 )	2020-11-09 21:18:24 +06:30
release	Drone: use nightly build cuda docker images (#3658 )	2020-10-26 10:47:09 +00:00
tpu-tests	Tpu save (#4309 )	2020-12-02 13:05:11 +00:00
README.md	[dockers] install nvidia-dali-cudaXXX (#4532 )	2020-11-09 21:18:24 +06:30

README.md

Docker images

Builds images form attached Dockerfiles

You can build it on your own, note it takes lots of time, be prepared.

git clone <git-repository>
docker image build -t pytorch-lightning:latest -f dockers/conda/Dockerfile .

or with specific arguments

git clone <git-repository>
docker image build \
    -t pytorch-lightning:py3.8-pt1.6 \
    -f dockers/base-cuda/Dockerfile \
    --build-arg PYTHON_VERSION=3.8 \
    --build-arg PYTORCH_VERSION=1.6 \
    .

To run your docker use

docker image list
docker run --rm -it pytorch-lightning:latest bash

and if you do not need it anymore, just clean it:

docker image list
docker image rm pytorch-lightning:latest

Run docker image with GPUs

To run docker image with access to you GPUs you need to install

# Add the package repositories
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list

sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit
sudo systemctl restart docker

and later run the docker image with --gpus all so for example

docker run --rm -it --gpus all pytorchlightning/pytorch_lightning:base-cuda-py3.7-torch1.6