lightning/tests
Rohit Gupta a628d181ee
Fix val_progress_bar total with num_sanity_val_steps (#3751)
* Fix val_progress_bar total with num_sanity_val_steps

* chlog

* Fix val_progress_bar total with num_sanity_val_steps

* move test

* replaced with sanity flag and suggestions
2020-10-04 08:32:18 -04:00
..
backends [WIP] ref: decoupled ddp, ddp spawn (finish 3733) (#3819) 2020-10-03 14:05:31 -04:00
base verified epoch logging (#3830) 2020-10-03 21:17:24 -04:00
callbacks Fix val_progress_bar total with num_sanity_val_steps (#3751) 2020-10-04 08:32:18 -04:00
core remove deprecated test (#3820) 2020-10-03 13:21:10 -04:00
loggers ref: bug fix with logging val epoch end + monitor (#3812) 2020-10-03 12:33:29 -04:00
metrics Metric aggregation testing (#3517) 2020-10-01 15:37:51 +02:00
models added broadcast option to tpu (#3814) 2020-10-04 07:47:33 -04:00
trainer Fix val_progress_bar total with num_sanity_val_steps (#3751) 2020-10-04 08:32:18 -04:00
utilities
README.md fix path in CI for release & python version in all dockers & duplicated badges (#3765) 2020-10-02 05:26:21 -04:00
__init__.py
collect_env_details.py
conftest.py
test_deprecated.py
test_profiler.py

README.md

PyTorch-Lightning Tests

Most PL tests train a full MNIST model under various trainer conditions (ddp, ddp2+amp, etc...). This provides testing for most combinations of important settings. The tests expect the model to perform to a reasonable degree of testing accuracy to pass.

Running tests

The automatic travis tests ONLY run CPU-based tests. Although these cover most of the use cases, run on a 2-GPU machine to validate the full test-suite.

To run all tests do the following:

Install Open MPI or another MPI implementation. Learn how to install Open MPI on this page.

git clone https://github.com/PyTorchLightning/pytorch-lightning
cd pytorch-lightning

# install AMP support
bash requirements/install_AMP.sh

# install dev deps
pip install -r requirements/devel.txt

# run tests
py.test -v

To test models that require GPU make sure to run the above command on a GPU machine. The GPU machine must have:

  1. At least 2 GPUs.
  2. NVIDIA-apex installed.
  3. Horovod with NCCL support: HOROVOD_GPU_ALLREDUCE=NCCL HOROVOD_GPU_BROADCAST=NCCL pip install horovod

Running Coverage

Make sure to run coverage on a GPU machine with at least 2 GPUs and NVIDIA apex installed.

cd pytorch-lightning

# generate coverage (coverage is also installed as part of dev dependencies under requirements/devel.txt)
coverage run --source pytorch_lightning -m py.test pytorch_lightning tests examples -v

# print coverage stats
coverage report -m

# exporting results
coverage xml

Building test image

You can build it on your own, note it takes lots of time, be prepared.

git clone <git-repository>
docker image build -t pytorch_lightning:devel-torch1.4 -f dockers/cuda-extras/Dockerfile --build-arg TORCH_VERSION=1.4 .

To build other versions, select different Dockerfile.

docker image list
docker run --rm -it pytorch_lightning:devel-torch1.4 bash
docker image rm pytorch_lightning:devel-torch1.4