lightning/tests
Travis Addair 7024177f7d
Added Horovod distributed backend (#1529)
* Initial commit of Horovod distributed backend implementation

* Update distrib_data_parallel.py

* Update distrib_data_parallel.py

* Update tests/models/test_horovod.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Update tests/models/test_horovod.py

Co-Authored-By: Jirka Borovec <Borda@users.noreply.github.com>

* Fixed tests

* Added six

* tests

* Install tox for GitHub CI

* Retry tests

* Catch all exceptions

* Skip cache

* Remove tox

* Restore pip cache

* Remove the cache

* Restore pip cache

* Remove AMP

Co-authored-by: William Falcon <waf2107@columbia.edu>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: J. Borovec <jirka.borovec@seznam.cz>
2020-04-22 17:39:08 -04:00
..
base Added Horovod distributed backend (#1529) 2020-04-22 17:39:08 -04:00
loggers default test logger (#1478) 2020-04-21 20:33:10 -04:00
models Added Horovod distributed backend (#1529) 2020-04-22 17:39:08 -04:00
trainer default test logger (#1478) 2020-04-21 20:33:10 -04:00
Dockerfile CI: split tests-examples (#990) 2020-03-25 07:46:27 -04:00
README.md Added Horovod distributed backend (#1529) 2020-04-22 17:39:08 -04:00
__init__.py default test logger (#1478) 2020-04-21 20:33:10 -04:00
collect_env_details.py fix changelog (#1452) 2020-04-20 17:36:26 -04:00
conftest.py default test logger (#1478) 2020-04-21 20:33:10 -04:00
install_AMP.sh CI: split tests-examples (#990) 2020-03-25 07:46:27 -04:00
requirements.txt Added Horovod distributed backend (#1529) 2020-04-22 17:39:08 -04:00
test_deprecated.py fix deprecated default_save_path (#1449) 2020-04-10 14:32:56 -04:00
test_profiler.py default test logger (#1478) 2020-04-21 20:33:10 -04:00

README.md

PyTorch-Lightning Tests

Most PL tests train a full MNIST model under various trainer conditions (ddp, ddp2+amp, etc...). This provides testing for most combinations of important settings. The tests expect the model to perform to a reasonable degree of testing accuracy to pass.

Running tests

The automatic travis tests ONLY run CPU-based tests. Although these cover most of the use cases, run on a 2-GPU machine to validate the full test-suite.

To run all tests do the following:

git clone https://github.com/PyTorchLightning/pytorch-lightning
cd pytorch-lightning

# install AMP support
bash tests/install_AMP.sh

# install dev deps
pip install -r tests/requirements.txt

# run tests
py.test -v

To test models that require GPU make sure to run the above command on a GPU machine. The GPU machine must have:

  1. At least 2 GPUs.
  2. NVIDIA-apex installed.
  3. Horovod with NCCL support: HOROVOD_GPU_ALLREDUCE=NCCL HOROVOD_GPU_BROADCAST=NCCL pip install horovod

Running Coverage

Make sure to run coverage on a GPU machine with at least 2 GPUs and NVIDIA apex installed.

cd pytorch-lightning

# generate coverage (coverage is also installed as part of dev dependencies under tests/requirements.txt)
coverage run --source pytorch_lightning -m py.test pytorch_lightning tests examples -v --doctest-modules

# print coverage stats
coverage report -m

# exporting results
coverage xml