lightning/tests
kumuji 619f984c36
Option to provide seed to random generators to ensure reproducibility (#1572)
* Option to provide seed to random generators to ensure reproducibility

I added small function in utilities which imports torch, numpy, python
random and sets seed for all of the libraries to ensure reproducibility
of results.

* Apply recommendations from core contributors on seeding

1. Moved the seeding code to another file
2. Make deterministic as a parameter for trainer class
3. Add assertions for seeding numpy
4. Added warnings
5. torch.manual_seed should be enough for seeding torch

* Revert "Apply recommendations from core contributors on seeding"

This reverts commit a213c8e6882eec8a9e7408b9418926d2db7c5461.

* Revert "Revert "Apply recommendations from core contributors on seeding""

This reverts commit 59b2da53c62878de7aab0aa3feb3115e105eea06.

* Change in test, for correct seeding

* Allow seed equal to 0

* Allow seed to be uint32.max

* Added deterministic to benchmarks

* Cuda manual seed as in benchmark seeding

* Seeding should be done before model initialization

* cuda manual_seed is not necessary

* Fixing seed test_cpu_lbfgs

On some seeds seems like lbfgs doesn't converge.
So I fixed the seed during testing.

* rebasing issue with old reproducibility.py

* Improved documentation and ability to seed before initializing Train
class

* Change in docs

* Removed seed from trainer, update for documentation

* Typo in the docs

* Added seed_everything to _all_

* Fixing old changes

* Model initialization should be earlier then Trainer

* Update pytorch_lightning/trainer/__init__.py

From Example to testcode

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Fixing according to the contributors suggestions

* Moving horovod deterministic to Trainer class

* deterministic flag affects horovod docs update

* Improved static typing

* Added deterministic to test runners of horovod

It is failing on some versions, not very predictable

* static seeds for horovod tests

* Change for reset_seed function in tests

* Seeding horovod using reset_seed from tutils

* Update pytorch_lightning/trainer/__init__.py

* chlog

* Update trainer.py

* change "testcode" to "Example" in trainer init documentation

* Update pytorch_lightning/trainer/seed.py, first line in comment

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-05-12 07:53:20 -04:00
..
base Option to provide seed to random generators to ensure reproducibility (#1572) 2020-05-12 07:53:20 -04:00
callbacks Fix lr key name in case of param groups (#1719) 2020-05-10 17:05:34 -04:00
loggers Fix NeptuneLogger to work in ddp mode (#1753) 2020-05-10 13:19:18 -04:00
models Option to provide seed to random generators to ensure reproducibility (#1572) 2020-05-12 07:53:20 -04:00
trainer made ddp the default if no backend specified with multiple GPUs (#1789) 2020-05-12 06:54:23 -04:00
Dockerfile Tests/docker (#1573) 2020-04-23 12:52:59 -04:00
README.md Tests/docker (#1573) 2020-04-23 12:52:59 -04:00
__init__.py default test logger (#1478) 2020-04-21 20:33:10 -04:00
collect_env_details.py fix changelog (#1452) 2020-04-20 17:36:26 -04:00
conftest.py test deprecation warnings (#1470) 2020-04-23 17:34:47 -04:00
install_AMP.sh CI: split tests-examples (#990) 2020-03-25 07:46:27 -04:00
requirements-devel.txt Tests/docker (#1573) 2020-04-23 12:52:59 -04:00
requirements.txt Tests/docker (#1573) 2020-04-23 12:52:59 -04:00
test_deprecated.py Tests: refactor cleanup (#1744) 2020-05-10 13:15:28 -04:00
test_profiler.py RC & Docs/changelog (#1776) 2020-05-11 21:57:53 -04:00

README.md

PyTorch-Lightning Tests

Most PL tests train a full MNIST model under various trainer conditions (ddp, ddp2+amp, etc...). This provides testing for most combinations of important settings. The tests expect the model to perform to a reasonable degree of testing accuracy to pass.

Running tests

The automatic travis tests ONLY run CPU-based tests. Although these cover most of the use cases, run on a 2-GPU machine to validate the full test-suite.

To run all tests do the following:

git clone https://github.com/PyTorchLightning/pytorch-lightning
cd pytorch-lightning

# install AMP support
bash tests/install_AMP.sh

# install dev deps
pip install -r tests/requirements-devel.txt

# run tests
py.test -v

To test models that require GPU make sure to run the above command on a GPU machine. The GPU machine must have:

  1. At least 2 GPUs.
  2. NVIDIA-apex installed.
  3. Horovod with NCCL support: HOROVOD_GPU_ALLREDUCE=NCCL HOROVOD_GPU_BROADCAST=NCCL pip install horovod

Running Coverage

Make sure to run coverage on a GPU machine with at least 2 GPUs and NVIDIA apex installed.

cd pytorch-lightning

# generate coverage (coverage is also installed as part of dev dependencies under tests/requirements-devel.txt)
coverage run --source pytorch_lightning -m py.test pytorch_lightning tests examples -v --doctest-modules

# print coverage stats
coverage report -m

# exporting results
coverage xml

Building test image

You can build it on your own, note it takes lots of time, be prepared.

git clone <git-repository>
docker image build -t pytorch_lightning:devel-pt_1_4 -f tests/Dockerfile --build-arg TORCH_VERSION=1.4 .

To build other versions, select different Dockerfile.

docker image list
docker run --rm -it pytorch_lightning:devel-pt_1_4 bash
docker image rm pytorch_lightning:devel-pt_1_4