History

Jirka Borovec 6166f46281 drop unused variable in API (#6308 ) * drop unused pl model in ckpt * irelevant * on_evaluation_batch_start * evaluation_epoch_end * attach_datamodule		2021-03-04 10:26:54 +01:00
..
accelerators	Add fairscale & deepspeed to skipif 4/n (#6281 )	2021-03-02 19:45:13 +00:00
base	prune deprecated profiler as bool (#6164 )	2021-02-24 09:08:21 +00:00
callbacks	Fix ModelPruning(make_pruning_permanent=True) buffers getting removed when saved during training (#6073 )	2021-03-03 13:29:58 +01:00
checkpointing	Refactor: Runif for TPU and Horovod 5/n (#6301 )	2021-03-02 16:21:20 +00:00
core	Refactor: skipif for AMPs 3/n (#6293 )	2021-03-02 18:13:53 +05:30
deprecated_api	Refactor: skipif for Windows 2/n (#6268 )	2021-03-02 09:36:01 +00:00
helpers	Add fairscale & deepspeed to skipif 4/n (#6281 )	2021-03-02 19:45:13 +00:00
loggers	Refactor: Runif for TPU and Horovod 5/n (#6301 )	2021-03-02 16:21:20 +00:00
metrics	Refactor: skipif for Windows 2/n (#6268 )	2021-03-02 09:36:01 +00:00
models	[bugfix] TPU test hangs to barrier on 1 process (#6272 )	2021-03-02 18:01:35 -05:00
overrides	Refactor: skipif for Windows 2/n (#6268 )	2021-03-02 09:36:01 +00:00
plugins	drop unused variable in API (#6308 )	2021-03-04 10:26:54 +01:00
trainer	prune duplicite test in optim (#6312 )	2021-03-03 15:41:00 +09:00
tuner	Refactor: skipif for Windows 2/n (#6268 )	2021-03-02 09:36:01 +00:00
utilities	Refactor: runif for spec 6/6 (#6307 )	2021-03-02 18:57:13 +00:00
README.md	Fix pre-commit trailing-whitespace and end-of-file-fixer hooks. (#5387 )	2021-01-26 14:27:56 +01:00
__init__.py	fix duplicate console logging bug v2 (#6275 )	2021-03-02 15:17:55 +05:30
collect_env_details.py	…
conftest.py	PoC: Accelerator refactor (#5743 )	2021-02-12 15:48:56 -05:00
mnode_tests.txt	Mnodes (#5020 )	2021-02-04 20:55:40 +01:00
special_tests.sh	Expose DeepSpeed FP16 parameters due to loss instability (#6115 )	2021-02-21 21:43:11 +01:00
test_profiler.py	fix duplicate console logging bug v2 (#6275 )	2021-03-02 15:17:55 +05:30

README.md

PyTorch-Lightning Tests

Most PL tests train a full MNIST model under various trainer conditions (ddp, ddp2+amp, etc...). This provides testing for most combinations of important settings. The tests expect the model to perform to a reasonable degree of testing accuracy to pass.

Running tests

The automatic travis tests ONLY run CPU-based tests. Although these cover most of the use cases, run on a 2-GPU machine to validate the full test-suite.

To run all tests do the following:

Install Open MPI or another MPI implementation. Learn how to install Open MPI on this page.

git clone https://github.com/PyTorchLightning/pytorch-lightning
cd pytorch-lightning

# install AMP support
bash requirements/install_AMP.sh

# install dev deps
pip install -r requirements/devel.txt

# run tests
py.test -v

To test models that require GPU make sure to run the above command on a GPU machine. The GPU machine must have:

At least 2 GPUs.
NVIDIA-apex installed.
Horovod with NCCL support: HOROVOD_GPU_OPERATIONS=NCCL pip install horovod

Running Coverage

Make sure to run coverage on a GPU machine with at least 2 GPUs and NVIDIA apex installed.

cd pytorch-lightning

# generate coverage (coverage is also installed as part of dev dependencies under requirements/devel.txt)
coverage run --source pytorch_lightning -m py.test pytorch_lightning tests examples -v

# print coverage stats
coverage report -m

# exporting results
coverage xml

Building test image

You can build it on your own, note it takes lots of time, be prepared.

git clone <git-repository>
docker image build -t pytorch_lightning:devel-torch1.4 -f dockers/cuda-extras/Dockerfile --build-arg TORCH_VERSION=1.4 .

To build other versions, select different Dockerfile.

docker image list
docker run --rm -it pytorch_lightning:devel-torch1.4 bash
docker image rm pytorch_lightning:devel-torch1.4