lightning/tests
Rohit Gupta ff5361604b
Weekly patch release v1.6.5 (#13481)
* update NGC docker (#13136)

* update docker
* Apply suggestions from code review

Co-authored-by: Akihiro Nitta <nitta@akihironitta.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

* Decouple pulling legacy checkpoints from existing GHA workflows and docker files (#13185)

* Add pull-legacy-checkpoints action
* Replace pulls with the new action and script
* Simplify

* Merge pull request #13250 from PyTorchLightning/ci/rm-base

CI: Remove simple test `ci_test-base.yml`

* Update rich requirement from !=10.15.*,<=12.0.0,>=10.2.2 to >=10.2.2,!=10.15.0.a,<13.0.0 in /requirements (#13047)

* Update rich requirement in /requirements

Updates the requirements on [rich](https://github.com/willmcgugan/rich) to permit the latest version.
- [Release notes](https://github.com/willmcgugan/rich/releases)
- [Changelog](https://github.com/Textualize/rich/blob/master/CHANGELOG.md)
- [Commits](https://github.com/willmcgugan/rich/compare/v10.2.2...v12.4.1)

---
updated-dependencies:
- dependency-name: rich
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

* Fix torch.distributed._sharded_tensor DeprecationWarning (#13261)

* update tutorials (#13268)

* [BUG] `estimated_stepping_batches` requires distributed comms in `configure_optimizers` for `DeepSpeedStrategy` (#13350)

* Update torchmetrics requirement from <=0.7.2,>=0.4.1 to >=0.4.1,<0.9.2 in /requirements (#13275)

Update torchmetrics requirement in /requirements

Updates the requirements on [torchmetrics](https://github.com/PyTorchLightning/metrics) to permit the latest version.
- [Release notes](https://github.com/PyTorchLightning/metrics/releases)
- [Changelog](https://github.com/PyTorchLightning/metrics/blob/master/CHANGELOG.md)
- [Commits](https://github.com/PyTorchLightning/metrics/compare/v0.4.1...v0.9.1)

---
updated-dependencies:
- dependency-name: torchmetrics
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Fix mypy errors for model summary utilities (#13384)

* rename org Lightning AI

* Modified python version check to accommodate for legacy version styles (#13420)

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

(cherry picked from commit b332b66328)

* Call `set_epoch` for distributed batch samplers (#13396)

Co-authored-by: Jirka <jirka.borovec@seznam.cz>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

(cherry picked from commit 2dd332f9c7)

* _RICH_AVAILABLE

* _FAIRSCALE_AVAILABLE

* _BAGUA_AVAILABLE

* redefine

* chlog spaces

* CI: Fix `fatal: unsafe repository` (#13515)

* update release date

* CI: azure rename

* Restore log step during restart (#13467)

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

* remove redundant test

* Update CI setup (#13291)

* drop mamba
* use legacy GPU machines

* fix schema check

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Akihiro Nitta <nitta@akihironitta.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Adam J. Stewart <ajstewart426@gmail.com>
Co-authored-by: Sean Naren <sean@grid.ai>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
Co-authored-by: Martino Sorbaro <martinosorb@users.noreply.github.com>
2022-07-12 19:40:14 -04:00
..
accelerators Enable all ddp params for hpu parallel strategy (#13067) 2022-06-01 08:04:16 -04:00
benchmarks Specify `Trainer(benchmark=False)` in parity benchmarks (#13182) 2022-06-01 08:04:16 -04:00
callbacks xfail flaky quantization test blocking CI (#13177) 2022-06-01 08:04:16 -04:00
checkpointing Threading support for legacy loading of checkpoints (#12814) 2022-05-03 14:54:54 -04:00
core Weekly patch release v1.6.5 (#13481) 2022-07-12 19:40:14 -04:00
deprecated_api Fix `trainer.logger` deprecation message (#12671) 2022-05-03 14:54:54 -04:00
helpers Weekly patch release v1.6.5 (#13481) 2022-07-12 19:40:14 -04:00
lite Fix tests failing on a single GPU (#11753) 2022-06-01 08:04:16 -04:00
loggers Weekly patch release v1.6.5 (#13481) 2022-07-12 19:40:14 -04:00
loops Weekly patch release v1.6.5 (#13481) 2022-07-12 19:40:14 -04:00
models Weekly patch release v1.6.5 (#13481) 2022-07-12 19:40:14 -04:00
overrides Fix retrieval of batch indices when dataloader num_workers > 0 (#10870) 2021-12-02 10:36:10 +00:00
plugins Weekly patch release v1.6.5 (#13481) 2022-07-12 19:40:14 -04:00
profiler Drop PyTorch 1.7 support (#12432) 2022-03-27 21:31:20 +00:00
strategies Weekly patch release v1.6.5 (#13481) 2022-07-12 19:40:14 -04:00
trainer Weekly patch release v1.6.5 (#13481) 2022-07-12 19:40:14 -04:00
tuner Avoid redundant callback restore warning while tuning (#13026) 2022-06-01 08:04:16 -04:00
utilities Weekly patch release v1.6.5 (#13481) 2022-07-12 19:40:14 -04:00
README.md Weekly patch release v1.6.5 (#13481) 2022-07-12 19:40:14 -04:00
__init__.py Replace `yapf` with `black` (#7783) 2021-07-26 13:37:35 +02:00
conftest.py Drop PyTorch 1.7 support (#12432) 2022-03-27 21:31:20 +00:00
standalone_tests.sh Fix standalone test collection (#13177) 2022-06-01 08:04:16 -04:00

README.md

PyTorch-Lightning Tests

Most of the tests in PyTorch Lightning train a BoringModel under various trainer conditions (ddp, ddp2+amp, etc...). Want to add a new test case and not sure how? Talk to us!

Running tests

Local: Testing your work locally will help you speed up the process since it allows you to focus on particular (failing) test-cases. To setup a local development environment, install both local and test dependencies:

# clone the repo
git clone https://github.com/PyTorchLightning/pytorch-lightning
cd pytorch-lightning

# install required depedencies
python -m pip install ".[dev, examples]"
# install pre-commit (optional)
python -m pip install pre-commit
pre-commit install

Additionally, for testing backward compatibility with older versions of PyTorch Lightning, you also need to download all saved version-checkpoints from the public AWS storage. Run the following script to get all saved version-checkpoints:

bash .actions/pull_legacy_checkpoints.sh

Note: These checkpoints are generated to set baselines for maintaining backward compatibility with legacy versions of PyTorch Lightning. Details of checkpoints for back-compatibility can be found here.

You can run the full test suite in your terminal via this make script:

make test

Note: if your computer does not have multi-GPU or TPU these tests are skipped.

GitHub Actions: For convenience, you can also use your own GHActions building which will be triggered with each commit. This is useful if you do not test against all required dependency versions.

Docker: Another option is to utilize the pytorch lightning cuda base docker image. You can then run:

python -m pytest pytorch_lightning tests pl_examples -v

You can also run a single test as follows:

python -m pytest -v tests/trainer/test_trainer_cli.py::test_default_args

Conditional Tests

To test models that require GPU make sure to run the above command on a GPU machine. The GPU machine must have at least 2 GPUs to run distributed tests.

Note that this setup will not run tests that require specific packages installed such as Horovod, FairScale, NVIDIA/apex, NVIDIA/DALI, etc. You can rely on our CI to make sure all these tests pass.

Standalone Tests

There are certain standalone tests, which you can run using:

PL_RUN_STANDALONE_TESTS=1 python -m pytest -v tests/trainer/
# or
./tests/standalone_tests.sh tests/trainer

Running Coverage

Make sure to run coverage on a GPU machine with at least 2 GPUs and NVIDIA apex installed.

cd pytorch-lightning

# generate coverage (coverage is also installed as part of dev dependencies under requirements/devel.txt)
coverage run --source pytorch_lightning -m pytest pytorch_lightning tests pl_examples -v

# print coverage stats
coverage report -m

# exporting results
coverage xml

Building test image

You can build it on your own, note it takes lots of time, be prepared.

git clone <git-repository>
docker image build -t pytorch_lightning:devel-torch1.9 -f dockers/cuda-extras/Dockerfile --build-arg TORCH_VERSION=1.9 .

To build other versions, select different Dockerfile.

docker image list
docker run --rm -it pytorch_lightning:devel-torch1.9 bash
docker image rm pytorch_lightning:devel-torch1.9