Commit Graph

4917 Commits

Author SHA1 Message Date
Rohit Gupta 7ca41734da
Add `dataloader_idx` to batch transfer hooks (#6241)
* replace with kwargs

* chlog

* fix

* add test

* fix

* device

* deepspeed

* pep

* optional

* docs

* bc

* comments

* pep

* mypy

* pep

* Apply suggestions from code review

* kwargs

* docs

* .

* .

* 1.3 -> 1.4

* kwargs -> step_kwargs
2021-05-13 23:03:55 +05:30
Carlos Mocholí a584196abf
Default `seed_everything(workers=True)` in the `LightningCLI` (#7504) 2021-05-13 12:18:03 +02:00
Adrian Wälchli dd1a17b071
Refactor result handling in training loop (#7506)
* refactor results

* rename dic -> dict

* simplify

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* changelog

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix None check

* chlog wording

* move process_closure_result to the end

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-05-13 09:30:34 +01:00
Jirka Borovec 298f9e5c2d
Prune deprecated utils modules (#7503)
* argparse_utils

* model_utils

* warning_utils

* xla_device_utils

* chlog

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-05-13 07:24:42 +00:00
Jirka Borovec 946aee0c7b
prune data parallel (#7510) 2021-05-13 06:23:02 +01:00
Carlos Mocholí 072ad52b6b
Add `trainer.predict(ckpt_path)` (#7430)
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2021-05-13 01:49:58 +02:00
Jirka Borovec d4ec75164c
Prune deprecated trainer attributes (#7501)
* use_single_gpu

* use_horovod

* use_ddp2

* use_ddp

* use_dp

* on_gpu

* use_tpu

* on_tpu

* on_cpu

* cleaning

* chlog

* Apply suggestions from code review

* Apply suggestions from code review
2021-05-12 20:10:15 +00:00
Jirka Borovec 96981091c7
Prune deprecated classif. metrics (#7499)
* stat_scores_multiple_classes

* precision_recall

* precision

* recall

* auc

* auroc

* multiclass_auroc

* iou

* clean-up

* chlog

* flake8

* imports

* prune
2021-05-12 18:03:34 +00:00
Jirka Borovec 140b0c727e
Prune deprecated trainer attributes 2 (#7502)
* accelerator_backend

* get_model

* clean

* chlog

* flake8
2021-05-12 10:19:30 -07:00
Carlos Mocholí 83283fdb20
Fix yapf-isort conflict (#7500) 2021-05-12 15:44:57 +02:00
Jirka Borovec db54b30776
Update README to 1.3 (#7489) 2021-05-12 13:36:52 +02:00
Federico Simonetta 8cdbd03d02
MLFlow now uses env variable as default tracking uri (#7457)
* Clarify logger flag

Clarify behavior of boolean values on the logger flag for Trainer.

* Update docs/source/common/trainer.rst

* doc

* MLFlow now uses env variable as default tracking uri

Solves https://github.com/PyTorchLightning/pytorch-lightning/issues/6894

* Update pytorch_lightning/loggers/mlflow.py

Co-authored-by: thomas chaton <thomas@grid.ai>

* changelog

Co-authored-by: SpontaneousDuck <kennywitham4@gmail.com>
Co-authored-by: Ethan Harris <ewah1g13@soton.ac.uk>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: thomas chaton <thomas@grid.ai>
Co-authored-by: jirka <jirka.borovec@seznam.cz>
2021-05-12 11:26:57 +02:00
Christopher Ehmann b9a52fa2ef
added stage param to LightningDataModule.setup example (#7483)
Co-authored-by: Sileadim <christopher@omnius.com>
2021-05-11 23:43:22 +05:30
shuyingsunshine21 8538c1f61e
Accelerator model state dict (#7474)
* Fix some test errors
Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

* checkpoint consolidation

* Update ddp_spawn.py

* Update test_metric_result_integration.py

* Update test_results.py

* Update utils.py

* Update utils.py

* Update test_all_gather_grad.py

* Update test_all_gather_grad.py

* Update test_results.py

* Revert "Update test_results.py"

This reverts commit 9d4a2b891d.

* Revert "Merge pull request #1 from shuyingsunshine21/shuyingsunshine21-checkpoint_consolidate"

This reverts commit c5053da789, reversing
changes made to 0d23d75bc9.

* Revert "Update test_all_gather_grad.py"

This reverts commit 0d23d75bc9.

* Revert "Update utils.py"

This reverts commit 70fe5da9c6.

* Revert "Update utils.py"

This reverts commit a9aae99f6e.

* Revert "Update test_results.py"

This reverts commit ea74906878.

* Revert "Update test_metric_result_integration.py"

This reverts commit bf70e431b3.

* Revert "Update ddp_spawn.py"

This reverts commit f17210183b.

* Revert "checkpoint consolidation"

This reverts commit 536c1323b0.

* Revert "Revert "checkpoint consolidation""

This reverts commit 3a9fde915a.

* Revert "Revert "Revert "checkpoint consolidation"""

This reverts commit 7a369f47e1.

* Revert "Revert "Update ddp_spawn.py""

This reverts commit 8222dc98ea.

* Revert "Revert "Update test_metric_result_integration.py""

This reverts commit 6c095b2370.

* Revert "Revert "Update test_results.py""

This reverts commit 250d0aaaa2.

* Revert "Revert "Update utils.py""

This reverts commit 8651d54d79.

* Revert "Revert "Update test_all_gather_grad.py""

This reverts commit dcdcd29731.

* modify distributed environment to make test pass

* modify model state dict to training type plugin

* remove changes

* add changelog

* fixing isort for pre-commit failure

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Address code review

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: SeanNaren <sean@grid.ai>
2021-05-11 16:39:04 +01:00
Adrian Wälchli a1a655d006
Reduce log output size in special tests (#7481) 2021-05-11 17:36:20 +02:00
Justus Schock 7b283e3c46
Bugfix/Multiple dataloaders (#7433)
* Update supporters.py

* Update apply_func.py

* Update supporters.py

* Update model_train_dataloaders.py

* Update model_train_steps.py

* Update test_dataloaders.py

* Update CHANGELOG.md

* Update model_train_steps.py

* Update test_dataloaders.py

* Update test_dataloaders.py

* Update supporters.py

* Update test_supporters.py

* Apply suggestions from code review

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update tests/trainer/test_dataloaders.py

Co-authored-by: Akihiro Nitta <nitta@akihironitta.com>

* Apply suggestions from code review

Co-authored-by: Edgar Riba <edgar.riba@gmail.com>

* Update supporters.py

* Update supporters.py

* Apply suggestions from code review

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Akihiro Nitta <nitta@akihironitta.com>
Co-authored-by: Edgar Riba <edgar.riba@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-05-11 16:33:29 +02:00
Jirka Borovec d7c44cc649
Docs: sync chlog 1.3.1 (#7478) 2021-05-11 12:44:22 +02:00
ananthsub fdf50a5e4b
Mark certain Trainer APIs as protected (#7420) 2021-05-11 11:53:41 +02:00
Adrian Wälchli ad9118f04a
remove trainer hidden state | sanity refactor [1 / n] (#7437)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-05-11 11:09:08 +02:00
David Fidalgo 4a1134db64
Log epoch metrics before firing the `on_evaluation_end` hook (#7272)
* Log epoch metrics before firing the `on_evaluation_end` hook (addresses #7166)

* test that epoch metrics are logged before `on_evaluation_end` hook

* update CHANGELOG

* Shorter test

Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
2021-05-11 10:54:31 +02:00
Carlos Mocholí b65ae79478
Automatically check `DataModule.has_{setup,teardown,prepare_data}` [2/2] (#7238)
* Automatically check `DataModule.has_{setup,teardown,prepare_data}`

* Use variable

* Spacing

* Docs

* Update CHANGELOG

* Remove `_DataModuleWrapper`

* Add test

* Update docs/source/extensions/datamodules.rst

* Bad merge

* add test for invalid name

* Remove ValueError

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-05-11 10:53:00 +02:00
pre-commit-ci[bot] 8660d8cf03
[pre-commit.ci] pre-commit autoupdate (#7475)
updates:
- [github.com/pre-commit/pre-commit-hooks: v2.3.0 → v3.4.0](https://github.com/pre-commit/pre-commit-hooks/compare/v2.3.0...v3.4.0)
- [github.com/PyCQA/isort: 5.7.0 → 5.8.0](https://github.com/PyCQA/isort/compare/5.7.0...5.8.0)
- [github.com/pre-commit/mirrors-yapf: v0.30.0 → v0.31.0](https://github.com/pre-commit/mirrors-yapf/compare/v0.30.0...v0.31.0)

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-05-11 14:56:27 +08:00
Justus Schock f6fe715e73
Fix Sphinx argument deprecation (#7464) 2021-05-10 17:30:23 +02:00
edenlightning 3ec54203bb
Fix slack link (#7452)
* Update README.md

* Update Slack link

Co-authored-by: Akihiro Nitta <nitta@akihironitta.com>
2021-05-10 08:50:14 +00:00
edenlightning 159308e4ef
Update introduction_guide.rst (#7453) 2021-05-10 08:49:54 +00:00
Akihiro Nitta 6d82dc832b
Pin `Sphinx<4.0` (#7456)
* Dont use sphinx 4.0.0

* Dont use sphinx 4.0.0

* Update comment

* Simple 

There is no other release between 3.5 and 4.0

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2021-05-10 10:27:37 +02:00
Adrian Wälchli 6bc616d78f
fix display bug (#7395) 2021-05-10 11:26:15 +08:00
Adrian Wälchli 1af42d7d1e
fix 1.9 test (#7441) 2021-05-08 20:03:51 +02:00
shuyingsunshine21 987530cd38
Set `num_nodes` and `sync_batchnorm` From Trainer for Manually Passed Training Type Plugin (#7026)
Co-authored-by: thomas chaton <thomas@grid.ai>
Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-05-08 11:25:51 +00:00
Akihiro Nitta 710b144b9b
Restore `trainer.current_epoch` after tuning (#7434)
* Add a test

* Save and restore current_epoch

* Update CHANGELOG

* alphabetical order
2021-05-08 07:15:52 +02:00
Ethan Harris 45143fd825
Improve val step logging (#7351)
* Fix val step logging

* Add a type

* Fix

* Update CHANGELOG.md
2021-05-07 22:58:03 +00:00
ananthsub f9e050c5e5
Move DP warning suppression to the DataParallel Plugin (#7421) 2021-05-07 23:02:44 +02:00
ananthsub fecce50355
Deprecate TrainerModelHooksMixin (#7422)
* Deprecate TrainerModelHooksMixin

* Update CHANGELOG.md

* Update model_hooks.py

* Update model_hooks.py
2021-05-07 13:19:36 -07:00
Carlos Mocholí 8208c330eb
Use `torch.nn.utils.clip_grad_norm_` and add `clip_grad_by_value` support for TPU (#7025)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>
2021-05-07 16:41:39 +00:00
Carlos Mocholí 9ba76ce60c
Unify `configure_optimizers` docs (#7399) 2021-05-07 16:10:24 +02:00
Carlos Mocholí 7dcddb27f0
Refactor tests to use `BoringModel` (#7401) 2021-05-07 15:59:32 +02:00
Louis Taylor 2b7e65b747
Add base IPU dockerfiles (#7252) 2021-05-07 12:07:29 +00:00
Jirka Borovec 1a27c12b26
update ngc for 1.3 (#7414) 2021-05-07 13:13:54 +02:00
Leonard Lausen 98b94b810c
Fix DeepSpeedPlugin with IterableDataset (#7362)
* deepspeed add train_micro_batch_size_per_gpu argument

* Update naming and doc

* Modify to use auto naming convention, add test

* Add iterable tests

* Fix tests, attempt by mocking

* Import correct package

* Fix comparison

* Set as special test

* Remove import

* Add Changelog

Co-authored-by: SeanNaren <sean@grid.ai>
2021-05-07 10:46:03 +01:00
Jirka Borovec 28103c67c2
show mush go on (#7413)
* chlog + version

* readme

* .
2021-05-06 19:06:21 -04:00
Jirka Borovec fbc8b209f2
update versions (#7409)
* update versions

* chlog

* win

* str
2021-05-06 20:35:39 +00:00
Jirka Borovec b181b8c646
release 1.3.0 (#7404)
* v1.3.0

* ci event

* chlog

* badge

* formatting
2021-05-06 15:05:35 -04:00
Florian Müller-Fouarge d4d959b342
Call `super().__init__()` in `MilestonesFinetuning` example (#7398) 2021-05-06 21:11:36 +05:30
Akihiro Nitta a2a65abf68
[ci] Unpin pip==20.1 (#6375)
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: jirka <jirka.borovec@seznam.cz>
2021-05-06 13:41:33 +00:00
Gyeongjae Choi d9bdc56b6a
Add _gpus_arg_default in argparse_utils for backward compatibility (#7402) 2021-05-06 13:35:12 +00:00
Sean Naren 94f6c3e160
Advanced GPU Documentation (#7259)
* Added advanced gpu section

* Small changes

* Better documentation

* Address code review

* Add warning about using trainer.model, clean up some of the examples

* Add section for ddp, remove references and old sequential documentation

* Remove Fully Sharded documentation for now

* Apply suggestions from code review

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>

* Address code review

* Address code review

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
2021-05-06 13:53:20 +01:00
Louis Taylor 1a62f7f5ff
ci: adjust torch version requirements in IPU pipeline (#7383)
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2021-05-06 18:20:05 +05:30
Jirka Borovec d52e0a8f3e
v0.1.3.0rc3 + changelogs (#7388)
* v0.1.3.0rc3

* spaces

* wip

* wip

* wip

* wip

* prune

* wip

* wip

* Apply suggestions from code review

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-05-06 07:28:10 -04:00
Martin Kristiansen c3fc0313ef
Updating docs and error message: half precision not available on CPU (#7384)
* Updating docs and error message to specify that half precission not available on CPU

* update messages

Co-authored-by: Martin Kristiansen <martinkristiansen@sixgill.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: jirka <jirka.borovec@seznam.cz>
2021-05-06 09:05:50 +00:00
Adrian Wälchli dea7a0230d
group all loop tests in a folder (#7394)
* move files

* rename
2021-05-06 10:03:25 +01:00