Commit Graph

49 Commits

Author SHA1 Message Date
Adrian Wälchli d4d197070f
Add `SyncBatchNormPlugin` (#11754)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>
2022-03-01 19:41:40 +05:30
Carlos Mocholí 8fd17f2edf
[IPU] Support manually instantiating the `poptorch.DataLoader` (#12116) 2022-02-28 09:36:26 +00:00
Adrian Wälchli d0f54609de
Fix `is_interactive_compatible` logic after AcceleratorConnector rewrite (#12008)
* fix is_interactive_compatible

* improve tests

* update message

* address review

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2022-02-22 19:20:54 +05:30
Rohit Gupta d541cf4c64
remove ddp procs collection from script launcher (#12029) 2022-02-22 12:39:48 +01:00
Adrian Wälchli de1815f4ba
Remove `DDPSpawnStrategy.get_mp_spawn_kwargs` in favor of launchers (#11966) 2022-02-22 11:28:21 +00:00
Adrian Wälchli 57aae5912e
Refactor signature for launcher (#11967) 2022-02-21 21:11:50 +01:00
Kushashwa Ravi Shrimali 0374fe65db
Support gradient accumulation using Horovod's `backward_passes_per_step` (#11911)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-02-19 02:54:04 +01:00
ananthsub cf64f34434
Refactor `Strategy._move_optimizer_states` as utility functions (#11758)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: thomas chaton <thomas@grid.ai>
2022-02-18 08:36:07 +00:00
four4fish 6e14209185
Rewrite accelerator_connector (#11448) 2022-02-17 23:38:39 +00:00
Rohit Gupta 25b505508d
Add process launchers (#11643)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-02-17 21:16:51 +00:00
ananthsub 4dba492fb5
Update horovod.py (#11917) 2022-02-16 21:58:54 -08:00
ananthsub 62ebd42ce0
Update ddp.py (#11929) 2022-02-16 17:29:07 -08:00
edward-io 87bd54aedf
fix typos (#11937) 2022-02-16 17:27:51 -08:00
Carlos Mocholí 8822117200
Return the output of the optimizer step (#11711)
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2022-02-09 09:37:13 +00:00
Rohit Gupta 182c18d319
Configure native deepspeed schedulers with interval='step' (#11788) 2022-02-09 08:20:50 +00:00
Rohit Gupta 9ed44dee0d
Fix to avoid moving batch to device for DataParallel (#11780)
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
2022-02-07 14:26:18 +00:00
ananthsub 0ba25d3cac
Update DDPStrategy to use optimizers property from within class (#11777) 2022-02-07 13:28:37 +01:00
Rohit Gupta 7ec1e66e17
reduce only loss with dp (#11594)
Co-authored-by: Aki Nitta <nitta@akihironitta.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2022-02-07 17:00:29 +05:30
ananthsub a64438c897
Centralize rank_zero_only utilities into their own module (#11747)
* Centralize rank_zero_only utilities into their own module

Fixes #11746

* PossibleUserWarning

* Update test_warnings.py

* update imports

* more imports

* Update CHANGELOG.md

* Update mlflow.py

* Update cli.py

* Update api_references.rst

* Update meta.py

* add deprecation tests

* debug standalone

* fix standalone tests

* Update CHANGELOG.md
2022-02-07 08:09:55 +00:00
ananthsub dfda970572
Update TPU Spawn to use root_device instead of LightningModule's device (#11750) 2022-02-06 06:26:38 +00:00
Dan Dale 9d8faecdb2
Allow Horovod `teardown()` to complete gracefully if exception thrown in callback setup (#11752) 2022-02-05 11:13:21 -08:00
ananthsub 241c97e6eb
Update HorovodStrategy to use optimizers property from within class (#11728) 2022-02-05 10:04:55 +01:00
Adrian Wälchli cc43d07db1
Remove legacy dead code in DDP script launch (#11678)
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
2022-02-05 11:40:16 +05:30
ananthsub 72db64d294
Use the strategy's `root_device` instead of the LightningModule's device property (#11734) 2022-02-05 04:33:25 +01:00
wangraying 8c07d8bf90
Add `Trainer(strategy="bagua")` (#11146)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
Co-authored-by: Sean Naren <sean@grid.ai>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
Co-authored-by: thomas chaton <thomas@grid.ai>
2022-02-04 17:02:09 +00:00
four4fish d43fd0d4d6
Lazy initialize Strategy.parallel_devices (#11572)
Co-authored-by: Aki Nitta <nitta@akihironitta.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-02-03 04:25:16 +00:00
Aki Nitta fbc1f9f1d9
Rename `Strategy.lr_schedulers` to `Strategy.lr_scheduler_configs` (#11549) 2022-02-02 22:10:01 +00:00
Carlos Mocholí d7944a13cd
Teardown all internal components on exception (#11620) 2022-02-02 21:10:19 +00:00
ananthsub 1bd6fc979e
Remove `Strategy.on_tpu` property (#11536) 2022-01-20 08:25:26 +01:00
ananthsub f41d1e5e5e
Remove `Strategy.on_gpu` (#11537) 2022-01-19 21:27:12 +00:00
Carlos Mocholí 62818dbace
Use a dataclass as the scheduler config (#11443) 2022-01-18 20:23:32 +01:00
Carlos Mocholí 344ab1e0a5
Move the `lightning_optimizers` ownership to the `Strategy` (#11444) 2022-01-18 12:58:56 +01:00
Carlos Mocholí 5914fb748f
Add typing to accelerators/gpu.py (#11333) 2022-01-12 19:44:51 +00:00
Rohit Gupta 82c8875f33
Add `LightningModule.lr_scheduler_step` (#10249)
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
2022-01-12 03:53:49 +00:00
edward-io 6107ce8e0d
Add DETAIL logs for batch use cases (#11008) 2022-01-12 01:22:48 +01:00
Rohit Gupta 06b8f82b8a
Update API references in doc (#11357) 2022-01-07 15:56:17 +01:00
Kaushik B 42a1c72660
Add Accelerators section to Lightning docs (#10755) 2022-01-06 19:12:44 +05:30
Adrian Wälchli 9c8f52ccd1
Fix restoring lr scheduler states with deepspeed strategy (#11322)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: thomas chaton <thomas@grid.ai>
2022-01-06 12:34:16 +00:00
Danielle Pintz 5b59c951e2
Deprecate `TrainerDataLoadingMixin` and move logic to `DataConnector` (#11282)
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Aki Nitta <nitta@akihironitta.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-01-05 21:23:57 +01:00
Adrian Wälchli 9906a1a54d
Update optimizer configuration info message in `DeepSpeedStrategy` (#11327) 2022-01-05 18:20:06 +00:00
Kaushik B 70c975a9f3
Fix exception message for FSDP running on CPU (#11325) 2022-01-05 18:02:31 +01:00
Kaushik B 93223ff5ce
Introduce StrategyRegistry (#11233)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2022-01-05 17:14:18 +05:30
Adrian Wälchli a8bd7ac73f
Fix lr scheduler state not being dumped to checkpoint in deepspeed strategy (#11307) 2022-01-05 08:38:08 +00:00
Danielle Pintz b082715103
Remove `Strategy.optimizer_zero_grad` (#11246) 2022-01-03 13:46:57 +01:00
Adrian Wälchli 4eede7c30b
Add deprecation path for renamed training type plugins (#11227)
Co-authored-by: Kaushik B <kaushikbokka@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2022-01-03 13:41:05 +01:00
Danielle Pintz ca9b25db80
Remove `Strategy.init_optimizers` (#11236) 2021-12-23 18:48:21 +00:00
Kaushik B 0adcd6a048
Rename training_type_plugin file to strategy (#11239)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-12-23 14:01:23 +00:00
Danielle Pintz a6a28e08d2
Deprecate `TrainerOptimizersMixin` and move functionality to `core/optimizer.py` (#11155) 2021-12-22 17:56:37 -08:00
Kaushik B 576a5d62a0
Introduce strategies directory for Training Strategies (#11226)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-12-22 20:23:30 +00:00