lightning

Commit Graph

Author	SHA1	Message	Date
Adrian Wälchli	d4d197070f	Add `SyncBatchNormPlugin` (#11754 ) Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>	2022-03-01 19:41:40 +05:30
Carlos Mocholí	8fd17f2edf	[IPU] Support manually instantiating the `poptorch.DataLoader` (#12116 )	2022-02-28 09:36:26 +00:00
Adrian Wälchli	d0f54609de	Fix `is_interactive_compatible` logic after AcceleratorConnector rewrite (#12008 ) * fix is_interactive_compatible * improve tests * update message * address review * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	2022-02-22 19:20:54 +05:30
Rohit Gupta	d541cf4c64	remove ddp procs collection from script launcher (#12029 )	2022-02-22 12:39:48 +01:00
Adrian Wälchli	de1815f4ba	Remove `DDPSpawnStrategy.get_mp_spawn_kwargs` in favor of launchers (#11966 )	2022-02-22 11:28:21 +00:00
Adrian Wälchli	57aae5912e	Refactor signature for launcher (#11967 )	2022-02-21 21:11:50 +01:00
Kushashwa Ravi Shrimali	0374fe65db	Support gradient accumulation using Horovod's `backward_passes_per_step` (#11911 ) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> Co-authored-by: ananthsub <ananth.subramaniam@gmail.com> Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>	2022-02-19 02:54:04 +01:00
ananthsub	cf64f34434	Refactor `Strategy._move_optimizer_states` as utility functions (#11758 ) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: thomas chaton <thomas@grid.ai>	2022-02-18 08:36:07 +00:00
four4fish	6e14209185	Rewrite accelerator_connector (#11448 )	2022-02-17 23:38:39 +00:00
Rohit Gupta	25b505508d	Add process launchers (#11643 ) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: ananthsub <ananth.subramaniam@gmail.com> Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>	2022-02-17 21:16:51 +00:00
ananthsub	4dba492fb5	Update horovod.py (#11917 )	2022-02-16 21:58:54 -08:00
ananthsub	62ebd42ce0	Update ddp.py (#11929 )	2022-02-16 17:29:07 -08:00
edward-io	87bd54aedf	fix typos (#11937 )	2022-02-16 17:27:51 -08:00
Carlos Mocholí	8822117200	Return the output of the optimizer step (#11711 ) Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>	2022-02-09 09:37:13 +00:00
Rohit Gupta	182c18d319	Configure native deepspeed schedulers with interval='step' (#11788 )	2022-02-09 08:20:50 +00:00
Rohit Gupta	9ed44dee0d	Fix to avoid moving batch to device for DataParallel (#11780 ) Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>	2022-02-07 14:26:18 +00:00
ananthsub	0ba25d3cac	Update DDPStrategy to use optimizers property from within class (#11777 )	2022-02-07 13:28:37 +01:00
Rohit Gupta	7ec1e66e17	reduce only loss with dp (#11594 ) Co-authored-by: Aki Nitta <nitta@akihironitta.com> Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>	2022-02-07 17:00:29 +05:30
ananthsub	a64438c897	Centralize rank_zero_only utilities into their own module (#11747 ) * Centralize rank_zero_only utilities into their own module Fixes #11746 * PossibleUserWarning * Update test_warnings.py * update imports * more imports * Update CHANGELOG.md * Update mlflow.py * Update cli.py * Update api_references.rst * Update meta.py * add deprecation tests * debug standalone * fix standalone tests * Update CHANGELOG.md	2022-02-07 08:09:55 +00:00
ananthsub	dfda970572	Update TPU Spawn to use root_device instead of LightningModule's device (#11750 )	2022-02-06 06:26:38 +00:00
Dan Dale	9d8faecdb2	Allow Horovod `teardown()` to complete gracefully if exception thrown in callback setup (#11752 )	2022-02-05 11:13:21 -08:00
ananthsub	241c97e6eb	Update HorovodStrategy to use optimizers property from within class (#11728 )	2022-02-05 10:04:55 +01:00
Adrian Wälchli	cc43d07db1	Remove legacy dead code in DDP script launch (#11678 ) Co-authored-by: Jirka <jirka.borovec@seznam.cz> Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>	2022-02-05 11:40:16 +05:30
ananthsub	72db64d294	Use the strategy's `root_device` instead of the LightningModule's device property (#11734 )	2022-02-05 04:33:25 +01:00
wangraying	8c07d8bf90	Add `Trainer(strategy="bagua")` (#11146 ) Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com> Co-authored-by: Sean Naren <sean@grid.ai> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> Co-authored-by: ananthsub <ananth.subramaniam@gmail.com> Co-authored-by: thomas chaton <thomas@grid.ai>	2022-02-04 17:02:09 +00:00
four4fish	d43fd0d4d6	Lazy initialize Strategy.parallel_devices (#11572 ) Co-authored-by: Aki Nitta <nitta@akihironitta.com> Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>	2022-02-03 04:25:16 +00:00
Aki Nitta	fbc1f9f1d9	Rename `Strategy.lr_schedulers` to `Strategy.lr_scheduler_configs` (#11549 )	2022-02-02 22:10:01 +00:00
Carlos Mocholí	d7944a13cd	Teardown all internal components on exception (#11620 )	2022-02-02 21:10:19 +00:00
ananthsub	1bd6fc979e	Remove `Strategy.on_tpu` property (#11536 )	2022-01-20 08:25:26 +01:00
ananthsub	f41d1e5e5e	Remove `Strategy.on_gpu` (#11537 )	2022-01-19 21:27:12 +00:00
Carlos Mocholí	62818dbace	Use a dataclass as the scheduler config (#11443 )	2022-01-18 20:23:32 +01:00
Carlos Mocholí	344ab1e0a5	Move the `lightning_optimizers` ownership to the `Strategy` (#11444 )	2022-01-18 12:58:56 +01:00
Carlos Mocholí	5914fb748f	Add typing to accelerators/gpu.py (#11333 )	2022-01-12 19:44:51 +00:00
Rohit Gupta	82c8875f33	Add `LightningModule.lr_scheduler_step` (#10249 ) Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>	2022-01-12 03:53:49 +00:00
edward-io	6107ce8e0d	Add DETAIL logs for batch use cases (#11008 )	2022-01-12 01:22:48 +01:00
Rohit Gupta	06b8f82b8a	Update API references in doc (#11357 )	2022-01-07 15:56:17 +01:00
Kaushik B	42a1c72660	Add Accelerators section to Lightning docs (#10755 )	2022-01-06 19:12:44 +05:30
Adrian Wälchli	9c8f52ccd1	Fix restoring lr scheduler states with deepspeed strategy (#11322 ) Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> Co-authored-by: thomas chaton <thomas@grid.ai>	2022-01-06 12:34:16 +00:00
Danielle Pintz	5b59c951e2	Deprecate `TrainerDataLoadingMixin` and move logic to `DataConnector` (#11282 ) Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> Co-authored-by: Aki Nitta <nitta@akihironitta.com> Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>	2022-01-05 21:23:57 +01:00
Adrian Wälchli	9906a1a54d	Update optimizer configuration info message in `DeepSpeedStrategy` (#11327 )	2022-01-05 18:20:06 +00:00
Kaushik B	70c975a9f3	Fix exception message for FSDP running on CPU (#11325 )	2022-01-05 18:02:31 +01:00
Kaushik B	93223ff5ce	Introduce StrategyRegistry (#11233 ) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	2022-01-05 17:14:18 +05:30
Adrian Wälchli	a8bd7ac73f	Fix lr scheduler state not being dumped to checkpoint in deepspeed strategy (#11307 )	2022-01-05 08:38:08 +00:00
Danielle Pintz	b082715103	Remove `Strategy.optimizer_zero_grad` (#11246 )	2022-01-03 13:46:57 +01:00
Adrian Wälchli	4eede7c30b	Add deprecation path for renamed training type plugins (#11227 ) Co-authored-by: Kaushik B <kaushikbokka@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	2022-01-03 13:41:05 +01:00
Danielle Pintz	ca9b25db80	Remove `Strategy.init_optimizers` (#11236 )	2021-12-23 18:48:21 +00:00
Kaushik B	0adcd6a048	Rename training_type_plugin file to strategy (#11239 ) Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>	2021-12-23 14:01:23 +00:00
Danielle Pintz	a6a28e08d2	Deprecate `TrainerOptimizersMixin` and move functionality to `core/optimizer.py` (#11155 )	2021-12-22 17:56:37 -08:00
Kaushik B	576a5d62a0	Introduce strategies directory for Training Strategies (#11226 ) Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>	2021-12-22 20:23:30 +00:00

49 Commits