Commit Graph

24 Commits

Author SHA1 Message Date
four4fish 1eff3b53c1
Update fairscale version (#11567)
Co-authored-by: Aki Nitta <nitta@akihironitta.com>
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2022-03-21 11:38:55 +00:00
Rohit Gupta 865c54f308
Fix deepspeed scheduler initialization (#12031) 2022-03-21 10:31:00 +00:00
ananthsub 4277845fa7
Add support for specifying process group backend to relevant distributed strategies (#11745) 2022-03-17 23:38:03 -07:00
edward-io 90a9da5abb
check trainerfn == FITTING before configuring sync_batchnorm (#11919)
Co-authored-by: edward-io <me@edward.io>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Aki Nitta <nitta@akihironitta.com>
2022-03-12 03:52:59 +00:00
four4fish 15364c18c8
Check `parallel_devices` passed through `strategy` is consistent with the `accelerator` flag (#12105) 2022-03-03 10:30:24 -08:00
Adrian Wälchli d4d197070f
Add `SyncBatchNormPlugin` (#11754)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>
2022-03-01 19:41:40 +05:30
Jan Stratil c877d54c04
Fix passing _ddp_params_and_buffers_to_ignore (#11949) 2022-02-24 17:22:48 +00:00
Kaushik B dcad2ea4bc
Move strategy tests from accelerators to strategies directory (#11329) 2022-02-22 05:14:18 +00:00
four4fish 6e14209185
Rewrite accelerator_connector (#11448) 2022-02-17 23:38:39 +00:00
Rohit Gupta 25b505508d
Add process launchers (#11643)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-02-17 21:16:51 +00:00
edward-io 87bd54aedf
fix typos (#11937) 2022-02-16 17:27:51 -08:00
guyang3532 79c4e5de60
Refine the pytorch profiler (#11268) 2022-02-11 14:50:18 +01:00
Rohit Gupta 182c18d319
Configure native deepspeed schedulers with interval='step' (#11788) 2022-02-09 08:20:50 +00:00
wangraying 8c07d8bf90
Add `Trainer(strategy="bagua")` (#11146)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
Co-authored-by: Sean Naren <sean@grid.ai>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
Co-authored-by: thomas chaton <thomas@grid.ai>
2022-02-04 17:02:09 +00:00
Krishna Kalyan 6291af5c19
Replace occurrences of `on_before_accelerator_backend_setup_called` with `setup` (#11568)
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
2022-02-03 04:14:33 +00:00
ananthsub 1bd6fc979e
Remove `Strategy.on_tpu` property (#11536) 2022-01-20 08:25:26 +01:00
ananthsub f41d1e5e5e
Remove `Strategy.on_gpu` (#11537) 2022-01-19 21:27:12 +00:00
Carlos Mocholí 62818dbace
Use a dataclass as the scheduler config (#11443) 2022-01-18 20:23:32 +01:00
Carlos Mocholí 344ab1e0a5
Move the `lightning_optimizers` ownership to the `Strategy` (#11444) 2022-01-18 12:58:56 +01:00
Adrian Wälchli 9c8f52ccd1
Fix restoring lr scheduler states with deepspeed strategy (#11322)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: thomas chaton <thomas@grid.ai>
2022-01-06 12:34:16 +00:00
Kaushik B 70c975a9f3
Fix exception message for FSDP running on CPU (#11325) 2022-01-05 18:02:31 +01:00
Kaushik B 93223ff5ce
Introduce StrategyRegistry (#11233)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2022-01-05 17:14:18 +05:30
Adrian Wälchli a8bd7ac73f
Fix lr scheduler state not being dumped to checkpoint in deepspeed strategy (#11307) 2022-01-05 08:38:08 +00:00
Kaushik B 650c710efa
Rename training plugin test files & names to strategy (#11303) 2022-01-04 14:32:45 +01:00