four4fish
1eff3b53c1
Update fairscale version ( #11567 )
...
Co-authored-by: Aki Nitta <nitta@akihironitta.com>
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2022-03-21 11:38:55 +00:00
Rohit Gupta
865c54f308
Fix deepspeed scheduler initialization ( #12031 )
2022-03-21 10:31:00 +00:00
ananthsub
4277845fa7
Add support for specifying process group backend to relevant distributed strategies ( #11745 )
2022-03-17 23:38:03 -07:00
edward-io
90a9da5abb
check trainerfn == FITTING before configuring sync_batchnorm ( #11919 )
...
Co-authored-by: edward-io <me@edward.io>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Aki Nitta <nitta@akihironitta.com>
2022-03-12 03:52:59 +00:00
four4fish
15364c18c8
Check `parallel_devices` passed through `strategy` is consistent with the `accelerator` flag ( #12105 )
2022-03-03 10:30:24 -08:00
Adrian Wälchli
d4d197070f
Add `SyncBatchNormPlugin` ( #11754 )
...
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>
2022-03-01 19:41:40 +05:30
Jan Stratil
c877d54c04
Fix passing _ddp_params_and_buffers_to_ignore ( #11949 )
2022-02-24 17:22:48 +00:00
Kaushik B
dcad2ea4bc
Move strategy tests from accelerators to strategies directory ( #11329 )
2022-02-22 05:14:18 +00:00
four4fish
6e14209185
Rewrite accelerator_connector ( #11448 )
2022-02-17 23:38:39 +00:00
Rohit Gupta
25b505508d
Add process launchers ( #11643 )
...
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-02-17 21:16:51 +00:00
edward-io
87bd54aedf
fix typos ( #11937 )
2022-02-16 17:27:51 -08:00
guyang3532
79c4e5de60
Refine the pytorch profiler ( #11268 )
2022-02-11 14:50:18 +01:00
Rohit Gupta
182c18d319
Configure native deepspeed schedulers with interval='step' ( #11788 )
2022-02-09 08:20:50 +00:00
wangraying
8c07d8bf90
Add `Trainer(strategy="bagua")` ( #11146 )
...
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
Co-authored-by: Sean Naren <sean@grid.ai>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
Co-authored-by: thomas chaton <thomas@grid.ai>
2022-02-04 17:02:09 +00:00
Krishna Kalyan
6291af5c19
Replace occurrences of `on_before_accelerator_backend_setup_called` with `setup` ( #11568 )
...
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
2022-02-03 04:14:33 +00:00
ananthsub
1bd6fc979e
Remove `Strategy.on_tpu` property ( #11536 )
2022-01-20 08:25:26 +01:00
ananthsub
f41d1e5e5e
Remove `Strategy.on_gpu` ( #11537 )
2022-01-19 21:27:12 +00:00
Carlos Mocholí
62818dbace
Use a dataclass as the scheduler config ( #11443 )
2022-01-18 20:23:32 +01:00
Carlos Mocholí
344ab1e0a5
Move the `lightning_optimizers` ownership to the `Strategy` ( #11444 )
2022-01-18 12:58:56 +01:00
Adrian Wälchli
9c8f52ccd1
Fix restoring lr scheduler states with deepspeed strategy ( #11322 )
...
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: thomas chaton <thomas@grid.ai>
2022-01-06 12:34:16 +00:00
Kaushik B
70c975a9f3
Fix exception message for FSDP running on CPU ( #11325 )
2022-01-05 18:02:31 +01:00
Kaushik B
93223ff5ce
Introduce StrategyRegistry ( #11233 )
...
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2022-01-05 17:14:18 +05:30
Adrian Wälchli
a8bd7ac73f
Fix lr scheduler state not being dumped to checkpoint in deepspeed strategy ( #11307 )
2022-01-05 08:38:08 +00:00
Kaushik B
650c710efa
Rename training plugin test files & names to strategy ( #11303 )
2022-01-04 14:32:45 +01:00