Akash Kwatra
bc1c8b926c
Deprecate `BaseProfiler` in favor of `Profiler` ( #12150 )
...
Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>
Co-authored-by: Kaushik B <kaushikbokka@gmail.com>
2022-03-21 20:17:03 +00:00
DuYicong515
31c68d107e
Remove `AcceleratorConnector.num_gpus` and deprecate `Trainer.num_gpus` ( #12384 )
2022-03-21 18:06:39 +01:00
Danielle Pintz
caed77f155
Refactor `TorchElasticEnvironment.detect` to use `torch.distributed.is_torchelastic_launched` ( #12376 )
...
* Refactor TorchElasticEnvironment.detect to use native utility from torch.distributed
* fix version and tests
* fix version
* Update tests/accelerators/test_accelerator_connector.py
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2022-03-21 16:51:24 +01:00
four4fish
1eff3b53c1
Update fairscale version ( #11567 )
...
Co-authored-by: Aki Nitta <nitta@akihironitta.com>
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2022-03-21 11:38:55 +00:00
Rohit Gupta
865c54f308
Fix deepspeed scheduler initialization ( #12031 )
2022-03-21 10:31:00 +00:00
DuYicong515
523200971d
Remove `AcceleratorConnector.root_gpu` and deprecate `Trainer.root_gpu` ( #12262 )
2022-03-19 23:53:50 +00:00
jjenniferdai
6ba66789ae
[2/n] add `Stateful` functionality support for Callbacks ( #12232 )
2022-03-19 20:20:50 +00:00
DuYicong515
ed2bcc5ab3
Deprecate `Trainer.devices` in favor of `Trainer.num_devices` and `Trainer.device_ids` ( #12151 )
2022-03-18 12:38:57 -07:00
ananthsub
4277845fa7
Add support for specifying process group backend to relevant distributed strategies ( #11745 )
2022-03-17 23:38:03 -07:00
Danielle Pintz
601948a4bf
Deprecate `Trainer.use_amp` ( #12312 )
2022-03-18 06:14:35 +00:00
Danielle Pintz
2360049744
Deprecate `LightningModule.use_amp` ( #12315 )
2022-03-18 03:49:18 +01:00
Danielle Pintz
f8e50f9cf5
Fix the case where logger=None is passed to Trainer ( #12249 )
2022-03-18 02:18:28 +00:00
edward-io
90a9da5abb
check trainerfn == FITTING before configuring sync_batchnorm ( #11919 )
...
Co-authored-by: edward-io <me@edward.io>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Aki Nitta <nitta@akihironitta.com>
2022-03-12 03:52:59 +00:00
four4fish
4d74f379a5
Only allow one value for each plugin type in `plugins` flag ( #12083 )
2022-03-11 19:36:23 +00:00
Jirka Borovec
c90174ca31
unify logger testing ( #9081 )
...
Co-authored-by: Ethan Harris <ewah1g13@soton.ac.uk>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2022-03-11 14:24:30 +00:00
Jirka Borovec
8577ef7bba
Skip horovod 0.24.0 only ( #12248 )
...
* try skip horovod 0.24.0 only
* HOROVOD_BUILD_CUDA_CC_LIST
* fix test
Co-authored-by: Aki Nitta <nitta@akihironitta.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-03-10 16:01:08 +00:00
jjenniferdai
d31126c331
Support passing `storage_options` in `trainer.save_checkpoint()` API ( #11891 )
2022-03-09 18:35:50 +00:00
Carlos Mocholí
49a4a36ad4
Have the outputs match the loops format ( #12182 )
...
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2022-03-08 18:10:18 +00:00
Carlos Mocholí
8fa156948a
Add `LightningCLI(auto_registry)` ( #12108 )
2022-03-08 12:26:10 -05:00
jjenniferdai
f3253070c4
Deprecate `LightningDataModule.on_save/load_checkpoint` ( #11893 )
2022-03-07 18:21:46 -08:00
Carlos Mocholí
aea96e45a4
Integrate global step with progress tracking ( #11805 )
2022-03-07 19:21:37 +00:00
Rohit Gupta
fc499bf56f
Disable tuner with distributed strategies ( #12179 )
...
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-03-07 08:45:07 +00:00
four4fish
91052dc6d5
Move ipu precision flag check to IPUPrecisionPlugin init ( #12148 )
2022-03-05 09:03:24 +00:00
ananthsub
9c3d6b8fc7
Deprecate `LightningModule.on_pretrain_routine_{start/end}` ( #12122 )
2022-03-04 22:17:08 -08:00
Akash Kwatra
eff67d7a02
Deprecate `AbstractProfiler` in favor of `BaseProfiler` ( #12106 )
2022-03-05 02:35:57 +00:00
Danielle Pintz
0b682b807a
Mark `logger_connector` as protected ( #12195 )
2022-03-05 02:33:42 +00:00
Louis Taylor
73bda54e63
CI: update poplar sdk version ( #12226 )
2022-03-04 23:49:30 +00:00
Ethan Harris
ac735db0a0
Remove `data_pipeline` attribute patch ( #12204 )
2022-03-04 23:09:37 +00:00
Akash Kwatra
1f7298d326
Deprecate `LoggerCollection` in favor of `trainer.loggers` ( #12147 )
...
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2022-03-04 23:01:43 +00:00
jjenniferdai
5d2a3eab69
add `state_dict`/`load_state_dict` to base `Callback` ( #11998 )
2022-03-04 02:41:48 +00:00
four4fish
15364c18c8
Check `parallel_devices` passed through `strategy` is consistent with the `accelerator` flag ( #12105 )
2022-03-03 10:30:24 -08:00
jjenniferdai
d923dff627
Deprecate `PrecisionPlugin.on_save/load_checkpoint` ( #11978 )
2022-03-02 10:14:55 -08:00
jjenniferdai
89d37569d8
add `accelerator.is_available()` check ( #12104 )
...
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Kaushik B <kaushikbokka@gmail.com>
2022-03-02 10:07:49 +00:00
Adrian Wälchli
0e24140fe4
Improve mechanism to reset the seed after sanity check ( #11870 )
...
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
2022-03-01 23:27:30 +00:00
Adrian Wälchli
d4d197070f
Add `SyncBatchNormPlugin` ( #11754 )
...
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>
2022-03-01 19:41:40 +05:30
Danielle Pintz
0fe3379fa4
Deprecate `weights_save_path` from the Trainer constructor ( #12084 )
...
Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2022-02-28 22:45:26 +00:00
Carlos Mocholí
6309a59c3c
Do not prefetch when possible ( #12101 )
2022-02-28 18:31:18 +00:00
Kushashwa Ravi Shrimali
02ccd874b9
Stop loading a few properties if checkpoint's `dirpath` has changed ( #12045 )
...
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-02-28 16:42:09 +00:00
Kaushik B
a52a6ea030
Add support for pluggable Accelerators ( #12030 )
...
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
2022-02-28 21:36:23 +05:30
Carlos Mocholí
a9024ce870
[CLI] Fix `SaveConfigCallback` with DDP spawn ( #12011 )
2022-02-28 13:27:42 +00:00
Cai Q.T
01c31ae434
Fix `LightningModule.{un,}toggle_model` when only 1 optimizer is used ( #12088 )
...
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-02-28 12:41:51 +00:00
Rohit Gupta
17bb815d01
Add `estimated_stepping_batches` property to `Trainer` ( #11599 )
2022-02-28 12:40:48 +00:00
Rohit Gupta
5b342f14a6
fix to avoid common hook warning if no hook is overridden ( #12131 )
2022-02-28 18:07:05 +05:30
Carlos Mocholí
db1c709519
Clean loop fetching usage ( #12103 )
2022-02-28 10:51:33 +00:00
Carlos Mocholí
5f920dc088
Refactor Horovod NCCL check ( #11948 )
2022-02-28 10:45:32 +00:00
Mauricio Villegas
54b9a85227
Unit test for CLI with subcommands and a common default config file ( #12061 )
...
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-02-28 10:17:49 +00:00
DuYicong515
c9af112801
Remove `AcceleratorConnector.num_nodes` ( #12107 )
...
Co-authored-by: Aki Nitta <nitta@akihironitta.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2022-02-28 09:53:38 +00:00
Carlos Mocholí
8fd17f2edf
[IPU] Support manually instantiating the `poptorch.DataLoader` ( #12116 )
2022-02-28 09:36:26 +00:00
DuYicong515
0b677ecf2b
Remove `AcceleratorConnector.has_tpu` ( #12109 )
2022-02-27 14:16:03 +00:00
DuYicong515
b2932337bc
Remove `AcceleratorConnector.has_ipu` ( #12111 )
...
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2022-02-27 13:36:36 +00:00