Commit Graph

3824 Commits

Author SHA1 Message Date
twsl 0b9034baef
Return only unique names/versions for LoggerCollection (#10976)
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2021-12-23 00:35:38 +00:00
Kaushik B 576a5d62a0
Introduce strategies directory for Training Strategies (#11226)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-12-22 20:23:30 +00:00
Carlos Mocholí eb5b350f9a
Remove explicit isinstance checks in strategies for checkpoint io (#11177)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-12-22 04:41:45 +00:00
Adrian Wälchli b6dd1a3878
Fix typing in `pl.callbacks.lr_monitor` (#10802)
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
2021-12-22 03:50:00 +00:00
Adrian Wälchli ba8e7cd787
Fix BF16 teardown for TPU precision plugin (#10990)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>
Co-authored-by: thomas chaton <thomas@grid.ai>
2021-12-22 03:47:14 +00:00
four4fish cf5ef32f7b
Deprecate Trainer.training_type_plugin in favor of trainer.strategy (#11141)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-12-22 02:11:43 +00:00
Adrian Wälchli 17ad1a4c00
Rename `ParallelPlugin` to `ParallelStrategy` (#11123) 2021-12-22 01:09:17 +00:00
four4fish 4bfe5bda0f
Rename the DDPSpawnShardedPlugin to DDPSpawnShardeedStrategy (#11210)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-12-22 00:27:36 +00:00
Aki Nitta 28ce9105e4
Rename `SingleDevicePlugin` to `SingleDeviceStrategy` (#11181)
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-12-21 23:56:14 +00:00
four4fish f98cd78e9e
Renamed the `DDPSpawnPlugin` to `DDPSpawnStrategy` (#11145) 2021-12-21 23:06:14 +00:00
four4fish 0c69c757d4
Rename the `DataParallelPlugin` to `DataParallelStrategy` (#11183) 2021-12-21 22:00:24 +00:00
Aki Nitta c3cd4d050f
Rename `SingleTPUPlugin` to `SingleTPUStrategy` (#11182) 2021-12-21 20:09:30 +00:00
four4fish 1c5a5c3dfe
Renamed the DDP2Plugin to DDP2Strategy (#11185)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-12-21 19:21:00 +00:00
Carlos Mocholí b2c3d01b3e
Fix master import conflict (#11203) 2021-12-21 18:47:56 +00:00
Danielle Pintz ac8dc2c2f3
Deprecate `TrainerCallbackHookMixin` (#11148) 2021-12-21 09:47:08 -08:00
four4fish caab69aabb
Renamed DDPShardPlugin to DDPShardStrategy (#11187)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-12-21 17:18:25 +00:00
Carlos Mocholí f696326060
Remove `should_rank_save_checkpoint` property from TTP (#11070) 2021-12-21 18:11:20 +01:00
Carlos Mocholí 3692eba807
Drop Python 3.6 support (#11117) 2021-12-21 17:06:15 +00:00
Aki Nitta 9da78a94bd
Rename `TPUSpawnPlugin` to `TPUSpawnStrategy` (#11190)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-12-21 16:36:16 +00:00
Danielle Pintz 1177389d5a
Move `TrainerCallbackHookMixin.on_save/load_checkpoint` to `Trainer` and rename for clarity (#11179) 2021-12-21 17:30:01 +01:00
Kaushik B 2e947a88e0
Rename IPUPlugin to IPUStrategy (#11193) 2021-12-21 15:55:41 +00:00
Kaushik B 283bdece0a
Rename DeepSpeedPlugin to DeepSpeedStrategy (#11194) 2021-12-21 15:18:01 +00:00
Oliver Borchert 17aceafa80
Suppress Warning in `PredictionEpochLoop` (#11189)
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-12-21 14:40:41 +00:00
Kaushik B ba0c901395
Rename HorovodPlugin to HorovodStrategy (#11195) 2021-12-21 14:31:41 +01:00
Rohit Gupta 93ce2d7cc9
Avoid torch amp cuda warning with bf16 on cpu (#11161)
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-12-21 18:24:26 +05:30
four4fish b64dea9dc3
Rename `DDPPlugin` to `DDPStrategy` (#11142)
* Raname DDPPlugin to DDPStrategy

* Change ddp_plugin to ddp_strategy

* update changelog

* rename occurences in docs

* rename more occurrences

* fix line too long

* more fixes

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-12-21 08:55:51 +00:00
jjenniferdai 31f39c9578
Move `CheckpointConnector.fault_tolerant_auto_save_path` out of `CheckpointConnector.hpc_resume_path` (#11092) 2021-12-21 02:24:01 +01:00
Rohit Gupta 787f41eff6
update optimizer_step example in docs (#10420) 2021-12-21 08:19:40 +09:00
Adrian Wälchli 08e661ff72
Rename `restore_checkpoint_after_pre_dispatch` to `restore_checkpoint_after_setup` (#11166)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-12-20 17:16:52 +00:00
Carlos Mocholí e8169bbd46
Fix setter usage for checkpoint io and precision in TTP (#11071)
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2021-12-20 17:45:32 +01:00
Adrian Wälchli f5c2881b68
3/n Simplify spawn plugins: Merge `pre_dispatch` and `setup` logic (#11137) 2021-12-20 17:41:22 +01:00
Adrian Wälchli 2e47e2f4ae
Set spawn_method on initialization (#11162) 2021-12-20 17:39:54 +01:00
four4fish 0ee78e96ef
Rename `DDPFullyShardedPlugin` to `DDPFullyShardedStrategy` (#11143)
* Rename DDPFullyShardedPlugin to DDPFullyShardedStrategy

* update fsdp_plugin to fsdp_strategy

* update changelog

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-12-20 17:11:20 +01:00
ORippler 86a3c5e2a3
Add required states for resumed ModelCheckpoint GC (#10995)
* Add required states for resumed ModelCheckpoint GC

* Add backwards compatibility with legacy cktps

Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>

* Add test to check if attrs are written to ckpt

Note that we do not yet check for proper loading/reinstantiation of
ModelCheckpooint based on the ckpt written to disk

* Test if attributes are restored properly from ckpt

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix broken `test_callbacks_state_fit_ckpt_path`

`ModelCheckpoint` is configured to save after every epoch,
but `trainer.fit` is called with `max_steps = 1`

Note there may be a better way of doing this, where `ModelCheckpoint`
is called after `training_step`

* Update test_restore.py

* Update test_restore.py

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Check that all attributes are restored properly

* revert changes, use fix on master

* Convert to proper unit test

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Refactor `test_mode_checkpoint_saveload_ckpt`

* First save, then load ckpt.
* Instantiate ModelCheckpoint twice.

Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-12-20 17:05:15 +01:00
Danielle Pintz b1baf460d9
Include hook's object name when profiling (#11026) 2021-12-20 15:18:24 +01:00
Adrian Wälchli 29eb9cccf2
Rename the `TrainingTypePlugin` base to `Strategy` (#11120)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: four4fish <88516121+four4fish@users.noreply.github.com>
2021-12-20 12:50:11 +00:00
guyang3532 cc4a978bf6
Safely disable profiler (#11167) 2021-12-20 11:51:46 +00:00
Carlos Mocholí 7ed3dbf191
Fix evaluation logging on epoch end with multiple dataloaders (#11132) 2021-12-19 15:51:01 +01:00
Danielle Pintz f95976d602
rename _call_ttp_hook to _call_strategy_hook (#11150) 2021-12-18 17:53:03 -08:00
Rohit Gupta 3461af0ddb
Add support for returning callback from `LightningModule.configure_callbacks` (#11060) 2021-12-18 10:46:35 +00:00
Rafał Jankowski 3cc69f992b
Fixed NeptuneLogger when using DDP (#11030)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2021-12-18 01:40:13 +00:00
Carlos Mocholí 62f1e82e03
Fix CVE-2020-1747 and CVE-2020-14343 (#11099) 2021-12-17 20:27:15 +00:00
Carlos Mocholí 8508cce37d
Mark all result classes as protected (#11130) 2021-12-17 19:35:17 +00:00
Rohit Gupta 860959fb3f
Enable logging hparams only if there are any (#11105) 2021-12-17 19:40:56 +01:00
Carlos Mocholí dbb7f56b35
Deprecate `Trainer.verbose_evaluate` (#10931) 2021-12-17 19:26:32 +01:00
Carlos Mocholí 75d96d9897
Reset the current progress tracking state during double evaluation (#11119) 2021-12-17 19:20:11 +01:00
Adrian Wälchli 978f5e6ad6
Fix AttributeError when using CombinedLoader in prediction (#11111)
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
2021-12-17 18:02:25 +00:00
quancs 179b4dd415
remove redundant methods in RichProgressBar (#11100)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-12-17 17:40:31 +00:00
Carlos Mocholí 7e10f6d41f
Save the loop progress state by default (#10784) 2021-12-17 16:00:27 +00:00
Carlos Mocholí fa6d17c96f
Fix typing for utilities.warnings (#11115) 2021-12-17 15:07:27 +01:00