Danielle Pintz
01f5f99919
Deprecate callback hooks `on_init_start` and `on_init_end` ( #10940 )
2021-12-08 07:42:19 +00:00
Danielle Pintz
aeb0b5595f
Deprecate `call_hook` ( #10979 )
2021-12-08 00:52:47 +00:00
Rohit Gupta
6369e3b77f
Update Changelog after 1.5.5 release ( #10977 )
2021-12-07 12:35:20 -08:00
Adrian Wälchli
6bfc0bbc56
Remove `TrainingTypePlugin.post_dispatch` in favor of `teardown` ( #10939 )
...
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
2021-12-06 22:27:30 +00:00
four4fish
629ca09e09
fix TypeError cause failure in singal_connector teardown ( #10961 )
2021-12-06 21:48:31 +00:00
four4fish
63bb4ec77d
4/n Move Accelerator into strategy - remove X_step() from accelerator ( #10890 )
...
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-12-06 20:16:54 +00:00
Adrian Wälchli
6c79b2e969
Change temporary spawn checkpoint name ( #10934 )
2021-12-06 16:08:55 +00:00
Adrian Wälchli
3e1f8aa312
Fix spawn plugins not deleting temp checkpoint ( #10935 )
2021-12-06 13:41:19 +00:00
four4fish
2fc64e9656
2/n Move Accelerator into strategy - remove dispatch functions from Accelerator ( #10885 )
...
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-12-06 09:51:14 +00:00
Rajath Bharadwaj
7914e5c157
added UserWarnings if max_epochs not set in the Trainer class ( #10700 )
2021-12-06 09:44:25 +00:00
Kaushik B
6599ced17d
Don't import torch_xla.debug for torch-xla<1.8 ( #10836 )
2021-12-06 06:31:38 +00:00
Luca Moschella
7792b77932
Resolve: 'DummyExperiment' object does not support item assignment ( #10917 )
...
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-12-03 17:54:05 +00:00
four4fish
6fe3211573
Unroll dict input before call Accelerator X_steps ( #10908 )
...
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-12-03 17:00:52 +00:00
Rohit Gupta
8ba3b383c0
Fix filtration logic for eval results with multiple dataloaders ( #10810 )
...
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
2021-12-03 14:34:46 +00:00
four4fish
e646ca1d59
Remove `setup_optimizers_in_pre_dispatch` logic ( #10906 )
2021-12-03 15:05:08 +01:00
Adrian Wälchli
c55bc433ce
Fix retrieval of batch indices when dataloader num_workers > 0 ( #10870 )
...
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2021-12-02 10:36:10 +00:00
Adrian Wälchli
98cb7e8790
1/n Simplify spawn plugins: Simplify handling of multiprocessing queue ( #10034 )
...
Co-authored-by: thomas chaton <thomas@grid.ai>
2021-12-02 10:30:44 +00:00
Rohit Gupta
5b9995da04
Fix schedule reset logic in pytorch profiler ( #10837 )
2021-12-02 14:22:49 +05:30
four4fish
9beeabbced
Removed unnecessary `_move_optimizer_state` method overrides ( #10849 )
...
* Update tpu tp share same logic with ttp
* run test
* Update tpu_spawn.py
* debug
* Add changelog
* Apply suggestions from code review
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* Update training_type_plugin.py
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Update training_type_plugin.py
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-12-02 05:03:30 +00:00
four4fish
45dd8066e7
3/n Move Accelerator into strategy - remove model_sharded_context() ( #10886 )
...
* 3/n Move Accelerator into strategy - remove model_sharded_context()
* update ttp function
* update changelog
* update changelog
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
2021-12-02 03:34:51 +00:00
four4fish
44cd412e91
Remove precision_plugin pre_dispatch() method ( #10887 )
...
* Remove precision_plugin pre_dispatch() method
* update changelog
2021-12-01 18:42:17 -08:00
Carlos Mocholí
a7aed2af7a
[CLI] Add support for `ReduceLROnPlateau` ( #10860 )
2021-12-01 15:41:22 +00:00
Rafał Jankowski
c6478414ee
Fixed uploading best model checkpoint in NeptuneLogger ( #10369 )
2021-12-01 13:58:54 +00:00
Aka.Fido
72cc8b7ca9
Disable validation completely when `overfit_batches>0` ( #9709 )
...
Co-authored-by: thomas chaton <thomas@grid.ai>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-12-01 13:57:57 +00:00
Adrian Wälchli
7514adf814
Remove `return_result` argument from `DDPSpawnPlugin.spawn()` ( #10867 )
2021-12-01 13:29:08 +00:00
Kaushik B
ec0fb2fd95
Raise exception if rich is less than 10.2.2 ( #10839 )
2021-12-01 06:14:19 +00:00
Kaushik B
3c9488f62f
Update changelog after v1.5.4 release ( #10843 )
2021-11-30 23:26:25 +00:00
Mauricio Villegas
f3b0a06e90
Fix `SignalConnector._has_already_handler` check for callable type ( #10483 )
...
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: thomas chaton <thomas@grid.ai>
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
2021-11-30 22:47:52 +00:00
Adrian Wälchli
25473acddb
Restore signals on teardown ( #10611 )
...
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-11-30 22:07:14 +00:00
Rohit Gupta
1437be5e98
Disable batch_size extraction for torchmetric instances ( #10815 )
...
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
2021-11-30 20:47:05 +00:00
four4fish
1d2878523a
2/n Move Precision Plugin into strategy - move optimizer related logics ( #10596 )
...
Co-authored-by: Danielle Pintz <38207072+daniellepintz@users.noreply.github.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: thomas chaton <thomas@grid.ai>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-11-30 08:31:23 +00:00
four4fish
8bf7f9cce7
1/n Move Accelerator into strategy - move batch_to_device to strategy ( #10649 )
...
* 1/n Integrate Device Specific Accelerator Logic with strategy - move batch_to_device to strategy
* add changelog
* add model is not none check
* Apply suggestions from code review
Co-authored-by: thomas chaton <thomas@grid.ai>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
* Update CHANGELOG.md
* Update test_datamodules.py
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Update test_hooks.py
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Update dp.py
Co-authored-by: thomas chaton <thomas@grid.ai>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-11-29 12:11:21 -08:00
Rohit Gupta
753cc4dfad
Fix default logging levels for train step specific hooks ( #10756 )
2021-11-29 19:51:17 +00:00
Carlos Mocholí
d3b7492bd0
[CLI] Add support for `--key.help=class` ( #10767 )
2021-11-29 14:12:53 +00:00
Adrian Wälchli
49d09aa28b
Update changelog after 1.5.3 release ( #10744 )
2021-11-27 05:28:23 +00:00
Adrian Wälchli
c752060712
Consolidate state when retrieving sharded state dict in Lite ( #10746 )
...
Co-authored-by: thomas chaton <thomas@grid.ai>
2021-11-27 04:54:45 +00:00
thomas chaton
e94aff1c5b
Fault Tolerant: Add support for fault tolerant dataloader validator ( #10465 )
2021-11-26 19:33:47 +00:00
thomas chaton
6fe6e9e414
Delete TensorBoardLogger experiment before spawning the processes. ( #10777 )
2021-11-26 17:07:57 +00:00
thomas chaton
412d507a73
Fault Tolerant: move signal to SIGTERM ( #10605 )
2021-11-26 13:37:27 +00:00
Kaushik B
e507bc9027
Fix compare version for packages ( #10762 )
2021-11-26 09:15:22 +00:00
thomas chaton
3d6262b7a9
Fault Tolerant Manual: Add support for DDP ( #10638 )
2021-11-25 18:31:53 +01:00
Kaushik B
e0b4bb2ea3
Deprecate `DeviceType` in favor of `_AcceleratorType` ( #10503 )
...
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-11-25 16:41:03 +01:00
Carlos Mocholí
f8b2d5b128
Improve error message on `TypeError` during `DataLoader` reconstruction ( #10719 )
2021-11-24 21:51:11 +00:00
thomas chaton
0066ff0129
Fault Tolerant Manual: Enable the feature ( #10707 )
2021-11-24 17:36:08 +00:00
Adrian Wälchli
30ec4815cb
Support re-instantiation for custom DataLoader in Lightning ( #10680 )
...
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
2021-11-24 15:58:51 +01:00
thomas chaton
e51a8ee7a3
Fault Tolerant Manual: utilities cleanup ( #10703 )
...
Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
2021-11-24 15:01:55 +01:00
thomas chaton
b28ab34ff5
Fault Tolerant Manual: Add loading to reload the states ( #10699 )
...
Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-11-23 17:18:36 +00:00
thomas chaton
7cf6374bd0
Fault Tolerant Manual: Add support for collecting states across processes ( #10639 )
2021-11-23 14:27:33 +00:00
Adrian Wälchli
ee9f7c0421
Update DeepSpeed precision handling after moving PrecisionPlugin ( #10657 )
2021-11-23 13:51:41 +00:00
thomas chaton
1702036c14
Fault Tolerant Manual: Add stateful dataloader iter ( #10674 )
2021-11-23 12:30:50 +00:00
thomas chaton
2036dfb5df
Fault Tolerant Manual: Add _rotate_worker_indices utility ( #10647 )
2021-11-22 19:52:04 +00:00
thomas chaton
6acfef680f
Fault Tolerant Manual: Add is_obj_stateful utility ( #10646 )
2021-11-22 18:48:32 +00:00
Andres Algaba
6fc7c54c3a
refactor slurm_job_id ( #10622 )
...
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: thomas chaton <thomas@grid.ai>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
2021-11-22 17:41:08 +00:00
Rohit Gupta
d431ce14a1
Raise an error if batch_size cannot be inferred from current batch ( #10541 )
...
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-11-22 16:55:19 +00:00
Carlos Mocholí
a6dedcf492
Fix `move_metrics_to_cpu` with evaluation ( #10631 )
2021-11-22 15:58:21 +00:00
thomas chaton
991cd895c6
1/n Add `FaultTolerantMode` ( #10645 )
2021-11-22 14:58:23 +00:00
Kaushik B
ce0a977742
Moved `env_vars_connector._defaults_from_env_vars` to `utilities.argsparse._defaults_from_env_vars` ( #10501 )
...
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-11-22 08:06:35 +00:00
ananthsub
a18b6409d1
Check torch.distributed availability before sharded tensor state dict hook registration ( #10621 )
...
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-11-19 17:34:23 +00:00
Mauricio Villegas
5d748e560b
LightningCLI changes for jsonargparse>=4.0.0 ( #10426 )
...
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: thomas chaton <thomas@grid.ai>
2021-11-19 17:03:14 +00:00
Rohit Gupta
ec27313be2
Fix batch size extraction when set by the user in `LightningModule.log` ( #10408 )
...
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
2021-11-19 16:48:26 +00:00
Biho-Kim
e83e8ae305
Respect the passed dtype with `self.log` ( #10076 )
...
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2021-11-19 15:16:33 +00:00
thomas chaton
94390aba56
Lite: Don't pop value if they don't exist ( #10613 )
2021-11-19 14:04:33 +00:00
Kaushik B
137b62d80d
Add `refresh_rate` to RichProgressBar ( #10497 )
...
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-11-19 05:59:57 +00:00
thomas chaton
7d3ad5b76e
Don't register signal in thread ( #10610 )
2021-11-19 04:13:35 +01:00
four4fish
700521c7d3
1/n Move precision plugin into strategy - update reference ( #10570 )
...
* 1/n move precision plugin into strategy - update reference
* update precision plugin reference in tpu_spawn
* add missing reference in error message
* add back removed license line
* update references in tests
* update reference in trainer
* update return annotation for precision_plugin property on TTP
* simplify access to precision plugin reference in sharded plug
* add changelog
* remove precision property from ttp and add deprecation message
* fix make doc and update precision reference
* simplify a reference to precision
accidentally overridden Adrian's change, now add it back
* Update CHANGELOG.md
add Adrian's change back
* Update accelerator precision
Add Adrian's change back
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Add none check for precision plugin
just to be safe
* Update ipu.py
* update precision_plugin param deprecation message
* Update accelerator.py
* Remove deprecated warning
Tests will fail after 9940
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-11-19 00:39:01 +00:00
Adrian Wälchli
0f6d89422b
Control automatic resubmission on SLURM ( #10601 )
2021-11-18 17:48:53 +00:00
Adrian Wälchli
261ea90822
Update changelog after 1.5.2 release ( #10590 )
2021-11-17 23:31:09 +00:00
Adrian Wälchli
d50e1696f9
Fix propagation of device and dtype properties in Lite modules ( #10559 )
2021-11-16 17:26:46 +00:00
Carlos Mocholí
edebd8a3bc
Fix scripting causing false positive deprecation warnings ( #10555 )
...
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2021-11-16 15:52:09 +00:00
Sean Naren
e98ace3adc
[DeepSpeed] Do not fail if batch size could not be inferred for logging ( #10438 )
2021-11-16 11:42:25 +00:00
Rohit Gupta
de7ef41fea
remove deprecated `reload_dataloaders_every_epoch` from `Trainer` ( #10481 )
2021-11-16 06:47:43 +00:00
Rohit Gupta
60850ef510
fix overfit_batch sampler replacement logic ( #10486 )
...
Co-authored-by: thomas chaton <thomas@grid.ai>
2021-11-15 22:31:45 +00:00
Carlos Mocholí
65ebfed3ae
Fix `to_torchscript()` causing false positive deprecation warnings ( #10470 )
2021-11-15 22:12:55 +00:00
Carlos Mocholí
dcafc95f2b
Avoid deprecated `progress_bar_refresh_rate` usage ( #10520 )
...
Co-authored-by: Danielle Pintz <38207072+daniellepintz@users.noreply.github.com>
2021-11-15 22:04:48 +01:00
thomas chaton
1de3539eac
Resolve instantiation problem with init_meta_context ( #10493 )
2021-11-15 19:13:01 +00:00
Kaushik B
ae71284627
Remove deprecated `disable_validation` property from Trainer ( #10450 )
2021-11-15 18:42:00 +00:00
Kaushik B
01cf7a2ac5
Deprecate `DistributedType` in favor of `StrategyType` ( #10505 )
2021-11-15 17:10:08 +00:00
Shivam Mehta
794c4b08c0
Remove deprecated `is_overridden(model=...)` ( #10507 )
...
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-11-15 12:56:30 +00:00
puhuk
8b0cb47cc0
Remove deprecated `hpc_load` in `CheckpointConnector` ( #10525 )
...
Co-authored-by: Aki Nitta <nitta@akihironitta.com>
2021-11-15 11:54:47 +00:00
thomas chaton
ffb40060c0
shutdown workers on failure ( #10463 )
2021-11-15 10:03:46 +00:00
Rohit Gupta
a8c2725ff8
remove deprecated signature for `transfer_batch_to_device` ( #10480 )
...
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-11-13 19:32:30 +00:00
Kaushik B
fabb364402
Remove deprecated `mode` argument from ModelSummary ( #10449 )
...
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-11-12 19:32:43 +00:00
Carlos Mocholí
847e24011a
Squeeze the early stopping monitor ( #10461 )
2021-11-12 18:03:47 +00:00
Rohit Gupta
fa0ed17f8a
remove deprecated train_loop ( #10482 )
...
* remove deprecated train_loop
* chlog
2021-11-12 12:42:25 +00:00
Kaushik B
d577f461a4
Remove deprecated `utilities.distributed.rank_zero_{warn,deprecation}` ( #10451 )
2021-11-10 07:35:48 -08:00
ananthsub
aad86423f7
Remove more deprecated methods from base `Accelerator` class ( #10448 )
2021-11-10 12:58:24 +05:30
a-gardner1
ce149f6451
Fix support for dataclasses with ClassVar/InitVar in `apply_to_collection` ( #9702 )
...
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
2021-11-10 04:42:27 +00:00
Carlos Mocholí
d515bcac96
Remove deprecated profiler import ( #10443 )
2021-11-09 23:13:02 +01:00
Justus Schock
eeef5a80ac
Update Changelog for v1.5.1 ( #10439 )
...
* Missing Changelogs
* Add 1.5.1 entry to changelog
Co-authored-by: Kaushik B <kaushikbokka@gmail.com>
Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>
2021-11-09 21:25:54 +00:00
thomas chaton
8d810d6144
Enable distributed training with CombinedDataLoader and max_size_cycle ( #10374 )
...
* solve combinedloader
* update
* update changelog
* update on comments
* resolve iterable dataset support
* update test description
* update
* update on comments
* update
* Accelerator auto
* Address review
* Refactor
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
2021-11-09 20:06:10 +00:00
Carlos Mocholí
c413b69240
Remove deprecated `task_idx` ( #10441 )
2021-11-09 18:54:38 +00:00
Carlos Mocholí
ebab4be3e4
Remove deprecated `DeviceDtypeModuleMixin` import ( #10442 )
2021-11-09 18:35:53 +00:00
Ross Johnstone
c2f25d42ab
Make `monitor` required arg of EarlyStopping callback ( #10328 )
...
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-11-09 18:08:03 +00:00
Carlos Mocholí
069ec1005a
Do not autodetach extras ( #10424 )
...
* Do not autodetach extras
* Update CHANGELOG
* Use foo
2021-11-09 16:07:16 +00:00
thomas chaton
7fb277f260
Resolve workers being forcelly deleted with `persistent_workers=True` ( #10434 )
...
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
2021-11-09 14:58:31 +00:00
Carlos Mocholí
edbf27430d
Remove deprecated `self.log` arguments ( #10423 )
2021-11-09 15:49:55 +01:00
Adrian Wälchli
aaa6aa75e9
Fix converting only float type tensors in Lite ( #10429 )
...
* fix
* less code
* add test case
* add test cases
* update input
* add test cases
* add type hint
* add changelog note
Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>
2021-11-09 15:21:00 +01:00
Kaushik B
5eeca87e98
Fix deadlocks for distributed training for RichProgressBar ( #10428 )
2021-11-09 18:30:37 +05:30
Rohit Gupta
21eafafcb0
disable step logging in epoch hooks ( #10409 )
...
* disable step logging in epoch hooks
* chlog
* Apply suggestions from code review
* chlog
2021-11-09 16:53:27 +05:30
four4fish
0ed5e3dc8a
Raise exceptions when torch distributed is not available ( #10418 )
...
* Raise exceptions when torch distributed is not avalible
* add changelog
2021-11-09 09:11:05 +00:00