Commit Graph

5776 Commits

Author SHA1 Message Date
Kaushik B 14fb076a30
Fix deprecation test version for accelerator collective (#9892) 2021-10-12 11:50:31 +05:30
Gili Tzabari 4afe53791b
Clarify lr scheduler frequency (#9843) 2021-10-12 01:44:07 +00:00
Sean Naren 83acb8671d
Update DeepSpeed version, fix failing tests (#9898) 2021-10-11 22:35:33 +00:00
Adrian Wälchli f9d2612102
fix qconfig import for pytorch 1.10 (#9899) 2021-10-11 22:30:34 +00:00
Kaushik B c3aa6e9818
Prepare v1.5.0rc0 (#9893) 2021-10-11 20:36:01 +01:00
yopknopixx 173f4c8466
Deprecate `terminate_on_nan` Trainer argument in favor of `detect_anomaly` (#9175)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-10-11 17:17:43 +00:00
Adrian Wälchli 6a0c47a014
remove redundant accumulation normalization in manual optimization (#9769) 2021-10-11 15:26:12 +00:00
Ranuga-Disansa f915a8a283
Removed a redundant warning with `ModelCheckpoint(monitor=None)` callback (#9875)
* Update README.md

* Update README.md

* Create evaluation.py

* Update README.md

* Update evaluation.py

* Create evaluation.py

* Create evaluation.py

* Update evaluation.py

* Create nlp.py

* Update evaluation.py

* Create evaluation.py

* Update nlp.py

* Update nlp.py

* Update evaluation.py

* Create evaluation.py

* Update nlp.py

* Update nlp.py

* Update requirements.txt

* Update evaluation.py

* Create data_loader.py

* Update nlp.py

* Update evaluation.py

* Update data_loader.py

* Update nlp.py

* Update data_loader.py

* Update requirements.txt

* Update model_checkpoint.py

* Delete evaluation.py

* Delete data_loader.py

* Delete nlp.py

* Update requirements.txt

* Update model_checkpoint.py

* Update README.md

* Update pytorch_lightning/callbacks/model_checkpoint.py

* Update CHANGELOG.md

* Update test_model_checkpoint.py

* Update model_checkpoint.py

* update

* update

* chlog update

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2021-10-11 14:54:07 +00:00
Rohit Gupta 54d4b4b21d
use existing logic to configure optimizers in lr_finder (#9789)
* use predefined logic

* patch init_optimizers

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-10-11 13:05:27 +00:00
Sean Naren 66ce4436c6
[docs] Add Torch Distributed Run (#9890) 2021-10-11 12:15:46 +00:00
theory-in-progress 4ecb0d8bc9
Updated quantization imports in PyTorch 1.10 (#9878)
Co-authored-by: tchaton <thomas@grid.ai>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2021-10-11 11:23:21 +00:00
Rohit Gupta 46fa703853
disable_logger (#9837) 2021-10-11 16:36:59 +05:30
Boris Dayma 2db9ea3500
feat(wandb): support media logging (#9545) 2021-10-11 10:15:36 +01:00
Rohit Gupta ce8233e6f0
use public format checkpoint method (#9818)
* use public method

* document

* Apply suggestions from code review
2021-10-11 09:23:47 +01:00
Rohit Gupta d71501d97f
Reset `val_dataloader` in `tuner/batch_size_scaling` (#9857)
* reset val

* chlog
2021-10-11 09:13:33 +01:00
kingyiusuen 8740c801bb
Fix typo in _validate_scheduler_optimizer() (#9886) 2021-10-11 09:16:17 +02:00
Siddhartha c395766300
use ModuleNotFoundError instead of ImportError (#9867)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-10-11 05:57:37 +00:00
Franz Rieger 2f313285b9
Correct function name (#9859) 2021-10-09 06:41:05 +00:00
ananthsub 5206e52786
Add support for `torch.set_detect_anomaly` (#9848)
* Add support for `detect_anomaly`

* Update CHANGELOG.md
2021-10-07 16:03:56 +00:00
Rohit Gupta 4decbc0d95
Deprecate `dataloader_idx` from `on_train_batch_start/end` (#9816)
* deprecate hooks

* dep todo

* explicit

* Apply suggestions from code review

* Apply suggestions from code review

* code review

* base
2021-10-07 10:18:11 +00:00
Danielle Pintz 0561fd6925
Fix test_quantization with Pytorch 1.10 (#9808) 2021-10-07 08:54:06 +01:00
Rohit Gupta 8a8ecb8d01
Update the logic to check for accumulation steps with deepspeed (#9826)
* support_dict

* chlog

* fix test

* epochs
2021-10-06 17:50:10 +01:00
Rohit Gupta b303b4f895
Fix restoring training state during `trainer.fit` only (#9413)
* reload state on fit

* trainer.state

* add test

* chlog

* revert

* review

* review

* rev and ammend

* fix test and logic

* update

* code review

* Apply suggestions from code review

* better assertions

* better assertions

* Apply suggestions from code review

* add loop test

* Apply suggestions from code review

* Split for typing

* review comments

* review comments

* use if_else

* code review

* code review

* code review

* Apply suggestions from code review

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

* Remove unnecessary pieces from the test

* move test

Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
2021-10-06 14:57:40 +00:00
Jirka Borovec b3e9dff32d
rename callback FineTune arg `round` (#9711)
* rename CB Tune arg round

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-10-06 09:39:36 +01:00
Kaushik B f94faa9cd3
Enable auto parameters tying for TPUs (#9525)
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-10-06 10:16:44 +02:00
Elad Segal 86ad941d06
Fix missing arguments when saving hyperparams from parent class only (#9800)
* Fix missing arguments when saving hyperparams from parent class only

* fix antipattern
2021-10-06 08:32:29 +01:00
edwardpwtsoi 7c6efbc8a8
Resolved wrong mv usage for extracted directory (#9678)
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
2021-10-05 12:56:33 +00:00
Danielle Pintz 9e2347f8ff
Fix broken `test_is_picklable` with PT1.10 (#9810) 2021-10-05 14:22:21 +02:00
pre-commit-ci[bot] 9e621b451f
[pre-commit.ci] pre-commit suggestions (#9819) 2021-10-05 09:21:16 +02:00
Tobias 9317fbfc25
Make DDP and Horovod batch_size scaling examples explicit (#9813)
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2021-10-05 09:12:26 +02:00
Danielle Pintz 3392215ef6
Fix broken `test_cpu_amp_precision_context_manager` (#9809)
* @RunIf(min_gpus=1)

* dtype -> fast_dtype
2021-10-04 12:14:13 +00:00
kingyiusuen 6d530373c0
Add warnings regarding unsupported keys in optim config and OneCycleLR (#9666)
* Add warnings regarding unsupported keys in optim config and OneCycleLR

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix docstring

* Update CHANGELOG.md

* Split  into two parts

* Use difference operator to find extra keys

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-10-04 08:25:05 +00:00
thomas chaton 5841ca9782
[Feat] Add auto_restart for fault tolerant training (#9722) 2021-10-01 16:37:17 +00:00
Carlos Mocholí 6ef4e5ac76
Remove return value from the backward closure (#9770) 2021-10-01 16:53:00 +02:00
Sean Naren 38f8029874
Fix Deepspeed and lightning calling scheduler (#9788) 2021-10-01 14:35:44 +00:00
Rohit Gupta 617e798f3b
Raise an exception if using `amp_level` with native `amp_backend` (#9755)
* add exception

* chlog

* code review

* Apply suggestions from code review

Co-authored-by: thomas chaton <thomas@grid.ai>
2021-10-01 14:27:05 +02:00
Rohit Gupta 9d982080df
Fix some flaky tests in tuner/lr_finder (#9766)
* update tests

* fix more tests
2021-10-01 11:15:16 +05:30
Adrian Wälchli ab207921b9
update changelog after 1.4.9 release (#9762) 2021-09-30 15:45:17 +02:00
Adrian Wälchli 9e11d97af6
Merge pull request #9690 from PyTorchLightning/feature/codeowners-rohit
add Rohit Gupta to default codeowners
2021-09-30 09:13:48 -04:00
Adrian Wälchli b054d21493
disable warnings summary for pytest (#9743) 2021-09-30 10:51:56 +02:00
ananthsub 0d3325ea20
Add support for `torch.use_deterministic_algorithms` (#9121)
* re-add changes

* Update test_data_parallel.py

* Update CHANGELOG.md

* Update test_legacy_checkpoints.py

* Update test_horovod.py

* Update test_horovod.py

* Update accelerator_connector.py

* update tests
2021-09-30 04:40:09 +00:00
Carlos Mocholí fb81e738fa
Refactor `grad_norm` function (#9742) 2021-09-30 02:54:08 +00:00
Carlos Mocholí 7f95fd04d7
Remove unnecessary `pytest.param` usage (#9760) 2021-09-30 02:42:11 +00:00
Sean Naren 8c9cb0c133
[3/n] add additional rich version check (#9757) 2021-09-29 17:24:51 +00:00
Sean Naren 0df3543137
[2/n] Fix rich model summary for tuples (#9756) 2021-09-29 17:13:27 +00:00
Carlos Mocholí 32003159f0
Remove legacy pytest markers (#9761) 2021-09-29 17:08:26 +00:00
Carlos Mocholí 19008ce98f
IPU hotfix for #9721 (#9759) 2021-09-29 15:36:39 +02:00
Carlos Mocholí 0ddd6a8c19
Remove `_NATIVE_AMP_AVAILABLE` checks (#9747) 2021-09-29 15:34:26 +02:00
Carlos Mocholí 44aed17aff
Remove duplicated native AMP + LBFGS check (#9748) 2021-09-29 13:14:03 +00:00
Carlos Mocholí 9ebfbbc349
Remove unused `post_optimizer_step` (#9746) 2021-09-29 13:09:22 +00:00