Commit Graph

1167 Commits

Author SHA1 Message Date
Adrian Wälchli d3e5a43546
Restrict setup methods to accept a single model (#10064) 2021-10-25 16:32:57 +00:00
manipopopo cfb2d87765
Disable quantization aware training observers (#8540)
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: tchaton <thomas@grid.ai>
Co-authored-by: rohitgr7 <rohitgr1998@gmail.com>
2021-10-25 15:46:09 +00:00
Adrian Wälchli f8a7f3fde0
Add Yield loop example (#9983)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: tchaton <thomas@grid.ai>
2021-10-25 14:26:36 +00:00
thomas chaton 454e93bace
Add support for init_meta_context, materialize_module (#9920) 2021-10-21 15:48:31 +01:00
Adrian Wälchli 4ea72a9365
Update setup logic in training type plugins (sharded) [4 / 4] (#10028)
Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>
Co-authored-by: thomas chaton <thomas@grid.ai>
2021-10-21 10:35:01 +02:00
Kaushik B aa1540410f
Add XLACheckpointIO (#9972)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-10-21 02:39:16 +05:30
Adrian Wälchli d41902883a
Update `optimizer_step` methods in accelerator and plugins (#10023)
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
2021-10-20 21:36:27 +01:00
Rohit Gupta 1599c77d16
Fix `LearningRateMonitor` logging with multiple param groups optimizer with no scheduler (#10044) 2021-10-20 22:13:00 +05:30
Carlos Mocholí f0b3e0f4de
Default to `precision=bf16` on CPU when `precision=16` is passed (#10033) 2021-10-20 13:25:13 +00:00
Adrian Wälchli 2c16f1d6b9
remove dataloader patching on the LightningModule (#9764)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: rohitgr7 <rohitgr1998@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-10-20 15:23:20 +02:00
Carlos Mocholí ad8d6c83da
[CLI] Shorthand notation to instantiate datamodules (#10011)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-10-20 00:49:48 +00:00
Adrian Wälchli e0c83ee6df
Update `TPUSpawnPlugin` spawn methods (#10022) 2021-10-20 01:59:11 +02:00
Carlos Mocholí e44921ee21
Fix `self.log(on_epoch=True, reduce_fx=sum)` on_batch_start (#9791) 2021-10-20 01:56:37 +02:00
Carlos Mocholí d45897d522
Rename `TPUHalfPrecisionPlugin` to `TPUBf16PrecisionPlugin` (#10026)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-10-19 21:09:37 +00:00
Ning 0b68f2abf8
Remove `reset_train_val_dataloaders` from Trainer and move data reloading logic to loop (#9671)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
2021-10-19 21:45:52 +02:00
Adrian Wälchli 3ea534754e
Update setup logic in training type plugins (deepspeed) [2 / n] (#10009)
Co-authored-by: Sean Naren <sean@grid.ai>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-10-19 18:23:11 +00:00
Carlos Mocholí e8beceb631
Add `TPUPrecisionPlugin` (#10020)
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-10-19 17:48:57 +00:00
Adrian Wälchli 4aaca17fce
Update setup logic in training type plugins (data-parallel) [3 / n] (#10010)
Co-authored-by: thomas chaton <thomas@grid.ai>
2021-10-19 19:47:36 +02:00
Adrian Wälchli 854bdc042d
Update setup logic in training type plugins [1 / n] (#9994)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-10-19 17:45:36 +02:00
Adrian Wälchli bcb94de90e
Add `DDPSpawnPlugin.spawn()` (#10018) 2021-10-19 14:34:47 +00:00
Rohit Gupta 0aa220b46b
Remove deprecated `distributed_backend` from `Trainer` (#10017)
* rm distributed_backend from Trainer

* unused

* chlog

* internal distributed_backend

* Docstring

Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
2021-10-19 13:54:37 +00:00
thomas chaton 86df7dcee7
Add KFold Loop example (#9965) 2021-10-18 16:27:12 +01:00
Adrian Wälchli a99b7440b5
Add unit tests for `pl.utilities.grads` (#9765)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-10-18 18:58:51 +05:30
Rohit Gupta 4dc32ad7db
Fix logic to check for spawn in worker_check (#9902)
* fix

* update tests

* chlog

* skip windows
2021-10-18 13:02:46 +00:00
Adrian Wälchli 10d0b41977
Introduce `PrecisionPlugin.forward_context()` (#9988)
Co-authored-by: thomas chaton <thomas@grid.ai>
2021-10-18 12:58:19 +00:00
Carlos Mocholí c69a79c86f
Fix `self.log(on_epoch=True)` on_batch_start (#9780) 2021-10-18 14:02:16 +02:00
Elad Segal 8c76cf5ae1
reset val dataloader for binsearch (#9975) 2021-10-18 12:54:26 +02:00
ronif 7b4df7bf91
Fix issue with no-init dataclass fields in move_to_device (#9963)
Co-authored-by: ronif <ronif@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-10-17 07:10:47 +00:00
Mauricio Villegas 1f09cf2432
Fixed use of LightningCLI in computer_vision_fine_tuning.py example (#9934) 2021-10-16 17:04:02 +01:00
kingyiusuen 6429de8944
Add support for `len(datamodule)` (#9895)
Co-authored-by: tchaton <thomas@grid.ai>
2021-10-15 14:19:50 +02:00
Danielle Pintz 16213b1635
Deprecate `log_gpu_memory`, `gpu_metrics`, and util funcs in favor of `DeviceStatsMonitor` callback (#9921)
Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2021-10-14 22:45:44 +02:00
Oliver Borchert afbf703684
Single-process multi-node CPU training (#9603)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: thomas chaton <thomas@grid.ai>
2021-10-14 22:21:41 +02:00
Danielle Pintz 6feda08109
Deprecate `GPUStatsMonitor` and `XLAStatsMonitor` in favor of `DeviceStatsMonitor` (#9924)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Nicki Skafte Detlefsen <skaftenicki@gmail.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2021-10-14 15:52:45 +00:00
four4fish a002f872ea
[2/n] Directly call TrainingTypePlugin APIs instead of going through the Accelerator (#9901)
Co-authored-by: tchaton <thomas@grid.ai>
2021-10-14 17:38:22 +02:00
Viraj Bagal 15698698c4
Log LR using LearningRateMonitor even when LR Scheduler is not defined. (#9786)
* LR logging works even with no lr scheduler, wrote few extra tests as well

* updated changelog

* modified code as suggested by DeepSource

* added helper functions

* opt with no scheduler

* rename

* chlog

* update test

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: rohitgr7 <rohitgr1998@gmail.com>
2021-10-14 13:28:19 +00:00
Danielle Pintz 940b910d27
[2/4] Add DeviceStatsMonitor callback (#9712)
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
Co-authored-by: thomas chaton <thomas@grid.ai>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Kaushik B <kaushikbokka@gmail.com>
Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>
2021-10-13 18:29:36 +00:00
Kaushik B 05b15e63f0
Add `strategy` argument to Trainer (#8597)
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-10-13 12:34:06 +00:00
ananthsub 28fc8d2016
Add `enable_model_summary` flag and deprecate `weights_summary` (#9699)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Kaushik B <kaushikbokka@gmail.com>
2021-10-13 17:20:54 +05:30
Kaushik B b1e215d036
Remove `should_rank_save_checkpoint` property from Trainer (#9433) 2021-10-13 11:36:24 +00:00
Rohit Gupta 0f8fd20443
Remove epoch from `trainer.logged_metrics` (#9904) 2021-10-13 11:30:27 +02:00
ananthsub 4610fddb19
Mark `Trainer.terminate_on_nan` protected and deprecate public property (#9849)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-10-12 20:23:22 +00:00
Chris Chow f14a47a0b2
guard against None in pytorch get_xla_supported_devices (#9572)
Co-authored-by: Chris Chow <cchow@nianticlabs.com>
Co-authored-by: thomas chaton <thomas@grid.ai>
2021-10-12 12:01:32 +00:00
Rohit Gupta f2b0db60f1
Raise a `MisconfigurationException` when trainer functions are called with `ckpt_path="best"` but `checkpoint_callback` isn't configured (#9841)
* add check

* chlog

* Apply suggestions from code review

Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>

* Apply suggestions from code review

Co-authored-by: thomas chaton <thomas@grid.ai>

Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
Co-authored-by: thomas chaton <thomas@grid.ai>
2021-10-12 15:35:55 +05:30
Adrian Wälchli 64d1c46623
Update error message for interactive incompatible plugins (#9896)
Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>
2021-10-12 15:10:49 +05:30
Sean Naren 6da5829e53
DeepSpeed support for device IDs (#9847) 2021-10-12 09:24:46 +00:00
ananthsub f16bfe9bdd
Mark `trainer.config_validator` as protected (#9779) 2021-10-12 09:29:05 +01:00
Rohit Gupta db322f4bbb
Deprecate `checkpoint_callback` from the `Trainer` constructor in favour of `enable_checkpointing` (#9754)
* enable_chekpointing

* update codebase

* chlog

* update tests

* fix warning

* Apply suggestions from code review

Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Apply suggestions from code review

Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>

* Apply suggestions from code review

Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-10-12 07:55:07 +00:00
Kaushik B c3aa6e9818
Prepare v1.5.0rc0 (#9893) 2021-10-11 20:36:01 +01:00
yopknopixx 173f4c8466
Deprecate `terminate_on_nan` Trainer argument in favor of `detect_anomaly` (#9175)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-10-11 17:17:43 +00:00
Ranuga-Disansa f915a8a283
Removed a redundant warning with `ModelCheckpoint(monitor=None)` callback (#9875)
* Update README.md

* Update README.md

* Create evaluation.py

* Update README.md

* Update evaluation.py

* Create evaluation.py

* Create evaluation.py

* Update evaluation.py

* Create nlp.py

* Update evaluation.py

* Create evaluation.py

* Update nlp.py

* Update nlp.py

* Update evaluation.py

* Create evaluation.py

* Update nlp.py

* Update nlp.py

* Update requirements.txt

* Update evaluation.py

* Create data_loader.py

* Update nlp.py

* Update evaluation.py

* Update data_loader.py

* Update nlp.py

* Update data_loader.py

* Update requirements.txt

* Update model_checkpoint.py

* Delete evaluation.py

* Delete data_loader.py

* Delete nlp.py

* Update requirements.txt

* Update model_checkpoint.py

* Update README.md

* Update pytorch_lightning/callbacks/model_checkpoint.py

* Update CHANGELOG.md

* Update test_model_checkpoint.py

* Update model_checkpoint.py

* update

* update

* chlog update

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2021-10-11 14:54:07 +00:00