Adrian Wälchli
d3e5a43546
Restrict setup methods to accept a single model ( #10064 )
2021-10-25 16:32:57 +00:00
manipopopo
cfb2d87765
Disable quantization aware training observers ( #8540 )
...
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: tchaton <thomas@grid.ai>
Co-authored-by: rohitgr7 <rohitgr1998@gmail.com>
2021-10-25 15:46:09 +00:00
Adrian Wälchli
f8a7f3fde0
Add Yield loop example ( #9983 )
...
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: tchaton <thomas@grid.ai>
2021-10-25 14:26:36 +00:00
thomas chaton
454e93bace
Add support for init_meta_context, materialize_module ( #9920 )
2021-10-21 15:48:31 +01:00
Adrian Wälchli
4ea72a9365
Update setup logic in training type plugins (sharded) [4 / 4] ( #10028 )
...
Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>
Co-authored-by: thomas chaton <thomas@grid.ai>
2021-10-21 10:35:01 +02:00
Kaushik B
aa1540410f
Add XLACheckpointIO ( #9972 )
...
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-10-21 02:39:16 +05:30
Adrian Wälchli
d41902883a
Update `optimizer_step` methods in accelerator and plugins ( #10023 )
...
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
2021-10-20 21:36:27 +01:00
Rohit Gupta
1599c77d16
Fix `LearningRateMonitor` logging with multiple param groups optimizer with no scheduler ( #10044 )
2021-10-20 22:13:00 +05:30
Carlos Mocholí
f0b3e0f4de
Default to `precision=bf16` on CPU when `precision=16` is passed ( #10033 )
2021-10-20 13:25:13 +00:00
Adrian Wälchli
2c16f1d6b9
remove dataloader patching on the LightningModule ( #9764 )
...
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: rohitgr7 <rohitgr1998@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-10-20 15:23:20 +02:00
Carlos Mocholí
ad8d6c83da
[CLI] Shorthand notation to instantiate datamodules ( #10011 )
...
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-10-20 00:49:48 +00:00
Adrian Wälchli
e0c83ee6df
Update `TPUSpawnPlugin` spawn methods ( #10022 )
2021-10-20 01:59:11 +02:00
Carlos Mocholí
e44921ee21
Fix `self.log(on_epoch=True, reduce_fx=sum)` on_batch_start ( #9791 )
2021-10-20 01:56:37 +02:00
Carlos Mocholí
d45897d522
Rename `TPUHalfPrecisionPlugin` to `TPUBf16PrecisionPlugin` ( #10026 )
...
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-10-19 21:09:37 +00:00
Ning
0b68f2abf8
Remove `reset_train_val_dataloaders` from Trainer and move data reloading logic to loop ( #9671 )
...
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
2021-10-19 21:45:52 +02:00
Adrian Wälchli
3ea534754e
Update setup logic in training type plugins (deepspeed) [2 / n] ( #10009 )
...
Co-authored-by: Sean Naren <sean@grid.ai>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-10-19 18:23:11 +00:00
Carlos Mocholí
e8beceb631
Add `TPUPrecisionPlugin` ( #10020 )
...
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-10-19 17:48:57 +00:00
Adrian Wälchli
4aaca17fce
Update setup logic in training type plugins (data-parallel) [3 / n] ( #10010 )
...
Co-authored-by: thomas chaton <thomas@grid.ai>
2021-10-19 19:47:36 +02:00
Adrian Wälchli
854bdc042d
Update setup logic in training type plugins [1 / n] ( #9994 )
...
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-10-19 17:45:36 +02:00
Adrian Wälchli
bcb94de90e
Add `DDPSpawnPlugin.spawn()` ( #10018 )
2021-10-19 14:34:47 +00:00
Rohit Gupta
0aa220b46b
Remove deprecated `distributed_backend` from `Trainer` ( #10017 )
...
* rm distributed_backend from Trainer
* unused
* chlog
* internal distributed_backend
* Docstring
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
2021-10-19 13:54:37 +00:00
thomas chaton
86df7dcee7
Add KFold Loop example ( #9965 )
2021-10-18 16:27:12 +01:00
Adrian Wälchli
a99b7440b5
Add unit tests for `pl.utilities.grads` ( #9765 )
...
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-10-18 18:58:51 +05:30
Rohit Gupta
4dc32ad7db
Fix logic to check for spawn in worker_check ( #9902 )
...
* fix
* update tests
* chlog
* skip windows
2021-10-18 13:02:46 +00:00
Adrian Wälchli
10d0b41977
Introduce `PrecisionPlugin.forward_context()` ( #9988 )
...
Co-authored-by: thomas chaton <thomas@grid.ai>
2021-10-18 12:58:19 +00:00
Carlos Mocholí
c69a79c86f
Fix `self.log(on_epoch=True)` on_batch_start ( #9780 )
2021-10-18 14:02:16 +02:00
Elad Segal
8c76cf5ae1
reset val dataloader for binsearch ( #9975 )
2021-10-18 12:54:26 +02:00
ronif
7b4df7bf91
Fix issue with no-init dataclass fields in move_to_device ( #9963 )
...
Co-authored-by: ronif <ronif@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-10-17 07:10:47 +00:00
Mauricio Villegas
1f09cf2432
Fixed use of LightningCLI in computer_vision_fine_tuning.py example ( #9934 )
2021-10-16 17:04:02 +01:00
kingyiusuen
6429de8944
Add support for `len(datamodule)` ( #9895 )
...
Co-authored-by: tchaton <thomas@grid.ai>
2021-10-15 14:19:50 +02:00
Danielle Pintz
16213b1635
Deprecate `log_gpu_memory`, `gpu_metrics`, and util funcs in favor of `DeviceStatsMonitor` callback ( #9921 )
...
Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2021-10-14 22:45:44 +02:00
Oliver Borchert
afbf703684
Single-process multi-node CPU training ( #9603 )
...
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: thomas chaton <thomas@grid.ai>
2021-10-14 22:21:41 +02:00
Danielle Pintz
6feda08109
Deprecate `GPUStatsMonitor` and `XLAStatsMonitor` in favor of `DeviceStatsMonitor` ( #9924 )
...
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Nicki Skafte Detlefsen <skaftenicki@gmail.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2021-10-14 15:52:45 +00:00
four4fish
a002f872ea
[2/n] Directly call TrainingTypePlugin APIs instead of going through the Accelerator ( #9901 )
...
Co-authored-by: tchaton <thomas@grid.ai>
2021-10-14 17:38:22 +02:00
Viraj Bagal
15698698c4
Log LR using LearningRateMonitor even when LR Scheduler is not defined. ( #9786 )
...
* LR logging works even with no lr scheduler, wrote few extra tests as well
* updated changelog
* modified code as suggested by DeepSource
* added helper functions
* opt with no scheduler
* rename
* chlog
* update test
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: rohitgr7 <rohitgr1998@gmail.com>
2021-10-14 13:28:19 +00:00
Danielle Pintz
940b910d27
[2/4] Add DeviceStatsMonitor callback ( #9712 )
...
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
Co-authored-by: thomas chaton <thomas@grid.ai>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Kaushik B <kaushikbokka@gmail.com>
Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>
2021-10-13 18:29:36 +00:00
Kaushik B
05b15e63f0
Add `strategy` argument to Trainer ( #8597 )
...
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-10-13 12:34:06 +00:00
ananthsub
28fc8d2016
Add `enable_model_summary` flag and deprecate `weights_summary` ( #9699 )
...
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Kaushik B <kaushikbokka@gmail.com>
2021-10-13 17:20:54 +05:30
Kaushik B
b1e215d036
Remove `should_rank_save_checkpoint` property from Trainer ( #9433 )
2021-10-13 11:36:24 +00:00
Rohit Gupta
0f8fd20443
Remove epoch from `trainer.logged_metrics` ( #9904 )
2021-10-13 11:30:27 +02:00
ananthsub
4610fddb19
Mark `Trainer.terminate_on_nan` protected and deprecate public property ( #9849 )
...
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-10-12 20:23:22 +00:00
Chris Chow
f14a47a0b2
guard against None in pytorch get_xla_supported_devices ( #9572 )
...
Co-authored-by: Chris Chow <cchow@nianticlabs.com>
Co-authored-by: thomas chaton <thomas@grid.ai>
2021-10-12 12:01:32 +00:00
Rohit Gupta
f2b0db60f1
Raise a `MisconfigurationException` when trainer functions are called with `ckpt_path="best"` but `checkpoint_callback` isn't configured ( #9841 )
...
* add check
* chlog
* Apply suggestions from code review
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
* Apply suggestions from code review
Co-authored-by: thomas chaton <thomas@grid.ai>
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
Co-authored-by: thomas chaton <thomas@grid.ai>
2021-10-12 15:35:55 +05:30
Adrian Wälchli
64d1c46623
Update error message for interactive incompatible plugins ( #9896 )
...
Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>
2021-10-12 15:10:49 +05:30
Sean Naren
6da5829e53
DeepSpeed support for device IDs ( #9847 )
2021-10-12 09:24:46 +00:00
ananthsub
f16bfe9bdd
Mark `trainer.config_validator` as protected ( #9779 )
2021-10-12 09:29:05 +01:00
Rohit Gupta
db322f4bbb
Deprecate `checkpoint_callback` from the `Trainer` constructor in favour of `enable_checkpointing` ( #9754 )
...
* enable_chekpointing
* update codebase
* chlog
* update tests
* fix warning
* Apply suggestions from code review
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Apply suggestions from code review
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
* Apply suggestions from code review
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-10-12 07:55:07 +00:00
Kaushik B
c3aa6e9818
Prepare v1.5.0rc0 ( #9893 )
2021-10-11 20:36:01 +01:00
yopknopixx
173f4c8466
Deprecate `terminate_on_nan` Trainer argument in favor of `detect_anomaly` ( #9175 )
...
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-10-11 17:17:43 +00:00
Ranuga-Disansa
f915a8a283
Removed a redundant warning with `ModelCheckpoint(monitor=None)` callback ( #9875 )
...
* Update README.md
* Update README.md
* Create evaluation.py
* Update README.md
* Update evaluation.py
* Create evaluation.py
* Create evaluation.py
* Update evaluation.py
* Create nlp.py
* Update evaluation.py
* Create evaluation.py
* Update nlp.py
* Update nlp.py
* Update evaluation.py
* Create evaluation.py
* Update nlp.py
* Update nlp.py
* Update requirements.txt
* Update evaluation.py
* Create data_loader.py
* Update nlp.py
* Update evaluation.py
* Update data_loader.py
* Update nlp.py
* Update data_loader.py
* Update requirements.txt
* Update model_checkpoint.py
* Delete evaluation.py
* Delete data_loader.py
* Delete nlp.py
* Update requirements.txt
* Update model_checkpoint.py
* Update README.md
* Update pytorch_lightning/callbacks/model_checkpoint.py
* Update CHANGELOG.md
* Update test_model_checkpoint.py
* Update model_checkpoint.py
* update
* update
* chlog update
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2021-10-11 14:54:07 +00:00