Rohit Gupta
93266e2c22
Avoid deprecated warnings from accelerator and checkpoint connector #10142
2021-10-26 14:10:30 +02:00
Danielle Pintz
a5235d5b01
Remove `model_connector.py` ( #10111 )
2021-10-26 11:52:14 +02:00
Rohit Gupta
34d5980df6
Raise `MisconfigurationException` if `trainer.eval` is missing required methods ( #10016 )
2021-10-25 23:12:08 -07:00
Danielle Pintz
13d6d7bad1
Remove `optimizer_connector.py` ( #10120 )
2021-10-26 00:52:43 +00:00
Eric Wiener
0e20119d24
Change default value of the `max_steps` Trainer argument from `None` to `-1` ( #9460 )
...
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: thomas chaton <thomas@grid.ai>
Co-authored-by: rohitgr7 <rohitgr1998@gmail.com>
2021-10-25 20:21:33 +00:00
Danielle Pintz
1f7bd6650c
Mark accelerator connector as protected ( #10032 )
2021-10-25 19:24:54 +00:00
jjenniferdai
6d79184ec5
Unify checkpoint load paths [redo #9693 ] ( #10061 )
2021-10-25 19:05:31 +00:00
Adrian Wälchli
76081fb846
Mark SLURM detection methods in `AcceleratorConnector` as protected ( #10101 )
...
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
2021-10-25 17:52:15 +00:00
Carlos Mocholí
b376799430
Minor fixes related to clipping ( #10130 )
...
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2021-10-25 16:40:22 +00:00
Adrian Wälchli
aff80477b7
Remove dead code in accelerator connector ( #10100 )
...
* remove dead code in accelerator connector
* remove slurm "fake_slurm_managing_tasks" dead code
2021-10-25 13:37:40 +00:00
Danielle Pintz
e94dcf6936
Mark `trainer.data_connector` as protected ( #10031 )
...
Co-authored-by: tchaton <thomas@grid.ai>
2021-10-25 12:29:09 +01:00
thomas chaton
454e93bace
Add support for init_meta_context, materialize_module ( #9920 )
2021-10-21 15:48:31 +01:00
jjenniferdai
2d9db211b5
Revert "Support serialized checkpoint loading ( #9605 )" ( #10057 )
...
This reverts commit f0e6f1b58a
.
2021-10-21 02:51:22 +02:00
Kaushik B
56bc55db71
Update strategy flag in docs ( #10000 )
...
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
2021-10-20 21:02:53 +05:30
Carlos Mocholí
f0b3e0f4de
Default to `precision=bf16` on CPU when `precision=16` is passed ( #10033 )
2021-10-20 13:25:13 +00:00
Adrian Wälchli
2c16f1d6b9
remove dataloader patching on the LightningModule ( #9764 )
...
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: rohitgr7 <rohitgr1998@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-10-20 15:23:20 +02:00
jjenniferdai
f0e6f1b58a
Support serialized checkpoint loading ( #9605 )
...
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-10-20 09:38:35 +01:00
Carlos Mocholí
53c62f63e8
Constrain IPU precision choices ( #10030 )
2021-10-20 00:52:01 +00:00
Carlos Mocholí
e44921ee21
Fix `self.log(on_epoch=True, reduce_fx=sum)` on_batch_start ( #9791 )
2021-10-20 01:56:37 +02:00
Carlos Mocholí
d45897d522
Rename `TPUHalfPrecisionPlugin` to `TPUBf16PrecisionPlugin` ( #10026 )
...
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-10-19 21:09:37 +00:00
Ning
0b68f2abf8
Remove `reset_train_val_dataloaders` from Trainer and move data reloading logic to loop ( #9671 )
...
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
2021-10-19 21:45:52 +02:00
Carlos Mocholí
e8beceb631
Add `TPUPrecisionPlugin` ( #10020 )
...
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-10-19 17:48:57 +00:00
Adrian Wälchli
854bdc042d
Update setup logic in training type plugins [1 / n] ( #9994 )
...
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-10-19 17:45:36 +02:00
Rohit Gupta
0aa220b46b
Remove deprecated `distributed_backend` from `Trainer` ( #10017 )
...
* rm distributed_backend from Trainer
* unused
* chlog
* internal distributed_backend
* Docstring
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
2021-10-19 13:54:37 +00:00
Danielle Pintz
83ce1bf515
Make `verify_loop_configurations` a utility function ( #9976 )
2021-10-18 23:52:45 +00:00
Danielle Pintz
203737bfce
Don't raise DeprecationWarning for `LoggerConnector.gpus_metrics` ( #9959 )
2021-10-18 22:51:09 +00:00
thomas chaton
86df7dcee7
Add KFold Loop example ( #9965 )
2021-10-18 16:27:12 +01:00
Adrian Wälchli
a99b7440b5
Add unit tests for `pl.utilities.grads` ( #9765 )
...
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-10-18 18:58:51 +05:30
Rohit Gupta
4dc32ad7db
Fix logic to check for spawn in worker_check ( #9902 )
...
* fix
* update tests
* chlog
* skip windows
2021-10-18 13:02:46 +00:00
Carlos Mocholí
e0470cc244
Update `resume_from_checkpoint` docs ( #9952 )
2021-10-18 17:40:47 +05:30
Carlos Mocholí
c69a79c86f
Fix `self.log(on_epoch=True)` on_batch_start ( #9780 )
2021-10-18 14:02:16 +02:00
Carlos Mocholí
01b304ec57
Update accelerator connector messages after the addition of strategy ( #9937 )
2021-10-18 01:10:48 +00:00
Carlos Mocholí
e5dfdf34f9
Avoid deprecation warning after #9901 ( #9951 )
2021-10-16 17:36:25 +01:00
Carlos Mocholí
db4e770004
Validate the precision input earlier ( #9763 )
2021-10-15 17:30:00 +00:00
Danielle Pintz
16213b1635
Deprecate `log_gpu_memory`, `gpu_metrics`, and util funcs in favor of `DeviceStatsMonitor` callback ( #9921 )
...
Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2021-10-14 22:45:44 +02:00
Oliver Borchert
afbf703684
Single-process multi-node CPU training ( #9603 )
...
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: thomas chaton <thomas@grid.ai>
2021-10-14 22:21:41 +02:00
four4fish
a002f872ea
[2/n] Directly call TrainingTypePlugin APIs instead of going through the Accelerator ( #9901 )
...
Co-authored-by: tchaton <thomas@grid.ai>
2021-10-14 17:38:22 +02:00
Rohit Gupta
23e8b59ae7
Add `configure_gradient_clipping` hook in `LightningModule` ( #9584 )
...
* init hook
* docs
* dep train args
* update tests
* doc
* doc
* .gitignore
* not dep
* add trainer args
* add & update tests
* fix tests
* pre-commit
* docs
* add docs
* add exception
* code review
* deepspeed
* update tests
* not
* try fix
* Apply suggestions from code review
* update deepspeed
* disable some tests
* disable some tests
* enable all tests
2021-10-13 20:15:13 +05:30
Kaushik B
05b15e63f0
Add `strategy` argument to Trainer ( #8597 )
...
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-10-13 12:34:06 +00:00
ananthsub
28fc8d2016
Add `enable_model_summary` flag and deprecate `weights_summary` ( #9699 )
...
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Kaushik B <kaushikbokka@gmail.com>
2021-10-13 17:20:54 +05:30
Kaushik B
b1e215d036
Remove `should_rank_save_checkpoint` property from Trainer ( #9433 )
2021-10-13 11:36:24 +00:00
Rohit Gupta
0f8fd20443
Remove epoch from `trainer.logged_metrics` ( #9904 )
2021-10-13 11:30:27 +02:00
ananthsub
4610fddb19
Mark `Trainer.terminate_on_nan` protected and deprecate public property ( #9849 )
...
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-10-12 20:23:22 +00:00
Danielle Pintz
dd6d797e0e
Remove type error handling in _configure_checkpoint_callbacks ( #9823 )
...
* remove type error handling in _configure_checkpoint_callbacks
* rm test
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2021-10-12 20:13:02 +00:00
Rohit Gupta
f2b0db60f1
Raise a `MisconfigurationException` when trainer functions are called with `ckpt_path="best"` but `checkpoint_callback` isn't configured ( #9841 )
...
* add check
* chlog
* Apply suggestions from code review
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
* Apply suggestions from code review
Co-authored-by: thomas chaton <thomas@grid.ai>
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
Co-authored-by: thomas chaton <thomas@grid.ai>
2021-10-12 15:35:55 +05:30
Adrian Wälchli
64d1c46623
Update error message for interactive incompatible plugins ( #9896 )
...
Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>
2021-10-12 15:10:49 +05:30
ananthsub
f16bfe9bdd
Mark `trainer.config_validator` as protected ( #9779 )
2021-10-12 09:29:05 +01:00
Rohit Gupta
db322f4bbb
Deprecate `checkpoint_callback` from the `Trainer` constructor in favour of `enable_checkpointing` ( #9754 )
...
* enable_chekpointing
* update codebase
* chlog
* update tests
* fix warning
* Apply suggestions from code review
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Apply suggestions from code review
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
* Apply suggestions from code review
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-10-12 07:55:07 +00:00
yopknopixx
173f4c8466
Deprecate `terminate_on_nan` Trainer argument in favor of `detect_anomaly` ( #9175 )
...
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-10-11 17:17:43 +00:00
Rohit Gupta
46fa703853
disable_logger ( #9837 )
2021-10-11 16:36:59 +05:30