jjenniferdai
d4a4b77906
[3/3] Update lightning callbacks to `Stateful`, deprecations for old `on_save/load_checkpoint` signatures ( #11887 )
2022-03-25 00:06:10 +00:00
Rohit Gupta
5b342f14a6
fix to avoid common hook warning if no hook is overridden ( #12131 )
2022-02-28 18:07:05 +05:30
Rohit Gupta
5d2d9b09df
Avoid patching common `DataHooks` to the `LightningModule` ( #10603 )
2022-02-25 09:26:59 +01:00
Carlos Mocholí
789fae828d
Fix `current_epoch` value on training end ( #8578 )
...
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2022-02-10 17:55:59 +01:00
Rohit Gupta
400201712f
added warning for distributedsampler in case of evaluation ( #11479 )
2022-02-03 18:42:13 +00:00
Rohit Gupta
7948ed703d
Avoid enforcing `shuffle=False` for eval dataloaders ( #11575 )
2022-02-03 09:35:31 +00:00
Krishna Kalyan
6586dd23b7
Mark `CheckpointConnector` as protected ( #11550 )
...
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-02-03 02:26:08 +00:00
Maaz Karim
16a04b29eb
Mark SignalConnector as protected ( #11513 )
...
Co-authored-by: rohitgr7 <rohitgr1998@gmail.com>
2022-01-20 08:39:59 +01:00
jjenniferdai
4b5761539e
Remove `hpc_save` ( #11101 )
2022-01-03 12:23:13 +00:00
jjenniferdai
31f39c9578
Move `CheckpointConnector.fault_tolerant_auto_save_path` out of `CheckpointConnector.hpc_resume_path` ( #11092 )
2021-12-21 02:24:01 +01:00
jjenniferdai
6e21dd3767
Deprecate `on_hpc_{save/load}` hooks ( #10911 )
...
* first commit
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* update pr #
* test filterwarnings
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* add a todo comment
* updates
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* `` Update pytorch_lightning/core/saving.py
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
* `` Update pytorch_lightning/core/saving.py
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* model --> LightningModule Update pytorch_lightning/core/saving.py
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
* model --> LightningModule Update pytorch_lightning/core/saving.py
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2021-12-08 14:56:15 -08:00
four4fish
629ca09e09
fix TypeError cause failure in singal_connector teardown ( #10961 )
2021-12-06 21:48:31 +00:00
Danielle Pintz
6043179931
Re-design `call_hook` interface ( #10575 )
2021-12-04 16:39:55 -05:00
Mauricio Villegas
f3b0a06e90
Fix `SignalConnector._has_already_handler` check for callable type ( #10483 )
...
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: thomas chaton <thomas@grid.ai>
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
2021-11-30 22:47:52 +00:00
Adrian Wälchli
25473acddb
Restore signals on teardown ( #10611 )
...
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-11-30 22:07:14 +00:00
thomas chaton
412d507a73
Fault Tolerant: move signal to SIGTERM ( #10605 )
2021-11-26 13:37:27 +00:00
thomas chaton
7d3ad5b76e
Don't register signal in thread ( #10610 )
2021-11-19 04:13:35 +01:00
Adrian Wälchli
0f6d89422b
Control automatic resubmission on SLURM ( #10601 )
2021-11-18 17:48:53 +00:00
Carlos Mocholí
dcafc95f2b
Avoid deprecated `progress_bar_refresh_rate` usage ( #10520 )
...
Co-authored-by: Danielle Pintz <38207072+daniellepintz@users.noreply.github.com>
2021-11-15 22:04:48 +01:00
Carlos Mocholí
7a9a08c5d3
Drop torch 1.6 testing ( #10390 )
...
* Drop torch 1.6 support
* Drop 1.6 support
* Update CHANGELOG
* Fixes
* Split change
* Undo change
* 1.7 -> 1.7.1
https://github.com/pytorch/pytorch/issues/47354
* Force trigger nightly
* Update .github/workflows/events-nightly.yml
Co-authored-by: Aki Nitta <nitta@akihironitta.com>
* Revert 1.7.1 change - try wildcard
* Update adjust versions and test it
* Undo test changes
* Revert "Undo test changes"
This reverts commit 3a6acadd11
.
* Update CHANGELOG.md
Co-authored-by: Aki Nitta <nitta@akihironitta.com>
2021-11-13 20:35:03 +00:00
Ross Johnstone
c2f25d42ab
Make `monitor` required arg of EarlyStopping callback ( #10328 )
...
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-11-09 18:08:03 +00:00
Kaushik B
45c45dc7b0
Deprecate `ProgressBar` and rename it to `TQDMProgressBar` ( #10134 )
...
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-11-01 11:42:21 +00:00
thomas chaton
255e3edc98
resolve failing test ( #10191 )
...
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-10-28 15:27:03 +00:00
jjenniferdai
6d79184ec5
Unify checkpoint load paths [redo #9693 ] ( #10061 )
2021-10-25 19:05:31 +00:00
jjenniferdai
2d9db211b5
Revert "Support serialized checkpoint loading ( #9605 )" ( #10057 )
...
This reverts commit f0e6f1b58a
.
2021-10-21 02:51:22 +02:00
Adrian Wälchli
2c16f1d6b9
remove dataloader patching on the LightningModule ( #9764 )
...
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: rohitgr7 <rohitgr1998@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-10-20 15:23:20 +02:00
jjenniferdai
f0e6f1b58a
Support serialized checkpoint loading ( #9605 )
...
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-10-20 09:38:35 +01:00
ananthsub
28fc8d2016
Add `enable_model_summary` flag and deprecate `weights_summary` ( #9699 )
...
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Kaushik B <kaushikbokka@gmail.com>
2021-10-13 17:20:54 +05:30
Rohit Gupta
db322f4bbb
Deprecate `checkpoint_callback` from the `Trainer` constructor in favour of `enable_checkpointing` ( #9754 )
...
* enable_chekpointing
* update codebase
* chlog
* update tests
* fix warning
* Apply suggestions from code review
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Apply suggestions from code review
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
* Apply suggestions from code review
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-10-12 07:55:07 +00:00
Rohit Gupta
b303b4f895
Fix restoring training state during `trainer.fit` only ( #9413 )
...
* reload state on fit
* trainer.state
* add test
* chlog
* revert
* review
* review
* rev and ammend
* fix test and logic
* update
* code review
* Apply suggestions from code review
* better assertions
* better assertions
* Apply suggestions from code review
* add loop test
* Apply suggestions from code review
* Split for typing
* review comments
* review comments
* use if_else
* code review
* code review
* code review
* Apply suggestions from code review
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
* Remove unnecessary pieces from the test
* move test
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
2021-10-06 14:57:40 +00:00
thomas chaton
5841ca9782
[Feat] Add auto_restart for fault tolerant training ( #9722 )
2021-10-01 16:37:17 +00:00
Danielle Pintz
b3a5c7f442
Add `enable_progress_bar` to Trainer constructor ( #9664 )
2021-09-24 22:53:31 -07:00
Rohit Gupta
8fcdcb598b
Fix `accumulate_grad_batches` on init ( #9652 )
...
* fix accumuate_grad_batches on init
* chlog
* update error
* move to callback connector
* add test with callback
* fix tests
* Update pytorch_lightning/trainer/connectors/callback_connector.py
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
* update ipu logic
* rev
* rev
* rev
* pls work
* code review
Co-authored-by: Rohit Gupta <goku@rmac.local>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-09-24 18:51:54 +00:00
thomas chaton
c7451b3ccf
[Feat] Add graceful detection of signal to exit + SignalConnector and merge SlurmConnector. ( #9566 )
...
Co-authored-by: Sean Naren <sean@grid.ai>
2021-09-17 19:13:59 +00:00
Kaushik B
d773407e59
feat: Add ModelSummary Callback ( #9344 )
...
Co-authored-by: Ethan Harris <ewah1g13@soton.ac.uk>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-09-10 12:42:42 +00:00
Jirka Borovec
6e124e7207
CI: precommit - docformatter ( #8584 )
...
* CI: precommit - docformatter
* fix deprecated
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-09-06 12:49:09 +00:00
Adrian Wälchli
b9443a07b9
[2 / 3] improvements to saving and loading callback state ( #7187 )
...
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2021-08-24 17:35:19 +00:00
Carlos Mocholí
e1442d247e
Always use `trainer.call_hook` ( #8498 )
2021-08-20 18:22:03 +02:00
Carlos Mocholí
ed13040729
Connect the model to the training type plugin at the start of run ( #8536 )
2021-08-04 17:43:34 +02:00
Adrian Wälchli
8c27fa71fa
[1 / 3] improvements to saving and loading callback state ( #6886 )
...
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-07-29 00:12:32 +02:00
Carlos Mocholí
a64cc37394
Replace `yapf` with `black` ( #7783 )
...
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-07-26 13:37:35 +02:00
Adrian Wälchli
6b7b40473b
deprecate hpc_load() and integrate it with restore() ( #7955 )
...
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-06-14 12:20:01 +00:00
Carlos Mocholí
3df02b880a
Add checkpoint parameter to on_save_checkpoint ( #6072 )
...
Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>
2021-02-25 21:18:19 +05:30
Adrian Wälchli
b8619a695f
new LightningModule hook "configure_callbacks" ( #5621 )
...
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: chaton <thomas@grid.ai>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
2021-02-12 19:27:44 -05:00
Jirka Borovec
a0f7831278
fix miss-leading imports in tests ( #5873 )
...
* fix imorts
* .
2021-02-09 05:10:52 -05:00
Jirka Borovec
f83cca6107
formatting flake8 & isort ( #5824 )
...
* formatting
* isort
* make
* yapf
* isort
2021-02-05 18:33:12 -05:00
Adrian Wälchli
9555043a29
Force ModelCheckpoint callback to run last ( #5731 )
2021-02-03 16:40:57 -05:00