Arnaud Gelas
2373858b33
Fix pre-commit isort failure on tests/checkpointing/*.py ( #5427 )
...
* Remove tests.checkpointing from skipped module in pyproject.toml
* Fix pre-commit isort failure on tests/checkpointing/*.py
2021-01-12 03:31:51 -05:00
Alan Du
f6dc354349
Throw MisconfigurationError on unknown mode ( #5255 )
...
* Throw MisconfigurationError on unknown mode
* Add tests
* Add match condition for deprecation message
2021-01-12 02:31:26 -05:00
Jirka Borovec
059f4630c8
prune check on Trainer fit result ( #5453 )
...
* prune check on Trainer fit result
* flake8
* Apply suggestions from code review
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
* .
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-01-11 19:36:48 -05:00
Jirka Borovec
beb8cacf1c
fix formatting - flake8 + isort
2021-01-06 21:31:48 +01:00
Carlos Mocholí
3ee3c42035
Prepare 1.1.3 release ( #5365 )
...
* Prepare 1.1.3 release
* Fix flake8 error
* suppress
* Remove 1.1.4 section
* Add missing commits to CHANGELOG
* Update PR template
* Add missing commit
* fix
* Update CHANGELOG.md
* Apply suggestions from code review
* Apply suggestions from code review
Co-authored-by: Jirka Borovec <jirka.borovec@seznam.cz>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
(cherry picked from commit 4d9db866a1
)
2021-01-06 15:17:27 +01:00
Jirka Borovec
9610ea817b
refactor imports of logger dependencies ( #4860 )
...
* refactor imports of logger dependencies
* fix
* fix
* fix
* name
* fix
* mocks
* fix tests
* fix mlflow
* fix test tube
* fix wandb import check
* whitespace
* name
* name
* hack
* hack
* rev
* fix
* update mlflow import check
* try without installing conda dep
* .
* .
* .
* .
* .
* .
* .
* .
* .
Co-authored-by: Adrian Wälchli <adrian.waelchli@inf.unibe.ch>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
(cherry picked from commit ec0fb7a3ec
)
2021-01-06 15:16:06 +01:00
chaton
56437e98a6
[bug-fix] Trainer.test points to latest best_model_path ( #5161 )
...
* resolve bug
* update code
* add set -e
* Update pytorch_lightning/callbacks/model_checkpoint.py
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* update test
* Update tests/checkpointing/test_trainer_checkpoint.py
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
* Update tests/checkpointing/test_trainer_checkpoint.py
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
* update on comments
* resolve test
* convert to set
* update
* add error triggering
* update
* update on comments
* update
* resolve import
* update
* update
* Update pytorch_lightning/plugins/rpc_plugin.py
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
* update
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-62-109.ec2.internal>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
(cherry picked from commit d5b367871f
)
2021-01-06 15:14:10 +01:00
Rohit Gupta
9cfbf8d609
Disable checkpointing, earlystopping and logging with fast_dev_run ( #5277 )
...
* Disable checkpointing, earlystopping and logger with fast_dev_run
* docs
* chlog
* disable callbacks and enable DummyLogger
* add log
* use dummy logger method
* Apply suggestions from code review
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: chaton <thomas@grid.ai>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
(cherry picked from commit f740245521
)
2021-01-06 12:57:24 +01:00
Rohit Gupta
81e9d4260e
Fix saved filename in ModelCheckpoint if it already exists ( #4861 )
...
* disable version if not required
* disable version if not required
* pep
* chlog
* improve test
* improve test
* parametrize test and update del_list
* Update pytorch_lightning/callbacks/model_checkpoint.py
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
* try appending version to already saved ckpt_file
* Revert "try appending version to already saved ckpt_file"
This reverts commit 710e05e01f738d982aabf1f36c09fa59293e5c0c.
* add more assertions
* use BoringModel
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: chaton <thomas@grid.ai>
Co-authored-by: Roger Shieh <sh.rog@protonmail.ch>
2021-01-05 09:57:37 +01:00
Jirka Borovec
b72ed71d4e
Refactor: clean trainer device & distrib setters ( #5297 )
...
* naive replace
* simplify
* clean
* .
* fix
* .
* fix
* fix
2021-01-04 17:10:13 +00:00
Jirka Borovec
af833f673c
drop deprecated TrainResult ( #5323 )
...
* drop TrainResult
* .
* .
* .
* .
* .
* .
2021-01-04 09:54:21 +08:00
Jirka Borovec
fb90eec515
drop deprecated checkpoint filepath ( #5321 )
...
* drop deprecated checkpoint filepath
* tests
2021-01-02 00:08:29 +01:00
Jirka Borovec
35fd6e93c7
refactor - check E501 ( #5200 )
2020-12-21 14:23:09 +05:30
Carlos Mocholí
398f122a42
Improve some tests ( #5049 )
...
* Improve some tests
* Add TrainerState asserts
Co-authored-by: Roger Shieh <sh.rog@protonmail.ch>
2020-12-13 23:04:16 +08:00
Jirka Borovec
05f25f3a54
update usage of deprecated checkpoint_callback ( #5006 )
...
* drop usage of deprecated checkpoint_callback
* fix
* fix
2020-12-09 14:14:34 -05:00
Jan-Henrik Lambrechts
b00991efd8
Added changeable extension variable for model checkpoints ( #4977 )
...
* Added changeable extension variable for model checkpoints
* Removed whitespace
* Removed the last bit of whitespace
* Wrote tests for FILE_EXTENSION
* Fixed formatting issues
* More formatting issues
* Simplify test by just using defaults
* Formatting to PEP8
* Added dummy class that inherits ModelCheckpoint; run only one batch instead of epoch for integration test
* Fixed too much whitespace formatting
* some changes
Co-authored-by: rohitgr7 <rohitgr1998@gmail.com>
2020-12-06 22:58:50 +05:30
chaton
c2e6e68c7e
optimizer clean up ( #4658 )
...
* add LightningOptimizer
* typo
* add mock closure
* typo
* remove logic in optimizer_step
* update
* update
* update
* desactivate LightningOptimizer for hovorod
* resolve flake
* typo
* check optimizer name
* change name
* added backward to LightningOptimizer
* remove use_lightning_optimizer
* move update
* simplify init
* resolve comments
* resolve bug
* update
* update
* resolve bugs
* resolve flake8
* set state
* work manual_optimizer_step
* add doc
* add enable_pl_optimizer
* make optimizer_step
* add make_optimizer_step
* add examples
* resolve test
* add test_optimizer_return_options_enable_pl_optimizer
* add enable_pl_optimizer=True
* update
* update tests
* resolve bugs
* update
* set Trainer to False
* update
* resolve bugs
* update
* remove from doc
* resolve bug
* typo
* update
* set to True
* simplification
* typo
* resolve horovod
* unwrap horovod
* remove Optimizer
* resolve horovod
* move logic to amp_backend
* doesn't seem to be pickable
* update
* add again
* resolve some bugs
* cleanup
* resolve bug with AMP
* change __repr__
* round at -12
* udpate
* update
* update
* remove from horovod
* typo
* add convert_to_lightning_optimizers in each accelerators
* typo
* forgot
* forgot a convert_to_lightning_optimizers
* update
* update
* update
* increase coverage
* update
* resolve flake8
* update
* remove useless code
* resolve comments + add support for LightningOptimizer base class
* resolve flake
* check optimizer get wrapped back
* resolve DDPSharded
* reduce code
* lightningoptimizer
* Update pytorch_lightning/core/optimizer.py
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
* Update pytorch_lightning/core/lightning.py
* remove reference to step function
* Apply suggestions from code review
* update on comments
* resolve
* Update CHANGELOG.md
* add back training_step in apex and native_amp
* rename optimizer_step
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: William Falcon <waf2107@columbia.edu>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
2020-12-01 00:09:46 +00:00
Jeff Yang
7d96fd1168
[tests/checkpointing] refactor with BoringModel ( #4661 )
...
* [tests/checkpointing] refactor with BoringModel
* [tests/checkpointing] refactor with BoringModel
* [tests/checkpointing] refactor with BoringModel
* LessBoringModel -> LogInTwoMethods
* LessBoringModel -> LogInTwoMethods
* LessBoringModel -> TrainingStepCalled
Co-authored-by: chaton <thomas@grid.ai>
Co-authored-by: Ananya Harsh Jha <ananya@pytorchlightning.ai>
2020-11-24 01:23:12 +01:00
Roger Shieh
42e59c6add
Cast hparams to dict when not using omegaconf ( #4770 )
...
* init fix
* init test
* more specific dict assert
* update changelog
* Update tests/checkpointing/test_model_checkpoint.py
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: chaton <thomas@grid.ai>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-11-20 19:53:05 +08:00
Carlos Mocholí
396a46f55f
Add current_score to ModelCheckpoint.on_save_checkpoint ( #4721 )
...
* Add current_score to ModelCheckpoint.on_save_checkpoint
* Update CHANGELOG
[ci skip]
* fix
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
* fix2
* Add test for NaN
* Fix failing tests
* Simplify line
* Add test docstrings
Co-authored-by: chaton <thomas@grid.ai>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2020-11-18 08:09:44 +00:00
Jirka Borovec
e1955e3c89
isolate PL debugger in tests ( #4643 )
...
* isolate PL debugger in tests
* miss
Co-authored-by: Jeff Yang <ydcjeff@outlook.com>
2020-11-14 11:22:56 +00:00
Kai Zhang
30ad3e2ad3
Replace a MisconfigurationException with warning in ModelCheckpoint callback ( #4560 )
...
* replace MisconfigurationException with warning
* update test
* check raising UserWarning
Co-authored-by: chaton <thomas@grid.ai>
Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>
2020-11-10 10:44:43 +01:00
Rohit Gupta
ad2556b669
Disable saving checkpoints if not trained ( #4372 )
...
* Disable saving checkpoints if not trained
* chlog
* update test
* fix
Co-authored-by: chaton <thomas@grid.ai>
2020-11-03 11:38:32 +05:30
Jirka Borovec
ef03c39ab7
Add step index in checkpoint name ( #3807 )
...
* true final value of global step
* ch check
* tests
* save each validation interval
* wip
* add test
* add test
* wip
* fix tests, revert old edits, fix merge conflicts, update doctests
* test + bugfix
* sort files
* format test
* suggestion by ananth
* added changelog
* naming
* docs
* example
* suggestion
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
* fix test
* pep
* pep
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2020-11-02 15:05:58 +01:00
Adrian Wälchli
6ae4c6ec85
update docs on checkpoint_callback Trainer argument ( #4461 )
...
* docs update
* update callbacks docs
* docs
* notebook examples
* warning
* line lenght
* update deprecation
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
Co-authored-by: Roger Shieh <55400948+s-rog@users.noreply.github.com>
2020-11-02 06:18:20 +01:00
Jeff Yang
0f584faa6b
PyTorch 1.7 Stable support ( #3821 )
...
* prepare for 1.7 support [ci skip]
* tpu [ci skip]
* test run 1.7
* all 1.7, needs to fix tests
* couple with torchvision
* windows try
* remove windows
* 1.7 is here
* on purpose fail [ci skip]
* return [ci skip]
* 1.7 docker
* back to normal [ci skip]
* change to some_val [ci skip]
* add seed [ci skip]
* 4 places [ci skip]
* fail on purpose [ci skip]
* verbose=True [ci skip]
* use filename to track
* use filename to track
* monitor epoch + changelog
* Update tests/checkpointing/test_model_checkpoint.py
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2020-10-30 15:42:14 +00:00
Adrian Wälchli
d1234c592d
deprecate passing ModelCheckpoint instance to Trainer(checkpoint_callback=...) ( #4336 )
...
* first attempt
* update tests
* support multiple
* test bugfix
* changelog
* pep
* pep
* import order
* import
* improve test for resuming
* test
* update test
* add references test
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
* docstring suggestion deprecation
Co-authored-by: Jeff Yang <ydcjeff@outlook.com>
* paramref
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Jeff Yang <ydcjeff@outlook.com>
2020-10-30 04:47:37 +01:00
Carlos Mocholí
00cc69aed7
Add "monitor" to saved ModelCheckpoints ( #4383 )
...
* Add key
* Remove unused variables
* Update CHANGELOG [skip ci]
* best_model_monitor -> monitor
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2020-10-28 15:21:08 +05:30
chaton
3abfec8962
[HOTFIX] ModelCheckpoint - Don't increase current_epoch and global_step if not trained ( #4291 )
...
* add two tests w/wo tempdir
* resolve flake8
* this test is failing
* update bug report
* resolve bug and add test
* remove bug_report
* resolve flake8
* resolve bug
* resolve pep8
* resolve pep8
Co-authored-by: Teddy Koker <teddy.koker@gmail.com>
2020-10-23 11:17:50 +01:00
Rohit Gupta
4c7ebdc32b
Add dirpath and filename parameter in ModelCheckpoint ( #4213 )
...
* Add dirpath and filename parameter in ModelCheckpoint
* remove old function
* chlog
* codefactor
* update tests
* docs
* fix doctest and added tests
* pathlib dirpath
* dep version and docs
* try fix doctest
* pep
* suggestions
Co-authored-by: carmocca <carlossmocholi@gmail.com>
* suggestions
* fix test
* pep
* trigger tests
* Apply suggestions from code review
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
* suggestions
* try fix windows test
* add and update some tests
* trigger tests
* Apply suggestions from code review
* Apply suggestions from code review
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: William Falcon <waf2107@columbia.edu>
2020-10-23 09:59:12 +05:30
William Falcon
8a20d6af51
make save fx part of model checkpoint cb ( #4284 )
2020-10-21 10:06:42 -04:00
Sean Naren
98eb736496
Added getstate/setstate method for torch.save serialization ( #4127 )
...
* Added getstate/setstate method for torch.save serialization, added additional Optional Typing to results object
* Added tests to ensure torch.save does not fail
* Added flags to ensure compatible ddp cpu environment
* Removed torch version check due to minimum already being 1.3, reduced epochs for speed
* Moved tests to separate file
* Update to accelerator, move to ddp_spawn to prevent hanging ddp
2020-10-13 16:47:23 -04:00
William Falcon
09c2020a93
notices ( #4118 )
2020-10-13 07:18:07 -04:00
Jirka Borovec
8873750cf0
remove deprecated early_stop_callback ( #3982 )
2020-10-08 06:30:33 -04:00
Sean Naren
2aebf65241
Test to ensure ckpt filepath contains correct val score ( #3933 )
...
* Added test to ensure ckpt filepath contains the correct val score reported from the trainer
* Modified to check all saved ckpt files
2020-10-07 07:43:17 -04:00
Jirka Borovec
6ac0958166
fix init nan for checkpointing ( #3863 )
...
* add test for checkpoint nan
* fix
* pep
2020-10-05 07:36:12 -04:00
William Falcon
d9656d166c
fixed model checkpoint frequency ( #3852 )
...
* fixed model checkpoint frequency
* fixed model checkpoint frequency
* fixed model checkpoint frequency
* fixed model checkpoint frequency
* merged
2020-10-04 21:49:20 -04:00