* removed monitor default value and added depreceation message
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* format change
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* requested changes
* added test
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* format changes
* typehint change
* Update CHANGELOG.md
* requested changes
* regex
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* Update code
Co-authored-by: EliaCereda
* More property updates
* Move properties. Introduce trainer._fitting
* Use trainer.fitting
* Fix reset dataloaders
* Unused code
* RunningStage.SANITY_CHECKING
* Use setters
* Fix bugs
* Fix bugs
* TrainerState.{FITTING,VALIDATING,TESTING,PREDICTING,TUNING}
* Fix bugs
* Fix bugs
* Fix tests
* Update CHANGELOG. Add deprecation warning. Fix tests
* Unused imports
* Optional trainer
* More deprecation. More refactoring
* Correct version
* Use properties
* Address comments
* flake8
* Missed renamings
* Typo
* is -> ==
It is recommended to use for Enums since they are singletons, however, since the LightningEnum subclasses str, it's not a good idea in case a user sets the state/stage with a str
* Also for tests
* Typo
* Address @tchaton's comments
* PEP8
* Correct property
* Update CHANGELOG
* Apply suggestions from code review
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* Update pytorch_lightning/trainer/trainer.py
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* Remove called sanity check
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* Fix for multiple callbacks
* Add CHANGELOG.md
* Remove old params
* Skip tests on windows using ddp
* Change name of the variable to not clash with should stop, which is separate
* Apply suggestions from code review
* Fix params
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
* add test
* resolve bug
* udpate test
* wrongly copy / paste
* update test
* resolve a second bug
Co-authored-by: Ubuntu <ubuntu@ip-172-31-62-109.ec2.internal>
* Fix info message when EarlyStopping 'mode' not provided
* fixup! Fix info message when EarlyStopping 'mode' not provided
* Apply suggestions from code review
Co-authored-by: Jeff Yang <ydcjeff@outlook.com>
Co-authored-by: Jeff Yang <ydcjeff@outlook.com>
* tpu device check
* replaced with xmp spawn
* Revert "replaced with xmp spawn"
This reverts commit 6835380f
* replaced all instances of XLA_AVAILABLE
* moved inner_f to global scope
* made refactors
* added changelog
* added TPU_AVAILABLE variable
* fix codefactor issues
* removed form trainer and early stopping
* add TORCHXLA_AVAILABLE check
* added tests
* refactoring
* Update pytorch_lightning/utilities/xla_device_utils.py
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* updated function names
* fixed bug
* updated CHANGELOG.md
* added todo
* added type hints
* isort and black
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: William Falcon <waf2107@columbia.edu>
* ref: fix metric err
* ref: fix metric err
* ref: fix metric err
* ref: merge
* ref: merge
* ref: merge
* ref: merge
* ref: decoupled ddp2
* ref: decoupled ddp2
* ref: decoupled ddp2
* ref: decoupled ddp2
* ref: decoupled ddp2
* ref: clean up ddp before final fix
* ref: clean up ddp before final fix
* ref: clean up ddp before final fix
* ref: clean up ddp before final fix
* ref: clean up ddp before final fix
* ref: clean up ddp before final fix
* ref: clean up ddp before final fix
* ref: clean up ddp before final fix
* ref: clean up ddp before final fix
* ref: clean up ddp before final fix
* ref: clean up ddp before final fix
* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)
* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)
* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)
* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)
* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)
* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)
* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)
* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)
* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)
* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)
* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)
* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)
* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)
* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)
* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)
* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)
* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)
* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)
* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)
* Update pytorch_lightning/callbacks/model_checkpoint.py
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)
* force crash when max_epochs < epochs in a checkpoint
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
* Fix ModelCheckpoint's name formatting
* Fix failing tests
* Add dot to CHECKPOINT_SUFFIX
* Set variables to their default values at the end of tests
* Fix logic for filepath='' and filename=None. Add test
* Fix Windows tests
* Fix typo. Remove leading line break and zeroes
* Remove CHECKPOINT_SUFFIX
* Fix typos. Use appropriate f-string format
* Apply suggestions from code review
* Fix broken tests after #3320
* Finish changes suggested by Borda
* Use explicit test var names
* Apply suggestions
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
* Apply suggestions
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
* Update CHANGELOG
* Apply suggestions from code review
* for
* prepend whitespace in warn msg
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* Fixes the test for early stopping without val step.
The expression which checked, if early stopping was triggered, had an off-by-one error and hence was true even if early stopping was not triggered.
Furthermore set patience to 0 and max epochs to 10, to ensure loss has enough time to flatten.
* Fixes early stopping without val step.
The issue has been, that only `early_stop_on` key was checked and not an arbitrary monitor key.
* Fixes branch, which checks whether early stopping is done during validation.
Before only `val_early_stop_on` was checked. Since arbitrary keys can be used, the set of possible validation keys cannot be exhaustive. Hence this disables "early stopping on_train_epoch_end" via an instance attribute if early stopping was executed in on_validation_epoch_end.
Furthermore adds a test, which ensures arbitrary keys work.
* Improve check whether eval results are used.
Only disable early checking with train results if eval results are actually used. Before they were always disabled in ``on_validation_epoch_end``.
Rename and document instance variable, to make it more clear.
* Remove wrong documentation on behaviour of early stopping with train result' dict.
* Apply suggestions from code review
* Apply suggestions from code review
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
* change t() to transpose() as xla devices do not support .t() on 1-dim tensor
* detach tensor before copying
* Revert "detach tensor before copying"
This reverts commit 37cc7bbe
* changed dims
* added test_result_obj_on_tpu
* detach before copying
* detach before copying
* detach before copying
* replace torch.cat with sum