* tpu device check
* replaced with xmp spawn
* Revert "replaced with xmp spawn"
This reverts commit 6835380f
* replaced all instances of XLA_AVAILABLE
* moved inner_f to global scope
* made refactors
* added changelog
* added TPU_AVAILABLE variable
* fix codefactor issues
* removed form trainer and early stopping
* add TORCHXLA_AVAILABLE check
* added tests
* refactoring
* Update pytorch_lightning/utilities/xla_device_utils.py
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* updated function names
* fixed bug
* updated CHANGELOG.md
* added todo
* added type hints
* isort and black
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: William Falcon <waf2107@columbia.edu>
* Fix.
* Fix#2550: allow to load model from checkpoint if self.save_hyperparameters() was not called.
* Fix? Cleaner way of not calling self.save_hyperparameters in EvalModelTemplate.
* Fix? `_load_model_state` cleanup
* Fix?
* Fix#2550: allow to load model from checkpoint if self.save_hyperparameters() was not called.
* Fix.
* Fix? Cleaner way of not calling self.save_hyperparameters in EvalModelTemplate.
* Fix? `_load_model_state` cleanup
* Fixed side effect in `test_load_model_from_checkpoint_extra_args`.
* Apply suggestions from code review
* fix
* try
* fixed missing arg in evalmodel
* fixed missing arg in evalmodel
* fix
* update
* fix loading
* add test
* prune
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Jirka Borovec <jirka@pytorchlightning.ai>
Co-authored-by: William Falcon <waf2107@columbia.edu>
* make current_epoch and global_step to be same as trainer, after model restore.
* remove assignment here
* test
* minor modification
* Update pytorch_lightning/core/lightning.py
type check, better clarity
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
* Update pytorch_lightning/core/lightning.py
type check, better clarity
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
* comments for current_epoch and global_step properties
* Update tests/models/test_restore.py
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
* update comments according to the changes made
* Update tests/models/test_restore.py
* add current_epoch, global_step to jit ignore list
* Add comments to CHANGELOG
* Update CHANGELOG.md
* Update tests/models/test_restore.py
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
* ref: fix metric err
* ref: fix metric err
* ref: fix metric err
* ref: merge
* ref: merge
* ref: merge
* ref: merge
* ref: decoupled ddp2
* ref: decoupled ddp2
* ref: decoupled ddp2
* ref: decoupled ddp2
* ref: decoupled ddp2
* ref: clean up ddp before final fix
* ref: clean up ddp before final fix
* ref: clean up ddp before final fix
* ref: clean up ddp before final fix
* ref: clean up ddp before final fix
* ref: clean up ddp before final fix
* ref: clean up ddp before final fix
* ref: clean up ddp before final fix
* ref: clean up ddp before final fix
* ref: clean up ddp before final fix
* ref: clean up ddp before final fix
* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)
* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)
* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)
* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)
* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)
* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)
* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)
* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)
* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)
* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)
* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)
* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)
* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)
* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)
* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)
* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)
* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)
* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)
* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)
* Update pytorch_lightning/callbacks/model_checkpoint.py
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)
* force crash when max_epochs < epochs in a checkpoint
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
* cleaning up stale logger tests
* cleaning up stale logger tests
* cleaning up stale logger tests
* cleaning up stale logger tests
* cleaning up stale logger tests
* cleaning up stale logger tests
* script
* docs
* simple test
* move test
* fix doctest
* no grad context
* extend tests
test
test
* datamodule test
* clean up test
* docs
* name
* fix import
* update changelog
* fix import
* skip pytorch 1.3 in test
* update codeblock
* skip bugged 1.4
* typehints
* doctest not working on all pytorch versions
* rename TestGAN to prevent pytest interference
* add note about pytorch version
* fix torchscript version inconsistency in tests
* reset training state + tests
* update docstring
* Apply suggestions from code review
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
* update docstring, dict return
* add docs to index
* add link
* doc eval mode
* forward
* optional save to file path
* optional
* test torchscript device
* test save load with file path
* pep
* str
* Commit typing suggestion
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
* skip test if cuda not available
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
* change t() to transpose() as xla devices do not support .t() on 1-dim tensor
* detach tensor before copying
* Revert "detach tensor before copying"
This reverts commit 37cc7bbe
* changed dims
* added test_result_obj_on_tpu
* detach before copying
* detach before copying
* detach before copying
* replace torch.cat with sum
When a LightningModule inherits from a class that implements `__new__()` such as `typing.Generic`, `inspect.signature(cls)` short-circuits and returns the signature of `__new__()` instead of `__init__()`. So, we need to be more specific and call inspection directly on the init function.
* add ddp script variations
* add ddp test
* rename
* shell
* test
* test
* try call
* try without subprocess
* test
* display the error
* list all variations
* try string
* try copy env
* debug
* pythonpath
* path
* update test
* change
* simple ddp test
* replace
* remove random port
* random port
* str
* clean up
* check run spawn
* clean up
* docs
* docs
* update test
* docs
* changelog
* changelog
* override dist backend when using tpus
* added test
* updated doc string
* drop redundant info...
* more redundant info
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: rohitgr7 <rohitgr1998@gmail.com>