* Fix bug comparing max_steps to global step which inits at 0
* Added test to ensure accumulate grad batch works with max steps
* check fix with TODO test
* correct call counts
* Add check to ensure we've finished accumulation of this global step before exiting loop in conjuction with max steps
* Remove + 1 check in test as this was incorrect
* Update incorrect expected outputs in lr finder test
* Added brackets for clarity
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* closure for all optimizers
* rename hook and take care of alternating backwards
* add comment
* training_loop_fix
* closure whenever possible
* training_loop
* simple tests that count backward calls
* fix test to work with closure
* remove debugging statement
* better place
* check grads after backward
* start fixing manual optimization
* skip step when result returned by closure was None
* fix gradient clipping test to work with closure
* attribute dict result only for automatic optimization
* adjust backward calls in accelerator
* adjust where to call gradient clipping
* adjust backward calls in tests
* Apply suggestions from code review
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* pass kwargs to xla optimizer
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* ref: decouple apex second attemp part 4/n
* ref: decouple apex second attemp part 4/n
* Update lightning.py
* ref: decouple apex second attemp part 4/n
* make current_epoch and global_step to be same as trainer, after model restore.
* remove assignment here
* test
* minor modification
* merge with parent's master
* [bug-fix]: update trainer properties
* minor comment fix
* minor comment fix
* reset train loader in `on_train_epoch_start` hook
* makes sure the changes work
* minor chane
* update changelog
* adding unit test for reload_dataloaders_every_epoch arg
* modified changelog, to add PR number
* revert imports
* changes to unit test
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* Makes sure logging doesn't ever happen from non-root zero
* Makes sure logging doesn't ever happen from non-root zero
* Makes sure logging doesn't ever happen from non-root zero
* added bug report model
* fix local model
* fix local model
* fix local model
* fix local model
* make current_epoch and global_step to be same as trainer, after model restore.
* remove assignment here
* test
* minor modification
* Update pytorch_lightning/core/lightning.py
type check, better clarity
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
* Update pytorch_lightning/core/lightning.py
type check, better clarity
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
* comments for current_epoch and global_step properties
* Update tests/models/test_restore.py
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
* update comments according to the changes made
* Update tests/models/test_restore.py
* add current_epoch, global_step to jit ignore list
* Add comments to CHANGELOG
* Update CHANGELOG.md
* Update tests/models/test_restore.py
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
* ref: fix metric err
* ref: fix metric err
* ref: fix metric err
* ref: merge
* ref: merge
* ref: merge
* ref: merge
* ref: decoupled ddp2
* ref: decoupled ddp2
* ref: decoupled ddp2
* ref: decoupled ddp2
* ref: decoupled ddp2
* ref: clean up ddp before final fix
* ref: clean up ddp before final fix
* ref: clean up ddp before final fix
* ref: clean up ddp before final fix
* ref: clean up ddp before final fix
* ref: clean up ddp before final fix
* ref: clean up ddp before final fix
* ref: clean up ddp before final fix
* ref: clean up ddp before final fix
* ref: clean up ddp before final fix
* ref: clean up ddp before final fix
* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)
* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)
* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)
* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)
* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)
* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)
* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)
* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)
* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)
* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)
* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)
* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)
* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)
* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)
* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)
* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)
* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)
* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)
* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)
* Update pytorch_lightning/callbacks/model_checkpoint.py
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
* ref: result 1/n (make monitor default to checkpoint_on to simplify result syntax)
* force crash when max_epochs < epochs in a checkpoint
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>