Commit Graph

3285 Commits

Author SHA1 Message Date
Kaushik B d773407e59
feat: Add ModelSummary Callback (#9344)
Co-authored-by: Ethan Harris <ewah1g13@soton.ac.uk>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-09-10 12:42:42 +00:00
Carlos Mocholí 4f8c3ba4a5
Type the Loop base class as generic (#9418) 2021-09-10 12:24:25 +00:00
Carlos Mocholí e0f2e041b9
Share the training step output data via `ClosureResult` (#9349) 2021-09-10 11:40:20 +00:00
Kaushik B d028e36946
Add remove_checkpoint to CheckpointIO plugin to simplify ModelCheckpo… (#9373)
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-09-10 11:55:04 +01:00
ananthsub c963bf6568
[loops] Reset reference to dataloader iterator on run end (#9386)
* [loops] Reset reference to dataloader iterator on run end
2021-09-10 04:18:58 +00:00
Danielle Pintz 160e7e1289
Deprecate LightningModule.get_progress_bar_dict (#8985)
* Move get_progress_bar_dict from lightning module to progress bar callback
2021-09-09 20:53:47 +00:00
Adrian Wälchli 089ae9b3e8
convert state to tuple explicitly when setting python random state (#9401)
* convert state to tuple explicitly

* update changelog
2021-09-09 19:27:28 +01:00
Yi Wang f515dd8125
Remove redundant quotes in an error message (#9392) 2021-09-09 15:19:59 +02:00
Carlos Mocholí 3070a9ea6e
Fix hiddens type annotation (#9377) 2021-09-09 08:45:52 +01:00
Artsiom 41ba639859
Fix logging of nan parameters (#9364)
* Fix logging of nan parameters
2021-09-09 00:39:23 +00:00
Binh Tang a079d7fccc
Enable inference mode for testing and predicting (#8813)
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
2021-09-08 21:38:04 +00:00
Carlos Mocholí 8407238d66
Keep hidden state in the optimization loops (#9368) 2021-09-08 13:43:40 +00:00
Carlos Mocholí f239b96320
Fix `replace_sampler` missing the batch size under specific conditions (#9367) 2021-09-08 12:27:59 +02:00
Carlos Mocholí 15d943089d
Enforce that the optimizer closure is executed when `optimizer_step` is overridden (#9360) 2021-09-08 12:24:57 +02:00
Adrian Wälchli 91ce0d0a99
Remove checkpoint tracking from internal debugger (#9326)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-09-08 00:42:31 +00:00
Adrian Wälchli ca679cd78f
Add `ManualOptimization` loop (#9266)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: thomas chaton <thomas@grid.ai>
2021-09-08 02:26:39 +02:00
Sean Naren a79c351a6a
Add a warning to deepspeed when inferring batch size (#9221) 2021-09-07 16:24:00 +00:00
Carlos Mocholí 6892d533ea
Run plugin closure before `on_before_optimizer_step` [1/2] (#9288)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-09-07 11:52:20 +00:00
Sean Naren d49709e29c
Remove todo, ensure we only check rank 0 for deepspeed warning (#9311) 2021-09-07 11:20:29 +00:00
Yi Wang 0135a4bd1c
Remove some incorrect comments in ddp.py (#9319) 2021-09-07 09:15:29 +01:00
Marten Lienen 98e2f56db0
Clear reference to training loss at the end of train step (#9336)
Without clearing this reference, the loss tensor stays live through the next training
step. This can be a problem for memory intensive models that produce very deep backward
graphs such as neural ODEs. For these models, keeping the backward graph of the previous
loss in memory can lead to OOM errors in the next training step even though the step might
have succeeded if we had cleared (and thus GC'd) the previous backward graph.

Co-authored-by: tchaton <thomas@grid.ai>
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-09-06 13:37:27 +00:00
Jirka Borovec 6e124e7207
CI: precommit - docformatter (#8584)
* CI: precommit - docformatter
* fix deprecated

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-09-06 12:49:09 +00:00
Sean Naren 72bb0186fb
Update requirements, update test (#9345) 2021-09-06 12:58:54 +01:00
Carlos Mocholí 05ff1b2085
Remove unnecessary `TrainingEpochLoop` return (#9298) 2021-09-06 13:54:33 +02:00
Adrian Wälchli 9a14f04322
Fix mypy typing errors in optimizer loop (#9317) 2021-09-06 13:54:07 +02:00
thomas chaton 9149b64908
[bugfix] Resolve PyTorch Profiling for Manual Optimization (#9316)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-09-06 10:45:34 +00:00
Roger Shieh 904dde7573
Fix inspection of unspecified args for container hparams (#9125)
* Update parsing.py

* add todo (for single arg)

* unblock non container single arg

* init test

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update CHANGELOG.md

* pep8 line length

* Update pytorch_lightning/utilities/parsing.py

* remove dict namespace conversion

* add omegaconf support

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add dict test

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add omegaconf test

* Update CHANGELOG.md

* Update pytorch_lightning/utilities/parsing.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* Update pytorch_lightning/utilities/parsing.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2021-09-06 09:48:11 +00:00
Carlos Mocholí 73fca23bed
Add typing for `ResultCollection` [3/3] (#9271) 2021-09-06 09:34:40 +00:00
Adrian Wälchli 50198d7483
fix progress bar restart with fault-tolerant training enabled (#9310)
* reset progress updates
* update docs
* add test

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-09-06 10:43:59 +02:00
Adrian Wälchli f9132e8db6
remove early stopping tracking from internal debugger (#9327)
* replace dev debugger in early stopping

* remove unused imports
2021-09-06 10:43:03 +02:00
Kaushik B dc3391beae
Remove deprecation warnings being called for `on_{task}_dataloader` (#9279)
* Avoid deprecation warnings being called when hooks are not implemented
* Update tests & changelog
* Apply suggestions from code review

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2021-09-06 10:03:30 +02:00
Danielle Pintz 912fd31131
Deprecate on_keyboard_interrupt callback hook (#9260)
* add on_exception callback hook

* deprecate on_keyboard_interrupt

* Apply suggestions from code review

* raise keyboard interrupt

* Delete cluster

* update changelog

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-09-06 09:57:00 +02:00
Carlos Mocholí 49c0485d50
Avoid optional `Tracker` attributes and enable mypy (#9320) 2021-09-06 00:20:44 +00:00
Eric Wiener cf1a589956
Support infinite training (#8877)
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-09-04 23:33:43 +00:00
John St. John c30d9b9fae
Update call to `amp.autocast` from `fast_dtype` to `dtype` (#9211)
Co-authored-by: SeanNaren <sean@grid.ai>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-09-04 02:59:11 +00:00
Gili Tzabari 908e60dc85
Renamed `lr_dict` to `lr_scheduler_config` (#9313) 2021-09-04 00:47:43 +00:00
thomas chaton f6d40871bd
Prevent loss to be moved to the cpu before backward call. (#9308) 2021-09-03 16:26:26 +00:00
jjenniferdai e97c28a02b
Typing `tuner.auto_gpu_select` (#9292)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>
2021-09-03 15:49:58 +01:00
Carlos Mocholí d5ee8d8e3f
Disable `{save,check}_on_train_epoch_end` with `check_val_every_n_epoch>1` (#9156) 2021-09-03 14:27:44 +00:00
Carlos Mocholí 171d242a89
Add typing for `_FxValidator` [1/3] (#9269) 2021-09-03 13:41:05 +00:00
Carlos Mocholí f745aa9ce1
Move tracking epoch end outputs logic to the `EvaluationEpochLoop` (#9261) 2021-09-03 15:02:34 +02:00
Adrian Wälchli b91747ef75
remove backward from training batch loop (#9265) 2021-09-03 00:15:40 +00:00
Carlos Mocholí 285db62ba2
Improve progress.py docstrings (#9284) 2021-09-03 00:15:09 +00:00
Carlos Mocholí ddb4dc2659
Add typing for `LoggerConnector` [2/3] (#9270)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-09-03 01:39:06 +02:00
Carlos Mocholí 1e08b044ec
Allow easy CLI trainer re-instantiation (#9241)
* Allow easy CLI trainer re-instantiation

* Update CHANGELOG

* Allow passing any trainer argument

* Do not modify the previous config
2021-09-03 00:56:30 +02:00
Burhanuddin Rangwala ead2404aac
Added doc strings to base logger file (#9232)
* added doc strings to base logger

* updated docs
2021-09-03 00:55:12 +02:00
B. Kerim Tshimanga f0788b3bbc
scheduled removal of auto_move_data decorator (#9231)
* scheduled removal of auto_move_data decorator

* update CHANGELOG.md

* remove unused import

* remove test_decorators.py

* fix missed merge conflict

Co-authored-by: thomas chaton <thomas@grid.ai>
2021-09-03 00:54:36 +02:00
Himanshu Dutta 5fbf04a145
DataModule compatiblity with Python dataclass (#9039)
* added support and checks required for use of datamodule as python dataclass
* made changes required for dataclass support for LightningDataModule and required tests
* made the code compliant with future releases
* edited tests - removed training call. left dataclass decorator to defaults.
* added tests to check for multilevel inheritence and make sure init isn't called on the parent of defined class
* modified __new__ to ensure calling of init on LightningDataModule impliciltly
* added relevant tests for multilevel inheritence cases
* removed default values from tests

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: tchaton <thomas@grid.ai>
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
2021-09-03 00:43:38 +02:00
four4fish 69cdb79e33
Add check for uninitialized _sync_dir in DDP Plugin to avoid errors during error handling (#9267) 2021-09-02 14:14:47 -07:00
Carlos Mocholí 071ae49808
Fix `LightningOptimizer.step` signature (#9289) 2021-09-02 22:23:48 +02:00