Carlos Mocholí
e0f2e041b9
Share the training step output data via `ClosureResult` ( #9349 )
2021-09-10 11:40:20 +00:00
Kaushik B
d028e36946
Add remove_checkpoint to CheckpointIO plugin to simplify ModelCheckpo… ( #9373 )
...
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-09-10 11:55:04 +01:00
Danielle Pintz
160e7e1289
Deprecate LightningModule.get_progress_bar_dict ( #8985 )
...
* Move get_progress_bar_dict from lightning module to progress bar callback
2021-09-09 20:53:47 +00:00
Adrian Wälchli
25af4b137e
rewrite and improve tests for truncated back-propagation ( #9369 )
...
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: thomas chaton <thomas@grid.ai>
2021-09-08 20:32:59 +00:00
Carlos Mocholí
8407238d66
Keep hidden state in the optimization loops ( #9368 )
2021-09-08 13:43:40 +00:00
Carlos Mocholí
f239b96320
Fix `replace_sampler` missing the batch size under specific conditions ( #9367 )
2021-09-08 12:27:59 +02:00
Carlos Mocholí
15d943089d
Enforce that the optimizer closure is executed when `optimizer_step` is overridden ( #9360 )
2021-09-08 12:24:57 +02:00
Adrian Wälchli
91ce0d0a99
Remove checkpoint tracking from internal debugger ( #9326 )
...
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-09-08 00:42:31 +00:00
Adrian Wälchli
ca679cd78f
Add `ManualOptimization` loop ( #9266 )
...
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: thomas chaton <thomas@grid.ai>
2021-09-08 02:26:39 +02:00
Sean Naren
a79c351a6a
Add a warning to deepspeed when inferring batch size ( #9221 )
2021-09-07 16:24:00 +00:00
Carlos Mocholí
6892d533ea
Run plugin closure before `on_before_optimizer_step` [1/2] ( #9288 )
...
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-09-07 11:52:20 +00:00
Sean Naren
d49709e29c
Remove todo, ensure we only check rank 0 for deepspeed warning ( #9311 )
2021-09-07 11:20:29 +00:00
Carlos Mocholí
392c577825
Add test assertion ( #9309 )
2021-09-06 16:06:26 +00:00
Marten Lienen
98e2f56db0
Clear reference to training loss at the end of train step ( #9336 )
...
Without clearing this reference, the loss tensor stays live through the next training
step. This can be a problem for memory intensive models that produce very deep backward
graphs such as neural ODEs. For these models, keeping the backward graph of the previous
loss in memory can lead to OOM errors in the next training step even though the step might
have succeeded if we had cleared (and thus GC'd) the previous backward graph.
Co-authored-by: tchaton <thomas@grid.ai>
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-09-06 13:37:27 +00:00
Jirka Borovec
6e124e7207
CI: precommit - docformatter ( #8584 )
...
* CI: precommit - docformatter
* fix deprecated
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-09-06 12:49:09 +00:00
Sean Naren
72bb0186fb
Update requirements, update test ( #9345 )
2021-09-06 12:58:54 +01:00
Carlos Mocholí
05ff1b2085
Remove unnecessary `TrainingEpochLoop` return ( #9298 )
2021-09-06 13:54:33 +02:00
Adrian Wälchli
9a14f04322
Fix mypy typing errors in optimizer loop ( #9317 )
2021-09-06 13:54:07 +02:00
thomas chaton
9149b64908
[bugfix] Resolve PyTorch Profiling for Manual Optimization ( #9316 )
...
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-09-06 10:45:34 +00:00
Roger Shieh
904dde7573
Fix inspection of unspecified args for container hparams ( #9125 )
...
* Update parsing.py
* add todo (for single arg)
* unblock non container single arg
* init test
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Update CHANGELOG.md
* pep8 line length
* Update pytorch_lightning/utilities/parsing.py
* remove dict namespace conversion
* add omegaconf support
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* add dict test
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* add omegaconf test
* Update CHANGELOG.md
* Update pytorch_lightning/utilities/parsing.py
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
* Update pytorch_lightning/utilities/parsing.py
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2021-09-06 09:48:11 +00:00
Carlos Mocholí
73fca23bed
Add typing for `ResultCollection` [3/3] ( #9271 )
2021-09-06 09:34:40 +00:00
Adrian Wälchli
50198d7483
fix progress bar restart with fault-tolerant training enabled ( #9310 )
...
* reset progress updates
* update docs
* add test
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-09-06 10:43:59 +02:00
Adrian Wälchli
f9132e8db6
remove early stopping tracking from internal debugger ( #9327 )
...
* replace dev debugger in early stopping
* remove unused imports
2021-09-06 10:43:03 +02:00
Kaushik B
dc3391beae
Remove deprecation warnings being called for `on_{task}_dataloader` ( #9279 )
...
* Avoid deprecation warnings being called when hooks are not implemented
* Update tests & changelog
* Apply suggestions from code review
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2021-09-06 10:03:30 +02:00
Danielle Pintz
912fd31131
Deprecate on_keyboard_interrupt callback hook ( #9260 )
...
* add on_exception callback hook
* deprecate on_keyboard_interrupt
* Apply suggestions from code review
* raise keyboard interrupt
* Delete cluster
* update changelog
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-09-06 09:57:00 +02:00
Carlos Mocholí
49c0485d50
Avoid optional `Tracker` attributes and enable mypy ( #9320 )
2021-09-06 00:20:44 +00:00
Eric Wiener
cf1a589956
Support infinite training ( #8877 )
...
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-09-04 23:33:43 +00:00
John St. John
c30d9b9fae
Update call to `amp.autocast` from `fast_dtype` to `dtype` ( #9211 )
...
Co-authored-by: SeanNaren <sean@grid.ai>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-09-04 02:59:11 +00:00
Gili Tzabari
908e60dc85
Renamed `lr_dict` to `lr_scheduler_config` ( #9313 )
2021-09-04 00:47:43 +00:00
thomas chaton
f6d40871bd
Prevent loss to be moved to the cpu before backward call. ( #9308 )
2021-09-03 16:26:26 +00:00
Carlos Mocholí
d5ee8d8e3f
Disable `{save,check}_on_train_epoch_end` with `check_val_every_n_epoch>1` ( #9156 )
2021-09-03 14:27:44 +00:00
Carlos Mocholí
171d242a89
Add typing for `_FxValidator` [1/3] ( #9269 )
2021-09-03 13:41:05 +00:00
Carlos Mocholí
f745aa9ce1
Move tracking epoch end outputs logic to the `EvaluationEpochLoop` ( #9261 )
2021-09-03 15:02:34 +02:00
Adrian Wälchli
b91747ef75
remove backward from training batch loop ( #9265 )
2021-09-03 00:15:40 +00:00
Carlos Mocholí
1e08b044ec
Allow easy CLI trainer re-instantiation ( #9241 )
...
* Allow easy CLI trainer re-instantiation
* Update CHANGELOG
* Allow passing any trainer argument
* Do not modify the previous config
2021-09-03 00:56:30 +02:00
B. Kerim Tshimanga
f0788b3bbc
scheduled removal of auto_move_data decorator ( #9231 )
...
* scheduled removal of auto_move_data decorator
* update CHANGELOG.md
* remove unused import
* remove test_decorators.py
* fix missed merge conflict
Co-authored-by: thomas chaton <thomas@grid.ai>
2021-09-03 00:54:36 +02:00
Himanshu Dutta
5fbf04a145
DataModule compatiblity with Python dataclass ( #9039 )
...
* added support and checks required for use of datamodule as python dataclass
* made changes required for dataclass support for LightningDataModule and required tests
* made the code compliant with future releases
* edited tests - removed training call. left dataclass decorator to defaults.
* added tests to check for multilevel inheritence and make sure init isn't called on the parent of defined class
* modified __new__ to ensure calling of init on LightningDataModule impliciltly
* added relevant tests for multilevel inheritence cases
* removed default values from tests
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: tchaton <thomas@grid.ai>
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
2021-09-03 00:43:38 +02:00
Adrian Wälchli
a5e2f2b432
fix state extraction from batch when fault-tolerant training ( #9281 )
2021-09-02 11:57:40 -07:00
Adrian Wälchli
e802f519ea
Tighten the checks for `Trainer.terminate_on_nan` ( #9190 )
2021-09-02 18:35:22 +02:00
Adrian Wälchli
75350938ca
extract optimizer loop ( #9191 )
2021-09-02 12:40:05 +01:00
four4fish
a451997c4d
Avoid wrapping LightningModule in DDP plugins when not fitting ( #9096 )
...
* Avoid wrapping LightningModule in DDP plugins when not fitting
* Avoid wrapping LightningModule in DDP plugins when not fitting
2021-09-02 02:23:59 +00:00
Pavel Grunt
e2ecb8f859
Allow exporting to onnx when input is tuple ( #8800 )
...
Fixes #8799
2021-09-02 03:36:20 +02:00
B. Kerim Tshimanga
35876bb75f
remove lightning module datamodule property ( #9233 )
...
Co-authored-by: thomas chaton <thomas@grid.ai>
2021-09-02 00:43:47 +02:00
B. Kerim Tshimanga
65b3dc4495
scheduled removal of DeepSpeedPlugin.cpu_offload* parameters ( #9244 )
2021-09-01 12:02:30 +02:00
Danielle Pintz
b046bd0670
Add on_exception callback hook ( #9183 )
2021-09-01 10:49:00 +02:00
Kaushik B
f21f1bedf2
Deprecate `process_position` from the Trainer constructor ( #9222 )
2021-08-31 15:14:23 +00:00
B. Kerim Tshimanga
f6614b370c
scheduled removal of BaseProfiler.output_filename in favor of dirpath… ( #9214 )
2021-08-31 09:30:43 +00:00
Soham Tiwari
861f8afeea
[bugfix] Changed CometLogger to stop modifying metrics in place ( #9150 )
...
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-08-31 08:21:16 +00:00
B. Kerim Tshimanga
07ee8fc9a0
Remove deprecated property `ModelCheckpoint.period` in favor of `ModelCheckpoint.every_n_epochs` ( #9213 )
2021-08-31 10:04:29 +02:00
B. Kerim Tshimanga
34053ef85e
Remove deprecated `Trainer.running_sanity_check` ( #9209 )
2021-08-31 01:44:33 +02:00