Commit Graph

117 Commits

Author SHA1 Message Date
Kaushik B 7b0d1183db
Update `gpus` flag with `accelerator` and `devices` flag (#12156)
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2022-03-23 19:52:12 +00:00
jjenniferdai 6ba66789ae
[2/n] add `Stateful` functionality support for Callbacks (#12232) 2022-03-19 20:20:50 +00:00
Rohit Gupta 5ea811b1d9
Avoid loading dataloaders if `limit_batches=0` (#11576) 2022-02-22 11:33:53 +00:00
jjenniferdai d69b33f1f0
Introduce `Stateful` PrecisionPlugin (#11638)
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
2022-02-14 15:56:09 +05:30
Carlos Mocholí 789fae828d
Fix `current_epoch` value on training end (#8578)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2022-02-10 17:55:59 +01:00
Danielle Pintz 9e63281a4c
remove todos (#11804) 2022-02-09 08:30:27 +00:00
jjenniferdai 1203094a20
Introduce `Stateful` DataModule (#11637) 2022-02-07 21:13:24 +01:00
Rohit Gupta 581bf7f2f2
Deprecate `on_epoch_start/on_epoch_end` hook (#11578) 2022-02-07 14:15:27 +00:00
Rohit Gupta 82c8875f33
Add `LightningModule.lr_scheduler_step` (#10249)
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
2022-01-12 03:53:49 +00:00
Carlos Mocholí dcffca73d4
Parametrize deepspeed hook test (#11308) 2022-01-05 19:38:25 +00:00
Adam Viola 1fc046cde2
Fix `_should_reload_dl_epoch` causing inconsistent validation dataloader reloading (#11036)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: thomas chaton <thomas@grid.ai>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-12-28 02:20:57 +01:00
Kaushik B 0adcd6a048
Rename training_type_plugin file to strategy (#11239)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-12-23 14:01:23 +00:00
Kaushik B 576a5d62a0
Introduce strategies directory for Training Strategies (#11226)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-12-22 20:23:30 +00:00
four4fish cf5ef32f7b
Deprecate Trainer.training_type_plugin in favor of trainer.strategy (#11141)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-12-22 02:11:43 +00:00
Adrian Wälchli f5c2881b68
3/n Simplify spawn plugins: Merge `pre_dispatch` and `setup` logic (#11137) 2021-12-20 17:41:22 +01:00
Adrian Wälchli 29eb9cccf2
Rename the `TrainingTypePlugin` base to `Strategy` (#11120)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: four4fish <88516121+four4fish@users.noreply.github.com>
2021-12-20 12:50:11 +00:00
Carlos Mocholí 7e10f6d41f
Save the loop progress state by default (#10784) 2021-12-17 16:00:27 +00:00
Rohit Gupta 61a744f5c6
Fix support for logging within callbacks returned from `LightningModule` (#10991)
Co-authored-by: thomas chaton <thomas@grid.ai>
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
2021-12-14 19:41:29 +01:00
Carlos Mocholí 1b43e43e9f
Minor changes in preparation for saving the loops state (#10783) 2021-11-30 19:37:04 +05:30
four4fish 8bf7f9cce7
1/n Move Accelerator into strategy - move batch_to_device to strategy (#10649)
* 1/n Integrate Device Specific Accelerator Logic with strategy - move batch_to_device to strategy

* add changelog

* add model is not none check

* Apply suggestions from code review

Co-authored-by: thomas chaton <thomas@grid.ai>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

* Update CHANGELOG.md

* Update test_datamodules.py

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update test_hooks.py

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update dp.py

Co-authored-by: thomas chaton <thomas@grid.ai>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-11-29 12:11:21 -08:00
Carlos Mocholí 152eb57def
Rename special to standalone (#10779) 2021-11-26 17:13:14 +00:00
Rohit Gupta 823bfa6f8a
Update `LightningModule` docs (#10637) 2021-11-23 01:02:04 +05:30
Carlos Mocholí 0de8ab4f2e
Fix failing master due to an interction between PRs (#10627) 2021-11-19 02:04:53 +00:00
Carlos Mocholí 35f6cbe09f
Use `update_wrapper` in test_hooks.py (#10578) 2021-11-19 01:52:55 +01:00
Carlos Mocholí 0fa07da987
Fail the test when a `DeprecationWarning` is raised (#9940) 2021-11-17 23:41:50 +01:00
Carlos Mocholí ba036fdeea
Support special test parametrizations (#10569) 2021-11-17 15:46:14 +00:00
Rohit Gupta de7ef41fea
remove deprecated `reload_dataloaders_every_epoch` from `Trainer` (#10481) 2021-11-16 06:47:43 +00:00
Carlos Mocholí 81d15c5986
Implement double optimizer closure for hook structure consistency (#10167) 2021-10-29 13:03:04 +00:00
Carlos Mocholí 03f01fb5ec
Fix gradient norm tracking and gradient clipping (#9287)
* WIP

* Progress

* Undo test change

* Fix plugin closure execution order

* Update CHANGELOG

* Fix manual optimization on AMP and skipping backward

* Fix for deepspeed

* Typo

* Hook test for manual closure

* Add skipping test with AMP

* You are hideous, apex

* Add deepspeed test

* Update CHANGELOG

* Fix for broken master

* Add RunIf

* FIXMEs

* Rename

* Fix grad norm

* add a simple test

* update test

* update  test

* update test

* fix merge conflicts

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Sea of changes

* Undo change

* Introduce TPUPrecisionPlugin

* Undo changes

* Undo changes

* Resolve FIXME

* Undo change

* Undo change

* Undo change

* Fix FIXMEs

* Fix FIXME

* Correct value

* Bad merge

* Fix circular imports

* WIP

* Fixing clipping

* Fixes

* Bad merge

* Move optimizer step and clipping into the `PrecisionPlugin`

* Fix AMP

* Update CHANGELOG

* Fix tests

* Underscore

* Progress

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Remove pre_optimizer_step

* Missed one

* Progress

* Progress

* Fix test

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update FIXMEs

* Fix test

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix test

* DeepSpeed warning. mypy

* Rename

* Finish tests

* Update CHANGELOG

* Dumb fixes

* accelerator=auto

* Apply suggestions from code review

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

* Update on comments

* Use ClassifModule

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2021-10-28 15:23:27 +00:00
jjenniferdai 6d79184ec5
Unify checkpoint load paths [redo #9693] (#10061) 2021-10-25 19:05:31 +00:00
Carlos Mocholí b376799430
Minor fixes related to clipping (#10130)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2021-10-25 16:40:22 +00:00
Kaushik B 56bc55db71
Update strategy flag in docs (#10000)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
2021-10-20 21:02:53 +05:30
Kaushik B 5e8829b97d
(1/n) tests: Use strategy flag instead of accelerator for training strategies (#9931)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-10-16 20:40:25 +05:30
Rohit Gupta 23e8b59ae7
Add `configure_gradient_clipping` hook in `LightningModule` (#9584)
* init hook

* docs

* dep train args

* update tests

* doc

* doc

* .gitignore

* not dep

* add trainer args

* add & update tests

* fix tests

* pre-commit

* docs

* add docs

* add exception

* code review

* deepspeed

* update tests

* not

* try fix

* Apply suggestions from code review

* update deepspeed

* disable some tests

* disable some tests

* enable all tests
2021-10-13 20:15:13 +05:30
ananthsub 28fc8d2016
Add `enable_model_summary` flag and deprecate `weights_summary` (#9699)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Kaushik B <kaushikbokka@gmail.com>
2021-10-13 17:20:54 +05:30
Sean Naren 83acb8671d
Update DeepSpeed version, fix failing tests (#9898) 2021-10-11 22:35:33 +00:00
Rohit Gupta 4decbc0d95
Deprecate `dataloader_idx` from `on_train_batch_start/end` (#9816)
* deprecate hooks

* dep todo

* explicit

* Apply suggestions from code review

* Apply suggestions from code review

* code review

* base
2021-10-07 10:18:11 +00:00
Carlos Mocholí 0ddd6a8c19
Remove `_NATIVE_AMP_AVAILABLE` checks (#9747) 2021-09-29 15:34:26 +02:00
Danielle Pintz b3a5c7f442
Add `enable_progress_bar` to Trainer constructor (#9664) 2021-09-24 22:53:31 -07:00
Danielle Pintz 160e7e1289
Deprecate LightningModule.get_progress_bar_dict (#8985)
* Move get_progress_bar_dict from lightning module to progress bar callback
2021-09-09 20:53:47 +00:00
Carlos Mocholí 6892d533ea
Run plugin closure before `on_before_optimizer_step` [1/2] (#9288)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-09-07 11:52:20 +00:00
Jirka Borovec 6e124e7207
CI: precommit - docformatter (#8584)
* CI: precommit - docformatter
* fix deprecated

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-09-06 12:49:09 +00:00
Carlos Mocholí d0efb55b0f
Delete `TrainingEpochLoop._dataloader_idx` which always equals 0 (#8911) 2021-08-16 13:34:42 +02:00
Carlos Mocholí c99e2fe0d2
Test `Callback.on_load_checkpoint` order (#8588) 2021-07-29 12:28:29 +02:00
Carlos Mocholí 47c47faeae
Remove `outputs` in `on_train_epoch_end` hooks (#8587) 2021-07-28 18:27:54 +02:00
Sean Naren aadd2a9d9c
Load ckpt path when model provided in validate/test/predict (#8352)
* Change trainer loading behaviour for validate/test/predict

* Fix

* Fix/add tests

* remove

* Cleanups

* Space

* cleanups

* Add CHANGELOG.md

* Move after setup

* Cleanups on logic

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Remve

* fix test

* feedback

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update pytorch_lightning/trainer/properties.py

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

* Feedback

* Same fix

* Same fix

* Add test for behaviour, modify based on feedback

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Wording

* Apply suggestions from code review

Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

* Cleanup docs

* Update pytorch_lightning/trainer/trainer.py

Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>

* feedback

* Fixes to test API

* Add carlos description

* Move logic further

* Move checkpoint connector logic

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>
2021-07-28 10:12:46 +00:00
Carlos Mocholí a64cc37394
Replace `yapf` with `black` (#7783)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-07-26 13:37:35 +02:00
Carlos Mocholí 321689f52e
Add `ModelCheckpoint(save_on_train_epoch_end)` (#8389)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-07-13 14:47:59 +00:00
Dusan Drevicky 1b06edf2f2
Add the `on_before_optimizer_step` hook (#8048)
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
2021-07-09 13:30:52 +02:00
thomas chaton 1c825a2a9c
Add the `on_before_backward` hook (#7865)
* Add callback to hook tests and add predict test

* Fix lambda callback test

* Simplify lambda call test

* Use LambdaCallback

* Dynamically append to called for the model

* Remove print

* Consistency

* Consistency

* Prepare args/kwargs testing

* yapf doesn't like dict literals

* Add arguments for fit no val test

* Add arguments for fit no val test

* add before_backward_hook

* add test

* resolve flake8

* resolve tests

* update changelog

* add on_before_backward to LightningModule

* update on comments

* Test arguments

* Datamodule refactor

* Fix eval test

* remove extra file

* resolve bug

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* move to hooks

* update

* resolve flake8

* update on comments

* Update full fit + val test

* Update test

* Remove FIXME

* Remove FIXME

* Undo change

* Fix

* Parametrize fit hook test

* Comment

* Parametrize fit hook test with different precision plugins

* Fix tests

* Parametrize fit hook test with manual optimization

* Unnecessary parenthesis

* WIP

* Comments

* Fix message

* Test CI error

* Revert "Test CI error"

This reverts commit 39c4a85a83.

* Add ddp training type teardown

* Update CHANGELOG

* Adrian's fix

* Use destructor

* Update CHANGELOG.md

* RPC destructor

* Update pytorch_lightning/plugins/training_type/ddp.py

* Why do you not work :(

* Missing condition

* Fix deepspeed test

* GC collect in conftest

* Do not show warnings for special tests

* Needs to run on 1.8

To avoid: "RuntimeError: NCCL error in: /pytorch/torch/lib/c10d/ProcessGroupNCCL.cpp:32, unhandled cuda error, NCCL version 2.4.8"

* Run torch 1.8

* Skip test due to 'Python bus error'

* Debug NCCL

* shm size

* Disable warnings for special tests

* Remove NCCL_DEBUG statement

* Try smaller shm size

* Revert "Skip test due to 'Python bus error'"

This reverts commit e0a3e8785d.

* README and adjust versions

* Avoid self.on_gpu call

* empty cache cleanup

* More garbage collection

* Unroll parametrizations

* Do not reuse mock

* Undo changes

* Undo notebooks modification

* resolve test

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* delete file

* Undo

* Fix test

* Revert "WIP"

This reverts commit f5828a8c42.

* Rename

* Remove optimizers

* Fix bug with LightningOptimizer

* Add optimizers

* update

* update

* Update CHANGELOG

* On after backward refactor

* Do not call super

* Fixes

* Remove should_accumulate

* pre/post backward refactor

* Call the LM backward hook

* Update tests

* Remove dev debug patch

* Fix test

* Remove optimizer arguments and typing

* Docs fixes

* Fix comment

* Undo changes

* Split manual and auto

* Undo change

* Deepsource

* Remove optimizers

* Undo changes

* Call the hook

* Docs

* Docs

Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-07-09 06:15:57 +00:00