Commit Graph

30 Commits

Author SHA1 Message Date
four4fish 6fe3211573
Unroll dict input before call Accelerator X_steps (#10908)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-12-03 17:00:52 +00:00
four4fish 1d2878523a
2/n Move Precision Plugin into strategy - move optimizer related logics (#10596)
Co-authored-by: Danielle Pintz <38207072+daniellepintz@users.noreply.github.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: thomas chaton <thomas@grid.ai>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-11-30 08:31:23 +00:00
Carlos Mocholí 3089dc3829
Improve typing for loops (#10749)
* Improve typing for loops

* Free memory
2021-11-26 18:39:09 +00:00
Carlos Mocholí 31bb6e69ca
Avoid optional instances in Loops (#10735)
* Avoid optional instances in Loops

* More cleanup
2021-11-26 18:00:18 +00:00
Kaushik B e0b4bb2ea3
Deprecate `DeviceType` in favor of `_AcceleratorType` (#10503)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-11-25 16:41:03 +01:00
Carlos Mocholí 069ec1005a
Do not autodetach extras (#10424)
* Do not autodetach extras

* Update CHANGELOG

* Use foo
2021-11-09 16:07:16 +00:00
Carlos Mocholí 03f01fb5ec
Fix gradient norm tracking and gradient clipping (#9287)
* WIP

* Progress

* Undo test change

* Fix plugin closure execution order

* Update CHANGELOG

* Fix manual optimization on AMP and skipping backward

* Fix for deepspeed

* Typo

* Hook test for manual closure

* Add skipping test with AMP

* You are hideous, apex

* Add deepspeed test

* Update CHANGELOG

* Fix for broken master

* Add RunIf

* FIXMEs

* Rename

* Fix grad norm

* add a simple test

* update test

* update  test

* update test

* fix merge conflicts

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Sea of changes

* Undo change

* Introduce TPUPrecisionPlugin

* Undo changes

* Undo changes

* Resolve FIXME

* Undo change

* Undo change

* Undo change

* Fix FIXMEs

* Fix FIXME

* Correct value

* Bad merge

* Fix circular imports

* WIP

* Fixing clipping

* Fixes

* Bad merge

* Move optimizer step and clipping into the `PrecisionPlugin`

* Fix AMP

* Update CHANGELOG

* Fix tests

* Underscore

* Progress

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Remove pre_optimizer_step

* Missed one

* Progress

* Progress

* Fix test

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update FIXMEs

* Fix test

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Fix test

* DeepSpeed warning. mypy

* Rename

* Finish tests

* Update CHANGELOG

* Dumb fixes

* accelerator=auto

* Apply suggestions from code review

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

* Update on comments

* Use ClassifModule

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2021-10-28 15:23:27 +00:00
Carlos Mocholí b376799430
Minor fixes related to clipping (#10130)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2021-10-25 16:40:22 +00:00
Alessio Bonfiglio 2a2fa5a56a
Group all the logged gradients under the same sub-folder (#7756) 2021-10-20 15:48:36 +00:00
Carlos Mocholí e95f9b71c1
Set the optimization output result class as a class attribute (#9977) 2021-10-19 16:33:08 +01:00
Carlos Mocholí bb2dc68792
Simplify track grad norm condition (#9992) 2021-10-19 15:00:16 +02:00
Adrian Wälchli 7a9151637c
loop customization docs (#9609)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: thomas chaton <thomas@grid.ai>
Co-authored-by: edenlightning <66261195+edenlightning@users.noreply.github.com>
2021-10-18 09:43:11 +00:00
four4fish a002f872ea
[2/n] Directly call TrainingTypePlugin APIs instead of going through the Accelerator (#9901)
Co-authored-by: tchaton <thomas@grid.ai>
2021-10-14 17:38:22 +02:00
Rohit Gupta 23e8b59ae7
Add `configure_gradient_clipping` hook in `LightningModule` (#9584)
* init hook

* docs

* dep train args

* update tests

* doc

* doc

* .gitignore

* not dep

* add trainer args

* add & update tests

* fix tests

* pre-commit

* docs

* add docs

* add exception

* code review

* deepspeed

* update tests

* not

* try fix

* Apply suggestions from code review

* update deepspeed

* disable some tests

* disable some tests

* enable all tests
2021-10-13 20:15:13 +05:30
ananthsub 4610fddb19
Mark `Trainer.terminate_on_nan` protected and deprecate public property (#9849)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-10-12 20:23:22 +00:00
Adrian Wälchli 6a0c47a014
remove redundant accumulation normalization in manual optimization (#9769) 2021-10-11 15:26:12 +00:00
Carlos Mocholí 6ef4e5ac76
Remove return value from the backward closure (#9770) 2021-10-01 16:53:00 +02:00
Carlos Mocholí 44aed17aff
Remove duplicated native AMP + LBFGS check (#9748) 2021-09-29 13:14:03 +00:00
Carlos Mocholí bc50591d49
reduce loop structure leakage into the `TrainingEpochLoop` (#9490)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-09-28 13:22:22 +00:00
four4fish 37469cd3e8
fix modify _DDPSinkBackward view inplace error for pytorch nightly 1.10 (#9649)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-09-22 19:14:24 +00:00
thomas chaton 89ab2470c1
[Refactor] 1/2 Move reset_on_restart within the loop reset (#9561)
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
2021-09-17 16:11:32 +00:00
Adrian Wälchli b84541464d
multiple optimizer restart with fault-tolerant training (#9537)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-09-16 09:54:59 +00:00
Adrian Wälchli b9fa69ea57
mark `FitLoop.should_accumulate` as protected (#9515)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-09-15 13:32:14 +00:00
Adrian Wälchli 200ed9eb9f
mark `OptimizerLoop.backward` method protected (#9514)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-09-15 14:58:01 +02:00
Carlos Mocholí 23450e2905
Add custom logic to each `OutputResult` subclass [2/2] (#9424) 2021-09-15 12:18:19 +00:00
Adrian Wälchli 0421f08742
fix optimizer loop with frequencies (#9507) 2021-09-14 21:21:45 +01:00
Carlos Mocholí b1ed1db089
Keep global step update in the loop (#8856) 2021-09-14 19:21:39 +05:30
Carlos Mocholí 48d3a10c9b
Add `OutputResult` [1/2] (#9437)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-09-14 15:48:27 +02:00
Adrian Wälchli 6ff43cbff7
fix resuming from checkpoint for fault-tolerant in case of no failure (#9371)
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
2021-09-10 17:25:46 +00:00
Carlos Mocholí 9eccb3148e
Loop and test restructuring (#9383) 2021-09-10 13:18:24 +00:00