Commit Graph

14 Commits

Author SHA1 Message Date
thomas chaton 5841ca9782
[Feat] Add auto_restart for fault tolerant training (#9722) 2021-10-01 16:37:17 +00:00
Carlos Mocholí 8dcba38e0e
Add `is_last_batch` to progress tracking (#9657) 2021-09-23 12:54:41 +00:00
Adrian Wälchli 0421f08742
fix optimizer loop with frequencies (#9507) 2021-09-14 21:21:45 +01:00
Carlos Mocholí e0f2e041b9
Share the training step output data via `ClosureResult` (#9349) 2021-09-10 11:40:20 +00:00
Adrian Wälchli ca679cd78f
Add `ManualOptimization` loop (#9266)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: thomas chaton <thomas@grid.ai>
2021-09-08 02:26:39 +02:00
Carlos Mocholí 49c0485d50
Avoid optional `Tracker` attributes and enable mypy (#9320) 2021-09-06 00:20:44 +00:00
Adrian Wälchli 75350938ca
extract optimizer loop (#9191) 2021-09-02 12:40:05 +01:00
Adrian Wälchli b13749b4ec
add fault-tolerance for global random state in map-style datasets (#8950)
Co-authored-by: tchaton <thomas@grid.ai>
Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2021-08-26 12:13:31 +00:00
thomas chaton dd8216a6b8
Save the `ResultCollection` in the loops state dict (#8641)
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-08-02 20:52:24 +00:00
Carlos Mocholí a64cc37394
Replace `yapf` with `black` (#7783)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-07-26 13:37:35 +02:00
Adrian Wälchli 7d93d70110
Loop specialization (#8226)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: tchaton <thomas@grid.ai>
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
Co-authored-by: Justus Schock <justus.schock@posteo.de>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2021-07-19 15:08:53 +02:00
thomas chaton 7bb810f143
Add progress tracking on Loops - 2/n (#8362)
* resolve issues

* update

* update

* update

* add more exceptions

* resolve bug

* update

* update

* update changelog

* resolve bug

* resolve comments

* update

* update

* update changelog

* update

* update

* remove space

* update

* add progress tracking to loops

* validate json

* update

* convert to dict for better readability

* validate reload

* update

* update

* update on comments

* remove deadcode

* clean changelog

* clean changelog

* update

* update on comments

* CHANGELOG

* CHANGELOG

* Update pytorch_lightning/loops/base.py

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

* whitespace suggestions

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* make fault_tolerant_enabled protected

* whitespace fixes around Args

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update

* typo it's -> its

* fix copy-paste typo in progress docstring

* Delete classes

* Minor change

* docs

* protected get_loops_state

* merge restore_loops with restore_progress

* Fix tests after removals

* explicit save with trainer.save_checkpoint()

* handle optimization restart based on optimizer_idx

* update increments

* update val batch progress and remove iteration count

* update progress tracking for dataloader loops

* remove self.dataloader_idx from eval_epoch_loop

* add batch progress to predict loop

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* incorporate progress tracking for current_epoch

* Fix test

* Actually remove it

* Remove unused TrainingEpochProgress

* Fix optimization progress - missing scheduler

* Restarting changes

* Scheduler progress

* Unused property, reset on epoch

* Resolve FIXME

* Remove FIXME

* fix test_progress (wip)

* fix batch_progress.current.reset

* Hold off on split progress. Out of scope of this PR

* Unnecessary if

* fix structure in test_progress

* structure

* clean up unused variables in test_progress

* refactor naming and organization in test_progress

* Unnecessary variable

* Remove unnecessary diff

* Improve comment

* Undo typing change to avoid polluting everything with mypy fixes

* Fix and improve test_loops.py

* Fix and organize `test_loop_state_dict`

* Remove unnecessary checks in test

* Update test after disallowing updates on None attributes

* Typing

* Minor test cleanup

* Fix and move loop test

* Move test from progress to loops

* Reset the scheduler progress

* SchedulerProgress fix

* Consistent whitespace

* Fix final test

* Minor test changes

* One test to rule them all

* Formatting

* Rename and clean variables

* Shorter names

* Shorter scheduler name

* Fix optimizer step calculation for stop_batch=2

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Remove empty connects

* Update CHANGELOG

* Holy shit finally got the formula right

* Fix final thing!!!

* Do not check state dicts

* parametrize multiple_dataloader progress test

* Update CHANGELOG.md

Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Justus Schock <justus.schock@posteo.de>
2021-07-19 08:31:45 +00:00
thomas chaton 370fa67004
[Refactor] Improve loops API 1/n (#8334)
* resolve issues

* update

* update

* update

* add more exceptions

* resolve bug

* update

* update

* update changelog

* resolve bug

* resolve comments

* update

* update

* update changelog

* update

* update

* remove space

* update

* re-order protected trainer attr

* move public method up

* add docs to state dict methods

* combine __load with load_state_dict

* rename shadowed variable

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* move changelog entry to refactor section

* refactor loop_progress property for test helper function

* update trainer setter docstring

* Update CHANGELOG.md

* Update pytorch_lightning/loops/base.py

* remove trainer check

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
2021-07-12 11:13:50 +00:00
thomas chaton d51b0ae7fc
Add `state_dict` to loops (#8197)
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
2021-07-01 15:54:37 +00:00