ananthsub
aad86423f7
Remove more deprecated methods from base `Accelerator` class ( #10448 )
2021-11-10 12:58:24 +05:30
puhuk
f9b9cdb0d1
Remove deprecated accelerator pass through functions in Accelerator ( #10403 )
...
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-11-08 17:36:37 +00:00
Adrian Wälchli
a270a79ed9
Rename "master" methods to "main" in ClusterEnvironment plugins ( #10103 )
...
* rename occurrences of master port, master address, maser node, master process
* rename properties
* add property decorators
* occurrences in docs
* update changelog
* update changelog
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* add lost method
* create deprecation
* add changelog
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fix typo (but it was already there!!!)
* Apply suggestions from code review
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
* add todo
* update more occurences
* add types
* add missing import
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
2021-11-08 12:32:58 +00:00
Carlos Mocholí
9237106451
Clip before step ( #10248 )
2021-10-30 11:27:49 +01:00
Kaushik B
cedaebfcbb
Add `auto_device_count` method to `Accelerators` ( #10222 )
...
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-10-29 22:31:32 +02:00
Carlos Mocholí
81d15c5986
Implement double optimizer closure for hook structure consistency ( #10167 )
2021-10-29 13:03:04 +00:00
Carlos Mocholí
03f01fb5ec
Fix gradient norm tracking and gradient clipping ( #9287 )
...
* WIP
* Progress
* Undo test change
* Fix plugin closure execution order
* Update CHANGELOG
* Fix manual optimization on AMP and skipping backward
* Fix for deepspeed
* Typo
* Hook test for manual closure
* Add skipping test with AMP
* You are hideous, apex
* Add deepspeed test
* Update CHANGELOG
* Fix for broken master
* Add RunIf
* FIXMEs
* Rename
* Fix grad norm
* add a simple test
* update test
* update test
* update test
* fix merge conflicts
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Sea of changes
* Undo change
* Introduce TPUPrecisionPlugin
* Undo changes
* Undo changes
* Resolve FIXME
* Undo change
* Undo change
* Undo change
* Fix FIXMEs
* Fix FIXME
* Correct value
* Bad merge
* Fix circular imports
* WIP
* Fixing clipping
* Fixes
* Bad merge
* Move optimizer step and clipping into the `PrecisionPlugin`
* Fix AMP
* Update CHANGELOG
* Fix tests
* Underscore
* Progress
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Remove pre_optimizer_step
* Missed one
* Progress
* Progress
* Fix test
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Update FIXMEs
* Fix test
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* Fix test
* DeepSpeed warning. mypy
* Rename
* Finish tests
* Update CHANGELOG
* Dumb fixes
* accelerator=auto
* Apply suggestions from code review
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
* Update on comments
* Use ClassifModule
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2021-10-28 15:23:27 +00:00
Carlos Mocholí
48b6292cf0
Move optimizer step and clipping into the `PrecisionPlugin` ( #10143 )
2021-10-26 17:26:26 +02:00
Rohit Gupta
93266e2c22
Avoid deprecated warnings from accelerator and checkpoint connector #10142
2021-10-26 14:10:30 +02:00
Carlos Mocholí
b376799430
Minor fixes related to clipping ( #10130 )
...
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2021-10-25 16:40:22 +00:00
Adrian Wälchli
d41902883a
Update `optimizer_step` methods in accelerator and plugins ( #10023 )
...
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
2021-10-20 21:36:27 +01:00
Carlos Mocholí
ef5a12212a
Isolate optimizer step logic to the `PrecisionPlugin` ( #10029 )
2021-10-20 15:43:08 +00:00
Carlos Mocholí
e8beceb631
Add `TPUPrecisionPlugin` ( #10020 )
...
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-10-19 17:48:57 +00:00
Carlos Mocholí
e5dfdf34f9
Avoid deprecation warning after #9901 ( #9951 )
2021-10-16 17:36:25 +01:00
four4fish
a002f872ea
[2/n] Directly call TrainingTypePlugin APIs instead of going through the Accelerator ( #9901 )
...
Co-authored-by: tchaton <thomas@grid.ai>
2021-10-14 17:38:22 +02:00
Danielle Pintz
940b910d27
[2/4] Add DeviceStatsMonitor callback ( #9712 )
...
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
Co-authored-by: thomas chaton <thomas@grid.ai>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Kaushik B <kaushikbokka@gmail.com>
Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>
2021-10-13 18:29:36 +00:00
Rohit Gupta
4decbc0d95
Deprecate `dataloader_idx` from `on_train_batch_start/end` ( #9816 )
...
* deprecate hooks
* dep todo
* explicit
* Apply suggestions from code review
* Apply suggestions from code review
* code review
* base
2021-10-07 10:18:11 +00:00
Carlos Mocholí
0ddd6a8c19
Remove `_NATIVE_AMP_AVAILABLE` checks ( #9747 )
2021-09-29 15:34:26 +02:00
Carlos Mocholí
9ebfbbc349
Remove unused `post_optimizer_step` ( #9746 )
2021-09-29 13:09:22 +00:00
four4fish
15cd6ad45b
Call TrainingTypePlugin collective functions directly instead of going through the Accelerator ( #9677 )
...
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2021-09-27 14:52:57 +02:00
Danielle Pintz
ab069876cb
[1/4] Add get_device_stats to accelerator interface ( #9586 )
2021-09-26 21:09:16 -07:00
ananthsub
41e3be197f
Remove `call_configure_sharded_model` lifecycle property ( #9612 )
2021-09-24 03:57:53 +02:00
Aki Nitta
f5608e90d6
Document exceptions in accelerators ( #9558 )
...
* Document exceptions in ipu.py
* Document exceptions in tpu.py
* Document exceptions in gpu.py
2021-09-18 15:14:08 +09:00
Carlos Mocholí
b1ed1db089
Keep global step update in the loop ( #8856 )
2021-09-14 19:21:39 +05:30
Kaushik B
b294c5760e
Fix type hint for filepath ( #9434 )
2021-09-10 21:38:54 +00:00
Danielle Pintz
cc2ac02dd1
Move add_to_queue/get_from_queue to DDPSpawnPlugin ( #9118 )
...
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: tchaton <thomas@grid.ai>
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2021-09-10 20:58:02 +00:00
Carlos Mocholí
3070a9ea6e
Fix hiddens type annotation ( #9377 )
2021-09-09 08:45:52 +01:00
Jirka Borovec
6e124e7207
CI: precommit - docformatter ( #8584 )
...
* CI: precommit - docformatter
* fix deprecated
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-09-06 12:49:09 +00:00
four4fish
f01a9a6cd2
Remove `BasePlugin` ( #9066 )
...
* Remove BasePlugin
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
2021-08-25 19:10:28 +00:00
Sean Naren
bac8b1be81
Add support for CPU AMP autocast ( #9084 )
2021-08-25 12:18:00 +00:00
four4fish
c912ebf889
Remove TrainingTypePlugin.on_save and Accelerator.on_save ( #9023 )
...
* Remove TrainingTypePlugin.on_save and Accelerator.on_save
2021-08-23 10:11:00 -07:00
ananthsub
8a931732ae
Remove unused `on_train_epoch_end` hook in accelerator ( #9035 )
2021-08-23 00:20:10 +05:30
four4fish
13e64e6a80
Remove deprecated functions from accelerator.py ( #9019 )
2021-08-22 00:25:42 +02:00
Carlos Mocholí
d0efb55b0f
Delete `TrainingEpochLoop._dataloader_idx` which always equals 0 ( #8911 )
2021-08-16 13:34:42 +02:00
Carlos Mocholí
93ab24d1ee
Replace DataLoader sampler once for IPUs ( #8858 )
2021-08-16 11:28:05 +02:00
Carlos Mocholí
ed13040729
Connect the model to the training type plugin at the start of run ( #8536 )
2021-08-04 17:43:34 +02:00
Caleb Robinson
9ca02f58ae
Fix an import deprecation warning ( #8687 )
...
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-08-03 22:17:28 +00:00
Jirka Borovec
f67892ea96
CI: yesqa ( #8564 )
...
* add yesqa
* fix flake8
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-08-02 16:05:56 +00:00
Sean Naren
07b7dc9c17
[Fix] Add delay property for checkpointing, refactor loading checkpoint (DeepSpeed Checkpointing Fix 1/n) ( #8627 )
...
* Add property to delay checkpointing, move loading checkpoint file into the run function to allow deepspeed engine to be loaded
* Add a small test
* Apply suggestions from code review
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* Update pytorch_lightning/accelerators/accelerator.py
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* Address review
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-07-30 11:31:08 +01:00
Santiago Castro
b256d6acd3
Avoid unnecessary list creation ( #8595 )
2021-07-28 13:36:45 +05:30
Carlos Mocholí
a64cc37394
Replace `yapf` with `black` ( #7783 )
...
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-07-26 13:37:35 +02:00
thomas chaton
c9af1a7aec
[bugfix] Reduce memory leaks ( #8490 )
...
* reduce memory leak
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* update changelog
* Apply suggestions from code review
Co-authored-by: Ethan Harris <ewah1g13@soton.ac.uk>
* resolve flake8
* update on comments
* resolve bug
* update
* Undo whitespace changes
* remove bug
* resolve flake8
* revert change
* update on comments
* delete the ddp wrapper as it hold memory
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* resolve flake8
* update on comments
* update changelog
* resolve test
* Update CHANGELOG
* Refactor teardown
* Fix comment
* Do it for non-gpu too
* remove ref when the model is not a lightning_module
* Fix import error
* move down
* resolve bug
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* resolve assignement
* update
* move above
* Fix device calls to support tpu training
* Updat todo
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Ethan Harris <ewah1g13@soton.ac.uk>
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
Co-authored-by: Kaushik B <kaushikbokka@gmail.com>
2021-07-21 11:37:05 +02:00
Carlos Mocholí
6ce77a102b
Set minimum PyTorch version to 1.6 ( #8288 )
...
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
2021-07-13 17:12:49 +00:00
Carlos Mocholí
c5a120ed9d
Update to Mypy>0.9 ( #8386 )
2021-07-13 08:23:36 +02:00
Carlos Mocholí
eb6d991218
Refactor plugins backward ( #8328 )
2021-07-08 16:02:09 +02:00
Adrian Wälchli
d73c32ab51
move `torch.cuda.set_device()` to enable collective calls earlier in setup ( #8312 )
2021-07-07 13:15:41 +02:00
Adrian Wälchli
ea5cfd2005
move batch to device before sending it to hooks ( #7378 )
...
* update train step
* test
* x
* limits
* val
* typeo
* x
* x
* step
* min gpus
* run all loops
* x
* limit test
* profiler
* clean up accelerator code
* move files
* rename
* move tests
* changelog
* reorder callbacks and model hooks
* add test description
* replace unneccessary method
* fix chlog
* adjust batch_to_device for DP Plugin
* update tests for dataloader idx
* unused imports
* hook change
* switch None
* clear memory
* change to None
* None
* None
* memory savings
* remove redundant todo
* hack
* cheat
* Revert "cheat"
This reverts commit a8433bd0b4
.
* Revert "hack"
This reverts commit 43a6d1edeb
.
* update new epoch loop
* remove from old loop code
* update chlog
* update hook test
* changelog
* teardown
* integrate changes in new eval loop
* fix hook calls
* add prediction step
* bad merge
* Revert "bad merge"
This reverts commit 488080863c
.
* fix train batch hook test
* rm -rf _notebooks
* update chlog
* release memory
* fix type
* notebooks mess
* debug
* Revert "debug"
This reverts commit eec4ee2f77
.
* teardown
* fix teardown bug
* debug
* x
* debug
* Revert "debug"
This reverts commit a6e6101946
.
Revert "debug"
This reverts commit 5ddeaec069
.
debug
debug
Revert "debug"
This reverts commit 605be746f7daedf265b2c05a1c153ce543394435.
Revert "Revert "debug""
This reverts commit a7612d5410409ed886cfb609457349ecf44cbfa8.
debug
x
x
x
s
tol
x
tol
* Fix changelog
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-07-05 09:31:39 +01:00
Carlos Mocholí
74eb6cc7e9
Clean `cuda.empty_cache` usage ( #8199 )
2021-06-30 13:04:24 +02:00
deepsource-autofix[bot]
03154eb30a
Refactor unnecessary `else` / `elif` when `if` block has a `return` statement ( #8156 )
...
Co-authored-by: deepsource-autofix[bot] <62050782+deepsource-autofix[bot]@users.noreply.github.com>
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
2021-06-28 15:27:41 +05:30
Carlos Mocholí
4d9b72b8a9
Nuke RPC ( #8101 )
2021-06-23 18:31:13 +00:00