Commit Graph

20 Commits

Author SHA1 Message Date
Jirka Borovec 0f36525e8f
fix/enable - check F401 (#5201)
* refactor - check F401

* missed

* fix
2020-12-21 10:15:04 +01:00
Jirka Borovec 2d54116baa
annotat unused vars (#5017)
* annotate all unused vars

* rank_zero_warn

* Apply suggestions from code review

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

* f1 fixed

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2020-12-19 13:53:06 +01:00
Jirka Borovec 059eaecbb4
set xxx_AVAILABLE as protected (#5082)
* sett xxx_AVAILABLE as protected

* docs
2020-12-14 20:19:05 +05:30
chaton 2c3d43dcb5
Initialize trainer with None in DDPAccelerator (#4915)
* Initialize trainer with None

* add typing to all accelerators

* resolve imports

* update

* add typing

* removed typo

* update

* Fix formatting and imports in accelerator

Co-authored-by: maxjeblick <maxjeblick@users.noreply.github.com>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
Co-authored-by: SeanNaren <sean@grid.ai>
Co-authored-by: Roger Shieh <sh.rog@protonmail.ch>
2020-12-10 15:24:44 +01:00
Rohit Gupta bcbba3b702
Simplify GPU and TPU accelerator (#5024) 2020-12-09 14:12:44 -05:00
Sean Naren ee9b3fe574
[feat] pp 1/n (#5016)
* Added changes for RPC plugin

* Add missing kwargs

* Fix code format

* Loading refactors by introducing is_distributed var, fix optimizer step flow

* Add rpc guard

* Added docstrings and typing

* resolve comments

* Add additional rpc hook, refactor name of exit process hook for clarity

* remove annotation

* Modify behaviour to allow optional return, add test for rpc plugin

* resolve tests

* rename is_ddp_based

* update

* update for windows

* update

* resolve test

* code smell

* Revert back to init_ddp_connection for backwards compat

* Swap to explicit name for property

* Add missing speed parity increase for CI variability, fix call counts for child process

Co-authored-by: tchaton <thomas@grid.ai>
2020-12-08 22:02:10 +00:00
chaton 02152c1729
Simplify optimization Logic (#4984)
* Rely on ddp plugin for blocking sync behaviour, and skip if we're using manual optimization

* debug

* Revert "debug"

This reverts commit ccca6b6b

* Expose manual reduce for automatic optimization

* Add input arguments

* Enable parity test

* clean imports

* Expose hook after to ensure we reset

* Fix naming

* add

* fix test

* uniformize optimizer logic

* resolve test

* resovle flake8

* resolve amp bug

* update tests

* remove bug

* remove optimizer_step in accelerators

* typo

* update lightning optimizer

* set doesn't work with ddp_spawn

* resolve flake8

* update threshold

* ignore pyright

* correct codeFactor

* remove useless if

* remove zer_grad function

* simplify step

* remove typo

* resolve bug

* Apply suggestions from code review

* update on comments

* resolve bugs

* remove tests

* Update pytorch_lightning/trainer/configuration_validator.py

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

* simplify testing

* add more tests

Co-authored-by: SeanNaren <sean@grid.ai>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2020-12-07 12:55:49 +00:00
Lezwon Castelino 12cb9942a1
Tpu save (#4309)
* convert xla tensor to cpu before save

* move_to_cpu

* updated CHANGELOG.md

* added on_save to accelerators

* if accelerator is not None

* refactors

* change filename to run test

* run test_tpu_backend

* added xla_device_utils to tests

* added xla_device_utils to test

* removed tests

* Revert "added xla_device_utils to test"

This reverts commit 0c9316bb

* fixed pep

* increase timeout and print traceback

* lazy check tpu exists

* increased timeout
removed barrier for tpu during test
reduced epochs

* fixed torch_xla imports

* fix tests

* define xla utils

* fix test

* aval

* chlog

* docs

* aval

* Apply suggestions from code review

Co-authored-by: Jirka Borovec <jirka.borovec@seznam.cz>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: chaton <thomas@grid.ai>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
2020-12-02 13:05:11 +00:00
chaton c2e6e68c7e
optimizer clean up (#4658)
* add LightningOptimizer

* typo

* add mock closure

* typo

* remove logic in optimizer_step

* update

* update

* update

* desactivate LightningOptimizer for hovorod

* resolve flake

* typo

* check optimizer name

* change name

* added backward to LightningOptimizer

* remove use_lightning_optimizer

* move update

* simplify init

* resolve comments

* resolve bug

* update

* update

* resolve bugs

* resolve flake8

* set state

* work manual_optimizer_step

* add doc

* add enable_pl_optimizer

* make optimizer_step

* add make_optimizer_step

* add examples

* resolve test

* add test_optimizer_return_options_enable_pl_optimizer

* add enable_pl_optimizer=True

* update

* update tests

* resolve bugs

* update

* set Trainer to False

* update

* resolve bugs

* update

* remove from doc

* resolve bug

* typo

* update

* set to True

* simplification

* typo

* resolve horovod

* unwrap horovod

* remove Optimizer

* resolve horovod

* move logic to amp_backend

* doesn't seem to be pickable

* update

* add again

* resolve some bugs

* cleanup

* resolve bug with AMP

* change __repr__

* round at -12

* udpate

* update

* update

* remove from horovod

* typo

* add convert_to_lightning_optimizers in each accelerators

* typo

* forgot

* forgot a convert_to_lightning_optimizers

* update

* update

* update

* increase coverage

* update

* resolve flake8

* update

* remove useless code

* resolve comments + add support for LightningOptimizer base class

* resolve flake

* check optimizer get wrapped back

* resolve DDPSharded

* reduce code

* lightningoptimizer

* Update pytorch_lightning/core/optimizer.py

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

* Update pytorch_lightning/core/lightning.py

* remove reference to step function

* Apply suggestions from code review

* update on comments

* resolve

* Update CHANGELOG.md

* add back training_step in apex and native_amp

* rename optimizer_step

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: William Falcon <waf2107@columbia.edu>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
2020-12-01 00:09:46 +00:00
Jirka Borovec 442d57f1e9
simplify imports xla / TPU (#4872)
* xla

* tpu

* fix

* fix

* flake8
2020-11-27 00:37:48 +01:00
chaton 4018237c30
[FEAT] Add lambda closure to manual_optimizer_step (#4618)
* added lambda_closure

* move to types

* add 2 new tests

* make example more complex

* add complex example to doc

* added more tests

* resolve doc

* typo

* update

* update tpu optimizer_step

* Apply suggestions from code review

* Update pytorch_lightning/core/lightning.py

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>

* update

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-11-12 19:22:06 +00:00
Sean Naren bacabaebaf
Sharded Accelerator 1/n: Expose clip gradients to plugins via abstract class (#4639)
* Added abstract precision plugin to expose clip_gradients function, use within accelerator to clip gradients

* Exclude model from override, keep optimizer (needed for sharded clip gradients), add override for O2 support apex

* Fix doc

* Applied codereview changes

* Refactored clip function to encapsulate tpu changes with tpu accelerator. Default to standard clip function for vanilla torch

* Pass correct grad clip val

* Moved var to property

* Apply code review suggestions
2020-11-12 17:18:09 +00:00
Sean Naren 33470ba605
Prevent crash if sync_dist=True on CPU (#4626)
* Added test/fix for sync_dist raising NotImplementedError

* Fixed comments/formatting

* Revert base class change, enforce sync tensors across accelerators, added GPU test
2020-11-11 22:04:05 +00:00
William Falcon ee35907170
Accelerator docs (#4583)
* accelerator docs

* accelerator docs
2020-11-08 17:24:41 -05:00
Ananya Harsh Jha 01ab2a933d
[bug] [docs] Clearer optimizer_step override instructions (#4455)
* fix

* flags

* remove defaults
2020-11-02 22:13:34 +00:00
Akihiro Nitta b45b57cc58
Use `Optional` for arguments set to `None` by default (#4164)
* Use `Optional` for variables set to `None` by default

* Use `Optional` instead of `Union[None, ...]` for consistency
2020-10-15 23:02:50 +02:00
William Falcon b9f2682b7d
clean docs, enable grad clip in manual mode (#4078)
* docs

* docs
2020-10-11 13:12:35 -04:00
William Falcon 7ffe05a3d1
ref: accelerator names (#4066)
* ref: accelerator names

* docs
2020-10-11 01:05:14 -04:00
William Falcon 4dbd761a1c
refactor 3/n (#2709)
* reactor into gpu accelerator

* reactor into gpu accelerator

* reactor into gpu accelerator

* reactor into gpu accelerator

* reactor into gpu accelerator

* reactor into gpu accelerator

* reactor into gpu accelerator

* reactor into gpu accelerator

* reactor into gpu accelerator

* reactor into gpu accelerator

* reactor into gpu accelerator

* reactor into gpu accelerator

* reactor into gpu accelerator

* reactor into gpu accelerator

* reactor into gpu accelerator

* reactor into gpu accelerator

* reactor into gpu accelerator

* reactor into gpu accelerator

* reactor into gpu accelerator

* reactor into gpu accelerator

* reactor into gpu accelerator
2020-07-25 20:56:50 -04:00
William Falcon b34217e410
Refactor 2/n (#2708)
* reactor into gpu accelerator

* reactor into gpu accelerator

* reactor into gpu accelerator

* reactor into gpu accelerator

* reactor into gpu accelerator

* reactor into gpu accelerator

* reactor into gpu accelerator
2020-07-25 17:31:34 -04:00