Jirka Borovec
b72ed71d4e
Refactor: clean trainer device & distrib setters ( #5297 )
...
* naive replace
* simplify
* clean
* .
* fix
* .
* fix
* fix
2021-01-04 17:10:13 +00:00
Jirka Borovec
a884866ff0
Unify names in Utils ( #5199 )
...
* warnings
* argparse
* mutils
* xla device
* deprecated
* tests
* simple
* flake8
* fix
* flake8
* 1.4
2020-12-22 00:23:33 +01:00
Jirka Borovec
0f36525e8f
fix/enable - check F401 ( #5201 )
...
* refactor - check F401
* missed
* fix
2020-12-21 10:15:04 +01:00
Jirka Borovec
05f25f3a54
update usage of deprecated checkpoint_callback ( #5006 )
...
* drop usage of deprecated checkpoint_callback
* fix
* fix
2020-12-09 14:14:34 -05:00
chaton
ef8ef12fd0
[feat] pp 2/n ( #5026 )
...
* Added changes for RPC plugin
* Add missing kwargs
* Fix code format
* Loading refactors by introducing is_distributed var, fix optimizer step flow
* Add rpc guard
* Added docstrings and typing
* resolve comments
* Add additional rpc hook, refactor name of exit process hook for clarity
* remove annotation
* Modify behaviour to allow optional return, add test for rpc plugin
* resolve tests
* rename is_ddp_based
* update
* update for windows
* update
* resolve test
* code smell
* Added sequential plugin
* resolve bug
* update
* cleanup
* add Exception
* resolve docs
* Remove ddp support
* Revert distributed -> ddp
* Update pl_examples/basic_examples/conv_sequential_example.py
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
* Update pl_examples/basic_examples/conv_sequential_example.py
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
* Update pytorch_lightning/plugins/ddp_sequential_plugin.py
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
* Address code review points
* Update pytorch_lightning/plugins/ddp_sequential_plugin.py
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
* Update pytorch_lightning/plugins/ddp_sequential_plugin.py
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
* Add missing return
* Fix formatting, add datamodule args
* add small comment
* resolve comments
* resolve comments
* update source for fairscale
* update extras
* remove staticmethod
* resolve flake8
* Skip tests that are failing due to bug upstream with multiple optimizers and shard
* update
* update on comments
* clean test
* latest comments
* remove old comments
* add todo
* Update version
* update
* resolve bugs
* resolve bugs
* update test
* remove hanging test
* Update pytorch_lightning/plugins/ddp_sequential_plugin.py
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
* resolve on comments
* Update pytorch_lightning/plugins/ddp_sequential_plugin.py
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
* resolve on comments
* Update pytorch_lightning/plugins/ddp_sequential_plugin.py
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
* Update pytorch_lightning/plugins/ddp_sequential_plugin.py
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
* Update pytorch_lightning/plugins/ddp_sequential_plugin.py
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
* Update pytorch_lightning/plugins/ddp_sequential_plugin.py
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
* remove ImportError
Co-authored-by: SeanNaren <sean@grid.ai>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2020-12-09 12:56:51 +00:00
Jirka Borovec
53d7c9555c
drop usage of deprecated distributed_backend ( #5009 )
...
Co-authored-by: chaton <thomas@grid.ai>
Co-authored-by: Roger Shieh <sh.rog@protonmail.ch>
2020-12-09 09:18:23 +01:00
Sean Naren
ee9b3fe574
[feat] pp 1/n ( #5016 )
...
* Added changes for RPC plugin
* Add missing kwargs
* Fix code format
* Loading refactors by introducing is_distributed var, fix optimizer step flow
* Add rpc guard
* Added docstrings and typing
* resolve comments
* Add additional rpc hook, refactor name of exit process hook for clarity
* remove annotation
* Modify behaviour to allow optional return, add test for rpc plugin
* resolve tests
* rename is_ddp_based
* update
* update for windows
* update
* resolve test
* code smell
* Revert back to init_ddp_connection for backwards compat
* Swap to explicit name for property
* Add missing speed parity increase for CI variability, fix call counts for child process
Co-authored-by: tchaton <thomas@grid.ai>
2020-12-08 22:02:10 +00:00
Gianluca Scarpellini
16fa4ed1e5
Fixed PYTHONPATH for ddp test model ( #4528 )
...
* Fixed PYTHONPATH for ddp test model
* Removed debug calls
* Apply suggestions from code review
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: chaton <thomas@grid.ai>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
2020-12-05 20:09:47 +00:00
Lezwon Castelino
12cb9942a1
Tpu save ( #4309 )
...
* convert xla tensor to cpu before save
* move_to_cpu
* updated CHANGELOG.md
* added on_save to accelerators
* if accelerator is not None
* refactors
* change filename to run test
* run test_tpu_backend
* added xla_device_utils to tests
* added xla_device_utils to test
* removed tests
* Revert "added xla_device_utils to test"
This reverts commit 0c9316bb
* fixed pep
* increase timeout and print traceback
* lazy check tpu exists
* increased timeout
removed barrier for tpu during test
reduced epochs
* fixed torch_xla imports
* fix tests
* define xla utils
* fix test
* aval
* chlog
* docs
* aval
* Apply suggestions from code review
Co-authored-by: Jirka Borovec <jirka.borovec@seznam.cz>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: chaton <thomas@grid.ai>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
2020-12-02 13:05:11 +00:00
Jirka Borovec
add387c6a7
CI cleaning ( #4941 )
...
* set
* cut
* env
* oonce
* env
* env
* env
2020-12-02 10:00:05 +00:00
Jirka Borovec
11e73ceaa6
fix import and typo in AMP ( #4871 )
...
* fix import and typo
* docs
* apex
* fix
* typo
2020-11-26 23:45:52 +01:00
chaton
4803f681b0
[FEAT] DDP: Create DDPLauncher ( #4515 )
...
* test
* poc
* add simpler test for ddp
* typo
* resolve pep8
* try coverage testing
* trying to add coverage inside ddp
* resolve flake8
* update
* forgot coverage
* move .coveragerc
* update rcfile path
* update
* test
* update
* adding description
* add DDPLauncher decorator
* add undecorated
* push update
* update ddp testing
* Update tests/backends/launcher.py
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
* Update tests/backends/launcher.py
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
* update on comments
* update on comments
* resolve comments
* resolve isort
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2020-11-20 10:17:46 +00:00
William Falcon
624f5b5938
ref: unify slurm and TE under backendPlugin 3/n ( #4581 )
2020-11-08 15:32:37 -05:00
William Falcon
bfaf014096
ref: unify slurm and TE under backendPlugin 2/n ( #4580 )
2020-11-08 15:07:16 -05:00
William Falcon
0f64f15f52
ref: unify slurm and TE under backendPlugin 1/n ( #4578 )
...
* ref: unify slurm and TE under backendPlugin
* ref: unify slurm and TE under backendPlugin
2020-11-08 14:28:55 -05:00
Jirka Borovec
f37444fa3e
CI: add flake8 ( #4239 )
2020-10-19 21:20:17 +01:00
William Falcon
09c2020a93
notices ( #4118 )
2020-10-13 07:18:07 -04:00
William Falcon
5b645d713e
Covv1 ( #4072 )
...
* temporary drop metrics tests while speeding them up
* cov
* cov
* docs
2020-10-11 10:21:53 -04:00
William Falcon
7ffe05a3d1
ref: accelerator names ( #4066 )
...
* ref: accelerator names
* docs
2020-10-11 01:05:14 -04:00
William Falcon
5b261a230e
enable passing in custom accelerators ( #4050 )
...
* enable custom accelerators
* ref: finish decoupling apex, LM and backward
* ref: finish decoupling apex, LM and backward
* ref: finish decoupling apex, LM and backward
2020-10-10 09:21:08 -04:00
William Falcon
2b255a3df4
ref: enable custom clusters (1/n) ( #4048 )
...
* enable cluster plugins
* enable cluster plugins + test backend choices
* enable cluster plugins + test backend choices
* enable cluster plugins + test backend choices
* enable cluster plugins + test backend choices
* enable cluster plugins + test backend choices
* enable cluster plugins + test backend choices
2020-10-10 08:09:29 -04:00
Adrian Wälchli
cc9781a0ad
Deprecate early_stop_callback Trainer argument (part 2) ( #3845 )
...
* update tests with EarlyStopping default
* imports
* revert legacy tests
* fix test
* revert
* revert
2020-10-04 17:36:47 -04:00
William Falcon
70e792344a
test selecting the correct backend. temp backends while slurm and TE are decoupled ( #3848 )
...
* test selecting the correct backend. tem backends while slurm and TE are decoupled
* test selecting the correct backend. tem backends while slurm and TE are decoupled
2020-10-04 15:44:50 -04:00
William Falcon
35d1111994
[WIP] ref: decoupled ddp, ddp spawn (finish 3733) ( #3819 )
...
* ref: finish #3733
* remove deprecated test
* remove deprecated test
* remove deprecated test
* remove deprecated test
* remove deprecated test
* remove deprecated test
* remove deprecated test
* remove deprecated test
* remove deprecated test
* remove deprecated test
* remove deprecated test
* remove deprecated test
* remove deprecated test
* Update pytorch_lightning/accelerators/ddp_backend.py
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
* remove deprecated test
* remove deprecated test
* remove deprecated test
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
2020-10-03 14:05:31 -04:00
William Falcon
e17712e5c3
part 5 of #3733 ( #3774 )
...
* ref: part 4 of #3733
* ref: part 4 of #3733
* ref: part 4 of #3733
2020-10-01 12:34:12 -04:00
William Falcon
622c5c3982
ref: part 4 of #3733 ( #3773 )
...
* ref: part 4 of #3733
* ref: part 4 of #3733
* ref: part 4 of #3733
* ref: part 4 of #3733
2020-10-01 11:26:58 -04:00
William Falcon
ac2b0f0f06
ref: continue #3733 ( #3767 )
...
* ref: #3733 part 2
* ref: #3733 part 2
2020-10-01 09:25:33 -04:00