Vatsalya Chaubey
|
ce93d8bcfd
|
Handle errors due to uninitailized parameters (#7642)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
|
2021-06-14 15:56:03 +00:00 |
Adrian Wälchli
|
02fa32b7bc
|
Handle torch.jit scripted modules in layer summary (#6511)
|
2021-03-15 03:17:42 +01:00 |
Jirka Borovec
|
55dd3a4c64
|
Typing for tests 1/n (#6313)
* typing
* yapf
* typing
|
2021-03-09 11:27:15 +00:00 |
Jirka Borovec
|
b46d22197d
|
Refactor: skipif for AMPs 3/n (#6293)
* args
* native
* apex
* isort
|
2021-03-02 18:13:53 +05:30 |
Jirka Borovec
|
0f9134e043
|
Refactor: skipif for Windows 2/n (#6268)
* win
* isort
* flake8
|
2021-03-02 09:36:01 +00:00 |
Jirka Borovec
|
eb815000f6
|
Refactor: skipif for multi - gpus 1/n (#6266)
* ngpus
* gpu
* isort
* pt
* flake8
|
2021-03-02 09:03:32 +01:00 |
Justus Schock
|
da6dbc8d1d
|
PoC: Accelerator refactor (#5743)
* restoring the result from subprocess
* fix queue.get() order for results
* add missing "block_backward_sync" context manager
* add missing "block_backward_sync" context manager
* fix sync_batchnorm
* fix supported gpu-ids for tuple
* fix clip gradients and inf recursion
* accelerator selection: added cluster_environment plugin
* fix torchelastic test
* fix reduce early stopping decision for DDP
* fix tests: callbacks, conversion to lightning optimizer
* fix lightning optimizer does not pickle
* fix setting benchmark and deterministic option
* fix slurm amp test
* fix prepare_data test and determine node_rank
* fix retrieving last path when testing
* remove obsolete plugin argument
* fix test: test_trainer_config
* fix torchscript tests
* fix trainer.model access
* move properties
* fix test_transfer_batch_hook
* fix auto_select_gpus
* fix omegaconf test
* fix test that needs to simulate slurm ddp
* add horovod plugin
* fix test with named arguments
* clean up whitespace
* fix datamodules test
* remove old accelerators
* fix naming
* move old plugins
* move to plugins
* create precision subpackage
* create training_type subpackage
* fix all new import errors
* fix wrong arguments order passed to test
* fix LR finder
* Added sharded training type and amp plugin
* Move clip grad to precision plugin
* Added sharded spawn, select accelerators based on distributed_backend + enable custom fp16 plugin automatically
* Fix import issue, attempting to fix tests
* Fix initial test
* Reflect hook logic from master, should wrap model after move to device
* Optional state consolidation, since master has optimizers not wrapped
* change attribute for instance test
* reset optimizers
optimizers are not used in main process, so state would be wrong.
* legacy
* imports in accel
* legacy2
* trainer imports
* fix import errors after rebase
* move hook to new setup location
* provide unwrapping logic
* fix trainer callback system
* added ddp2 implementation
* fix imports .legacy
* move plugins
* restore legacy
* drop test.py from root
* add tpu accelerator and plugins
* fixes
* fix lightning optimizer merge
* reset bugreportmodel
* unwrapping
* step routing forward
* model access
* unwrap
* opt
* integrate distrib_type
* sync changes
* sync
* fixes
* add forgotten generators
* add missing logic
* update
* import
* missed imports
* import fixes
* isort
* mv f
* changelog
* format
* move helper to parallel plugin
* d
* add world size
* clean up
* duplicate
* activate ddp_sharded and tpu
* set nvidia flags
* remove unused colab var
* use_tpu <-> on_tpu attrs
* make some ddp_cpu and clusterplugin tests pass
* Ref/accelerator connector (#5742)
* final cleanup
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* connector cleanup
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* trainer cleanup
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* accelerator cleanup + missing logic in accelerator connector
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* add missing changes to callbacks
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* reflect accelerator changes to lightning module
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* clean cluster envs
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* cleanup plugins
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* add broadcasting
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* yapf
* remove plugin connector
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* plugins
* manual optimization
* update optimizer routing
* add rank to torchelastic
* fix memory mixed precision
* setstate on trainer for pickling in ddp spawn
* add predict method
* add back commented accelerator code
* adapt test for sync_batch_norm to new plugin
* fix deprecated tests
* fix ddp cpu choice when no num_processes are given
* yapf format
* skip a memory test that cannot pass anymore
* fix pickle error in spawn plugin
* x
* avoid
* x
* fix cyclic import in docs build
* add support for sharded
* update typing
* add sharded and sharded_spawn to distributed types
* make unwrap model default
* refactor LightningShardedDataParallel similar to LightningDistributedDataParallel
* update sharded spawn to reflect changes
* update sharded to reflect changes
* Merge 1.1.5 changes
* fix merge
* fix merge
* yapf isort
* fix merge
* yapf isort
* fix indentation in test
* copy over reinit scheduler implementation from dev1.2
* fix apex tracking calls with dev_debugger
* reduce diff to dev1.2, clean up
* fix trainer config test when gpus>0 and num_processes >0 and ddp_cpu
* sort plugin tests legacy/new
* fix error handling for amp on cpu
* fix merge
fix merge
fix merge
* [Feat] Resolve manual_backward (#5837)
* resolve manual_backward
* resolve flake8
* update
* resolve for ddp_spawn
* resolve flake8
* resolve flake8
* resolve flake8
Co-authored-by: Ubuntu <ubuntu@ip-172-31-88-60.ec2.internal>
* fix tests/accelerator tests on cpu
* [BugFix] Resolve manual optimization (#5852)
* resolve manual_optimization
* update
* update
Co-authored-by: Ubuntu <ubuntu@ip-172-31-88-60.ec2.internal>
* Remove copy trainer parameters to happen earlier within the loop and add safe guard to get ref model (#5856)
* resovle a bug
* Accelerator refactor sharded rpc (#5854)
* rpc branch
* merge
* update handling of rpc
* make devices etc. Optional in RPC
* set devices etc. later if necessary
* remove devices from sequential
* make devices optional in rpc
* fix import
* uncomment everything
* fix cluster selection
Co-authored-by: Ubuntu <ubuntu@ip-172-31-88-60.ec2.internal>
* resolve bug
* fix assert in rpc test
* resolve a test
* fix docs compilation
* accelerator refactor - fix for sharded parity test (#5866)
* fix memory issue with ddp_spawn
* x
x
x
x
x
x
x
x
x
* x
* Remove DDP2 as this does not apply
* Add missing pre optimizer hook to ensure lambda closure is called
* fix apex docstring
* [accelerator][BugFix] Resolve some test for 1 gpu (#5863)
* update
* revert init
* resolve a bug
* update
* resolve flake8
* update
* update
* update
* revert init
* resolve a bug
* update
* resolve flake8
* update
* update
* update
* update
* update
* revert init
* resolve a bug
* update
* resolve flake8
* update
* update
* update
* revert init
* update
* resolve flake8
* update
* update
* update
* update
* update
* all_gather
* update
* make plugins work, add misconfig for RPC
* update
* update
* remove breaking test
* resolve some tests
* resolve flake8
* revert to ddp_spawn
Co-authored-by: root <root@ip-172-31-88-60.ec2.internal>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-88-60.ec2.internal>
Co-authored-by: Justus Schock <justus.schock@rwth-aachen.de>
* yapf isort
* resolve flake8
* fix apex doctests
* fix apex doctests 2
* resolve docs
* update drone
* clean env
* update
* update
* update
* update
* merge
* Fix RPC related tests, clean out old API, update for new accelerator API [skip ci] (#5881)
* Fix RPC related tests, clean out old API, update for new accelerator API
* Move tests out of legacy folder, update paths and names
* Update test_remove_1-4.py
* Expose properties for tpu cores/gpus/num_gpus
* Add root GPU property
* Move properties to properties.py
* move tests that were previously in drone
* Fix root GPU property (#5908)
* Move root GPU to property, remove horovod set as this is handled in horovod plugin, ensure we mock correctly to set GPU accelerator
* Add missing tests back
* fix best model path transfer when no checkpoint callback available
* Fix setup hook order [wip] (#5858)
* Call trainer setup hook before accelerator setup
* Add test case
* add new test
* typo
* fix callback order in test
Co-authored-by: tchaton <thomas@grid.ai>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* rename ddp sequential -> rpc sequential for special test
* revert
* fix stupid merge problem
* Use property in connector for sampler (#5913)
* merge the import conflicts
* fix spawning of processes in slurm
* [wip] Fix some bugs for TPU [skip ci] (#5878)
* fixed for single tpu
* fixed spawn
* fixed spawn
* update
* update
* wip
* resolve bugs
* resolve bug
* update on comment
* removed decorator
* resolve comments
* set to 4
* update
* update
* need cleaning
* update
* update
* update
* resolve flake8
* resolve bugs
* exclude broadcast
* resolve bugs
* change test
* update
* update
* skip if meet fails
* properly raise trace
* update
* add catch
* wrap test
* resolve typo
* update
* typo
Co-authored-by: Lezwon Castelino <lezwon@gmail.com>
Co-authored-by: Your Name <you@example.com>
* resolve some tests
* update
* fix imports
* update
* resolve flake8
* update azure pipeline
* skip a sharded test on cpu that requires a gpu
* resolve tpus
* resolve bug
* resolve flake8
* update
* updat utils
* revert permission change on files
* suggestions from carlos
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
* remove unrelated formatting changes
* remove incomplete comment
* Update pytorch_lightning/accelerators/__init__.py
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
* remove unrelated formatting change
* add types
* warn 1.7 ddp manual backward only if ddp kwarg unset
* yapf + isort
* pep8 unused imports
* fix cyclic import in docs
* Apply suggestions from code review
* typer in accelerator.py
* typo
* Apply suggestions from code review
* formatting
* update on comments
* update typo
* Update pytorch_lightning/trainer/properties.py
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* update
* suggestion from code review
* suggestion from code review
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: SeanNaren <sean@grid.ai>
Co-authored-by: Jirka Borovec <jirka.borovec@seznam.cz>
Co-authored-by: chaton <thomas@grid.ai>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-88-60.ec2.internal>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
Co-authored-by: root <root@ip-172-31-88-60.ec2.internal>
Co-authored-by: Lezwon Castelino <lezwon@gmail.com>
Co-authored-by: Your Name <you@example.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
|
2021-02-12 15:48:56 -05:00 |
Jirka Borovec
|
a0f7831278
|
fix miss-leading imports in tests (#5873)
* fix imorts
* .
|
2021-02-09 05:10:52 -05:00 |
Jirka Borovec
|
bd920b4102
|
Refactor simplify tests (#5861)
* add new
* restructure
* yapf
* move
* fix
|
2021-02-08 11:52:02 +01:00 |
Jirka Borovec
|
4faaef7758
|
formatting tests: 4/n (#5846)
* models
* ckpt
* core
* log
|
2021-02-06 12:07:26 +01:00 |
chaton
|
d0aaf983b9
|
[Feat] Adding PruningCallback (#5618)
* wip
* add pruning callback
* add condition for duplicated weights
* update on comments
* update on comments
* update on comments
* add more tests
* resolve flake8
* resolve on comments
* update changelog
* update on comments
* update on comments
* change order
* remove ddp_spawn skip
* update
* typo
* Update pytorch_lightning/callbacks/pruning.py
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
* Update pytorch_lightning/callbacks/pruning.py
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
* update on comments
* forgot platform
* update on comments
* remove @rank_zero_only
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
|
2021-01-27 01:00:42 -05:00 |
chaton
|
5f3372871a
|
[feat] Add PyTorch Profiler. (#5560)
* add profiler
* add profiler
* update
* resolve flake8
* update doc
* update changelog
* clean doc
* delete prof file
* merge pr codebase
* update
* update doc
* update doc
* update doc
* update on comments
* update docstring
* update docstring
* try
* update test
* Update pytorch_lightning/profiler/__init__.py
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
* Update pytorch_lightning/profiler/__init__.py
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
* update on comments
* remove old code
* add support for ddp
* resolve flake8
* Update pytorch_lightning/profiler/__init__.py
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
* resolve tests
* resolve flake8
Co-authored-by: Jirka Borovec <jirka.borovec@seznam.cz>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Sean Naren <sean.narenthiran@gmail.com>
|
2021-01-26 06:48:54 -05:00 |
Jirka Borovec
|
7b30133a82
|
flake8 & isort (#5647)
|
2021-01-25 14:31:38 -05:00 |
NeuralLink
|
db784225eb
|
summarize total size of model params in bytes (#5590)
* simplified model size calc
* fix spaces
* fix newlines
* minor refactor
* Update pytorch_lightning/core/memory.py
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
* make model size property
* fix doctest
* Update pytorch_lightning/core/memory.py
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* remove explicit doctest from file
* better docs
* model precalculate size 1.0 mbs
* better comment
* Update tests/core/test_memory.py
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
* Update tests/core/test_memory.py
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
* merge _model_size into model_size property itself
* minor comment fix
* add feature to changelog
* added precision test
* isort
* minor def name typo
* remove monkeypath set env as boringmodel wont need any torch hub cache
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
|
2021-01-25 09:35:29 +01:00 |
Rohit Gupta
|
29bcf30984
|
[tests/core] Updated with BoringModel and added BoringDataModule (#5432)
* update with BoringModel and introduce BoringDataModule
* isort
* fix
* rm random_split
* fix test
* fix test
* update
* update test_results
* val_step
* update tests
* rebase
* rebase
|
2021-01-13 01:48:37 -05:00 |
Rohit Gupta
|
704e00ee7f
|
Fix invalid value for weights_summary (#5296)
* Fix weights_summary
* use mode
* fix
* optional
* what was I thinking
(cherry picked from commit 062800aa99 )
|
2021-01-06 12:59:32 +01:00 |
William Falcon
|
09c2020a93
|
notices (#4118)
|
2020-10-13 07:18:07 -04:00 |
Adrian Wälchli
|
6bfcfa8671
|
fix dtype conversion of example_input_array in model summary (#2510)
* fix dtype conversion
* changelog
|
2020-07-05 07:17:22 -04:00 |
Adrian Wälchli
|
f972ab3a82
|
Fix summary hook handles not getting removed (#2298)
* detach hooks after completion
* detach hook
* update docs
* add test
* docs
* changelog
|
2020-06-20 07:38:47 -04:00 |
Adrian Wälchli
|
7dc58bd286
|
Refactor model summary + generalize example input array (#1773)
* squash
variant a
variant b
add test
revert rename
add changelog
docs
move changelog entry to top
use hooks
wip
wipp
layer summary
clean up, refactor
type hints
rename
remove obsolete code
rename
unused imports
simplify formatting of table and increase readability
doctest
superclass object
update examples
print unknown sizes
more docs and doctest
testing
unknown layers
add rnn test
remove main
restore train mode
test device wip
device
constant
simplify model forward transfer
return summary object in method
extend tests
fix summary for empty module
extend tests
refactor and added hook
variant a
variant b
add test
revert rename
add changelog
docs
move changelog entry to top
remove hardcoded string
simplify
test unknown shapes and all others
comments for tests
fix hparams attribute
* update default
* unused import
* clean up
* replace hardcoded strings
* fix doctest
* fix top/full
* black
* fix rnn test
* fix rnn
* update debugging docs
update docs
typo
update docs
update docs
* add changelog
* extract constant
* setter and getter
* move parity models to test folder
* parameterize mode
|
2020-06-15 17:05:58 -04:00 |