Commit Graph

75 Commits

Author SHA1 Message Date
puhuk 412f0a4d24
Remove deprecated dataloader arguments in Trainer methods (#10325)
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
2021-11-04 11:03:39 +01:00
Ning f6ed0bd8ca
introduce has_len_all_ranks() to check the length of dataloader across ranks (#9827)
* introduce , udpate tests

* update CHANGELOG.md

* change staticmethod and hook attribute naming

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix typo

* remove non-essential comment

* fix merge error and comment format

* try to fix test_tpu.py failure

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update on comments

* chlog

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* chlog

* update

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* try fix

* Revert back TPUSpawn changes

* Update test

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: thomas chaton <thomas@grid.ai>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Kaushik B <kaushikbokka@gmail.com>
2021-11-02 13:22:58 -04:00
Danielle Pintz e94dcf6936
Mark `trainer.data_connector` as protected (#10031)
Co-authored-by: tchaton <thomas@grid.ai>
2021-10-25 12:29:09 +01:00
Adrian Wälchli 2c16f1d6b9
remove dataloader patching on the LightningModule (#9764)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: rohitgr7 <rohitgr1998@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-10-20 15:23:20 +02:00
Elad Segal 8c76cf5ae1
reset val dataloader for binsearch (#9975) 2021-10-18 12:54:26 +02:00
Rohit Gupta 54d4b4b21d
use existing logic to configure optimizers in lr_finder (#9789)
* use predefined logic

* patch init_optimizers

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-10-11 13:05:27 +00:00
Rohit Gupta 46fa703853
disable_logger (#9837) 2021-10-11 16:36:59 +05:30
Rohit Gupta d71501d97f
Reset `val_dataloader` in `tuner/batch_size_scaling` (#9857)
* reset val

* chlog
2021-10-11 09:13:33 +01:00
Rohit Gupta 4decbc0d95
Deprecate `dataloader_idx` from `on_train_batch_start/end` (#9816)
* deprecate hooks

* dep todo

* explicit

* Apply suggestions from code review

* Apply suggestions from code review

* code review

* base
2021-10-07 10:18:11 +00:00
Rohit Gupta 9d982080df
Fix some flaky tests in tuner/lr_finder (#9766)
* update tests

* fix more tests
2021-10-01 11:15:16 +05:30
Rohit Gupta 83d83abc9d
Fix `lr_find` to generate same results on multiple calls (#9704)
* dump global_step

* add test

* chlog
2021-09-26 19:20:42 +00:00
Rohit Gupta a3def9d228
Use a unique filename to save temp ckpt in tuner (#9682)
* unique filename

* chlog

* update tests
2021-09-25 11:28:51 +00:00
Jirka Borovec 6e124e7207
CI: precommit - docformatter (#8584)
* CI: precommit - docformatter
* fix deprecated

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-09-06 12:49:09 +00:00
jjenniferdai e97c28a02b
Typing `tuner.auto_gpu_select` (#9292)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>
2021-09-03 15:49:58 +01:00
B. Kerim Tshimanga 35876bb75f
remove lightning module datamodule property (#9233)
Co-authored-by: thomas chaton <thomas@grid.ai>
2021-09-02 00:43:47 +02:00
Elad Segal 413f7b2894
fix batch auto scaling when `init_val` causes OOM (#8954)
* fix batch auto scaling when `init_val` causes OOM

* Update CHANGELOG.md

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-08-18 09:56:16 +02:00
Carlos Mocholí a1264a6850
Automatic string fixes (#8886) 2021-08-13 14:28:14 +00:00
Jirka Borovec f67892ea96
CI: yesqa (#8564)
* add yesqa
* fix flake8

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-08-02 16:05:56 +00:00
Carlos Mocholí e63968ab88
Add `pyupgrade` to `pre-commit` (#8557)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-07-26 14:38:12 +02:00
Carlos Mocholí a64cc37394
Replace `yapf` with `black` (#7783)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-07-26 13:37:35 +02:00
Carlos Mocholí 4a64bc3fd3
Fix DeepSpeed lr scheduler logic (#8527)
* Fix deepspeed scheduler logic

* Fix tests

* Minor changes

* Improve tests

* inference fix

* CHANGELOG

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2021-07-23 10:08:58 +01:00
Adrian Wälchli 4becd1cf31
rename old `Trainer.train_loop` -> `Trainer.fit_loop` (#8025) 2021-06-22 11:49:32 +02:00
Carlos Mocholí 560b1970af
Standardize positional datamodule and argument names (#7431)
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-06-15 11:50:13 +00:00
Adrian Wälchli 8c32bf2dd4
refactor on_gpu handling in checkpoint connector (#7860) 2021-06-07 11:30:22 +02:00
Xinyao(Alvin) Sun 0c958c5a1f
Fix dataloaders are not reset when tuning the model (#7566)
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
2021-05-24 10:21:45 +02:00
Adrian Wälchli ad9118f04a
remove trainer hidden state | sanity refactor [1 / n] (#7437)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-05-11 11:09:08 +02:00
Akihiro Nitta 710b144b9b
Restore `trainer.current_epoch` after tuning (#7434)
* Add a test

* Save and restore current_epoch

* Update CHANGELOG

* alphabetical order
2021-05-08 07:15:52 +02:00
ramonemiliani93 5db832f181
Fix auto scaling mode when calling tune method on trainer. (#7321)
* Add test for non-existing mode, the test should fail if something different from `power` or `binsearch` is passed.

* Add newline.

* Apply fix

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* Update tests/tuner/test_scale_batch_size.py

* Update pytorch_lightning/tuner/batch_size_scaling.py

Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>
2021-05-04 12:03:51 +00:00
Carlos Mocholí 8c0ea92af2
`TrainerState` refactor [5/5] (#7173)
* `TrainerState` refactor

* flake8

* Update finished check

* Test cleanup

* Fix tests

* Fixes

* Reorder

* flake8

* Update CHANGELOG

* Better docs

* Better docs

* Remove default

* Update tests

* Bad merge
2021-05-04 12:50:56 +02:00
Carlos Mocholí 5af086ab9f
Attach data refactor and tuner bugs [4/n] (#7258)
Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>
Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>
2021-04-30 13:54:58 +00:00
ananthsub 969e857690
Rename `trainer._launch` to `trainer._run` (#7265)
* rename-run

* fix
2021-04-30 13:39:02 +01:00
Carlos Mocholí a5ac3f8a16
Code cleaning in preparation for #7258 [3/n] (#7262) 2021-04-29 14:40:51 +02:00
Carlos Mocholí bdc4272e99
`_launch` refactor and types [1/n] (#7232) 2021-04-28 17:41:08 +02:00
Akihiro Nitta 92af363270
Fix `lr_finder` suggesting too high learning rates (#7076)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-04-23 10:59:40 +00:00
Tharindu Hasthika f581411210
Fixed missing arguments in `lr_find` call (#6784)
There seem to be 3 arguments missing in the `lr_find` call in the tunining.py file.
2021-04-06 11:37:15 +02:00
Adrian Wälchli b2bcad1132
Fix tuner.scale_batch_size not finding batch size attribute when using datamodule (#5968) 2021-03-14 09:16:19 +01:00
David Palzer 523c59bfdd
fixed bug where tuner would not tune lr if also tuning batch_size (#4688)
* fixed bug where tuner would not tune lr if also tuning batch_size

* added a '+1' to computing the smoothed loss. This maintains the behavior for the smoothed loss as before the bug fix

* pep8 fix

* add changelog

Co-authored-by: chaton <thomas@grid.ai>
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-03-09 08:30:06 +08:00
Elia Cereda d0596fac94
Refactor RunningStage usage in advance of implementing Trainer.validate() (#4945)
* Update code

Co-authored-by: EliaCereda

* More property updates

* Move properties. Introduce trainer._fitting

* Use trainer.fitting

* Fix reset dataloaders

* Unused code

* RunningStage.SANITY_CHECKING

* Use setters

* Fix bugs

* Fix bugs

* TrainerState.{FITTING,VALIDATING,TESTING,PREDICTING,TUNING}

* Fix bugs

* Fix bugs

* Fix tests

* Update CHANGELOG. Add deprecation warning. Fix tests

* Unused imports

* Optional trainer

* More deprecation. More refactoring

* Correct version

* Use properties

* Address comments

* flake8

* Missed renamings

* Typo

* is -> ==

It is recommended to use  for Enums since they are singletons, however, since the LightningEnum subclasses str, it's not a good idea in case a user sets the state/stage with a str

* Also for tests

* Typo

* Address @tchaton's comments

* PEP8

* Correct property

* Update CHANGELOG

* Apply suggestions from code review

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* Update pytorch_lightning/trainer/trainer.py

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* Remove called sanity check

Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-03-06 12:40:19 +00:00
Adrian Wälchli bc577ca792
fix duplicate console logging bug v2 (#6275)
Co-authored-by: chaton <thomas@grid.ai>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2021-03-02 15:17:55 +05:30
Kunal Mundada 3371d32664
docstring changes in tuner (#6264)
* docstring changes in tuner

* added full stop
2021-03-02 09:22:44 +08:00
Adrian Wälchli 02ac4b0b6a
Replace .get_model() with explicit .lightning_module (#6035)
* rename get_model -> lightning_module

* update references to get_model

* pep8

* add proper deprecation

* remove outdated _get_reference_model

* fix cyclic import
2021-02-18 15:59:54 +01:00
Rohit Gupta 99da0d92a5
update lr_finder to check for attribute if not running fast_dev_run (#5990)
* ref lr_finder a bit

* chlog

* Apply suggestions from code review

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
2021-02-17 07:15:29 -05:00
Jirka Borovec 79d42d83e7
formatting 3/n: PL modules (#5716)
* cb

* log

* prof

* tune

* flake8
2021-02-08 14:28:38 -05:00
Kaushik B 5dfd62c09e Disable training with zero num_training_batches when insufficient limit_train_batches (#5703)
* disable training when zero num_train_batches with limit_train_batches

* refactor train skip condition

* fix formatting issues

* fix formatting issues

* ref: test error msg

* fix tests for data loader calls

* fix train dataloader condition

* update limit_train_batches upper range in test comment

* remove model state check test

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
2021-02-05 21:40:42 +01:00
noamzilo 84a8d2d178 Bugfix/5487 auto lr ordering (#5638)
* started to write failing test. just getting into the framework...

* started to write failing test. just getting into the framework...

* added failing test for misconfiguration of lr finder

* made test startup quickly. making sure without the fix it also fails slowly

* improved test

* fixed for linter

* fixed for linter

* yet another fix for the linter

* yet another fix for the linter

* fixed comment by @carmocca

* fixed comment by @carmocca

* Fix test

* chlog

* Apply suggestions from code review

* Fix test

* Update pytorch_lightning/tuner/lr_finder.py

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

* Update pytorch_lightning/tuner/lr_finder.py

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

* Update tests/trainer/test_lr_finder.py

* Update pytorch_lightning/tuner/lr_finder.py

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

* Update pytorch_lightning/tuner/lr_finder.py

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

* Update pytorch_lightning/tuner/lr_finder.py

* Update tests/trainer/test_lr_finder.py

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Jirka Borovec <jirka.borovec@seznam.cz>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2021-02-05 21:40:40 +01:00
Adrian Wälchli 344f3a984a
Refactor access to trainer attributes in LightningModule (#5730)
* rank access

* tests for property

* weekref

* logger

* changelog

* torchscript

* changelog

* chlog

* .

* amp

* yapf

* flake8

Co-authored-by: Jirka Borovec <jirka.borovec@seznam.cz>
2021-02-01 14:28:17 +00:00
Arnaud Gelas 6386b8d36b
Fix isort a few failures (#5504)
Remove from skipped module in pyproject.toml and fix failures on:
- pytorch_lightning/callbacks/*.py
- pytorch_lightning/cluster_environments/*.py
- pytorch_lightning/profiler/*.py
- pytorch_lightning/tuner/*.py

Co-authored-by: Jirka Borovec <jirka.borovec@seznam.cz>

Co-authored-by: Jirka Borovec <jirka.borovec@seznam.cz>
2021-01-15 17:44:27 -05:00
Jirka Borovec 54d20dc596
Refactor: clean trainer device & distrib getters (#5300)
* warnings

* .

* .

* flake8

* .

* .

* .

* use_tpu

* use_dp

* .

* use_ddp

* .

* use_horovod

* .

* .

* .
2021-01-12 05:22:37 -05:00
Jirka Borovec 957583544a
mark todo exceptions (#5320)
* mark todo exceptions

* .

* .

* .

* .

* .

* .

* .

* .

* try

* .
2021-01-04 09:07:56 +01:00
Rohit Gupta 6d2aeff26a
fast_dev_run can be int (#4629)
* fast_dev_run can be int

* pep

* chlog

* add check and update docs

* logging with fdr

* update docs

* suggestions

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

* fdr flush logs

* update trainer.fast_dev_run

* codefactor and pre-commit isort

* tmp

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Roger Shieh <sh.rog@protonmail.ch>
Co-authored-by: edenlightning <66261195+edenlightning@users.noreply.github.com>
2020-12-09 01:37:53 +05:30