Carlos Mocholí
aea96e45a4
Integrate global step with progress tracking ( #11805 )
2022-03-07 19:21:37 +00:00
Akash Kwatra
7e2f9fbad5
Refactor codebase to use `trainer.loggers` over `trainer.logger` when needed ( #11920 )
2022-02-25 16:01:04 -08:00
Carlos Mocholí
789fae828d
Fix `current_epoch` value on training end ( #8578 )
...
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2022-02-10 17:55:59 +01:00
ananthsub
a64438c897
Centralize rank_zero_only utilities into their own module ( #11747 )
...
* Centralize rank_zero_only utilities into their own module
Fixes #11746
* PossibleUserWarning
* Update test_warnings.py
* update imports
* more imports
* Update CHANGELOG.md
* Update mlflow.py
* Update cli.py
* Update api_references.rst
* Update meta.py
* add deprecation tests
* debug standalone
* fix standalone tests
* Update CHANGELOG.md
2022-02-07 08:09:55 +00:00
Krishna Kalyan
6586dd23b7
Mark `CheckpointConnector` as protected ( #11550 )
...
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-02-03 02:26:08 +00:00
Carlos Mocholí
a44881cd90
Changes in preparation to #8578 ( #11562 )
2022-02-02 19:57:08 +00:00
Carlos Mocholí
075b8801c9
Fix checkpoint values when saving and resetting the tuner state ( #11518 )
2022-01-20 18:54:40 +00:00
Carlos Mocholí
62818dbace
Use a dataclass as the scheduler config ( #11443 )
2022-01-18 20:23:32 +01:00
four4fish
cf5ef32f7b
Deprecate Trainer.training_type_plugin in favor of trainer.strategy ( #11141 )
...
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2021-12-22 02:11:43 +00:00
Carlos Mocholí
fa6d17c96f
Fix typing for utilities.warnings ( #11115 )
2021-12-17 15:07:27 +01:00
Carlos Mocholí
5ba5b72473
Update tests to avoid the deprecated `weights_summary` ( #10446 )
2021-11-11 18:15:18 +01:00
Ning
f6ed0bd8ca
introduce has_len_all_ranks() to check the length of dataloader across ranks ( #9827 )
...
* introduce , udpate tests
* update CHANGELOG.md
* change staticmethod and hook attribute naming
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fix typo
* remove non-essential comment
* fix merge error and comment format
* try to fix test_tpu.py failure
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* update on comments
* chlog
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* chlog
* update
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* try fix
* Revert back TPUSpawn changes
* Update test
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: thomas chaton <thomas@grid.ai>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Kaushik B <kaushikbokka@gmail.com>
2021-11-02 13:22:58 -04:00
Danielle Pintz
e94dcf6936
Mark `trainer.data_connector` as protected ( #10031 )
...
Co-authored-by: tchaton <thomas@grid.ai>
2021-10-25 12:29:09 +01:00
Adrian Wälchli
2c16f1d6b9
remove dataloader patching on the LightningModule ( #9764 )
...
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: rohitgr7 <rohitgr1998@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2021-10-20 15:23:20 +02:00
Elad Segal
8c76cf5ae1
reset val dataloader for binsearch ( #9975 )
2021-10-18 12:54:26 +02:00
Rohit Gupta
46fa703853
disable_logger ( #9837 )
2021-10-11 16:36:59 +05:30
Rohit Gupta
d71501d97f
Reset `val_dataloader` in `tuner/batch_size_scaling` ( #9857 )
...
* reset val
* chlog
2021-10-11 09:13:33 +01:00
Rohit Gupta
83d83abc9d
Fix `lr_find` to generate same results on multiple calls ( #9704 )
...
* dump global_step
* add test
* chlog
2021-09-26 19:20:42 +00:00
Rohit Gupta
a3def9d228
Use a unique filename to save temp ckpt in tuner ( #9682 )
...
* unique filename
* chlog
* update tests
2021-09-25 11:28:51 +00:00
Jirka Borovec
6e124e7207
CI: precommit - docformatter ( #8584 )
...
* CI: precommit - docformatter
* fix deprecated
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-09-06 12:49:09 +00:00
Elad Segal
413f7b2894
fix batch auto scaling when `init_val` causes OOM ( #8954 )
...
* fix batch auto scaling when `init_val` causes OOM
* Update CHANGELOG.md
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-08-18 09:56:16 +02:00
Carlos Mocholí
a1264a6850
Automatic string fixes ( #8886 )
2021-08-13 14:28:14 +00:00
Carlos Mocholí
a64cc37394
Replace `yapf` with `black` ( #7783 )
...
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-07-26 13:37:35 +02:00
Carlos Mocholí
4a64bc3fd3
Fix DeepSpeed lr scheduler logic ( #8527 )
...
* Fix deepspeed scheduler logic
* Fix tests
* Minor changes
* Improve tests
* inference fix
* CHANGELOG
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2021-07-23 10:08:58 +01:00
Adrian Wälchli
4becd1cf31
rename old `Trainer.train_loop` -> `Trainer.fit_loop` ( #8025 )
2021-06-22 11:49:32 +02:00
Adrian Wälchli
8c32bf2dd4
refactor on_gpu handling in checkpoint connector ( #7860 )
2021-06-07 11:30:22 +02:00
Xinyao(Alvin) Sun
0c958c5a1f
Fix dataloaders are not reset when tuning the model ( #7566 )
...
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
2021-05-24 10:21:45 +02:00
Adrian Wälchli
ad9118f04a
remove trainer hidden state | sanity refactor [1 / n] ( #7437 )
...
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2021-05-11 11:09:08 +02:00
ramonemiliani93
5db832f181
Fix auto scaling mode when calling tune method on trainer. ( #7321 )
...
* Add test for non-existing mode, the test should fail if something different from `power` or `binsearch` is passed.
* Add newline.
* Apply fix
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
* Update tests/tuner/test_scale_batch_size.py
* Update pytorch_lightning/tuner/batch_size_scaling.py
Co-authored-by: Carlos Mocholi <carlossmocholi@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>
2021-05-04 12:03:51 +00:00
Carlos Mocholí
5af086ab9f
Attach data refactor and tuner bugs [4/n] ( #7258 )
...
Co-authored-by: Nicki Skafte <skaftenicki@gmail.com>
Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>
2021-04-30 13:54:58 +00:00
ananthsub
969e857690
Rename `trainer._launch` to `trainer._run` ( #7265 )
...
* rename-run
* fix
2021-04-30 13:39:02 +01:00
Carlos Mocholí
a5ac3f8a16
Code cleaning in preparation for #7258 [3/n] ( #7262 )
2021-04-29 14:40:51 +02:00
Carlos Mocholí
bdc4272e99
`_launch` refactor and types [1/n] ( #7232 )
2021-04-28 17:41:08 +02:00
Adrian Wälchli
bc577ca792
fix duplicate console logging bug v2 ( #6275 )
...
Co-authored-by: chaton <thomas@grid.ai>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2021-03-02 15:17:55 +05:30
Kunal Mundada
3371d32664
docstring changes in tuner ( #6264 )
...
* docstring changes in tuner
* added full stop
2021-03-02 09:22:44 +08:00
Adrian Wälchli
02ac4b0b6a
Replace .get_model() with explicit .lightning_module ( #6035 )
...
* rename get_model -> lightning_module
* update references to get_model
* pep8
* add proper deprecation
* remove outdated _get_reference_model
* fix cyclic import
2021-02-18 15:59:54 +01:00
Rohit Gupta
99da0d92a5
update lr_finder to check for attribute if not running fast_dev_run ( #5990 )
...
* ref lr_finder a bit
* chlog
* Apply suggestions from code review
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
2021-02-17 07:15:29 -05:00
Jirka Borovec
79d42d83e7
formatting 3/n: PL modules ( #5716 )
...
* cb
* log
* prof
* tune
* flake8
2021-02-08 14:28:38 -05:00
Arnaud Gelas
6386b8d36b
Fix isort a few failures ( #5504 )
...
Remove from skipped module in pyproject.toml and fix failures on:
- pytorch_lightning/callbacks/*.py
- pytorch_lightning/cluster_environments/*.py
- pytorch_lightning/profiler/*.py
- pytorch_lightning/tuner/*.py
Co-authored-by: Jirka Borovec <jirka.borovec@seznam.cz>
Co-authored-by: Jirka Borovec <jirka.borovec@seznam.cz>
2021-01-15 17:44:27 -05:00
Jirka Borovec
54d20dc596
Refactor: clean trainer device & distrib getters ( #5300 )
...
* warnings
* .
* .
* flake8
* .
* .
* .
* use_tpu
* use_dp
* .
* use_ddp
* .
* use_horovod
* .
* .
* .
2021-01-12 05:22:37 -05:00
Rohit Gupta
6d2aeff26a
fast_dev_run can be int ( #4629 )
...
* fast_dev_run can be int
* pep
* chlog
* add check and update docs
* logging with fdr
* update docs
* suggestions
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
* fdr flush logs
* update trainer.fast_dev_run
* codefactor and pre-commit isort
* tmp
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Roger Shieh <sh.rog@protonmail.ch>
Co-authored-by: edenlightning <66261195+edenlightning@users.noreply.github.com>
2020-12-09 01:37:53 +05:30
Mohamed Al Salti
cd90dd429b
Fix batch_arg_name bug ( #4812 )
...
Add `batch_arg_name` to all calls to `_adjust_batch_size`
2020-11-23 11:34:11 +05:30
Nicki Skafte
4f3160ba2e
Skip tuner algorithms on fast dev ( #3903 )
...
* skip on fast dev
* fix error
* changelog
* fix recursive issue
* combine tests
* pep8
* move logic to base funcs
* fix mistake
* Update pytorch_lightning/tuner/lr_finder.py
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
* pep
Co-authored-by: William Falcon <waf2107@columbia.edu>
Co-authored-by: Nicki Skafte <nugginea@gmail.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: chaton <thomas@grid.ai>
2020-11-10 00:34:42 +01:00
Rohit Gupta
1396321b4d
Add fsspec to tuner ( #4458 )
...
* Add fsspec to tuner
* suggestions
* pathlib
* pep
* missed pep
2020-11-03 15:09:40 +05:30
Adrian Wälchli
d1234c592d
deprecate passing ModelCheckpoint instance to Trainer(checkpoint_callback=...) ( #4336 )
...
* first attempt
* update tests
* support multiple
* test bugfix
* changelog
* pep
* pep
* import order
* import
* improve test for resuming
* test
* update test
* add references test
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
* docstring suggestion deprecation
Co-authored-by: Jeff Yang <ydcjeff@outlook.com>
* paramref
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Jeff Yang <ydcjeff@outlook.com>
2020-10-30 04:47:37 +01:00
Jirka Borovec
f37444fa3e
CI: add flake8 ( #4239 )
2020-10-19 21:20:17 +01:00
Akihiro Nitta
b45b57cc58
Use `Optional` for arguments set to `None` by default ( #4164 )
...
* Use `Optional` for variables set to `None` by default
* Use `Optional` instead of `Union[None, ...]` for consistency
2020-10-15 23:02:50 +02:00
William Falcon
05e0b4e5a1
Revert "Remove limitation of batch scaler ( #4006 )" ( #4040 )
...
This reverts commit 7e756ca11f
.
2020-10-09 21:03:23 -04:00
Nicki Skafte
7e756ca11f
Remove limitation of batch scaler ( #4006 )
...
* working code
* add tests
* fix scaling
* move patch dataloader to utils
* renaming
* fix tests
* add changelog
* update docs
* pep8
2020-10-09 14:53:01 -04:00
Jirka Borovec
8873750cf0
remove deprecated early_stop_callback ( #3982 )
2020-10-08 06:30:33 -04:00