Commit Graph

2954 Commits

Author SHA1 Message Date
Adrian Wälchli e6a8283e9c
Organize accelerator tests (#13986) 2022-08-03 13:49:55 +00:00
thomas chaton 5479c60b22
Reduce state size (#13970) 2022-08-03 13:47:16 +00:00
Adrian Wälchli 4ce97f37a2
Validate the model input of trainer methods (#13892)
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2022-08-03 13:38:42 +00:00
Adrian Wälchli ce025bf954
Lazy import check for hydra dependency (#13812) 2022-08-03 04:27:16 -04:00
Raphael Randschau 2919dcf7ee
[CLI] add support for cluster management (#13835) 2022-08-02 10:31:09 +02:00
Jerome Anand b3203d93d0
Added support for HPU device stats monitor (#13819)
* Added support for HPU device stats monitor

Signed-off-by: Jerome <janand@habana.ai>

* Update changelog

Signed-off-by: Jerome <janand@habana.ai>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Apply suggestions from code review

Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>

* Update reference

Signed-off-by: Jerome <janand@habana.ai>

* Apply suggestions from code review

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

* fix alignment

* add descriptions

* Update hpu_intermediate.rst

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2022-08-02 13:31:31 +05:30
Adrian Wälchli eb233ea12d
Snapshot selected globals and restore them in spawned process (#13921)
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2022-08-01 22:21:46 +00:00
Rohit Gupta 0f6caffa57
Fix deepspeed default precision plugin `amp_level` to O2 (#13897)
Co-authored-by: Akihiro Nitta <nitta@akihironitta.com>
2022-07-29 20:36:51 +00:00
thomas chaton aefb9ab43f
(app) Introduce LightningTrainingComponent (#13830) 2022-07-29 16:44:52 +02:00
Adrian Wälchli caaf35689c
Improvements to standalone scripts (#13840)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-07-28 23:33:22 +00:00
Adrian Wälchli 7708ce22b2
Update GitHub links to PL repo (#13849)
* update lightning links in docs

* update links in chlog

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Apply suggestions from code review

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

* Update src/pytorch_lightning/README.md

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

* Update src/pytorch_lightning/README.md

Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>

* update

* painful

* badges

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update badges

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
Co-authored-by: Akihiro Nitta <nitta@akihironitta.com>
2022-07-28 22:08:07 +02:00
HMellor 07b39c257b
Cast on host instead of IPU when using `precision=16` (#13880)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2022-07-28 19:26:41 +00:00
Adrian Wälchli 25203d4c81
Organize model summary utilities (#13893) 2022-07-28 19:23:29 +02:00
Carlos Mocholí 406cea7146
Support DeepSpeed <0.7.0 (#13859)
Co-authored-by: awaelchli <aedu.waelchli@gmail.com>
2022-07-28 14:38:51 +00:00
Carlos Mocholí 1299e4f984
Run GPU tests with PyTorch 1.12 (#13716)
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
2022-07-28 19:37:57 +05:30
Carlos Mocholí 511875e567
Support DeepSpeed >=0.6.0, <0.6.5 (#13863)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2022-07-27 18:57:52 +02:00
Adrian Wälchli fff62f0ae5
Fix TPU testing and collect all tests (#11098)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>
2022-07-27 15:40:40 +00:00
otaj 95f5f170f5
Allowed custom `BatchSampler`s when instantiated in `*_dataloader` hook (#13640)
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2022-07-27 15:32:50 +00:00
Adrian Wälchli 2a24b906ac
Add batch size script argument for standalone tests (#13841)
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2022-07-27 12:36:22 +00:00
nitinramvelraj b37e466f28
Change tests/README.md to reflect repo structure change (#13437)
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-07-27 10:37:29 +00:00
otaj 4c7b9f0b11
Disallow batch sampler with multiple IPU devices (#13854)
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2022-07-27 15:20:43 +05:30
Anton Shevtsov 41f45b475e
Check if the scheduler already has `reduce_on_plateau` (#13838)
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-07-27 09:10:57 +00:00
Adrian Wälchli c3911700d1
Fix error handling in learning rate finder (#13845)
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2022-07-27 04:32:39 -04:00
Rohit Gupta faf7ff57c0
Add support for async checkpointing (#13658) 2022-07-26 21:13:19 +05:30
thomas chaton 4c35867b61
[App] Introduce Commands (#13602) 2022-07-25 17:13:46 +00:00
Adrian Wälchli a8d7b4476c
Fix PyTorch spelling errors (#13774)
* Fix PyTorch spelling errors

* more
2022-07-25 12:51:16 -04:00
Justus Schock 227871982d
Merge different gpu backends with accelerator='gpu' (#13642)
* Rename GPUAccelerator to CUDAAccelerator

* Add back GPUAccelerator and deprecate it

* Remove temporary registration

* accelerator connector reroute

* accelerator_connector tests

* update enums

* lite support + tests

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* typo

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* move "gpu" support up before actual accelerator flag checks

* Stupid arguments

* fix tests

* change exception type

* fix registry test

* pre-commit

* CI: debug HPU flow (#13419)

* Update the hpu-tests.yml to pull docker from vault
* fire & sudo
* habana-gaudi-hpus
* Check the driver status on gaudi server (#13718)

Co-authored-by: arao <arao@habana.ai>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Akarsha Rao <94624926+raoakarsha@users.noreply.github.com>

* Update typing-extensions requirement from <4.2.1,>=4.0.0 to >=4.0.0,<4.3.1 in /requirements (#13529)

Update typing-extensions requirement in /requirements

Updates the requirements on [typing-extensions](https://github.com/python/typing_extensions) to permit the latest version.
- [Release notes](https://github.com/python/typing_extensions/releases)
- [Changelog](https://github.com/python/typing_extensions/blob/main/CHANGELOG.md)
- [Commits](https://github.com/python/typing_extensions/compare/4.0.0...4.3.0)

---
updated-dependencies:
- dependency-name: typing-extensions
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* [pre-commit.ci] pre-commit suggestions (#13540)

updates:
- [github.com/psf/black: 22.3.0 → 22.6.0](https://github.com/psf/black/compare/22.3.0...22.6.0)

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* [FIX] Native FSDP precision + tests (#12985)

* Simplify fetching's loader types (#13111)

* Include app templates to the lightning and app packages (#13731)

* Include app templates to the package

Co-authored-by: mansy <mansy@lightning.ai>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* Fix mypy typing errors in pytorch_lightning/callbacks/model_checkpoint.py (#13617)

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

* Fix typos initialize in docs (#13557)


Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>

* Fix main progress bar counter when `val_check_interval=int` and `check_val_every_n_epoch=None` (#12832)

* Fix mypy errors attributed to `pytorch_lightning.loggers.tensorboard.py` (#13688)

Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

* Fix mypy errors attributed to `pytorch_lightning.loggers.mlflow` (#13691)

Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: otaj <6065855+otaj@users.noreply.github.com>

* fix mypy errors for loggers/wandb.py (#13483)


Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Akihiro Nitta <nitta@akihironitta.com>

* Fix gatekeeper minimum check (#13769)

* changelog

* changelog

* fix order

* move up again

* add missing test

Co-authored-by: rohitgr7 <rohitgr1998@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: arao <arao@habana.ai>
Co-authored-by: Akarsha Rao <94624926+raoakarsha@users.noreply.github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Sean Naren <sean@grid.ai>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Mansy <ahmed.mansy156@gmail.com>
Co-authored-by: mansy <mansy@lightning.ai>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Lee Jungwon <33821003+BongYang@users.noreply.github.com>
Co-authored-by: Nathaniel D'Amours <88633026+NathanielDamours@users.noreply.github.com>
Co-authored-by: Justin Goheen <26209687+JustinGoheen@users.noreply.github.com>
Co-authored-by: otaj <6065855+otaj@users.noreply.github.com>
Co-authored-by: Gautier Dagan <s2234411@ed.ac.uk>
Co-authored-by: Akihiro Nitta <nitta@akihironitta.com>
2022-07-25 14:46:45 +00:00
Mauricio Villegas 1b31039c58
Update LightningCLI test for new support in latest release of jsonargparse (#13805) 2022-07-25 09:25:42 +00:00
Adrian Wälchli 81f149e9d4
Rename spawn-based launchers (#13743) 2022-07-23 11:48:15 -04:00
Adrian Wälchli fa886f2a58
Lazy import check for neptune dependency (#13477)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Akihiro Nitta <nitta@akihironitta.com>
2022-07-23 14:06:26 +00:00
Adrian Wälchli d24978baa3
Add ddp_notebook alias for ddp_fork (#13744)
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2022-07-23 09:06:35 -04:00
Jinyoung Lim ae9803137a
Add logging messages to notify when `FitLoop` stopping conditions are met (#9749)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-07-23 12:07:47 +00:00
Carlos Mocholí 4f53e7132f
Promote the CLI out of utilities (#13767) 2022-07-23 12:07:29 +00:00
Adrian Wälchli f6f06d4e42
Set default strategy to ddp_fork in interactive environments (#13746) 2022-07-22 19:34:30 +00:00
Carlos Mocholí 9f51c07604
Support setting the trainer reference recursively for ensembles (#13638)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
2022-07-22 19:58:46 +02:00
Adrian Wälchli 596aa8400d
Lazy import check for wandb dependency (#13474) 2022-07-22 19:57:46 +02:00
Adrian Wälchli c3299d2c59
Add support for DDP fork (#13405)
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Akihiro Nitta <nitta@akihironitta.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2022-07-22 16:05:35 +00:00
Rohit Gupta 763fbf6b77
Fix to allow custom `CheckpointIO` with strategy classes (#13785) 2022-07-22 14:32:54 +00:00
Krishna Kalyan 238c9913a2
Do not force `sync_dist=True` on epoch end (#13364)
Co-authored-by: rohitgr7 <rohitgr1998@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-07-22 10:04:30 +00:00
Jerome Anand 9596fabe7b
Add auto_device_count and device name support (#13423)
Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Akihiro Nitta <nitta@akihironitta.com>
Co-authored-by: awaelchli <aedu.waelchli@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
Co-authored-by: mansy <mansy@lightning.ai>
Co-authored-by: manskx <mansy@lightning.ai>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Mansy <ahmed.mansy156@gmail.com>
Co-authored-by: otaj <ota@lightning.ai>
Co-authored-by: Sean Naren <sean@grid.ai>
Co-authored-by: Keiichi Kuroyanagi <kuroyanagi.keiichi@gmail.com>
Co-authored-by: Martino Sorbaro <martinosorb@users.noreply.github.com>
Co-authored-by: Wang Ran (汪然) <wangr@smail.nju.edu.cn>
Co-authored-by: Rhys Goodall <rhys.goodall@outlook.com>
Co-authored-by: Siyuan Li <siyuanli.s.c@gmail.com>
Co-authored-by: Ekagra Ranjan <ekagra.ranjan@gmail.com>
Co-authored-by: S. Kumano <54502860+s-kumano@users.noreply.github.com>
Co-authored-by: otaj <6065855+otaj@users.noreply.github.com>
Co-authored-by: Gautier Dagan <gautierdagan2017@u.northwestern.edu>
Co-authored-by: Sherin Thomas <sherinct@live.com>
Co-authored-by: Cyprien Ricque <48893621+Cyprien-Ricque@users.noreply.github.com>
Co-authored-by: Masahiro Wada <argon.argon.argon@gmail.com>
Co-authored-by: nitinramvelraj <98356761+nitinramvelraj@users.noreply.github.com>
Co-authored-by: donlapark <10988155+donlapark@users.noreply.github.com>
Co-authored-by: Justin Goheen <26209687+JustinGoheen@users.noreply.github.com>
Co-authored-by: Shantam Gilra <64306405+shantam-8@users.noreply.github.com>
Co-authored-by: Bibhabasu Mohapatra <68384968+bibhabasumohapatra@users.noreply.github.com>
Co-authored-by: Jimmy Yao <jiahaoyao.math@gmail.com>
Co-authored-by: Nikhil Shenoy <nikhilshenoy98@gmail.com>
Co-authored-by: Sanjay Aradhyamath <57592361+samz5320@users.noreply.github.com>
2022-07-22 10:29:02 +05:30
Gautier Dagan 0e5312833f
fix mypy errors for loggers/wandb.py (#13483)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Akihiro Nitta <nitta@akihironitta.com>
2022-07-21 01:07:24 +00:00
Rohit Gupta e451fa28d0
Fix main progress bar counter when `val_check_interval=int` and `check_val_every_n_epoch=None` (#12832) 2022-07-20 20:33:00 +00:00
Carlos Mocholí bbd364a041
Simplify fetching's loader types (#13111) 2022-07-20 12:15:24 +00:00
Sean Naren d78698528d
[FIX] Native FSDP precision + tests (#12985) 2022-07-20 11:32:35 +00:00
Rohit Gupta c67b075cf5
Use `global_step` while restoring logging step for old checkpoints (#13645)
Co-authored-by: Akihiro Nitta <nitta@akihironitta.com>
2022-07-19 18:53:22 +00:00
Justus Schock abf82b360a Add back GPUAccelerator and deprecate it 2022-07-19 13:06:30 -04:00
Justus Schock c75457da99 Rename GPUAccelerator to CUDAAccelerator 2022-07-19 13:06:30 -04:00
Carlos Mocholí 0e5a51f55c
Allow CUDA and IPU tests without the CI environment var (#13676) 2022-07-19 13:40:25 +09:00
thomas chaton d4c7f91fec
Change AWS credentials to Lightning ones (#13703) 2022-07-18 16:01:57 +02:00
Carlos Mocholí d058190b6d
Run standalone tests in batches (#13673) 2022-07-18 12:10:35 +00:00
George Stein 0449e861cc
Fix `trainer.predict(return_predictions=False)` does not track `batch_indices` (#13629)
* Pull request for fixing issue #13580
* chlog and test
* disable track for epoch

Co-authored-by: rohitgr7 <rohitgr1998@gmail.com>
2022-07-18 08:26:15 +00:00
Mansy b822ac1d3e
Add CI app cloud e2e & fix setup UI download (#13499)
* Add CI app e2e
* update UserID
* fix apps cleanup
* download frontend inside setup.py

Co-authored-by: mansy <mansy@lightning.ai>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Akihiro Nitta <nitta@akihironitta.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2022-07-17 20:19:01 +02:00
thomas chaton 2a873da042
Add --app_args support from the CLI (#13625) 2022-07-15 19:12:40 +01:00
Jirka Borovec aa62fe36df
add testing PT 1.12 (#13386)
* add testing PT 1.12
* Fix quantization tests
* Fix another set of tests
* Fix check since https://github.com/pytorch/pytorch/pull/80139 is only going to be available for 1.13
* Skip this test for now for 1.12

Co-authored-by: SeanNaren <sean@grid.ai>
2022-07-15 19:41:23 +02:00
Adrian Wälchli d42711f22f
Remove deprecated `Strategy.post_dispatch` (#13461)
* Remove deprecated Strategy.post_dispatch

* changelog

* remove unused imports
2022-07-15 13:18:55 -04:00
thomas chaton 5e26840f94
Introduce ServableModuleValidator Callback (#13614)
* wip

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* Update tests/tests_pytorch/serve/test_servable_module_validator.py

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

* Update tests/tests_pytorch/serve/test_servable_module_validator.py

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

* Update src/pytorch_lightning/serve/servable_module_validator.py

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

* Update src/pytorch_lightning/serve/servable_module_validator.py

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

* Update src/pytorch_lightning/serve/servable_module_validator.py

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

* Typing improvements

* wip

* update doc

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* Update examples/pl_servable_module/production.py

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

* update

* update

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-07-15 11:07:40 -04:00
Carlos Mocholí 8355ba1260
Run only CUDA tests on Azure GPU CI (#13651) 2022-07-15 13:51:23 +02:00
Akihiro Nitta 7ba0270552
Remove deprecated `max_steps=None` (#13591)
* Remove max_steps=None

* Update changelog

* Update docs

* Unused import

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-07-14 12:28:38 +00:00
Akihiro Nitta c1cc112b52
Remove deprecated `LightningDistributed` (#13549)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-07-13 21:15:28 +00:00
Adrian Wälchli daf7cec01e
Remove deprecated ClustertEnvironment methods (#13458)
* Remove deprecated ClustertEnvironment methods

* update changelog

* ignore typing error

Co-authored-by: Akihiro Nitta <nitta@akihironitta.com>
2022-07-13 19:53:46 +00:00
Akihiro Nitta feb8e7d344
Remove deprecated `LightningModule.on_post_move_to_device` (#13548)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-07-13 18:06:27 +00:00
Adrian Wälchli 07e7d6dc3b
Remove deprecated `Trainer.slurm_job_id` (#13459) 2022-07-13 16:50:55 +00:00
Sanjay Aradhyamath 562467402d
Removed deprecated `pytorch_lightning.overrides.distributed.IndexBatchSamplerWrapper.batch_indices` (#13565)
* Removed the deprecated   method

* Removed deprecated  IndexBatchSamplerWrapper.batch_indices

* Update src/pytorch_lightning/CHANGELOG.md

* Missed code

Co-authored-by: Akihiro Nitta <nitta@akihironitta.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-07-13 00:53:18 +00:00
Nikhil Shenoy e034cd31d3
Remove `add_to_queue` and `remove_from_queue` from LightningModule (#13600)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-07-12 16:47:51 +02:00
Rohit Gupta dba65be911
Remove redundant GPU test (#13623)
Remove redundant test
2022-07-12 09:51:11 -04:00
Rohit Gupta df931e2486
Restore log step during restart (#13467)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-07-12 09:45:59 +00:00
Shantam Gilra bdb6e40392
Remove deprecated `pytorch_lightning.core.decorators.parameter_validation` (#13514)
* Removal of depreciated code from decorators

* Update CHANGELOG.md

* Removed imports
2022-07-11 23:03:54 +00:00
nitinramvelraj 61c28cb428
Remove deprecated `on_keyboard_interrupt` (#13438)
Co-authored-by: Akihiro Nitta <nitta@akihironitta.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2022-07-05 07:34:18 +00:00
Mansy 767fe6eefa
Add CI for app examples (#13495)
* Add CI for app examples

Co-authored-by: manskx <mansy@lightning.ai>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
2022-07-02 05:05:16 +00:00
Mansy dc70b6511c
Add CI for python lightning app Python unit tests (#13491)
* Update lightning_app src

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update lightning app tests

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add CI

* update tests

* requirements

* fix version tests

* todo

* fix tests

* fix tests

* fix tests

* fix tests

* fix formatting

Co-authored-by: mansy <mansy@lightning.ai>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: awaelchli <aedu.waelchli@gmail.com>
2022-07-01 16:28:44 -04:00
Adrian Wälchli a80354e3ae
Move deepspeed summary test to correct folder (#13478) 2022-07-01 08:47:04 +00:00
Mansy 2ea35182e0
Add lightning app examples (#13456)
* add lightning app examples

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix CI

* rm init

* restucture app examples

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* img

Co-authored-by: mansy <mansy@lightning.ai>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
2022-06-30 16:45:15 -04:00
Sherin Thomas b90525b5b5
adding LAI test (#13321)
* tests

* ci

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update tests

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

Co-authored-by: Jirka <jirka.borovec@seznam.cz>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: mansy <mansy@lightning.ai>
2022-06-30 16:43:04 -04:00
Siyuan Li e0a0d1e4f9
Set timeout for DDPSpawnStrategy (#13383)
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-06-29 20:55:06 -04:00
Adrian Wälchli 1f85b6d6a4
Fix validation when accelerator is a string (#13417)
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2022-06-29 22:42:34 +00:00
Rhys Goodall 8c4d640bfc
Convert validation loop config warnings to `PossibleUserWarning` (#13377)
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2022-06-29 22:34:25 +00:00
Adrian Wälchli 2dd332f9c7
Call `set_epoch` for distributed batch samplers (#13396)
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2022-06-29 19:09:35 +00:00
Adrian Wälchli 43635a9a9b
Remove remaining old-style AcceleratorConnector properties (#13412)
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
2022-06-29 18:51:48 +00:00
ananthsub 7fca126749
Update gather_all_tensors to handle tensors of different sizes (#12630)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
Co-authored-by: awaelchli <aedu.waelchli@gmail.com>
2022-06-29 17:03:00 +00:00
Adrian Wälchli ddbf95516b
Remove support for DDP2 strategy (#12705)
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
2022-06-29 15:06:51 +00:00
Sean Naren f145acd2a3
Add model summary when using DeepSpeed Stage 3 (#13427) 2022-06-29 14:49:34 +00:00
Adrian Wälchli c71f32a490
Rename old references to training type plugin in tests (#13421) 2022-06-28 14:57:44 -04:00
Carlos Mocholí b1e38bfd79
Better errors for logging corner cases (#13164) 2022-06-28 16:59:31 +01:00
Carlos Mocholí a4750100cf
[CLI] Support custom trainers without callbacks (#13138) 2022-06-28 17:39:17 +02:00
Sean Naren 54f2d44fb8
Remove unnecessary endpoint logic, rename `collaborative` to `hivemind` (#13392)
* Remove endpoint after collaborate app/dht CLI

* Fix references, change filename

* Add CHANGELOG.md

* Address review

Co-authored-by: Jirka <jirka.borovec@seznam.cz>
2022-06-28 08:41:08 -04:00
Jirka Borovec d2e4e7e003
create meta package [RFC] (#13327)
* placeholder

* move setup_tools & abstract about

* adjust lightning-app

* notes

* lightning about

* lightning init

* CI check

* ci

* install

* adjust manifest & mv chlog

* manifest

* pkg

* mv __setup__

* parse_requirements

* lit

* ci - pytorch

* wrap func

* ci

* cd draft

* generate lit

* pkg

* utf-8

* root pkg

* req.

* ver

* mypy

* try check

* meta pkg

* meta pkg - vars

* meta pkg - pruning

* meta pkg - fixing

* fix PL for meta

* multi-line wrapper

* hack manifest

* ci

* fix docstr

* fixing

* ci & mypy

* links
2022-06-27 09:34:18 -04:00
Justus Schock f54abc506f
Merge pull request #13123 from Lightning-AI/mps_accelerator
MPS Accelerator
2022-06-24 08:15:48 -04:00
Sean Naren 73e7a5d0c2
Rename `CollaborativeStrategy` to `HivemindStrategy` (#13388) 2022-06-23 15:44:48 +00:00
awaelchli 511f1a6515 Reroute profiler to profilers (#12308)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Akihiro Nitta <nitta@akihironitta.com>
2022-06-22 20:55:39 -04:00
awaelchli fc1559e41c Rename profiler to profilers (#12308)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Akihiro Nitta <nitta@akihironitta.com>
2022-06-22 20:55:39 -04:00
Patrick Haller 887dc0ff8c
DummyLogger can be called with unknown methods (#13224)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2022-06-22 17:51:42 +02:00
Atharva Phatak 63a9ab4ae2
Improved Deepspeed Imports (#13223)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
2022-06-22 11:09:33 -04:00
otaj 33bd270845
Adds Sampler Wrappers for custom samplers in distributed environment (#12959)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2022-06-22 12:17:53 +02:00
Adrian Wälchli b08259d536
Add `XLAEnvironment` plugin (#11330)
* add xla environment class
* add api reference
* integrate
* use xenv
* remove properties

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Kaushik B <kaushikbokka@gmail.com>
Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2022-06-22 10:57:50 +02:00
Ray Schireman 8266300b29
Remove pytorch lightning.callbacks.lr monitor.learning rate monitor.lr_sch_names (#13353)
Co-authored-by: Raymond G Schireman <raymond.schireman@uvm.edu>
Co-authored-by: Akihiro Nitta <nitta@akihironitta.com>
2022-06-22 02:03:17 +02:00
otaj 2e9cd72add
Improve support for custom `DataLoader`s when instantiated in `*_dataloader` hook (#12981)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Akihiro Nitta <nitta@akihironitta.com>
2022-06-22 01:53:24 +02:00
Mauricio Villegas 6371d7c615
Fix LightningCLI signature parameter resolving for some lightning classes (#13283)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-06-21 21:58:41 +00:00
Sean Naren 89e2e69b01
[BUG] `estimated_stepping_batches` requires distributed comms in `configure_optimizers` for `DeepSpeedStrategy` (#13350) 2022-06-21 17:48:27 +01:00
Tianshu Wang 749709fb4f
Use run name for logging with WandbLogger (#12604)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Akihiro Nitta <nitta@akihironitta.com>
2022-06-21 15:25:37 +00:00
Mauricio Villegas 0ae9627bf8
Deprecate CLI registries and update documentation (#13221)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-06-21 17:12:04 +02:00
Carlos Mocholí ad87d2cad0
Future 5/n: Move requirements (#13306)
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
2022-06-21 17:11:33 +02:00
Siyuan Li c600f987c2
Enable timeout for `DDPStrategy` (#13244) 2022-06-21 15:49:57 +02:00
Ekagra Ranjan 81b7000978
EarlyStopping logging on rank 0 only (#13233)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-06-21 09:37:41 -04:00
Adam J. Stewart d24178ec29
Fix torch.distributed._sharded_tensor DeprecationWarning (#13261) 2022-06-21 04:52:06 -04:00
Jerome Anand cd44512ab9
Added multi-optimizer tests with hpu (#13217)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2022-06-21 09:07:31 +02:00
Jirka Borovec ab59f308b1
Future 4/n: test & legacy in test/ folder (#13295)
* move: legacy >> test/

* move: tests >> test/

* rename unittests

* update CI

* tests4pl

* tests_pytorch

* proxi

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* ci

* link

* cli

* standalone

* fixing

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* .

* Apply suggestions from code review

Co-authored-by: Akihiro Nitta <nitta@akihironitta.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* alone

* test -> tests

* Standalone fixes

* ci

* Update

* More fixes

* Fix coverage

* Fix mypy

* mypy

* Empty-Commit

* Fix

* mypy just for pl

* Fix standalone

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Akihiro Nitta <nitta@akihironitta.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-06-15 18:10:49 -04:00
Carlos Mocholí b551921a9f
Remove unused test argument (#13296) 2022-06-15 22:51:14 +02:00
Jirka Borovec 9cc714cdd1
Future 2/n: stand-alone examples (#13294)
* move: pl_examples >> src/

* convert pl_examples package to plain examples

* update CI for examples

* ci

* missing

* install
2022-06-15 08:53:51 -04:00
Carlos Mocholí 0cf9d73d28
Drop PyTorch 1.8 support (#13155)
* Drop PyTorch 1.8 support

* Missed update

* Skip profiler test until supported

* Upgrade ipu dockerfile pytorch version

* Update XLA version
2022-06-14 20:46:44 -04:00
Carlos Mocholí 981a6da121
Remove test's proxy boring classes import (#13297)
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
2022-06-15 01:53:54 +02:00
Carlos Mocholí 7938293cd9 Add deprecation path for the old Loop module (#13043) 2022-06-02 21:23:14 +02:00
Akihiro Nitta 3c5a8a833e
Decouple pulling legacy checkpoints from existing GHA workflows and docker files (#13185)
* Add pull-legacy-checkpoints action
* Replace pulls with the new action and script
* Simplify
2022-06-02 15:39:14 +02:00
Mauricio Villegas 79de6a95fb
LightningCLI natively support callback list append (#13129)
* LightningCLI natively support callback list append.

* Update jsonargparse version

* Handle case where callbacks is not a list.

* Fix PEP8 issue.

* Handle mypy false positive

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-06-02 10:00:02 +09:00
ananthsub c1f05021ff
Fix initialization of optimizers in DDP Strategy (#11952)
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-06-01 10:25:05 +00:00
Rohit Gupta 9445a84a12
Fix epoch logging on train epoch end (#13025)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-06-01 09:05:11 +00:00
Mauricio Villegas 18cdfab83b
Register torch's unresolvable import paths in cli module (#13153)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Akihiro Nitta <nitta@akihironitta.com>
2022-06-01 11:00:57 +02:00
Toshiki Ishikawa c16dcb7266
Remove the deprecated `logger.close` (#13149)
* refactor:removed the close instance from the LoggerCollection class

* Also logger.close()

* Update CHANGELOG

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-06-01 00:22:56 +00:00
Akihiro Nitta a21e6c3f33
Specify `Trainer(benchmark=False)` in parity benchmarks (#13182)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-05-31 21:23:21 +00:00
Nikhil Shenoy f4f14bb5d8
Removed `weights_summary` argument from Trainer (#13070)
Co-authored-by: rohitgr7 <rohitgr1998@gmail.com>
Co-authored-by: Akihiro Nitta <nitta@akihironitta.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-05-31 20:31:39 +00:00
Carlos Mocholí 5bdb936eca
[CLI] Respect existing seed by default (#13110) 2022-05-31 22:31:25 +02:00
Rohit Gupta ba438828e5
Fix logging's step values when multiple dataloaders are used during evaluation (#12184)
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-05-31 22:00:29 +02:00
Masahiro Wada 90acf909fb
Fix not running test codes (#13089)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-05-31 19:27:46 +00:00
Carlos Mocholí 01ebb1f2f5
Avoid changing the current `cudnn.benchmark` value (#13154) 2022-05-31 19:22:19 +00:00
Carlos Mocholí 9945d028d3 Fix standalone test collection (#13177) 2022-05-31 20:21:14 +02:00
Carlos Mocholí 50326e9a65 xfail flaky quantization test blocking CI (#13177) 2022-05-31 20:21:14 +02:00
Nikhil Shenoy dd47518322
Removed `flush_logs_every_n_steps` argument from Trainer (#13074)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-05-25 12:47:15 +02:00
Justus Schock fbd887df9d
Rename min_gpus to min_cuda_gpus (#13133)
* rename min_gpus to min_cuda_gpus
2022-05-24 12:54:05 +00:00
rohitgr7 eb21135b2a Add deprecation path for the old Callback module (#13031) 2022-05-19 13:25:25 +02:00
shantam-8 ddff6c1c7e Rename callbacks/base.py to callbacks/callback.py (#13031) 2022-05-19 13:25:25 +02:00
Carlos Mocholí 42f27c2a34
Update docs about `reduce_fx` (#13101) 2022-05-19 07:24:09 -04:00
Jerome Anand 74941bb790
Enable all ddp params for hpu parallel strategy (#13067)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-05-19 12:41:16 +05:30
rohitgr7 5023752bd0 Add deprecation path for the old LightningModule module (#12740) 2022-05-18 08:03:55 +05:30
rohitgr7 a68fe66705 Update core/lightning.py to core/module.py (#12740) 2022-05-18 08:03:55 +05:30
Carlos Mocholí 23de1ca90a
Fix CLI test interaction (#13037) 2022-05-16 22:29:18 +00:00
Nikhil Shenoy 830da2c4a4
Removed `process_position` argument from Trainer Class (#13071) 2022-05-16 13:05:50 +02:00
Jirka Borovec fab2ff35ad
CI: Azure - multiple configs (#12984)
* CI: Azure - multiple configs
* names
* benchmark
* Apply suggestions from code review

Co-authored-by: Akihiro Nitta <nitta@akihironitta.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-05-14 01:59:03 +00:00
Akihiro Nitta 03039a236e
Fix `materialize_module` recursively setting its child module (#12870)
* Don't set materialized child to child's child
* Update CHANGELOG

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2022-05-13 18:05:38 +00:00
otaj db7b0361a5
Fix number of references to LightningModule (#12897)
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-05-13 11:23:25 -04:00
Mauricio Villegas 8713e0ea61
Improved the jsonargparse[signatures] availability variable (#13035)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-05-12 14:57:19 +00:00
Geo Jolly 5ab9d53fc1
Remove the deprecated `on_{train,val,test,predict}_dataloader` hooks (#13033)
Co-authored-by: rohitgr7 <rohitgr1998@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-05-11 23:02:39 -04:00
Adrian Wälchli d24361733c
Provide access to unwrapped model in Lite (#12597)
Co-authored-by: Akihiro Nitta <nitta@akihironitta.com>
2022-05-11 18:28:08 +00:00
Rohit Gupta 4011f379b8
Fix double precision during evaluation (#12983) 2022-05-11 17:43:19 +00:00
Rohit Gupta 9881bf2a2c
Avoid redundant callback restore warning while tuning (#13026) 2022-05-11 16:11:04 +02:00
Nikhil Shenoy b7959e3f51
Remove deprecated `checkpoint_callback` flag in Trainer (#13027)
* Removed lines pertinent to checkpoint_callback

* removed checkpoint callback flag

* Updated Change Log

* Removed deprecation test for checkpoint_callback argument

* updated line in the simple_classif_training.py

* Updated docs

* updated simple_classif_training.py removing enable_checkpointing
2022-05-11 08:01:00 -04:00
Eric Wiener 3f78c4ca7a
Track CPU stats with DeviceStatsMonitor (#11795)
Co-authored-by: ananthsub <ananth.subramaniam@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Kaushik B <kaushikbokka@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-05-10 10:57:38 +00:00
Rohit Gupta c02dc8585c
Profile `LightningDataModule` hooks (#12971)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-05-06 12:23:36 +00:00
Akash Kwatra c5e1002fe4
Add profiling to dataloader `next()` (#12124)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-05-06 11:48:12 +02:00
Jirka Borovec 7ce948edb6
Unpin CUDA docker image for GPU CI (#12373)
* unpin CUDA docker image for GPU CI
* Apply suggestions from code review

Co-authored-by: Aki Nitta <nitta@akihironitta.com>
Co-authored-by: Akihiro Nitta <akihiro@pytorchlightning.ai>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2022-05-06 02:56:57 +00:00
Sean Naren 1a502c061c
[1/2] Collaborative Strategy (#12842) 2022-05-05 16:06:26 +00:00
sisilmehta2000 d337374da7
[FSDP] Adding Native FSDP Strategy (#12447) 2022-05-05 12:48:29 +00:00
otaj e2ea9f045f
Add support for reloading the last checkpoint saved by passing `ckpt_path="last"` (#12816)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-05-05 08:32:58 +00:00
Rohit Gupta de7c103918
Add a method signature check for `setup` (#12960)
Co-authored-by: otaj <ota@grid.ai>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-05-04 18:20:59 +00:00
Max Mametkulov 1e96848596
Raise an exception when using DeepSpeed with an invalid accelerator (#12699)
Co-authored-by: manjirou <maxim.mametkulov@halbestunde.com>
Co-authored-by: Akihiro Nitta <nitta@akihironitta.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2022-05-04 18:16:41 +00:00
Abhisek Maiti 2ffc0deaf5
Support `predict_dataset` in `LightningDataModule.from_datasets` (#12942)
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2022-05-04 13:12:22 +00:00
Rohit Gupta 9bfbd9ea80
Fix zero division error for empty dataloaders (#12885)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-05-03 20:40:30 +00:00
Adrian Wälchli 5641836b96
Callback collection through entry points (#12739)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: thomas chaton <thomas@grid.ai>
2022-05-03 16:54:41 +00:00
Rohit Gupta 46ed9dc62a
Fix fit loop restart logic to enable resume using the checkpoint (#12821)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-05-03 16:27:13 +00:00
Rohit Gupta 5dc89512e8
Fix `TQDMProgressBar` reset and update to show correct time estimation (#12889)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-05-03 16:21:59 +00:00
Sean Naren 4d06301c18
[FIX] Enable mixed precision in the Fully Sharded Strategy when `precision=16` (#12965)
* Fix fully sharded mixed precision setter

* Add CHANGELOG.md
2022-05-03 15:39:59 +00:00
Carlos Mocholí f4505ce6b2
Construct the hook kwargs inside each loop (#12100)
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2022-05-03 17:08:02 +02:00
Rohit Gupta cd01856ffc
Add `LightningDataModule.load_from_checkpoint` to load datamodules directly from checkpoint (#12550)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: otaj <ota@grid.ai>
2022-05-03 12:27:06 +00:00
Mauricio Villegas 1c25ab8daf
Support CLI shorthand natively (#12614)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-05-03 14:16:37 +02:00
Rohit Gupta eebba9e632
Enforce eval shuffle warning only for default samplers (#12653)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-05-02 16:11:09 +00:00
Carlos Mocholí 917918ade3
Remove duplicate boring classes (#12951) 2022-05-02 17:42:12 +02:00
Carlos Mocholí 26acdd6569
Add hook test for reloading with max epochs (#12932) 2022-05-02 14:41:28 +02:00
otaj c461854fa7
Versioning of last checkpoins (#12902)
* last checkpoint versioning

* changelog

* Simplify test

* Update CHANGELOG.md

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

* Update CHANGELOG.md

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-04-29 14:13:50 +09:00
Kushashwa Ravi Shrimali 74d46d655d
Threading support for legacy loading of checkpoints (#12814) 2022-04-28 20:37:58 +00:00
otaj 55b3bc3e36
Print ragged dict of metrics in `EvaluationLoop._print_results` properly (#12857)
* first fix

* full bugfix + tests

* Apply Adrian's suggestion

* Add test with tensor(0)

* Minor code simplification

* change sorting to make the comment correct

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-04-28 16:05:24 +00:00
Sean Naren bcbd9c359e
ShardedGradScaler should only be set for FP16 (#12915)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-04-28 17:44:31 +02:00
Schinkikami a62c227932
Support automatic seeding of the LightningCLI (#12822)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-04-28 14:24:32 +00:00
Akihiro Nitta f3e746c145
Fix tests related to DDP communication hooks (#12878)
* Fix ddp_comm_hook tests

* Refactor ddp_comm_hook tests

Co-authored-by: Akihiro Nitta <akihiro@pytorchlightning.ai>
2022-04-27 22:37:19 +05:30
Carlos Mocholí 10c7a7c84f
Fix `trainer.logger` deprecation message (#12671) 2022-04-27 16:11:34 +02:00
Rohit Gupta 70754bea83
Fix to ensure the checkpoint states are saved in a common filepath with deepspeed (#12887)
Co-authored-by: Akihiro Nitta <nitta@akihironitta.com>
2022-04-27 14:41:51 +02:00
Wei Ji 6490996b39
Support deterministic="warn" in Trainer for Pytorch 1.11+ (#12588)
Co-authored-by: carmocca <carlossmocholi@gmail.com>
Co-authored-by: Akihiro Nitta <nitta@akihironitta.com>
2022-04-27 12:05:26 +00:00
otaj a41486245a
Use a single instance of `rich.console.Console` throughout the codebase (#12886) 2022-04-27 01:47:43 +00:00
Adrian Wälchli ab60cdbdcb
Raise better error when calling `Trainer.save_checkpoint` without a model attached (#12772)
* add error message

* add test

* changelog

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-04-26 11:16:41 +01:00
Akihiro Nitta bb81802bff
Update `deepspeed` and `fairscale` versions (#12860)
* Fix deepspeed installation

* Adapt to deepspeed>=0.5.9

* Fix fairscale installation

Co-authored-by: Akihiro Nitta <akihiro@pytorchlightning.ai>
2022-04-26 01:40:25 +02:00
alvitawa 958310a3fc
Fixed encoding issues on terminals that do not support unicode characters (#12828)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-04-25 12:24:30 +00:00
stanbiryukov 8034919c44
Remove deprecated `TestTubeLogger` (#12859)
* remove deprecated test_tube logger

* remove testube from logger __init__

* remove relevant testtube tests

* update CHANGELOG with removal of deprecated `TestTubeLogger`
2022-04-24 20:05:48 +02:00
Ray Schireman f931e27373
Remove the deprecated get_progress_bar_dict (#12839)
Co-authored-by: Raymond G Schireman <raymond.schireman@uvm.edu>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-04-22 22:22:26 +00:00
Ferdinand Schlatt f4f70a8a08
Add required for positional arguments in argparse logic (#12504)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2022-04-22 17:39:39 +02:00
Ray Schireman 9b2b1bb494
Remove deprecated `LightningDataModule.val_transforms` (#12763)
* remove val_transform from datamodule.py

* remove val_transforms from tests

* update docs

* update changelog

* remove unused imports

Co-authored-by: Raymond G Schireman <raymond.schireman@uvm.edu>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-04-22 01:24:42 +00:00
Henry Lau b155a6323f
Fix support for `ModelCheckpoint` monitors with dots (#12783)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-04-21 22:59:32 +02:00
Ray Schireman 54a2b5ceeb
Remove the deprecated `LightningDataModule.test_transforms` (#12773)
Co-authored-by: Raymond G Schireman <raymond.schireman@uvm.edu>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
Co-authored-by: Akihiro Nitta <nitta@akihironitta.com>
2022-04-21 16:40:15 +00:00
Nik a758d900ec
Support `val_check_interval` values higher than number of training batches (#11993)
Co-authored-by: rohitgr7 <rohitgr1998@gmail.com>
2022-04-21 09:35:53 +00:00
otaj f300b60f47
Horovod tests do not make sense for 1 gpu (#12710)
Co-authored-by: Akihiro Nitta <nitta@akihironitta.com>
2022-04-21 08:49:40 +00:00
Adrian Wälchli 1233554e73
Make standalone tests less verbose (#12684)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-04-20 20:57:40 +02:00
Kushashwa Ravi Shrimali 0eefdd4d48
Better error mesage and type checking for `gpus` arg and `devices` arg in `Trainer` (#12530)
Co-authored-by: Kaushik B <45285388+kaushikb11@users.noreply.github.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2022-04-20 17:19:41 +00:00
Minh Chien Vu af03f0a434
Remove the deprecated LightningDataModule.size, LightningDataModule.dims (#12780) 2022-04-18 21:49:05 +02:00
Danielle Pintz 0b22e51462
Remove deprecated `dataloader_idx` argument from `on_train_batch_start/end` callback hooks (#12769)
* remove dataloader_idx
* fix tests
* Update CHANGELOG.md

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2022-04-18 18:49:59 +02:00
Jirka Borovec bca78bc25b
Merge pull request #12766 from PyTorchLightning/docs/slack
update slack link
2022-04-18 11:13:22 -04:00
twsl ae3226ced9
Add dataclass support to _extract_batch_size (#12573)
Co-authored-by: Akihiro Nitta <nitta@akihironitta.com>
2022-04-15 14:13:33 +02:00
otaj b8d4b81221
Inspect correct function in wrap_init (#12716)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com>
2022-04-15 13:58:28 +02:00
Kaushik B 77a02234e9
Support auto_select_gpus with accelerator and devices api (#12608)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2022-04-12 21:28:54 +00:00
puhuk 663216fe3e
Remove pytorch_lightning.core.memory.get_gpu_memory_map (#12644)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-04-12 15:55:56 +00:00
Ankita Sharma 313eee00ee
Remove deprecated `XLAStatsMonitor` (#12688)
Co-authored-by: rohitgr7 <rohitgr1998@gmail.com>
2022-04-12 15:35:43 +00:00
Rohit Gupta 6fcb590cfb
Update deepspeed precision test (#12727) 2022-04-12 09:27:14 +00:00
Adrian Wälchli 4a48710506
Remove support for passing strategy name to plugins (#12700)
* remove more code

* update tests

* remove unsupported test

* remove unsupported test

* remove dead enum values

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add changelog

* fix pep

* add xfail test

* remove comment

* Remove support for passing strategy name to plugins

* remove unused import

* chlog

* improve comment

* update chlog

* fix merge error

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2022-04-12 03:51:48 +02:00
Adrian Wälchli b2fe6bda5b
Remove support for passing strategy strings to accelerator (#12696)
Co-authored-by: Kushashwa Ravi Shrimali <kushashwaravishrimali@gmail.com>
2022-04-11 22:52:28 +02:00
Ray Schireman 9d343ba137
Remove the deprecated `pl.callbacks.ProgressBar` (#12658)
Co-authored-by: Raymond G Schireman <raymond.schireman@uvm.edu>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2022-04-11 18:20:30 +00:00