Commit Graph

1178 Commits

Author SHA1 Message Date
Adrian Wälchli 7749525cbd
Document SLURM interactive mode (#16955) 2023-03-06 20:58:46 +00:00
Jirka Borovec 9e12816de3
typing: fix App's core API - Flow (#16947)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-03-06 19:21:48 +00:00
Carlos Mocholí 900c0ebc84
Mention that the Trainer has started in barebones mode (#16926)
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
2023-03-06 19:50:21 +01:00
Jirka Borovec 429b4309cd
typing: fix App's core API - Work (#16946) 2023-03-06 17:23:07 +00:00
Carlos Mocholí c2b28a0e8c
Update mypy job to torch 2.0 (#16933) 2023-03-06 16:54:31 +00:00
Carlos Mocholí 4eab1f3ef1
Call `_cuda_clearCublasWorkspaces` on teardown (#16907) 2023-03-06 17:52:02 +01:00
JuanPablo 6781f3559f
Add `max_size` mode to CombinedLoader (#16939)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2023-03-06 16:21:49 +00:00
Carlos Mocholí 98b6e42df7
The torchdynamo inline cache is fixed in 2.1 (#16934) 2023-03-06 14:24:10 +00:00
Carlos Mocholí a00e061417
Prepare for ShardedTensor deprecation (#16892) 2023-03-06 13:42:27 +00:00
Justus Schock 24c0cd738c
BYOT example (#16938)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Luca Antiga <luca.antiga@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-03-06 10:26:18 +01:00
Adrian Wälchli f2caa01bb3
Document gradient clipping in Fabric (#16943) 2023-03-05 17:03:57 +00:00
Carlos Mocholí fca69e68da
Fabric: Test PyTorch 2.0 pre-release on CPU and CUDA (#16905)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>
2023-03-03 17:48:49 +00:00
Sherin Thomas f224c68b15
Maverick message fix (#16940)
* message fix for maverick

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* cleanup

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-03-03 22:31:38 +05:30
Jirka Borovec d9817aa5f1
refactor: move RunIf to PL pkg (#16923)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-03-03 15:45:59 +00:00
Ethan Harris ba2c378ba0
[App] Add `healthz` endpoint to plugin server (#16882)
Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>
2023-03-03 09:47:19 +00:00
Carlos Mocholí c8c4cbdddb
Update CHANGELOG after the v1.9.4 release (#16906) 2023-03-03 02:38:12 +01:00
Adrian Wälchli 7820a117bc
Optimize precision conversion in forward of Fabric module wrapper (#16903) 2023-03-02 23:41:37 +00:00
dependabot[bot] 0cb25f44d7
Update torchmetrics requirement from <0.10.1,>=0.7.0 to >=0.7.0,<0.11.1 in /requirements (#15904)
* Update torchmetrics requirement in /requirements

Updates the requirements on [torchmetrics](https://github.com/Lightning-AI/metrics) to permit the latest version.
- [Release notes](https://github.com/Lightning-AI/metrics/releases)
- [Changelog](https://github.com/Lightning-AI/metrics/blob/master/CHANGELOG.md)
- [Commits](https://github.com/Lightning-AI/metrics/compare/v0.7.0...v0.11.0)

---
updated-dependencies:
- dependency-name: torchmetrics
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

* task

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
2023-03-02 23:35:31 +00:00
Yuxuan Lu 8057c66eae
Add support for fsspec paths for CSVLoggers (#16880)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2023-03-02 23:33:55 +00:00
Matt Whiteway 9e536713f1
update default websocket setting (#16446) 2023-03-02 23:30:05 +00:00
Jirka Borovec a406826bcb
cleaning typing CI and config (#16860)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2023-03-02 19:12:14 +00:00
Jirka Borovec 37d0322dbe
pkg: append PL to lightning (#16921) 2023-03-02 19:11:26 +00:00
Jirka Borovec aa7fa3a0ce
add `Any` annotation (#16861)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-03-01 21:00:21 +00:00
Sherin Thomas da50c1341e
Maverick registration (#16913)
Maverick registration (#16913)
2023-03-01 18:18:14 +05:30
Justus Schock b4e29e0c8f
PL: Test PyTorch 2.0 pre-release on CPU and CUDA (#16764)
Co-authored-by: Akihiro Nitta <nitta@akihironitta.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
2023-02-28 18:38:40 +01:00
Ethan Harris afc63e9f66
[App] Fix environment check with command redirection (#16883) 2023-02-28 12:34:43 +00:00
Wouter Zwerink dfa35dac99
Require neptune 1.0 (#16888)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2023-02-28 12:13:42 +01:00
Marten Lienen 7bc39ae238
Log gradient norms of any magnitude (#16877) 2023-02-28 01:32:07 +00:00
Adrian Wälchli b49c5310e4
Replace obsolete `_FakeQueue` in multiprocessing launcher (#16873)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-02-28 01:31:18 +00:00
Carlos Mocholí 9aafd557bd
Backwards compatibility for `get_init_args` (#16851) 2023-02-28 01:09:30 +01:00
Justus Schock 3d1927e6bc
Adds Gradient Clipping to Fabric (#16715)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>
2023-02-27 23:44:13 +00:00
Jirka Borovec 8884c8970c
test/hotfix: DDPSpawnStrategy (#16889) 2023-02-27 23:16:44 +00:00
Jirka Borovec 52a39c03f8
docs: update `pytorch_lightning` imports (#16864)
* update docs imports

* ci

* fabric

* trigger

* links

* .

* docstring

* chlog

* cleaning
2023-02-27 15:14:23 -05:00
Yi Heng Lim 4444d0c37d
Fix support for passing -1 to `find_usable_cuda_devices` function (#16866)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2023-02-27 20:08:42 +00:00
Adrian Wälchli 07b89c87ee
Merge DDPStrategy and DDPSpawnStrategy in PL (#16809) 2023-02-27 14:43:23 -05:00
Adrian Wälchli e3efbaa7f6
Incorporate pytorch's fixes in device_count_nvml (#16795)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2023-02-27 18:07:55 +00:00
Carlos Mocholí 3ddf100671
Check for dataloader_idx presence in the hooks (#16837) 2023-02-27 17:46:46 +01:00
Adrian Wälchli 5bdf8a52bc
Remove redundant strategy property (#16811) 2023-02-26 00:11:51 +00:00
Carlos Mocholí 6b9ddf00dd
Introduce `Trainer(barebones=True)` (#16854)
Co-authored-by: Luca Antiga <luca.antiga@gmail.com>
2023-02-24 12:32:31 +01:00
Carlos Mocholí 0130273eb5
Trainer: auto default (#16847) 2023-02-23 18:42:17 +01:00
Carlos Mocholí d486f94dd2
Fabric: auto default (#16842) 2023-02-23 13:45:27 +00:00
Adrian Wälchli bc9651364f
Update changelog after 1.9.3 and bump version for RC (#16833)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2023-02-23 13:37:24 +00:00
Liyang90 19139137c7
Fix for hanging issue on TPU Pod (#16844)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2023-02-23 14:27:00 +01:00
Carlos Mocholí 235e692259
Fabric: do `set_epoch` for `batch_sampler.sampler` (#16841) 2023-02-23 00:11:29 +00:00
Ethan Harris beced48904
[App] Add support for plugins to return actions (#16832) 2023-02-22 17:12:04 +00:00
Carlos Mocholí 62e3d5854f
Consume the prediction batch indices iteratively (#16826) 2023-02-22 17:03:08 +01:00
Justus Schock 598c2476cd
Remove implicit frontend testing from `testing.run_app_in_cloud` (#16741)
Co-authored-by: Ethan Harris <ethanwharris@gmail.com>
2023-02-22 14:48:10 +00:00
Carlos Mocholí 914effa04c
Rename `replace_sampler_ddp|replace_sampler` to `use_distributed_sampler` (#16829)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2023-02-22 14:07:02 +01:00
Carlos Mocholí 565d6111f3
Move `max_batches` definition to the Loops (#16820) 2023-02-22 14:01:34 +01:00
Ethan Harris f969411284
[App] Fix local app run with relative import (#16835) 2023-02-22 12:23:28 +00:00