Commit Graph

9464 Commits

Author SHA1 Message Date
Carlos Mocholí 3d573d5e79
Fix [TPU] tests (#18136)
* Debug [TPU] tests

* -U

* Uninstall typing extensions

* Minor simplifications

* Silly cancelling logic

* pip3?

* sudo

* More

* Revert "Silly cancelling logic"

This reverts commit ce31d874f3.
2023-07-23 13:39:00 +02:00
Carlos Mocholí 27cab24ca3
bump: typing-extensions >=4.0.0, <=4.7.1 (#18125)
* Update max typing extensions

* Fix some app tests too

* "trigger ci"

* xfail on 3.9
2023-07-21 16:46:18 +02:00
Carlos Mocholí 01b82e4fb1
Minor miscellaneous fixes (#18077)
* Various miscellaneous fixes

* Update

* Update

* succeeded

* Comment everywhere

* hasattr
2023-07-20 14:44:51 +02:00
Adrian Wälchli d6b5f3af15
Fix "optimizer in backward" compatibility with torch 2.1 nightly (#18119)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-07-20 07:22:54 -04:00
Adrian Wälchli ed6a48ed57
DeepSpeed precision simplifications (#18113)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-07-20 07:13:31 -04:00
Adrian Wälchli 2af425328a
Add FSDP to the docs glossary (#18121) 2023-07-20 09:07:37 +02:00
Adrian Wälchli f72d702027
Disable running the finetuning callback with DeepSpeed (#18100) 2023-07-20 09:02:11 +02:00
Adrian Wälchli 6ab6ab8193
Fabric FSDP documentation guide (#18109) 2023-07-19 18:39:07 +02:00
Carlos Mocholí 071f85842e
Support NVIDIA's Transformer Engine as a precision plugin (#17597) 2023-07-19 18:21:58 +02:00
Carlos Mocholí d653e4e088
Relax the assumption that the root module is FSDP wrapped (#18054) 2023-07-19 15:34:03 +02:00
Luca Antiga 2b854a84ea
Raise Empty when request_queue is None (#18111)
Co-authored-by: awaelchli <aedu.waelchli@gmail.com>
2023-07-19 07:22:57 -04:00
Adrian Wälchli dab373de54
Support loading a raw PyTorch state-dict checkpoint in Fabric (#18049)
Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-07-18 14:06:17 -04:00
Adrian Wälchli 5308e90895
Enable loading legacy checkpoints that pickled the `_FaultToleranceMode` enum (#18094) 2023-07-18 09:50:54 -04:00
Jirka Borovec 6b52b84ef8
docs: fallback for restoring right scroll menu (#18108) 2023-07-18 14:06:18 +02:00
Ishan Dutta 7116a9f9bb
Include parent directory validation check for deepspeed (#17795)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>
Co-authored-by: awaelchli <aedu.waelchli@gmail.com>
2023-07-17 19:09:38 -04:00
Shihao Yin c31ef77510
Fix `TensorBoardLogger.log_graph` not recording the graph (#17926)
Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>
Co-authored-by: awaelchli <aedu.waelchli@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-07-17 18:18:39 -04:00
Carlos Mocholí a9269aecdb
Add nightly testing to CUDA CI (#18078) 2023-07-17 20:53:29 +02:00
Adrian Wälchli 281d6a27d1
Allow custom loggers without an experiment property (#18093)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2023-07-17 12:30:11 -04:00
Adrian Wälchli d79eaae334
Update deepspeed model-parallel docs (#18091)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-07-17 12:02:54 -04:00
Adrian Wälchli b8d4a70db7
Make introduction example run on devices (#18097) 2023-07-17 17:19:16 +02:00
Adrian Wälchli ea92c218cc
Only validate schedulers in automatic optimization (#18092) 2023-07-17 17:18:42 +02:00
dependabot[bot] c21b6de3c7
Update deepdiff requirement from <6.3.1,>=5.7.0 to >=5.7.0,<6.3.2 in /requirements (#18032)
Update deepdiff requirement in /requirements

Updates the requirements on [deepdiff](https://github.com/seperman/deepdiff) to permit the latest version.
- [Release notes](https://github.com/seperman/deepdiff/releases)
- [Changelog](https://github.com/seperman/deepdiff/blob/master/docs/changelog.rst)
- [Commits](https://github.com/seperman/deepdiff/compare/5.7.0...6.3.1)

---
updated-dependencies:
- dependency-name: deepdiff
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-07-17 07:49:34 -04:00
dependabot[bot] 5191c21387
Update docutils requirement from <0.20,>=0.16 to >=0.16,<0.21 in /requirements (#18095)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-07-17 07:48:10 -04:00
Adrian Wälchli 080eaf38fa
Enable setting the sharding strategy as string in FSDP (#18087) 2023-07-15 18:07:09 +02:00
Carlos Mocholí c60f67e736
Support sets for policies in FSDP (#18084) 2023-07-15 17:39:28 +02:00
Carlos Mocholí e9c42ed11f
More XLA fixes for nightly support (#18085) 2023-07-15 01:16:42 +02:00
Adrian Wälchli 356f5d0c65
Fix detection of next version in Fabric's CSVLogger (#17986) 2023-07-14 16:08:16 -04:00
Carlos Mocholí 2f657ae46e
Support custom policies for activation checkpointing with FSDP (#18045) 2023-07-14 20:00:52 +02:00
Carlos Mocholí 340eecd846
Add `Trainer.init_module` and `LightningModule.configure_model` (#18004) 2023-07-14 19:15:05 +02:00
Jirka Borovec 00496da92d
docs: update CHANGELOG for 2.0.4 and 2.0.5 (#18071) 2023-07-14 02:46:48 +02:00
Carlos Mocholí a607da97ef
[TPU] Prepare for XLA 2.1 release (#17993) 2023-07-14 02:17:25 +02:00
Justus Schock 207b7c70c0
Fix IAM Credentials Backend (#18073)
* update

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update

* update

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update

* test-updates

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* test-updates

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-07-13 19:07:29 +02:00
Carlos Mocholí 234a78a957
Update `print` doc (#18053) 2023-07-13 13:10:07 +02:00
Carlos Mocholí 3a55f0c0a1
Minor miscellaneous fixes (#18068) 2023-07-13 06:01:58 -04:00
Mauricio Villegas e38c71b828
Fix `LightningCLI` not saving correctly seed_everything for `run=True` (#18056) 2023-07-12 11:53:39 +02:00
Jirka Borovec b16c35d673
drop AWS action (#18050) 2023-07-11 15:45:53 +02:00
Nicki Skafte Detlefsen 4fc6b560a7
Fix compatability with pydantic 2.0+ (#18030)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
2023-07-10 23:51:39 +02:00
Adrian Wälchli a97c559d92
Make model test more robust (#18043)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-07-10 20:36:18 +00:00
Adrian Wälchli 69d7cfe5d8
Enable `self.device` access in setup hook (#18021) 2023-07-10 16:49:47 +02:00
Carlos Mocholí ad74f8623f
Don't reapply activation checkpointing (#18006) 2023-07-10 13:24:09 +00:00
Justus Schock 7ca49f2cb7
Requirements update (#18014)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-07-10 13:00:20 +00:00
Jirka Borovec 1a8baf61de
drop environment.yml (#18040) 2023-07-10 13:52:14 +02:00
Adrian Wälchli 6d888b5ce0
Fix param_group -> param_groups typo (#18020) 2023-07-09 19:13:33 +00:00
Adrian Wälchli acc70d0ae5
Support all half-precision modes in FSDP precision plugin (#17807)
Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>
2023-07-09 18:40:46 +00:00
Giorgio Strano f95275005a
Add option to change "=" symbol in ModelCheckpoint filenames (#17999)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2023-07-09 18:37:52 +00:00
Kilian Lieret 9780dfddc0
Fix doc for creating custom progress bar (#18024) 2023-07-09 18:35:06 +00:00
Jirka Borovec 913fa99f1b
pin `pydantic <2.0` (#18022)
pin pydantic <2.0
2023-07-08 16:04:38 +02:00
Carlos Mocholí 9a2bb85d82
Drop `torchdistx` support (#17995) 2023-07-08 02:15:05 +00:00
Leng Yue 734a3253cd
Support PyTorch Lightning's FSDP optimizer states saving and loading (#17819)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>
2023-07-07 17:55:59 +00:00
Jirka Borovec 1b43aacadd
Update name in pyproject.toml (#18010) 2023-07-07 17:08:59 +00:00