Commit Graph

9571 Commits

Author SHA1 Message Date
dependabot[bot] 2fb90e6a66
Bump coverage from 7.2.7 to 7.3.0 in /requirements (#18300)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-08-16 17:19:35 +02:00
Jirka Borovec 3375b3a28a
ci: upload all possible packages (#18324) 2023-08-16 17:03:05 +02:00
Adrian Wälchli 725159ed60
Revamp model parallel docs (1/n) (#18314) 2023-08-16 08:06:50 -04:00
PL Ghost c971503363
Adding test for legacy checkpoint created with 2.0.7 (#18321)
[create-pull-request] automated change

Co-authored-by: Borda <Borda@users.noreply.github.com>
2023-08-16 12:41:30 +02:00
Carlos Mocholí dc44fa406a
Mention specific param names and devices in warning (#18273) 2023-08-16 00:51:28 +02:00
Jirka Borovec c98fb36b11
update ci for legacy upload (#18316)
* update ci for legacy upload

* docs
2023-08-15 17:32:39 -04:00
Jirka Borovec 300255854f
drop duplicate version file (#18315)
* link release

* drop false file

* drop tox
2023-08-15 18:19:21 +02:00
Adrian Wälchli a0ca2c8bcd
Disable memory sharing on model parameters in ddp-spawn (#18238)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-08-15 14:39:51 +02:00
Adrian Wälchli 0d1932ceaa
Re-enable saving and loading model checkpoint in FSDP with PyTorch < 2.0 (#18296)
Co-authored-by: Daniel Dale <danny.dale@gmail.com>
2023-08-14 11:09:38 -04:00
Jirka Borovec 8c518533c0
bump 2.1.0 RC0 (#18310) 2023-08-14 09:21:05 -04:00
dependabot[bot] ede22b97cd
Update rich requirement from <=13.4.2,>=12.3.0 to >=12.3.0,<=13.5.2 in /requirements (#18305)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-08-14 14:46:30 +02:00
dependabot[bot] 73ccc5c78c
Update matplotlib requirement from <3.6.2,>3.1 to >3.1,<3.7.3 in /requirements (#18301) 2023-08-14 14:37:08 +02:00
dependabot[bot] 1a6a657ed2
Update tensorboard requirement from <2.14.0,>=2.9.1 to >=2.9.1,<2.15.0 in /requirements (#18299) 2023-08-14 14:36:29 +02:00
dependabot[bot] 4da8078c99
Bump pytest-rerunfailures from 10.3 to 12.0 in /requirements (#18302) 2023-08-14 14:36:02 +02:00
dependabot[bot] df68501340
Update fsspec[http] requirement from <2023.5.0,>2021.06.0 to >2021.06.0,<2023.7.0 in /requirements (#18306) 2023-08-14 14:34:29 +02:00
dependabot[bot] 721e25e425
Update jsonargparse[signatures] requirement from <4.23.0,>=4.18.0 to >=4.18.0,<4.24.0 in /requirements (#18307)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-08-14 13:49:09 +02:00
Dan Dale c081b48324
Accommodate FSDP full-precision `param_dtype` training with PyTorch < 2.0 (#18278) 2023-08-14 12:22:26 +02:00
Jirka Borovec 746dfbd1fa
fix badges in readme (#18260) 2023-08-14 09:40:27 +02:00
Adrian Wälchli 3142ed5e44
Integration tests for XLA precision (#18286) 2023-08-13 09:20:26 -04:00
Adrian Wälchli c95dbac2e8
Validate Trainer settings against cluster environment (#18292) 2023-08-12 21:26:37 +02:00
Jirka Borovec 3ac550f94b
improve debug failing warning test (#18271)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2023-08-12 07:59:52 -04:00
Adrian Wälchli 03ca31c3d3
Avoid updating the device for XLA FSDP in `Fabric.setup()` [TPU] (#18276) 2023-08-11 22:00:23 -04:00
Adrian Wälchli 97020bf8d7
Support skipping training step when using mixed precision training (#18267)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-08-11 12:42:17 -04:00
Adrian Wälchli 4da2d8741d
Refresh the internal LightningOptimizer state for inspection (#18280) 2023-08-11 18:20:09 +02:00
Adrian Wälchli 7fe8756917
[TPU] Proper half-precision implementation for XLA (#18213)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-08-11 11:37:41 -04:00
0x404 b88b8b3937
Explicitly enable grad in closure (#18268)
Co-authored-by: awaelchli <aedu.waelchli@gmail.com>
2023-08-10 16:58:29 -04:00
Ryan Smith e24620c1af
Change error to warning if state_dict is empty in `load_from_checkpoint` (#18266)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-08-10 16:07:29 -04:00
Carlos Mocholí c83774a109
Update docs about double precision with complex numbers (#18269) 2023-08-10 10:36:55 +02:00
Adrian Wälchli 888466b144
Support true 16-bit precision with FSDP in Trainer (#18219) 2023-08-10 04:15:35 -04:00
Adrian Wälchli 70e31b6480
Make `all_reduce` consistent for both NCCL and GLOO (#18235)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2023-08-09 17:39:57 -04:00
Adrian Wälchli 27d9125a5d
Cast input before moving to device for all strategies (#18264) 2023-08-09 17:42:55 +02:00
Jirka Borovec efa7b2f9ef
docformatter: config with black (#18064)
* docformatter: config with black

* additional_dependencies: [tomli]

* 119

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-08-09 10:44:20 -04:00
Ethan Harris e33816ce60
[App] Fix app unit tests (#18262) 2023-08-08 21:20:33 +01:00
Ethan Harris 176e456814
[App] Client retries forever (#18065)
Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>
2023-08-08 21:13:03 +01:00
Adrian Wälchli 8f81dafd95
Support true 16-bit precision with deepspeed in Trainer (#18217)
Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>
2023-08-08 14:07:25 -04:00
Adrian Wälchli 774ea1e551
Improve info message of process observer with rank information (#18257) 2023-08-08 13:07:11 -04:00
PL Ghost 9d43f3eb94
Adding test for legacy checkpoint created with 2.0.5 (#18046)
Co-authored-by: Borda <Borda@users.noreply.github.com>
2023-08-08 18:29:49 +02:00
Adrian Wälchli a0f46abc71
Remove unreachable code in `TorchCheckpointIO` (#18237)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-08-08 16:30:04 +02:00
pre-commit-ci[bot] 834bd61164
[pre-commit.ci] pre-commit suggestions (#17983)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>
Co-authored-by: Jirka B <j.borovec+github@gmail.com>
2023-08-08 16:26:06 +02:00
Jirka Borovec 8f29bb561b
bump sphinx to 5.3 (#18204) 2023-08-08 15:32:34 +02:00
dependabot[bot] cb4715c2cc
Update packaging requirement from <=23.0,>=20.0 to >=20.0,<=23.1 in /requirements (#18245)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-08-08 12:33:35 +02:00
dependabot[bot] 6861c97ae8
Update tensorboardx requirement from <=2.6.1,>=2.2 to >=2.2,<=2.6.2 in /requirements (#18243)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-08-08 12:05:55 +02:00
dependabot[bot] cee05ea029
Update fsspec[http] requirement from <2023.5.0,>2021.06.0 to >2021.06.0,<2023.7.0 in /requirements (#18248)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-08-08 11:50:35 +02:00
dependabot[bot] 175aaafe1e
Update numpy requirement from <1.25.2,>=1.17.2 to >=1.17.2,<1.25.3 in /requirements (#18249)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-08-08 11:50:18 +02:00
dependabot[bot] 9fc731f33f
Update pyyaml requirement from <=6.0 to <=6.0.1 in /requirements (#18242)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-08-08 10:46:37 +02:00
Adrian Wälchli 1b80d4be07
Replace `_FSDPPolicy` with public import (#18236) 2023-08-08 10:42:13 +02:00
Ethan Harris b8f392934a
Simplify store (#18234)
Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>
Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
2023-08-08 09:34:03 +01:00
dependabot[bot] 41205bbc9b
Update pydantic requirement from <2.1.0,>=1.7.4 to >=1.7.4,<2.2.0 in /requirements (#18241)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-08-08 10:10:29 +02:00
dependabot[bot] 272e377788
Bump playwright from 1.35.0 to 1.36.0 in /requirements (#18247)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>
2023-08-08 09:43:57 +02:00
dependabot[bot] 301e3db5ae
Update uvicorn requirement from <=0.23.1 to <=0.23.2 in /requirements (#18246)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-08-08 09:43:47 +02:00