Commit Graph

8671 Commits

Author SHA1 Message Date
Jirka Borovec e59685e5f8
align lit-utils version to post0 (#16593) 2023-02-01 15:00:04 +05:30
Adrian Wälchli c1692c6eb6
Bump max deepspeed version to 0.8.0 (#16469)
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2023-02-01 05:38:43 +00:00
Jirka Borovec 34140c0603
move lightning_app >> lightning/app (#16553)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-02-01 06:29:16 +01:00
Carlos Mocholí 59f2d4ce63
Install colossalai==0.1.12 in CI (#16587) 2023-02-01 04:57:22 +00:00
Carlos Mocholí dc298f2340
Drop support for Python 3.7 (#16579)
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
2023-02-01 01:36:42 +00:00
Carlos Mocholí c2b5b77e23
Ignore leaked XLA environment variables (#16582) 2023-01-31 22:33:52 +01:00
Jirka Borovec 95e3117a0c
ci/hotfix: replace SD with Flashy (#16584)
ci: replace SD with Flashy
2023-01-31 13:06:58 -05:00
Carlos Mocholí 9f8043e16f
Cascade SIGTERM to subprocesses (#16525) 2023-01-31 17:24:58 +01:00
Ethan Harris 3001c7c1d5
[App] Fix app name in URL (#16575) 2023-01-31 07:14:56 -05:00
Jirka Borovec 38a98f447f
ci: update doctest for installed packages (#16574) 2023-01-31 11:41:05 +00:00
Ethan Harris 0ec93f2125
[App] Update app URLs to latest format (#16568) 2023-01-30 20:52:46 +00:00
dependabot[bot] dd46873473
Update omegaconf requirement from <2.3.0,>=2.0.5 to >=2.0.5,<2.4.0 in /requirements (#16545)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-01-30 18:35:08 +00:00
Adrian Wälchli c2aa8c9e10
Remove `optimizer_idx` from `toggle/untoggle_optimizer` methods (#16560)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2023-01-30 18:06:42 +00:00
Andrea Tupini d634846b5e
Minor formatting fix on model_parallel docs (#16565) 2023-01-30 12:40:03 -05:00
Adrian Wälchli 8fc4fb18e6
Switch multi-optimizer tests to manual optimization (#16559) 2023-01-30 17:18:37 +00:00
Jirka Borovec 86d31eb8d2
update version for Fabric CLI (#16556) 2023-01-30 17:12:39 +00:00
Jirka Borovec cc78efadcf
bump version Dev (#16562) 2023-01-30 10:41:15 -05:00
Carlos Mocholí a78412f11d
Remove DataLoader serialization (under fault tolerance) (#16533) 2023-01-30 16:12:01 +01:00
belerico 55ff87ae82
Fabric RL example: add `torchmetrics` and minor fixes (#16543)
Co-authored-by: awaelchli <aedu.waelchli@gmail.com>
2023-01-30 09:11:43 -05:00
Adrian Wälchli 8aca46a192
Remove `using_lbfgs` argument from `optimizer_step` module hook (#16538) 2023-01-30 12:49:35 +00:00
Adrian Wälchli 1008f313e8
Remove `on_tpu` argument from `optimizer_step` module hook (#16537)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-01-30 13:17:20 +01:00
Jirka Borovec dc38663a03
store: update API in messages (#16535) 2023-01-30 11:19:57 +00:00
Jirka Borovec d521f2bc99
store: mock/fixture home (#16536)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-01-30 12:02:15 +01:00
Jirka Borovec 879701f52f
ci: hotfix precommit/poetry/isort (#16549) 2023-01-30 11:07:52 +01:00
Carlos Mocholí 07cda8c94e
Ensure SIGTERM handlers other than ours can be added (#16534) 2023-01-29 03:14:38 +01:00
Aniket Maurya 252dd92623
Fix docstring for `LightningWork.has_stopped` (#16532) 2023-01-29 00:01:13 +00:00
William Falcon 493be358a3
Update README.md 2023-01-28 11:13:21 -05:00
Adrian Wälchli bb7b8d601a
Fabric docs feedback 2/n (#16480) 2023-01-27 20:13:20 +01:00
Carlos Mocholí 405bb75553
Move progress file (#16524) 2023-01-27 17:49:52 +00:00
Carlos Mocholí 226290cfc1
PyTorch 2.0 switched the `set_to_none` default (#16531)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2023-01-27 16:51:56 +00:00
Carlos Mocholí e74a8378b4
Remove result serialization (under fault tolerance) (#16516) 2023-01-27 17:41:37 +01:00
Adrian Wälchli b216a114a7
Decouple Tuner from Trainer (#16462)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2023-01-27 15:08:40 +00:00
Jirka Borovec c8c47227df
store: adding group-check (#16528) 2023-01-27 14:01:06 +00:00
Jirka Borovec 2251055be4
store: drop `requirements_file_path` (#16527) 2023-01-27 08:39:07 -05:00
Carlos Mocholí d562319a61
Make the `FaultToleranceCheckpoint` callback opt-in (#16512) 2023-01-27 13:02:14 +00:00
belerico b5599e1320
Add reinforcement learning example for Fabric (#16506)
Co-authored-by: awaelchli <aedu.waelchli@gmail.com>
Co-authored-by: Luca Antiga <luca@lightning.ai>
2023-01-27 11:28:25 +00:00
Kushashwa Ravi Shrimali d738ab17e6
Init: Models store API (#15811)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>
2023-01-27 12:27:04 +01:00
dconathan 25e1aff7d7
Allow `MLFlowLogger` to work with `mlflow-skinny` (#16513)
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
Fixes https://github.com/Lightning-AI/lightning/issues/16486
2023-01-27 04:29:00 +00:00
Carlos Mocholí c854e4d1e2
SIGTERM handling is now unrelated to fault tolerance (#16501) 2023-01-27 02:59:42 +00:00
Carlos Mocholí b2387136ba
Fix `torch.compile` tests (#16503) 2023-01-27 02:41:45 +00:00
Carlos Mocholí 76cb048b29
Remove docs about automatic fault tolerance (#16500)
Remove docs about the experimental automatic fault tolerance
2023-01-26 19:47:40 +01:00
Jirka Borovec c3a9bf0419
ci: trigger on action edit (#16514) 2023-01-26 15:59:10 +01:00
Adrian Wälchli 23e71a880a
Fabric checkpointing 3/n: Implement missing `get_module_state_dict` for strategies (#16487) 2023-01-26 13:10:14 +00:00
Jirka Borovec 50fd12f841
fabric: test with tbX (#16511) 2023-01-26 12:52:02 +00:00
Jirka Borovec d871f70389
app: hotfix import `pytest` (#16510) 2023-01-26 12:51:33 +00:00
Jirka Borovec 22c34658f4
relax `packaging` versions (#16508) 2023-01-26 12:42:57 +00:00
Adrian Wälchli c68cfd686e
Rename LiteMultiNode to FabricMultiNode (#16505) 2023-01-26 11:36:27 +00:00
Jirka Borovec f812cb8339
ci: move Flagships to GH (#16420)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-01-26 09:28:30 +00:00
Adrian Wälchli dfd8d80cb1
Multi-node documentation for Fabric (#16495)
Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>
2023-01-25 22:07:09 +00:00
Adrian Wälchli 961fa6a0ea
Fix strict torch_xla availability check (#16476)
Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>
Co-authored-by: Jirka <jirka.borovec@seznam.cz>
2023-01-25 18:52:33 +00:00