Commit Graph

14 Commits

Author SHA1 Message Date
awaelchli abae4c903b
Update Lightning AI multi-node guide (Trainer) (#19530)
* update

* update

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* configure_model

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-02-28 08:35:53 -05:00
awaelchli e461e90f84
Update the Multi-GPU docs (#19525) 2024-02-26 22:29:26 -05:00
Alex Morehead 095d9cf279
docs: Fix typos and wording in cluster_advanced.rst (#18465) 2023-09-03 09:06:33 -04:00
Adrian Wälchli 7749525cbd
Document SLURM interactive mode (#16955) 2023-03-06 20:58:46 +00:00
Jirka Borovec 52a39c03f8
docs: update `pytorch_lightning` imports (#16864)
* update docs imports

* ci

* fabric

* trigger

* links

* .

* docstring

* chlog

* cleaning
2023-02-27 15:14:23 -05:00
Carlos Mocholí 76cb048b29
Remove docs about automatic fault tolerance (#16500)
Remove docs about the experimental automatic fault tolerance
2023-01-26 19:47:40 +01:00
Carlos Mocholí 486b4d5d9d
Remove old platform docs (#16499)
* Remove old platform docs

* More

* More
2023-01-25 16:16:51 +01:00
Carlos Mocholí cfe87a0b56
Clarify cluster advanced docs (#16403) 2023-01-17 14:58:01 +00:00
edenlightning 1c196da309
Update fault_tolerant_training_basic.rst (#16012) 2022-12-22 07:16:02 +00:00
Adrian Wälchli 7a1e0e801e
Fix typo in definition of world size in docs (#15954) 2022-12-08 18:06:12 +00:00
Adrian Wälchli ff3c5b7b9d
Docs section for SLURM troubleshooting (#14873)
Co-authored-by: Laverne Henderson <laverne.henderson@coupa.com>
Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com>
2022-09-29 12:41:31 +00:00
Max Ehrlich e5998e6bf2
Make the SLURM Preemption/Timeout Signal Configurable (#14626)
* Add parameter to change the preemption signal
* Make the signal connector use the custom signal from SLURMEnvironment

Signed-off-by: Max Ehrlich <max.ehr@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
2022-09-12 19:24:35 +00:00
Rohit Gupta e21490b9bb
Update old PL links (#13349) 2022-06-21 16:38:04 +02:00
Jirka Borovec b58577fd4d
Future 3/n: docs adjustment (#13299)
* docs: rename source >> source-PL

* docs: fix typing

* readthedocs

* update paths & codeowners

* source-pytorch

* ci

Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2022-06-15 10:54:53 -04:00