Commit Graph

28 Commits

Author SHA1 Message Date
awaelchli 98005bbed0
Add Studio badge to tensor parallel docs (#19913) 2024-05-28 09:04:55 -04:00
awaelchli c09356db1e
(10/10) Support 2D Parallelism - Port Fabric docs to PL (#19899) 2024-05-23 08:55:52 -04:00
awaelchli 341474aaac
(8/n) Support 2D Parallelism - 2D Parallel Fabric Docs (#19887) 2024-05-22 13:47:55 -04:00
awaelchli 987c2c4093
(7/n) Support 2D Parallelism - TP Fabric Docs (#19884)
Co-authored-by: Sebastian Raschka <mail@sebastianraschka.com>
Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com>
2024-05-22 06:20:40 -04:00
awaelchli d9113b61cc
Add additional references in compile guides (#19550) 2024-03-04 08:00:50 -05:00
awaelchli 48c39ce24f
Compile guide for Trainer (#19531)
Co-authored-by: Luca Antiga <luca.antiga@gmail.com>
2024-02-28 09:15:33 -05:00
awaelchli 59e45d6f6d
Update `all_gather` docs (#19469) 2024-02-14 19:37:50 +01:00
Sebastian Raschka 0a77aa2cc5
Some clarifications in the torch.compile Fabric docs (#19456) 2024-02-13 06:52:29 +01:00
awaelchli 130b05fe0c
Fix dead link in docs (#19387) 2024-02-06 11:54:17 +01:00
awaelchli 89ff87def0
Reapply compile in `Fabric.setup()` by default (#19382) 2024-02-01 15:06:18 -05:00
Adam J. Stewart 509b2ca560
Docs: fix FSDP acronym (#19384) 2024-02-01 16:02:59 +01:00
awaelchli c346f4d159
Compile guide for Fabric (#19330) 2024-01-31 14:57:07 -05:00
awaelchli 93c1ab0653
Dedicated docs page for distributed checkpoints (Trainer) (#19299) 2024-01-17 12:20:12 +01:00
awaelchli a4ecf8d5c8
Dedicated docs page for distributed checkpoints (#19287) 2024-01-16 08:44:10 -05:00
Adrian Wälchli b79b68481e
Clarify setup of optimizer when using `empty_init=True` (#19067) 2023-11-26 11:04:36 +01:00
Adrian Wälchli 5d819c91fb
Remove `fsdp_overlap_step_with_backward` in favor of native solution (#18726) 2023-10-06 08:11:41 -04:00
Adrian Wälchli 8094855137
Avoid passing process group to enable FSDP's hybrid-shard (#18583) 2023-09-19 13:46:24 -04:00
Adrian Wälchli d3ee410100
Add dedicated docs page for init-module (#18416) 2023-08-28 11:28:38 -04:00
Adrian Wälchli f4825e5778
Extend FSDP guide with checkpointing (#18374) 2023-08-23 20:23:16 +02:00
Jirka Borovec 547e7aa393
docs: 1/3 enable Sphinx nitpicky [fabric] (#18069)
Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-08-23 10:31:20 +02:00
Adrian Wälchli 6df43685ee
Revamp model parallel docs (FSDP) (3/n) (#18326) 2023-08-17 15:30:58 -04:00
Adrian Wälchli 03ca31c3d3
Avoid updating the device for XLA FSDP in `Fabric.setup()` [TPU] (#18276) 2023-08-11 22:00:23 -04:00
Adrian Wälchli 3fd24f9591
Remove outdated warning about loading full-state checkpoints in FSDP (#18208) 2023-08-01 20:06:30 +02:00
Adrian Wälchli 6ab6ab8193
Fabric FSDP documentation guide (#18109) 2023-07-19 18:39:07 +02:00
Adrian Wälchli 9ff7d7120b
Add `rank_zero_first` utility (#17784)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-06-12 10:32:32 +00:00
Adrian Wälchli 0e84f01b09
Document how to use multiple models and optimizers in Fabric (#16952) 2023-03-07 13:19:43 +01:00
Adrian Wälchli 54147e0745
Update Fabric docs navigation (#16957) 2023-03-06 16:13:51 +01:00
Jirka Borovec 0e8ac7e1c9
docs: move fabric on its own (#16742)
* docs: move fabric to Lai

* update imports

* links

* drop link to Trainer

* own docs

* ci

* trigger

* prune cross-links

* cleaning

* cleaning

* template

* imports

* template

* path

* links

* tensorboardX

* plugins

* label

* drop fixme

* drop copy nb + examples

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Apply suggestions from code review

* try again

* rev

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2023-03-01 12:36:14 +01:00